Chapter 12: Abstract Containers

Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.

Please state the document version you're referring to, as found in the title (in this document: 6.5.0) and please state chapter and paragraph name or number you're referring to.

All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.

C++ offers several predefined datatypes, all part of the Standard Template Library, which can be used to implement solutions to frequently occurring problems. The datatypes discussed in this chapter are all containers: you can put stuff inside them, and you can retrieve the stored information from them.

The interesting part is that the kind of data that can be stored inside these containers has been left unspecified by the time the containers were constructed. That's why they are spoken of as abstract containers.

Abstract containers rely heavily on templates, which are covered near the end of the C++ Annotations, in chapter 18. However, in order to use the abstract containers, only a minimal grasp of the template concept is needed. In C++ a template is in fact a recipe for constructing a function or a complete class. The recipe tries to abstract the functionality of the class or function as much as possible from the data on which the class or function operates. As the data types on which the templates operate were not known by the time the template was constructed, the datatypes are either inferred from the context in which a template function is used, or they are mentioned explicitly by the time a template class is used (the term that's used here is instantiated). In situations where the types are explicitly mentioned, the angle bracket notation is used to indicate which data types are required. For example, below (in section 12.2) we'll encounter the pair container, which requires the explicit mentioning of two data types. E.g., to define a pair variable containing both an int and a string, the notation

    pair<int, string> myPair;
is used. Here, myPair is defined as a pair variable, containing both an int and a string.

The angle bracket notation is used intensively in the following discussion of abstract containers. Actually, understanding this part of templates is the only real requirement for using abstract containers. Now that we've introduced this notation, we can postpone the more thorough discussion of templates to chapter 18, and concentrate on their use in this chapter.

Most of the abstract containers are sequential containers: they represent a series of data which can be stored and retrieved in some sequential way. Examples are the vector, implementing an extendable array, the list, implementing a datastructure in which insertions and deletions can be easily realized, a queue, also called a FIFO ( first in, first out) structure, in which the first element that is entered will be the first element that will be retrieved, and the stack, which is a first in, last out ( FILO or LIFO) structure.

Apart from the sequential containers, several special containers are available. The pair is a basic container in which a pair of values (of types that are left open for further specification) can be stored, like two strings, two ints, a string and a double, etc.. Pairs are often used to return data elements that naturally come in pairs. For example, the map is an abstract container storing keys and their associated values. Elements of these maps are returned as pairs.

A variant of the pair is the complex container, implementing operations that are defined on complex numbers.

All abstract containers described in this chapter and the string datatype discussed in chapter 4 are part of the Standard Template Library. There also exists an abstract container for the implementation of a hashtable, but that container is not (yet) accepted by the ANSI/ISO standard. Nevertheless, the final section of this chapter will cover the hashtable to some extent. It may be expected that containers like hash_map and other, now still considered an extension, will become part of the ANSI/ISO standard at the next release: apparently by the time the standard was frozen these containers were not yet fully available. Now that they are available they cannot be official part of the C++ library , but they are in fact available, albeit as extensions.

All containers support the following operators:

Note that before a user-defined type (usually a class-type) can be stored in a container, the user-defined type should at least support: Closely linked to the standard template library are the generic algorithms. These algorithms may be used to perform frequently occurring tasks or more complex tasks than is possible with the containers themselves, like counting, filling, merging, filtering etc.. An overview of generic algorithms and their applications is given in chapter 17. Generic algorithms usually rely on the availability of iterators, which represent begin and end-points for processing data stored within containers. The abstract containers usually support constructors and members expecting iterators, and they often have members returning iterators (comparable to the string::begin() and string::end() members). In the remainder of this chapter the iterator concept is not covered. Refer to chapter 17 for this.

The url http://www.sgi.com/Technology/STL is worth visiting by those readers who are looking for more information about the abstract containers and the standard template library than can be provided in the C++ annotations.

Containers often collect data during their lifetimes. When a container goes out of scope, its destructor tries to destroy its data elements. This only succeeds if the data elements themselves are stored inside the container. If the data elements of containers are pointers, the data pointed to by these pointers will not be destroyed, resulting in a memory leak. A consequence of this scheme is that the data stored in a container should be considered the ` property' of the container: the container should be able to destroy its data elements when the container's destructor is called. So, normally containers should contain no pointer data. Also, a container should not be required to contain const data, as const data prevent the use of many of the container's members, like the assignment operator.

12.1: Notations used in this chapter

In this chapter about containers, the following notational convention is used: Some containers, e.g., the map container, contain pairs of values, usually called `keys' and `values'. For such containers the following notational convention is used in addition:

12.2: The `pair' container

The pair container is a rather basic container. It can be used to store two elements, called first and second, and that's about it. Before pair containers can be used the following preprocessor directive must have been specified:
    #include <utility>
The data types of a pair are specified when the pair variable is defined (or declared), using the standard template (see chapter Templates) angle bracket notation:
    pair<string, string> piper("PA28", "PH-ANI");
    pair<string, string> cessna("C172", "PH-ANG");
here, the variables piper and cessna are defined as pair variables containing two strings. Both strings can be retrieved using the first and second fields of the pair type:
    cout << piper.first << endl <<      // shows 'PA28'
            cessna.second << endl;      // shows 'PH-ANG'
The first and second members can also be used to reassign values:
    cessna.first = "C152";
    cessna.second = "PH-ANW";
If a pair object must be completely reassigned, an anonymous pair object can be used as the right-hand operand of the assignment. An anonymous variable defines a temporary variable (which receives no name) solely for the purpose of (re)assigning another variable of the same type. Its generic form is
    type(initializer list)
Note that when a pair object is used the type specification is not completed by just mentioning the containername pair. It also requires the specification of the data types which are stored within the pair. For this the (template) angle bracket notation is used again. E.g., the reassignment of the cessna pair variable could have been accomplished as follows:
    cessna = pair<string, string>("C152", "PH-ANW");
In cases like these, the type specification can become quite elaborate, which has caused a revival of interest in the possibilities offered by the typedef keyword. If a lot of pair<type1, type2> clauses are used in a source, the typing effort may be reduced and legibility might be improved by first defining a name for the clause, and then using the defined name later. E.g.,
    typedef pair<string, string> pairStrStr;

    cessna = pairStrStr("C152", "PH-ANW");
Apart from this (and the basic set of operations (assignment and comparisons)) the pair offers no further functionality. It is, however, a basic ingredient of the upcoming abstract containers map, multimap and hash_map.

12.3: Sequential Containers

12.3.1: The `vector' container

The vector class implements an expandable array. Before vector containers can be used the following preprocessor directive must have been specified:
    #include <vector>
The following constructors, operators, and member functions are available:

12.3.2: The `list' container

The list container implements a list data structure. Before list containers can be used the following preprocessor directive must have been specified:
    #include <list>
The organization of a list is shown in figure 7.

Figure 7 is shown here.
Figure 7: A list data-structure


In figure 7 it is shown that a list consists of separate list-elements, connected to each other by pointers. The list can be traversed in two directions: starting at Front the list may be traversed from left to right, until the 0-pointer is reached at the end of the rightmost list-element. The list can also be traversed from right to left: starting at Back, the list is traversed from right to left, until eventually the 0-pointer emanating from the leftmost list-element is reached.

As a subtlety note that the representation given in figure 7 is not necessarily used in actual implementations of the list. For example, consider the following little program:

    int main()
    {
        list<int> l;
        cout << "size: " << l.size() << ", first element: " <<
                l.front() << endl;
    }
When this program is run it might actually produce the output:
    size: 0, first element: 0
Its front element can even be assigned a value. In this case the implementor has choosen to insert a hidden element to the list, which is actually a circular list, where the hidden element serves as terminating element, replacing the 0-pointers in figure 7. As noted, this is a subtlety, which doesn't affect the conceptual notion of a list as a data structure ending in 0-pointers. Note also that it is well known that various implementations of list-structures are possible (cf. Aho, A.V., Hopcroft J.E. and Ullman, J.D., (1983) Data Structures and Algorithms (Addison-Wesley)).

Both lists and vectors are often appropriate data structures in situations where an unknown number of data elements must be stored. However, there are some rules of thumb to follow when a choice between the two data structures must be made.

Other considerations related to the choice between lists and vectors should also be given some thought. Although it is true that the vector is able to grow dynamically, the dynamic growth does involve a lot data-copying. Clearly, copying a million large data structures takes a considerable amount of time, even on fast computers. On the other hand, inserting a large number of elements in a list doesn't require us to copy non-involved data. Inserting a new element in a list merely requires us to juggle some pointers. In figure 8 this is shown: a new element is inserted between the second and third element, creating a new list of four elements.

Figure 8 is shown here.
Figure 8: Adding a new element to a list


Removing an element from a list also is a simple matter. Starting again from the situation shown in figure 7, figure 9 shows what happens if element two is removed from our list. Again: only pointers need to be juggled. In this case it's even simpler than adding an element: only two pointers need to be rerouted.

Figure 9 is shown here.
Figure 9: Removing an element from a list


Summarizing the comparison between lists and vectors, it's probably best to conclude that there is no clear-cut answer to the question what data structure to prefer. There are rules of thumb, which may be adhered to. But if worse comes to worst, a profiler may be required to find out what's best.

But, no matter what the thoughts on the subject are, the list container is available, so let's see what we can do with it. The following constructors, operators, and member functions are available:

12.3.3: The `queue' container

The queue class implements a queue data structure. Before queue containers can be used the following preprocessor directive must have been specified:
    #include <queue>
A queue is depicted in figure 10.

Figure 10 is shown here.
Figure 10: A queue data-structure


In figure 10 it is shown that a queue has one point (the back) where items can be added to the queue, and one point (the front) where items can be removed (read) from the queue. A queue is therefore also called a FIFO data structure, for first in, first out. It is most often used in situations where events should be handled in the same order as they are generated.

The following constructors, operators, and member functions are available for the queue container:

Note that the queue does not support iterators or a subscript operator. The only elements that can be accessed are its front and back element. A queue can be emptied by:

12.3.4: The `priority_queue' container

The priority_queue class implements a priority queue data structure. Before priority_queue containers can be used the following preprocessor directive must have been specified:
    #include <queue>
A priority queue is identical to a queue, but allows the entry of data elements according to priority rules. An example of a situation where the priority queue is encountered in real-life is found at the check-in terminals at airports. At a terminal the passengers normally stand in line to wait for their turn to check in, but late passengers are usually allowed to jump the queue: they receive a higher priority than other passengers.

The priority queue uses operator<() of the data type stored in the priority ueue to decide about the priority of the data elements. The smaller the value, the lower the priority. So, the priority queue could be used to sort values while they arrive. A simple example of such a priority queue application is the following program: it reads words from cin and writes a sorted list of words to cout:

#include <iostream>
#include <string>
#include <queue>
using namespace std;

int main()
{
    priority_queue<string> q;
    string word;

    while (cin >> word)
        q.push(word);

    while (q.size())
    {
        cout << q.top() << endl;
        q.pop();
    }
}
Unfortunately, the words are listed in reversed order: because of the underlying <-operator the words appearing later in the ASCII-sequence appear first in the priority queue. A solution to that problem is to define a wrapper class around the string datatype, in which the operator<() has been defined according to our wish, i.e., making sure that the words appearing early in the ASCII-sequence will appear first in the queue. Here is the modified program:
#include <iostream>
#include <string>
#include <queue>

class Text
{
    std::string d_s;

    public:
        Text(std::string const &str)
        :
            d_s(str)
        {}
        operator std::string const &() const
        {
            return d_s;
        }
        bool operator<(Text const &right) const
        {
            return d_s > right.d_s;
        }
};

using namespace std;

int main()
{
    priority_queue<Text> q;
    string word;

    while (cin >> word)
        q.push(word);

    while (q.size())
    {
        word = q.top();
        cout << word << endl;
        q.pop();
    }
}
In the above program the wrapper class defines the operator<() just the other way around than the string class itself, resulting in the preferred ordering. Other possibilities would be to store the contents of the priority queue in, e.g., a vector, from which the elements can be read in reversed order.

The following constructors, operators, and member functions are available for the priority_queue container:

Note that the priority queue does not support iterators or a subscript operator. The only element that can be accessed is its top element. A priority queue can be emptied by:

12.3.5: The `deque' container

The deque (pronounce: `deck') class implements a doubly ended queue data structure (deque). Before deque containers can be used the following preprocessor directive must have been specified:
    #include <deque>
A deque is comparable to a queue, but it allows reading and writing at both ends. Actually, the deque data type supports a lot more functionality than the queue, as will be clear from the following overview of available member functions. A deque is a combination of a vector and two queues, operating at both ends of the vector. In situations where random insertions and the addition and/or removal of elements at one or both sides of the vector occurs frequently, using a deque should be considered.

The following constructors, operators, and member functions are available for deques:

12.3.6: The `map' container

The map class implements a (sorted) associative array. Before map containers can be used, the following preprocessor directive must have been specified:
    #include <map>
A map is filled with key/value pairs, which may be of any container-acceptable type. Since types are associated with both the key and the value, we must specify two types in the angle bracket notation, comparable to the specification we've seen with the pair (section 12.2) container. The first type represents the type of the key, the second type represents the type of the value. For example, a map in which the key is a string and the value is a double can be defined as follows:
    map<string, double> object;
The key is used to access its associated information. That information is called the value. For example, a phone book uses the names of people as the key, and uses the telephone number and maybe other information (e.g., the zip-code, the address, the profession) as the value. Since a map sorts its keys, the key's operator<() must be defined, and it must be sensible to use it. For example, it is generally a bad idea to use pointers for keys, as sorting pointers is something different than sorting the values these pointers point to.

The two fundamental operations on maps are the storage of Key/Value combinations, and the retrieval of values, given their keys. The index operator, using a key as the index, can be used for both. If the index operator is used as lvalue, insertion will be performed. If it is used as rvalue, the key's associated value is retrieved. Each key can be stored only once in a map. If the same key is entered again, the new value replaces the formerly stored value, which is lost.

A specific key/value combination can be implicitly or explicitly inserted into a map. If explicit insertion is required, the key/value combination must be constructed first. For this, every map defines a value_type which may be used to create values that can be stored in the map. For example, a value for a map<string, int> can be constructed as follows:

    map<string, int>::value_type siValue("Hello", 1);
The value_type is associated with the map<string, int>: the type of the key is string, the type of the value is int. Anonymous value_type objects are also often used. E.g.,
    map<string, int>::value_type("Hello", 1);
Instead of using the line map<string, int>::value_type(...) over and over again, a typedef is often used to reduce typing and to improve legibility:
    typedef map<string, int>::value_type StringIntValue
Using this typedef, values for the map<string, int> may now be constructed using:
    StringIntValue("Hello", 1);
Finally, pairs may be used to represent key/value combinations used by maps:
    pair<string, int>("Hello", 1);

The following constructors, operators, and member functions are available for the map container:

As mentioned at the beginning of this section, the map represents a sorted associative array. In a map the keys are sorted. If an application must visit all elements in a map (or just the keys or the values) the begin() and end() iterators must be used. The following example shows how to make a simple table listing all keys and values in a map:
    #include <iostream>
    #include <iomanip>
    #include <map>

    using namespace std;

    int main()
    {
        pair<string, int>
            pa[] =
            {
                pair<string,int>("one", 10),
                pair<string,int>("two", 20),
                pair<string,int>("three", 30),
            };
        map<string, int>
            object(&pa[0], &pa[3]);

        for
        (
            map<string, int>::iterator it = object.begin();
                it != object.end();
                    ++it
        )
            cout << setw(5) << it->first.c_str() <<
                    setw(5) << it->second << endl;
    }
    /*
        Generated output:
      one   10
    three   30
      two   20
    */

12.3.7: The `multimap' container

Like the map, the multimap class implements a (sorted) associative array. Before multimap containers can be used the following preprocessor directive must have been specified:
    #include <map>
The main difference between the map and the multimap is that the multimap supports multiple values associated with the same key, whereas the map contains single-valued keys. Note that the multimap also accepts multiple identical values associated with identical keys.

The map and the multimap have the same set of member functions, with the exception of the index operator ( operator[]()), which is not supported with the multimap. This is understandable: if multiple entries of the same key are allowed, which of the possible values should be returned for object[key]?

Refer to section 12.3.6 for an overview of the multimap member functions. Some member functions, however, deserve additional attention when used in the context of the multimap container. These members are discussed below.

Although the functions lower_bound() and upper_bound() act identically in the map and multimap containers, their operation in a multimap deserves some additional attention. The next example illustrates multimap::lower_bound(), multimap::upper_bound() and multimap::equal_range applied to a multimap:
    #include <iostream>
    #include <map>
    using namespace std;

    int main()
    {
        pair<string, int> pa[] =
        {
            pair<string,int>("alpha", 1),
            pair<string,int>("bravo", 2),
            pair<string,int>("charley", 3),
            pair<string,int>("bravo", 6),   // unordered `bravo' values
            pair<string,int>("delta", 5),
            pair<string,int>("bravo", 4),
        };
        multimap<string, int> object(&pa[0], &pa[6]);

        typedef multimap<string, int>::iterator msiIterator;

        msiIterator it = object.lower_bound("brava");

        cout << "Lower bound for `brava': " <<
                it->first << ", " << it->second << endl;

        it = object.upper_bound("bravu");

        cout << "Upper bound for `bravu': " <<
                it->first << ", " << it->second << endl;

        pair<msiIterator, msiIterator>
            itPair = object.equal_range("bravo");

        cout << "Equal range for `bravo':\n";
        for (it = itPair.first; it != itPair.second; ++it)
            cout << it->first << ", " << it->second << endl;
        cout << "Upper bound: " << it->first << ", " << it->second << endl;

        cout << "Equal range for `brav':\n";
        itPair = object.equal_range("brav");
        for (it = itPair.first; it != itPair.second; ++it)
            cout << it->first << ", " << it->second << endl;
        cout << "Upper bound: " << it->first << ", " << it->second << endl;
    }
    /*
        Generated output:

        Lower bound for `brava': bravo, 2
        Upper bound for `bravu': charley, 3
        Equal range for `bravo':
        bravo, 2
        bravo, 6
        bravo, 4
        Upper bound: charley, 3
        Equal range for `brav':
        Upper bound: bravo, 2
    */
In particular note the following characteristics:

12.3.8: The `set' container

The set class implements a sorted collection of values. Before set containers can be used the following preprocessor directive must have been specified:
    #include <set>
A set is filled with values, which may be of any container-acceptable type. Each value can be stored only once in a set.

A specific value to be inserted into a set can be explicitly created: Every set defines a value_type which may be used to create values that can be stored in the set. For example, a value for a set<string> can be constructed as follows:

    set<string>::value_type setValue("Hello");
The value_type is associated with the set<string>. Anonymous value_type objects are also often used. E.g.,
    set<string>::value_type("Hello");
Instead of using the line set<string>::value_type(...) over and over again, a typedef is often used to reduce typing and to improve legibility:
    typedef set<string>::value_type StringSetValue
Using this typedef, values for the set<string> may be constructed as follows:
    StringSetValue("Hello");
Alternatively, values of the set's type may be used immediately. In that case the value of type Type is implicitly converted to a set<Type>::value_type.

The following constructors, operators, and member functions are available for the set container:

12.3.9: The `multiset' container

Like the set, the multiset class implements a sorted collection of values. Before multiset containers can be used the following preprocessor directive must have been specified:
    #include <set>
The main difference between the set and the multiset is that the multiset supports multiple entries of the same value, whereas the set contains unique values.

The set and the multiset have the same set of member functions. Refer to section 12.3.8 for an overview of the multiset member functions. Some member functions, however, deserve additional attention when used in the context of the multiset container. These members are discussed below.

Although the functions lower_bound() and upper_bound() act identically in the set and multiset containers, their operation in a multiset deserves some additional attention. In particular note that with the multiset container lower_bound() and upper_bound() produce the same result for non-existing keys: they both return the first element having a key exceeding the provided key.

Here is an example showing the use of various member functions of a multiset:

    #include <iostream>
    #include <set>

    using namespace std;

    int main()
    {
        string
            sa[] =
            {
                "alpha",
                "echo",
                "hotel",
                "mike",
                "romeo"
            };

        multiset<string>
            object(&sa[0], &sa[5]);

        object.insert("echo");
        object.insert("echo");

        multiset<string>::iterator
            it = object.find("echo");

        for (; it != object.end(); ++it)
            cout << *it << " ";
        cout << endl;

        cout << "Multiset::equal_range(\"ech\")\n";
        pair
        <
            multiset<string>::iterator,
            multiset<string>::iterator
        >
            itpair = object.equal_range("ech");

        if (itpair.first != object.end())
            cout << "lower_bound() points at " << *itpair.first << endl;
        for (; itpair.first != itpair.second; ++itpair.first)
            cout << *itpair.first << " ";

        cout << endl <<
                object.count("ech") << " occurrences of 'ech'" << endl;

        cout << "Multiset::equal_range(\"echo\")\n";
        itpair = object.equal_range("echo");

        for (; itpair.first != itpair.second; ++itpair.first)
            cout << *itpair.first << " ";

        cout << endl <<
                object.count("echo") << " occurrences of 'echo'" << endl;

        cout << "Multiset::equal_range(\"echoo\")\n";
        itpair = object.equal_range("echoo");

        for (; itpair.first != itpair.second; ++itpair.first)
            cout << *itpair.first << " ";

        cout << endl <<
                object.count("echoo") << " occurrences of 'echoo'" << endl;
    }
    /*
        Generated output:

        echo echo echo hotel mike romeo
        Multiset::equal_range("ech")
        lower_bound() points at echo

        0 occurrences of 'ech'
        Multiset::equal_range("echo")
        echo echo echo
        3 occurrences of 'echo'
        Multiset::equal_range("echoo")

        0 occurrences of 'echoo'
    */

12.3.10: The `stack' container

The stack class implements a stack data structure. Before stack containers can be used the following preprocessor directive must have been specified:
    #include <stack>
A stack is also called a first in, last out ( FILO or LIFO) data structure, as the first item to enter the stack is the last item to leave. A stack is an extremely useful data structure in situations where data must temporarily remain available. For example, programs maintain a stack to store local variables of functions: the lifetime of these variables is determined by the time these functions are active, contrary to global (or static local) variables, which live for as long as the program itself lives. Another example is found in calculators using the Reverse Polish Notation ( RPN), in which the operands of operators are entered in the stack, whereas operators pop their operands off the stack and push the results of their work back onto the stack.

As an example of the use of a stack, consider figure 11, in which the contents of the stack is shown while the expression (3 + 4) * 2 is evaluated. In the RPN this expression becomes 3 4 + 2 *, and figure 11 shows the stack contents after each token (i.e., the operands and the operators) is read from the input. Notice that each operand is indeed pushed on the stack, while each operator changes the contents of the stack.

Figure 11 is shown here.
Figure 11: The contents of a stack while evaluating 3 4 + 2 *


The expression is evaluated in five steps. The caret between the tokens in the expressions shown on the first line of figure 11 shows what token has just been read. The next line shows the actual stack-contents, and the final line shows the steps for referential purposes. Note that at step 2, two numbers have been pushed on the stack. The first number (3) is now at the bottom of the stack. Next, in step 3, the + operator is read. The operator pops two operands (so that the stack is empty at that moment), calculates their sum, and pushes the resulting value (7) on the stack. Then, in step 4, the number 2 is read, which is dutifully pushed on the stack again. Finally, in step 5 the final operator * is read, which pops the values 2 and 7 from the stack, computes their product, and pushes the result back on the stack. This result (14) could then be popped to be displayed on some medium.

From figure 11 we see that a stack has one point (the top) where items can be pushed onto and popped off the stack. This top element is the stack's only immediately visible element. It may be accessed and modified directly.

Bearing this model of the stack in mind, let's see what we can formally do with it, using the stack container. For the stack, the following constructors, operators, and member functions are available:

Note that the stack does not support iterators or a subscript operator. The only elements that can be accessed is its top element. A stack can be emptied by:

12.3.11: The `hash_map' and other hashing-based containers

The map is a sorted data structure. The keys in maps are sorted using the operator<() of the key's data type. Generally, this is not the fastest way to either store or retrieve data. The main benefit of sorting is that a listing of sorted keys appeals more to humans than an unsorted list. However, a by far faster method to store and retrieve data is to use hashing.

Hashing uses a function (called the hash function) to compute an (unsigned) number from the key, which number is thereupon used as an index in the table in which the keys are stored. Retrieval of a key is as simple as computing the hash value of the provided key, and looking in the table at the computed index location: if the key is present, it is stored in the table, and its value can be returned. If it's not present, the key is not stored.

Collisions occur when a computed index position is already occupied by another element. For these situations the abstract containers have solutions available, but that topic is beyond the subject of this chapter.

The Gnu g++ compiler supports the hash_(multi)map and hash_(multi)set containers. Below the hash_map container is discussed. Other containers using hashing ( hash_multimap, hash_set and hash_multiset) operate correspondingly.

Concentrating on the hash_map, its constructor needs a key type, a value type, an object creating a hash value for the key, and an object comparing two keys for equality. Hash functions are available for char const * keys, and for all the scalar numerical types char, short, int etc.. If another data type is used, a hash function and an equality test must be implemented, possibly using function objects (see section 9.10). For both situations examples are given below.

The class implementing the hash function could be called hash. Its function call operator ( operator()()) returns the hash value of the key that is passed as its argument.

A generic algorithm (see chapter 17) exists for the test of equality (i.e., equal_to()), which can be used if the key's data type supports the equality operator. Alternatively, a specialized function object could be constructed here, supporting the equality test of two keys. Again, both situations are illustrated below.

The hash_map class implements an associative array in which the key is stored according to some hashing scheme. Before hash_map containers can be used the following preprocessor directive must have been specified:

    #include <ext/hash_map>
The hash_(multi)map is not yet part of the ANSI/ISO standard. Once this container becomes part of the standard, it is likely that the ext/ prefix in the #include preprocessor directive can be removed. Note that starting with the Gnu g++ compiler version 3.2 the __gnu_cxx namespace is used for symbols defined in the ext/ header files. See also section 2.1.

Constructors, operators and member functions available for the map are also available for the hash_map. The map and hash_map support the same set of operators and member functions. However, the efficiency of a hash_map in terms of speed should greatly exceed the efficiency of the map. Comparable conclusions may be drawn for the hash_set, hash_multimap and the hash_multiset.

Compared to the map container, the hash_map has an additional constructor:

        hash_map<...> hash(n);
where n is a size_t value, may be used to construct a hash_map consisting of an initial number of at least n empty slots to put key/value combinations in. This number is automatically extended when needed.

The hashed key type is almost always text. So, a hash_map in which the key's data type is either char const * or a string occurs most often. If the following header file is installed in the C++ compiler's INCLUDE path as the file hashclasses.h, sources may specify the following preprocessor directive to make a set of classes available that can be used to instantiate a hash table

    #include <hashclasses.h>
Otherwise, sources must specify the following preprocessor directive:
    #include <ext/hash_map>
#ifndef _INCLUDED_HASHCLASSES_H_
#define _INCLUDED_HASHCLASSES_H_

#include <string>
#include <cctype>

/*
    Note that with the Gnu g++ compiler 3.2 (and beyond?) the ext/ header
    uses the __gnu_cxx namespace for symbols defined in these header files.

    When using compilers before version 3.2, do:
        #define __gnu_cxx   std
    before including this file to circumvent problems that may occur
    because of these namespace conventions which were not yet used in versions
    before 3.2.

*/

#include <ext/hash_map>
#include <algorithm>

/*
    This file is copyright (c) GPL, 2001-2004
    ==========================================
    august 2004: redundant include guards removed

    october 2002:   provisions for using the hashclasses with the g++ 3.2
                compiler were incorporated.

    april 2002: namespace FBB introduced
                abbreviated class templates defined,
                see the END of this comment section for examples of how
                to use these abbreviations.

    jan 2002:   redundant include guards added,
                required header files adapted,
                for_each() rather than transform() used

    With hash_maps using char const * for the keys:
                         ============

    * Use `HashCharPtr' as 3rd template argument for case-sensitive keys
    * Use `HashCaseCharPtr' as 3rd template argument for case-insensitive
      keys

    * Use `EqualCharPtr' as 4th template argument for case-sensitive keys
    * Use `EqualCaseCharPtr' as 4th template argument for case-insensitive
      keys


    With hash_maps using std::string for the keys:
                         ===========

    * Use `HashString' as 3rd template argument for case-sensitive keys
    * Use `HashCaseString' as 3rd template argument for case-insensitive keys

    * OMIT the 4th template argument for case-sensitive keys
    * Use `EqualCaseString' as 4th template argument for case-insensitive
        keys


    Examples, using int as the value type. Any other type can be used instead
              for the value type:

                                    // key is char const *, case sensitive
        __gnu_cxx::hash_map<char const *, int, FBB::HashCharPtr,
                            FBB::EqualCharPtr >
            hashtab;

                                    // key is char const *, case insensitive
        __gnu_cxx::hash_map<char const *, int, FBB::HashCaseCharPtr,
                                         FBB::EqualCaseCharPtr >
            hashtab;

                                    // key is std::string, case sensitive
        __gnu_cxx::hash_map<std::string, int, FBB::HashString>
            hashtab;

                                    // key is std::string, case insensitive
        __gnu_cxx::hash_map<std::string, int, FBB::HashCaseString,
                                        FBB::EqualCaseString>
            hashtab;

    Instead of the above full typedeclarations, the following shortcuts should
    work as well:

        FBB::CharPtrHash<int>       // key is char const *, case sensitive
            hashtab;

        FBB::CharCasePtrHash<int>   // key is char const *, case insensitive
            hashtab;

        FBB::StringHash<int>        // key is std::string, case sensitive
            hashtab;

        FBB::StringCaseHash<int>    // key is std::string, case insensitive
            hashtab;

    With these template types iterators and other map-members are also
    available. E.g.,

    --------------------------------------------------------------------------
    extern FBB::StringHash<int> dh;

    for (FBB::StringHash<int>::iterator it = dh.begin(); it != dh.end(); it++)
        std::cout << it->first << " - " << it->second << std::endl;
    --------------------------------------------------------------------------

    Feb. 2001 - April 2002
    Frank B. Brokken (f.b.brokken@rug.nl)
*/

namespace FBB
{

    class HashCharPtr
    {
        public:
            size_t operator()(char const *str) const
            {
                return __gnu_cxx::hash<char const *>()(str);
            }
    };

    class EqualCharPtr
    {
        public:
            bool operator()(char const *x, char const *y) const
            {
                return !strcmp(x, y);
            }
    };

    class HashCaseCharPtr
    {
        public:
            size_t operator()(char const *str) const
            {
                std::string s = str;
                for_each(s.begin(), s.end(), *this);
                return __gnu_cxx::hash<char const *>()(s.c_str());
            }
            void operator()(char &c) const
            {
                c = tolower(c);
            }
    };

    class EqualCaseCharPtr
    {
        public:
            bool operator()(char const *x, char const *y) const
            {
                return !strcasecmp(x, y);
            }
    };

    class HashString
    {
        public:
            size_t operator()(std::string const &str) const
            {
                return __gnu_cxx::hash<char const *>()(str.c_str());
            }
    };

    class HashCaseString: public HashCaseCharPtr
    {
        public:
            size_t operator()(std::string const &str) const
            {
                return HashCaseCharPtr::operator()(str.c_str());
            }
    };

    class EqualCaseString
    {
        public:
            bool operator()(std::string const &s1, std::string const &s2) const
            {
                return !strcasecmp(s1.c_str(), s2.c_str());
            }
    };


    template<typename Value>
    class CharPtrHash: public
        __gnu_cxx::hash_map<char const *, Value, HashCharPtr, EqualCharPtr >
    {
        public:
            CharPtrHash()
            {}
            template <typename InputIterator>
            CharPtrHash(InputIterator first, InputIterator beyond)
            :
                __gnu_cxx::hash_map<char const *, Value, HashCharPtr,
                                    EqualCharPtr>(first, beyond)
            {}
    };

    template<typename Value>
    class CharCasePtrHash: public
        __gnu_cxx::hash_map<char const *, Value, HashCaseCharPtr,
                                                 EqualCaseCharPtr >
    {
        public:
            CharCasePtrHash()
            {}
            template <typename InputIterator>
            CharCasePtrHash(InputIterator first, InputIterator beyond)
            :
                __gnu_cxx::hash_map<char const *, Value,
                            HashCaseCharPtr, EqualCaseCharPtr>
                            (first, beyond)
            {}
    };

    template<typename Value>
    class StringHash: public __gnu_cxx::hash_map<std::string, Value,
                                                 HashString>
    {
        public:
            StringHash()
            {}
            template <typename InputIterator>
            StringHash(InputIterator first, InputIterator beyond)
            :
                __gnu_cxx::hash_map<std::string, Value, HashString>
                             (first, beyond)
            {}
    };


    template<typename Value>
    class StringCaseHash: public
            __gnu_cxx::hash_map<std::string, int, HashCaseString,
                                EqualCaseString>
    {
        public:
            StringCaseHash()
            {}
            template <typename InputIterator>
            StringCaseHash(InputIterator first, InputIterator beyond)
            :
                __gnu_cxx::hash_map<std::string,
                            int, HashCaseString,
                            EqualCaseString>(first, beyond)
            {}
    };

    template<typename Key, typename Value>
    class Hash: public
            __gnu_cxx::hash_map<Key, Value,
                        __gnu_cxx::hash<Key>(),
                        equal<Key>())
    {};

}
#endif
The following program defines a hash_map containing the names of the months of the year and the number of days these months (usually) have. Then, using the subscript operator the days in several months are displayed. The equality operator used the generic algorithm equal_to<string>, which is the default fourth argument of the hash_map constructor:
    #include <iostream>
        // the following header file must be available in the compiler's
        // INCLUDE path:
    #include <hashclasses.h>
    using namespace std;
    using namespace FBB;

    int main()
    {
        __gnu_cxx::hash_map<string, int, HashString > months;
        // Alternatively, using the classes defined in hashclasses.h,
        // the following definitions could have been used:
        //      CharPtrHash<int> months;
        // or:
        //      StringHash<int> months;

        months["january"] = 31;
        months["february"] = 28;
        months["march"] = 31;
        months["april"] = 30;
        months["may"] = 31;
        months["june"] = 30;
        months["july"] = 31;
        months["august"] = 31;
        months["september"] = 30;
        months["october"] = 31;
        months["november"] = 30;
        months["december"] = 31;

        cout << "september -> " << months["september"] << endl <<
                "april     -> " << months["april"] << endl <<
                "june      -> " << months["june"] << endl <<
                "november  -> " << months["november"] << endl;
    }
    /*
        Generated output:
    september -> 30
    april     -> 30
    june      -> 30
    november  -> 30
    */

The hash_multimap, hash_set and hash_multiset containers are used analogously. For these containers the equal and hash classes must also be defined. The hash_multimap also requires the hash_map header file.

Before the hash_set and hash_multiset containers can be used the following preprocessor directive must have been specified:

    #include <ext/hash_set>

12.4: The `complex' container

The complex container is a specialized container in that it defines operations that can be performed on complex numbers, given possible numerical real and imaginary data types.

Before complex containers can be used the following preprocessor directive must have been specified:

    #include <complex>
The complex container can be used to define complex numbers, consisting of two parts, representing the real and imaginary parts of a complex number.

While initializing (or assigning) a complex variable, the imaginary part may be left out of the initialization or assignment, in which case this part is 0 (zero). By default, both parts are zero.

When complex numbers are defined, the type definition requires the specification of the datatype of the real and imaginary parts. E.g.,

    complex<double>
    complex<int>
    complex<float>
Note that the real and imaginary parts of complex numbers have the same datatypes.

Below it is silently assumed that the used complex type is complex<double>. Given this assumption, complex numbers may be initialized as follows:

Anonymous complex values may also be used. In the following example two anonymous complex values are pushed on a stack of complex numbers, to be popped again thereafter:
    #include <iostream>
    #include <complex>
    #include <stack>

    using namespace std;

    int main()
    {
        stack<complex<double> >
            cstack;

        cstack.push(complex<double>(3.14, 2.71));
        cstack.push(complex<double>(-3.14, -2.71));

        while (cstack.size())
        {
            cout << cstack.top().real() << ", " <<
                    cstack.top().imag() << "i" << endl;
            cstack.pop();
        }
    }
    /*
        Generated output:
    -3.14, -2.71i
    3.14, 2.71i
    */
Note the required extra blank space between the two closing pointed arrows in the type specification of cstack.

The following member functions and operators are defined for complex numbers (below, value may be either a primitve scalar type or a complex object):