Chapter 4: The `string' data type

Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.

Please state the document version you're referring to, as found in the title (in this document: 6.5.0) and please state chapter and paragraph name or number you're referring to.

All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.

C++ offers a large number of facilities to implement solutions for common problems. Most of these facilities are part of the Standard Template Library or they are implemented as generic algorithms (see chapter 17).

Among the facilities C++ programmers have developed over and over again are those for manipulating chunks of text, commonly called strings. The C programming language offers rudimentary string support: the ASCII-Z terminated series of characters is the foundation on which a large amount of code has been built (We define an ASCII-Z string as a series of ASCII-characters terminated by the ASCII-character zero (hence -Z), which has the value zero, and should not be confused with character '0', which usually has the value 0x30).

Standard C++ now offers a string type. In order to use string-type objects, the header file string must be included in sources.

Actually, string objects are class type variables, and the class is formally introduced in chapter 6. However, in order to use a string, it is not necessary to know what a class is. In this section the operators that are available for strings and several other operations are discussed. The operations that can be performed on strings take the form

        stringVariable.operation(argumentList)
For example, if string1 and string2 are variables of type string, then
        string1.compare(string2)
can be used to compare both strings. A function like compare(), which is part of the string-class is called a member function. The string class offers a large number of these member functions, as well as extensions of some well-known operators, like the assignment (=) and the comparison operator (==). These operators and functions are discussed in the following sections.

4.1: Operations on strings

Some of the operations that can be performed on strings return indices within the strings. Whenever such an operation fails to find an appropriate index, the value string::npos is returned. This value is a (symbolic) value of type string::size_type, which is (for all practical purposes) an (unsigned) int.

Note that in all operations with strings both string objects and char const * values and variables can be used.

Some string-members use iterators. Iterators will be covered in section 17.2. The member functions using iterators are listed in the next section (4.2), they are not further illustrated below.

The following operations can be performed on strings:

4.2: Overview of operations on strings

In this section the available operations on strings are summarized. There are four subparts here: the string-initializers, the string-iterators, the string-operators and the string-member functions.

The member functions are ordered alphabetically by the name of the operation. Below, object is a string-object, and argument is either a string const & or a char const *, unless overloaded versions tailored to string and char const * parameters are explicitly mentioned. Object is used in cases where a string object is initialized or given a new value. The entity referred to by argument always remains unchanged.

Furthermore, opos indicates an offset into the object string, apos indicates an offset into the argument string. Analogously, on indicates a number of characters in the object string, and an indicates a number of characters in the argument string. Both opos and apos must refer to existing offsets, or an exception will be generated. In contrast to this, an and on may exceed the number of available characters, in which case only the available characters will be considered.

When streams are involved, istr indicates a stream from which information is extracted, ostr indicates a stream into which information is inserted.

With member functions the types of the parameters are given in a function-prototypical way. With several member functions iterators are used. At this point in the Annotations it's a bit premature to discuss iterators, but for referential purposes they have to be mentioned nevertheless. So, a forward reference is used here: see section 17.2 for a more detailed discussion of iterators. Like apos and opos, iterators must also refer to an existing character, or to an available iterator range of the string to which they refer.

Finally, note that all string-member functions returning indices in object return the predefined constant string::npos if no suitable index could be found.

4.2.1: Initializers

The following string constructors are available:

4.2.2: Iterators

See section 17.2 for details about iterators. As a quick introduction to iterators: an iterator acts like a pointer, and pointers can often be used in situations where iterators are requested. Iterators almost always come in pairs: the begin-iterator points to the first entity that will be considered, the end-iterator points just beyond the last entity that will be considered. Iterators play an important role in the context of generic algorithms (cf. chapter 17).

4.2.3: Operators

The following string operators are available:

4.2.4: Member functions

The string member functions are listed in alphabetical order. The member name, prefixed by the string-class is given first. Then the full prototype and a description are given. Values of the type string::size_type represent index positions within a string. For all practical purposes, these values may be interpreted as unsigned.

The special value string::npos, defined by the string class, represents a non-existing index. This value is returned by all members returning indices when they could not perform their requested tasks. Note that the string's length is not returned as a valid index. E.g., when calling a member `find_first_not_of(" ")' (see below) on a string object holding 10 blank space characters, npos is returned, as the string only contains blanks. The final 0-byte that is used in C to indicate the end of a ASCII-Z string is not considered part of a C++ string, and so the member function will return npos, rather than length().

In the following overview, `size_type' should always be read as ` string::size_type'.