CIS Department > Tutorials > Software Design Using C++ > STL

Software Design Using C++

STL: Strings

Introduction

It is important to understand that there are several systems for setting up and handling strings. They are different and not completely compatible. Thus, you must always be clear about what type of string you are using. Here are three of the commonly-used systems for strings:

C-style strings (sometimes called C-strings, though that name should not be confused with the third type of string below)
Character arrays with a NULL character marking the end of the string
Class string
The one discussed here
Microsoft's CString class
Look for CString under help in Visual C++.

Why use one or the other? C-style strings are rather simple to understand. However, if functions like strcpy, strncpy, and strcat are not used carefully, they can lead to security flaws, especially buffer overflows. The string class provides more functionality and protection from possible buffer overflows, so it is to be preferred when using strings (especially when copying strings) in main memory. The CString object-oriented approach also provides more functionality and may be required in writing certain types of Windows software. The first type of strings, C-style strings, are useful when one wants to write string data to a file or read string data from a file and need to know exactly how many characters one has. As the example below shows in part, it is possible to convert between these various styles of strings. However, it would be best to keep such conversions to a minimum for the sake of efficiency.

The string class

The string class is a container class within the STL. It provides an object-oriented and safer alternative to the C-style character arrays that are often used for strings. It also provides the [] notation (as in array subscripting) to access individual characters of a string object. However, the [] notation does not give bounds checking and so does not prevent you from going off the end of the string object. The string class also provides the at() method for accessing individual characters in a string object. This method does provide bounds checking and thus is safer. As a short example of how to output character n of a string object str, you could use cout <<str[n]; without bounds checking or use cout <<str.at(n); with bounds checking.

Example

string.cpp example

As you can see in this example, to use the string class you need to include the string header file. The first line of the main function shows two of the ways in which to initialize a string:


string MsgA, MsgB("Hello, world!");

MsgA is initialized to the empty string, while MsgB is initialized using the literal C-style string "Hello, world!". A string object can be printed using the usual << operator as follows:


cout << MsgB;

As the example shows, the = assignment operator can be used to assign one string into another. This works even if the string on the right hand side is a C-style string, not a string object. The + operator can be used to append two strings, where one of the two strings can even be a C-style string or a simple character. The ability to use these operators makes the string class quite convenient. You can also use the [] array subscript notation on a string object to get the character at that position. This is used in the example to print out the first five letters of a string.

The example ends by showing how to convert between string objects and character array strings. The conversion from a string object to a character array is the more difficult: you have to copy the data into the character array from a source pointer that you get by applying the c_str function to your string object. The assignment operator can be used to copy a character array string into a string object, so conversion in that direction is easy.

The string class also has additional capabilities that are not covered here. See the references for more information.

Additional Examples

It is useful to compare how to program with objects of type string and how to program with C-style strings. The first of the two examples in this section uses objects of type string, while the second example solves the exact same problem using the same algorithm but with C-style strings. The two programs look very much alike. One key difference occurs when you check the length of a string. With string objects we do something like Response.length(), while with C-style strings we instead use strlen(Response). With C-style strings we also use a typedef to create our own string type and have to decide on a maximum length for the strings. This is not the case with the string class.

The overall idea in these two programs is to allow the user to repeatedly enter a string and have it checked to see if it is or is not a palindrome. A palindrome is a string that reads the same forwards and backwards, such as "madam". Some people allow spaces, which are ignored when deciding if a string is a palindrome, so that "race car" is a palindrome, but these programs do not handle this. Here is a drawing of the "madam" palindrome sitting in a string:

[the madam palindrome]

If you are wondering about the algorithm used in the IsPalindrome function, it works like this: If the length of the string is zero, we have the empty string, which can be considered to be a very trivial palindrome. If the length of the string is one, we have a one-character string, which is trivially a palindrome. For lengths two or greater, we distinguish between whether the length is odd or even. If the length is odd, such as 5 for a word like "madam", we find the midpoint as (5 - 1) / 2 = 2, the index of the letter d in "madam". Since the middle letter matches with itself, there is nothing to check for it. We then use a FOR loop to compare the characters at indices 0 and 4 (the first and last characters), then the characters at indices 1 and 3. If both pairs have matches, then we have a palindrome. If we ever fail to have a match, as with "modem", then the word is not a palindrome. Notice that the loop stops after we check the character at the index given by our midpoint minus one. The case where the length is even is similar, but there is no character in the middle. Consider the example "deed" of length 4. We find the midpoint as (4 - 1) / 2 = 1, due to the integer division. Our FOR loop then checks to see if the characters at indices 0 and 3 match, then it checks those at indices 1 and 2. Notice that the loop stops after we check the character at the index given by our midpoint.