Notes N13

Math 10 -- D. C. Smolarski, S.J.
Santa Clara University, Department of Mathematics and Computer Science

[Return to Math 10 Homepage | Return to Notes Listing Page]

Contents


Introduction to Arrays of char

Recall that the intrinsic data type char refers to a single character. In other words, if letter has been declared to be of type char, then letter can only hold one single character. One can get into trouble by attempted to store more than one character into such a variable.

C/C++ uses arrays of type char to store several characters, in other words, to store what are usually called "words" or "strings" of characters. In fact, an array of type char (with certain restrictions) is commonly referred to as a "string."

Strings in C/C++ are usually terminated (in memory, i.e., in an array) with a special character, called the null character, which is '\0' (octal code zero). Thus, one space of any array of char must be left for the null character. In other words, if we declare

	char name[20];
then the array name can hold, at maximum, 19 printable characters in addition to the null character which indicates the end of the printable string. This means it is possible to store fewer than 19 characters in such an array without having to be concerned about introducing blanks at the end.

The standard stream input (cin) and output (cout) functions and operators can be used with character arrays with this caution: As with numbers, white space (e.g., a blank or tab) will end the input. Thus, if "John Jones" were in an input file, and the following code were in a program,

	cin >> name ;
(assume name were declared as above), then name would contain the letters 'J', 'o', 'h', 'n', in locations 0 through 3 and the null character in location 4 and unknown data in locations 5 through 19.

Similarly, when using cout and the output operator with an array of char, if the full array identifier name is given, the string of characters starting with the character in location 0 and going up to the character before the null character will be printed. The null character itself is non-printable and signals the stream output function to ignore it and anything that comes after it in the array.

To read a specific number of characters on a line into a character array (regardless of whether the string contains blanks or not), one can use the stream input member function getline or the member function get (which are both member functions of cin or any other file stream input variable which may be defined). The functions getline and get may take two arguments, the first being the identifier that has been declared as an array of type char and the second being the maximum length of the array. It is permitted to use a third (optional) argument which is the stopping delimiter character, which, by default, is the new line symbol, '\n'. (This stopping symbol works in these functions in a similar way to how it works in the function ignore [cf. Notes N5].)

These functions will then "get" the next designated number of characters on the "line" and store them in the designated array (always ending the "string" with a null).

Thus, for example,

	cin.getline(name,15);
or
        cin.get(name,15);
will read the next 14 characters on the input line (blanks and non-blanks alike) and store them in char array name (locations 0 through 13) and terminate the string with a null character in location 14, for a total of 15 elements.

If there are fewer than the designated number of characters left on the line, get and getline will only read in those characters remaining until it detects the newline character. It terminates the string with a null character immediately after the characters read in. In the previous example, if there are only 5 characters on a line, those charcters would be placed in array locations 0 through 4 and the null character would be put into location 5 rather than location 14.

The official difference between get and getline when used in this context is that getline extracts the terminating delimiter (with the next input starting from after this character), while get leaves the delimiter in the input string. However, some implementations have other differences. (Note also, that get is a very versatile input function and can be used in other contexts with different parameters.)

To avoid problems, getline should normally be used only when one wants to read everything on a line of data in the input file, and the second value should match the standard length of the line. If there is more data on a line, for example of a numeric type, then get should be used instead.

If there is more data on the input line, that data may be read by other cin statements, or it could be ignored by using something such as

	cin.ignore(100,'\n');
Remember that member functions that are members of the class cin are also available for use by any file input variable defined as an ifstream variable.

String Functions and Operations

Although strings (arrays of char) can be initialized by using an equals sign when declared, a string cannot be reset by using an equal sign within a program. Thus, given the declaration
	char name[20];
it would be illegal to attempt the following assignment
	name = "Hello";
Such an assignment can only be accomplished by using a library function contained in the library string.h which must be referenced by an #include statement at the head of the program file.

One could reset the value of name by invoking

	strcpy(name,"Hello");
where strcpy abbreviates "string-copy." The second argument can be either a literal string of characters (between a set of double quotes) or another array of char.

One cannot compare elements in two different string arrays by the standard logical operators used for numbers, either. To compare strings stored in two different arrays (either for equality or for alphabetical order), one makes use of strcmp (string-compare). The function strcmp is integer function that can be used as a logical function as follows.

Assume string1 and string2 are two strings that one wishes to compare. One tests the two strings using strcmp by invoking

	strcmp(string1, string2)
in an appropriate if statement (or set of statements).

strcmp(string1, string2) will return the value of 0 (which is interpreted by an if statement as false), if string1 and string2 are the same (i.e., the characters are the same from location 0 in both arrays until the null character in both arrays). The function will return a negative value (which is interpreted by an if statement as true), if string1 comes before string2 in alphabetical order, and it will return a positive value (which is also interpreted by an if statement as true), if string1 comes AFTER string2 in alphabetical order.

There are also other functions in string.h which are not commonly used in elementary programs, such as

For more details on such functions, see a standard reference.


This page is maintained by Dennis C. Smolarski, S.J. dsmolarski@math.scu.edu
© Copyright 1997, 1998, 1999 Dennis C. Smolarski, SJ, All rights reserved. Last changed: 25 February 1999.