Chapter 13
Essentials of Essentials of Fortran-90/95/2003
Character Variables

Math 60 -- D. C. Smolarski, S.J.
Santa Clara University, Department of Mathematics and Computer Science

[Return to Math 60 Homepage | Return to Brief Contents Page]
[Return to Full Contents Page]

Contents


13.1 Declaring Character Variables

In the earliest versions of Fortran, character information was stored in integer variables. FORTRAN-77 introduced the data type, CHARACTER, as a new standard type for variables. The character type declaration statement is used in a way similar to that of other Fortran declaration statements, that is,
	CHARACTER <variable list>
By default, variables declared to be of data type CHARACTER store a single character. To declare variables that have the capacity to store more than one character, the length is included in one of the following ways (assuming that n is an integer number):
	CHARACTER*n <variable list>
or, starting with Fortran-90,
        CHARACTER (LEN=n) :: <variable list>
or
        CHARACTER (n) :: <variable list>
(The second forms are the recommended Fortran-90/95/2003 style. Note that LEN= is actually optional within the parentheses.)

In the last three examples, the integer n refers to the number of characters which the programmer wishes to store in one variable. This '*n' designation can also be appended to any variable in the variable list. If '*n' is appended to variables, the local number supersedes any number appended to the word CHARACTER.

For example, suppose we have the following declarations:

	CHARACTER (LEN=20) ::  NAME
	CHARACTER*3 PETE, SAM*5, FRED
	CHARACTER MSG1*2,MSG2*4,MSG3*10,MSG4
These declarations would declare that
NAME could store 20 characters
PETE could store 3 characters
FRED could store 3 characters
SAM could store 5 characters
MSG1 could store 2 characters
MSG2 could store 4 characters
MSG3 could store 10 characters
and MSG4 could store only 1 character.
It is also permitted to declare a variable with length "*" which indicates that the length is determined by an initial assignment in the declaration or elsewhere.

13.2 Assignments

To assign "constant" values (i.e., a string of characters) to a character variable, one surrounds the string with single quotes in an assignment statement, e.g.,
	PETE = 'SAM'
	SAM = 'PATTY'
One may also read-in character data from an input file into a character data of the proper size.

To indicate the empty string, one includes two single or double quotes in a row, e.g.,

        PETE = ''

13.3 Input/Output

To read values from outside the program into a character variable or to output a character value using a WRITE statement, one uses the A field descriptor in a FORMAT statement. For example,
	PETE = 'SAM'
	SAM = 'DENIS'
	WRITE(6,10) PETE, SAM
	10 FORMAT(1X,A3,2X,A5)
or
        PROGRAM TEST5
	CHARACTER*5 A,B
	READ(5,101) A,B
	101 FORMAT(2A5)
	WRITE(6,102) B,A
	102 FORMAT(1X,2A5)
	STOP
	END
Suppose the second example had an input of
(column number-->	1 2 3 4 5 6 7 8 9 10 )
                	N O W   I S   T H E
the output would be
(column number-->	cc 1 2 3 4 5 6 7 8 9 10 )
 	                   S   T H E N O W   I
or (Fortran 2003 style)
(column number-->	 1 2 3 4 5 6 7 8 9 10 11)
 	                   S   T H E N O W    I
(where cc indicates the pre-Fortran-2003 "carriage control" non-printing character, cf. section 3.7).

If one uses free-format input, the character string should be enclosed in single quotes, but quotes should not be used with formatted input. With free-format input, the READ statement must determine which type the input data is, so it needs a signal, the quote, to indicate that the data is character. In contrast, when using FORMAT statements, the field descriptor determines the columns in which the data is to be found as well as the type of the data, so such additional signals are not needed.

13.4 Assignments of Different Length

When using character variables and strings of different length in assignment statements, the intuitive rules normally hold, namely:
  1. if a longer string is assigned to a shorter length variable, the (right) end of the string is truncated;
  2. if a shorter string is assigned to a longer length variable, the (right) end of the longer string is padded with blanks.
For example,
        PROGRAM TEST5
	CHARACTER A*2,B*3,C*4
	B = 'SAM'
	A = B
	C = B
	WRITE(6,103) A,C
	103 FORMAT(1X,A2,2X,A4)
	STOP
	END

      (	Output                    )
 (F-77	cc 1 2 3 4 5 6 7 8 9  10  )
 (F-03	 1 2 3 4 5 6 7 8 9 10 11  )
	   S A     S A M
When A gets B's value, the final M in SAM is truncated since A can only store two characters. When C gets the value of B, C is padded with another character, a blank, at the right end, since C can store 4 characters, and B only has 3.

13.5 Concatenation Operator

Fortran-90/95/2003 provides a way of joining two strings together by means of what is called the concatenation operator. This operator is indicated by two slash signs together, //. This operation unites two separate strings into one. For example,
        PROGRAM TEST7
	CHARACTER NAME1*3,NAME2*4,NAME3*7
	NAME1 = 'PAT'
	NAME2 = 'RICK'
	NAME3 = NAME1//NAME2
	WRITE(6,23) NAME3
	23 FORMAT(1X,A7)
	STOP
	END
would have the following output
 (F-77	cc 1 2 3 4 5 6 7 8 9  10  )
 (F-03	 1 2 3 4 5 6 7 8 9 10 11  )
	   P A T R I C K
Concatenation's precedence is below that of the arithmetic operators and above that of the logical operators.

13.6 Other Character Operations: Functions For Character Data

Fortran-90/95/2003 also provides other ways of working with character data. The following functions can be useful.

13.7 Comparing Character Data

It is possible to compare two character strings, since the comparison is made based on the numeric code of the characters, and the code follows alphabetical order. Thus, if one thinks of alphabetical order as a different version of numeric order, then the logical relational operators can be used with the expected results. For example, 'A' .LE. 'C' is .TRUE. and 'Z' .GT. 'P' is also .TRUE. since 'less than' and 'greater than' in the alphabetical context refers to being earlier or later in the normally ordered list (this is similar to the meaning of these operators with numbers).

In addition, Fortran-90/95/2003 also provides several character comparison functions that perform the same comparisons (as the logical relational operators) and return logical values as outputs.

The last three functions are used in the same way as LGE with similar output values.

One must be careful that one does not confuse the `human' meaning and `human' implied order of character strings with the actual computer (alphabetical) order. Take the following example:

	CHARACTER*3 A,B,C
	A = 'TWO'
	B = 'TEN'
	C = 'SIX'
Then, we have that
A .GE. B is .TRUE.
and
C .LE. B is .TRUE.
Even though 2 is less than 10, nevertheless, as far as the dictionary is concerned, the word 'two' comes after the word 'ten' and thus 'two' > 'ten'.

Different implementations use different coding systems, so even though all the capital letters follow the established order, and all the small letters follow the established order, there is no standard rules for whether small letters come before capital letters and where numbers and special characters fit into the schema. When using the character functions LGE, LGT, LLE, LLT, the ASCII sequence (digits before upper-case letters before lower case letters) is always followed even on a non-ASCII machine. When using the logical relational operators, however, the coding system of the implementation is used.

13.8 Substrings

One can also specify substrings in Fortran-90/95/2003 by using a notation similar to array notation. For example, if A can store 10 characters, to indicate the substring consisting of characters 4 through 8, we can write A(4:8). If one omits the first or last number, the compiler will assume the first or last character in the complete string. This notation can be used anywhere and on either side of assignment statements, and combined with the concatenation operator.

For example,

	PROGRAM TEST9
	CHARACTER*24 B,C
	B = 'A SMALL STEP FOR A MAN'
	C = B(3:3)//B(:1)//B(4:4)
	B(18:) = C
	WRITE(*,*) B
	STOP
	END
This program assigns to C the value of 'SAM' (the third, first and fourth characters of B), and then replaces the eighteenth through last characters of B with C, i.e., B now becomes 'A SMALL STEP FOR SAM'.

Fortran-90/95/2003 permits the same character positions to be indicated on both sides of the assignment statement. For example,

	A(4:6) = A(2:4)
is legal even though the fourth character position appears on both sides.


This page is maintained by Dennis C. Smolarski, S.J. dsmolarski@math.scu.edu
© Copyright 1998-2005 Dennis C. Smolarski, S.J., All rights reserved.
Last changed: 23 June 2003.