[Santa Clara University]
Department of Mathematics
and Computer Science
[Return to Math 169 Homepage]

Math 169 Notes -- Icon


Contents


A. "Hello World!" in Icon

The well-known sample program in Icon can be written as follows:
          procedure main()
	       write("Hello world!")
	  end

B. History

Icon is a language that is a successor to SNOBOL4. It retains the pattern-matching and string manipulation operations of SNOBOL4 and combines with these the block-structure form of languages like Algol, Pascal and C/C++.

C. Identifiers

Identifiers can be used to name program segments and variables. An Icon identifier must begin with a letter of the alphabet or an underscore. It can be followed by any number of letters, underscores, or numbers. UPPERCASE and lowercase letters are considered to be different.

D. Reserved Words

Reserved words are used to indicate programming constructs (such as "while" and "if" expressions) and delimit program segments (such as "procedure" and "end").

Icon reserved words are written with all lower case letters.

E. Output Function

For output, Icon includes the library function "write" that takes one or more variables and literal strings as its arguments. Literal strings are surrounded by double quotes ("..."). The arguments are placed between a matched set of parentheses and separated by commas.

F. Program Structure

An Icon program is a collection of procedures.

Each procedure begins with the reserved word "procedure" and ends with the reserved word "end". The name of the procedure follows the reserved word "procedure" and is an identifier of the language. A matched set of parentheses follows the procedure name and included within the parentheses are any arguments to the procedure.

Even if the procedure has no arguments, the parentheses are nevertheless included.

As in C/C++, The main program is the procedure named "main" with no parameters.

For example,

	procedure main()
		write("Hello world")
	end

G. Subprograms

A subprogram is a procedure with a name other than "main." It is invoked (as in Pascal or C/C++) by merely listing the name of the procedure.

For example,

	procedure hello()
		write("Hello world")
	end
	procedure main()
		hello()
	end
NOTE that when hello is called from the main program, the empty parentheses are included.

Procedures can explicitly return a value by using the reserved word "return" before the identifier storing the value. Procedures can explicitly fail by using the reserved word "fail" (or by ending without an explicit return).

For example,

	if count > 0 then
		return count
	else fail
A procedure can be suspended by invoking the expression "suspend" (followed by an expression that provides the return value). When the procedure is invoked again, it continues from the expression after the point of suspension.

H. Comments

Comments are indicates by a pound sign, #. Everything following the pound sign on the same line is ignored.

I. Expressions

Technically, Icon has no statements. It only has "expressions." Each expression can have a value, a side effect (like output), or can succeed or fail. Multiple expressions on one line are separated by semicolons (as in Fortran-90/95). The use of semicolons at the end of lines is not necessary. Unlike Pascal, at the end of a procedure (after the reserved word "end"), no semi-colon or period is needed.

J. Variables

In general, variables do not have to be declared and they adjust their type to fit the context. Assignment is indicated by := . For example,
	procedure main()
		i := 1
		write(i)
		i := "hello"
		write(i)
	end
produces the output of
	1
	hello
If there are more than one procedures in a programs, by default all variables are local to each separate procedure. Certain identifiers can be made global by using a "global" declaration statement outside all procedures. A "global" declaration begins with the reserved word "global" followed by the list of identifiers (separated by commas).

Variables can also explicitly be declared as local inside each procedure by using the reserved word "local" followed by the list of identifiers. This line should be the first one in a procedure (after the procedure heading line).

Variables are usually created when a procedure is called and destroyed when it is completed. To keep a variable and its last value alive from one invocation of a procedure to another, the variable can be declared as "static" (in a way similar to declaring variables global or local).

K. Input Function

For input, Icon includes the library function "read" that usually takes no arguments. The value of "read" is a character string that is the next complete line of text from the standard input file. The function "read" fails if there are no more lines in the input file.

One can read from a file either by using Unix file redirection when invoking the executable file, or by explicitly opening a file and associating the actual file name with an identifier which is then used as the argument to the "read" function. For example:

procedure main ()
    intext := open("sonnet.inp")
    while line := read(intext) do
	{
	    write(line)
	}
end
Note that this code makes use of the "success or failure" feature of Icon expression within a "while" loop, concepts discussed in the next two sections.

L. Success or Failure

Each Icon expression is said to succeed or fail. It succeeds if it accomplished what it was meant to accomplish (e.g., procedure call or assignment) and fails if something else happens (e.g., when a logical comparison is false, as when testing whether 3 < 2). For example, when the input function "read" is used on an empty file, it fails, and if it is included in an assignment or write statement, the larger expression also fails.

Thus, if the input file is empty,

read() fails,
line := read() also fails, and
write(read()) also fails.

M. Control Structures

1. "While" loops are similar to Pascal while loops with the following changes. The expression after the reserved word "while" can be any expression that can fail, including an assignment. The "do" section is optional. Multiple expressions in the "do" section are grouped by using curley braces {, } (as in C/C++).

2. "Until" loops are related to Pascal REPEAT loops with the following differences. The reserved word "until" comes first followed by an expression that can fail. The "do" section then follows and it is optional. As in a "while" structure, multiple expressions are grouped by using curley braces.

3. If-then-else structures are used as in Pascal/Fortran-90/C/C++. The "else" clause is optional.

4. A loop can be terminated immediately by invoking the expression "break" (similar to the Fortran-90/95 exit statement). Normally this would be used as part of an if-then-else structure for early termination. Part of the expression in the body of the loop can be skipped by invoking the expression "next" (similar to the Fortran-90/95 cycle statement). This would be used as part of an if-then-else structure, and control is then immediately transfered to the beginning of the loop structure, ignoring any remaining expressions within the body fo the loop.

5. An Icon "repeat" loop is significantly different from the Pascal structure with the same name. The Icon "repeat" loop has no conditional clause (e.g., "until"), and repeats indefinitely whether the expressions in its body succeed or fail. It can only be terminated using a "break" expression.

6. A "case" structure exists, similar to the one in Pascal (or the SELECT CASE structure in Fortran-90/95). The options are grouped as clauses with braces. Each clause has a left side (the evaluation option), a colon, and a right side (the action expression). If the key expression matches one of the options listed, the corresponding action is performed. There is a "default" option if none of the options listed equals the key expression. For example,

	case i of {
		j+1	: write("high")
		j-1	: write("low")
		default	: write("otherwise")
		}
7. Note that in "conditional" expressions (i.e., if, while, until expressions), the condition can be any expression that can succeed or fail, including an assignment, such as line := read() . This is distinctively different that what is allowed in similar structures in other languages.

As in other languages, the reserved word "not" can be used before any expression to reverse its success/failure value.

N. Emergency Program Termination

To terminate immediately the program (e.g., because some erroneous condition has arisen), one can invoke the express "stop(s)" where s is a string. The program immediately writes the string s and then stops. This can also be used to abruptly terminate any looping expression.

O. Pattern Matching

Icon is designed (as was SNOBOL4) to facilitate pattern matching and character string manipulation. To aid in this, the language includes several pattern matching functions.
find(s1,s2) succeeds if s1 is a substring of s2 occurring anywhere in s2. The resulting value is the location in s2 of the first character of s1.
match(s1,s2) succeeds if s1 is an initial substring of s2 occurring at the beginning of s2. The resulting value is one plus the length of s1.
Thus,
find("on", "motion") succeeds (with a value of 5), and
match("on", "motion") fails (given no value), but
match("mo", "motion") succeeds (with a value of 3).
These and other functions can be used as conditional expressions in if, while, or until expressions.

P. Arithmetic Expressions, Binary Operators

Icon includes the following binary arithmetic operators:
+ addition
- subtraction
* multiplication
/ division
% "mod", i.e., remainder after division
^ exponentiation
Each operator can precede the assignment symbol to produce an "augmented assignment." I.e., ,x +:= y is an abbreviation for x := x+y. The % operator has the same precedence as * and /. The absolute value function is "abs".

Q. Numerical Comparisons

The comparison (i.e., relational/logical/boolean) operators are similar to those in Pascal, C/C++, Fortran-90/95:
< less than
<= less than or equal to
= equal to
> greater than
>= greater than or equal to
~= not equal to
It should be noted that if a comparison succeeds, the value of the expression is that of the right argument! For example,
	3 < 5
succeeds and produces a value of 5. Thus,
	n := 4 < 10
assigns a value of 10 to the identifier n.

Compound comparisons can be written as in standard mathematical expressions, such as:

	1 <= n <= 10
This groups from left to right, and succeeds if the value of n is between 1 and 10, inclusive.

R. Repeated Evaluation

If a generating function is used in an numerical comparison, e.g.,
		find(s1,s2) > 10
and the comparison fails, the function is invoked again for additional values until it succeeds. Only if it never succeeds, will the entire expression fail. When it succeeds, the function is not evaluated again.

S. Iteration

The structure "every ... do" will cause the expression that follows the "every" to be re-evaluated for all possible values and then to execute the expression that follows the "do" (if that optinal clause is included).

One can generate a sequence of numbers by using

		i to j by k 
(the "by" clause is optional).

Thus,

 
		every i:= 1 to 9 by 2 do write(i^2)
produces 1, 9, 25, 49, 81.

T. Alternation and Conjunction

Alternation of possible values is expressed by the pipe (|) operator. Thus (0|1) generates 0 and 1 as values. In the expression
	if i = (0|1) then write("okay")
okay is written if i is either 0 or 1. This can be understood as the "or" logical operator as well.

Conjunction, that is the "and" logical operator, is expressed by the single ampersand, &.

U. Other Assignment Features

Multiple assignments are permitted, e.g., x := y := z and group from RIGHT to LEFT, i.e., as x := (y := z).

Augmented assignments (as in C/C++) are permitted, e.g., i +:= 1 is interpreted as i := i+ 1.

Exchange of values is indicated by an equals sign with colons on both sides, e.g., x :=: y.

V. Unary Operators

Icon has a number of unary operators, for example:
?s produces one of the characters in string s randomly chosen
!s generates the characters of string s one by one, from first to last, left to right
*x produces the size of x where x is a string, cset or structure

W. String Scanning

The scanning operator is the question mark ? .

The syntax of use is: expr1 ? expr2
where expr1 is the subject to be scanned
and expr2 does the scanning.

The function move(i) which increments the position in the subject by i characters (if possible, else it fails), can be used in a possible scanning expression. The value of move(i) is the portion of the subject between the old and new positions.

The functions find and match can be used in a scanning expression. In this case, the second argument is omitted and the subject is assumed to be the argument to be searched.

For example

	write(text ? find("the"))
writes the position of the first occurrence of "the" in text.

The function tab(i) sets the position in the subject to (right before) the i-th character when i is positive and "returns" the substring from the current position up to the i-th character. If i is 0, tab(i) indicates the right edge of of the subject. If i is negative, tab(i) indicates |i| characters from the right edge of the subject.

X. Csets

"Csets" are Icon character sets, i.e., a set of characters. A cset can be defined by enclosing a string of characters using SINGLE quotes, i.e., vowel := 'aeiouAEIOU' .

Predefined csets include:

&letters all upper and lowercase letters
&ucase all uppercase letters
&lcase all lowercase letters
&digits the ten single digits
&cset all 256 characters
&ascii the first 128 characters in ASCII
Icon has five operations on csets:
++ set union (binary)
** set intersection (binary)
-- set difference (binary)
~ set complement (unary)
* set size (unary)
Icon has functions that take csets as arguments.
upto(c) generates the position in the subject in which any character in cset c occurs.
many(c) produces the position after a sequence of characters in c
any(c) produces the position after any character in c.
These functions can take additional arguments: the second argument is the subject (by default is the last value of &subject), the third argument is the number of the character at which the search is started (by default it is 1).

The following is an example of the use of these functions:

procedure main()
    text := "Does anyone have an apple or an ugly orange?"
    text ? {
	while tab(upto(&letters)) do
	    { write("Upto =", upto(&letters))
	      write("many =", many(&letters))
	    write("tab many=", tab(many(&letters)))}
    }
end
The output is:
Upto =1
many =5
tab many=Does
Upto =6
many =12
tab many=anyone
Upto =13
many =17
tab many=have
Upto =18
many =20
tab many=an
Upto =21
many =26
tab many=apple
Upto =27
many =29
tab many=or
Upto =30
many =32
tab many=an
Upto =33
many =37
tab many=ugly
Upto =38
many =44
tab many=orange

Y. Lists (Arrays), Tables, Sets

One dimensional arrays are called lists and are declared by using the list function which determines the initial size. Elements of a list are specified by using square brackets to enclose the subscript. All the values of a list may be initialized when the list is created by using a second, optional parameter.

Thus,

		vector := list(100,0.0)
creates a list of size 100 called vector, each element of which has been initialized to 0.0 . The second element of vector is designated as vector[2]. A list can also be created by enclosing the values of the elements in square brackets, i.e., city := ["Tucson", "Los Angeles", "San Jose"].

An empty list (containing no values) can be created by [] or list(0).

A multi-dimensional array can be simulated by created a list of lists. Thus,

		board := list(8)
		every !board := list(5)
creates a two-dimensional array with first dimension 8 and second dimension 5. The i-th,j-th element is designated as board[i][j].

Lists can be used as queues and stacks. The operations put(L,x) and get(L) enqueues an element x onto the right end of list L, and dequeues and element from the left end of list L. The operation push(L,x) adds an element x to the left end of list L, and pop(L) removes a value from the left end of L.

A table is a collection of items that can be referenced by "subscripts" that are neither numeric nor ordered. In other words, a table is like an array (list), except that the subscripts can be anything, including arbitrary strings. A table is created by invoking the function table and giving an initial value as the argument. A table automatically expands whenever a new element is referenced. Thus

		words := table(0)
creates a table named words. One can thereafter create a new element merely by references an element of words with some subscript, e.g.,
		words["The"] := 1
A set is an unordered collection of values. A set is created by invoking the function set on a collection of values, e.g., set([1,2,3,3,1]) creates a set with three members, 1, 2, and 3. The function member(S,x) succeeds if x is a member of set S (and it produces x as a value). The functions insert(S,x) and delete(S,x) are also part of the language. Set operations are designated as follows:
++ set union
** set intersection
-- set difference


This page is maintained by Dennis C. Smolarski, S.J. dsmolarski@math.scu.edu
Last updated: 14 May 2002.