|
|
||
![]() |
Department of
Mathematics and Computer Science |
|
OUTPUT = "Hello world!" END
Some of the features of SNOBOL include user-defined functions and data types, automatic data-type conversion, heterogeneous arrays (i.e., arrays with different data types), "table" data type with associative lookup for access (i.e., an "array" with arbitrary, non-numeric subscripts).
If this operation succeeds, one may wish to perform a replacement within the subject string, or perform some type of action. If this operation does not succeed ("fails"), one may wish to perform an alternative action.
The ability to detect success or failure of pattern matchings is fundamental to SNOBOL. Pattern matching can also be combined with replacement of the pattern detected with an assignment statement in the same line.
Pattern matching is accomplished by putting a subject string (or variable) on a statement line, followed by a pattern string.
The actual "matching" command is indicated by a blank space. One can indicate a branch label for each statement line by including a colon followed by an S with a branch label in parentheses to which control is transferred on a success or an F with a branch label to which control is transferred on a failure.
Thus, the following code will replace X's with A's and Y's with B's in an input string:
TEXT = INPUT
ONE TEXT 'X' = 'A' :S(ONE)
TWO TEXT 'Y' = 'B' :S(TWO)
OUTPUT = TEXT
END
Patterns can consist of options (indicated by the "pipe" (vertical bar) with
spaces on either side),
and concatenation of several subpatterns is achieved (also) by the blank
space "operator." One can also use the dot operator to "capture" a final
value. Thus,
X = 'BREAD AND BUTTER'
PAT = (('B' | 'R') ('E' | 'EA') ('D' | 'DS')) . Y
X PAT
OUTPUT = Y
END
results in Y getting the value of READ.
TRIM(X) -- removes all trailing blanks from input string X
SIZE(X) -- returns the length of input string X
TAB(n) -- matches any substring up to an including position n (starting
at the current position)
LEN(n) -- matches any pattern of length n
ARB -- matches any pattern of arbitrary length
ANY(X) -- matches any single character contained in the input string X considered
as a set of characters
NOTANY(X) -- matches any single character NOT contained in the input
string X considered as a set of characters
SPAN(X) -- matches the longest substring (of the subject string) whose
characters are in X
BREAK(X) -- matches the longest substring (of the subject string) whose
characters are NOT in X
POS(n) -- matches the null string if the matching cursor is at position n
REM -- matches the remainder of the string starting at the
current cursor position.
The following is an example making use of some of these functions to capture sections of a subject string.
CARD = 'DENNIS SMOLARSKI 554-4124'
X = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
PAT = BREAK(X) SPAN(X) . FIRST SPAN(' ') SPAN(X) . LAST SPAN(' ') REM . TEL
CARD PAT
OUTPUT = "FIRSTNAME=" FIRST
OUTPUT = "LASTNAME=" LAST
OUTPUT = "TELEPHONE=" TEL
END
The pattern string PAT indicates that the interpreter stop ("breaks") at the
first element in the set X (i.e., a letter of the alphabet), then moves
over ("spans") the largest string consisting of elements in set X and
capturing this value in FIRST, then spanning over blanks, then spanning
over a character string and capturing it in LAST, then spanning blanks,
then capturing the REMainder of the subject string in TEL.
This page is maintained by Dennis C. Smolarski, S.J.
dsmolarski@math.scu.edu
Last updated: 14 May 2002.