[Santa Clara University]
Department of Mathematics
and Computer Science
[Return to Math 169 Homepage]

Math 169 Notes -- Python


Contents


A. "Hello World!" in Python

The well-known sample program in Python can be written in one of two ways as follows:

Interactive mode:

          print "Hello world!"
Functional mode:
          def greeting():
	       print "Hello world!"
followed by the function invocation,
          greeting()
Unix Shell Script: Assuming the mode for the file containing the following script has been changed to being executable and the Python language interpreter is located in /usr/bin, one could run the simple Python shell script
         #! /usr/bin/python
	 print "Hello world!"

B. History

Python was created in the early 1990s by Guido van Rossum in the Netherlands as a successor to ABC. It was designed as a scripting language with facility for character manipulation (as with SNOBOL and Icon) as well as mathematical functionality. For example, it shares with Icon (and various versions of LISP) the ability to have unlimited precision integer arithmetic. Nevertheless it retains the modern block-structure form of languages like Algol, Pascal and C/C++.

Unlike many languages, Python (like LISP) can be used in an interactive mode. One can also create functions to be stored in independent files and imported into the Python environment for execution.

C. Identifiers

Identifiers can be used to name program segments and variables. A Python identifier must begin with a letter of the alphabet or an underscore. It can be followed by any number of letters, underscores, or numbers. UPPERCASE and lowercase letters are considered to be different.

D. Reserved Words

Reserved words are used to indicate programming constructs (such as "while" and "if" expressions) and delimit program segments (such as "def").

Python reserved words are written with all lower case letters.

E. Output Function

For output, Python includes the statement "print" which is followed by one or more variables and literal strings. Literal strings are surrounded by single or double quotes ('...' or "..."), double single or double double quotes, or triple single or triple double quotes. The items to be printed are separated by commas.

Output to a file is addressed in a separate section below.

F. Program Structure

One can use Python interactively within the Python environment, similar to the way one can interact with LISP or Matlab. One can, alternatively, create a program as with other languages.

A Python program can be written as a collection of commands and statements or structures within the interactive environment. These commands can also be put in a separate file which can be invoked from the interpreter. Another alternative is to write a collection of procedures within a file (with extension .py) and import the file, and then invoke the procedures.

As with other languages, when writing a long program, it is good to structure it as a set of interrelated procedures/functions some of which call others. Usually Python programmers make use of the IDLE Python GUI enabling a programmer to save code and to debug easily.

The starting function may be given any name. Many programmers, however, following the C/C++/Java style, label the main function "main."

One imports a module by using the keyword import followed by the file name (without extension .py). In this case, one invokes any function by including the filename followed by a dot and then the function name. For example, if file mymath.py contains a function area to compute the area of a circle with input r being the value of the radius, then one must type

      import mymath
      mymath.area(4)
to get the desired answer for a circle of radius 4.

One can avoid repeating the file name when using functions it contains by specifying functions explicitly or using the wildcard symbol *. In other words, one may type

      from mymath import area
      area(4)
or
      from mymath import *
      area(4)

G. Functions

A function is any named collection of commands/statements. It is invoked (as in Pascal or C/C++) by merely listing the name of the function including in parentheses any actual parameters. The name of the function follows the keyword def and the name is followed by parentheses with formal parameters (if any). The right parenthesis is followed by a colon. All interior statements must be indented (as shown in the example above).

Functions can explicitly return a value by using the reserved word return before the identifier storing the value to be returned.

For example,

	def factorial(n):
	    r = 1
	    while n > 0:
                  r = r * n
                  n = n - 1
	    return r
Note that there are no enclosing braces to delimit the interior or a function nor any specific keyword (e.g., END) to indicate the conclusion. The body of the function (and of structures such as "while") is indicated solely via indentation.

Parameters are passed via call by value only. As with Fortran-90, one can also label the parameters when invoking a function (and the parameters are not in order).

H. Comments

Comments are indicates by a pound sign, #. Everything following the pound sign on the same line is ignored. Comments can be put on the same line as executable code, following such code.

I. Statements

Multiple statements are not permitted on a single line and a semicolon is not used at the end of a line.

If a statement needs to run over more than one line, one places the backslash character \ at the end of the line to be continued. The backslash is not used if the split occurs within grouping symbols, i.e., ( ), [ ], or { }. The continued line is indented. For example,

      x = [1, 2, 3, 4,
            5, 6]
      y = 123 + 34.0/456.7 - \
            54.2

J. Variables and Data Types

In general, variables do not have to be declared and they adjust their type to fit the context. Assignment is indicated by = . For example,
        def main():
		i = 1
		print i
		i = "hello"
		print i
        main()		
produces the output of
	1
	hello
Python has a wealth of different fundamental (unstructured) data types.

Strings are enclosed in single quotes (and can have embedded double quotes), or in double quotes (and can have embedded single quotes), or in triple double quotes (and can have embedded newlines). One can include the C/C++ backslash escape characters in strings, e.g., \n, \t, \\, \', etc.

(Short) Integers are as in other languages. They are written without any decimal point and store about 9 digits accuracy.

Long Integers are designated by a trailing lower-case or upper-case L, as in 3L or 1234567890123456L. These are integers with unlimited precision.

Floats are numbers including a decimal point. Floats will automatically shift from single precision to double precision as needed.

Complex Numbers are written as the sum of the real and imaginery parts with the convention that the imaginery number is designated (as in most Engineering disciplines) as j rather than i, so one writes 3 + 2j for example. (Note that standard math functions in the math modules do not take complex arguments; instead one must import the complex math module, cmath, and specially designate functions from that module, e.g., cmath.sqrt(-1).)

By default all variables are local to each separate function. Identifiers that have been given a value in the main level are visible to a function when it is invoked. If a function uses an identifier already in use at the main level, that local identifier does not affect the other identifier's value. One can determine all visible variables/identifiers and their values via the commands globals() and locals().

Variables are usually created when a function is called and destroyed when it is completed.

K. Input Functions

For keyboard input, one can use the function raw_input which takes a character string argument to be displayed on the screen as a prompt. It will consider any input entered to be a character string. Usually this function is used with an appropriate conversion (coercion) function, such as int, float, long if one expects numeric input. For example,
      x = float(raw_input("enter a number:"))
will print out enter a number: on the screen and wait for user input. The function raw_input will read in a number as a character string and the function float will convert the string to a real number. Because one can be explicit about the resulting data type, this combination of functions is usually the safer approach to take in writing longer code.

In short segments, for keyboard input, one can also use the function input which takes a character string argument to be displayed on the screen. It will convert input to the appropriate data type (include a string enclosed in quotes). For example,

      x = input("enter a number:")
will print out enter a number: on the screen and wait for user input of a number. After the number (and Enter) has been typed, the value is stored in variable x.

Note that a character string entered without enclosing quotes will result in an error message.

For various reasons, several authors (including the notes at the main Python web site, http://www.python.org/doc/current/lib/built-in-funcs.html [see the entry for input] ) warn about possible errors when using input and recommend using raw_input instead.

L. File I/O

To read or write from a file, one need first to open the file in either read or write mode and associate it with an internal identifier, called the file object. For example,
        myinfile = open("textfile.txt",'r')
will open the local disk file textfile.txt as a file to be read (because of 'r') and associates it with the local identifier myinfile. (To designate a file for output, one uses 'w', for write.)

Then to read a single line from the file, one uses the function readline associated wth the local identifier as if it were a C/C++ structured variable, e.g.,

       line = myinfile.readline()
To write to the file associate with local identifier myoutfile, one uses write which take as an argument whatever is to be written, for example,
       myoutfile.write("The end of the output.\n")
To close files, one uses the close function, e.g.,
       myinfile.close()

M. Lists

Python lists share some of the characteristics of LISP lists as well as arrays in other programming languages. Python lists consist of data objects separated by commas and enclosed by square brackets. For example,
        x = [1, 2, 3]
assigns a list of three elements to the variable x.

Like with arrays in C/C++ (and other languages), elements in a Python list can be accessed directly via a subscript enclosed in square brackets (starting at 0 as with C/C++), e.g., x[0].

Like in LISP, a Python list can have elements of different data types, including lists. It does not have to be declared before use, and, in fact, can be expanded or condensed after being created. For example

        x = [1, 2.5, 90000000L, "hello"]
If a list element is itself a list, elements in that list can be access by a second subscript (as with C/C++ two dimensional arrays). For example,
        x = [[1,2,3], [11,22,33],"hello"]
        x[1][2]
results in 33 being printed out.

Operations on lists can be done via assignments or via functions (i.e., variable methods).

The length of a list x is obtained via the function len(x).

To append to the end of a list, one can write

        x[len(x):] = [5, 6, 7]
or
        x.append(5)
        x.append(6)
        x.append(7)
To append to the front of a list, one can write
        x[:0] = [-1, 0]
or
        x.insert(0,0)
        x.insert(0,-1)
One can also insert an element into a list at an arbitrary point in this way (e.g., at a point making it have subscript 3 in the revised list)
        x.insert(2,"goodbye")
To remove elements starting with subscript 2 and going up to (but not including) subscript 4, one can write
        x[2:4] = []
or
        del x[2:4]
To remove the first instance of specific item value in a list, one uses
         x.remove(3)
If the value 3 happens not to be in the list, one gets an error.

One can reverse elements in a list, or sort elements in a list, or find the maximum/minimum elements in a list using built-in methods and functions. For example,

         x.reverse()        
         x.sort()
         xmin = min(x)
         xmax = max(x)
One can test to see whether a value is in or not in a list using the operator in or not in. For example,
         3 in x
         2 not in x
True is indicated by a returned value of 1 and false with a value of 0.

One can concatenate two lists using the plus sign. For example,

         z = [0, 1, 2, 3] + [4, 5, 6, 7]
One can replicate values of a list to create a new list by using the times symbol and an integer value. For example,
         z = [1, 3] * 4
results in [1, 3, 1, 3, 1, 3, 1, 3].

Other list functions and methods can be found in a Python book or web search.

N. Control Structures

Control structures in Python reflect the control structures in other scientific languages with this major exception: The "blocks" interior to these structures are indicated by indentation rather than by braces (as in C/C++/Java) or keywords (e.g., begin ... end [Pascal] or END IF, END DO, END SELECT as in Fortran).

Moreover, each "header" line (i.e., a line beginning with a keyword) must end with a colon.

1. While loops are similar to while loops in other languages with the following changes. The expression after the reserved word "while" can be any conditional expression (no parentheses are needed). A while statement can have an optional "else:" clause that is executed once when the condition fails. The "body" of the while segment can include the keywords break (which completely terminates the structure, including any else clause) or continue (which terminates the remainder of that iteration of the loop and restarts the loop from the condition again).

As an example, the following is taken from the sample function given above:

	    while n > 0:
	          r = r * n
	          n = n - 1

2. If-then-else structures are used as in Pascal/Fortran-90/C/C++. The "else" clause is optional.

The keyword then is not used (similar to C/C++).

An "else if" clause may be included (with another condition), but the keyword is written elif.

3. For Loops in Python are both similar and dissimilar to counted loops in other languages.

The major similarity is that such loops are repeated a set number of times. The major dissimilarity is that the determination of the frequency of the repetition of the loop is significantly augmented from what is available in other languages.

The syntax of the for loop includes the keyword for followed by a loop variable, followed by the keyword in, followed by a sequence from which the loop variable takes its values, followed by the colon (on the header line). Following this header line, the body of the loop appears, indented. An else: clause may follow as with the while loop.

Most generally, one determine a sequence which constitutes the range of numbers that the loop variable can take on. For example,

        x = [1.0, 2.0, 3.0]
        for n in x:
           print 1/n
is a valid loop structure based on the sequence defined by the assignment to x.

Programmers often make use of the range function to obtain a continuous set of values. For example range(len(x)) would produce a sequence of integers corresponding to the subscript range of the list x (i.e., its "length" assuming the initial value corresponds to subscript 0).

One can also use range with two arguments, indicating the starting value and the upper value (that is not included). For example, range(2,6) results in the list [2,3,4,5].

One can also use range with three arguments, indicating the the starting value, the final value (not included), and the step value (as with Fortran), which may be negative. Thus, range(6,1,-2) produces the list [6, 4, 2].

Instead of using the function range (which constructs and stores the values specified), one may use the function xrange which does not really store the values specified, but creates them as needed for a for. The function xrange is slower than range but saves memory.

O. Arithmetic Expressions, Binary Operators

Python includes the following binary arithmetic operators:
+ addition
- subtraction
* multiplication
/ division
% "mod", i.e., remainder after division
** exponentiation
The % operator has the same precedence as * and /. The absolute value function is "abs". The other common math functions are part of the math or cmath modules.

P. Numerical Comparisons

The comparison (i.e., relational/logical/boolean) operators are similar to those in Pascal, C/C++, Fortran-90/95:
< less than
<= less than or equal to
== equal to
> greater than
>= greater than or equal to
!= not equal to
Compound comparisons can be written as in standard mathematical expressions, such as:
	1 <= n <= 10

The standard logical operators and, or, and not are part of the language. Other character and symbolic operators are part of the languages (e.g., in (for set inclusion), ^ (for bitwise exclusive or)). See a complete manual for the full list.

Q. Dictionaries

A dictionary (in Python) is similar to a table in Icon. It is a collection of items that can be referenced by "subscripts" that are neither numeric nor ordered. In other words, a dictionary is like a Python list, except that the subscripts can be anything, including arbitrary strings.

An empty dictionary is created by associating a name in an assignment statement with curly braces enclosing nothing, e.g.,

              sampledict = {}
To add an element to the dictionary, one uses an assignment statement similar to that used with a C/C++ array, except that the subscript is put in quotes if it is to be a literal string. For example,
              sampledict['sam'] = 'blue'
              sampledict[2] = 'green'
are legitimate assignments. Alternatively, one can define an initial dictionary explicitly in this way:
              sampledict = {'sam':'blue',2:'green'}
To see all the "keys" (i.e., subscripts), one makes use of the intrinsic method "keys()", for example,
              sampledict.keys()
and all the keys of "sampledict" will be displayed as a list of pairs. Similarly, one can display the values using the intrinsic method "values()" and both keys and values together in pairs using the intrinsic method "items()".

One can test for the presence of a specific key (subscript) using the "has_key" method, e.g.,

              sampledict.has_key('sam')
which will return the value of 1 if 'sam' is, in fact, being used as a "subscript" and is associated with some value (and 0 otherwise). An alternative method is "get" with two arguments, the first being the key (subscript) being sought and the second optional argument being the value returned if the dictionary did not contain the desired key. For example,
              sampledict.get('sam','Not in Dictionary')
will return "Not in Dictionary" if 'sam' is not a valid key.

One can also combine two dictionaries using the "update" method. For example, if one wishes to augment dictionary "x" with the contents of dictionary "z", one invokes

              x.update(z)
and "x" will contain all it formerly contained augmented by the contents of "z".

R. Strings

A string is a sequence of characters associated together, delimited as a constant in expressions by quote marks. Individual characters in a string may be referenced via C/C++ array notation, but any character in a string cannot be changed. Note that the subscript -1 designates the last character in a string. Thus, given
              x = "Hello"
x[0] will return the value of 'H' and x[-1] will return the value of 'o'.

Attempting to change a character in the string, e.g., via

              x[0]='G'
will result in an error.

String operation: Concatenation of two strings can be achieved via the plus operator. Thus,

              y = "John" + "son"
results in y receiving the value of the string 'Johnson'. The plus operator, however, can be less efficient for joining several strings together than the intrinsic function join that is part of the string module.

S. Functions in the module string

If one imports the module string, one has access to a variety of powerful functions. The following is a non-exhaustive survey of the more important functions.

string.join: This will concatenate into one long string the individual strings given as a list. E.g.,

         string.join(["John", "son"])
results in the string 'Johnson'. To add a separator between the individual strings being concatenated, one adds a second parameter whose value is the character to be inserted between each of the strings being concatenated.

string.split: This will split the string provided as an argument at each 'white space', whether the white space is a tab, newline, or blank. Additional arguments are possible to change the "split" character or to determine how many "splits" should take place.

string.atof: This will take a character string (ASCII) and convert it to a FLOAT number.

string.atoi: This will take a character string (ASCII) and convert it to a INTEGER number. An additional parameter will convert the string to the indicated base.

string.atol: This will take a character string (ASCII) and convert it to a LONG INTEGER number.

string.strip: This will take a character string and remove white space at the beginning or end. The related functions string.rstrip and string.lstrip will strip white space from only the right or the left end of the input string.

string.find: This will take two character strings as arguments and returns an integer indicating the beginning location of the second string within the first string. E.g.,

           string.find("Mississippi","ss")
returns a value of 2. The return value is -1 if the substring does not appear in the first string. The related function string.rfind starts the search at the right end of the larger string. The function string.count returns the number of times the second string appears in the first string.

string.replace: This will take three character strings as arguments and creates a new string based on the first string (without actually changing it), in which every occurrance of the second string is replaced by the third. For example,

           string.replace('abracadabra','a','xx')
returns 'xxbrxxcxxdxxbrxx'.

Other functions enable a programmer to create a "translation" table for characters (e.g., a table relating ( to [, ) to ], ~ to !, etc.), and then invoke this table on a string to "translate" all characters found in the table into their "translation" characters.

SAMPLE CODE

The following example, taken from The Quick Python Book by D. Harms and K. McDonald, illustrates the power of Python in dealing with string and dictionaries. It analyzes a string (in this case the explicit sample string is part of the code), and counts the number of times each word appears in the string. The dictionary occurrences stores the words as keys and associates with each word a number as its value indicating its frequency.

           import string
	   SampleString = "To be or not to be"
	   occurrences = {}
	   for word in string.split(SampleString):
	      occurrences[word]  = occurrences.get(word,0)+1
	   for word in occurrences.keys():
	      print "The word", word, "occurs", \
	           occurrences[word], "time(s) in the string"

T. Regular Expressions and Pattern Matching

Python was designed with character string processing and manipulation in mind attempting to incorporate some of the power inherent in Unix shell commands, Perl, and Tcl to manipulate strings.

For additional information as to how regular expressions are used in Unix/Linux, type man regexp at the Unix/Linux prompt.

Regular expression patterns in Python are similar to those in Unix. For example, ( ) are used to group alternatives separated by |, [ ] are used to group single character alternatives (unless a range is specified with a - (as in a-z), a + after [ ] is used to indicate one or more repetitions of characters within the brackets.

Regular expression functions are included in the re module which must be imported before use. If a regular expression is designated for use, it is usual practice to "compile" the expression to increase the speed of the code. As examples, the following are valid regular expressions:

     regexp1 = re.compile("hello|Hello")
     regexp2 = re.compile("(h|H)ello")
     regexp3 = re.compile("[hH]ello")
     regexp4 = re.compile("[a-zA-Z]+")
The first three expressions, regexp1, regexp2, regexp3, all designate the same two strings, "hello" and "Hello". The last expression, regexp4 represents any string consisting of upper case or lower case letters of the alphabet having at least one letter.

The following sample code, taken from The Quick Python Book, shows how to use regular expressions to search through lines of an input file and count how many contain the string "hello."

      import re
      regexp = re.compile("hello")
      count = 0
      file = open("textfile",'r')
      for line in file.readlines():
          if regexp.search(line):
	      count = count + 1
      file.close()
      print count
Further information may be found on-line or in published books on Python.

U. The exec and eval Statements

Python has two commands similar to the eval function in LISP. Both take either a literal character string or a variable which has a character string as its value and interprets the character string as if it were typed in by a user.

The eval statement takes an expression (either a literal character string or stored in a variable) within parentheses as an argument to a function and returns the value as if it were typed into the interpreter. For example,

          x=1
          y='x + 2'
          eval(y)
returns the value of 3. Similarly,
          eval('2 + 3')
returns the value of 5.

The exec statement consists of the keyword exec followed by a statement (either a literal character string or stored in a variable) without parentheses and executes the statement as if it were typed into the interpreter. A Python "statement" usually does not return a value as does an expression. It may, for instance, set a variable to a (new) value or perform some action. For example,

          exec 'x=2'
or
          c = 'x=2'
          exec c
sets x to the value of 2 but does not print out anything on the intepreter's screen.

Alternatively,

          exec 'print "Hello world!"'
or
          d = "Hello world!"
          exec 'print d'
returns Hello world! to the screen.


This page is maintained by Dennis C. Smolarski, S.J. dsmolarski@math.scu.edu
Last updated: 23 October 2005. Minor corrections: 9 November 2005, 31 Jan 2006.