Copyright © 2006 David Schmidt

Chapter 5:
Mutable Structures: Lists and Dictionaries


5.1 Mutable data structures: permanent positions
5.2 Lists
    5.2.1 Vote counting
    5.2.2 Spreadsheets and game boards
    5.2.3 Semantics of list assignment
5.3 Design: Select a data structure
    5.3.1 Case study: Appointments calendar
5.4 Dictionaries
    5.4.1 Python dictionaries
    5.4.2 Case study: Appointments calendar rebuilt
    5.4.3 Semantics of dictionaries
5.5 Nested lists
    5.5.1 Case study: Appointments calendar as a grid
    5.5.2 Design: Lists versus grids: the slide puzzle
    5.5.3 Semantics of nested lists
5.6 Foundations: a loop maintains a list's data invariant
5.7 Design: When to use tuples, lists, and dictionaries
5.8 Summary


Strings and tuples are examples of data structures, because they are ``structures'' that hold multiple pieces of ``data.'' (Strings hold multiple characters, and tuples hold whatever you want to insert into them!)

Strings and tuples are also called immutable structures, because once you build one, you cannot alter its internal contents. This is a subtle point, but it's worth some thought. Say you construct this string:

"abcdefghijklm"
You might even name it:
mystring = "abcdefghijklm"
but you cannot change the internals of the string --- for example, you cannot replace the d in the middle by, say, "X". The best you can do is build a new string from the one you started with and assign the new string to the cell that held the original one:
mystring = mystring[:3] + "X" + mystring[4:]
You cannot do this assignment: mystring[3] = "X", which is trying to replace character 3 in the string by a new character.

In a similar way, if we have a playing card:

mycard = ("hearts", "jack")
it makes no sense to alter the card after it is built:
mycard[1] = "ace"  # this is not allowed on a tuple !

In this chapter, we learn about two forms of mutable data structure that will let you change their internal contents by assignment: a list and a dictionary.

Remember to read Dawson, Chapter 5.


5.1 Mutable data structures: permanent positions

The previous chapter showed us that tuples are good for holding a ``small collection'' (like a playing card, or a pixel, or a date) or a collecting a sequence of values (like the words extracted from an English sentence or a deck of playing cards).

But there are other structures that are based on the idea of ``permanent position'' or ``index number'' or ``address'' or ''key.'' When you use a spreadsheet program, you see a grid of cells in front of you. You can insert numbers or strings into the cells, but the cells can also be empty. Further, you can erase or replace the number you placed in the cell with another. The cells are not data, but they are ``permanent positions'' that you can fill with data. Each cell has a position --- an index --- in the spreadsheet.

Yet another example is a chessboard, which is an 8-by-8 grid of squares. You can place a chess piece on a square and you can move the piece off the square, leaving it empty. But the board always keeps its permanent positions of 64 squares.

A third example is a calendar, where each day of the month has its own square. You can insert appointments into a square, cancel appointments, or you can have a day without appointments. But the days are permanently positioned in the calendar, and each has its own index number.

There is a special data structure for modelling spreadsheets, game boards, and calendars. The structure has cells into which one can insert and update information. (This makes the structure mutable.) The usual name for this structure is array or matrix, but in Python, it is called a list.

(If you have used arrays in languages like Basic, C, or Java, you will soon see that Python lists are different because they can grow as needed. Some languages use the name, vector to describe an array that ``grows.'')


5.2 Lists

Here is the official definition: a Python list is a sequence of cells, indexed by the integers 0, 1, 2, and so on, where each cell is mutable (changeable). Each cell holds a value or element, which can be a number, string, tuple, and so on. The operations we learned for string and tuple sequences also work for lists, but most importantly, we assign to specific cells within a list.

Here is an example: I want to model my clothes chest in the computer. The chest has a drawer for shirts, one for pants, and one for pairs of socks. Here is a list that describes the shirts, pants, and socks I have at the moment:

chest = ["3 shirts", "2 pants", "no socks"]
The chest holds 3 shirts, 2 pants, and no pairs of socks. The list is written as a sequence, separated by commas and surrounded by square brackets. (Remember that Python tuples are surrounded by rounded brackets --- parentheses.)

Here is a drawing of the list in computer storage:


I can print my pants count by using the indexing operation:

print "I have", chest[1]
And I will see printed I have 3 pants. A for-loop quickly prints my inventory:
for drawer in chest :
    print drawer
This prints
3 shirts
2 pants
no socks
Or, I can merely say print chest. (Try it --- it prints the list on one line, with its commas and brackets:
['3 shirts', '2 pants', 'no socks']

If I want to print the drawer number alongside its contents, I can use this loop:

for index in range(len(chest)) :
    print "drawer ", index, "holds", chest[index]
which prints
drawer 0 holds 3 shirts
drawer 1 holds 2 pants
drawer 2 holds no socks
The function, len(LIST), computes the number of cells in its argument, LIST. (In the example, len(chest) computes to 3.) The function, range(N), is a Python shortcut for generating the list (sequence), [0, 1, 2, ..., N-1], so that range(len(chest)) generates exactly the indexes 0, 1, 2, that we should use to index the cells in chest. The use of range in for is not so elegant, but it makes the Python for-loop behave like a Fortran- or C-style for-loop.

If I launder my socks and place them into the chest, I can update my inventory with an assignment statement to the appropriate cell of the list:

chest[2] = "3 pairs of socks"
This replaces "no socks" by "3 pairs of socks" in cell 2 of the list. Such an assignment is impossible to do on a tuple. (The reason is that tuples are stored differently inside computer storage than are lists. Tuples are stored so that they can be constructed and indexed fast. Lists are stored so that their cells can be changed, but construction, indexing, and updating are relatively slow.)

There is a method that makes a list grow by one cell to hold a new value: LIST.append(EXPRESSION). For example, say that we have

chest = ["3 shirts", "2 pants", "no socks"]
and we wish to add a drawer for shorts:
chest.append("5 shorts")
print chest shows us:
["3 shirts", "2 pants", "no socks", "5 shorts"]
append is useful for building a list when we do not know the correct size in advance. For example, this loop builds a list to hold all the positive numbers supplied by a user as input:
===============

number_list = []  # list starts empty
OK = True
while OK :
    num = int(raw_input("Type a positive number (or a nonpositive, to quit): ")
    if num > 0 :
        number_list.append(num)  # place  num  in a new cell in  number_list
    else :
        OK = False
print number_list

================

Finally, we can store other items than strings and numbers into a list. Here is an example, where each drawer of the chest holds a ``label'' of the items saved in the drawer plus a count of the items:

chest = [ ["shirts", 3], ["pants", 2], ["socks", 0] ]
Each cell of chest holds a string, integer list. I can print the inventory like this:
for drawer in chest:
  print "I have", drawer[1], drawer[0]
This prints
I have 3 shirts
I have 2 pants
I have 0 socks
When I wash three pairs of socks, the inventory can be updated like this:
chest[2][1] = chest[2][1] + 3
We make good use of such nested lists later in the chapter.


5.2.1 Vote counting

A classic use for lists is vote counting, where a sequence of randomly supplied inputs are tallied into categories.

Perhaps you must write a program that tallies the votes of a four-candidate election. (For simplicity, we will say that the candidates' ``names'' are Candidate 0, Candidate 1, Candidate 2, and Candidate 3. ) Votes arrive one at a time, where a vote for Candidate i is denoted by the number, i. For example, two votes for Candidate 3 followed by one vote for Candidate 0 would appear:

Type your vote (0,1,2,3): 3
Type your vote (0,1,2,3): 3
Type your vote (0,1,2,3): 0
and so on.

Vote counting goes smoothly when we use a list that holds the totals for the candidates:

votes = 4 * [0]
This makes a list of four cells, where each cell's value starts at zero. Of course, votes[0] holds Candidate 0's votes, and so on. When a vote arrives, it must be added to the appropriate element:
v = int(raw_input("Type your vote (0,1,2,3): ")
votes[v] = votes[v] + 1;
The algorithm for vote counting follows the ``input processing pattern'':
===================================================

processing = True
while processing 
    v = int(raw_input("Type your vote (0,1,2,3): ")
    if v < 0 or v > 3 :
        processing = False   # bad vote, so quit
    else :
        votes[v] = votes[v] + 1

========================================================
Once all the votes are counted , they must be printed. If we state
print votes  #  this prints the entire list, e.g.,  [25, 44, 12, 2]
If we state
for candidate in votes :
    print candidate,
print
This prints the list's cells' contents on one line, like this:
25 44 12 2
To label the output in a useful way, we can use a while-loop:
count = 0
while count != len(votes) :
    print "Candidate", count, "has", votes[count], "votes"
    count = count + 1
Which would print
Candidate 0 has 25 votes
Candidate 1 has 44 votes
Candidate 2 has 12 votes
Candidate 3 has 2 votes
The same printout comes from this for-loop:
for count in range(len(votes))) :
    print "Candidate", count, "has", vote[count], "votes"
Ensure that you understand all these different ways of printing the vote totals.

Finally, what would happen if a person casts a vote for Candidate 5 (who does not exist)? The above program prevents such a vote, because if we tried to perform,

votes[5] = votes[5] + 1
the program would stop with this error message:
Traceback (most recent call last):
  File "", line 1, in ?
IndexError: list index out of range
Since 5 is not an acceptable index to one of the cells in votes, there is an IndexError.

Exercises

  1. Modify the vote-counting program so that it first asks for the number of candidates in the election. After the number is typed, then votes are cast as usual and the results are printed.
  2. Modify the program so that it first requests the names of the candidates. After the names are typed, the votes are cast as as usual and the results are printed with each candidate's name and votes. (Hint: Use a second list to hold the names.)
  3. Write a program that reads a series of integers in the range 1 to 20. The input is terminated by an integer that does not fall in this range. For its output, the program prints the integer(s) that appeared most often in the input, the integer(s) that appeared least often, and the average of all the inputs.


5.2.2 Spreadsheets and game boards

Spreadsheets and calendars are well modelled by lists, because they are structures with a fixed number of cells.

Spreadsheets

A calendar is just a spreadsheet where each cell is numbered by a day of the month. Here is a series of examples. First, to make a large list to represent the 28 days of February, we say,

february = [ "" ] * 28
This makes a 28-celled list named february. (The * 28 ``multiplies'' the one-celled list into a 28-celled one.)
print february
shows the list of 28 empty strings.
february[13] = "valentine's day"
inserts a message into cell 13 of the list. (Remember that the indexes for the list start with 0, so cell number 13 is the fourteenth day in the list.)

We can print the calendar's contents, one week at a time, like this:

for count in range(len(february)) :  # counts from 0 to 27
    if count % 7 == 0 :
        print  # start a new week on a new line
    print "February", count+1, ":", february[count], ". ",
The example can be improved if we make ``February 1'' print at the correct position for the day of the week when it falls. (This is a good exercise to work.)

Gameboards

A chessboard's squares are cells, and we can use a numbering scheme based on 0,1,2,... This works surprisingly well. A chessboard has 64 squares:

ROW_SIZE = 8
BOARD_SIZE = ROW_SIZE * ROW_SIZE
board = [ () ] * BOARD_SIZE   # place an empty tuple on each square
Chess pieces can be placed on the squares like this:
board[6] = ("white", "pawn")
If we wish to move the piece on square k to the right by one square, we require two assignments:
new_position = k + 1
board[new_position] = board[k]
board[k] = ()
If the second assignment is omitted, then the chess piece would rest on both squares simultaneously.

Say that we want to move a chess piece at square k downwards one row; it's this simple:

new_position = k + ROW_SIZE
board[new_position] = board[k]
board[k] = ()
In both the previous examples, we should add if-commands that ensure that the new_position does not extend ``off the board.'' This is left for you to do as an exercise.

A chessboard should be displayed as a square of 8 rows, each of which holds 8 squares. We use this standard technique to print the list as a square:

ROW_SIZE = 8
for square in range(BOARD_SIZE) :
    if count % ROW_SIZE == 0 :   # print a newline at the end of each row
        print
    print square,                # print the square's content, no newline

Later in this chapter we learn how to make and manipulate a true grid, which is a list of 8 rows (cells), where each row (cell) itself holds a list of 8 squares:

# make the board:
board = [ ]
for row in range(ROW_SIZE) :
    board.append([ () ] * ROW_SIZE)

board[5][5] = ("white", "pawn")  # notice the _two_ indexes -- row 5, cell 5 !

# print the board, one row per line:
for row in board :
    for square in row :
        print square, 
    print


5.2.3 Semantics of list assignment

Because lists are mutable --- they have cells that can be updated by assignment --- they are too large to be saved in a program's namespace. (The reasons for this are explained in the Data Structures course that you take after this course.) Instead, lists are saved in a new storage region, called the heap:

Say that a Python program contains this command:
chest = ["3 shirts", "2 pants"]
The command constructs the list of two strings in the heap, remembers the storage address where the list was made, and assigns that address to the variable in the namespace:

The diagram shows that the list was constructed at some address, say, addr1, in the heap, and addr1 is assigned to chest. Later, if the list's value is needed, as in,
print chest[1]
the expression, chest[1], is computed like this:
  1. The value of chest is fetched; it is addr1.
  2. Since addr1 is a heap address, the list at addr1 is found.
  3. The list is indexed by 1 and the answer, "2 pants", is returned.
Read the steps like this:
print chest[1]  ==>  print addr1[1]  ==>  print "2 pants"

Although the Python interpreter tries to hide from you the heap and the addresses of lists, it is not completely successful. Here is an example:

none = "nothing here"
chest = ["3 shirts", "2 pants"]
box = []
box = chest    # box  holds the same address as  chest !
box[0] = none
print chest    # prints  ["nothing here", "2 pants"] !
As noted in the commentary, the assignment, box = chest, places the address of chest's list into box's cell, meaning that assignments to box's cells also affect chest's cells. This is called aliasing.

We play out the example in steps: The beginning configuration has an empty namespace and heap:


After the first three commands are executed, we see that two lists are constructed in the heap and their addresses are assigned to variables in the namespace:

The next assignment, box = chest assigns chest's address to box:

and this means the assignment that follows updates the list at addr1:

The update affects chest, as we see when the print command executes:

The example suggests aliasing is bad, but this is too harsh of a statement. It is true that aliasing can be confusing in examples like the one above, but programs that build large databases of information use aliasing as a trick to link together data structures and reduce duplicate information.

For example, a database for the public library uses aliasing to ``link'' each patron to a list of books borrowed, and each borrowed book is linked to a list holding the book's bibliographic information. The links (aliases) eliminate duplicating information that would otherwise overwhelm the computer's storage. You will learn in a Data Structures course how to use aliasing in this smart way.

Exercises

Draw the execution semantics of these examples and then test the examples with the Python interpreter:
  1. x = [1, 2, 3]
    x[1] = "a"
    print x
    
  2. x = []
    for cell in x
        print cell
    
  3. x = []
    for i in range(4)
        x.append(i)
    print x
    
  4. vec = [("shirts", 2)]
    print vec[0]
    vec[0][0] = 5
    print vec[0]
    
  5. vec = [1, 2, 3]
    for cell in vec :
        cell = 0
    print vec
    
  6. none = 0
    items = ("shirts", "pants", "socks")
    chest = [3, 2, none]
    for index in range(len(chest)) :
        print chest[index], items[index]
    
    chest[2] = chest[2] + 3
    
    new_item = "shorts"
    items = items + (new_item,)
    chest.append(5)
    


5.3 Design: Select a data structure

Back in Lecture 2, we learned the basic steps of program design. Now that we know about data structures (tuples and lists, in particular), it's time to revise the design steps:
1. describe the program's intended behavior
2. select (and draw) the program's data structure(s)
3. design the program's algorithm (which ``controls'' the data structure)
4. write the program from the algorithm and test it
5. document the program
Selecting the data structure is often the crucial step in building a program --- If we can draw a picture of the data structure what the program must build and maintain, then the rest of the design falls into place.

We now consider an example.


5.3.1 Case study: Appointments calendar

The techniques we saw in the clothes-chest example are useful for programs like spreadsheets, which collect, save, and display information. As a first project, let's study how a simple appointments calendar would be designed and built with lists.

Keeping it small, let's say that the calendar holds appointments for exactly one afternoon of a day --- its user can make appointments during the noon hour, and at the hours 1pm, 2pm, 3pm, and 4pm. The behavior goes like this, where the user types information in response to the prompts:

$ python Appointments.py

Type request: M(ake appt), V(iew appts), Q(uit): M
Type appt. time (hh:mm): 12:30
Type your appointment: lunch

Type request: M(ake appt), V(iew appts), Q(uit): M
Type appt. time (hh:mm): 2:00
Type your appointment: snack

Type request: M(ake appt), V(iew appts), Q(uit): 12:55
I don't understand your request; please try again.

Type request: M(ake appt), V(iew appts), Q(uit): m
Type appt. time (hh:mm): 12:55
Type your appointment: nap

Type request: M(ake appt), V(iew appts), Q(uit): V
Your appointments: 
12:30: lunch
12:55: nap
2:00: snack

Type request: M(ake appt), V(iew appts), Q(uit): Q
The program that makes the calendar come to life holds appointments for each of the 5 hours of the afternoon, and some hours might have no appointments at all. This suggests we use a list whose cells represent the the 5 hours for appointments. Here is a sketch of the data structure:

The picture shows that each cell will hold zero or more appointments. How might we save a sequence of appointments in a cell? A tuple of appointments is a good solution (but it is not the only one).

How is the list maintained? The answer is given by the algorithm we write, which looks like the standard ``input-transaction processing'' pattern:

processing = True
while processing :
    READ AN INPUT TRANSACTION;
    if THE TRANSACTION INDICATES THAT THE LOOP SHOULD STOP :
        processing = false
    else :
        PROCESS THE TRANSACTION
For the appointment manager, the PROCESS THE TRANSACTION part will ask if the user wants to make a new appointment or view the whole calendar. Here is the algorithm:
# start with an empty calendar for noon, 1pm, 2pm, 3pm, 4pm:
calendar = [(), (), (), (), ()]

processing = True
while processing :
    read the request code ("M" or "V" or "Q")
    if the request code == "Q" : # Quit
        processing = False
    elif code == "V" : # View all appointments
        print all the appointments in the calendar
    elif code == "M" : # Make a new appointment
        read the time of the new appointment
        read the appointment text
        store the appointment into the  calendar
Now, let's refine the "V" (print all the appointments in the calendar) part --- it is a for-loop:
for hour in calendar :
    ... print all the appointments in the  hour  (*)
Since each hour (cell) of the calendar holds a tuple, we do line (*) with another loop, which prints the contents of the tuple:
for hour in calendar :
    for appointment in hour :
        print appointment

Next, consider how we might make an appointment. First, we read the time of the new appointment (e.g., "12:30") and the text ("lunch"). We concatenate these two strings together and store the information into the tuple that lives at the cell for the noon hour. The coding will look something like this:

# appt_time =  ... READ "12:30"
# appt =  ... READ "lunch"
# appt_hour = 0  # we somehow calculate this from the "12" in "12:30"!
new_appt = appt_time + ": " + appt
calendar[appt_hour] = calendar[appt_hour] + (new_appt,)
But the missing part is: how do we extract the appt_hour, 12, from the string, "12:30" ? The answer uses a new Python trick, a method named split:
# split the time in two, into its hours part and its minutes part:
hour_and_minutes = appt_time.split(":")  # split  is a string method
# get the hours part from the list,  [hours, minutes]:
appt_hour = int(hour_and_minutes[0])
The new method, STRING.split(SEPARATOR_STRING), examines STRING and ``splits'' it into a list of those pieces that are separated by SEPARATOR_STRING. For example,
appt_time = "12:30"
hour_and_minutes = appt_time.split(":")
print hour_and_minutes 
prints the list, ["12", "30"]. The split method is a good trick to use when extracting numbers (or individual words) from a string.

Figure 1 assembles the the program:

FIGURE 1==============================

# Appointments
#   maintains a calendar of afternoon appoinments from noon up to 5pm.
# assumed input: a series of requests of these three forms:
#   (i) M  (for ''Make a new appointment'')
#       followed by the time, typed in the format,  hh:mm
#       followed by the appointment, typed on a single line
#   (ii) V (for ''View all appointments)
#   (iii) Q (for ``Quit'')
# guaranteed output: when  V  is typed, a listing of all appointments is printed

# start with an empty calendar for noon, 1pm, 2pm, 3pm, 4pm:
calendar = [(), (), (), (), ()] 

more_requests = True

while more_requests :

    request = raw_input("\nType request: M(ake appt), V(iew appts), Q(uit): ")
    code = request[0].upper()

    if code == "Q" :  # Quit
        more_requests = False

    elif code == "V" : # View all appointments
        print "Your appointments: "
	for hour in calendar :
	    for appointment in hour :
	        print appointment
            
    elif code == "M" : # Make a new appointment
        # read the time of the appointment:
	appt_time = raw_input("Type appt. time (hh:mm): ")
	# split the time in two, into its hours part and its minutes part:
	hour_and_minutes = appt_time.split(":")  # split  is a string method
	# get the hours part from the list,  [hours, minutes]:
	appt_hour = int(hour_and_minutes[0])
        # read the appointment:
	appt = raw_input("Type your appointment: ")
	# save it in the calendar;  first, treat noon hour (12) as 0 hour:
	if appt_hour == 12 :
	    appt_hour = 0
	new_appt = appt_time + ": " + appt
	calendar[appt_hour] = calendar[appt_hour] + (new_appt,)
    else :
        print "I don't understand your request; please try again."

print "Have a nice day."

raw_input("\n\npress Enter to finish")

ENDFIGURE=============================================
Here is what the data structure looks like after the example appointments have been added to it:

Now we should test the program with the behavior seen earlier and try other behaviors, too.

Exercise

Modify the calendar manager to behave as follows:
  1. Add a command, H, which lets a user view the appointments for one specific hour. For example:
    Type request: M(ake appt), V(iew appts), H(our to view), Q(uit): H
    Type appt. hour: 12
    Your appointments:
    12:30: lunch
    12:55: nap
    
  2. Make the program refuse to add an appointment if there is another appointment already scheduled at the exact same time. (Hint: an appointment is no longer stored as a string, but as a tuple, (hour, minute, text).)
  3. Add a command that lets the user delete an appointment at a specific time.


5.4 Dictionaries

So far, we have studied three forms of sequences --- strings, tuples, and lists. When we extract information from a sequence, we use integer indexes: 0, 1, 2, etc. We used sequences to model real-life cells and grids (the appointments manager and slide-puzzle), but not every problem can be solved in this way. In particular, sequence data structures do not work well with data that have keys.

Consider a library's card catalog. Each library book must be entered into the card catalog, where the book is uniquely identified by its key --- its catalog number. Such catalog numbers are typically a mix of letters and numbers and punctuation, e.g., QA79.03. There is no simple way of using integer codes like 0, 1, 2, ... to stand for the catalog numbers.

Another example is a telephone book, where a person's name is used as the key for finding the person's telephone number:


Yet another example is a numerical phone book, where the telephone number is used as the key for finding the name of the number's owner.

Indeed, just about every real-life database uses keys to organize and find information.

A data structure that holds elements each of which has a unique key, is called a dictionary. (The name comes from real-life dictionaries, which are collections of definitions whose keys are words.) Python is one of the few languages that has a dictionary data structure ready for you to use.

(Other languages, like C and Pascal, have a simpler version of dictionary, called a record structure or struct. You will soon see that, unlike record structures, Python dictionaries can grow as needed and the set of allowable keys/fieldnames is not fixed in advance.)


5.4.1 Python dictionaries

A Python dictionary is written as a collection of KEY : ELEMENT pairs, separated by commas, surrounded by braces. Here is an example of a telephone book for three persons, where the persons' names (strings) are used as the keys to find the phone numbers:
tel_book = {"Jane Sprat": 4098, "Jake Jones": 4139, "Lana Lang": 2135}
To look up Lana Lang's phone number, we use her name as the index (key):
print tel_book["Lana Lang"]
This prints 2135. But it is an error if we look up a key that is not in the dictionary. For this reason we should first is the key is present, e.g.,
if "Mary Hartman" in tel_book :
    print tel_book["Mary Hartman"]

If Lana Lang receives a new phone number, say, 9999, we can update the dictionary:

tel_book["Lana Lang"] = 9999
We can print the contents of the telephone book this simply:
print tel_book
For the above example, we see that the entries are not printed alphabetically by key, but like this:
{'Jake Jones': 4139, 'Lana Lang': 9999, 'Jane Sprat': 4098}
The reason for the ``mixed up'' printout is because the dictionary is stored within the computer as a structure called a hash table (which we will not study here --- you will learn about it in your Data Structures course).

If a new person is added to the telephone book, we can update the dictionary with an assignment, e.g.,

tel_book["Mary Hartman"] = 6744
but it is illegal to try to add a new person with the same name (key) as someone already in the dictionary --- every element must have a distinct key.

We can easily delete an entry from a dictionary, e.g.,

moved_away = "Jake Jones"
if moved_away in tel_book :
    del tel_book[moved_away]
We use a for-loop to systematically examine all the keys in a dictionary and so process all the dictionary's elements:
for k in tel_book :
    print k, tel_book[k]
(The loop looks at the keys, one by one, held in the dictionary.)

Finally, to print the dictionary's contents, ordered by its keys, you can use two new Python methods, keys and sort:

keylist = tel_book.keys()  # builds a list of all the keys used in  tel_book
keylist.sort()  # sorts (reorders) the list alphabetically
for k in keylist :  
    print k, ":", tel_book[k]

Of course, a dictionary can store arbitrary values with its keys. Here is a modified phone book, which keeps addresses and phone numbers:

tel_book = {"Jane Sprat": ("200 First St.", 4098),
            "Jake Jones": ("County Jail", 4139),
            "Lana Lang": ("Smallville", 2135) }

A dictionary's keys are usually strings, but we can also use integers. Here is the clothes-chest example, which started this chapter, expressed as a dictionary with integer keys:

chest = {0: "3 shirts", 1: "2 pants", 2: "no socks"}
Here is yet another modelling of the chest, where the drawers are named by the items they hold:
chest = {"shirts": 3, "pants": 2, "socks": 0}
Indeed, any immutable value (strings, numbers, booleans, even tuples) can be used as a key to a dictionary. The keys can be mixed, as can be the values:
places_Ive_been_in_my_town = { "101 Elm St." : "my house", 
                               ("3rd", "Vine") : "burger king",
                               100 : ["mall", "food court"] }


5.4.2 Case study: Appointments calendar rebuilt

The appointments-manager program we built earlier in this chapter used a list, where the list's cells represented the hours of the day; appointments were saved within tuples. If we reconsider the program, we realize that each appointment has a ``key'' --- the time of the appointment.

Recall the behavior of the appointments manager: Its user types the time and purpose of each appointment:

Type request: M(ake appt), V(iew appts), Q(uit): M
Type appt. time (hh:mm): 12:30
Type your appointment: lunch

Type request: M(ake appt), V(iew appts), Q(uit): M
Type appt. time (hh:mm): 2:00
Type your appointment: snack

Type request: M(ake appt), V(iew appts), Q(uit): m
Type appt. time (hh:mm): 12:55
Type your appointment: nap

Type request: M(ake appt), V(iew appts), Q(uit): V
Your appointments:
12:30: lunch
12:55: nap
2:00: snack
An appointment can be viewed as a ``record'' whose ``key'' is the time of the appointment. Here is a drawing of the appointments organized as a dictionary:

We can reuse most of the algorithm we designed for the appointment manager. Indeed, the replacement of the list by the dictionary makes the program shorter and simpler because we use an appointment's time as its key for insertion:

FIGURE=============================================

# Appointments
#   maintains a calendar of appoinments from noon to 11:59pm

# the calendar begins as empty record
calendar = {}

more_requests = True

while more_requests :

    request = raw_input("Type request: M(ake appt), V(iew appts), Q(uit): ")
    code = request[0].upper()

    if code == "Q" :  # Quit
        more_requests = False

    elif code == "V" : # View all appointments
        print "Your appointments: "
        # print all appointments:
	for time in calendar :   # examine all keys
	    print time, calendar[time]
            
    elif code == "M" : # Make a new appointment
        # get the time of the appointment:
	appt_time = raw_input("Type appt. time (hh:mm): ")
        # get the appointment:
	appt = raw_input("Type your appointment: ")
	# save it in the calendar:
	calendar[appt_time] = appt
    else :
        print "I don't understand your request; please try again."

print calendar # print the whole thing

raw_input("\nHave a nice day.")

ENDFIGURE======================================


5.4.3 Semantics of dictionaries

Since dictionaries are mutable data structures, they are stored in the heap, like lists. A dictionary is a collection of KEY: ELEMENT pairs, and the pairs are not saved in any certain order. Here is how we will display dictionaries in the heap:


Exercises

Improve the appointments manager so that
  1. if a new appointment is entered at the same time as an existing appointment, the new appointment is not entered and an error message is printed;
  2. the user can delete an appointment by stating its time;
  3. the appointments print so that they are sorted by the time of day.


5.5 Nested lists

A grid or a matrix is a data structure where the cells are identified by two indexes -- a ``latitude'' and a ``longitude,'' like on a map. The indexes are usually called the row number and the column number.

To understand this, think about how we draw a spreadsheet or a game board: it is grid that looks like this:


(The numberings will be explained in a moment.) We insert data (numbers or words or chess pieces) into the grid's cells.

What is important here is the numbering --- rather than think about cell ``number 7'' or cell ``number 14,'' we think about the cell in ``row 1, column 2'' or ``row 3, column 1'' --- there are two dimensions to the structure we design.

In Python we build such a structure with a list of lists:

[["00", "01", "02", "03"], ["10", "11", "12", "13"], ["20", "21", "22", "23"], ["30", "31", "32", "33"]]
To see this more clearly, let's arrange the nested list into its ``rows'':
[ ["00", "01", "02", "03"],
  ["10", "11", "12", "13"],
  ["20", "21", "22", "23"],
  ["30", "31", "32", "33"] ]

This form of nested list is often called a matrix or two-dimensional array. The numbering used in the example matrix just seen is not an accident: If we write

grid = [ ["00", "01", "02", "03"],
         ["10", "11", "12", "13"],
         ["20", "21", "22", "23"],
         ["30", "31", "32", "33"] ]
and then say
print grid[1]
we would see
['10', '11', '12', '13']
which corresponds to ``row 1'' of the matrix. If we wish to fetch and print the string "21" in grid, we would say
print grid[2][1]
which names the string in ``row 2'', ``column 1'' in grid.

Here are some other exercises that you should try:

# print the grid in four lines, one row per line:
for row in grid :
    print row
# print all the strings in the grid, listed one row per line on the display:
for row in grid :
    for cell in row :
        print cell,
    print   # start a new line
# print each string on its own line, labelled with its row-column position:
size = len(grid) # how many rows are in the grid
for i in range(size) :
    for j in range(size) :  # we are assuming that the grid is _square_
        print "row", i, "column", j, ": ", grid[i][j]
# build a square matrix, the size given by the user, filled with strings
# that show the cell numbers, like in the above example:
size = int(raw_input("Type a grid size (a nonnegative int): "))
grid = []
if size > 0 :
    for i in range(size) :
        new_row = []
        for j in range(size) :
            value = str(i) + str(j) 
            new_row.append(value)
        grid.append(new_row)
# place "XX"s across the diagonal of  grid:
for i in range(size) :
    grid[i][i] = "XX"

Grids are best used when you model cells whose addresses are based on two index numbers. A good example is a calendar for an entire year, where each day of the year is indexed by two coordinates --- the month and day. Another example is the layout of a multi-story hotel, where each room is indexed by its floor number and its room number. Gameboards, where the squares are identified by row, column numbers, are another example where grids might be used.


5.5.1 Case study: Appointments calendar as a grid

Let's return to the appointments manager one more time. Perhaps we decide that the hours of the day are subdivided into minutes. Thus, an appointment like this one:
Type request: M(ake appt), V(iew appts), Q(uit): M
Type appt. time (hh:mm): 2:00
Type your appointment: snack
is a request to file the appointment, "snack" at the two coordinates, 2 (for the row), 00 (for the column). This means the calendar is modelled as a huge grid (nested list) with 5 rows, each row containing 60 elements (one for each minute of the hour).

The grid is created like this:

HOURS = 5  # for  noon, 1pm, 2pm, 3pm, 4pm
MINUTES = 60  # in each hour

# build the nested list:
calendar = []   # will hold all the appointments
for i in range(HOURS) :  # count from 0 to 4
    calendar.append(MINUTES * [""])  # add one hour (row) of cells to  calendar
If you print calendar, you will see 300 cells, each containing an empty string.

When we make a new appointment, we split the appointment time into its row, column, coordinates and place the appointment in the selected cell (provided the cell isn't already filled with a previous appointment):

appt_time = raw_input("Type appt. time (hh:mm): ")
hour_and_minutes = appt_time.split(":")  # split the time in two
hour = int(hour_and_minutes[0])  # get the hours and minutes:
minute = int(hour_and_minutes[1])

# read the appointment:
appt = raw_input("Type your appointment: ")

if hour == 12 :   # convert noon hour into  hour = 0 :
    hour = 0
if calendar[hour][minute] != "" :   # is the cell already filled ?
    print "error: you already have an appointment at that time"
else :
    calendar[hour][minute] = appt

Here is the completed program:

FIGURE====================================================

import string

HOURS = 5  # model  noon, 1pm, 2pm, 3pm, 4pm
MINUTES = 60
# build the nested list:
calendar = []   # will hold all appointments
for i in range(HOURS) :  # count from 0 to 4
    calendar.append( 60 * [""])  # add one hour of cells to the calendar

more_requests = True
while more_requests :
    request = raw_input("\nType request: M(ake appt), V(iew appts), Q(uit): ")
    code = request[0].upper()

    if code == "Q" :  # Quit
        more_requests = False

    elif code == "V" : # View all appointments
        print "Your appointments: "
        for hour in range(HOURS):
            for minute in range(MINUTES):
                appt = calendar[hour][minute]
                if appt != "" :
                    if hour == 0 :
                        print "12:" + string.zfill(minute,2)  + ":", appt
                    else :
                        print str(hour) + ":" + string.zfill(minute,2) + ":", appt

    elif code == "M" : # Make a new appointment
        appt_time = raw_input("Type appt. time (hh:mm): ")
        hour_and_minutes = appt_time.split(":")  # split the time in two
        hour = int(hour_and_minutes[0])  # get the hours and minutes:
        minute = int(hour_and_minutes[1])
        appt = raw_input("Type your appointment: ")
        if hour == 12 :   # convert noon hour into  hour = 0 :
            hour = 0
        if calendar[hour][minute] != "" :   # is the cell already filled ?
            print "error: you already have an appointment at that time"
        else :
            calendar[hour][minute] = appt
    else :
        print "I don't understand your request; please try again."

raw_input("\n\npress Enter to finish")

ENDFIGURE======================================================
Notice that it takes some work to properly print the hours, minutes time from the row, column coordinates when all the appointments are printed. (See the use of import string and string.zfill(...).)

Modelling the calendar has a grid has a huge disadvantage: almost all of the cells of the calendar will remain empty. Perhaps it is not a tragedy to waste 300 cells in our small example, but this form of model would be unacceptable for a monthly calendar.


5.5.2 Design: Lists versus grids: the slide puzzle

Let's consider a gameboard and how it might be implemented either with a list or a grid.

Behavior

I trust you have played with a slide puzzle --- this is a board, shaped like a square, which holds interlocking pieces that you can slide. There is only one empty space, into which you can slide a piece. Here is how the puzzle might look like when you first touch it:

Perhaps the objective is to rearrange the numbers that label the pieces so that the numbers appear in ascending order. It will take some work, but you have time to kill, so you start, say, by sliding piece 1 into the empty space:

Now, you may move either piece 5, 2 (or 1 again); perhaps you move 5:

You continue playing with the puzzle for as long as you like.

Data structure selection

Let's build a computerized slide puzzle, which has the same behavior as the physical one shown above. To build the program, we must select a data structure to represent the slide puzzle inside the computer. Since the puzzle board has cells and a cell might be empty, this suggests some form of list.

The obvious choice is a grid of integers, like this


where the integer 0 represents the empty space. It is easy to write the Python data structure:
puzzle = [ [15, 14, 13, 12],
           [11, 10,  9,  8],
           [ 7,  6,  5,  4],
           [ 3,  2,  1,  0] ]
But a second choice is a simple list, like this:
puzzle = [15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

Which representation is better? The answer depends --- assuming the human does not use the row-column coordinate numbers of the squares to slide pieces (instead, she would type the number on the piece itself), a two-coordinate indexing system is not essential to the game. This means either form of data structure will do.

So, we will construct the solution both ways and compare the results.

The algorithm

We must design an algorithm that controls the puzzle, that is, lets a human repeatedly move the pieces (numbers) within the slide puzzle. Here is a the basic strategy:
while True :
    print the puzzle
    ask the human for the number on the piece to move
    try to move the piece (*)
Let's focus on the third step (*) in the loop: moving the selected piece, say, the piece labeled by number num. We must
    find where piece  num  is situated in the  puzzle
    once found, verify that it is located adjacent to the empty space (number 0)
    swap  num  and  0  in the  puzzle
To refine these steps further, we must fix our choice of data structure for the puzzle. We study the grid first and the list second.

Code the solution with a grid

If we model the puzzle like this,
puzzle = [ [15, 14, 13, 12],
           [11, 10,  9,  8],
           [ 7,  6,  5,  4],
           [ 3,  2,  1,  0] ]
then the difficult steps in the solution are indeed to
    find where piece  num  is situated in the  puzzle
    once found, verify that it is located adjacent to the empty space (0)
    swap  num  and  0  in the  puzzle
To locate the piece the human wishes to move, we search the puzzle, examining all its integers, until we find num. To search a grid, we use a loop inside a loop, like this:
PUZZLE_SIZE = 4  # it's a 4-by-4 puzzle

# search the puzzle for the piece numbered  num:
for i in range(PUZZLE_SIZE):   # counts  0, 1, 2, 3.
    for j in range(PUZZLE_SIZE) :
        if num == puzzle[i][j] :
            ... remember i and j ...
Once we find num at position i, j, we must verify that i, j are adjacent to the coordinates of the empty space:
# Say that we wish to move the piece at coordinates (i,j) into the empty_space,
# and  empty_space = (row, column)  is the location of the empty space.
# Check if the  empty_space  is adjacent to  i,j :
if (empty_space == (i-1, j))     \
    or (empty_space == (i+1, j)) \
    or (empty_space == (i, j-1)) \ 
    or (empty_space == (i, j+1)) :
    # if True,
        ... swap  num  and  0  in the  puzzle ...
    else : # False
        print "illegal move --- try again"
Here is the completed coding:
FIGURE===============================================

# SlidePuzzle
#   implements a slide-puzzle game.
# assumed input: a series of integers between 1 and 15, which indicate the
#   pieces to move in the slide puzzle
# guaranteed output: the slide puzzle, as it appears after each move

import string  # required to use  zfill  function to print the board

PUZZLE_SIZE = 4  # it is convenient to remember the puzzle's size
# the puzzle, where 0 marks the empty space/cell:
puzzle = [ [15, 14, 13, 12],
           [11, 10,  9,  8],
	   [ 7,  6,  5,  4],
	   [ 3,  2,  1,  0] ]

# It is convenient to remember which cell is empty:
empty_space = (3, 3)

# loop forever to move the pieces dictated by the user:
while True :

    # print the puzzle:
    for row in puzzle :
        for item in row :
            if item == 0 :
                print "  ",
            else :
                print string.zfill(item, 2),  # print a two-symbol number
        print

    # ask for the next move:
    num = int(raw_input("\nType number of piece to move: "))

    # search the puzzle for the piece numbered  num:
    piece = ()  # will remember the cell coordinates where  num  rests
    for i in range(PUZZLE_SIZE):
        for j in range(PUZZLE_SIZE) :
	    if num == puzzle[i][j] :
	        piece = (i, j)  # we found  num  at coordinates  i,j

    # did we find the piece ?
    if piece == () :
        print "illegal move --- try again"
    else :  # we did find it, so let's try to move it:
        # check if the piece is adjacent to the empty space: 
	i = piece[0] 
	j = piece[1]
	if (empty_space == (i-1, j))    \   # to the left ?
	   or (empty_space == (i+1, j)) \   # to the right ?
	   or (empty_space == (i, j-1)) \   # above ?
	   or (empty_space == (i, j+1)) :   # below ?
	    # if True, it's ok to move  num  into the empty space:
            puzzle[empty_space[0]][empty_space[1]] = num
	    # remember that the former cell holding  num  is now empty:
	    puzzle[i][j] = 0
	    empty_space = (i, j)
        else : # False, so it's not ok to move  num:
	    print "illegal move --- try again"

=====================================================
Because we work with a nested list, we require nested loops to print the list and to search it for the piece that the player wishes to move. Also, when we check that the piece is adjacent to the empty space, we must check both the row and the column variations.

Code the solution with a flat list

Say that we use a flat list, like
puzzle = [15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Again, the interesting steps in the solution are
    find where piece  num  is situated in the  puzzle
    once found, verify that it is located adjacent to the empty space (0)
    swap  num  and  0  in the  puzzle
To find the desired num in puzzle, we use an ordinary searching loop:
for i in range(len(puzzle)):
    if puzzle[i] == num :
        location = i   # remember that num is located at position  i
        break
The most difficult part of the solution is checking that the position of the piece is adjacent to the empty space. A first attempt reads somewhat like this:
empty_space =  ...  # remembers the index of the empty space in the  puzzle

if (empty_space == location - 1)  \   # to the left ?
   or (empty_space == location + 1) \ # to the right ?
   or (empty_space == location - ROW_SIZE) \  # above ?
   or (empty_space == location + ROW_SIZE) :  # below ?
     ... swap  num  and  0  in the  puzzle ...
else :
     ... can't move the piece ...
We check whether the empty space is immediately above the position of the piece to move merely by subtracting the ROW_SIZE --- a neat trick. (For example, if the piece to be moved rests at puzzle[7], and the empty space rests at puzzle[3], then the latter is immediately above the former.)

Here is the solution based on the above ideas:

FIGURE================================================

# SlidePuzzle
#   implements a slide-puzzle game.
# assumed input: a series of integers between 1 and 15, which indicate the
#   pieces to move in the slide puzzle
# guaranteed output: the slide puzzle, as it appears after each move

import string  # required to use  zfill  function to print the board

ROW_SIZE = 4  # it is convenient to remember the puzzle's size
# the puzzle, where 0 marks the empty space/cell:
puzzle = [15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

# it is convenient to remember which cell is empty:
empty_space = 15

# loop forever to move the pieces dictated by the user:
while True :
    # print the puzzle:
    for count in range(len(puzzle)) :
        if count % ROW_SIZE  == 0 :
            print
        if puzzle[count] == 0 :  # empty space ?
            print "  ",
        else :
            print string.zfill(puzzle[count], 2),  # print a two-symbol number

    # ask for the next move:
    num = int(raw_input("\nType number of piece to move: "))

    # search the puzzle for the piece numbered  num:
    location = -1  # will remember the cell where  num  rests
    for i in range(len(puzzle)):
        if puzzle[i] == num :
            location = i
            break

    # did we find the piece ?
    if location == -1 :
        print "illegal move --- try again"
    else :  # we did find it, so let's try to move it:
        # check if the piece is located adjacent to the empty space: 
        if ((empty_space == location - 1) and (location % ROW_SIZE != 0))  \
          or ((empty_space == location + 1) and (empty_space % ROW_SIZE != 0)) \
          or (empty_space == location - ROW_SIZE) \
          or (empty_space == location + ROW_SIZE) :
            # if True, it's ok to move  num  into the empty space:
            puzzle[empty_space] = num
            # remember that the former cell holding  num  is now empty:
            puzzle[location] = 0
            empty_space = location
        else : # False, so it's not ok to move  num:
            print "illegal move --- try again"

ENDFIGURE===========================================================

The only complication in this solution is the check whether the piece to be moved rests immediately to the left or to the right of the empty space: The if-command must be improved to read

if ((empty_space == location - 1) and (location % ROW_SIZE != 0))  \
     or ((empty_space == location + 1) and (empty_space % ROW_SIZE != 0)) \
     or (empty_space == location - ROW_SIZE) \
     or (empty_space == location + ROW_SIZE) :
When the empty space rests at the end of one row and the piece to be moved rests at the beginning of the next row, this is not an adjacency. (For example, the piece to be moved rests at puzzle[4], and the empty space rests at puzzle[3], then the latter is not immediately to the left of the former.)

Comparison

Either solution is acceptable for the slide puzzle. The nested-list solution requires nested loops for printing and lookup, but it portrays the puzzle in exactly the same dimensions as the real, physical puzzle.

The flat-list solution uses simple techniques for printing and lookup, but there is a small complication when checking whether the piece to be moved is to the left or right of the empty space.


5.5.3 Semantics of nested lists

In an earlier section, we learned that lists were stored in the heap. This is also true for nested lists, and in case you are curious, here is a diagram of how the puzzle nested list appears in heap storage for the slide-puzzle program:

Since each row of the puzzle is itself a list, the puzzle is in fact a list of addresses of lists. This layout is a bit difficult for humans to appreciate, but computers work well with it, because it linearizes the matrix in computer storage, which is itself linear.

The picture shows us one important idea: we must be very careful when constructing a matrix for the first time. If we try the ``multiplication trick,'' twice, like this:

# INCORRECT attempt to build an 8-by-8 game board:
board = [ [ "" ] * 8 ] * 8
What happens is that
  1. A list of eight strings is constructed by [ "" ] * 8.
  2. The address of the 8-string list is duplicated 8 times and placed into another list of 8 cells.
The result looks like this:
Namespace:      | Heap:
                |
board : addr2   | addr1: [ "", "", "", "", "", "", "", "" ]
                | addr2: [ addr1, addr1, addr1, addr1, addr1, addr1, addr1, addr1 ]
Rather than a 64-celled board, we have an 8-celled one in disguise. The correct way of building the 64-celled board looks like this:
board = []                 # the matrix we will build; starts empty
for i in range(8) :
    board.append( [ "" ] * 8 ) # build a new list of 8 cells and
                               #   append it to the matrix

Exercises

Improve the slide-puzzle program in the following ways:
  1. When the user types q or quit, the program halts.
  2. When the program starts, it asks the user for the size of the puzzle, and the program builds a puzzle of the specified size. (E.g., a size of 3 builds a puzzle with pieces numbered from 1 to 8.)
  3. The program prints the puzzle so that the numbers line up vertically and the empty space is printed as a blank (and not as 0).


5.6 Foundations: a loop maintains a list's data invariant

A list is an example of a sequence data structure, and in the previous chapter, we learned that loops that process sequences generate sequences of facts that are summarized as ``for all'' assertions. All the techniques we used to state logical properties of tuples apply also to lists.

We also learned that some sequences, like an alphabetized string, should be given a data invariant that states the key logical property that is always true about the sequence.

Data invariants are crucial for lists, because lists are mutable. When we alter the value of a cell in a list, we risk ruining the list's data invariant, so we should always state a data invariant for the list and always check that the invariant is kept true when we update the list.

A list's data invariant might be quite simple, such as

numbers = [2, 3, 6, 3, 8, 0]  # data invariant:  all values in  numbers are nonnegative ints
The invariant tells us that we should never encounter, say, a string within a cell of numbers.

In the case of the slide-puzzle game that we just saw, the data invariant states the correct appearance of the slide puzzle:

puzzle = ...  # data invariant: puzzle  is a permutation of  range(16)
That is, puzzle holds the integers 0 to 15 in some mixed up order.

It is clear that a correctly working slide-puzzle game must preserve the puzzle's data invariant! This must be an invariant property of the loop that reads the human's move and makes the move on the puzzle --- the updated puzzle is still a permutation of the original. Of course, the game's loop must also enforce the primary rule of the slide-puzzle game, which is that only a number adjacent to 0 may be exchanged with the 0.

Sophisticated data invariants can also appear in simple programs. Consider again the vote-counting program from the beginning of the chapter:

====================================================

votes = 4 * [0]  # data invariant: 
                 #  for all i in 0..3, votes[i] == quantity of i's read as input
processing = True
while processing
    # invariant:  data invariant for  votes  is preserved
    v = int(raw_input("Type your vote (0,1,2,3): ")
    if v < 0 or v > 3 :
        processing = False   # bad vote, so quit
    else :
        votes[v] = votes[v] + 1

========================================================
The sole purpose of the program is to preserve the list's data invariant as the input numbers are read one by one. This is not an accident --- the job of many programs is merely to preserve a data structure's data invariant. As a result,
Every mutable data structure should possess a useful data invariant that is preserved while the program executes.
When you design a data structure, always ask, ``What is the structure's data invariant?'' The answer you give will tell you much about the algorithm you write that must process and maintain the data structure.


5.7 Design: When to use tuples, lists, and dictionaries

It is time for some review.

First, tuples are used to build values that are small collections (playing cards and pixels). Tuples are also used to collect values into sequences (collecting words from a sentence or collecting the cards in a card deck).

In contrast, lists are used to build structures like game boards and grids, which have ``spaces'' or ''cells'', can be ``indexed'' by a fixed ``address'', and can be updated with new values. Use a list to build a data structure when

Lists can be unnested (``flat'') or nested (''grids'' or two-dimensional arrays). When we model a linear structure, like a daily timetable, we should use a flat list. It is tempting to use a nested list when we model a gameboard, like a tic-tac-toe board. But often, the computer programming of such a board is simpler with a flat list, and in the case of a slide puzzle or tic-tac-toe board, a flat list works surprisingly well.

When choosing between a flat list and a nested list, are these questions:

  1. Are the cells of the list explicitly indexed? If yes, do they require one index number or two? If two indexes are used, then a nested list is appropriate.
  2. Is it important to portray all of the vertical, diagonal, and horizontal relationships between the cells of the structure? If the answer is ``yes,'' then use a nested list.

For example, a monthly calendar should be modelled with a flat list, even though the calendar is printed as a grid --- there is no relationship between the days (cells) of the month, and we refer to each day with a single index number. In contrast, an 8-by-8 board for playing chess is better modelled as a grid, because chessboard squares are often addressed by row, column coordinates, and some chess pieces move vertically and horizontally, others move diagonally, and others move in the shape of the letter, ''L.'' Computing these forms of movement is done easier with a nested list.

Dictionaries are used to build books, directories, and databases where elements are located by their keys. Use a dictionary when

Tuples are often used to represent the elements that are stored into lists and dictionaries. Tuples work well as ``small values'' like a playing cards (suit,count) or chess piece (color, piece) or a person's statistics (name, gender, birthdate).


5.8 Summary

This chapter introduced two forms of mutable (changeable) data structure --- lists and dictionaries. Lists are similar to arrays (but lists can grow as needed), and dictionaries are similar to record structures (but dictionaries can grow as needed and the key set is not fixed in advance).

Lists

A list is written with this syntax:

[ ELEMENTS ]
where ELEMENTS are zero of more EXPRESSIONs, separated by commas. Each expression can compute to any value at all --- number, boolean, string, tuple, list, etc. The elements in a list are saved in the list's cells which are indexed (numbered) by 0, 1, 2, ....

A list can be assigned to a variable, as usual:

gameboard = [ "", "", "", "", "", "", "", "" ]
We can use a shortcut to make a list whose items start with the same value:
gameboard = [ "" ] * 8

Given a list, LIS, we can compute its length (number of cells) like this:

len( LIS )
The primary operation on lists is indexing, and we can write this indexing expression:
LIS [ INT_EXPRESSION ]
where LIS is a list, and INT_EXPRESSION is an expression that computes to an integer that falls between 0 and len( LIS ) -1. The expression returns the element saved at cell number INT_EXPRESSION.

We update a list's cell with this assignment command:

LIS [ INT_EXPRESSION ] = EXPRESSION
The assignment destroys the value that was formerly held at cell INT_EXPRESSION and replaces it with the value of EXPRESSION.

Since a list is a SEQUENCE, we can use the for-loop to systematically process a list's elements:

for ITEM in LIS :
    ... ITEM ...
where ITEM is a variable name that stands for each element from list LIS.

There is a special operation, range, that constructs a list of numbers: range(NUM) builds the list, [0, 1, 2, ..., NUM-1]. (E.g., range(3) computes [0, 1, 2].) We can use range to print a list's index numbers and contents:

for index in range(len(LIS)) :
    print index, LIS[index]

Here are two methods that alter lists:

And here is a useful method for breaking an input string into a list of the words it holds:

Lists can be nested inside lists. Here is how to build an 8-by-8 chessboard, a matrix:

board = [ ]
for row in range(8) : 
    board.append([ "" ] * 8)
We assign to individual cells of the board with two indexes:
board[5][5] = ("white", "pawn")
and we use nested for-loops to print the board:
for row in board :
    for square in row :
        print square, "|",
    print

Dictionaries

A dictionary is written with this syntax:

{ KEY_ELEMENT_PAIRS }
where KEY_ELEMENT_PAIRS are zero or more KEY : ELEMENTs, separated by commas. Each KEY must be an immutable value (number, boolean, string, or tuple). Each ELEMENT can compute to any value at all --- number, boolean, string, list, dictionary, etc. The elements in a list are saved in a hash table structure.

A dictionary can be assigned to a variable, as usual.

Given a dictionary, DICT, we can find an element by using its key:

DICT [ KEY ]
If KEY is not found in DICT, it is an error, so it is better to ask first if the KEY is present:
if KEY in DICT :
    ... DICT[KEY] ...

We update a dictionary with this assignment command:

DICT [ KEY ] = EXPRESSION
If the KEY is new to DICT, then a new key, element pair is added. If the KEY is already in use, the assignment destroys the value that was formerly associated with KEY and replaces it with the value of EXPRESSION.

We can use the for-loop to systematically process a dictionary:

for K in DICT :
    ... K ... DICT[K] ...
where K is a variable name that stands for each key saved in dictionary DICT.

The operation, del DICT[K] deletes key K and its element from DICT. (If K is not in DICT, it is an error.)

Here are two useful methods for dictionaries:

Finally, here is a simple way to print the contents of a dictionary ordered by the keys:
keylist = my_dictionary.keys()
keylist.sort()
for k in keylist:
    print k, ":", my_dictionary[k]