The Java Collections components
Many of the data structures we studied are already coded for
you in the Java libraries, within the package, java.util.
Here is a brief summary.
Stacks and Vectors
Two classes belong to the earliest version of java.util:
-
class Stack: This is an array-based implementation of
a stack, with the standard push, pop, and top (called peek)
operations.
-
class Vector: This is an array that can grow as needed.
You can insert an object into a Vector by adding
it to the end of the array: addElement(Object e) (and the
array grows by one cell).
Or, you can use an integer index like
with the standard array: add(int index, Object e) (This operation
will shift the elements of higher index to the right by one cell, and
the array grows by one.)
Lookups can be done with an index: get(int index);
or, you can extract elements from either end of the array without
giving an index number. Because the insertion operation does not
overwrite a cell in the Vector, there is an explicit remove
operation for deleting a value from a Vector.
The ``Collections Framework''
Within java.util is a
a family of data structures that
share standard operations and properties; they are called
the ``collections framework.'' The framework is ``defined''
by several
Java interfaces that state some standard operations that
classes in the framework must implement.
Here are the two most important interfaces:
-
interface Collection: The interface states the standard
operations that one would expect of a data structure (a ``collection''):
insertion, lookup, deletion. Here are the Java names:
-
public boolean add(Object o)
-
public boolean contains(Object o) (This is a kind of lookup.)
-
public boolean remove(Object o)
-
public boolean isEmpty()
-
public Iterator iterator() (This will be explained later.)
-
interface Map : The operations use keys to do insertions, etc:
-
public boolean put(Object key, Object o)
-
public Object get(Object key)
-
public boolean remove(Object key)
-
public boolean isEmpty()
-
public Iterator iterator()
Here are two interfaces that add more operations to Collection:
-
interface List extends Collection, which gives you operations
to add and look up objects using array-indexes. (This means the data
structure is a kind of array or numbered sequence --- it's too bad
that they call it a ''list'' !)
-
interface Set extends Collection. This interface requires
that there are no duplicate objects in a collection --- like a set.
It includes operations for set-like operations like union, intersection,
and set subtraction.
Classes that implement interface List
There are two important classes that implement interface List,
that is, are numbered sequences:
-
class ArrayList: This is really just a Java Vector,
that is, an array that grows as needed, recoded to fit into the
Collections package.
-
class LinkedList: This is just a singly-linked list,
extended with operations that let you find Cell number k
in the list and return the object held in it. (Also, you can insert
a new object into the middle of the linked list by adding it
at position k.)
You should use these two classes to build other data structures
that must be ``smart'' arrays or ``smart'' linked lists. For example,
you might build a class Queue like this:
import java.util.*;
public class Queue
{ private LinkedList my_queue;
public Queue()
{ my_queue = new LinkedList(); }
public enqueue(Object ob)
{ my_queue.add(ob); } // adds ob to the end of the linked list
public Object dequeue()
{ Object answer = null;
if ( !my_queue.isEmpty() )
{ answer = my_queue.remove(0); } // remove the front object in the list
return answer;
}
...
}
Classes that implement interface Map
These classes store ``Records'' --- objects paired with their keys:
-
class TreeMap implements Map:
This is an ordered binary tree (a binary search tree) that uses the keys
to store the objects. The tree is balanced using a ``red node-black node''
balancing strategy, which a bit more complicated than the AVL-balancing
strategy but uses similar ideas.
-
class HashMap implements Map:
This is a classical hash table, where you must state the size of the
hash table when you construct a HashMap object.
When a key,object pair are inserted, the key is converted into a
hash code using polynomial coding with base 31. Collisions are resolved
using linked-list chains (``buckets'') within the array elements.
Classes that implement interface Set
Interface Set is supposed to describe data structures that implement
sets, having operations for set membership, union, and intersection.
The Java language does not do well at providing set data structures,
so instead we are asked to choose between a tree-simulation of a set
and a hash-table simulation of a set. Neither solution is ideal.
-
class TreeSet implements Set: It's a binary tree that does
not use any keys to save its objects. (Instead, the object's value
is used as a ``key'' for storing the object in the tree.)
-
class HashSet implements Set: This is a hash table, where
hash codes are manufactured from
the object's value.
Indeed, both TreeSets and HashSets are really just TreeMaps and HashMaps
that manufacture their own keys for the objects that are inserted.
Iterators
One standard sticky problem with data structures is printing the structure's
contents in a simple way --- for example, we might copy the
objects within a binary tree or a hash table into an array
and return the array for printing.
An iterator is an ordered ``array'' of the contents
of a data structure. There is a Java interface, interface Iterator.
An iterator has at least these two methods:
-
public Object next(): shows us one of the objects in the
data structure that we have not yet seen
-
public boolean hasNext(): tells us if there are more objects
to look at
To understand these operations, let's compare them to an array.
Say that we copied the objects held in a tree into an array named
iter. Then we write this loop to print the contents:
Object[] iter = ... copy contents of tree into array ...
for ( int i = 0; i != iter.length; i = i + 1 )
{ Object next_object = iter[i];
System.out.println( next_object.toString() );
// remember that toString is a Java method that tries to convert
// an object into a string for printing. It often works!
}
You do the same work with an iterator: Say that you
built a data structure, my_data_structure,
with one of the Java Collections classes
listed above. Next, you added some objects into the structure,
and now you want to print the contents:
Iterator iter = my_data_structure.iterator(); // copies the objects in
// my data structure into an ``array'' named iter
while ( iter.hasNext() ) // are there more objects to look at ?
{ Object next_object = iter.next(); // get the next object
System.out.println( next_object.toString() );
}
The iterator structure hides the details about whether an array
or a linked list or whatever is the best structure for returning
the contents of a data structure for printing.
Summary
Now that you understand and know how to program
linked lists, ordered trees, and hash tables, you can
intelligently use the classes in the Java Collections package
and save time when you are next asked to build a ``smart''
data structure.
You can read the local documentation for java.util
at
-
http://www.cis.ksu.edu/VirtualHelp/Info/JDK1.3.1/docs/api/java/util/package-summary.html
Or, visit Sun's web site, java.sun.com, for the latest writeup.