Spelling trees and hash tables are sometimes called dictionaries, because they store objects whose keys are ``words'' (sequences of symbols).
A standard use of a ``dictionary'' is counting occurrences of words in a file. For practice in building hash-tables, you must use a hash table to implement a word-frequency counting program. The program reads a file of words and counts the occurrences of each of the words. Then, the program prints all the words and their frequency of occurrences. For example, a text file that contains these lines:
Bed be been bee it. It be bed, been it? It be!would generate this output, in alphabetical order:
Frequency counts: be: 3 bed: 2 bee: 1 been: 2 it: 4Upper-case letters are converted to lower case; punctuation is ignored. Assume that the input file consists of letter-only words and the punctuation symbols, , . ; : - ? !
Once started, the program shows a dialog that requests the name of a sequential text file, FILENAME. The program reads FILENAME, counts the words, and prints the results in the command window and into an output sequential file named FILENAME.out. The words and their counts appear in alphabetical order.
package HashTable public class HashTable { private Word[] table; // array that holds words and their counts private int TABLE_SIZE; // the length of the array ... // methods that add/increment a word into the table // and print the table's contents // The hash function should be placed here, also: public int hash(String key) { ... } // calculates key's hash code, an int in the range, 0..TABLE_SIZE-1 }If you use ``buckets'' (linked lists) to handle collisions, the class looks like this:
package HashTable public class HashTable { private Cell[] table; // array that holds lists of words and their counts private int TABLE_SIZE; // the length of the array ... }where each Cell object will hold a Word object as its value part.
The objects that hold words and their counts will probably look like this:
package HashTable; public class Word { private String the_word; // the word acts as its own ``key'' private int count; // how many times the word has appeared ... // methods that increment count and return values of the_word and count }You should construct the the hash table as an array of prime-numbered size (e.g., 101 elements). Also, use polynomial coding for computing the hash function and use an appropriate base, e.g., 31, 37, or 41.
There is only one annoyance to writing this program: When the program prints the final listing of words and their counts, it must do so in alphabetical order. You should not assume that the words are inserted into the hash table in alphabetical order (indeed, they will not be inserted alphabetically), so you must some standard sorting technique to sort the words for printing.
Here are some Java-related notes that might be helpful:
StringTokenizer t = new StringTokenizer(text_line, ",.;:-?! ");will be useful for extracting the words within the string, text_line.
String lower_case_word = input_word.toLowerCase();
char c = ...; total = total + (c * (int)(Math.pow(base, x)));adds to integer variable, total, the underlying integer value of c multiplied by basex.