CS 117, Winter 2001

Word frequencies, due 2/21/01

For this assignment, you may talk to other people about coding strategy, algorithms, etc., but I want each of you to write your own code. Submit your code using the Homework Submission Program.

The Goal

You are going to write a program that

So, for example, if the file contains the text:

The moose and the kudu
frolicked in the
meadow with their friends
the okapi and the gnu.

the output should be:

the         5
and         2
friends     1
frolicked   1
gnu         1
in          1
kudu        1
meadow      1
moose       1
okapi       1
their       1
with        1

Note that punctuation should be removed, and that "the" and "The" are considered to be the same word.

Advice

I recommend that you use the "incremental development" approach in writing this program. That means that you should plan a sequence of partial solutions to the problem, each slightly more ambitious than the one before it.

For example, you might write programs that do the following:

  1. Get the text file's name from the user, and print all the words, in lower case form, on the screen. Don't worry about duplicate words for this first version of the program, and don't count the words. When this works, SAVE A COPY OF YOUR CODE AND DON'T TOUCH IT AGAIN.

  2. Get the text file's name, read the words into an array of strings, and then print them out. Again, don't worry about duplicate words. SAVE A COPY OF YOUR CODE AND DON'T TOUCH IT AGAIN.

  3. Read the words into an array of strings. Keep a separate array of counters. If you encounter a word that's already in the string array, add 1 to that word's counter. Otherwise, add the word to the string array and set the corresponding counter to 1. SAVE A COPY OF YOUR CODE AND DON'T TOUCH IT AGAIN.

  4. Etc.

The idea is to plan a sequence of small steps, each of which is manageable on its own, that add up to the complete program. Among the many benefits of this approach is the constant availability of a partial solution that can be handed in for grading, or demonstrated to a customer, or released to the public for early testing.

Start early, keep in touch, and have fun.



Jeff Ondich, Department of Mathematics and Computer Science, Carleton College, Northfield, MN 55057
(507) 646-4364, jondich@carleton.edu