I will make no recommendation here as to how to structure your program internally. The only interface I insist on is the command line interface. If your program is called with no command line arguments, it should take its input from stdin. If your program is called with one command line argument, it should interpret that argument as a file name, and take input from that file (if it exists). In all cases, your program's output should go to stdout, with error messages sent to stderr. See here for a little help processing command line arguments.
When you get your program working, try it on some big files (if you look around the NeXT directory structure, you might find Hamlet or something similar to use--the Web is also full of enormous text files waiting to be downloaded and counted). Just for fun, time your program using the Unix utility time. Now try your program on the small dictionary file /usr/dict/words (about 25000 words, one per line). How does your program perform on this one (by the way, you might want to redirect the output to a file in this case).
Program modularly. You may be able to use much of the code from this program on your next assignment (a simple spell-checker), but only if you separate this program into reasonable pieces. The functions that get words from the input should be separate from the BST functions, etc.
What do you think should constitute a word? I'll leave that one up to you, but it seems to me that "word?" is not a word, though a simplistic approach to word reading might have found this very "word" at the end of the previous sentence.
Start early, keep in touch, and have fun.
What is the most common word in Hamlet, anyway?