Artificial intelligence is an active and exciting area of computer science. Can computers be made to do tasks that seem "intelligent"? Some of the goals (and partial successes) of artificial intelligence include important areas such as medical diagnosis, drug discovery, and robotics. For this assignment, we will visit the part of artificial intelligence referred to as "natural language processing," i.e. getting a computer to work with a natural language such as English.
More specifically, you will write a program called AIAuthor that will actually write original prose in the style of a chosen human author. Can computers be creative? We'll find out!
This is a team assignment.
Find a source text. Specifically, find a text file that contains some text that you want the computer to creatively imitate. Project Gutenberg is a great place to start.
The technique that you will use works conceptually as follows:
Read the source text into memory as a single string. Unlike previous examples of reading text (like for the spelling dictionary), we actually want to retain spaces in the original document. Therefore, you should read in an entire line at a time, instead of a word at time. Check out the Scanner class nextLine method.
You will also want to use a StringBuilder instead of a
String to hold the data as you read it in. The problem with
using a String to hold the data is that every time you read a
new line, you need to append it to the old one. Every time that you
append to a String, Java creates a brand new one by copying
the old String and then appending the new text. This is
extemely slow. A StringBuilder is much faster way to manage
appending multiple strings of text. When you're done appending, you
can convert the StringBuilder back to a normal
String. Here's an example:
StringBuilder sbtext = new StringBuilder(); sbtext.append("hello"); sbtext.append(" goodbye "); sbtext.append("friends"); String text = sbtext.toString(); // converts the StringBuilder to a normal string
Be careful when you append the lines together that you append a space appropriately at the end of each line. If you don't handle this right, the last character of each line and the first character of the next next line will always appear right next to each other.
Here's an example of some text that my program generates, using a window size of 5 on Grimm's fairy tales:
I will you learn home to taken his way, and around him, and left his come intender him in they cried: 'The little child,' said the other; 'let us for the cloak, and brough the peasant was to King.Alternatively, when I tried it with a window size of 8, here's what I got one of the times that I ran it:
The servants running across the stables to salt and butter to eat and drink, the head you must get up quickly, and fell fainting to the test.For part 1 of this assignment, your program should at least be able to read in a source text, ask for a window size, and choose a random seed from the text (which should be printed to the screen). Of course, you can turn in more of the entire assignment (or the whole thing) for part 1 if you like.
Make sure that you use good design. Think carefully about how to layout your program so that you use classes and objects appropriately, and so that your methods remain short and self-contained.
Once you get your program running, try to get your program to write the most funny or insightful text that you can.