Storytelling with Skynet

Our Process

We explored three methods to address the problem of computer generated text. The original paper we referred to proposed recurrent neural networks as a possible solution, citing their ability to adapt for a given dataset and their general robustness. We took this advice to heart in our implementation by creating two different types of recurrent neural networks (basic RNN and sequence to sequence). We soon realized that in order to properly evaluate our RNNs we needed some sort of benchmark with which to measure their success. This is how the n-gram model was created. However, the n-gram model proved to generate better suggestions than we originally thought it would, even outperforming the RNN in many cases, so we included it in our final product.

RNN

The RNN was our first attempt at creating a text-generating AI. Many of the implementation-level details of the original paper's RNN were either unreferenced or vaguely alluded to by links. With no common code-base to draw inspiration from we turned to TensorFlow (an open-source neural network library for Python). Basing our neural network's structure on a tutorial by Rowel Atienza, we then implemented complex concepts like LSTM (long short-term memory) cells and eventually branched off of our basic RNN model to create a sequence to sequence model.

Seq2Seq

Through further research we found that many recurrent neural networks were turning to sequence to sequence implementations as a way to improve their results. We decided to do the same. The expansion to sequence to sequence allows our model to work such that if we give an input of a certain length, the output lenth is variable. This allows our neural network to output more creative and varied sentences. A further improvement on our RNN is that the sequence to sequence model outputs full sentences as opposed to the single word outputs of our basic RNN model. These full sentence outputs also have the advantage of having a set beginning and end, making the sentences inherently structured and slightly more sensical.

N-Gram

We also decided to create a more simplistic model, which stochastically generates text based solely on the words directly previous. We used a trie to store our model, so we could query a variety of substring lengths when querying words. We encoded our input to always end in a sentence stop, so the model consistently generated whole sentences, rather than simply picking up where the user left off. The main drawback of this approach is that the only input that model saw was the final word of the input sentence and the stop character, leading to sentences that don't necessarily follow from input.