Want to use our code? Look at this first!

This page gives a high-level overview of our code structure and some info about using it.

Code structure

The diagram below shows the folders in the code dump from the Downloads page, relating them to the major components described in the SMT Intro page, and showing the major classes in each of them. Solid arrows indicate subclassing/interface implementation, dashed arrows indicate that one class is heavily used by another, and the color of the outlines indicates which components of the project use which other components.

Class descriptions

Below are brief descriptions of what role is played by each class in the diagram above. Look to the source files for more detailed documentation.

Corpus Accessor

Parameter Store

Language Model

Translation Models

Decoders

Using the code

The code dump includes a makefile; simply run make to compile the project. Then, run the runDecoder.sh script in the Decoders/ folder to perform translations. Running the script with no arguments provides usage info. We only provide trained parameters for Spanish to English to save space, so the sourceLang and targetLang arguments should be set to es and en, respectively.