CS 251: Interpreter, if/let

It's time to start actually implementing the interpreter portion of the project!

Getting started

Check out your blank repository (called "interpreter" this time, with your team name attached to it), and then download this zip file. Add the files to your repository as usual. As in the past, you have a choice regarding linkedlist.c, talloc.c, tokenizer.c, and parser.c. If you want to continue to build your own interpreter that is truly yours, you can copy in these files from the last project, and move forward. Alternatively, if you would prefer to use my versions instead, you can do so. In the lib directory you'll see a variety of .o files, which are compiled binary versions of my .c files. If you would prefer to use one or many of these in your project instead of your own, you can replace them in the Makefile. Specifically, in the Makefile, on the line starting with SRCS, you can replace talloc.c with lib/talloc.o, etc. I suspect you are likely best off taking all of the files together, as opposed to mixing and matching yours with mine, but you may try if you like.

Note that compiled binaries are operating system specific (sometimes version specific). I have compiled these .o files to work with the course virtual machine.

The key missing file is interpeter.c, which you will need to create.

To compile your code, issue the command "make" at the command prompt. (This will give an error if you haven't yet created the .c files you need, or have changed the Makefile to refer to the necessary .o files.)

To run your code, first create a file with some Scheme code in it, and give it a filename (e.g., input.txt). Then issue this command at a prompt:

      ./interpreter < input.txt
    

Your code should read data from standard input. In that case, the above command will redirect the text in the input.txt file into your program.

To run your code through valgrind, I have provided a shell script to set all the appropriate options. You can do it this way:

      ./valgrind.sh  < input.txt
   

Functionality

For this phase of the project, you must support the correct evaluation of the following types of Scheme expressions:

  1. Boolean, integer, real, and string literals. (They evaluate to themselves.)

  2. (if test trueExpr falseExpr)

    You may assume that test evaluates to a boolean.

  3. (let list-of-bindings body).

    You will need to implement the creation of new frames. Much more on this in a bit.

  4. Variables. Now that you've implemented let and frames, you should be able to evaluate a bound symbol.

When you're done with this part of the project, you should be able to evaluate very simple programs like this:

(let ((x #t)) (if x 3 5))

This should evaluate to 3. Your programs won't yet be able to handle functions; that's coming later.

Core functionality

At the core of your interpreter, you will need to implement a function to evaluate Scheme code:

Value *eval(Value *tree, Frame *frame) { ... }

Given an expression tree and a frame in which to evaluate that expression, eval returns the value of the expression. I'll explain the "frame" in a moment. Here's a skeleton of my eval function, to give you an idea of where to start:

Value *eval(Value *tree, Frame *frame) {
   switch (tree->type)  {
     case INT_TYPE: {
        ...
        break;
     }
     case ......: {
        ...
        break;
     }  
     case SYMBOL_TYPE: {
        return lookUpSymbol(tree, frame);
        break;
     }  
     case CONS_TYPE: {
        Value *first = car(tree);
        Value *args = cdr(tree);

        // Sanity and error checking on first...

        if (!strcmp(first->s,"if")) {
            result = evalIf(args,frame);
        }

        // .. other special forms here...

        else {
           // not a recognized special form
           evalationError();
        }
        break;
     }

      ....
    }    
    ....
}

Frames, and bindings

We will be discussing this in class, but here is a brief overview of how to evaluate Scheme code.

The eval function mentioned above takes a pointer to a "frame"; what is a frame? A frame is a collection of bindings. OK, what's a binding? A binding is a mapping from a variable name (i.e. a symbol) to a value. Frames are created whenever we introduce new variable names. For example, in the program

(let ((x 5) (y "hello")) (if #t x y))

the bindings for x and y are stored in a single frame. You will have to construct a frame whenever eval encounters a let. This frame should be passed in when calling eval on the body of the let expression. The frame is used to resolve (find the value of) each variable encountered during evaluation. When eval tries to evaluate x, it looks in the current frame to find a value for x.

There's one other crucial detail here: frames can be chained together to create a larger environment consisting of multiple frames. For example, in the program

(let ((x "hello")) (let ((y "goodbye")) (if #t x y)))

each of the two let expressions creates a separate frame. When eval evaluates x, it first checks the innermost let's frame; since that frame only contains a binding for y, it then checks the outer let's frame, where it finds a binding for x with value "hello".

To summarize, evaluating a let expression of the form:

(let ((var1 expr1) (var2 expr2) ... (varn exprn)) body)

consists of the following steps:

  1. Let e be a pointer to the current frame. Create a new frame f whose parent frame is e.
  2. For i = 1 through n:
    1. Let vali be the result of evaluating expri in frame e.
    2. Add a binding from vari to vali to f.
  3. Evaluate body using frame f and return the result.

You will need to implement data structures for frames and bindings. The easiest approach is to use linked lists. The linked list of frames essentially forms a stack: you push on a new frame just before evaluating the body of the let expression, and pop the frame off before returning (although the "pop" really happens automatically when you return from eval). Within each frame you should store a list of variable/value pairs (i.e. bindings), using anotherr linked list. When you need to resolve a variable, check the current frame first, then its parent if that variable is not in the current frame, then its parent, and so on until you reach a frame with no parent.

Evaluation errors

There are going to be many cases in which evaluation is not possible. For example:

In each of these cases you may immediately quit, printing "evaluation error". Make sure to clean up memory on your way out; use texit for this.

It may help you to print error messages explaining where evaluation went wrong. You may include such messages after "evaluation error", if you wish.

Multiple S-expressions

As discussed in the parser assignment, a Scheme program is a list of S-expressions. You should call eval on each S-expression. This is what the function interpret, that you need to write, should do: it is a thin wrapper that calls eval for each top-level S-expression in the program. You should print out any necessary results before moving on to the next S-expression.

Sample output

$ cat test1
3
5
(if #f 7 12)
$ ./interpreter < test1
3
5
12

$ cat test2
(let ((x 5)) x)
(if #t x y)
$ ./interpreter < test2
5
evaluation error

$ cat test3
(let ((foo "hello")
      (bar 7)
      (baz #f))
  (if baz foo bar))
$ ./interpreter < test3
7

Memory errors

Your code should have no memory leaks or memory errors when run on any input (correct or incorrect) using valgrind. We will be checking this during grading.

What to submit

Push all of your code into your git repository. Your submitted files should be exactly what you checked out, but also with a file named interpeter.c. Your Makefile may have changed if you modified it to work with my binaries. If you did this, add a readme.txt file explaining what you did.

For the rest of the interpreter project, the work that you do will be cumulative. Starting over with a new repository every time will get fairly cumbersome. Therefore, you will to continue to push your work for the rest of the project using this same repository. We need to know which commit is intended for which submission, so starting with this assignment you should tag your submissions. After you have committed your final change for this particular assignment, enter the following commands:

      git commit -am "Going to add a tag for this particular commit"
      git push
      git tag if-let
      git push --tags

Make sure to include both of those push commands.

Look on the web server after the push, and in the commit history you should be able to see the tags appear.

You should wait to tag until you're sure you have committed the version you want us to grade. That said, in the unlikely event that you goof and realize that you want to commit a newer version for us to grade, you'll need to use another tag. Reusing the same tag for a different commit is generally a really bad idea. If you need to tag another commit, append a ".1" (or whatever version number you're up go. For example:

      git commit -am "Oops: grade this one, I really mean it"
      git push
      git tag if-let.1
      git push --tags

You should also include in your submission a collection of at least 4 pairs of test files. (More is better!) Each pair of files should be named interpreter-test.input.XX and interpreter-test.output.XX, where XX is a test number. It should be the case that if I run the following command then I get no output:

./interpreter < interpreter-test.input.XX | diff interpreter-test.output.XX -

When we test your code for grading purposes, we will use our own set of test files which we will run using exactly the same command as described above. Test carefully.

Your code should have no memory leaks or memory errors when run on any input (correct or incorrect) using valgrind. We will be checking this as well during grading.

Remember, you can test your code in valgrind this way:

./valgrind.sh < interpreter-test.input.XX

Have fun interpreting!


This assignment was originally created by David Liben-Nowell and has since been updated by Dave Musicant and Laura Effinger-Dean.