CS 251: Programming Languages

Spring 2018

proj5: Eval

Due: Wednesday, 05/16 at 22:00

1. Goals

To start evaluating Scheme code.

2. Introduction

Now that you've parsed tokenized Scheme code into tree format, it's time to start evaluating! We will do this over a series of three checkpoints. First, we start the evaluator with the ability to handle if, let, and quote special forms. Next in two checkpoints, we handle the application of lambda expressions and the application of primitive procedures. Finally, we wrap up the project with many, many odds and ends.

3. Tasks

Create a header file interpreter.h (with #include guard, as always) with the following data type and function prototypes:
```
struct Frame {
   Value *bindings;
   struct Frame *parent;
};
typedef struct Frame Frame;

void interpret(Value *tree);
Value *eval(Value *expr, Frame *frame);
```
The eval function takes a parse tree of a single S-expression (as in the output of your parser) and an environment frame in which to evaluate the expression, and returns a pointer to a Value representing the value. The interpret function handles a list of S-expressions (as in a Scheme program), calls eval on each S-expression in the top-level (global) environment, and prints each result in turn.
Implement eval and interpret in interpreter.c. Your eval function must handle the following Scheme features: literals, if, quote, let, and let-bound variables.

Create main.c that, after including appropriate header files, simply does this:

int main(void) {
   Value *list = tokenize();
   Value *tree = parse(list);
   interpret(tree);
   tfree();
   return 0;
}

Modify your Makefile to compile your new interpreter code. When I'm in your directory and I execute the command make interpreter, I should get an executable file called interpreter in that directory.
Create a collection of at least 4 pairs of test files. (More is better!) Each pair of files should be named test.eval.input.XX and test.eval.output.XX, where XX is a test number. It should be the case that if I run the following command then I get no output:
```
./interpreter < test.eval.input.XX | diff - test.eval.output.XX
```
(Your code may be tested against my test files or other teams' test files.)

4. Specification

From this point forward, the specification is fairly straightforward: support the correct evaluation of various types of Scheme expressions.

Numbers, strings, and booleans. (They evaluate to themselves.)
(if test consequent alternate).
Recall that everything except #f is true in Scheme and that if short-circuits.
Scheme also allows the omission of alternate (check R5RS standard linked from the website); you do not need to support this usage. (To do this, you will likely want to add a VOID_TYPE, something we will do in the next checkpoint.)
(quote expr).
This is equivalent to 'expr, but you only need to be able to handle the version with quote.
Supporting 'expr is a possible extension. If you go in this optional direction, you will likely find that this is easier to do by first going back and modifying the parser (and the tokenizer).
```
$ cat test1
(quote a)
(quote (a b c))
(quote (a b (quote (c d e))))

$ cat test1 | ./interpreter
a
(a b c)
(a b (quote (c d e)))
```
Do not implement this by trying to somehow wrap up the quoted expression in some sort of enclosure indicating that it is quoted. For example, don't do this by creating a QUOTE_TYPE for a Value or some such. It might help you get this part of the assignment to work, but you'll get into hot water later. This reflects a common and interesting point of confusion: quote does not tell the interpreter to do something to its arguments; rather, it tells the interpreter to not do something.

If you are feeling a strong urge to add something to the parse tree indicating that something is quoted, don't do it. If you are experiencing such compulsions because your tree is not printing properly, past experience has been that the problem is likely in your code for printing your parse tree, not structuring it in the first place.
(let list-of-bindings body).
You will need to implement the creation of new frames. Much more on this in a bit.
Variables.
Now that you've implemented let and frames, you should be able to evaluate a let-bound symbol.

When you're done with this part of the project, you should be able to evaluate very simple programs like this:

(let ((x #t))
  (if x 3 5))

This should evaluate to 3. Your programs won't yet be able to handle functions; that's coming later.

Sample code

At the core of your interpreter, you will need to implement a function to evaluate Scheme code:

Here is a skeleton of my eval function:

Value *eval(Value *tree, Frame *frame) {
   switch (tree->type) {
      case INT_TYPE:
         ...
         break;

      // ... more cases ...

      case SYMBOL_TYPE:
         return lookUpSymbol(tree, frame);
         break;
      case CONS_TYPE: {
         Value *first = car(tree);
         Value *args = cdr(tree);

         // Sanity and error checking on first...

         if (!strcmp(first->s, "if")) {
            result = evalIf(args, frame);
         }

         // ... other special forms here ...

         else {
            // not a recognized special form
            evaluationError();
         }
         break;
      }
      ...
   }
   ...
}

Frames and bindings

We will be discussing/have discussed this in class, but here is a brief overview of how to evaluate Scheme code.

The eval function mentioned above takes a pointer to a "frame"; what is a frame? A frame $F$ is a collection of bindings and a pointer to the parent frame of $F$. A binding is a mapping from a variable name (i.e., a symbol) to a value.

The frame is used to resolve (find the value of) each variable encountered during evaluation. When we evaluate a symbol $x$ in a frame $F$, we look for a binding with the name $x$. If found, the associated value in that binding is the value of $x$. If not, repeat in the parent frame of $F$ (if it exists). The unique top-level frame is the one that has no parent. It is an error if we reach the top-level frame and still cannot locate a binding for the name $x$.

(In the next checkpoint, you will need to create the top-level frame to support introducing bindings to that frame using define. For now, since all bindings are local, you may use NULL to represent the top-level frame. Of course, you may opt to create the top-level frame now if you wish.)

Frames are created whenever we introduce new variable names in a let expression (and also in other contexts, to come in later checkpoints). This frame should be passed in when calling eval on the body of the let expression. For example, in the program

(let ((x 23)
      (y 251))
  (if #t x y))

the bindings for x and y are stored in a single frame. On the other hand, in the program

(let ((x 23))        ;; creates frame F_1
  (let ((y 251))     ;; creates frame F_2, whose parent frame is F_1
    (if #t x y)))    ;; evaluated in frame F_2

each of the two let expressions creates a separate frame. Let's call them $F_1$ and $F_2$. When eval evaluates (if #t x y), the current frame is $F_2$, the one created by the inner let expression. When resolving x, we first check for a binding with the name x in the current frame $F_2$. Since it is not found, we repeat the process in the parent frame of $F_2$, namely in $F_1$, and find the binding for x with value 23.

To summarize, to evaluate a let expression of the form

(let ((var-1 expr-1) (var-2 expr-2) ... (var-n expr-n)) body)

in frame $F$, do the following:

For each i from 1 to n, in any order, evaluate expr-i in $F$ and let $v_i$ be the result.
Create a new frame $G$ whose parent frame is $F$.
For each i from 1 to n, add to $G$ a binding of the name var-i to the value $v_i$.
Evaluate body in $G$ and return the result.

Evaluation errors

There are going to be many cases in which evaluation is not possible. For example:

When an if expression has the wrong number of arguments.
When the list-of-bindings for let is not a nested list of the right form.
When you encounter a variable that is not bound in the current frame or any of its ancestors.
etc...

In each of these cases you may give up evaluating by printing "evaluation error" and quitting. Of course, make sure to clean up memory on your way out; use texit for this.

It may help you to print error messages explaining where evaluation went wrong. You may include such messages after "evaluation error" if you wish.

Sample output

For example, here are a few sample runs of my code:

$ cat test1
3
5
(if #f 7 12)

$ ./interpreter < test1
3
5
12

$ cat test2
(let ((x 5)) x)
(if #t x y)
23

$ ./interpreter < test2
5
evaluation error

$ cat test3
(let ((foo "hello")
      (bar (quote foo))
      (baz #f))
  (if baz foo bar))

$ ./interpreter < test3
foo

5. Possible extensions

Your project grade will be mostly based on correctness (and style), as a sum of the scores you receive from the individual checkpoints. However, a small fraction of your project grade will be based on ambition, to be assessed on the project as a whole at the end of term. To that end, you are required to pursue some extensions beyond the specifications of the checkpoints.

To aid in this, I will list some ideas at the end of checkpoints. While listed here, these possible extensions will not be graded as part of this checkpoint. Rather, in a file called readme.txt, write down what extensions you've done (if any). We will look at this file when grading the finished project at the end of term. These ideas may range from routine to near impossible. I try to list them in ascending order of difficulty. Always refer to the R5RS standard (linked from the course website) when in need of more details.

Handle (let list-of-bindings body-1 body-2 ... body-n): the expressions body-1, body-2, ..., body-n are evaluated in order, and the value of the last one is returned.
Handle (if test consequent) variant that is missing alternate.
Handle 'expr as a shorthand for (quote expr).
Informative error messages. You will likely want a variant of printTree from the previous checkpoint which, instead of printing, returns an expression as a string.

To reiterate, you do not need to be pursuing these extensions at this time. (Some of these will be easier to do later.) You also do not need to (actually, should not) implement every extension possible. You are more than welcome to also pursue extensions of your own devising. (If you have a crazy idea, feel free to chat with me first.)

6. Submission

Your code should have no memory leaks or memory errors when run on any input (correct or incorrect) using Valgrind. We will be checking this during grading, so you should test your code via Valgrind yourself.

Follow the instructions from the previous assignment. In particular:

Update credits.txt if applicable.
Push all your code into your GitHub repository, including all created/modified files described as above, including the 4 (or more) pairs of test files you created.
When you are sure you are done, tag your commit with proj5 and push the tag.

This assignment was originally created by David Liben-Nowell and has since been updated by Dave Musicant, Laura Effinger-Dean, and Jed Yang.

Start early, have fun, and discuss questions on Moodle.