COS 371: Programming Languages

Spring 2020

proj1: Linked list

Due: Wednesday, 04/01 at 22:00

1. Goals

To implement a linked list that will be used throughout the interpreter project.

2. Introduction

Your course project is to develop a Scheme interpreter in C. To help you along the way, I will provide a number of checkpoints: well-specified intermediate tasks for you to complete. This first checkpoint is to develop a linked list.

Lists are amazing. We can build our entire Scheme interpreter using a linked list as our only data structure.

Even though a more sophisticated data structure than a linked list may prove useful, better, faster, or easier at times, a correct and flexible linked list implementation is the only structure you really need.

3. A Value type holding different types of values

One of the first questions to arise will be "What is this a linked list of? Integers? Strings? More lists?" The catch is that you'll be storing lists created in Scheme, which can contain values of a whole variety of types. You might think: "Ah! I'll have a Value class, with subclasses for IntValue, StringValue, etc." And that would be totally awesome, if you were implementing this list using an OOP (object-oriented programming) language. Oops.

In C, one way to handle this type of typing issue is to use union types. (See our online C reference that I assigned or other online tutorials.) These types allow you to have a single datatype that sometimes includes an int, sometimes a char *, etc.) Every such value should also include a tag telling you what kind of data you're storing.

There are two important but separate things we'll have to do.

Define type tags:
```
typedef enum {
   INT_TYPE, DOUBLE_TYPE, STR_TYPE /* ... */
} valueType;
```
We define a list of constants INT_TYPE, DOUBLE_TYPE, etc., which we will use to tag our Values to indicate which type of value is being stored. (Secretly, these constants are just integers, but using magic numbers is very, very bad for readability and maintainability.)
Define a Value type itself:
```
struct Value {
   valueType type;
   union {
      int i;
      double d;
      char *s;
      /* ... */
   };
};

typedef struct Value Value;
```
The Value struct includes a field (called type) containing one of these type tags, as well as a union of various possible fields. As usual, we use typedef to make Value a shorthand for struct Value.

For example, we could create a Value containing the integer 23 as follows:

Value myInt;
myInt.type = INT_TYPE;
myInt.i = 23;

Warning: C will let you set myInt.d = 3.14159265358979, despite myInt.type's value.

If you have a Value val, you will probably need to handle val differently depending on what type is stored in it. The easiest approach is to use a switch statement:

switch (val.type) {
   case INT_TYPE:
      // do something with val.i
      break;
   case DOUBLE_TYPE:
      // do something with val.d
      break;
   /* ... */
   default:
      /* ??? */
}

4. A list of links that link lists

You will thus want to make some sort of linked list implementation where the nodes contain Values. There are many different ways that one can construct a linked list. Since some of the choices below will cause you trouble much later on in the project, I will give you some guidance:

A linked list could contain a head field that points to a Node, or a linked list could simply be a pointer to a Node. Use the latter approach.
The most common approach you have likely seen for a node is a Node struct that has a data field (a Value *) and a next field (a Node *).
Don't do it this way for the assignment. There's a better way that will save you much pain later.

Because you will eventually be using your linked list to represent Scheme lists, you will have a much easier time if your linked list actually resembles a Scheme list.
Specifically, each node should be a "cons cell" with two fields, which I'll call car and cdr (of course). Each of these fields is a Value *.
```
struct ConsCell {
   Value *car;
   Value *cdr;
};
```
Since cdr should point to another node (or the empty list, see next point), we deduce that ConsCell should be a valid type of Value.
The empty list can be represented by a NULL pointer or a pointer to a Value of NULL_TYPE that represents the empty list. Use the latter approach.

To summarize, add CONS_TYPE and NULL_TYPE into the valueType enumeration and add a field c of type struct ConsCell to the union in the Value struct:

typedef enum {
   INT_TYPE, DOUBLE_TYPE, STR_TYPE, CONS_TYPE, NULL_TYPE /* ... */
} valueType;

struct Value {
   valueType type;
   union {
      int i;
      double d;
      char *s;
      /* ... */
      struct ConsCell {
         struct Value *car;
         struct Value *cdr;
      } c;
   };
};

typedef struct Value Value;

The astute student will notice two changes to struct ConsCell when moved into the union. I now write struct in front of Value (because the typedef abbreviation cannot be established before we are done defining struct Value) and naming the ConsCell type c in the union.

So, a linked list has type Value *. This Value is of NULL_TYPE if the list is empty. Otherwise, it is of CONS_TYPE and holds a cons cell. The car of that cell points to a Value holding the first element of the list. The cdr of that cell points to another Value representing the rest of the list. (Unless something has made the list improper in some way, if you cdr down a linked list you will eventually reach a Value of NULL_TYPE.)

5. Instructions

Check out from GitHub your team's blank repository (your team name, followed by -project).
Download and add the contents of the starter archive to your Git repository. You are encouraged to simply place files in the root of your repository, not in a proj1 subdirectory or anything like that. You will continuously change many of the files for each checkpoint. Git will keep track of your changes.
The files you get include:
- value.h: defines the Value datatype, as described above.
- linkedlist.h: prototypes all of the functions that you will need to write from scratch.
- linkedlist_test.c: a tester function that makes some nodes, puts them into a linked list, displays them, and cleans up memory afterwards.
- Makefile: allows for a nicely packaged sequence of commands to compile code.
The missing file here is linkedlist.c, which you'll need to create yourself in order to implement everything listed in linkedlist.h.
To compile your code, issue the command:
```
make
```
This will follow the instructions in the Makefile for building your project in order to produce an executable called linkedlist.
At first, it won't build at all because your linkedlist.c file isn't there. Create the file and for now, create every function that you need with no code inside it so that you can get everything to build. Once you have done that, you can begin implementing your functions, following an incremental development plan and testing as you go, as always.

To run your code, issue this command at a prompt:
```
./linkedlist
```

6. Notes

When we test your code for grading purposes, we will use value.h and linkedlist.h exactly as provided here, so don't change them. We will also use linkedlist_test.c exactly as provided here. Furthermore, we will use a fancier linkedlist_test.c, of our own devising, that follows the specifications in linkedlist.h, to see if you've carefully tested your linked list. The linkedlist_test.c that I have provided very intentionally does not test all possible cases. You should test further.

Your code should have no memory leaks or memory errors when run on any input (correct or incorrect) using Valgrind. We will be checking this during grading.
Use either command to run your code through Valgrind:
```
valgrind --leak-check=full --show-leak-kinds=all ./linkedlist
make memtest
```
For your own testing purposes, feel free to edit (and push) linkedlist_test.c and Makefile. But remember, your code needs to run with the original linkedlist_test.c that we provide. When we grade, we will simply copy out your linkedlist.c file (and no other files) into our own project, and test it.
Look up how to use assert (hint: man assert at the terminal) to make assertions. The idea is that your linked list code is for internal use only, so you should expect your functions to be called on correct arguments, and thus you need NOT gracefully handle incorrect arguments. Of course, to err is human, and you may inadvertently call your linked list functions with incorrect arguments at some point in your interpreter project. When this happens, your C program may or may not fail. This makes debugging harder. Instead, use assert to guarantee failure when passed incorrect arguments. Trust me, this will make your lives easier later.
To make cleanup easier (or even possible), you will likely want to do a deep copy of the char *s for Value of STR_TYPE. Look up and use strcpy (hint: man strcpy at the terminal).

7. Submission

Follow the instructions from the previous assignments. That is:

Create a file called credits.txt, write "proj1" (you will add to this same file in subsequent assignments), and below it, declare "We didn't get any help" or list anyone who helped you with the assignment (including lab assistants and classmates), and any websites, books or other sources you used for reference. Also briefly describe how you were helped. Please be thorough.
Add, commit, and push regularly.
When you are done, double check to make sure you have committed the version you want graded; tag and push the tag:
```
git tag proj1
git push --tags
```
(Recall that a normal git push does not push tags. Be sure to check on GitHub to make sure that it received your final submission.)

This assignment was originally created by David Liben-Nowell and has since been updated by Dave Musicant, Laura Effinger-Dean, and Jed Yang.

Start early, have fun, and discuss questions on Moodle.