Lab Assignment 3 - Pipes

Deadlines

  • Tell me partners: due Wednesday, January 31st, at 10:00pm

  • Design document: due Thursday, February 1st, at 10:00pm (post on a “design pod” channel on Slack)

  • Peer feedback: due Tuesday, February 6th, at 10:00pm (again as a post on Slack)

  • Lab #2 retrospective: due Tuesday, February 6th, at 10:00pm (on Gradescope)

  • Code implementation: due Monday, February 12th, at 10:00pm (on Gradescope)

Introduction

A pipe is a sequential communication channel between two endpoints, supporting writes on one end, and reads on the other. Reads and writes are asynchronous and buffered, up to some internally defined size.

The OS kernel allocates and manages this buffer:

  • A user process calling read on the read end of a pipe will block (e.g., it does not immediately return, like we saw with waiting on condition variables and acquiring locks) until there are bytes to read (read is allowed to do a partial read, meaning it returns having read fewer bytes than the user requested).
  • A user process calling write on the write end of a pipe will block if there is no more room in the internal buffer.

Pipes are a simple way to support interprocess communication, especially between processes started by fork.

Goals

Your task is to implement the UNIX-based system call for pipe. This opens two file descriptors and maintains a connection between them (BBQ, anyone?) to enable inter-process communication.

Collaboration policy

For this and most other assignments, you are encouraged to talk at a high level with your classmates. However, any low-level discussions (e.g., that involve actual code instead of ideas) should be with at most one other student in the class.

See our class collaboration policy for more information.

Lab Assignment #2 retrospective

Instead of doing quite so much peer review for this lab, you’ll spend some of that time reflecting on how Lab #2 went. Your goal is to generate ~2 paragraphs answering the following questions:

  • What went well?
  • What really didn’t go well?
  • What are some things that took more time than you expected?
  • What, if anything, went better than you expected?
  • Did you follow your implementation plan from your design? If not, what steps did you end up tackling earlier?
  • If you could tell students in future offerings of this course something to help them with this lab, what would you tell them?

You should complete this individually and submit it on Gradescope.

Getting started

Pull any changes to the baseline code by running

git pull upstream main

and merging. If you have any issues with this, please let me know!

After the merge, double check that your code still passes the Lab Assignments #1 and #2 tests.

Background

File operations

From the user’s perspective, pipes are just file descriptors that applications can read from and write to using the standard read and write system calls (sys_read and sys_write from Lab Assignment #1). Thus, functions to read and write with pipes need to operate on file structs retrieved from a process’s open-file-descriptor table. But, if sys_read calls fs_read_file, how can it handle the fact that reading from a pipe works differently underneath the hood than reading from a file??

To answer this, we can look at the implementation of fs_read_file in kernel/fs/fs.c:

ssize_t
fs_read_file(struct file *file, void *buf, size_t count, offset_t *ofs)
{
    ssize_t rs = 0;
    if (file->oflag != FS_WRONLY) {
        rs = file->f_ops->read(file, buf, count, ofs);
    }
    return rs;
}

This function does two things:

  1. Checks that the file is opened opened as write only (FS_WRONLY).
  2. Uses the file struct’s f_ops field to call its read operation.

Diving into this more, we can see that the f_ops field is a pointer to a struct file_operations. This struct contains four function pointers, and is how we can control the behavior of read, write, readdir, and close for a particular file struct. Here is a truncated (comments removed) display of this struct (note the funky function pointer syntax):

/*
 * File operations
 */
struct file_operations {
    ssize_t (*read)(struct file *file, void *buf, size_t count, offset_t *ofs);
    ssize_t (*write)(struct file *file, const void *buf, size_t count, offset_t *ofs);
    err_t (*readdir)(struct file *dir, struct dirent *dirent);
    void (*close)(struct file *file);
};

For example, stdin is a file struct that has specialized behavior. in kernel/console.c, we find:

static struct file_operations stdin_ops = {
    .read = stdin_read,
};

In the console_init function that initializes stdin and stdout, we find:

    stdin.oflag = FS_RDONLY;
    stdout.oflag = FS_WRONLY;
    stdin.f_ops = &stdin_ops;
    stdout.f_ops = &stdout_ops;

This is making it so that when fs_read_file calls f_ops->read(file, buf, count, ofs) with stdin, stdin_read gets called. We can do the same thing with pipe operations:

static ssize_t pipe_read(struct file *file, void *buf, size_t count, offset_t *ofs);
static ssize_t pipe_write(struct file *file, const void *buf, size_t count, offset_t *ofs);
static void pipe_close(struct file *p);

static struct file_operations pipe_ops = {
   .read = pipe_read,
   .write = pipe_write,
   .close = pipe_close
};

Like for stdin and its stdin_ops, we would initialize a pipe’s f_ops field to point to pipe_ops.

Pipe information

Although a pipe appears as just another file to the user, internally it has more to keep track of than just what’s in osv’s current struct file. This pipe-specific data likely includes:

  • synchronization primitives (locks and condition variables)
  • a buffer (char array) to hold bytes written to the pipe
  • status information (such as whether each end is open)

As discussed above, pipe operations will need to apply to file structs. One possibility would be to add all the information for pipes directly to the file struct. However, this is not a very flexible or modular solution. For one thing, it would involve allocating a pipe buffer for every file struct whether or not it was actually a pipe.

A better and more general solution is to add a generic void *info field to the file struct that can be used to point to extra information that various kinds of files might need. When creating a pipe, then, you can use fs_alloc_file to allocate a file struct, and then initialize the info field. In pipe_read, pipe_write, and pipe_close, we can retrieve this information with

struct pipe *p = (struct pipe *)file->info;

Implementation

The pipe system call does not specify how large the internal buffer needs to be, but you may use 512 bytes as the buffer size. You should be able to write any number of bytes to a pipe (although write will eventually block if no other process is reading from the pipe).

Concurrent operations (reads and writes) can happen in any order, but each operation completes as an atomic unit. Because pipes support concurrency, your pipe implementation will need to use locks and condition variables.

You’ll be implementing the pipe system call:

/*
 * Corresponds to int pipe(int *fds);
 *
 * Creates a pipe and two open file descriptors. The file descriptors
 * are written to the array at fds, with fds[0] as the read end of the 
 * pipe and fds[1] as the write end of the pipe.
 * 
 * Return:
 * ERR_OK on success
 * ERR_FAULT if fds address is invalid
 * ERR_NOMEM if 2 new file descriptors are not available
 */
sysret_t
sys_pipe(void *arg);

The pipe system call should create a pipe (a holding area for written data) and open two file descriptors, one for reading and one for writing. A write to a pipe with no open read descriptors should return an error. A read from a pipe with no open write descriptors should return any remaining buffered data, and 0 to indicate EOF (end of file) if no buffered data remains.

For memory management, you can use kmalloc and kfree or the kmem_cache-based approach used by the BBQ code.

Getting started

To store data written to a pipe, pipes can make use of a blocking bounded queue like the one we discussed with condition variables. Remember that OSTEP Chapter 30 (section 30.2 specifically) walks through the implementation of this kind of buffer, including whey two condition variables are needed. It’s up to you whether to integrate a BBQ as a separate kernel data structure that’s used in the implementation of pipes, or to adapt the code for the pipe implementation. Note that the code we’ve discussed before was for a queue of ints, whereas a pipe should have a buffer of bytes (chars).

The reference solution for this lab assignment made changes to:

  • include/kernel/fs.h
  • kernel/syscall.c

It also added new files:

  • include/kernel/pipe.h
  • kernel/pipe.c

Testing

After you implement the system call described above, test your Lab Assignment #3 code by either running individual tests in the osv shell (i.e., make qemu and then enter the name of the test file in user/lab3 without the .c) or run python3 test.py 3 to run all of the tests.

Some of the tests depend on fork, wait, and exit. If you were not able to get these working in Lab 2, please see me.

Debugging

When debugging, I would recommend running individual tests in osv. Note that test.py depends on observing TEST-NAME passed in the output from osv, and the presence of debugging print statements can interfere with this (as output from various processes can get mixed together). It’s also important, therefore, that you disable any debugging print statements before submission to Gradescope.

Hints

Here are some things to keep in mind:

  • The user passes fds into sys_pipe, so you will fill in that array (think of it as an int[2]) with the file descriptors assigned to the file structs for the read and write ends of the pipe.

  • Think of a pipe as a blocking bounded queue plus extra information the pipe needs (e.g., whether the read and write ends are open). This means pipe_read should be similar to bbq_remove and pipe_write should be similar to bbq_insert.

  • Although cool, a reader-writer lock is probably a more complex solution than is necessary.

  • It is possible for one process to call pipe_read while another calls pipe_write; your implementation should be thread-safe.

  • The functions pipe_read, pipe_write, and pipe_close that you will implement are the ones that f_ops->read, f_ops->write, and f_ops->close will point to for pipe files. Inside pipe_read there is no other function you can call to perform the read for you—you are implementing that function.

  • You should only read as many bytes from the buffer as the user requests (read them one at a time in a loop, copying the byte from the pipe’s data array to the array (buf) the user passed in).

  • The offset variable used for the file system is not relevant for pipes; it just has to be an argument to pipe_read and pipe_write to conform to the file API.

What to turn in

You will submit your work for this assignment via Gradescope.

Gradescope lets you submit via GitHub, which is probably the easiest method. All you’ll need to do is to connect your GitHub account (the Gradescope submission page has a button for this) and select the repository and the branch you wish to submit. Alternatively, you can create a .zip file of the osv-w24 directory and upload that. The arch, include, and kernel directories from your submission will be used in testing.

When you submit, the autograder will compile your code and run the test cases. The Gradescope autograder will run each test multiple times.

Although you are allowed to submit your answers as many times as you like, you should not treat Gradescope as your only debugging tool. Many people may submit their assignments near the deadline, and thus Gradescope will take longer to process the requests. You may not get feedback in a timely manner to help you debug problems.

Grading

This assignment will be graded out of 100 points, with 90 points coming from the tests shown in the table below. Comments explaining your approach can help earn partial credit if there are tests that don’t pass. The remaining 10 points are based on coding style, so make sure to submit clean, well-organized code.

Test    Points
pipe-bad-args 10
pipe-basic 10
pipe-errors 10
pipe-race 20
pipe-robust 20
pipe-test 20