Lab Assignment 1 - System Calls

Deadlines

  • Due: Monday, January 15, at 10:00pm

Introduction

In this assignment, you will add to osv some system calls for interacting with the file system.

Goals

Your task is to implement the system calls listed below. Unimplemented system calls will panic if they are called, so as you implement each system call you should remove the panic.

For this assignment, you do not need to worry about synchronization. There will only be one process.

Collaboration policy

For this and most other assignments, you are encouraged to talk at a high level with your classmates. However, any low-level discussions (e.g., that involve actual code instead of ideas) should be with at most one other student in the class.

See our class collaboration policy for more information.

Logistics with git

If your undergrad experience is like mine, you are likely frequently told to use version control, but not often really told how. Ask me about my Ray Caster assignment for a scary story on why version control is great.

The repository you’ve cloned for osv is already based on git, and I super strongly encourage you to push your changes to a private GitHub repo. Here’s why:

  • You can still easily pull any changes I make (e.g., typos).
  • You can push your changes to a place they can’t easily be lost.
  • This is a convenient way to submit on Gradescope.
  • If you add me as a collaborator, I can easily check your progress remotely if you are debugging and want some help.
  • It’s really good experience for both industry and grad school.

A git repository can have one or more remotes. These are other copies of the repository that one can pull changes from and push changes to. Having initially cloned the osv-w24 repo from mantis in Lab Assignment 0, the mantis repo will be named the origin remote. You can see this by running (from the osv-w24 directory):

$ git remote -v
origin  ssh://tamert@mantis.mathcs.carleton.edu:/web-pages/www.cs.carleton.edu/faculty/tamert/courses/cs332-w24/osv-w24 (fetch)
origin  ssh://tamert@mantis.mathcs.carleton.edu:/web-pages/www.cs.carleton.edu/faculty/tamert/courses/cs332-w24/osv-w24 (push)

By default, the git pull and git push commands interact with the origin repo. Thus, the first step is to rename the origin remote to upstream:

git remote rename origin upstream

Now you will get any updates I make to the original repo by running the following (do this now):

git pull upstream main

At this point, you’re ready to put your repo on GitHub. To do so, follow these steps:

  1. Go to github.com and sign in (create an account if you don’t have one–it’s free).

  2. On the left-hand side, you should see a green button to create a new repository.

  3. Give it a name, set it to Private, and click Create repository.

  4. At the top, under Quick setup, toggle to SSH (GitHub has deprecated HTTPS for some tasks).

  5. Now, navigate to your osv-w24 repo, and from that terminal run the three commands under …or push an existing repository from the command line: a. git remote add origin <YOUR-REPO-URL-HERE> adds the GitHub repo as a new remote named origin for your local repository (either type or copy-paste the line with your actual GitHub URL). b. git push -u origin main sends the code from the local repo to the GitHub repo (i.e., it pushes the contents of the main branch to the origin remote). Note that for this step, you’ll need to have set up an SSH key on GitHub.

  6. Add me as a collaborator on the GitHub repo by going to Settings->Manage Access on the GitHub repo page. My GitHub account is tanya-amert.

Background

Existing system calls

The osv operating system has to support a bunch of system calls. What follows is a list of the ones that are already implemented (look in kernel/syscall.c to see their implementations).

  • Process system calls:

    • int spawn(const char *args): creates a new process

    • int getpid(): returns the pid of the calling process

  • File system system calls:

    • int link(const char *oldpath, const char *newpath): creates a hardlink for a file

    • int unlink(const char *pathname): removes a hardlink

    • int mkdir(const char *pathname): creates a directory

    • int chdir(const char *path): changes the current working directory

    • int rmdir(const char *pathname): removes a directory

  • Utility system calls:

    • void meminfo(): prints information about the current process’s address space

    • void info(struct sys_info *info): reports system info

Trap

In osv, software interrupts are used to implement system calls. When a user application needs to invoke a system call, it issues an interrupt with instruction int 0x40. System call numbers are defined in include/lib/syscall-num.h. When the int instruction is being issued, the user program is responsible for setting the register %rax to be the chosen system call number.

The software interrupt is captured by the registered trap vector (arch/x86_64/kernel/vectors.S) and the handler in arch/x86_64/kernel/vectors.S will run. The handler will reach the trap function in arch/x86_64/kernel/trap.c and the trap function to route the interrupt to syscall function implemented in kernel/syscall.c. The syscall function then routes the call to the respective handler in kernel/syscall.c.

File, file descriptor, and inode

The kernel needs to keep track of the open files so it can read, write, and eventually close the files. A file descriptor is an integer that represents this open file. Somewhere in the kernel you will need to keep track of these open files. Remember that file descriptors must be reusable between processes. File descriptor 4 in one process should be able to be different from file descriptor 4 in another (although they could reference the same open file).

Traditionally, the file descriptor is an index into an array of open files.

The console is simply a file (file descriptor) from the user application’s point of view. Reading from the keyboard and writing to the screen is done via the kernel file system call interface. Currently, reading from and writing to the console is implemented as a hard-coded number, but as you implement file descriptors, you should use stdin and stdout file structs as backing files for console reserved file descriptors (0 and 1).

Implementation

For this assignment, you are provided with an Assignment 1 Design Document to guide you through your implementation. This will also serve as an example when you write your own design docs for future assignments.

Hints

As you work, keep the following in mind:

  • File descriptors are just integers.

  • Look at already-implemented system calls to see how to parse the arguments. (Example: kernel/syscall.c:sys_read.)

  • If a new file descriptor is allocated, it must be saved in the process’s file descriptor tables. Similarly, if a file descriptor is released, this must be reflected in the file descriptor table.

  • A full file descriptor table is a user error (you should return an error value instead of calling panic).

  • A complete file system is already implemented and available from within the kernel. You can use fs_read_file/fs_write_file to read/write from a file. You can use fs_open_file to open a file. If you decide to have multiple file descriptors referring to a single file struct, make sure to call fs_reopen_file() on the file each time. You can find information about a file in the file struct and the inode struct inside of the file struct.

  • For this assignment, the reference solution makes changes to kernel/syscall.c, kernel/proc.c, and include/kernel/proc.h.

What to implement

  1. File descriptor opening: sysret_t sys_open(void *arg)
  2. File descriptor reading: sys_ret_t sys_read(void *arg)
  3. Closing a file: sysret_t sys_close(void *arg)
  4. File descriptor writing: sysret_t sys_write(void *arg)
  5. Reading a directory: sysret_t sys_readdir(void *arg)
  6. Duplicating a file descriptor: sysret_t sys_dup(void *arg)
  7. Getting file status: sysret_t sys_fstat(void *arg)

More details

  1. File descriptor opening
/*
 * Corresponds to int open(const char *pathname, int flags, int mode); 
 * 
 * pathname: path to the file
 * flags: access mode of the file
 * mode: file permission mode if flags contains FS_CREAT
 * 
 * Open the file specified by pathname. Argument flags must include exactly one
 * of the following access modes:
 *   FS_RDONLY - Read-only mode
 *   FS_WRONLY - Write-only mode
 *   FS_RDWR   - Read-write mode
 * flags can additionally include FS_CREAT. If FS_CREAT is included, a new file
 * is created with the specified permission (mode) if it does not exist yet.
 * 
 * Each open file maintains a current position, initially zero.
 *
 * Return:
 * On success, non-negative file descriptor. The file descriptor returned by a
 *   successful call will be the lowest-numbered file descriptor not currently
 *   open for the process.
 * On failure:
 *   ERR_FAULT - Address of pathname is invalid.
 *   ERR_INVAL - flags has invalid value.
 *   ERR_NOTEXIST - File specified by pathname does not exist, and FS_CREAT is not
 *                  specified in flags.
 *   ERR_NOTEXIST - A directory component in pathname does not exist.
 *   ERR_NORES - Failed to allocate inode in directory (FS_CREAT is specified).
 *   ERR_FTYPE - A component used as a directory in pathname is not a directory.
 *   ERR_NOMEM - Failed to allocate memory.
 */
sysret_t
sys_open(void *arg);
  1. File descriptor reading
/*
 * Corresponds to ssize_t read(int fd, void *buf, size_t count);
 * 
 * fd: file descriptor of a file
 * buf: buffer to write read bytes to
 * count: number of bytes to read
 * 
 * Read from a file descriptor. Reads up to count bytes from the current position
 * of the file descriptor fd and places those bytes into buf. The current position
 * of the file descriptor is updated by number of bytes read.
 * 
 * If there are insufficient available bytes to complete the request,
 * reads as many as possible before returning with that number of bytes. 
 * Fewer than count bytes can be read in various conditions:
 * If the current position + count is beyond the end of the file.
 * If this is a pipe or console device and fewer than count bytes are available 
 * If this is a pipe and the other end of the pipe has been closed.
 *
 * Return:
 * On success, the number of bytes read (non-negative). The file position is
 *   advanced by this number.
 * On failure:
 *   ERR_FAULT - Address of buf is invalid.
 *   ERR_INVAL - fd isn't a valid open file descriptor.
 */
sysret_t
sys_read(void *arg);
  1. Closing a file
/*
 * Corresponds to int close(int fd);
 * 
 * fd: file descriptor of a file
 * 
 * Close the given file descriptor.
 *
 * Return:
 * ERR_OK - File successfully closed.
 * ERR_INVAL - fd isn't a valid open file descriptor.
 */
sysret_t
sys_close(void *arg);
  1. File descriptor writing
/*
 * Corresponds to ssize_t write(int fd, const void *buf, size_t count);
 * 
 * fd: file descriptor of a file
 * buf: buffer of bytes to write to the given fd
 * count: number of bytes to write
 * 
 * Write to a file descriptor. Writes up to count bytes from buf to the current
 * position of the file descriptor. The current position of the file descriptor
 * is updated by that number of bytes.
 * 
 * If the full write cannot be completed, writes as many as possible before
 * returning with that number of bytes. For example, if the disk runs out of space.
 *
 * Return:
 * On success, the number of bytes (non-negative) written. The file position is
 *   advanced by this number.
 * On failure:
 *   ERR_FAULT - Address of buf is invalid.
 *   ERR_INVAL - fd isn't a valid open file descriptor.
 *   ERR_END - fd refers to a pipe with no open read.
 */
sysret_t
sys_write(void *arg);
  1. Reading a directory
/*
 * Corresponds to int readdir(int fd, struct dirent *dirent);
 * 
 * fd: file descriptor of a directory
 * dirent: struct direct pointer
 * 
 * Populate the struct dirent pointer with the next entry in a directory. 
 * The current position of the file descriptor is updated to the next entry.
 * Only fds corresponding to directories are valid for readdir.
 *
 * Return:
 * ERR_OK - A directory entry is successfully read into dirent.
 * ERR_FAULT - Address of dirent is invalid.
 * ERR_INVAL - fd isn't a valid open file descriptor.
 * ERR_FTYPE - fd does not point to a directory.
 * ERR_NOMEM - Failed to allocate memory.
 * ERR_END - End of the directory is reached.
 */
sysret_t
sys_readdir(void *arg);
  1. Duplicating a file descriptor
/*
 * Corresponds to int dup(int fd);
 * 
 * fd: file descriptor of a file
 * 
 * Duplicate the file descriptor fd, must use the smallest unused file descriptor.
 * Reading/writing from a dupped fd should advance the file position of the original fd
 * and vice versa. 
 *
 * Return:
 * On success, non-negative file descriptor.
 * On failure:
 *   ERR_INVAL if fd is invalid.
 *   ERR_NOMEM if no available new file descriptor.
 */
sysret_t
sys_dup(void *arg);
  1. Getting file status
/*
 * Corresponds to int fstat(int fd, struct stat *stat);
 * 
 * fd: file descriptor of a file
 * stat: struct stat pointer
 *
 * Get the file status in the struct stat pointer passed in to the function.
 * Console (stdin, stdout) and all console dupped fds are not valid fds for fstat. 
 * Only real files in the file system are valid for fstat.
 *
 * Return:
 * ERR_OK - File status is written in stat.
 * ERR_FAULT - Address of stat is invalid.
 * ERR_INVAL - fd isn't a valid open file descriptor or refers to non file. 
 */
sysret_t
sys_fstat(void *arg);

Testing

After you implement each of the system calls described above, you can look through the user/lab1/* files and then run individual tests in the osv shell program by typing close-test, open-bad-args, etc.

To run all tests in lab 1, run python3 test.py 1 in the osv-w24 directory (from your normal shell, not within osv). This script relies on Python version >= 3.6. For each test passed, you should see a passed <testname> message. At the end of the test it will display a score for the test run.

What to turn in

You will submit your work for this assignment via Gradescope.

Gradescope lets you submit via GitHub, which is probably the easiest method. All you’ll need to do is to connect your GitHub account (the Gradescope submission page has a button for this) and select the repository and the branch you wish to submit. Alternatively, you can create a .zip file of the osv-w24 directory and upload that. The arch, include, and kernel directories from your submission will be used in testing.

When you submit, the autograder will compile your code and run the test cases.

Although you are allowed to submit your answers as many times as you like, you should not treat Gradescope as your only debugging tool. Many people submit their assignments near the deadline, and thus Gradescope will take longer to process the requests. You may not get feedback in a timely manner to help you debug problems.

Grading

This assignment will be graded out of 100 points, with 90 points coming from the tests shown in the table below. Comments explaining your approach can help earn partial credit if there are tests that don’t pass. The remaining 10 points are based on coding style, so make sure to submit clean, well-organized code.

Test    Points
close-test 10
dup-console 7
dup-read 7
fd-limit 5
fstat-test 7
open-bad-args 10
open-twice 10
read-bad-args 10
read-small 10
readdir-test 7
write-bad-args 7