Lab Assignment 0 - Hello osv

3 January, 2024

Deadlines

Due: Monday, January 8, at 4:00pm

Introduction

All of our assignments are based on osv, which is an eperimental operating system kernel used for teaching the principles and practie of operating systems. The baseline code for osv is a complete, bootable operating system. It provides some simple system calls that make it capable of running a minimal shell program.

Goals

This assignment is about exploring the osv codebase and generally getting oriented. Your task for future assignments will be to make osv complete and also add a few additional functionalities.

Collaboration policy

For this and most other assignments, you are encouraged to talk at a high level with your classmates. However, any low-level discussions (e.g., that involve actual code instead of ideas) should be with at most one other student in the class.

See our class collaboration policy for more information.

Getting `osv`

Linux environment

You will need to be on a Linux computer to work with osv; your work will be done in C. Lab assignments need to be completed on a x86-64 Linux machine with gcc and qemu. The CS department server mantis is already set up to work; otherwise, you can get your local computer working.

If you want to work on your local computer, and it’s Mac or Windows, you will need to set up a separate Linux environment.

On Windows 10/11, this is pretty straightforward due to Windows Subsystem for Linux. Follow this guide to set it up.

On Mac or earlier versions of Windows, you will need a Linux virtual machine (VM). Instructions are available for the official CS VM on the Carleton wiki. Note that it’s pretty large (11GB to download, and ~30 GB once expanded, as it has lots of handy stuff pre-installed), so you may want to use Xubuntu VM from osboxes.org instead; you’d want the free VirtualBox to run the VM.

If you need a refresher on navigating a Linux command line, see this tutorial or the Unix resources on the course web page.

Instead of developing the operating system on a real, physical personal computer (PC), we will use a program called an emulator that (mostly) faithfully emulates a complete PC: the code you write for the emulator will boot on a real PC, too. Using an emulator simplifies debugging; you can, for example, set breakpoints inside of the emulated x86-64 architecture, which is difficult to do with the silicon version of an x86-64 arhitecture.

In osv, we will use qemu, a modern and fast emulator. If you are working on your local machine (mantis already has these installed), you will likely need to install qemu with (recall that $ is the prompt–don’t type that):

$ sudo apt install qemu qemu-system-x86

or the equivalent if you’re using a non-Debian/Ubuntu Linux distribution (a.k.a. distro).

Cloning the `git` repository

To acquire the osv code that will be the basis for the assignments in this course, you will need to use git. I have set up the repository (i.e., a collection of files with versions managed by git) on the CS mantis server, so you will clone the repo from there to get the code. Run the command below to do so (replacing YOUR_CARLETON_USERNAME with your own username–the part of your Carleton email before the @). If prompted about authenticity of mantis, enter yes to continue. Type in your Carleton password when prompted. (Note that it may not show anything as you type–just be careful and hit Return/Enter when you’re done`.)

$ git clone ssh://YOUR_CARLETON_USERNAME@mantis.mathcs.carleton.edu:/web-pages/www.cs.carleton.edu/faculty/tamert/courses/cs332-w24/osv-w24

This will create a osv-w24 directory wherever you ran that command.

To learn more about git, take a look at the git section of the course resources page.

Running `osv`

Run make in the osv-w24 directory to build the osv kernel. Make sure the build successfully completes before continuing.

Here is what my successful build looks like (it’s long, so there is more where the ... is):

...
ild/user/lab4/sbrk-small  build/user/lab4/malloc-test  build/user/lab4/sbrk-lar
ge  build/user/lab4/bad-mem-access  build/user/cat  build/user/lab5/cow-small  
build/user/lab5/cow-large  build/user/lab5/cow-multiple  build/user/lab5/cow-lo
w-mem  build/user/lab2/race-test  build/user/lab2/fork-fd  build/user/lab2/wait
-twice  build/user/lab2/fork-tree  build/user/lab2/exit-test  build/user/lab2/f
ork-test  build/user/lab2/wait-bad-args  build/user/ls  build/user/sh  build/us
er/lab1/read-bad-args  build/user/lab1/readdir-test  build/user/lab1/open-twice
build/user/lab1/dup-read  build/user/lab1/read-small  build/user/lab1/fstat-t
est  build/user/lab1/fd-limit  build/user/lab1/write-bad-args  build/user/lab1/
open-bad-args  build/user/lab1/close-test  build/user/lab1/dup-console
tamert@mantis:~/cs332/osv-w24$

Now you’re ready to run qemu. You’ll need to supply the files build/fs.img and build/osv.img, created by the make process, as the contents of the mulated PC’s virtual hard disk. You don’t have to worry too much about this yet, but just so you know, those hard disk images contain:

our boot loader build/arch/x86_64/boot,
our kernel build/kernel/kernel.elf, and
a list of user applications in build/user.

Fortunately, make can take care of this for us, too (read the Makefile to learn more). Run make qemu to run qemu with the options required to set the hard disk and direct serial port output to the terminal. Some text should appear in the terminal, like this:

E820: physical memory map [mem 0x126000-0x1FFE0000]
 [0x0 - 0x9FC00] usable
 [0x9FC00 - 0xA0000] reserved
 [0xF0000 - 0x100000] reserved
 [0x100000 - 0x1FFE0000] usable
 [0x1FFE0000 - 0x20000000] reserved

cpu 0 is up and scheduling
cpu 1 is up and scheduling
OSV initialiation...Done

$

Press Ctrl-a x to exit qemu.

If you see warnings about TCG doesn't support requested feature: CPUID.01H:ECX.vmx [bit 5], you can safely ignore them.

Organization of source code

When you clone the repo, it should have the following structure:

osv
├── arch(x86_64)      // all architecture-dependent code for different architectures
│   └── boot          //   bootloader code
│   └── include(arch) //   architecture-dependent header files (contain architecture specific macros)
│   └── kernel        //   architecture-dependent kernel code
│   └── user          //   architecture-dependent user code (user space syscall invocation code)
│   └── Rules.mk      //   architecture-dependent makefile macros
├── include           // all architecture-independent header files
│   └── kernel        //   header files for kernel code
│   └── lib           //   header files for library code, used by both kernel and user code
├── kernel            // all kernel source code
│   └── drivers       //   driver code
│   └── mm            //   memory-management-related code, both physical and virtual memory
│   └── fs            //   file system code, generic filesys interface (VFS)
│       └── sfs       //     a simple file system implementation implementing VFS
│   └── Rules.mk      //   kernel-specific makefile macros
├── user              // all source code for user applications
│   └── lab*          //   tests for assginment* (e.g., lab0 has tests for assignent0)
│   └── Rules.mk      //   user applications makefile macros
├── lib               // all source code for library code
│   └── Rules.mk      //   user applications makefile macros
├── tools             // utility program for building osv
└── Makefile          // Makefile for building osv kernel

After compilation (running make), a new folder build will appear, which contains the kernel and fs image and all of the intermediate binaries (.d, .o, .asm).

Part 1: Debugging `osv`

The purpose of this first exercise is to get you started with qemu and qemu+gdb debugging.

A note on x86-64 assembly

The definitive reference for x86-64 assembly language programming using Intel’s instruction set architecture reference is Intel 64 and IA-32 Architecture Software Developer’s Manuals. It covers all of the features of the most recent processes that we won’t need in class, but you may be interested in learning about anyway.

An equivalent (and perhaps friendlier) set of manuals is AMD64 Architecture Programmer’s Manual. Save the Intel/AMD architecture manuals for later or use them for reference when you want to look up the definitive explanation of a particular processor feature or instruction.

You don’t have to read them now, but you may want to refer to some of this material when reading and writing x86-64 assembly.

`gdb`

You can use gdb as a remote debugger for osv, just as you did in CS 208 for various assignments. We have provided a gdbinit file for you to use as your ~/.gdbinit file (look up “Linux dotfiles” during a weekend internet rabbithole sometime if you’re curious). This file connects to a port that qemu will connect to when running make qemu-gdb, and also loads in kernel symbols. You can generate your ~/.gdbinit using the following command:

$ cp arch/x86_64/gdbinit ~/.gdbinit

To attach gdb to osv, you need to open two separate terminals. Both of them should be in the osv-w24 root directory. In one terminal, type make qemu-gdb. This starts the qemu process and waits for gdb to attach. In another terminal, type gdb. Now the gdb process is attached to qemu. (As another rabbithole, you can look up tmux as a way to easily have two terminals available in one window.)

In osv, when the bootloader loads the kernel from disk to memory, the CPU operates in 32-bit mode. The starting point of the 32-bit kernel is in arch/x86_64/kernel/entry.S. The file entry.S sets up 64-bit virtual memory (more on that later this term) and enables 64-bit mode. You don’t need to understand entry.S; it jumps to the main function in kernel/main.c, which is the starting point of our 64-bit OS.

Question #1

After attaching the qemu instance to gdb, set a breakpoint at the entrance of osv by typing in b main. You should see a breakpoint set at main in kernel/main.c. Then type c to continue execution; osv will go through booting and stop at main.

Which line of code in main prints the physical memory table? (Hint: use the n command to have gdb execute one line of C code at a time.)

Note: As part of your answer to Question #1 in your submission, make sure you explain how you determined the answer. What did you look for and/or find in gdb and the source code that led you to your answer?

Question #2

We can examine memory using gdb’s x command. The gdb manual has full details, but for now, it is enough to know that the command x/nx ADDR prints the n 32-bit “words” of memory starting at address ADDR. (Note that both xs in the command are lowercase; the second x tells gdb to display the mmeory contents in hexadecimal.)

To examine instructions in memory (besides the immediate next one to be executed, which gdb prints automatically), you can use the x/i command. This command has the syntax x/ni ADDR, where n is the number of consecutive instructions to disassemble and ADDR is the memory address at which to start disassembling.

Repeat the previous process to break at main. What is the memory address of main (hint: use p main)? Does gdb work with real physical addresses? Explain your answer.

Question #3

You may have noticed that when gdb hit your breakpoint, the message specified a thread:

Thread 1 hit Breakpoint 1, main () at kernel/main.c:34

When osv boots up, it has multiple threads, as you can see by running info threads within gdb. We’ll talk a lot more about threads in the coming weeks. For now, just know that the basic idea is that each thread is an independent unit of execution, which enables simultaneous computation.

For this assignment, I just want you to explore gdb’s capabilities when it comes to multi-threaded programs. At the start of main, the output of the info threads command indicates that one thread is halted.

Explore and answer the following questions, again with explanations of how you got to your answers:

What function starts the thread running?
What main function does the second thread start in?
What happens if you restart qemu/gdb and set a breakpoint for that function?

Submission

For this assignment, you should submit a file a0.txt with your answers to the questions listed above to the CS 332 Gradescope.

Evaluation

This assignment will be graded out of 50 points, as shown in the table below. Comments explaining your approach are always expected, and can help earn partial credit if you don’t fully complete a part of the assignment.

Grade breakdown:

Component     Points
--------------------------
Submission    20 points
Question #1   10 points
Question #2   10 points
Question #3   10 points