Final exam

Due 5:00PM Monday, 20 November 2023

Submit via Moodle as final.tar. Include your answers in final.pdf or final.txt, with source code and/or data (if any) in appropriately named files.

This is a takehome exam. You may use your textbook and online sources, as long as you cite them clearly. You may not speak with anyone other than Jeff Ondich about the content of this exam. That means in particular that you may not talk with your classmates or post questions on online forums. You may, however, ask questions during class and via the #questions channel of our class Slack workspace (and, of course, via Slack direct message to Jeff).

If you draw a diagram for any of these problems, you may of course use drawing software. But if it's easier, feel free to draw your diagrams by hand and include images in your PDF submission. Regardless of how you present your diagrams, please make sure they are easy to read.

When answering questions on this exam, you should show your work to get full credit. For example, in the very first question 1(a), it's not sufficient to provide the correct 8-digit hexadecimal number. You must also explain (concisely and clearly) why that particular 32-bit number represents the given real number.

Floating point numbers

Read Section 2.4 ("Floating Point") of your textbook. You might also find this section of Dive Into Systems or part of the Wikipedia page on floating point helpful. All of the following questions refer to the IEEE 754 single precision (32-bit) representation of real numbers.
1. [2] What is the representation of the real number -19.375? Give your answer as a single 8-digit hexadecimal number.
2. [2] To what real number does the 32-bit IEEE 754 representation 0x408aaaab correspond? Please express your answer as a fraction. (Note: since the significand appears to have a repeating pattern, you may find that extending that pattern infinitely gives you a much simpler fraction than the rounded off version stored in this 32-bit float.)
3. [1] Give a 32-bit representation of NaN? (i.e., "not a number"). Show your answer as an 8-digit hexadecimal number.
4. [1] Show a small amount of C code that can be used to cause a float-type variable x to contain NaN. (You can test this using a printf("%f", x); statement.)
Bytes in context

Suppose we have a UTF8-encoded text file named letter.txt containing my latest missive to my friend Rob, and that file starts like so:

Rob! 😀 I can't wait for you to get home from Crete...

This really is just a text file (without a byte order mark), so the very first byte of the file is 0x52 (that is, a capital R). (The emoji in the first line is named "GRINNING FACE" and has Unicode codepoint U+1F600.)

For each of the following questions, you may assume that we are going to compile with gcc and execute this C code on mantis.mathcs.carleton.edu:

int file_descriptor = open("letter.txt", O_RDONLY); if (file_descriptor < 0) { perror("Trouble opening file"); exit(1); } // Here, do whatever the question asks, using the read() function. // ... close(file_descriptor);

That is, for each question below, you open the file, you do the thing the question requires you to do, and then you close the file. Let's get started.
1. [2] Suppose you read the first sizeof(int) bytes of the file into an int variable and then print that variable as a decimal integer. What gets printed, and why?
2. [1] Same question, but for sizeof(long) bytes into a long variable.
3. [1] Same question, but for sizeof(float) bytes into a float variable (and print as a real number, not as an integer).
4. [2] Suppose you read the first line of text, not including the newline character, from letter.txt into a character buffer char buffer[20]. What bytes are found in buffer[0],...,buffer[7], inclusive, and why?
5. [2] Suppose you read the first line of text, not including the newline character, from letter.txt into a character buffer char buffer[20]. Suppose you then manage to get the program to jump (or retq, or whatever) to the address of buffer[0] and start executing code there. What is the first instruction it will try to execute? How did you figure that out?
Educating Jeff. [2] Is there a book, a movie, a podcast, a website, a game, a writer, a musician, etc. etc. you think I should know about? Let me know about it!
Exploring an executable

Consider the executable program problem4 If you run this program on mantis like so:

./problem4

it will tell you only that the expected command-line syntax is:

./problem4 textfile

For the questions below, you will investigate what this program does, how it does it, and how it might be vulnerable to a buffer overflow attack.
1. [1] If you view the first few bytes of problem4, you'll see that three of those bytes spell "ELF". What is ELF, and why does it show up in this executable program?
2. [1] Provide a list of functions contained in problem4. (Don't include obvious system functions like read or exit.) How did you obtain this list?
3. [1] What does the function get_count_for_file do? How do you know?
4. [1] What does the function get_count_for_line do? How do you know?
5. [1] In the assembly code (try using gdb), there's a function call callq __ctype_b_loc@plt. What does this function do?
6. [1] What does problem4 do? That is, what simple computational problem is it trying to solve? How do you know?
7. [7] This program is vulnerable to a buffer overflow. By exploiting this vulnerability, it is possible (for example), to specify a carefully-crafted input file for which problem4 runs to completion without crashing but prints the wrong answer.
  
  Show me an input file for which the correct answer should be 0, but for which problem4 prints an answer of 9. Include a detailed explanation of why your input file produces this result.
  
  Here are a few questions to consider as you seek to exploit this buffer overflow vulnerability. These are just to help you frame your thinking; you don't need to answer them in your writeup unless you think those answers are important for a clear explanation of what's going on.
  - In get_count_for_file, there's an int variable that holds the file descriptor for the opened input file. Where in the stack is that variable stored?
  - In get_count_for_file, there's an int variable that holds the running count of the thing this function is intended to count. Where in the stack is that variable stored?
  - In get_count_for_file, there's a char buffer that holds each line read from the file. Where in the stack is that buffer?
  - Does read_line null-terminate the line it reads? Keep this in mind when you're thinking about exactly how many bytes to overflow the buffer. You don't want to overflow so far that you corrupt stuff that needs to remain uncorrupted.
  - Are you going to be able to pull this off with an entirely ASCII text file? (Hint: no.) Do you have any tools that will help you construct a file where you can control exactly what bytes are in the file? (Hint: yes.)
  One final note. If you don't succeed in getting the program to print the desired wrong answer, you can get partial credit by explaining in detail what you were trying to do and what you have learned about the structure of get_count_for_file's stack frame.
Have a great break! Thanks for being a wonderful class. It was fun working with you.