Assembly to C
Package: asm-to-c-package.tar
Upload via Moodle as: asm-to-c.tarI encourage you to collaborate on this one! Much more fun to talk to each other and to have friends to share your befuddlement.
Goals
- Get familiar with x86_64 assembly language basics
- Practice learning how C code constructs get translated into equivalent assembly language
Rubric
Background
What does a compiler do?
That question has a long, complicated answer. But in brief, our compiler (gcc) takes C sources as input, and produces an executable program as output. The executable program contains, among other things, machine language instructions whose behavior implements the computations articulated in the original C code.
Machine language is just bits, and is thus hard to read. So if we want to understand the correspondence between C code and its corresponding machine language, we're better off asking gcc to output assembly language code instead. Assembly isn't particularly easy to read either, but it's a lot easier than machine language. And as a general rule, each assembly language instruction corresponds to exactly one machine language instruction, and vice versa. There are some exceptions (e.g., sometimes one assembly language instruction is an alias for a sequence of two or three machine language instructions), but as a rough guide, you can think of assembly and machine language instructions as being in one-to-one correspondence. As a result, by understanding the assembly language generated by gcc, we will be very close to understanding the machine language as well.
For this assignment, you are going to practice understanding the correspondence between simple C code and its equivalent assembly language by solving a sequence of puzzles. For each puzzle, you will read some given assembly language and try to come up with the original C code that generated it. This is a simple form of reverse engineering, and it's pretty fun.
Though we could use gcc on mantis, for this assignment we're instead going to use an extremely handy tool called the Compiler Explorer. You'll put some C code into the input panel, and the output panel will show you the assembly language generated by the selected compiler. As you adjust your C code, you'll be able to watch the changes in the assembly language, and then compare your assembly code to the puzzle's code.
What you should do?
In the asm-to-c-package.tar package, you will find several files named puzzle0.asm, puzzle1.asm, etc. For each puzzle, your job will go like this:
- Study the puzzleN.asm code to understand what it does. You should try to understand it holistically rather than just line-by-line, so you'll be able to describe the code's purpose in a single short sentence.
- Write an equivalent C function (or sometimes two functions). Then use the Compiler Explorer to compile it to assembly, and see how closely your assembly matches the contents of puzzleN.asm. Refine your code until you feel the match-up is close enough (Exact matches are great, of course, but close matches might also be correct. Very slight changes in source code can make changes in the assembly code, even if the code's end result is unchanged.
- Write a one-sentence description of the purpose of the code in puzzleN.asm. (e.g., you can imagine a description like "this function takes one positive integer parameter n and returns the nth prime number")
- Put your code in a file named puzzleN.c, and put your name(s) and one-sentence description in a comment at the top of the source code.
Hand it all in by combining all your puzzleN.c files into asm-to-c.tar.
Compiler Explorer settings
Set it up like this: