Assembly to C
Package: asm-and-c-package.tar
Upload via Moodle as: asm-and-c.tarNOTE: this assignment has two due dates. For Tuesday, April 28,
submit work for puzzle1.asm, puzzle2.asm, and puzzle3.asm,
including one reverse-engineered puzzle.
For Thursday, April 30,
submit work for puzzle4.asm, puzzle5.asm, and puzzle6.asm,
including one reverse-engineered puzzle.
You may work alone or with one other person for this assignment. I encourage you to collaborate on this one; it's much more fun to talk to each other and to have friends to share your befuddlement.
Goals
- Get familiar with x86_64 assembly language basics
- Identify assembly language patterns representing common structures from higher-level languages
- Trace possible execution flows through assembly code
Rubric
Assignment overview
For this assignment, you are going to practice understanding the correspondence between simple C code and its equivalent assembly language by studying a sequence of puzzles.
For each puzzle, you will read some assembly language and try to identify the assembly structures that correspond to C-language structures like loops, if/else statements, function calls, etc. For two of the puzzles, you will also try to write C code that compiles to the puzzle's assembly code. Overall, for this assignment, you will be doing a simple form of reverse engineering.
To help us study assembly code, we will use Compiler Explorer, as described briefly in this getting-started document. You'll put some C code into the input panel, and the output panel will show you the assembly language generated by the selected compiler. As you adjust your C code, you'll be able to watch the changes in the assembly language, and then compare your assembly code to the puzzle's code.
What you should do?
In the asm-and-c-package.tar package,
you will find several files named puzzle0.asm,
puzzle1.asm, etc. For each
puzzle, your job will go like this:
- Study
puzzleN.asmto understand what it does. Although it's good to try to understand it line-by-line, you will also want to understand the puzzle holistically. Each puzzle's code performs a computationally familiar task, and can be described in a single short sentence. - Fill in the comment at the top of the
puzzleN.asmfile, indicating which structures are present in the puzzle:- conditional branching (if, if/else)
- loop (for, while, do-while)
- nested loop
- function call
- recursive function call
- For each
# Next: TODOmarked in the inline comments, provide the label(s) of the instruction(s) that could possibly be executed immediately after the instruction on the TODO line. - Fill in the
Possible ordercomments at the top of the assembly file, giving three possible orders in which the labels may be executed in during a single invocation of the functionfunction1. - Fill in the
Registers and sizescomment at the top of the assembly file, listing the registers used for the parameters passed tofunction1, and explicitly indicating the sizes of the registers. For example, if the only parameter tofunction1is found in%rdiand it is always referenced as%edi, you’d write something like "%edi (4 bytes)". If there are two parameters, with the first being an 8-byte parameter and the second being a 2-byte parameter, you’d write "%rdi (8 bytes)" and "%si (2 bytes)". - For your choice of two puzzles (one from 1-3 and another from 4-6),
write a C file
puzzleN.ccontaining a function that, when entered into Compiler Explorer, produces the assembly language code found inpuzzleN.asm. At the top ofpuzzleN.c, include a comment containing a one-sentence description of the purpose offunction1.
NOTE: when you are writing your two puzzleN.c files, you may find that
some of the labels in the puzzleN.asm will not
show up in the assembly generated by Compiler Explorer. For example, puzzle4.asm
has labels CA, CB, CC, and CD, all
of which were added by the CS faculty after compiling was finished. Those extra labels
are there to help you trace through the code. Similarly, if you write code in Compiler Explorer
and it gives you the same code as one of the puzzles but the L* labels have different
numbers in different places than the puzzle, that's fine--the numbers don’t
have to exactly match, but they should be in the right places.
Hand it in
For each of the two deadlines, combine your edited puzzleN.asm files and your
puzzleN.c file in a tar file named asm-and-c.tar and submit it
via Moodle.
One last note...
This is weird stuff at first. We'll spend a lot of time in class giving you the tools you need to complete this assignment. Be patient and persistent. The pieces will fall into place before long.