The compiler, like the parser, has seen some major changes. The largest change comes by way of the use of the C runtime and the pseudo-assembly calls now emitted.
To follow the code in the compiler I fear we have to delve back into assembly, the assembly-like runtime instructions, and theory.
To understand how the code from the compiler works I need to cover a little more assembly.
Consider adding the numbers two and three. On a piece of paper, you can just write “2+3=5” and be done with it. You might be tempted, therefore, to transpose that into C or whatever intermediary language you’re using.
It’s not always that easy. For one, you might not get the entire equation at once. For another, how do you handle variables? Or variables which point to another expression?
In assembly, and the C runtime, we’ll use the registers to hold our values prior to doing the actual arithmetic.
The result of the addition is stored in the eax register so that subsequent operands can easily be added. To extend the previous example, consider 2 + 3 + 4 + 5:
setl(2, eax); /* eax is 2 */
addl(edx, eax); /* eax is now 5 */
addl(edx, eax); /* eax is now 9 */
addl(edx, eax); /* eax is now 14 */
Notice a pattern? After the first operand is assigned to the eax register, all subsequent operands are only ever assigned to the edx register since the eax register always holds the result of the last instruction. We can use this pattern to our advantage when compiling our code.
Nesting Arithmetic Operations
What about nested expressions like this: 1 + (5 - 2)?
If we use the above pattern, we have a problem because each nested expression starts the process anew. Since the first operand is always assigned to the eax register, the value of 1 stored in it would be overwritten by the value 5.
/* now the inner expression, (5 - 2) */
setl(5, eax); /* uh oh */
subl(edx, eax); /* eax is now 3 */
/* now what? The 1 is lost! */
addl(edx, eax); /* result is 5, which is clearly wrong */
Remember, we’re not writing this code by hand. If we were, we could be clever in our use of the registers.
Maybe we could ensure we run the nested expression first? It’s still not that simple. What if you have two nested expressions at the same level. Since we’re not writing this by hand we need an algorithm or some other technique to solve our problem.
Enter the push and pop instructions. We can temporarily store intermediate values on the stack.
pushl(eax); /* store the value in eax on the stack */
subl(edx, eax); /* eax is now 3 */
movl(eax, edx); /* move the value of 3 stored in eax to edx */
popl(eax); /* store the last value pushed (1) in eax */
addl(edx, eax); /* eax is now 4 */
Note that the value of 1 was originally stored in the eax register before it was pushed. In order to preserve the order of the values (1 was the first operand in the addition expression) we have to first move the value of eax to edx and then pop the stored value back into eax after we’re done handling the expression of higher precedence. This is to preserve the order of the operands.
This is pretty hard, I know.
Another thing I’d like to bring to your attention is more of a curiosity than anything. If you recall in Part 9 I said that the stack was more elastic than a standard stack data structure. Push and pop both move the stack pointer, increasing and decreasing the size of the active stack frame. The frame size is only frozen during a further function calls or the frame exits.
While you should probably always pop any information you push, strictly speaking it doesn't actually matter in the end. The location of the stack pointer doesn't matter when the function goes to exit because the leave instruction automatically moves it to the location of the base pointer. Anything you pushed, but haven’t popped, is lost and becomes unreachable.
Writing By Hand and Learning by Example
I spent a lot of time examining the assembly output from GCC and writing out assembly by hand for the expressions I’d created.
I have pages and pages of notes where I worked out the order of instructions to make sure I got the correct end result. I computed everything, step by step, by hand. I wanted to ensure its correctness before ever laying down a piece of code.
With GCC you can use the -S option to output assembly. Ensure that most optimizations are turned off using the -O0 option. You may have to use variables to view how equations are calculated because even with optimizations turned off, simple arithmetic is still optimized out.
When entering a function it may have parameters and local variables that are declared. Let’s look at a simple function to add two numbers:
(decl add (a b int) int (+ a b))
Let’s assume that both the stack pointer (ESP register) and the base pointer (EBP register) both point at the bottom of the stack. Call it index 0 (zero).
The declared function ‘add’ takes two integer arguments and has no local variables. That means we need at least 8 bytes of stack space. To reserve that space, we need to call the enter instruction.
Eight bytes are now reserved for the add function.
If you recall from Part 10, when enter is called the current location of the base pointer is stored on the stack and its position is moved up by four bytes. The ebp register is now at index 4 and the esp register is still at index 0. The stack pointer is then moved to the base pointer and moved a further 8 bytes up the stack, making its resting place index 12.
Hopefully, an illustration will help:
Step 1: original positions of stack and base pointers
0 <- base pointer, stack pointer
Step 2: store address (index) of base pointer
0 <- base pointer, stack pointer
Step 3: move the base pointer up 4 bytes
0 <- stack pointer
4 <- base pointer
Step 4: move stack pointer to base pointer
4 <- base pointer, stack pointer
Step 5: increase stack pointer by 8 bytes, as requested via call to enter()
0 <- original base pointer value; the return value, is stored here
4 <- base pointer
12 <- stack pointer
We have now established the active frame. It consists of the addresses between the base pointer and the stack pointer, bytes four through eleven (a total of 8 bytes).
The data for the parameters ‘a’ and ‘b’ have already been stored at the bottom of the frame by some code you haven’t seen, yet. Just assume that it’s true. Since integers are four bytes we can use this knowledge to retrieve these values.
By moving the base pointer by a relative amount we can retrieve these stored values. The first four bytes are located in the bytes between ebp+0 and ebp+3. That is the variable ‘a’. The second four bytes are located between ebp+4 and ebp+7. That’s variable ‘b’.
4 <- base pointer, bottom of the active frame & location of ‘a’ variable (ebp+0)
8 <- location of ‘b’ variable (ebp+4)
12 <- stack pointer, top of the active frame
Knowing that, we can use the knowledge we have for doing arithmetic to compute the answer!
Finally, we call the leave instruction to return the base and stack pointers back to their previous locations. In our contrived example, they’d both be returned to index zero, the bottom of the stack.
Note that we don’t have to use ebp+0. We could just use the ebp pointer. The +0 is an artefact of the code used to generate the C output, as you will see later. I used the same code here as the compiler would emit for continuity.
The whole function will look like this:
That’s all there is to it!
No Return Value?
Actually, there is. The eax register holds the return value of the function. Since that was set when we did the addition there’s no more work to be done!
Keeping the above in mind we can now look at storing parameters when calling functions. The function call itself is nothing special, it looks just like a standard C function call. That said, storing the parameters is pretty interesting.
Before calling a function, we need to store any parameters we need to pass to it. Well, we know where those parameters will be stored: at the bottom of the stack frame. However, at this point, that frame hasn't been created yet; so what gives?
Based on the knowledge we already have we actually know where to put them. Remember that the stack pointer is already at the top of the current stack frame? Well, that’s our ticket to fame!
When the enter instruction is given, we know that four bytes are used to store the current location of the base pointer. So, by utilizing the stack pointer and adding four to it, we’ll be at the location of where the base pointer will be in the forthcoming frame for the function call!
(add 2 3)
setl(2, esp+4); /* ebp+0 in the next frame */
setl(3, esp+8); /* ebp+4 in the next frame */
/* call the add function */
Clear as mud? How about an illustration to compare with the ones given above.
0 <- stack pointer (esp)
4 <- location of ‘a’ parameter (esp+4/ebp+0)
8 <- location of ‘b’ parameter (esp+8/ebp+4)
Bare in mind that it is vitally important that when you are storing parameters that you do nothing else prior to calling the function!
EAX and EDX
There are four, standard, general purpose registers that all x86 architectures have as discussed in the section on assembly.
We only really need two of those registers for the vast majority of our code. The question you might be wonder is, why only eax and edx and not the others?
I'm not aware of any specific reason other than a historical one. I’d be happy to be proved wrong! It may be that the eax register has been optimized in some way to make accumulating data faster or easier. Of that knowledge I am, unfortunately, woefully ignorant.
Regardless of that, it just makes sense to follow suit to the intended purpose of the registers. If anything, it makes looking at the generated code much easier for anyone else familiar with assembly. They'll expect those registers to be used.
I can tell you that in assembly, eax always stores the return value of the function. In C, main must always return a value. This value is passed to the OS when the program exits and that value must be stored in the eax register.
So other than consistency and familiarity I have no other good reasons for you. There are no rules that I know of that specify you MUST use those registers. What I will caution you on is that you would be breaking an established norm which may or may not matter.
On to Compiling!
You’ll finally get to see this code in action in the next section.
If you've gotten this far the next, and final, stage shouldn't be too hard to follow.