Saturday, 14 March 2015

Objects In Memory

Until recently, I can’t say that I ever gave much thought to how objects are represented in memory. I don’t write performance critical applications where a few milliseconds of extra speed might matter. I also don’t make programs for hardware with limited resources.

In short, I didn't feel like I needed to know.

That is, until I started learning about compilers and assembly. Suddenly, how data was represented in memory mattered and I began having to ask a lot of questions.

Questions like: How does a struct or array look like in memory? How are their elements accessed? Is extra space is needed for the struct itself?

There are even more questions to ask about word boundaries, cache hits and paging but for the sake of brevity, we’ll keep it to just those posed.

Because Science!

OK, not really.

C is pretty close to the hardware and I probably understand the x86 assembly for it best so I chose to write an example in that language and peeked at the assembly output.

struct { int a, b; } s = {1, 2};

A simple sizeof call revealed that the above struct was 8 bytes. Each integer takes up 4 bytes of space so this is a reasonable result.

Next, I inspected the assembly generated by gcc. The values 1 and 2 were moved onto the stack, side by side. We would call this segment of memory contiguous.

Finally, I wrote a third test which passed the struct as an argument to a function. Again, only the two values, represented by ‘a’ and ‘b’, were copied onto the stack as arguments to the function.

Other than how they were arranged in memory, the fields of a struct are treated like any other variable.

Conclusions, Struct Test

Based on my tests, I was able to draw the following conclusions:

  • The space needed for a C struct is the sum of its fields (see note below);
  • The memory for a C struct is kept in a contiguous block of memory for easy access.
Note: Actually, in truth, data types might need to be aligned and can have extra padding added making a struct an unexpected size, depending on its layout. If you're interested in learning more, this link was recommended to me.

In the end, a struct is only meaningful to the compiler and programmer. Each field is merely a memory address and nothing else about the struct is known to the CPU.

What about methods?

Function Pointers

Actually, before we can discuss object methods, we need to talk functions. To talk about functions, we need to cover pointers.

A pointer is just like any other variable except that the value it holds is a memory address.

The size of a pointer is the size of a memory address in the target system. If the platform is 64 bits, a pointer is 8 bytes. If it’s 32 bits, the pointer is 4 bytes and so on.

void func(void) {};
void (*f)(void) = func;


The sizeof f on my system, which is 64bits, confirms that a pointer is 8 bytes.

Note: Taking the sizeof the function name, in this case func, gives the size 1 byte. To understand why, check out this response on Stack Overflow.

Methods

A method might seem a lot more complex than a function pointer but I’ll quickly demonstrate why it’s not.

In Go, you see something like this:

type myStruct struct {}
func (s myStruct) Method() {}


So why is it not so complex? Well, its because another way to express this same method call is:

func Method(s myStruct) {}

Once you recognise that a method call is merely syntactic sugar, dispatching the call becomes much simpler. In both cases, move the data in argument s to the stack or a register and call the function as normal using the first parameter as the struct data.

Arrays

Armed with knowledge we can deduce how an array works pretty easily. An array is merely a segment of contiguous memory.

int arr[5] = {0, 1, 2, 3, 4};

The assembly for this C array moves each value onto the stack one at a time and side by side. All the data for the stack is kept packed together but other than that there’s nothing special going on.

All the CPU needs to know is where the first element of the array can be found. We can access the rest by using a byte offset from the first element.

Note: If you ever wondered why an array in C is a pointer (like I did) then there’s your answer.

To access a specific index, x86 assembly uses SIB (scaled index byte) addressing: base + (scale * index).

Base is the address of the start of the array, in this case the pointer arr. The scale is the size of each element. If using a standard word size element on a 32 bit machine, it would be 4 bytes. From the address of arr, add 4 bytes multiplied by i to get the ith element.

Note: In C, arr[i] is equivalent to arr+i for this very reason. In both cases, you are accessing the ith memory address relative to arr’s address.

Note: A C array does not contain any information about the array itself, like its length. This is why C strings use a NULL character to terminate strings. It’s the only way that C functions can detect when it has reached the end of a character array. It would also another reason the C specification does not have bounds checking.

Implementing

I hope my explanations will help you implement arrays, structures (or classes), function pointers and methods in your own language. Much of these things seem very daunting but it’s not actually that hard.

The hard part is wrapping your head around it!

In the end, the generated code is pretty simple. The complexity is, generally speaking, going to be in your compiler.

If you’re interested in what some Go types look like, check out this blog post by Russ Cox here.

Sunday, 8 March 2015

Concurrency vs Parallelism

Today, while I was reading up on instruction pipelines, I found my mind wandering slightly off on a tangent. I started to think about concurrency and parallelism.

It occurred to me that there is a really simple way to illustrate the difference if you’re struggling with the concept yourself. There is also a great talk by Rob Pike called “Parallelism is not Concurrency” that may help you, too, that goes a lot deeper into the comparison.

Some Kind of Process

Let’s use an example of a computer assembly line. We can define a series of steps that must be followed to properly build our computer. Each stage is as follows:


  1. Place empty case on conveyor and put the cover beside it..
  2. Insert and fasten the motherboard to the case.
  3. Insert the CPU, Memory and a graphics card.
  4. Insert a disc drive and a hard drive.
  5. Finally, insert the power supply unit, hook up all the cables and close the case.


Note: A common word, and metaphor, for a process like this is a “pipeline” because the product of each stage flows into the next and only travels in one direction.

Concurrency

Each step is dependant on the one that comes before but each can be worked on independently of the others. Each stage can have a separate person working on them or a single person moving between each step.

Concurrency, to my mind, is taking a process, a task that requires a specific order, that you can divide up into steps that can be worked on independently of the others.

Parallelism

Adding a second line, completely separate of the others, is an example of parallelism. It requires, at minimum, two separate workers for each line.

Concurrency in Parallel

A single worker is like a single core CPU. This worker must manage its time between each task, bouncing between each stage.

This same process can have many workers each one attached to a stage in the process. This is a concurrent solution that has been parallelized.

Summation

Concurrency is a design pattern solution and does not have direct hardware constraints. Concurrency solutions can run in serial but often can be easily run in parallel.