The Transterpreter Project

Concurrency, everywhere.


Transterpreter Summer '09 - Day 17 - The Harvard machine

Today was Saturday and we only worked half a day. We still got things done though.

First off we did some cleanup and abstraction. In commits 5866, 5867, 5868, 5869, 5870, 5871, and 5872 we made various changes:

  • Split the code into multiple files.
  • Deal with return codes from running the interpreter, ie in order to print sensible deadlock, error, and status messages.
  • Make an occam wiring module (wiring.module), containing the bindings to the Wiring functions that we're using.
  • Bind the serial output functions from Wiring so that we can output characters to the serial port from occam.

Once we had printing from occam working the next step for any hotheaded occam programmer is to run commstime. Since everything seems to have gone great so far, it should be absolutely no problem running a slightly more complex program. Two days worth of porting yielded parallel blinkenlights so we were confident that commstime should 'just work'.

What is Commstime?

Commstime is a small benchmark used to measure the cost of process startup and shutdown in occam. It also provides a measure of the context switch time, which is probably the figure that is used more often. (Peeking into the future, the source of commstime we use can be found here in the repository: commstime.occ.

Commstime represents a much larger program than anything we have run on the Arduino before. While the benchmark portion of commstime is only slightly larger than the programs we have been testing with so far, there also also a largeish number of support functions for printing the results of the benchmark. All in all, the bytecode for the commstime program comes to just over two kilobytes (incidentally just over the amount of available RAM on our AVR). This is compiled into the Transterpreter (as a static array of bytes) and the whole thing (interpreter + bytecode) is uploaded to flash.

So, Lets Run Commstime...

Bang! The machine crashes. Things that make you go mmm...

foo

Now, we had noticed, during out initial debugging sessions on the Transterpreter, that the virtual machines workspace pointer (think it as the stack pointer) seemed to live at an address awfully close to the instruction pointer. We did not think too much of it at the time, perhaps the flash and RAM were simply mapped such that our pointers are very close together. Given that the values of both the workspace pointer and the instruction pointer were very low and that there is 32KB of flash on the AVR chip we are using we should probably have been more suspicious than we initially were.

Since the commstime program is 'large' we did have a suspicion that it was perhaps being copied into memory. The bytecode for commstime is larger than the available memory on the chip and we are going to have a problem if the bytecode is in fact being copied into RAM. So, the first port of call is to check the declaration of the C file containing the bytecode (which is currently linked in directly with the Transterpreter) it should be const. It looks ok, but several failed attempts at running commstime convinces us that something must be up. We turn the debugging output of the instruction and workspace pointers on again and, as earlier, observe that they are closer than we'd like. It now seems prudent to look at the AVR manuals, you know, to figure out how memory is laid out on the chip, and stuff. According to the manual, flash and RAM are laid out as follows:

  • Flash starts at 0x0000 and ends at 0x3FFF with the bootloader living at the bottom of memory (close to 0x3FFF).
  • SRAM starts at 0x0000 and finishes at 0x08FF
    • registers and I/O registers are mapped at 0x0000 to 0x00FF
    • is mapped at 0x0100 to 0x08FF

This is when it dawned on us that for both memories to start at 0x0000 the AVR must be based on the Harvard Architecture, which means that it has separate program and data memories.

Armed with this new information, we quickly found the magic which would enable us to specify that we really really want the bytecode array to stay in flash (instead of being copied into RAM) in the avr-libc documentation. After adding the correct attribute to the declaration of the array, we naively compiled, uploaded, and ran the commstime program again to see if it now worked. It didn't, and crashed in a slightly different way this time.

Reading just a little bit further on in the avr-libc documentation it quickly became clear that when reading constants out of flash, we must use a special macro to ensure that code is generated to read out of flash and not RAM.

The Problem and Potential Solutions

The Transterpreter is von Neumann architecture machine and it has one address space containing both code and data. Thus, inside the virtual machine there is no distinction between reading code and reading data. This makes it hard to apply a trivial fix to the codebase, ie we can not just replacing all reads out of program memory with a call that uses the appropriate avr-libc macro to correctly read out of flash. In the virtual machine we simply don't know if we are reading code or data.

What further complicates matters is that code and constant data is interspersed in the bytecode files with no markers as to which is which. There are no sections in our bytecode files which mark a particular section as code and another as data. We are therefore also denied the opportunity to apply a simple fix which would check to see if we are reading out of a data segment of the bytecode file, as we simply don't have that information available.

A third feature of our virtual machine which could have helped would be a specific constant loading instruction. This would make it easy to identify loading of constants out of program memory and hence flash. This would presumably make it easier to redefine the code in the load constant instruction to explicitly load out of flash. But we don't have a specific constant loading instruction, so this is not a quick fix either.

The Fix

Since we are trying to get up and running quickly we are not really interested in implementing support for any of the potential solutions above. They are all to invasive as they would require changes in the bytecode format (ie, introducing sections) and/or changes in the compiler (ie, introducing new instructions). We are really only interested in changing the virtual machine and at that, we'd like to change it as little as possible.

Fortunately the virtual machine contain machinery to allow the user to specify a memory interface. This memory interface is used for all reading from and writing to memory. Thus it is possible to substitute the default memory interface (which is a set of macros which do a one-to-one mapping of addresses supplied by the virtual machine to the address space of the hosts memory), with a memory interface that is, for example, able to inspect the addresses and remap incoming requests to flash or RAM as appropriate.

While we have not yet implemented this, the last commit today (changeset 5873) is mostly a note to ourselves that we need to implement a new memory interface for the AVR which enables us to map both RAM and flash into the same address range.



Metadata