The Transterpreter Project

Concurrency, everywhere.


Transterpreter Summer '09 - Day 16 - It works, huzzah!

Yesterday left us with the ability to upload the Transterpreter runtime, but not the ability to use it. Today, with the first full day of work, we were hoping to end up with running code. Our very first attempt at this was the very exciting 'do nothing' program:

PROC main()
  SKIP
:

which worked with out to much hassle (changeset 5862). Good, but not quite cause for celebration yet. Next, we decided to move onto a more exciting program. One that never exits the interpreter and embodies a programming pattern often seen in occam: The WHILE TRUE loop. Most often there is code doing 'stuff' in the body of the loop, but for testing we sometimes see the following code, which is what we tried to run on the Arduino next:

PROC main()
  WHILE TRUE
    SKIP
:

After uploading this program, it promptly crashed the Arduino. Mmmmmm.

We then spent a considerable amount time digging around, during which I had to resurrect my Transputer bytecode reading abilities. Eventually we found that the slinker1 still had a bug which we are sure was fixed a long time ago. The bug in the slinker occurs only in WHILE TRUE: SKIP loops, which one would not expect to encounter in any serious program and thus do not see a lot of testing. The code which should be generated in the case of WHILE TRUE: SKIP is a backwards jump of four byte sized instructions (on a 16 bit machine)2. In the slinker however, the jump is represented as a jump to the absolute address of itself, but when we go to convert that to a relative jump, we fail to detect that we are jumping from the same address to the same address, and should therefore turn the absolute jump into a relative backwards jump (and not a relative forwards jump which is what currently happens). This failure is do to to the use of a < where we should use a <= in the backwards jump detection. Fixed (again! changeset 5863).

Onwards; Towards Working Code

Next we increase the stakes and aim for running the following blinkenlights program (the code shown here has has some of the support code used in the checked in version stripped out, but this is ok will later be pushed into a library anyway. Full source code for blink.occ is in the repository):

PROC main ()
  SEQ
    pinMode (LED.PIN, OUTPUT)
    WHILE TRUE
      SEQ
        digitalWrite (LED.PIN, HIGH)
        delay (100)
        digitalWrite (LED.PIN, LOW)
        delay (500)
:

Getting this code to work was just an exercise in setting up the FFI functionality in the Transterpreter and wrapping the various functions required to poke the pins on the chip. Initial explorations looked using PLACED occam variables, which enables us to place a specific variable at a certain memory address. Reads and writes to that memory address should then poke the pins. This works great, on platforms where pins are mapped into memory. On the AVR chips however, you need to use a special IO instruction and direct memory access to the pins is therefore not currently possible from occam.

We uploaded the new TVM and, as if by magic (ie occam code), the LED started blinking. Success!!! (there is a very exciting video at the end of the post showing the various blinkenlights).

Adam and myself working hard

A Note on Time

As we're currently running on top of the Arduino code, we get to use clock set up by that environment. The Arduino environment can give us time in both microseconds micros() and milliseconds millis(). Initially we used micros(), which returns long (32 bits on the Arduino), but occam only does time in the current word length, ie 16 bits on the Transterpreter Arduino port. When we using a long it is possible to get around 71 minutes out of the microsecond timer, but since we only use the lest significant 16 bits, we only get around 65 milliseconds before the counter wraps. Not that useful. So we have to switch to the millisecond timer (which also returns a long for around 50 days before rollover) but we only use the least significant 16 bits for just over one minute worth of counting before rollover. Better but still not great.

The point of talking about this is to highlight a problem with occam that we really would like to fix. It is not possible to declare timers that return values in anything but the ports wordsize, which is unhelpful on small platforms where it is hard to get granularity without rolling over the clock very quickly. Also it is not possible to declare timers in occam with different granularities, ie I cannot say "give me a timer with millisecond and a timer with second granularity". This would certainly help alleviate the problem as it would be possible to declare timers with high granularity for short periods of waiting and low granularity timers for long periods of waiting3.

I believe that on the Transputer chip there were two granularities of timer, but these were selected depending on whether a particular process was running at high or low priority. No syntax exists (or existed), nor did any instructions exist for selecting timer granularity.

Concurrent Printing

Timing aside, occam is a concurrent programming language and we really want to write concurrent programs. A single blinking LED is all fine and good, but it is multiple blinkenlights that we are interested in. The code to parallelise the blinkenlights introduced a parameterisation of the actual blinking of the LED. With this change the single LED blinking code from above would look like this:

PROC flash.pin (VAL INT pin, on, off)
  SEQ
    pinMode (pin, OUTPUT)
    WHILE TRUE
      SEQ
        digitalWrite (pin, HIGH)
        delay (on)
        digitalWrite (pin, LOW)
        delay (off)
:

PROC main ()
  flash.pin (LED.1.PIN, 100, 500)
:

The change to parallelise the blinkenlights is pretty simple, introducing a PAR and a replicated PAR block. The parallel blinkenlights is introduced in changeset 5865 which modifies the blink.occ program to look roughly like the following snippet:

PROC flash.pin (VAL INT pin, on, off)
  ... as above ...
:

PROC main ()
  PAR
    flash.pin (LED.1.PIN, 200, 500)
    flash.pin (LED.2.PIN, 300, 200)
    PAR i = 2 FOR 6
      flash.pin (i, i * 100, i * 230)
:

flashing for LED.1.PIN and LED.2.PIN are set up individually and not in the replicated PAR i = 2 FOR 6, as the pin constants do not follow the sequence generated by the replicated PAR.

This code makes the LEDs flash in pretty patterns. The code is also pretty clear (we think). We reuse the function that blinks a single LED (which uses the delay process to introduce delays between turning the LED on and off). To blink multiple LEDs with different delays, we simply run as many of these processes in parallel as we need. In fact the code above is simply a syntactic shorthand for this:

PROC main ()
  PAR
    flash.pin (LED.1.PIN, 200, 500)
    flash.pin (LED.2.PIN, 300, 200)
    flash.pin (2, 2 * 100, 2 * 230)
    flash.pin (3, 3 * 100, 3 * 230)
    flash.pin (4, 4 * 100, 4 * 230)
    flash.pin (5, 5 * 100, 5 * 230)
    flash.pin (6, 6 * 100, 6 * 230)
    flash.pin (7, 7 * 100, 7 * 230)
:

So, in occam, when we wish to start blinking multiple LEDs at with different delays at the same time, we simply run more of these processes in parallel, and we get the desired behaviour.

This is in contrast with using a more traditional imperative language where one can not naively make the transition from a function which blinks one LED using provided delay function, to just using that function multiple times. Why? The delay function that is used in the example Arduino LED blinking code stops the world (adapted from the Arduino Blink program):

void flash_pin(int pin, int on, int off) {
     digitalWrite(ledPin, HIGH);
     delay(on);
     digitalWrite(ledPin, LOW);
     delay(off);
}

That means, that when the delay is used, nothing else can run, and our LEDs will only blink one at a time. The Arduino tutorial proposes a solution for performing delays without using the delay function: blink without delay but this radically changes the structure of the program from the much simpler one using delay(). Using this method we also loose the ability to show users a natural progression from using a simple, reasonably intuitive parameterised pin flashing routine, to blink one LED, and then using it to blink multiple LEDs. Not so when we use a parallel language, such as occam.

Evidence (The Video)


  1. The slinker is the bytecode scrubber and linker we use when compiling code for the Transterpreter. It is written in Scheme (PLT Scheme to be exact). There is also now a new linker, the plinker (written in Perl), but this does not currently support 16bit plinking. The slinker does support 16bit slinking, but is missing some other features, such as dead-code elimination. 

  2. We jump four bytes back as the slinker does not optimally prefix the generated bytecode (optimal prefixing being the task of determining the minimum amount of bytes required to represent the loading of particular address offset, due to the Transputer bytecode format4 which uses a variable number of bytes to encode operands). 

  3. You might wonder why getting the time is not just a function call in occam and we should therefore be able to return whatever we like. While we can do this, there is a special syntax for timers which is used, especially in the ALTernation construct, where enables the reading from a number of channels or a timeout value. It probably is possible to fudge something, ie with a timeout process which uses a function to read time with any granularity and wordsize we like and then sends a signal when the timer fires, which can be used in the ALT. This is a kludge though. 

  4. The Transputer bytecode, which the Transterpreter interprets, has an instruction length of one byte (eight bits). The most significant nibble of each instruction contains the function (fn) and the least significant nibble contains the operand (op). This only leaves 16 instructions, which is not very much, and arguments in the range of 0-15 (unsigned), which is not in itself enough to do very many interesting things. In order to overcome this operand size problem two of the 16 instructions are used for prefixing and negative prefixing, that is, building up a larger operand in the special operand register, on which the remaining 14 instructions operate. To be able to have more than 14 instructions, a further instruction is reserved as the operator and uses the current value in the operand register in order to jump to a secondary instruction contained in the secondary instruction table. None of these secondary instructions can take arguments (other than those passed on the stack). 



Metadata