How protothreads really work

What goes on behind the magical macros in the C protothreads library? Why are they macros? How do the macros work? Read on for the explanation.

In the C implementation of protothreads, all protothread operations are hidden behind C macros. The reason for building the protothread library on C macros rather than C functions is that protothreads alter the flow of control. This is typically difficult to do with C functions since such an implementation typically would require low-level assembly code to work. By implementing protothreads with macros, it is possible for them to alter the flow of control using only standard C constructs.

This page explains how protothreads work under the hood. We do this by taking a look at how the C preprocessor expands the protothread macros, and by looking at how the resulting C code is executed.

First, we'll introduce a simple example program written with protothreads. Since this is a simple program we can show the entire program, including the main() function from which the protothread is driven. The code, shown below, waits for a counter to reach a certain threshold, prints out a message, and resets the counter. This is done in a while() loop that runs forever. The counter is increased in the main() function.

#include "pt.h"
 
static int counter;
static struct pt example_pt;
 
static
PT_THREAD(example(struct pt *pt))
{
  PT_BEGIN(pt);
  
  while(1) {
    PT_WAIT_UNTIL(pt, counter == 1000);
    printf("Threshold reached\n");
    counter = 0;
  }
  
  PT_END(pt);
}
 
int main(void)
{
  counter = 0;
  PT_INIT(&example_pt);
  
  while(1) {
    example(&example_pt);
    counter++;
  }
  return 0;
}

Before we let the C preprocessor expand the above code, we'll take a look at how the protothread macros are defined. In order to make things easier, we use a simpler definition than the actual definition from the protothreads source code. (The definition used here is a combined version of the protothread macros and the local continuation macros implemented with the C switch statement.) This definition looks like:

struct pt { unsigned short lc; };
#define PT_THREAD(name_args)  char name_args
#define PT_BEGIN(pt)          switch(pt->lc) { case 0:
#define PT_WAIT_UNTIL(pt, c)  pt->lc = __LINE__; case __LINE__: \
                              if(!(c)) return 0
#define PT_END(pt)            } pt->lc = 0; return 2
#define PT_INIT(pt)           pt->lc = 0

We see that the struct pt consists of a single unsigned short called lc, short for local continuation. This unsigned short variable is the source of the "two byte overhead" frequently mentioned on the protothread web pages. Furthermore, we see that the PT_THREAD macro simply puts a char before its argument. Also, we note how the PT_BEGIN and PT_END macros open and close a C switch statement, respectively. But the PT_WAIT_UNTIL macro is the most complex looking of them all. It contains one assignment, one case statement, one if statement, and even a return statement! Also, it uses the built-in __LINE__ macro twice. The __LINE__ macro is a special macro that the C preprocessor will expand to the line number at which the macro is issued. Finally, the PT_INIT macro simply initializes the lc variable to zero.

Many of the statements used in the protothread macros are not commonly used in macros. The return statement used in the PT_WAIT_UNTIL macro breaks the flow of control in the function the macro is used. For this reason, many people dislike the use return statements in macros. The PT_BEGIN macro opens a switch statement, but does not close it. The PT_END macro closes a compound statement that it has not itself opened. These things does look weird when looked at without the perspective of protothreads. However, in the context of protothreads these things are absolutely essential to the correct operation of protothreads: the macros has to change the flow of control in the C function in which they are used. This is indeed the whole point of protothreads.

Ok, enough of talk about how the protothread macros look weird to seasoned C developers. We now instead look at how the protothread in the example above looks when expanded by the C preprocessor. To make it easier to see what is happening, we put the original and the expanded versions side by side.

static
PT_THREAD(example(struct pt *pt))
{
  PT_BEGIN(pt);
  
  while(1) {
    PT_WAIT_UNTIL(pt,
      counter == 1000);
    printf("Threshold reached\n");
    counter = 0;
  }
  
  PT_END(pt);
}
static
char example(struct pt *pt)
{
  switch(pt->lc) { case 0:
 
  while(1) {
    pt->lc = 12; case 12:
    if(!(counter == 1000)) return 0;
    printf("Threshold reached\n");
    counter = 0;
  }
 
  } pt->lc = 0; return 2;
}

At the first line of the code we see how the PT_THREAD macro has expanded so that the example() protothread has been turned into a regular C function that returns a char. The return value of the protothread function can be used to determine if the protothread is blocked waiting for something to happen or if the protothread has exited or ended.

The PT_BEGIN macro has expanded to a PT(switch) statement with an open brace. If we look down to the end of the function we see that the expansion of the PT_END macro contains the closing brace for the switch. After the opening brace, we see how the PT_BEGIN expansion contains a case 0: statement. This is to ensure that the code after the PT_BEGIN statement is the first to be executed the first time the protothread is run. Recall that PT_INIT set pt->lc to zero.

Moving past the while(1) line, we see that the PT_WAIT_UNTIL macro has been expanded into something that contains the number 12. The pt->lc variable is set to 12, and a case 12: statement follows right after the assignment. After this, the counter variable is checked to see if it has reached 1000 or not. If not, the example() function now executed an explicit return! This all may seem surprising: where did the number 12 come from and why does the function return in the middle of the while(1) loop? To understand this we need to take a look at how the code is executed in the example() function the next time it is called.

The next time the example() function is called from the main() function, the pt->lc variable will not be zero but 12, as it was set in the expansion of the PT_WAIT_UNTIL macro. This makes the switch(pt->lc) jump to the case 12: statement. This statement is just before the if statement where counter variable is checked to see if it has reached 1000! So the counter variable is checked again. If it has not reached 1000, the example() function returns again. The next time the function is invoked, the switch jumps to the case 12: again and reevaluates the counter == 1000 statement. It will continue to do so until the counter variable reaches 1000. Then, the printf statement is executed and the counter variable is set to zero, before the while(1) loop loops again.

But where did the number 12 come from? It is the line number of the PT_WAIT_UNTIL statement (check it by counting lines in the original program on your screen!). The nice thing with line numbers are that they are monotonically increasing. That is, if we put another PT_WAIT_UNTIL statement later in our program, the line number will be different from the first PT_WAIT_UNTIL statement. Therefore, the switch(pt->lc) knows exactly where to jump - there are no ambiguities.

Did you notice anything slightly strange with the code above? The switch(pt->lc) statement jumped right into the while(1) loop. The case 12: statement was inside the while(1) loop! Does this really work? Yes, it does in fact work! This feature of the C programming language was probably first found by Tom Duff in his wonderful programming trick dubbed Duff's Device. The same trick has also been used by Simon Tatham to implement coroutines in C (a terrific piece of code).

Ok, now that you know how protothreads work inside its wicked macros, download the library and try it out for yourself!