Next: Cilk-NOW job architecture Up: Adaptive and Reliable Previous: Introduction

# The Cilk language and work-stealing scheduler

In this section we overview the Cilk parallel multithreaded language and its runtime system's work-stealing scheduler [6, 8, 26]. For brevity, we shall not present the entire Cilk language, and we shall omit some details of the work-stealing algorithm. Since Cilk-2 forms the basis for the Cilk-NOW system, we shall focus on the Cilk-2 language and on the Cilk-2 runtime system as implemented without adaptive parallelism or fault tolerance.

A Cilk program contains one or more Cilk procedures, and each Cilk procedure contains one or more Cilk threads. A Cilk procedure is the parallel equivalent of a C function, and a Cilk thread is a nonsuspending piece of a procedure. The Cilk runtime system manipulates and schedules the threads. The runtime system is not aware of the grouping of threads into procedures. Cilk procedures are purely an abstraction supported by the cilk2c type-checking preprocessor [33].

Consider a program that uses double recursion to compute the Fibonacci function. The Fibonacci function fib(n) for positive n is defined as

Figure 2 shows how this function is written as a Cilk procedure consisting of two Cilk threads: Fib and Sum. While double recursion is a terrible way to compute Fibonacci numbers, this toy example does illustrate a common pattern occurring in divide-and-conquer applications: recursive calls solve smaller subcases and then the partial results are merged to produce the final result.

Figure 2: A Cilk procedure to compute the nth Fibonacci number. This procedure contains two threads, Fib and Sum.

Figure 3: The Fib thread spawns a successor and two children. For the successor, it creates a closure with 2 empty argument slots, and for each child, it creates a closure with a continuation referring to one of these empty slots. The background shading denotes Cilk procedures.

A continuation is a global reference to an empty argument slot of a closure, implemented as a compound data structure containing a pointer to a closure and an offset that designates one of the closure's argument slots. Continuations are typed with the C data type of the slot in the closure. In the Cilk language, continuations are declared by the type modifier keyword cont. For example, the Fib thread declares two integer continuations, x and y.

Using the spawn_next primitive, a thread spawns a successor thread by creating a closure for the successor. The successor thread is part of the same procedure as its predecessor. For example, in the Fib thread, the statement spawn_next Sum (k, ?x, ?y) allocates a closure with Sum as the thread and three argument slots, as illustrated in Figure 3. The first slot is initialized with the continuation k and the last two slots are empty. The continuation variables x and y are initialized to refer to these two empty slots, and the join counter is set to 2. This closure is waiting.

Similarly, using the spawn primitive, a thread spawns a child thread by creating a closure for the child. The child thread is the initial thread of a newly spawned child procedure. The spawn statement is semantically identical to spawn_next. For example, the Fib thread spawns two children as shown in Figure 3. The statement spawn Fib (x, n-1) allocates a closure with Fib as the thread and two argument slots. The first slot is initialized with the continuation x which, as a consequence of the previous statement, refers to a slot in its parent's successor closure. The second slot is initialized with the value of n-1. The join counter is set to zero, so the thread is ready.

An executing thread sends a value to a waiting thread by placing the value into an argument slot of the waiting thread's closure. The send_argument statement sends a value to the empty argument slot of a waiting closure specified by its argument. The types of the continuation and the value must be compatible. The join counter of the waiting closure is decremented, and if it becomes zero, then the closure is ready. For example, the statement send_argument (k, n) in Fib writes the value of n into an empty argument slot in the parent procedure's waiting Sum closure and decrements its join counter. When the Sum closure's join counter reaches zero, it is ready. When the Sum thread gets executed, it adds its two arguments, x and y, and then uses send_argument to "return" this result up to its parent procedure's waiting Sum thread.