Check out the new USENIX Web site. next up previous
Next: Topological Race Detection Up: TRaDeA Topological Approach Previous: Introduction

Synchronisation in Java

 

Java [2] is an object-oriented language that was designed with multi-threading in mind [11, 12, 21]. In Java there are only two fundamental data types: `primitive types' and `reference types'. Primitive types consist of booleans, integers, floats, ...Reference types contain a reference to an object or contain `null'. These objects are created dynamically on the heap. A garbage collector is responsible for destroying them when they are no longer referenced [8]. Objects themselves contain primitive types or references.

A race between two (or more) threads occurs when they modify a member variable of an object in an unpredictable order. Races on variables on the stack are impossible since the stack can only be manipulated by the thread it belongs to.

To avoid data races, a programmer can force fragments of code running on different threads to execute in a certain order by adding extra synchronisation operations. Java offers several constructs that enforce extra synchronisationgif:

The fragments of code of a thread that are separated from each other by a synchronisation operation are abstracted into the notion of `events'. Notice that the synchronisation operations are not considered events themselves. The tex2html_wrap_inline586 event of thread tex2html_wrap_inline588 will be denoted by tex2html_wrap_inline590 . Two events, tex2html_wrap_inline592 and tex2html_wrap_inline594 , are said to be `ordered', tex2html_wrap_inline596 , if there exists a set of synchronisations that force event tex2html_wrap_inline592 always to occur before event tex2html_wrap_inline594 . A data race occurs when there is no set of synchronisations that force the events modifying a shared variable to occur in a fixed order.

TRaDe models the ordering of events by using a construct called a `vector clock' as defined in [6, 7, 20]. Vector clocks are tuples of integers with a dimension equal to the maximum degree of parallelism (number of threads) in the application. The first event, tex2html_wrap_inline602 of every thread tex2html_wrap_inline588 is assigned the vector clock, tex2html_wrap_inline606 , with components

equation67

The value of the vector clock of the next event in a thread is calculated using the vector clocks of its preceding events. If event tex2html_wrap_inline590 on thread tex2html_wrap_inline588 is guaranteed to occur after events tex2html_wrap_inline612 , its vector clock is updated as follows

eqnarray74

For our purposes, the most important property of vector clocks, is that they can be used to verify whether two events are ordered. Two events, a and b, are ordered iff

equation82

Two events are parallel, i.e. not ordered, iff

equation84

If we define W(a) the set of all locations written to during event a and R(a) the set of all locations read during event a, then two events, a and b, will be involved in a data race iff

  equation86

i.e. the two events are executed in parallel, both events access a common variable and at least one event modifies the variable.

The ordering between events is obtained by observing the synchronisation operations in Java. The start member function of Thread is used by one thread to start the execution of a second thread. The join member function allows one thread to wait for the end of the execution of a second thread. These operations impose an ordering on the events of these threads as can be seen in Figure 2. The value of the vector clocks is shown in the figure, illustrating the calculation of the vector clocks at synchronisation operations.

  figure96
Figure 2:   Synchronisation using the Thread class

A lock is associated with every Object in Java. A thread can try to take this lock using the bytecodes monitorenter and monitorexit. When the object is already locked, the thread will wait until it is unlocked and can then proceed. This construct does not impose a fixed time order on the code of the two threads involved, it just enforces mutual exclusion. It does suggest that the programmer is aware of a potential race and has used this construct as a means of synchronisation. We therefore consider this a `de facto' ordering, depicted in Figure 3 by a dashed arrow.

  figure107
Figure 3:   Synchronisation through a locked object

The synchronized keyword is applied to a subset of the member functions of a class, the `monitor'. When a thread invokes one of these member functions on an object of the synchronised class, Java ensures that none of the other member functions in the monitor is being executed. This is implemented through the object locking mechanism mentioned above. When a synchronised member function is executed, the lock of the object containing the member function is taken. When the member function finishes, the lock is released.

A final set of synchronisation primitives is wait and notify(All) which are member functions of every Object. When a thread invokes wait on an object, the execution of the thread is halted until another thread executes notify(All) on that very same object. At that time the first thread in line can continue its execution. This imposes the ordering depicted by the dotted arrow in Figure 4. However, a thread is only allowed to invoke wait or notify on an object if that thread owns the lock of that object, so in reality it suffices to observe the ordering between the monitorenter and monitorexit depicted by the solid arrows.

  figure124
Figure 4:   Synchronisation through signals

To detect data races, an access history for every object is constructed. In this access history, every read an write operation to the object must be stored together with the identity of the thread that performed the operation and the vector clock of the event to which the operation belongs. When a new read or write operation occurs, it is compared to the previous operations stored in the access history. If condition 5 is satisfied, a race is found.

Of course, as the program continues to execute, the access histories grow without bounds. In [5] it is shown that it suffices to store only the lastgif write operation in the access history, since these write operations must be ordered or a race would already have occurred. As a consequence, only one write operation is stored in the access history. Also, only the read operations which are parallel with each other need to be stored. So at most one read operation for every thread present in the program must be stored in the access history.

Still, this can amount to a very large overhead, especially if there are many threads (remember that the size of each vector clock grows proportionally to the number of threads). One way out is to reduce the number of objects for which an access history must be maintained. In the next section, we shall present a method which makes this possible.


next up previous
Next: Topological Race Detection Up: TRaDeA Topological Approach Previous: Introduction

Mark Christiaens
Fri Feb 23 11:13:35 MET 2001