Reader-Writer-Locking/RCU Analogy

Although RCU has been used in a great many interesting and surprising ways, one of the most straightforward is as a replacement for reader-writer locking. In this section, we present this analogy, protecting the simple doubly-linked-list data structure shown in Figure 6 with reader-writer locks and then with RCU, comparing the results. This structure has a struct list_head that is manipulated by the standard Linux list-manipulation primitives, a search key, and a single integer for data. The RCU primitive call_rcu() requires some space in the data element, which is supplied by the my_rcu_head field.

**Figure 6:** List Element Data Structure
$\begin{figure}{\tt\scriptsize\begin{verbatim}1 struct el { 2 struct list_hea... ... 4 long data; 5 struct rcu_head my_rcu_head; 6 };\end{verbatim}} \end{figure}$

The reader-writer-lock/RCU analogy substitutes primitives as shown in Table 1. The asterisked primitives are no-ops in non-preemptible kernels; in preemptible kernels, they suppress preemption, which is an extremely cheap operation on the local task structure. Note that since neither rcu_read_lock() nor rcu_read_unlock block irq or softirq contexts, it is necessary to add primitives for this purpose where needed. For example, read_lock_irqsave must become rcu_read_lock() followed by local_irq_save().

Deletion from the list is illustrated by the upper section of Table 2. The write_lock() and write_unlock() are replaced by spin_lock() and spin_unlock(), respectively, the list_del() is replaced by list_del_rcu(), and my_free() is replaced by call_rcu(). The call_rcu() will, after a grace period elapses, pass p to function my_free(), using the struct rcu_head in the my_rcu_head field to keep track of the deferred function call and argument.

Reader-Writer Lock	Read-Copy Update

`1 void delete(long mykey) 2 { 3 struct el *p; 4 write_lock(&list_lock); 5 p = search(mykey); 6 if (p != NULL) { 7 list_del(p); 8 } 9 write_unlock(&list_lock); 10 my_free(p); 11 }`	`1 void delete(long mykey) 2 { 3 struct el p; 4 spin_lock(&list_lock); 5 p = search(mykey); 6 if (p != NULL) { 7 list_del_rcu(p); 8 } 9 spin_unlock(&list_lock); 10 call_rcu(&p->rcuhead, 11 (void ()(void *))my_free, p); 12 }`
`1 void insert(long key, long data) 2 { 3 struct el p; 4 p = kmalloc(sizeof(p), GPF_ATOMIC); 5 p->key = key; 6 p->data = data; 7 write_lock(&list_lock); 8 list_add_tail(&(p->list), &head); 9 write_unlock(&list_lock); 10 }`	`1 void insert(long key, long data) 2 { 3 struct el p; 4 p = kmalloc(sizeof(p), GPF_ATOMIC); 5 p->key = key; 6 p->data = data; 7 spin_lock(&list_lock); 8 list_add_tail_rcu(&(p->list), &head); 9 spin_unlock(&list_lock); 10 }`
`1 struct el search(long mykey) 2 { 3 struct el p; 4 list_for_each(p, &head) { 5 if (p->key == mykey) { 6 return (p); 7 } 8 } 9 return (NULL); 10 }`	`1 struct el search(long mykey) 2 { 3 struct el p; 4 list_for_each_rcu(p, &head) { 5 if (p->key == mykey) { 6 return (p); 7 } 8 } 9 return (NULL); 10 }`
`1 /* Read-only search / 2 struct el p; 3 read_lock(&list_lock); 4 p = search(mykey); 5 if (p == NULL) { 6 /* handle error condition / 7 } else { 8 / access p w/out modifying / 9 } 10 read_unlock(&list_lock);`	`1 /* Read-only search / 2 struct el p; 3 rcu_read_lock(); /* nop unless CONFIG_PREEMPT / 4 p = search(mykey); 5 if (p == NULL) { 6 / handle error condition / 7 } else { 8 / access p w/out modifying / 9 } 10 rcu_read_unlock(); /* nop unless CONFIG_PREEMPT */`

Insertion into the list is illustrated by the second section of Table 2. Again, the write_lock() and write_unlock() are replaced by the simple spinlock primitives spin_lock() and spin_unlock(), respectively. The list_add_tail() is replaced by list_add_tail_rcu(). The rest of the code remains the same.

Searching the list is illustrated by the third section of Table 2. Locking is handled by the caller, so the two variants differ only in that the list_for_each() is replaced by list_for_each_rcu().

Searching the list for read-only access is illustrated by the last section of Table 2. The only difference is that the read_lock() and read_unlock() primitives are replaced by rcu_read_lock() and rcu_read_unlock(), respectively. Again, the bulk of the code remains the same for both cases.

Although this analogy can be quite compelling and useful, there are some caveats:

Where it applies, this analogy can deliver full parallelism with almost no increase in complexity. For example, Section 4 shows how applying this analogy to System V IPC yields order-of-magnitude speedups with a very small increase in code size and complexity.

RCU has also been used as a lazy barrier synchronization mechanism, as a mode-change control mechanism, as well as for more sophisticated list maintenance. Retrofitting existing code with RCU as shown above can produce significant performance gains, but of course the best results are obtained by designing RCU into the algorithms and code from the start.

Other methods have been proposed for eliminating locking [Herlihy93], and, when combined with more recent refinements [Michael02a,Michael02b], these methods are practical in some circumstances. However, they still require expensive atomic operations on shared storage, resulting in pipeline stalls, cache thrashing, and memory contention, even for read-only accesses.