1 struct el *search(long addr)
2 {
3 struct el *p;
5 p = head->next;
6 while (p != head) {
7 if (p->address == addr) {
8 return (p);
9 }
10 p = p->next;
11 }
12 return (NULL);
13 }
Figure 8: Read-Copy Search
The search() function can return a reference to an
already-deleted element, but the kfree rcu() guar-
antees that the element will not be freed (and thus
possibly re-used for some other purpose) while this
reference exists (see Figure 20 for a definition of
kfree rcu()). There are a number of techniques
that may be used to ensure that search() returns
references only to elements that have not yet been
deleted; see Section 7.3 for an example. However,
there are quite a few algorithms that tolerate “stale
data”, for example, many algorithms that track
state external to the machine must deal with stale
data in any case due to communications delays.
The delete function is quite similar to that
of a single-threaded application, with the addi-
tion of locking, and with kfree() replaced by
kfree rcu(). The internal implementation of
kfree rcu() waits for a grace period before freeing
the specified block of memory (see Section 4.2), and
also provides the required read-write barriers that
allow this function to execute correctly on weakly
consistent machines.
The search() function contains absolutely no locks
or atomic instructions, which means that the per-
formance of this function will scale with CPU core
clock rate, rather than the much slower memory
latencies for an implementation based on locks or
atomic operations. In addition, the search() does
not disable interrupts, which means that read-copy
update can improve performance of UP as well as
SMP kernels. However, search() can return stale
data. This can be prevented, if need be, see for
example Section 7.3.
Note that delete() is very similar to its reference-
count counterpart, including the global lock. This
particular implementation will therefore give good
1 void delete(struct el *p)
2 {
3 spin_lock(&list_lock);
4 p->next->prev = p->prev;
5 p->prev->next = p->next;
6 spin_unlock(&list_lock);
7 kfree_rcu(p, NULL);
8 }
Figure 9: Read-Copy Deletion
1 /* Read-only access. */
2
3 p = search(addr);
4 /* Read-only access to the structure. */
5 /* Next yield of CPU acts as release. */
6
7 /* Access and deletion. */
8
9 spin_lock(&list_lock);
10 p = search(addr);
11 /* Access and update p. */
12 spin_unlock(&list_lock);
13 if (to_be_deleted) {
14 delete(p);
15 }
16 /* Next yield of CPU acts as release. */
Figure 10: Read-Copy search/delete Usage
speedups only if there are many more searches than
deletions. In many situations (e.g., routing-table
updates), this will be the case. In other situations,
the deletion function might use a more complex but
more highly parallel design.
Figure 10 shows how the read-copy search() and
delete() functions might be used. Line 3 shows
how a read-only operation might be carried out.
Note that there is absolutely no cacheline bounc-
ing if all operations are read-only. Lines 9-15 show
how an update operation, possibly including a dele-
tion, might be carried out. The list lock serializes
concurrent modifications.
2.3 Discussion
The reference-count and read-copy search() and
delete() functions each have their strengths. The
read-copy functions avoid all cacheline bouncing for
reading tasks, but can return references to deleted
elements, and cannot hold a reference to elements
across a voluntary context switch. There are hybrid