Program 1 The functionality supported by the memory management scheme.
1 node_t
*
NewNode(int size);
2 void DeleteNode(node_t
*
node);
3 node_t
*
DeRefLink(node_t
**
link);
4 void ReleaseRef(node_t
*
node);
5 bool CASRef(node_t
**
link, node_t
*
old, node_t
*
_new);
6 void StoreRef(node_t
**
link, node_t
*
node);
Program 2 Callback procedures for the memory management.
1 void TerminateNode(block_t
*
node) {
2 StoreRef(&node->next,NULL);
3 }
4 void CleanUpNode(block_t
*
node) {
5 block_t
*
next = DeRefLink(&node->next);
6 block_t
*
next2 = DeRefLink(&globalTailBlock);
7 CASRef(&node->next, next, next2);
8 }
for increasing scalability besides allowing disjoint Enqueue and Dequeue operations to
execute in parallel.
3 The New Algorithm
The underlying data structure that our algorithmic design uses is a linked list of arrays,
and is depicted in Figure 1. In the data structure every array element contains a pointer to
some arbitrary value. Both the Enqueue and Dequeue operations are using increasing
array indices as each array element gets occupied versus removed. To ensure consis-
tency, items are inserted or removed into each array element by using the CAS atomic
synchronization primitive. To ensure that a Enqueue operation will not succeed with a
CAS at a lower array index than where the concurrent Dequeue operations are operat-
ing, we need to enable the CAS primitive to distinguish (i.e., avoid the ABA problem)
between ”used” and ”unused” array indices. For this purpose two null pointer values
[11] are used; one (NULL) for the empty indices and another (NULL2) for the removed
indices. As each array gets fully occupied (or removed), new array blocks are added to
(or removed from) the linked list data structure. Two shared pointers, globalHeadBlock
and globalTailBlock, are globally indicating the first and last active blocks respectively.
These shared pointers are also concurrently updated using CAS operations as the linked
list data structure changes. However, as these updates are done lazily (not atomically
together with the addition of a new array block), the actually first or last active block
might be found by following the next pointers of the linked list.
As a successful update of a shared pointer will cause a cache miss to the other
threads that concurrently access that pointer, the overall strategy for improving perfor-
mance and scalability of the new algorithm is to avoid accessing pointers that can be
concurrently updated [5]. Moreover, our algorithm achieves fewer updates by not hav-
ing shared variables with explicit information regarding which array index currently
being the next active for the Enqueue or Dequeue. Instead each thread is storing its