mo
to
SELECT ?Y WHERE {
?X memberOf X-Lab .
?X type Professor .
?X teacherOf ?Y .
}
X-Lab
Prof
?X
ty
?Y
OS
DS
SPARQL Graph Results
Fig. 2: A SPARQL query (Q
1
) on sample RDF graph.
For example, as shown in Figure
2, the query Q
1
re-
trieves all objects that were taught (to) by a Professor
who is a member (mo) of X-Lab. The query can also be
graphically represented by a query graph, in which ver-
tices represent the subjects and objects of the triple pat-
terns; the black vertices represent constants, a nd the red
vertices repres ent variables; The edges represent pred-
icates in the required patterns (GP). The query results
(?Y, described in RD) include DS and OS.
Difference from graph analytics. Readers might be
curious about the relationship between RDF queries and
graph analytics [
28, 18, 31, 19, 13, 41, 55, 56], especially
a recent design [
50] used one-sided RDMA to implement
message-passing primitives. However, there are several
fundamental differences between RDF queries and graph
analytics.
First, RDF queries are user-centric; thus minimizing
the roundtrip latency is more important than maximizing
network throughput. Second, RDF queries only touch a
small subset of a graph instead of processing the entire
graph, making it not worthwhile to dedicate all resources
to run a single query. Third, graph-analytics is usually
done in a batch-oriented manner in contrast to concur-
rently serving multiple RDF queries.
2.2 Existing Solutions
We then discuss two representative approaches adopted
in existing state-of-the-art RDF systems.
Triple store and triple join: A majority of existing
systems store and index RDF data as a set of triples in
relational databas e s, and excessively leverage triple join
operations to process SPARQL queries. Generally, query
processing consists of two phases: Scan and Join. In
the Scan phase, the R D F engine decompose s a SPARQL
query into a set of triple patterns. For the query in Fig-
ure
2, the triple patterns are {?X memberOf X-Lab}, {?X
type Professor} and {?X teacherOf ?Y}. For each triple
pattern, it generates a temporary query table with bind-
ings by scanning the triple store. In the Join phase, the
query tables are joined to produce the final query results.
Some prior work [
54] has summarized the inherent
limitations of triple-store based approach. First, triple
stores rely excessively on costly join operations, espe-
cially for distributed merge/hash-join. Second, the scan-
join approach may generate large redundant intermediate
results. Finally, while using redundant six primary SPO
4
4
S, P and O stand for subject, predicate and object accordingly.
SELECT ?X ?Y ?Z WHERE {
?X teacherOf ?Y .
?Z takesCourse ?Y .
?Z advisor ?X .
}
OS
tc
to
ad
Logan
Marie
ad
tc
?X
?Y
to
?Z
SPARQL Graph Results
Fig. 3: A SPARQL query (Q
2
) on sample RDF graph.
permutation indexes [
49] can accelerate scan operations,
such indexes lead to heavy memory pressure.
Graph store and graph exploration: Instead of join-
ing query tables, Trinity.RDF [
49] stores RDF data in a
native graph model on top of a distributed in-memory
key/value store, and leverages fast graph-exploration
strategy for query processing. It further adopts one-step
pruning (i.e., the constraint in the immediately prior step)
to reduce the intermediate results. As an example, con-
sidering Q
1
in Figure
2 over the data in Figure 1, after
exploring the type of Professor for each member of X-
Lab with respect to the data in Figure 1, we find that the
possible binding for ?X is only Erik and Logan, and the
rest of members are pruned.
However, the graph exploration in Trinity.RDF relies
on a final centralized join to filter out non-matching re-
sults. For example, the query Q
2
in Figure
3 asks for ad-
visors (?X), courses (?Y) and students (?Z) such that the
advisor advises (ad) the student who also takes a course
(tc) taught by (to) the advisor. After exploring all three
triple patterns in Q
2
with respect to the data in Figure
1,
the non-matching bindings, namely, Logan
−→
to OS, OS
←−
tc
Raven and Raven
−→
ad Erik will not be pruned until a final
join. Prior work [
21, 37] indicates that the final join is
a potential bottleneck, especially for queries with cycles
and/or large intermediate results.
2.3 RDMA and Its Characteristics
Remote Direct Memory Access (RDMA) is a cross-node
memory access technique with low-latency and low CPU
overhead, due to complete bypassing of target OS ker-
nel and/or CPU. RDMA provides both two-sided mes-
sage passing interfaces like SEND/RECV Verbs as well
as one-sided operations such as READ, WRITE and
two atomic operations (fetch-and-add and compare-and-
swap). As noted in prior work [
30, 16, 48], one-sided
operations are usually less disruptive than its two-sided
counterparts due to no CPU involvement to the target
machine. To minimize interference among multiple ma-
chines during query processing, we focus on one-sided
RDMA operations in this paper. However, it should be
straightforward to use two-sided RDMA operations in
Wukong as well.
Figure
4(a) shows the throughput (in Kbps ) of dif-
ferent communication primitives. RDMA undoubtedly
achieves the highest throughput for all payload sizes,
while the throughput of TCP/IP over IPoIB (IP over In-
finiBand) or 10GbE approaches that of one-sided RDMA
USENIX Association 12th USENIX Symposium on Operating Systems Design and Implementation 319