Track Join：分布式数据库中减少网络流量的高效连接算法

115 浏览量更新于2024-08-25 收藏 1.43MB PDF 举报

"Track Join - Distributed Joins with Minimal Network Traffic (sigmod14II) - 计算机科学" 本文提出了Track Join，一种分布式数据库中用于最小化网络流量的新型连接算法，旨在解决大规模数据分析中网络通信速度慢的问题。在当前的分布式并行数据库系统中，尽管硬件性能的提升得到了广泛研究，但通信减少却相对被忽视。现有的平行数据库管理系统（DBMS）主要依赖于针对磁盘优化的算法，并仅做少量修改以适应网络环境。然而，这种做法可能导致CPU负担加重，同时无法有效避免网络中数据元组的冗余传输。 Track Join算法的核心在于为每个独特的连接键生成最优的数据传输计划，以此来最大限度地减少网络通信。它在CPU与网络之间提供了一个新的权衡选择，即通过牺牲一定的CPU资源来换取网络流量的显著降低。通过在实际数据和合成数据上的评估，Track Join展示了其对不同情况和数据局部性的适应能力。无论是在考虑网络流量还是执行时间上，Track Join都表现出优越的性能，尤其在处理大规模分布式连接操作时，能够有效地减少不必要的数据传输，提高整体效率。 Track Join的实现可能涉及到以下几个关键技术点： 1. **连接键分析**：首先，算法需要识别并分析参与连接操作的各个表中的连接键，以确定数据传输的关键点。 2. **网络流量优化**：通过生成最优传输计划，确保每个节点只传输必要的数据，避免了数据的重复传输。 3. **智能调度**：Track Join可能包含一个智能调度器，它根据数据分布、网络拓扑和系统资源动态调整传输策略。 4. **本地计算与远程通信的平衡**：在不影响整体性能的前提下，尽可能地在本地处理计算任务，减少远程通信。 5. **适应性**：Track Join能够适应各种工作负载和数据局部性，这意味着它能在不同的数据分布情况下保持高效。总结来说，Track Join是分布式数据库领域的一个重要进展，它通过创新的算法设计，解决了分布式环境中网络通信瓶颈的问题，提升了大数据分析的效率。这一成果对于那些处理大量数据并依赖于高效连接操作的系统，如云计算平台和大规模数据仓库，具有重要的实践价值。

Algorithm 2-phase track join: process

− R to S

R|S

← {}

while any process

or any process

R|S

sends do

for all <key

R|S

k> from n

R|S

← T

R|S

+ <k, n

R|S

end for

end while

barrier

for all distinct key

R|S

k in T

R|S

for all <k, process

> in T

R|S

for all <k, process

> in T

R|S

send <k, n

> to n

end for

In the second phase, we only transfer tuples from one ta-

ble. Assuming we transfer R tuples, node T (process

) sends

messages to each location with matching R tuples, including

the key and the set of S tuples’ locations. Finally, R tuples

are selectively broadcast to the tracked S locations, instead

of all nodes in the network, and are joined locally.

Choosing whether to send R tuples to S tuple locations,

or S tuples to R locations, has to be decided by the query

optimizer before the query starts executing, similar to the

traditional inner–outer relation distinction of hash join.

2-phase track join transfers payloads from one table only.

If the input tables have mostly unique keys and the selectiv-

ity is high, the cost comprises of tracking plus min(|R|, |S|).

Algorithm 3-phase track join: process

barrier

R|S

← {}

while any process

or any process

R|S

sends do

for all <key

R|S

k, count c> from n

R|S

← T

R|S

+ <k, n

R|S

, c>

end for

end while

barrier

for all distinct key

R|S

k in T

R|S

R, S ← {}, {}

for all k, process

, count c> in T

R|S

R ← R + <n

, c · width

end for

for all k, process

, count c> in T

R|S

S ← S + <n

, c · width

end for

cost

← broadcast R to S

cost

← broadcast S to R

if RS

cost

< SR

cost

then

for all <k, process

> in T

R|S

for all <k, process

> in T

R|S

send <k, n

> to n

end for

else

for all <k, process

> in T

R|S

for all <k, process

> in T

R|S

send <k, n

> to n

end for

end if

end for

Algorithm 3-phase track join: process

... symmetric with process

of 3-phase track join ...

Algorithm 3-phase track join: process

← {}

for all <key

R|S

k, payload

> in table R do

← T

+ <k, p

end for

barrier

for all distinct key

R|S

k in T

c ← |k in T

send <k, c> to process

(hash(k) mod N)

end for

barrier

while any process

or any process

sends do

if source is process

then

for all <key

R|S

k, process

> from n

for all <k, payload

> in T

send <k, p

> to n

end for

else if source is process

then

for all <key

R|S

k, payload

> from n

for all <k, payload

> in T

commit <k, p

, p

end for

end if

end while

2.2 3-Phase Track Join

In the 3-phase (or double broadcast) track join, we can

decide whether to broadcast R tuples to the locations of S

tuples, or vice versa. The decision is taken per distinct join

key. In order to decide which selective broadcast direction

is cheaper, we need to know how many tuples will be trans-

ferred in each case. Thus, instead of only tracking nodes

with at least one matching tuple, we also track the number

of matches. To generalize for variable lengths, we transfer

the sum of matching tuple widths, rather than a count.

Bi-directional selective broadcast can distinguish cases in

which moving S tuples would transfer fewer bytes than mov-

ing R tuples. The decision whether to selectively broadcast

R → S or vice versa is taken for each distinct key indepen-

dently and we do not rely on the optimizer to pick the least

expensive direction overall for the entire join.

The cost estimation for one selective broadcast direction

is shown below. In practice, we compute both directions and

pick the cheapest. The complexity is O(n), where n is the

number of nodes with at least one matching tuple for the

given join key. The total number of steps required is less

than the number of tuples. Thus, the theoretical complex-

ity is linear. The messages that carry location information,

logically seen as key and node pairs, have size equal to M.

Algorithm track join: broadcast R to S

all

←

local

←

|, where |S

| > 0

nodes

← |i, where |R

| > 0 ∧ i 6= sel f |

nodes

← |i, where |S

| > 0|

cost

← R

all

· S

nodes

− R

local

+ R

nodes

· S

nodes

· M

return RS

cost

剩余11页未读，继续阅读

weixin_38619613

粉丝: 6
资源: 947

Track Join：分布式数据库中减少网络流量的高效连接算法

Feathers服务微服务化：高性能分布式架构升级

基于长周期光纤光栅的全光纤高Q微环共振器 quasi-distributed 和波长选择性耦合方法

角色扮演游戏氛围增强器-RPG Ambience

COP-4521-Secure-Parallel-and-Distributed-Computing-with-Python

Manning - Distributed Programming with JAVA.

SSD8 Manning - Distributed Programming with JAVA

POWER-SYSTEM-CONTROL-WITH-DISTRIBUTED-FLEXIBLE-AC_Flexible AC_co

http-service-distributed:http-service-distributed分布式业务补偿型事务框架

Tango - Distributed Data Structures over a Shared Log-计算机科学

benefits-sd-wan-distributed-enterprise

最新资源