提升Web端云存储服务的Delta同步效率

145 浏览量更新于2024-08-26 收藏 711KB PDF 举报

"这篇研究论文探讨了实用的基于Web的Delta同步技术在云存储服务中的应用。作者们来自清华大学、耶鲁大学和加州大学圣地亚哥分校，他们关注的重点是如何克服Web浏览器上的Delta同步效率问题。" Delta同步是一种关键的技术，对于提高云存储服务（如Dropbox）的网络效率至关重要。然而，目前的Delta同步技术主要应用于PC客户端和移动应用，而Web浏览器作为最普遍且操作系统独立的访问方式，却尚未实现有效的Delta同步。为了理解这一问题，研究人员实现了一个名为WebRsync的传统Delta同步解决方案，利用JavaScript在Web浏览器中进行尝试。然而，他们发现WebRsync在浏览器环境中由于JavaScript执行效率低下，导致了频繁的性能停滞甚至崩溃。针对这个问题，研究者提出了一种反向传统Delta同步的方法，即将数据块搜索和比较操作全部转移到服务器端。这样做虽然显著增加了服务器的计算负担，但也有效地减轻了Web浏览器的计算压力。考虑到这种情况，他们进一步利用本地存储和预处理策略来优化服务器端的计算过程，以降低延迟并提高系统稳定性。论文可能详细讨论了以下几点： 1. WebRsync的性能瓶颈分析：对JavaScript执行效率低下的具体原因进行深入剖析，包括浏览器环境的限制和JavaScript运行时的性能问题。 2. 服务器端的Delta同步算法优化：描述如何设计和实现将数据块处理移至服务器端的方案，以及如何有效管理服务器资源，避免过载。 3. 本地存储与预处理策略：解释如何利用浏览器的本地存储能力，预先处理部分计算任务，以减轻服务器压力并提升用户体验。 4. 实验与评估：通过实验验证新方法的有效性，包括性能对比、延迟测试和系统稳定性分析。 5. 安全性和隐私保护：讨论在将更多计算转移到服务器端时，如何确保用户数据的安全性和隐私不被侵犯。 6. 展望与未来工作：论文可能还探讨了进一步改进的可能，例如利用更先进的计算技术或分布式计算框架，以提升Web Delta同步的效率和可靠性。这篇论文旨在解决云存储服务在Web环境中的高效同步问题，通过重新设计Delta同步流程和优化服务器端计算，为Web浏览器用户提供更加流畅和可靠的云存储体验。

Practical Web-based Delta Synchronization for Cloud Storage Services

He Xiao

Tsinghua University

Zhenhua Li

Tsinghua University

Ennan Zhai

Yale University

Tianyin Xu

UCSD

Abstract

Delta synchronization (sync) is known to be crucial for

network-level efﬁciency of cloud storage services (e.g.,

Dropbox). Practical delta sync techniques are, how-

ever, only available for PC clients and mobile apps,

but not web browsers—the most pervasive and OS-

independent access method. To understand obstacles of

web-based delta sync, we implemented a traditional delta

sync solution (named WebRsync) for web browsers us-

ing JavaScript, and ﬁnd that WebRsync severely suffers

from the inefﬁciency of JavaScript execution inside web

browsers, thus leading to frequent stagnation and even

crashing. Given that the computation burden on the web

browser mainly stems from data chunk search and com-

parison, we reverse the traditional delta sync approach by

lifting all chunk search and comparison operations from

the client side into the server side. Inevitably, this brings

enormous computation overhead to the servers. Hence,

we further leverage locality matching and a more efﬁ-

cient checksum to reduce the overhead. The resulting

solution (called WebR2sync+) outpaces WebRsync by

an order of magnitude, and it is able to simultaneously

support ∼7300 web clients’ delta sync using an ordinary

VM server based on a Dropbox-like system architecture.

1 Introduction

Recent years have witnessed enormous popularity of

cloud storage services, such as Dropbox, Google Drive,

iCloud Drive, and Microsoft OneDrive. They have not

only provided a convenient and pervasive data store for

billions of Internet users [5], but also become a critical

component of numerous online applications (e.g., Drop-

box’s support for DocuSign, Google Drive’s support for

Gmail, and OneDrive’s support for Ofﬁce 365).

The popularity of cloud storage services inevitably

brings tremendous network trafﬁc overhead to both the

client and cloud sides [15]. Therefore, a lot of efforts

have been made to improve the network-level efﬁciency

of cloud storage services, including batched synchro-

nization (sync), deferred sync, delta sync, compression

and deduplication [12, 14, 17, 18]. Among these efforts,

delta sync is known to be of particular importance for its

ﬁne granularity (i.e., the client only sends the changed

content of a ﬁle to the cloud), thus achieving signiﬁcant

trafﬁc savings in the presence of users’ ﬁle edits [19].

Unfortunately, delta sync is currently only practical for

PC clients and mobile apps, but not web browsers—the

most pervasive and OS-independent access method [17].

For example, after a ﬁle f is edited into a new version f

by the user, Dropbox’s PC client or mobile app only up-

loads the altered bits to the cloud; in contrast, the web

browser has to upload the whole content of f

to the

cloud. This gap severely affects web-based user experi-

ences in terms of both sync performance and trafﬁc cost.

To understand the potential obstacles of web-based

delta sync, we implement a delta sync solution (referred

to as WebRsync) for web browsers using JavaScript

based on rsync [7], the de facto delta sync protocol for

PC clients. Also, we develop an automated tool (called

StagMeter) to accurately measure the stagnation of web

browsers. Our experimental results show that WebRsync

severely suffers from the inefﬁciency of JavaScript run-

ning inside web browsers. Under typical ﬁle editing

workloads, WebRsync is slower than PC client-based

delta sync by up to 25 times, thus causing web browsers

to frequently freeze and even crash.

Speciﬁcally, when a user edits a ﬁle from f to f

WebRsync ﬁrst requests the server side to execute (data)

chunk segmentation and ﬁngerprinting operations on f ,

and then requests the client side to perform chunk search

and comparison operations on f

. During the process,

the computation overhead on the client side is larger than

that on the server side by around 7 times. More in detail,

the client-side computation burden mainly stems from

chunk search (∼65%) and comparison (∼22%).

Motivated by the above observations, our ﬁrst effort is

to “reverse” the WebRsync process by handing all chunk

search and comparison operations over to the server side.

Meanwhile, chunk segmentation and ﬁngerprinting oper-

ations are shifted to the client side. The resulting solution

is referred to as WebR2sync, denoting web-based reverse

rsync (more details are described in §3.1 and Figure 4).

Although WebR2sync signiﬁcantly cuts the compu-

tation burden on the web client (and thus effectively

avoids stagnation/crashing), it brings enormous compu-

tation overhead to the server side. To this end, we make

two-fold additional efforts to optimize the server-side

computation overhead. First, we exploit the locality of

下载后可阅读完整内容，剩余6页未读，立即下载

weixin_38693720

粉丝: 10
资源: 900

提升Web端云存储服务的Delta同步效率

迈向基于Web的云存储服务增量同步

云存储服务的Web增量同步技术探析

WASMrsync+：基于WebAssembly的高效Delta Sync云存储解决方案

第3周 3WEB技术-tomcat 会话同步.html

SAP BW_DELTA

text_delta：Quill.js Delta库的Elixir对应部分。 为富文本格式的操作转换提供基准

game-server:用于共享的后端游戏服务器

WASMrsync：基于WebAssembly的高效云存储增量同步解决方案

Python库giaola_xml_delta-0.5.1发布，支持多版本Python

使用PHP解析Quill Delta：将富文本编辑器内容转换为HTML

最新资源

text_delta：Quill.js Delta库的Elixir对应部分。为富文本格式的操作转换提供基准