第 X 期 鄢喜爱等: 基于复制和纠删码自适应切换的云存储容错研究 1
基于复制和纠删码自适应切换的云存储容错研究
鄢喜爱
①②
张大方
*①
张波云
②
①
(湖南大学信息科学与工程学院 长沙 410082)
②
(湖南警察学院信息技术系 长沙 410138)
摘 要:云存储作为一种新兴的网络应用模式备受广大用户关注,但由于其复杂性和开放性,数据失效也成常态,
故数据容错变得越来越重要。该文对完全复制和纠删码两种常用的容错方法进行了对比分析,根据云端文件的访问
规律提出一种基于复制和纠删码自适应切换的云存储容错算法,构建了一个相应的云存储容错框架。容错算法根据
文件的访问频率和存储大小,自适应地选择容错策略,一方面降低了低访问量文件的存储开销,另一方面提高了高
访问量文件的访问时效。实验结果表明:该算法较完全复制节约存储空间 40%,而较纠删码的访问时效提升 48%,
并在这两方面优于同类混合数据冗余算法。
关键词:云存储;容错;副本复制;纠删码;自适应切换
中图分类号: TP391 文献标识码: A 文章编号:1009-5896(2017)
DOI: 10.11999/JEIT160182
Cloud Storage Fault-tolerant Research Based on Adaptive Switching
Scheme Between Replication and Erasure Codes
YAN Xiai
①②
ZHANG Dafang
①
ZHANG Boyun
②
①
(College of Computer Science and Elctronic Engineering, Hunan University, Changsha 410082, China)
②
(Department of Information Technology, Hunan Police Academy, Changsha 410138, China)
Abstract: As a new network application mode, cloud storage attractes much attention, but due to its complexity
and openness, data failure becomes the norm, hence fault tolerance is becoming the more and the more important.
In this paper, a comparative analysis of the replication and erasure codes is made, an adaptive switching scheme
between replication and erasure codes for cloud storage fault-tolerant system is proposed, and a corresponding fault
tolerance framework is constructed. The algorithm can select fault-tolerant scheme adaptively according to the
frequency of data accessing and the remaining memory, which reduces the storage overhead of the low access files,
and improves the access time of high access files.
Experiments results show that the proposed algorithm
outperforms replication based scheme in storage cost by up to 40%
and erasure codes scheme in acess time cost by
up to 48%, and it is better than the similar hyrid data redundancy algorithm in both areas.
Key words: Cloud storage; Fault tolerance; Replication; Erasure codes; Adaptive switching
1 引言
云存储为用户带来了低廉的运行维护成本,按
收稿日期:2016-03-01;改回日期:2016-09-23;网络出版:
*通信作者:张大方 dfzhang@hnu.edu.cn
基金项目:国家自然科学基金(61472130, 61471169),国家 973 计划
项目(2012CB315805),公安部公安理论及软科学研究计划项目
(2013LLYJHNST040),湖南省科技厅科研项目(2014FJ3049),网 络
侦查技术湖南省重点实验室开放基金项目(2016WLZC006)
Foundation Items: The National Natural Science Foundation of
China (61472130, 61471169), The National 973 Program (2012CB
315805), The Public Security Theory and Soft Science Research
Projects of Ministry of Public Security (2013LLYJHNST040), The
Hunan Province Science and Technology Department Research
Project (2014FJ3049), The Open Research Fund of Key Laboratory
of Network Crime Investigation of Hunan Province
(2016WLZC006)
需可扩展的性能配置以及更高效的存储能力,已被
越来越多的用户所接受
[1]
。然而,由于云存储环境的
复杂性和开放性,数据失效问题也引起了广大用户
的关注。例如:2011年,阿里云服务器磁盘出现故
障,在维护过程中执行重启操作,导致期间的数据
丢失; 2012年,谷歌邮箱爆发大规模数据丢失,
150000左右谷歌邮箱用户数据失效
[2]
。云存储系统的
首要任务是保证数据的高可用性和高可靠性
[3,4]
,必
须考虑构建一套高性能低开销的容错机制。
容错的数据冗余方法常用的有副本复制和纠删
码两种。随着数据量的增长,云存储容错逐渐由复
制向纠删码转变
[5]
。纠删码有效地减少了冗余空间,
但解码复杂,存在更多的延时。由此可见,单一的
网络出版时间:2016-11-08 10:58:02
网络出版地址:http://www.cnki.net/kcms/detail/11.4494.TN.20161108.1058.002.html