PLC缓存：提升SSD缓存性能与寿命的持久解决方案

150 浏览量更新于2024-08-14 收藏 1.77MB PDF 举报

身份认证购VIP最低享 7 折!

30元优惠券

“PLC缓存：持久耐用的SSD缓存，用于基于重复数据删除的主存储” 在当前的IT环境中，数据存储成本是企业关注的重要问题。重复数据删除技术是一种有效的解决方案，它通过识别并消除存储系统中的冗余数据，从而显著减少所需存储空间，提高成本效率。然而，这种技术在高延迟敏感的主存储中应用时，可能会导致I/O性能下降，因为数据处理过程需要对大量相似或重复的数据进行比较和操作。为了解决这个问题，研究者提出了PLC缓存（Persistent and Long-lived Cache）策略，这是一种针对SSD缓存优化的新方法。传统缓存算法，如先进先出（FIFO）、最近最少使用（LRU）和最不常用（LFU），在处理频繁的数据更新时，会加剧SSD的I/O处理速度下降，同时缩短SSD的寿命。SSD的写入寿命是由其有限的写入次数决定的，频繁的写入操作会加速其损耗。 PLC缓存的核心思想是区分受欢迎和不受欢迎的数据，并优先处理和存储那些长期存在且不常被修改的流行数据（即PLC数据）。该方法采用了两阶段策略：首先，通过智能算法过滤掉不受欢迎的数据，避免它们直接写入SSD；其次，通过优化策略，努力将更多的数据转化为PLC数据，这可以显著提高缓存命中率，进而提升I/O性能。实验结果显示，与传统的缓存方案相比，PLC缓存能平均减少23.4%的数据访问延迟，这意味着更快的数据读取速度和更流畅的应用性能。更为重要的是，PLC缓存能够降低写入到SSD的数据量高达15.7倍，这极大地延长了基于SSD的缓存的使用寿命，降低了更换SSD的成本，同时也减少了对环境的影响，因为废弃的SSD处理通常会带来电子废物的问题。这项创新对于依赖高效、低成本存储的企业和数据中心具有重大意义。通过提高SSD缓存的效率和耐久性，PLC缓存有助于实现更节能、更经济的数据管理策略，同时也为未来的存储系统设计提供了新的思路。在未来，随着大数据和云计算的持续发展，这样的优化技术将更加重要，因为它能够帮助应对不断增长的数据存储需求，同时保持高性能和可持续性。

资源详情

资源推荐

2) Secondary Deduplication

Secondary deduplication systems adopt a standard

deduplication method. A secondary storage system -

facilitating data backup and restore operations - requires high

I/O throughput.

Both Sparse Indexing [5] and SiLo [14] efficiently affiliate

disk bottleneck through exploiting similarity among data

segments. Deduplication is performed if two data segments are

similar to each other. Nam et al. introduced a Chunk

Fragmentation Level (CFL) monitor [31]. If the CFL monitor

indicates a chunk fragmentation is becoming worse, the

monitor will selectively rewrite some chunks to reduce

fragmentation, thereby achieving high restore performance.

Dedupv1 [18] and ChunkStash [6] store metadata in flash

memory for chunk index exploiting high random-read

performance of SSD to accelerate chunk index. SAR [7] stores

unique data chunks in SSDs with high reference count

exploiting high random-read performance of SSDs to greatly

improve restore performance.

3) Primary Deduplication

Data of emails, multimedia, and databases is stored in

primary deduplication systems, which are frequently accessed

by users in a random fashion. Compared with data restore in

secondary systems, primary storage systems simply read files

demanded by users each time rather than restoring the entire

dataset. As such, the reading latency can be significantly

reduced.

iDedup [4] exploits spatial locality to deduplicate data

sequence that are contiguously stored on a disk to reduce disk

fragmentation, thereby improving reading speed. What is more,

iDedup takes advantage of temporal locality with a smaller

memory to cache metadata and; therefore, iDedup

dramatically reduces disk I/Os and improves writing speed.

More importantly, data deduplication has been integrated into

practical storage products (e.g., NetApp ASIS [35] and EMC

Celerra [36]) and file systems (e.g. ZFS [37], SDFS [34],

LessFS [32], and LBFS [11]).

B. SSD Endurance

SSDs are favored by IT companies and end users thanks to

SSDs’ low access latency, high energy efficiency, and high

storage density. However, the most serious challenge of SSDs

lies in limited write endurance (i.e., SSDs only stand very

limited number of write bytes) [27]. This problem is caused by

two reasons:

1) Storage units inner flash chips have to be erased before

any re-write operation. To pursue high storage density and low

cost, mainstream SSDs have adopted multi-level cell flash

(MLC) instead of single-level cell (SLC). A SLC flash

memory array typically supports approximately 100,000

erasures cycles for each basic unit. The value of MLC flash,

which can store more than one bit in each unit, drops as low as

5,000 ~ 10,000 cycles or even lower [16, 28].

2) Write amplification. Each erase-unit inner flash chips

generally contains hundreds of pages – a basic read/write unit

of flash [28]. When an erase-unit – containing mixed valid

pages and invalid pages (old version of updated data) – is

about to be erased, the valid pages must be written to other

place, thereby imposing extra writes. Therefore, the amount of

data written on flash chips of an SSD are much larger than the

amount of data issued to the SSD.

In recent years, a considerable amount of research has

been done to enhance SSD endurance. For example, Griffin

employed hard disk drives or HDDs as a write cache for SSDs

to coalesce overwrites, aiming to significantly reduce write

traffic to the SSDs [16]. Chen et al. proposed a hetero-buffer,

which consists of DRAM and reorder area [17]. DRAM

devotes to improving hit ratio, reorder area aims at exhibiting

write amplification. Both of these techniques allow SSDs to

reduce the number of erased physical blocks. I-CASH

arranges SSDs to store seldom-changed and mostly read

reference data, whereas a HDD stores a log of changed deltas

of SSD data [30]. In the I-CASH case, SSDs obviate the need

of handling random writes. Kim et al. apply data deduplication

technology in SSDs to reduce write amplification effects [10].

C. Cache Management for SSDs

Traditional cache replacement algorithms (e.g., FIFO,

LRU, LFU, LIRS [15], and ARC [9]) make great efforts in

obtaining high cache hit rates by frequently updating cached

contents without any restriction. The write endurance of SSD

products is inadequate to meet such excessive write requests,

which inevitably lead to a short lifetime of the SSD cache (see

an example in Table 1). To overcome this problem, vendors

and researchers have proposed various solutions that exploit

application characteristics to limit the number of writes to

SSD cache.

For example, Oracle Exadata’s Smart Flash Cache [24]

combines the cache management and database logic together,

skipping some unimportant data for Flash cache to reduce

SSD writes. Netapp [33] - employing SSDs as read cache -

categories data into different priorities, namely, metadata,

normal data, and low-priority data. SSDs can be configured to

only cache data with specified priorities, with the result of

reducing write load. LARC [19] manages a virtual LRU queue

with a limited length prior to SSD cache; hitting the LRU

queue is the only condition of entering SSD to reduce the

amount of written data. EMC’s FAST Cache [20], Intel’s

. 2 The

rocesses of basic dedu

lication-based stora

e s

stems.

剩余11页未读，继续阅读

weixin_38733885

粉丝: 8
资源: 941

PLC缓存：提升SSD缓存性能与寿命的持久解决方案

存储/缓存技术中的4招教你设计一个完整的PLC应用系统

基于贝加莱PLC的数据采集和存储系统

S7-200SMART PLC中使用临时变量TEMP无法实现自锁功能的解决办法.docx

计算机存储层次结构与操作系统

基于modbus协议的数据记录与历史数据查询技术

Edgex Foundry的消息总线与数据交互

将上诉代码添加到缓存和缓写

基于C语言的Dao编程语言设计源码

如何自定义数据集进行目标检测_keras-yolo3.zip

基于JavaScript及多语言融合的勤工俭学平台设计源码

初始化对LoRA微调动态的影响研究

【PFJSP问题】基于matlab豪猪算法CPO求解置换流水车间调度问题PFSP【含Matlab源码 7895期】.mp4

IGWO-SVM：改良的灰狼优化算法改进支持向量机 采用三种改进思路：两种Logistic和Tent混沌映射和采用DIH策略

Spring-dbUtil-xml-proxy

HCIE-Security V2.0培训材料

基于STM32单片机智能药盒定时吃药喂水蓝牙APP设计（毕业设计）

基于Python的InsightFace 2D/3D人脸分析项目设计源码

ssm9229基于SSM的高校学生综合素质评价系统.zip

最新资源

IGWO-SVM：改良的灰狼优化算法改进支持向量机采用三种改进思路：两种Logistic和Tent混沌映射和采用DIH策略