PRIMS：构建极端可靠的非易失性存储技术

需积分: 1 103 浏览量更新于2024-09-12 收藏 273KB PDF 举报

"这篇文章探讨了PRIMS技术在非易失性随机存取内存（NVRAM）中的应用，旨在提高存储系统的可靠性和鲁棒性，防止文件系统和NVRAM的损坏。通过引入纠错编码的日志结构，PRIMS能够在小规模写入操作中实现高吞吐量的同时，确保持久元数据的正确性，并在每次操作后检查完整性，进行在线扫描以确保文件系统的完整无损。" 正文: 随着非易失性字节寻址内存（NVRAM）在关键数据存储领域的广泛应用，确保数据安全的重要性日益凸显。然而，现有的基于NVRAM的文件系统缺乏防止文件系统或NVRAM损坏的特性。大多数文件系统只在系统崩溃后检查一致性，这显然无法有效应对潜在的问题。 "PRIMS应用技术11"这篇论文由Kevin M. Greenan和Ethan L. Miller共同撰写，他们来自加州大学圣克鲁兹分校的存储系统研究中心。PRIMS（可能是Persistent Reliable In-Memory Storage的缩写）设计目标是解决这些问题，提供一种即使在NVRAM发生多次错误的情况下也能存活的文件存储方案。这些错误可能是由于操作系统不当写入或介质损坏造成的。 PRIMS的核心创新在于采用了一种纠错编码的日志结构来存储持久性元数据。这种结构允许系统在不牺牲性能的前提下，定期验证文件系统操作的正确性。在小规模写入操作中，PRIMS的吞吐量可以比传统的页保护机制高出一个数量级，这意味着它可以在保证速度的同时确保数据完整性。此外，PRIMS在每次执行操作时都会检查数据的完整性，这是对现有文件系统的一个显著改进，因为大多数系统仅在系统故障后才进行一致性检查。更进一步，PRIMS还执行在线扫描，对整个NVRAM进行检查，以确保文件系统的持续无损状态。这样的实时监控能够及时发现并修复潜在的错误，增强了系统的健壮性。 PRIMS技术的出现，为NVRAM的可靠存储提供了新的解决方案，它通过先进的错误检测和纠正策略，提升了文件系统在面对硬件故障或软件异常时的恢复能力，从而保障了关键数据的安全存储。这一技术对于数据中心、云计算环境以及对数据安全性有极高要求的应用场景具有重要的实践意义。

PRIMS: Making NVRAM Suitable for Extremely Reliable Storage

†

Kevin M. Greenan

kmgreen@cs.ucsc.edu

Ethan L. Miller

elm@cs.ucsc.edu

Storage Systems Research Center

University of California, Santa Cruz

Abstract

Non-volatile byte addressable memories are becoming

more c ommon, and are increasingly used for critical data

that must n ot be lost. However, existing NVRAM-based ﬁle

systems do not include features that gu ard against ﬁle sys-

tem corruption or NVRAM corruption. Furthermore, most

ﬁle systems check consistency only after the system has al-

ready crashed. We are designing PRIMS to address these

problems by providing ﬁle storage that can survive mul-

tiple errors in NVRAM, whether caused by errant operat-

ing system writes or by media corruption. PRIMS uses an

erasure-encoded log structure to store pe rsistent metadata,

making it possible to periodically verify the correctness of

ﬁle system operations while achieving throughput rates of

an order of magnitude higher than page-pro tection during

small writes. It also checks integrity on every operation and

performs on-line scans of the e ntire NVRAM to ensure that

the ﬁle system is consistent. If errors are found, PRIMS

can correct them using ﬁle system logs and extensive error

correction information. While PRIMS is designed for relia-

bility, we expect it to have excellent performance, thanks to

the a bility to do word-aligned rea ds and writes in NVRAM.

1 Introduction

Byte-addre ssab le, non-volatile memory (NVRAM) tech-

nologies such as magnetoresistive random acc e ss mem-

ory (MRAM) and phase-change memory (PRAM) have re-

cently emerged as viable competitors to Flash RAM [1, 2].

These relatively low capacity technologies are perfect for

permane nt metadata storage, and can greatly improve ﬁle

system performance, reliability and power consumption.

Unfortu nately, due to the inc reased chance of data corrup-

tion, storing permanent structures in NVRAM is generally

regarded as unsafe, particularly whe n compared to disk.

The simplicity of most memory access interfaces makes er-

roneous writes more likely, resulting in data corruption—

it is far easier to man ipulate structures in memory tha n on

disk. Such behavior is comm on in OS kernels, in which

buggy code can issue err oneous wild writes that ac c iden-

†

This research was funded in part by NSF-0306650, the Dept. of

Energy-funded Petascale Data Storage Institute, and by SSRC industrial

partners.

tally overwrite memory used by another module or applica-

tion.

The goal of PRIMS (Persistent, Reliable In-Memory

Storage) is to provide reliable storage in NVRAM with-

out hindering the access speed of byte-addressable mem-

ory. Given the limitations of current in-memory reliabil-

ity mecha nisms, we believe that a log-based scheme using

software erasure codes is the most effective way to ensure

the consistency of persistent, memory-resid e nt data. We

present a log-based approach that has the ability to detect

and corre ct errors at multiple byte-granularity without using

page-b a sed access control or specialized hardware support.

PRIMS consists of a single, erasure-encoded log structure

that is used to detect and correct hardware erro rs, software

errors an d ﬁle system inc onsistencies.

2 Motivation

Modern operating systems protect critical regions of

memory using access control bits in the paging stru ctures.

While page-level access control is an effective tool for pre-

venting wild writes in write caches, it is not the best so-

lution for pro te c ting small, persistent structures in byte-

addressable memory because every protected write r equires

a TLB ﬂush and two structu re modiﬁcations to m ark a page

as read-write an d read-only. During periods of freq uent

small writes, these permission changes have a dramatic ef-

fect o n performance.

Disk interfaces also decrease the likelihood of wild

writes by re quiring access through device dr ivers contain-

ing complex I/O routines. The probability of rogue code

accidentally corru pting disk blocks while evading the con-

trolled device drive r interface is extremely low. This strict

I/O interface greatly improves data reliability with respect

to software errors, but hinders perform ance o n low-latency

media, such as NVRAM.

In addition to software errors, hardware errors such as

random bit ﬂips and ce ll wear may occur on the media,

leading to data corruption. Hardware-based error correction

schemes require a specialized controller and resolve er rors

beyond the correction capability by rebooting the system.

Obviously, rebooting is not an option when protecting per-

sistent data in memory; thus, a more robust scheme is neces-

sary. Hardware-based error correction is also computed in-

dependent of any software implementation; as a result, wild

下载后可阅读完整内容，剩余3页未读，立即下载

sdadziahri

粉丝: 0

PRIMS：构建极端可靠的非易失性存储技术

最新资源