Multi-Rail 高级设计概述:系统架构与协议解析

需积分: 10 4 下载量 164 浏览量 更新于2024-07-16 收藏 1.13MB PDF 举报
"Multi-Rail High-Level Design" 文件是一份由 Amir Shehata 和 Olaf Weber 编写的关于分布式文件系统 Lustre 中多路径(Multi-Rail)设计的高级文档,旨在详细介绍如何实现高可用性和性能优化。文档包含了设计概述、系统结构、组件功能、用例场景以及调试需求等内容。 在设计概述部分,文档提到了多路径设计的目标是提高网络的可靠性和性能。系统层面的设计包括了 inetctl 工具、DLC 库、LNet IOCTL 接口、LNet 机制、主 NID(Network Interface Identifier)、PTLRPC(Progressive Transport Layer RPC)协议、LNDs(Logical Network Domains)以及 NUMA(Non-Uniform Memory Access)选择和动态对等发现功能。LNet 是 Lustre 网络层的核心,负责节点间的通信,而 PTLRPC 是在其之上构建的进程间通信机制。LNDs 则是实现网络协议(如 TCP/IP)的具体接口。 文档中还探讨了不同网络场景下的用例和边缘情况,这有助于理解多路径设计如何应对各种实际部署环境。动态对等发现功能允许系统自动识别和适应网络变化。此外,文档也提出了调试需求,特别是在用户空间和内核空间的实现细节,如 lnetctl 的使用、DLC 库的调试接口以及 LNetCtl IOCTL 对接口的操作。 在用户空间方面,文档介绍了如何通过 lnetctl 工具进行配置,以及 DLC 库和 LNetCtl IOCTL 在添加和删除网络接口时的角色。例如,添加或删除一个网络(Net)与添加或删除一个网络接口(NI)的过程有区别,涉及到 Network to Network Interface 的 CPT 继承问题。 在内核空间,文档讨论了线程模型、锁的使用以及 NUMA 意识的扩展,包括 NUMA 距离的计算,以及新的 CPT 接口和内存描述符的设计,这些都是为了优化多路径下内存访问的效率。 这份文档详细阐述了 Lustre 多路径设计的高级概念和技术细节,为理解和实现高性能、高可靠的分布式文件系统提供了重要的参考。
2020-03-31 上传
Contents Multi-Rail Scope and Requirements Document ....................................................................1 Acronym Table..........................................................................................................................3 Scope.........................................................................................................................................3 Problem Statement...............................................................................................................3 In-Scope ................................................................................................................................3 Out-of-Scope.........................................................................................................................3 Project Overview ......................................................................................................................4 Use Cases.............................................................................................................................5 Single Network..................................................................................................................5 Multi-Network/Single FS...................................................................................................5 Multi-Network/Multi-FS.....................................................................................................7 Networks and Network Interfaces.......................................................................................7 User Defined Selection Policy.............................................................................................8 System Overview..................................................................................................................9 High-level Data structure overview ...............................................................................10 Requirements..........................................................................................................................11 Process................................................................................................................................11 Categorization.....................................................................................................................12 Classes................................................................................................................................12 Terms...................................................................................................................................12 Requirement Format ..........................................................................................................13 Notes....................................................................................................................................13 Configuration Requirements..............................................................................................14 Local Network Configuration .........................................................................................14 Remote Peer Configuration ...........................................................................................15 Policy Configuration........................................................................................................15 General Configuration ....................................................................................................16 Functional Requirements...................................................................................................16 Interface Selection and Message Sending Requirements..........................................16 Dynamic NID Discovery .................................................................................................17 Debugging Requirements ..............................................................................................18 Network Interface Health ...............................................................................................20 Backward Compatibility Requirements.........................................................................20 Testing Requirements ....................................................................................................20 Documentation Requirements.......................................................................................21 Requirement Discussion........................................................................................................21 Selection Policies ...............................................................................................................21 User Defined Selection Policy (UDSP).........................................................................21 Local NI first vs Peer NID first .......................................................................................22 Selection Algorithm..........................