没有合适的资源?快使用搜索试试~ 我知道了~
首页netmap:突破界限的高速包IO框架
netmap:突破界限的高速包IO框架
需积分: 10 4 下载量 53 浏览量
更新于2024-09-15
收藏 455KB PDF 举报
"Netmap:一种新颖的高速包IO框架" Netmap是Luigi Rizzo在Università di Pisa提出的一个创新性框架,专为满足那些需要在1Gbps到10Gbps高速链路上传输大量数据的应用设计,例如路由器、流量监控器、防火墙等。这些应用要求能在不影响性能的前提下处理每秒数百万个包,这在传统系统中通常是难以实现的,因为它们会涉及频繁的动态内存分配、系统级开销以及内存复制等瓶颈。 核心贡献在于,netmap通过以下方式显著降低了包处理成本: 1. 内存管理优化:netmap预分配资源以避免每次接收或发送包时进行动态内存分配,从而减少了内存分配的开销和复杂性。 2. 系统开销最小化:通过批量处理,将系统级别的开销分散到大量的数据包操作中,而不是每次单独处理,这样可以提高整体效率。 3. 内存共享:netmap在内核和用户空间之间共享缓冲区和元数据,消除了不必要的内存复制,同时确保对设备寄存器和其他关键内核内存区域的访问安全。 相比于以往的工作,netmap不仅在性能上超越了大部分同类解决方案,而且它提供了一个与现有架构紧密集成的体系结构,这意味着开发者无需对底层硬件或应用程序进行重大改动就能利用这种高效包处理能力。 这种设计灵活性使得netmap成为一个理想的平台,使得开发人员能够轻松地在保持高性能的同时,开发出适应现代网络环境的高性能应用。此外,netmap还可能促进了网络设备驱动程序和操作系统的改进,提升了整个网络通信链路的效率,对于网络技术的发展具有重要意义。
资源详情
资源推荐
ware architectures is that most systems barely reach
0.5..1 Mpps per core from userspace, and even remain-
ing in the kernel yields only modest speed improvements,
usually within a factor of 2.
3 Related (and unrelated) work
It is useful at this point to present some techniques pro-
posed in the literature, or used in commercial systems, to
improve packet processing speeds. This will be instru-
mental in understanding their advantages and limitations,
and how our netmap framework can make use of them.
Socket APIs: The Berkeley Packet Filter, or BPF [12],
is one of the most popular systems for direct access to
raw packet data. BPF taps into the data path of a net-
work device driver, and dispatches a copy of each sent or
received packet to a file descriptor, from which userspace
processes can read or write. Linux has a similar mech-
anism through the AF
PACKET socket family. BPF can
coexist with regular traffic from/to the system, but usu-
ally BPF clients need to put the card in promiscuous
mode, causing large amounts of traffic to be delivered
to the host stack (and immediately dropped).
Packet filter hooks: Netgraph (FreeBSD), Netfil-
ter (Linux), Ndis Miniport drivers (Windows) are in-
kernel mechanisms used when packet duplication (as in
BPF) is not necessary, or the application (e.g. a fire-
wall, or an IDS) manipulates traffic as part of a packet
processing chain. These hooks permit to intercept traf-
fic from/to the driver and pass it to processing modules
without additional data copies. Note however that even
the packet filter hooks rely on the standard mbuf/skbuf
based packet representation, so the cost of metadata man-
agement (Section 2.2) still remains. Netslice [11] is an
example of a system that uses the netfilter hooks to ex-
port traffic to userspace processes through a suitable ker-
nel module.
Direct buffer access: One easy way to remove the
data copies involved in the kernel-userland transition is
to run the application code directly within the kernel.
Systems based on kernel-mode Click [8, 4] follow this
approach. Click permits an easy construction of packet
processing chains through the composition of modules,
some of which support fast access to the NIC (even
though they retain an skbuf-based packet representation).
The kernel environment is much more constrained
than the one available in user space, so a number of re-
cent proposals try a different approach: instead of run-
ning the application in the kernel, they removethe system
call overhead by exposing NIC registers and data struc-
tures to user space. This approach generally requires
modified device drivers, and poses some risks at runtime,
because the NIC’s DMA engine can write to arbitrary
memory addresses (unless limited by hardware mecha-
nisms such as IOMMUs), and a misbehaving client can
potentially trash data anywhere in the system.
UIO-IXGBE [9] implements exactly what we have de-
scribed above: buffers, hardware rings and NIC registers
(see Figure 1) are directly accessible to user programs,
with obvious risks for the stability of the system.
PF
RING [2] exports to userspace clients a shared
memory region containing a ring of pre-allocated packet
buffers. The kernel is in charge of copying data between
skbufs and the shared buffers, so the system is safe and
no driver modifications are needed. This approach amor-
tizes the system call costs over batches of packets, but re-
tains the data copy and skbuf management overhead. An
evolution of PF
RING called DNA [3] avoids the copy
because the memory mapped ring buffers are directly ac-
cessed by the NIC. Same as UIO-IXGBE, DNA clients
have direct access to the NIC’s registers and rings.
The PacketShader [5] I/O engine (PSIOE) is one of
the closest relatives to our proposals. PSIOE uses a
custom device driver that replaces the skbuf-based API
with a simpler one, using preallocated buffers. Cus-
tom ioctl()s are used to synchronize the kernel with
userspace applications, and multiple packets are passed
up and down through a memory area shared between
the kernel and the application. The kernel is in charge
of copying packet data between the shared memory
and packet buffers. Unlike netmap, PSIOE only sup-
ports one specific network card, and does not support
select()/poll(), requiring modifications to applica-
tions in order to let them use the new API.
Hardware solutions: Some hardware has been de-
signed specifically to support high speed packet cap-
ture, or possibly generation, together with special fea-
tures such as timestamping, filtering, forwarding. Usu-
ally these cards come with custom device drivers and
user libraries to access the hardware. As an example,
DAG [1, 7] cards are FPGA-based devices for wire-rate
packet capture and precise timestamping, using fast on-
board memory for the capture buffers (at the time, the
I/O bus was unable to sustain line rate). NetFPGA [10]
is another example of an FPGA-based card where the
firmware of the card can be programmed to implement
specific functions directly in the NIC, offloading the
main CPU from some work.
3.1 Unrelated work
A lot of commercial interest, in high speed network-
ing, goes to TCP acceleration and hardware virtualiza-
tion, so it is important to clarify where netmap stands
in this respect. netmap is a framework to reduce the
cost of moving traffic between the hardware and the
host stack. Popular hardware features related to TCP
acceleration, such as hardware checksumming or even
3
剩余11页未读,继续阅读
ShawnWithSmallEyes
- 粉丝: 5
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
- WebLogic集群配置与管理实战指南
- AIX5.3上安装Weblogic 9.2详细步骤
- 面向对象编程模拟试题详解与解析
- Flex+FMS2.0中文教程:开发流媒体应用的实践指南
- PID调节深入解析:从入门到精通
- 数字水印技术:保护版权的新防线
- 8位数码管显示24小时制数字电子钟程序设计
- Mhdd免费版详细使用教程:硬盘检测与坏道屏蔽
- 操作系统期末复习指南:进程、线程与系统调用详解
- Cognos8性能优化指南:软件参数与报表设计调优
- Cognos8开发入门:从Transformer到ReportStudio
- Cisco 6509交换机配置全面指南
- C#入门:XML基础教程与实例解析
- Matlab振动分析详解:从单自由度到6自由度模型
- Eclipse JDT中的ASTParser详解与核心类介绍
- Java程序员必备资源网站大全
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功