提示:关于 Ceph 的高品质 blog 文章也值得参考,如 Ceph Write Throughput 1, Ceph Write Throughput 2,
Argonaut v. Bobtail Performance Preview, Bobtail Performance - I/O Scheduler Comparison。
2.1.1 CPU
Ceph metadata servers dynamically redistribute their load, which is CPU intensive. So your metadata servers
should have significant processing power (e.g., quad core or better CPUs). Ceph OSDs run the RADOS service,
calculate data placement with CRUSH, replicate data, and maintain their own copy of the cluster map.
Therefore, OSDs should have a reasonable amount of processing power (e.g., dual core processors). Monitors
simply maintain a master copy of the cluster map, so they are not CPU intensive. You must also consider
whether the host machine will run CPU-intensive processes in addition to Ceph daemons. For example, if your
hosts will run computing VMs (e.g., OpenStack Nova), you will need to ensure that these other processes
leave sufficient processing power for Ceph daemons. We recommend running additional CPU-intensive
processes on separate hosts.
ceph 元数据服务器对 CPU 敏感,它会动态地重分布它们的负载,所以你的元数据服务器应该有足够的处理能
力(如 4 核或更强悍的 CPU)。ceph 的 OSD 运行着 RADOS 服务、用 CRUSH 计算数据存放位置、复制数据、
维护它自己的集群运行图副本,因此 OSD 需要一定的处理能力(如双核 CPU)。监视器只简单地维护着集群
运行图的副本,因此对 CPU 不敏感;但必须考虑机器以后是否还会运行 ceph 监视器以外的 CPU 密集型任务。
例如,如果服务器以后要运行用于计算的虚拟机(如 OpenStack Nova),你就要确保给 ceph 进程保留了足够
的处理能力,所以我们推荐在其他机器上运行 CPU 密集型任务。
2.1.2 内存
RAM
Metadata servers and monitors must be capable of serving their data quickly, so they should have plenty of
RAM (e.g., 1GB of RAM per daemon instance). OSDs do not require as much RAM for regular operations
(e.g., 200MB of RAM per daemon instance); however, during recovery they need significantly more RAM
(e.g., 500MB-1GB). Generally, more RAM is better.
元数据服务器和监视器必须可以尽快地提供它们的数据,所以他们应该有足够的内存,至少每进程 1GB。运行
OSD 的服务器不需要那么多的内存,每进程 500MB 差不多了。通常内存越多越好。
2.1.3 数据存储
Data Storage
Plan your data storage configuration carefully. There are significant cost and performance tradeoffs to
consider when planning for data storage. Simultaneous OS operations, and simultaneous request for read and
write operations from multiple daemons against a single drive can slow performance considerably. There are
also file system limitations to consider: btrfs is not quite stable enough for production, but it has the ability to
journal and write data simultaneously, whereas XFS and ext4 do not.
要谨慎地规划数据存储配置,因为其间涉及明显的成本和性能折衷。来自操作系统的并行操作和到单个硬盘的
多个守护进程并发读、写请求操作会极大地降低性能。文件系统局限性也要考虑:btrfs 尚未稳定到可以用于
生产环境的程度,但它可以同时记日志并写入数据,而 xfs 和 ext4 却不能。
Important: Since Ceph has to write all data to the journal before it can send an ACK (for XFS and
EXT4 at least), having the journals and OSD performance in balance is really important!
重要:因为 ceph 发送 ACK 前必须把所有数据写入日志(至少对 xfs 和 ext4 来说是),因此均衡日志
和 OSD 性能相当重要。
2.1.3.1
硬盘
Hard Disk Drives
OSDs should have plenty of hard disk drive space for object data. We recommend a minimum hard disk drive
size of 1 terabyte. Consider the cost-per-gigabyte advantage of larger disks. We recommend dividing the price
of the hard disk drive by the number of gigabytes to arrive at a cost per gigabyte, because larger drives may
have a significant impact on the cost-per-gigabyte. For example, a 1 terabyte hard disk priced at $75.00 has a