HBase分布式数据库详解:列式存储与强一致性

需积分: 9 3 下载量 46 浏览量 更新于2024-08-15 收藏 1.44MB PPT 举报
"本文档主要介绍了HBase的基本概念、特性以及其在Hadoop生态系统中的位置。内容包括HBase与传统关系型数据库的对比、CAP理论、NOSQL的一致性模型,以及HBase的逻辑数据模型、体系结构和各组件的职责。此外,还提到了Region的定位策略以及LSM-Tree等数据结构在HBase中的应用。" 在HBase中,Put/Get操作是核心的数据读写操作。Put操作用于向表中插入数据,而Get操作则用于读取数据。HBase作为一个NoSQL数据库,它与传统的ACID事务保障的关系型数据库有着显著的不同。HBase设计之初是为了满足互联网时代对大数据处理的高并发读写需求、海量数据存储和访问以及良好的伸缩性、可用性和可靠性。 HBase在Hadoop生态系统中占据重要位置,作为分布式列式存储系统,它充分利用了HDFS的分布式存储能力。HBase的特点包括基于列式的高效存储,提供强一致性的数据访问,具有高可靠性、高性能,并且能够自动切分和迁移Region以实现水平扩展。它无需预先定义Schema,允许灵活的数据模型。 HBase的逻辑数据模型由Table、Region、ColumnFamily、Row、Column和Value组成。Table是数据的基本单元,Region是Table的物理分割,ColumnFamily是一组列的集合,Row是数据的行标识,Column是ColumnFamily下的具体列,Value则是列对应的值,而TimeStamp用于记录数据版本。 HBase的体系结构包括Client、Zookeeper、Master和RegionServer。Client负责访问HBase并维护缓存以提高性能;Zookeeper用于选举和监控Master,存储Region的入口地址和元数据;Master负责Region的分配、负载均衡以及故障恢复;RegionServer则实际存储和处理Region的数据,执行Split和Compact操作。 Region的定位是通过-ROOT-和.META.表实现的,这是HBase的元数据存储。LSM-Tree(Log-Structured Merge Tree)数据结构使得HBase能够在写入性能和读取效率之间找到平衡,支持快速写入和范围查询,但可能会导致全表扫描。 此外,HBase支持多种过滤器,如BooleanFilter,用于快速定位数据是否存在于特定集合中,虽然可能有少量误判,但能有效提升查询效率。HBase的设计理念和特性使其成为处理大规模、高并发数据场景的理想选择。

Warning: No configuration directory set! Use --conf <dir> to override. Info: Including Hadoop libraries found via (/opt/hadoop-3.1.2/bin/hadoop) for HDFS access Info: Including HBASE libraries found via (/opt/hbase-2.2.6/bin/hbase) for HBASE access 错误: 找不到或无法加载主类 org.apache.flume.tools.GetJavaProperty Info: Including Hive libraries found via (/opt/hive-3.1.2) for Hive access + exec /opt/jdk1.8.0_351/bin/java -Xmx20m -cp '/opt/flume-1.9.0/lib/*:/opt/hadoop-3.1.2/etc/hadoop:/opt/hadoop-3.1.2/share/hadoop/common/lib/*:/opt/hadoop-3.1.2/share/hadoop/common/*:/opt/hadoop-3.1.2/share/hadoop/hdfs:/opt/hadoop-3.1.2/share/hadoop/hdfs/lib/*:/opt/hadoop-3.1.2/share/hadoop/hdfs/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/*:/opt/hadoop-3.1.2/share/hadoop/yarn:/opt/hadoop-3.1.2/share/hadoop/yarn/lib/*:/opt/hadoop-3.1.2/share/hadoop/yarn/*:/opt/hbase-2.2.6/conf:/opt/jdk1.8.0_351//lib/tools.jar:/opt/hbase-2.2.6:/opt/hbase-2.2.6/lib/shaded-clients/hbase-shaded-client-byo-hadoop-2.2.6.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/audience-annotations-0.5.0.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/commons-logging-1.2.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/findbugs-annotations-1.3.9-1.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/htrace-core4-4.2.0-incubating.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/log4j-1.2.17.jar:/opt/hbase-2.2.6/lib/client-facing-thirdparty/slf4j-api-1.7.25.jar:/opt/hadoop-3.1.2/etc/hadoop:/opt/hadoop-3.1.2/share/hadoop/common/lib/*:/opt/hadoop-3.1.2/share/hadoop/common/*:/opt/hadoop-3.1.2/share/hadoop/hdfs:/opt/hadoop-3.1.2/share/hadoop/hdfs/lib/*:/opt/hadoop-3.1.2/share/hadoop/hdfs/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/lib/*:/opt/hadoop-3.1.2/share/hadoop/mapreduce/*:/opt/hadoop-3.1.2/share/hadoop/yarn:/opt/hadoop-3.1.2/share/hadoop/yarn/lib/*:/opt/hadoop-3.1.2/share/hadoop/yarn/*:/opt/hadoop-3.1.2/etc/hadoop:/opt/hbase-2.2.6/conf:/opt/hive-3.1.2/lib/*' -Djava.library.path=:/opt/hadoop-3.1.2/lib/native org.apache.flume.node.Application --name a1 --conf/opt/flume-1.9.0/conf --conf-file/opt/flume-1.9.0/conf/dhfsspool.conf-Dflume.root.logger=DEBUG,consol SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/flume-1.9.0/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-3.1.2/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hive-3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 2023-06-08 17:26:46,403 ERROR node.Application: A fatal error occurred while running. Exception follows. org.apache.commons.cli.UnrecognizedOptionException: Unrecognized option: --conf/opt/flume-1.9.0/conf at org.apache.commons.cli.Parser.processOption(Parser.java:363) at org.apache.commons.cli.Parser.parse(Parser.java:199) at org.apache.commons.cli.Parser.parse(Parser.java:85) at org.apache.flume.node.Application.main(Application.java:287)

2023-06-09 上传