目录
一、开启 HDFS 机柜感知
1. 增加 core-site.xml 配置项
2. 创建机柜感知脚本
3. 创建机柜配置信息文件
4. 分发相关文件到其它节点
5. 重启 HDFS 使机柜感知生效
二、主机规划
三、安装配置 HBase 完全分布式集群
1. 在所有节点上配置环境变量
2. 解压、配置环境
3. 修改 $HBASE_HOME/conf/regionservers 文件
4. 创建 HBase 使用的临时目录
5. 修改 HBase 配置文件
6. 创建备用主节点文件
7. 分发配置文件到其它节点
四、启动 HBase 集群
1. 启动 HBase
2. 查看 HBase 相关进程
3. 查看 web 页面
4. 查看 HBase 在 Zookeeper 中的 znode
五、安装验证
1. 进入 hbase shell 查看状态及简单读写测试
2. 自动切换测试
(1)故障模拟
(2)查看状态
(3)故障恢复
(5)查看状态
(6)再次自动切换
参考:
完全分布式 HBase 集群的运行依赖于 Zookeeper 和 Hadoop,在前一篇中已经详细介绍了他们的安装部署及运行,参见“基于 HBase & Phoenix 构建实时数仓(1)—— Hadoop HA 安装部署”。本篇继续介绍在相同主机环境下安装配置完全分布式 HBase 集群。
一、开启 HDFS 机柜感知
HBase 中的数据存储在 HDFS 上,为了优化性能,首先开启 HDFS 的机柜感知功能。在 node1 上执行下面的操作步骤。
1. 增加 core-site.xml 配置项
# 编辑 $HADOOP_HOME/etc/hadoop/core-site.xml 文件
vim $HADOOP_HOME/etc/hadoop/core-site.xml
# 增加配置项
topology.script.file.name
/root/hadoop-3.3.6/etc/hadoop/topology.sh
2. 创建机柜感知脚本
# 编辑 /root/hadoop-3.3.6/etc/hadoop/topology.sh 文件
vim /root/hadoop-3.3.6/etc/hadoop/topology.sh
内容如下:
#!/bin/bash
# 此处是你的机架配置文件 topology.sh 所在目录
HADOOP_CONF=/root/hadoop-3.3.6/etc/hadoop
while [ $# -gt 0 ] ;
do
# 脚本第一个参数节点 ip 或者主机名称赋值给 nodeArg
nodeArg=$1
# 以只读的方式打开机架配置文件
exec
修改文件属性为可执行:
chmod 755 /root/hadoop-3.3.6/etc/hadoop/topology.sh
3. 创建机柜配置信息文件
# 编辑 /root/hadoop-3.3.6/etc/hadoop/topology.data 文件
vim /root/hadoop-3.3.6/etc/hadoop/topology.data
内容如下:
172.18.4.126 node1 /dc1/rack1
172.18.4.188 node2 /dc1/rack1
172.18.4.71 node3 /dc1/rack1
172.18.4.86 node4 /dc1/rack1
4. 分发相关文件到其它节点
scp /root/hadoop-3.3.6/etc/hadoop/core-site.xml node2:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/topology.sh node2:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/topology.data node2:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/core-site.xml node3:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/topology.sh node3:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/topology.data node3:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/core-site.xml node4:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/topology.sh node4:/root/hadoop-3.3.6/etc/hadoop/
scp /root/hadoop-3.3.6/etc/hadoop/topology.data node4:/root/hadoop-3.3.6/etc/hadoop/
5. 重启 HDFS 使机柜感知生效
# node1 执行
stop-dfs.sh
# node1、node2、node3 执行
hdfs --daemon stop journalnode
hdfs --daemon start journalnode
# node1 执行
start-dfs.sh
执行 hdfs dfsadmin -printTopology 打印机架信息,可以看到集群已经按照配置感应到节点机架位置。
[root@vvml-yz-hbase-test~]#hdfs dfsadmin -printTopology
Rack: /dc1/rack1
172.18.4.71:9866 (node3) In Service
172.18.4.188:9866 (node2) In Service
172.18.4.86:9866 (node4) In Service
[root@vvml-yz-hbase-test~]#
二、主机规划
所需安装包:HBase-2.5.7
下表描述了四个节点上分别将会运行的相关进程。简便起见,安装部署过程中所用的命令都使用操作系统的 root 用户执行。
节点 进程 |
node1 |
node2 |
node3 |
node4 |
HMaster |
* |
|
|
* |
HRegionServer |
|
* |
* |
* |
三、安装配置 HBase 完全分布式集群
1. 在所有节点上配置环境变量
# 编辑 /etc/profile 文件
vim /etc/profile
# 添加下面两行
export HBASE_HOME=/root/hbase-2.5.7-hadoop3/
export PATH=$HBASE_HOME/bin:$PATH
# 加载生效
source /etc/profile
在 node1 上执行以下步骤。
2. 解压、配置环境
# 解压
tar -zxvf hbase-2.5.7-hadoop3-bin.tar.gz
# 将 Hadoop 配置文件复制到 HBase 配置目录下,
# 以解决 java.lang.IllegalArgumentException: java.net.UnknownHostException: mycluster 问题
cp $HADOOP_HOME/etc/hadoop/core-site.xml $HBASE_HOME/conf/
cp $HADOOP_HOME/etc/hadoop/hdfs-site.xml $HBASE_HOME/conf/
# 编辑 $HBASE_HOME/conf/hbase-env.sh 文件设置 HBase 运行环境
vim $HBASE_HOME/conf/hbase-env.sh
# 在文件末尾添加
export JAVA_HOME=/usr/java/jdk1.8.0_202-amd64
export HBASE_LOG_DIR=${HBASE_HOME}/logs
export HBASE_MANAGES_ZK=false
export HBASE_CLASSPATH=/root/hadoop-3.3.6/etc/hadoop
export HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP="true"
export HBASE_PID_DIR=${HBASE_HOME}/tmp
- HBASE_MANAGES_ZK 设置成 true,则使用 HBase 自带的 Zookeeper 进行管理,只能实现单机模式,常用于测试环境。设为false,启动独立的 Zookeeper。
- HBASE_CLASSPATH 用于引导 HBase 找到 Hadoop 集群,一定要改成 Hadoop 的配置文件目录,不然无法识别 Hadoop 集群名称。
- HBASE_DISABLE_HADOOP_CLASSPATH_LOOKUP 设置不扫描 hadoop 的 jar。如果扫描很容易出现异常 object is not an instance of declaring class。
3. 修改 $HBASE_HOME/conf/regionservers 文件
将下面内容覆盖文件,默认只有 localhost。
# 编辑 $HBASE_HOME/conf/regionservers 文件
vim $HBASE_HOME/conf/regionservers
内容如下:
node2
node3
node4
4. 创建 HBase 使用的临时目录
mkdir $HBASE_HOME/tmp
5. 修改 HBase 配置文件
# 备份原始文件
cp $HBASE_HOME/conf/hbase-site.xml $HBASE_HOME/conf/hbase-site.xml.bak
# 编辑 $HBASE_HOME/conf/hbase-site.xml 文件
vim $HBASE_HOME/conf/hbase-site.xml
配置如下:
hbase.cluster.distributed
true
hbase.rootdir
hdfs://mycluster/hbase
hbase.tmp.dir
/root/hbase-2.5.7-hadoop3/tmp
hbase.zookeeper.quorum
node1:2181,node2:2181,node3:2181
hbase.zookeeper.property.dataDir
/var/lib/zookeeper/data
hbase.unsafe.stream.capability.enforce
false
hbase.unsafe.stream.capability.enforce
true
hbase.wal.provider
filesystem
hbase.client.keyvalue.maxsize
10485760
hbase.master.distributed.log.splitting
true
hbase.client.scanner.caching
5000
hbase.hregion.max.filesize
107374182400
hbase.hregion.memstore.flush.size
268435456
hbase.regionserver.handler.count
200
hbase.regionserver.global.memstore.lowerLimit
0.38
hbase.hregion.memstore.block.multiplier
8
hbase.server.thread.wakefrequency
1000
hbase.rpc.timeout
400000
hbase.hstore.blockingStoreFiles
5000
hbase.client.scanner.timeout.period
1000000
zookeeper.session.timeout
180000
hbase.regionserver.optionallogflushinterval
5000
hbase.client.write.buffer
5242880
hbase.hstore.compactionThreshold
5
hbase.hstore.compaction.max
12
hbase.regionserver.regionSplitLimit
1
hbase.regionserver.thread.compaction.large
5
hbase.regionserver.thread.compaction.small
8
hbase.master.logcleaner.ttl
3600000
hbase.hregion.majorcompaction
0
dfs.client.hedged.read.threadpool.size
20
dfs.client.hedged.read.threshold.millis
5000
phoenix.schema.isNamespaceMappingEnabled
true
phoenix.schema.mapSystemTablesToNamespace
true
6. 创建备用主节点文件
# 编辑 $HBASE_HOME/conf/backup-masters 文件
vim $HBASE_HOME/conf/backup-masters
内容如下:
node4
注意:该文件不能写注释,因为启动时会把注释的那行当成服务器列表而导致启动失败。
7. 分发配置文件到其它节点
scp -r $HBASE_HOME node2:/root/
scp -r $HBASE_HOME node3:/root/
scp -r $HBASE_HOME node4:/root/
四、启动 HBase 集群
1. 启动 HBase
# 在 node1 节点上执行
start-hbase.sh
# 输出
[root@vvml-yz-hbase-test~]#start-hbase.sh
running master, logging to /root/hbase-2.5.7-hadoop3//logs/hbase-root-master-vvml-yz-hbase-test.172.18.4.126.out
node2: running regionserver, logging to /root/hbase-2.5.7-hadoop3/bin/../logs/hbase-root-regionserver-vvml-yz-hbase-test.172.18.4.188.out
node3: running regionserver, logging to /root/hbase-2.5.7-hadoop3/bin/../logs/hbase-root-regionserver-vvml-yz-hbase-test.172.18.4.71.out
node4: running regionserver, logging to /root/hbase-2.5.7-hadoop3/bin/../logs/hbase-root-regionserver-vvml-yz-hbase-test.172.18.4.86.out
node4: running master, logging to /root/hbase-2.5.7-hadoop3/bin/../logs/hbase-root-master-vvml-yz-hbase-test.172.18.4.86.out
[root@vvml-yz-hbase-test~]#
2. 查看 HBase 相关进程
用 jps 可以看到 HMaster、HRegionServer 进程:
# node1
[root@vvml-yz-hbase-test~]#jps
578 NameNode
32724 JournalNode
9621 QuorumPeerMain
3654 HMaster
15563 ResourceManager
13645 JobHistoryServer
32079 DFSZKFailoverController
4367 Jps
[root@vvml-yz-hbase-test~]#
# node2
[root@vvml-yz-hbase-test~]#jps
1249 DataNode
17219 NodeManager
4291 HRegionServer
1925 JournalNode
4969 Jps
15007 QuorumPeerMain
[root@vvml-yz-hbase-test~]#
# node3
[root@vvml-yz-hbase-test~]#jps
5316 QuorumPeerMain
12452 DataNode
13144 JournalNode
7483 NodeManager
15356 HRegionServer
16030 Jps
[root@vvml-yz-hbase-test~]#
# node4
[root@vvml-yz-hbase-test~]#jps
8352 NodeManager
22480 HRegionServer
19857 NameNode
10531 ResourceManager
23555 Jps
19206 DFSZKFailoverController
22599 HMaster
19116 DataNode
[root@vvml-yz-hbase-test~]#
3. 查看 web 页面
web地址:
http://node1:16010/
http://node4:16010/
如下图所示,node1 为 Master,node4 为 Backup Master。
4. 查看 HBase 在 Zookeeper 中的 znode
zkCli.sh -server node1:2181
...
[zk: node1:2181(CONNECTED) 0] ls /hbase
[backup-masters, draining, flush-table-proc, hbaseid, master, master-maintenance, meta-region-server, namespace, online-snapshot, rs, running, splitWAL, switch, table]
[zk: node1:2181(CONNECTED) 1]
五、安装验证
1. 进入 hbase shell 查看状态及简单读写测试
# 进入 hbase shell
hbase shell
# 查看状态
status
# 创建测试表
create 'test', 'cf'
# 列出表
list 'test'
# 查看表结构
describe 'test'
# 插入数据
put 'test', 'row1', 'cf:a', 'value1'
put 'test', 'row2', 'cf:b', 'value2'
put 'test', 'row3', 'cf:c', 'value3'
# 全表扫描
scan 'test'
# 用 rowkey 查询
get 'test', 'row1'
# 退出 hbase shell
exit
输出:
[root@vvml-yz-hbase-test~]#hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.5.7-hadoop3, r6788f98356dd70b4a7ff766ea7a8298e022e7b95, Thu Dec 14 16:16:10 PST 2023
Took 0.0011 seconds
hbase:001:0> status
1 active master, 1 backup masters, 3 servers, 0 dead, 0.6667 average load
Took 0.4587 seconds
hbase:002:0> create 'test', 'cf'
Created table test
Took 0.6557 seconds
=> Hbase::Table - test
hbase:003:0> list 'test'
TABLE
test
1 row(s)
Took 0.0209 seconds
=> ["test"]
hbase:004:0> describe 'test'
Table test is ENABLED
test, {TABLE_ATTRIBUTES => {METADATA => {'hbase.store.file-tracker.impl' => 'DEFAULT'}}}
COLUMN FAMILIES DESCRIPTION
{NAME => 'cf', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '1', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FORE
VER', MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'ROW', IN_MEMORY => 'false', COMPRESSION => 'NONE', BLOCKCACHE => 'tru
e', BLOCKSIZE => '65536 B (64KB)'}
1 row(s)
Quota is disabled
Took 0.1227 seconds
hbase:005:0> put 'test', 'row1', 'cf:a', 'value1'
Took 0.0683 seconds
hbase:006:0> put 'test', 'row2', 'cf:b', 'value2'
Took 0.0053 seconds
hbase:007:0> put 'test', 'row3', 'cf:c', 'value3'
Took 0.0093 seconds
hbase:008:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=2024-03-07T11:20:22.299, value=value1
row2 column=cf:b, timestamp=2024-03-07T11:20:29.270, value=value2
row3 column=cf:c, timestamp=2024-03-07T11:20:33.538, value=value3
3 row(s)
Took 0.0241 seconds
hbase:009:0> get 'test', 'row1'
COLUMN CELL
cf:a timestamp=2024-03-07T11:20:22.299, value=value1
1 row(s)
Took 0.0071 seconds
hbase:010:0> exit
[root@vvml-yz-hbase-test~]#
可以看到,现在是一个 active master,一个 backup masters,三个 RegionServer。
2. 自动切换测试
(1)故障模拟
# 在 active master 节点上(这里是 node1),kill 掉 HMaster 进程
jps|grep HMaster|awk '{print $1}'|xargs kill -9
(2)查看状态
[root@vvml-yz-hbase-test~]#hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.5.7-hadoop3, r6788f98356dd70b4a7ff766ea7a8298e022e7b95, Thu Dec 14 16:16服务器托管网:10 PST 2023
Took 0.0010 seconds
hbase:001:0> status
1 active master, 0 backup masters, 3 servers, 0 dead, 1.0000 average load
Took 0.4730 seconds
hbase:002:0> put 'test', 'row4', 'cf:d', 'value4'
Took 0.1254 seconds
hbase:003:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=2024-03-07T11:20:22.299, value=value1
row2 column=cf:b, timestamp=2024-03-07T11:20:29.270, value=value2
row3 column=cf:c, timestamp=2024-03-07T11:20:33.538, value=value3
row4 column=cf:d, timestamp=2024-03-07T11:26:55.140, value=value4
4 row(s)
Took 0.0244 seconds
hbase:004:0> exit
[root@vvml-yz-hbase-test~]#
现在只有一个 active master,数据正常读写。
(3)故障服务器托管网恢复
# node1 上执行
hbase-daemon.sh start master
jps
输出:
[root@vvml-yz-hbase-test~]#hbase-daemon.sh start master
running master, logging to /root/hbase-2.5.7-hadoop3//logs/hbase-root-master-vvml-yz-hbase-test.172.18.4.126.out
[root@vvml-yz-hbase-test~]#jps
578 NameNode
7138 Jps
32724 JournalNode
9621 QuorumPeerMain
15563 ResourceManager
13645 JobHistoryServer
6781 HMaster
32079 DFSZKFailoverController
[root@vvml-yz-hbase-test~]#
(5)查看状态
[root@vvml-yz-hbase-test~]#hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.5.7-hadoop3, r6788f98356dd70b4a7ff766ea7a8298e022e7b95, Thu Dec 14 16:16:10 PST 2023
Took 0.0011 seconds
hbase:001:0> status
1 active master, 1 backup masters, 3 servers, 0 dead, 1.0000 average load
Took 5.4624 seconds
hbase:002:0> status 'detailed'
version 2.5.7-hadoop3
0 regionsInTransition
active master: node4:16000 1709780353589
RpcServer.priority.RWQ.Fifo.write.handler=0,queue=0,port=16000: status=Waiting for a call, state=WAITING, startTime=1709781875067, completionTime=-1
RpcServer.priority.RWQ.Fifo.write.handler=1,queue=0,port=16000: status=Waiting for a call, state=WAITING, startTime=1709781875221, completionTime=-1
RpcServer.default.FPBQ.Fifo.handler=199,queue=19,port=16000: status=Servicing call from 172.18.4.126:36280: GetClusterStatus, state=RUNNING, startTime=1709781969214, completionTime=-1
1 backup masters
node1:16000 1709782204055
master coprocessors: []
3 live servers
node2:16020 1709780352218
...
node3:16020 1709780355334
...
node4:16020 1709780353541
...
0 dead servers
Took 0.0275 seconds
=> #
hbase:003:0> exit
[root@vvml-yz-hbase-test~]#
可以看到,现在 node1 和 node4 互换了角色,node4 为 active master,node1 为 backup master,三个 RegionServer 正常。
(6)再次自动切换
# node4 上执行
jps|grep HMaster|awk '{print $1}'|xargs kill -9
# 查看状态
[root@vvml-yz-hbase-test~]#hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.5.7-hadoop3, r6788f98356dd70b4a7ff766ea7a8298e022e7b95, Thu Dec 14 16:16:10 PST 2023
Took 0.0011 seconds
hbase:001:0> status
1 active master, 0 backup masters, 3 servers, 0 dead, 1.0000 average load
Took 5.4625 seconds
hbase:002:0> exit
[root@vvml-yz-hbase-test~]#
# node4 上执行
hbase-daemon.sh start master
# 查看状态
[root@vvml-yz-hbase-test~]#hbase shell
HBase Shell
Use "help" to get list of supported commands.
Use "exit" to quit this interactive shell.
For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell
Version 2.5.7-hadoop3, r6788f98356dd70b4a7ff766ea7a8298e022e7b95, Thu Dec 14 16:16:10 PST 2023
Took 0.0010 seconds
hbase:001:0> status 'detailed'
version 2.5.7-hadoop3
0 regionsInTransition
active master: node1:16000 1709782204055
RpcServer.priority.RWQ.Fifo.write.handler=0,queue=0,port=16000: status=Waiting for a call, state=WAITING, startTime=1709782709881, completionTime=-1
RpcServer.priority.RWQ.Fifo.write.handler=1,queue=0,port=16000: status=Waiting for a call, state=WAITING, startTime=1709782709895, completionTime=-1
RpcServer.default.FPBQ.Fifo.handler=199,queue=19,port=16000: status=Servicing call from 172.18.4.86:39042: GetClusterStatus, state=RUNNING, startTime=1709782750795, completionTime=-1
1 backup masters
node4:16000 1709782808170
master coprocessors: []
3 live servers
node2:16020 1709780352218
...
node3:16020 1709780355334
...
node4:16020 1709780353541
...
0 dead servers
Took 0.4674 seconds
=> #
hbase:002:0> put 'test', 'row5', 'cf:e', 'value5'
Took 0.1138 seconds
hbase:003:0> scan 'test'
ROW COLUMN+CELL
row1 column=cf:a, timestamp=2024-03-07T11:20:22.299, value=value1
row2 column=cf:b, timestamp=2024-03-07T11:20:29.270, value=value2
row3 column=cf:c, timestamp=2024-03-07T11:20:33.538, value=value3
row4 column=cf:d, timestamp=2024-03-07T11:26:55.140, value=value4
row5 column=cf:e, timestamp=2024-03-07T11:41:18.171, value=value5
5 row(s)
Took 0.0293 seconds
hbase:004:0> exit
[root@vvml-yz-hbase-test~]#
可以看到,现在 node1 和 node4 再次互换了角色,node1 为 active master,node4 为 backup master,三个 RegionServer 正常,数据正常读写。
参考:
- Hadoop3.x 机架感知机制与配置
- 大数据开源框架环境搭建(五)——Hbase完全分布式集群的安装部署
- Apache HBase ™ Reference Guide
服务器托管,北京服务器托管,服务器租用 http://www.fwqtg.net
相关推荐: 技术分享 | 弹窗开发中,如何使用 Hook 封装 el-dialog?
开源中国社区团队直播首秀,以分享为名讲述开源中国社区背后的故事” 弹窗是前端开发中的一种常见需求。Element UI 框架中的 el-dialog 组件提供了弹窗相关的基本功能,但在实际开发中,我们难免会遇到一些定制化需求,比如对弹窗进行二次封装以便在项目中…