没有合适的资源?快使用搜索试试~ 我知道了~
首页hadoop 群集部署与运维实践
node1与node2主机角色分配:NameNode、DFSZKFailoverController;需要安装软件有:JDK、Hadoop2.7.1 nod3主机角色分配:ResourceManager;需要安装软件有:JDK、Hadoop2.7.1 node4、node5、node6主机角色分配:JournalNode、DataNode、NodeManager、QuorumPeerMain;需要安装软件有:JDK、Hadoop2.7.1、zookeeper3.4.6
资源详情
资源评论
资源推荐

1/23
hadoop2.7
实验环境说明:
192.168.100.1 node1.robin.com
192.168.100.2 node2.robin.com
192.168.100.3 node3.robin.com
192.168.100.4 node4.robin.com
192.168.100.5 node5.robin.com
192.168.100.6 node6.robin.com
node1与node2主机角色分配:NameNode、DFSZKFailoverController;需要安装软件有:JDK、Hadoop2.7.1
nod3主机角色分配:ResourceManager;需要安装软件有:JDK、Hadoop2.7.1
node4、node5、node6主机角色分配:JournalNode、DataNode、NodeManager、QuorumPeerMain;需要安装软件有:JDK、
Hadoop2.7.1、zookeeper3.4.6
一、配置本地解析:
[root@node1~]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.100.1 node1.robin.com
192.168.100.2 node2.robin.com
192.168.100.3 node3.robin.com
192.168.100.4 node4.robin.com
192.168.100.5 node5.robin.com
192.168.100.6 node6.robin.com
[root@node1 ~]#
for ((x=1;x<=6;x++));do scp /etc/hosts node$x.robin.com:/etc/ ; done
二、安装jdk
[root@node1 ~]# for ((x=1;x<=6;x++));do scp jdk-7u45-linux-x64.rpm node$x.robin.com:/root/ ; done
在所有节点上都rpm -ivh jdk-7u45-linux-x64.rpm
编辑/etc/profile文件,如下:
[root@node1 ~]# tail -3 /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_45
export HADOOP_HOME=/opt/hadoop
export PATH=$JAVA_HOME/jre/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
[root@node1 ~]#
for((x=1;x<=6;x++));do scp /etc/profile node$x.robin.com:/etc/ ; done
所有节点都source /etc/profile,并使用java -version验证jdk的新版本
三、新建立hadoop用户,配置ssh 互信
所有节点都新建hadoop用户,如下:
useradd hadoop
echo "redhat" | passwd --stdin hadoop
然后在node1主机上操作如下:
[root@node1 ~]# su - hadoop
[hadoop@node1 ~]$ ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
41:1a:d7:b1:df:bc:1d:bc:8e:ec:f9:ef:c5:48:f8:90 hadoop@node1.robin.com
The key's randomart image is:
+--[ RSA 2048]----+
| . o... |
| = .. |
| . . . |
| . . =. |
| S E +o |
| + =o|
| +.+|
| . + .|
| .=.++|
+-----------------+
[hadoop@node1 ~]$
ssh-copy-id -i node1.robincom
The authenticity of host 'robin (192.168.1.1)' can't be established.
RSA key fingerprint is 44:69:99:88:ac:45:67:7c:fe:95:b0:93:7e:af:38:4d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'robin,192.168.1.1' (RSA) to the list of known hosts.
hadoop@robin's password:
Now try logging into the machine, with "ssh 'robin'", and check in:

2/23
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
[hadoop@node1 ~]$
[hadoop@node1 ~]$ for((x=2;x<=6;x++));do scp -r .ssh node$x.robin.com:~ ; done
The authenticity of host 'node2 (192.168.1.2)' can't be established.
RSA key fingerprint is a7:24:ed:2e:56:5f:5c:f7:f4:fe:c0:ee:ef:51:a1:2d.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'node2,192.168.1.2' (RSA) to the list of known hosts.
hadoop@node2's password:
id_rsa.pub 100% 407 0.4KB/s 00:00
known_hosts 100% 799 0.8KB/s 00:00
authorized_keys 100% 407 0.4KB/s 00:00
id_rsa 100% 1675 1.6KB/s 00:00
The authenticity of host 'node3 (192.168.1.3)' can't be established.
RSA key fingerprint is 00:38:94:de:68:83:5e:48:77:83:e0:7d:14:33:a1:91.
Are you sure you want to continue connecting (yes/no)?
yes
Warning: Permanently added 'node3,192.168.1.3' (RSA) to the list of known hosts.
hadoop@node3's password:
id_rsa.pub 100% 407 0.4KB/s 00:00
known_hosts 100% 1198 1.2KB/s 00:00
authorized_keys 100% 407 0.4KB/s 00:00
id_rsa 100% 1675 1.6KB/s 00:00
The authenticity of host 'node4 (192.168.1.4)' can't be established.
RSA key fingerprint is 84:9a:aa:db:b2:2c:38:bb:5f:32:61:b5:e8:c3:9e:8a.
Are you sure you want to continue connecting (yes/no)?
yes
Warning: Permanently added 'node4,192.168.1.4' (RSA) to the list of known hosts.
hadoop@node4's password:
id_rsa.pub 100% 407 0.4KB/s 00:00
known_hosts 100% 1597 1.6KB/s 00:00
authorized_keys 100% 407 0.4KB/s 00:00
id_rsa 100% 1675 1.6KB/s 00:00
The authenticity of host 'node5 (192.168.1.5)' can't be established.
RSA key fingerprint is e2:6a:3f:08:2b:9b:af:39:54:ff:47:5f:a9:ee:af:06.
Are you sure you want to continue connecting (yes/no)?
yes
Warning: Permanently added 'node5,192.168.1.5' (RSA) to the list of known hosts.
hadoop@node5's password:
id_rsa.pub 100% 407 0.4KB/s 00:00
known_hosts 100% 1996 2.0KB/s 00:00
authorized_keys 100% 407 0.4KB/s 00:00
id_rsa 100% 1675 1.6KB/s 00:00
The authenticity of host 'node6 (192.168.1.6)' can't be established.
RSA key fingerprint is 9d:27:25:89:50:cd:a3:53:b1:0b:56:d0:cd:7d:eb:ae.
Are you sure you want to continue connecting (yes/no)?
yes
Warning: Permanently added 'node6,192.168.1.6' (RSA) to the list of known hosts.
hadoop@node6's password:
id_rsa.pub 100% 407 0.4KB/s 00:00
known_hosts 100% 2395 2.3KB/s 00:00
authorized_keys 100% 407 0.4KB/s 00:00
id_rsa 100% 1675 1.6KB/s 00:00
[hadoop@node1 ~]$
各节点相互ssh登录,测试是否互信。
四、配置zookeeper cluster
[root@node1 ~]# for((x=4;x<=6;x++));do scp zookeeper-3.4.6.tar.gz node$x.robin.com:/tmp ;done
root@node4's password:
zookeeper-3.4.6.tar.gz 100% 17MB 16.9MB/s 00:01
root@node5's password:
zookeeper-3.4.6.tar.gz 100% 17MB 16.9MB/s 00:00
root@node6's password:
zookeeper-3.4.6.tar.gz 100% 17MB 16.9MB/s 00:01
[root@robin ~]#
[root@node5 ~]#
chown hadoop.hadoop /opt
[root@node5 ~]# su - hadoop
[hadoop@node5 ~]$ tar xfz /tmp/zookeeper-3.4.6.tar.gz -C /opt/
[hadoop@node5 ~]$
[hadoop@node5 ~]$ cd /opt/
[hadoop@node5 opt]$ ls
rh zookeeper-3.4.6
[hadoop@node5 opt]$ mv zookeeper{-3.4.6,}
[hadoop@node5 opt]$ ls
rh zookeeper

3/23
[hadoop@node5 opt]$ ls zookeeper/conf/
configuration.xsl log4j.properties zoo_sample.cfg
[hadoop@node5 opt]$ cp zookeeper/conf/zoo{_sample,}.cfg
[hadoop@node5 opt]$ vim zookeeper/conf/zoo.cfg
[hadoop@node5 opt]$ grep -P -v "^($|#)" zookeeper/conf/zoo.cfg
tickTime=2000
这个时间是作为 Zookeeper 服务器之间或客户端与服务器之间维持心跳的时间间隔,也就是每个 tickTime 时间就会发送一个心跳。
initLimit=10
这个配置项是用来配置 Zookeeper 接受客户端(这里所说的客户端不是用户连接 Zookeeper 服务器的客户端,而是 Zookeeper 服务器
集群中连接到 Leader 的 Follower 服务器),初始化连接时最长能忍受多少个心跳时间间隔数.当已经超过 10个心跳的时间(也就是 tickTime)
长度后 Zookeeper 服务器还没有收到客户端的返回信息,那么表明这个客户端连接失败。总的时间长度就是 10*2000=20 秒
syncLimit=5
这个配置项标识 Leader 与 Follower 之间发送消息,请求和应答时间长度,最长不能超过多少个 tickTime 的时间长度,总的时间长度就
是 5*2000=4 秒
dataDir=
/opt/zookeeper/data
顾名思义就是 Zookeeper 保存数据的目录,默认情况下,Zookeeper 将写数据的日志文件也保存在这个目录里。
clientPort=2181
这个端口就是客户端连接 Zookeeper 服务器的端口,Zookeeper 会监听这个端口,接受客户端的访问请求。
server.1=node4.robin.com:2888:3888
server.2=node5.robin.com:2888:3888
server.3=node6.robin.com:2888:3888
server.A=B:C:D:其中 A 是一个数字,表示这个是第几号服务器;B 是这个服务器的 ip 地址;C 表示的是这个服务器与集群中的
Leader 服务器交换信息的端口;D 表示的是万一集群中的 Leader 服务器挂了,需要一个端口来重新进行选举,选出一个新的 Leader,而这
个端口就是用来执行选举时服务器相互通信的端口。如果是伪集群的配置方式,由于 B 都是一样,所以不同的 Zookeeper 实例通信端口号不
能一样,所以要给它们分配不同的端口号。
[hadoop@node5 opt]$
mkdir /opt/zookeeper/data
[hadoop@node5 opt]$ echo 2 > /opt/zookeeper/data/myid #创建对应的服务器编号
[root@node4 ~]# chown hadoop.hadoop /opt
[root@node6 ~]# chown hadoop.hadoop /opt
[hadoop@node5 opt]$ scp -r /opt/zookeeper node4.robin.com:/opt/
[hadoop@node5 opt]$ scp -r /opt/zookeeper node6.robin.com:/opt/
[root@node4 ~]# su - hadoop
[hadoop@node4 ~]$ echo 1 > /opt/zookeeper/data/myid
[root@node6 ~]# su - hadoop
[hadoop@node6 ~]$ echo 3 > /opt/zookeeper/data/myid
[hadoop@node4 ~]$ /opt/zookeeper/bin/zkServer.sh start
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@node4 ~]$ /opt/zookeeper/bin/zkServer.sh status
JMX enabled by defaultls /opt/hadoop/journal/ns1/current/
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[hadoop@node4 ~]$
[hadoop@node5 opt]$ /opt/zookeeper/bin/zkServer.sh start
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@node5 opt]$ /opt/zookeeper/bin/zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: leader
[hadoop@node5 opt]$
[hadoop@node6 ~]$ /opt/zookeeper/bin/zkServer.sh start
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED
[hadoop@node6 ~]$ /opt/zookeeper/bin/zkServer.sh status
JMX enabled by default
Using config: /opt/zookeeper/bin/../conf/zoo.cfg
Mode: follower
[hadoop@node6 ~]$
五、配置hadoop

4/23
1、首先在node1主机上解压并配置hadoop
[root@node1 ~]# chown hadoop.hadoop /opt
[root@node1 ~]# su - hadoop
[hadoop@node1 ~]$ tar xfz /tmp/hadoop-2.7.1.tar.gz -C /opt
[hadoop@node1 ~]$ mv /opt/hadoop{-2.7.1,}
[hadoop@node1 ~]$ ls /opt/hadoop/
bin include libexec NOTICE.txt sbin
etc lib LICENSE.txt README.txt share
[hadoop@node1 ~]$ cd /opt/hadoop/etc/hadoop/ <需要修改的配置文件有以下几个>
[hadoop@node1 hadoop]$ ls hadoop-env.sh hdfs-site.xml core-site.xml mapred-site.xml.template yarn-site.xml slaves
core-site.xml hdfs-site.xml slaves
hadoop-env.sh mapred-site.xml.template yarn-site.xml
[hadoop@node1 hadoop]$
[hadoop@node1 hadoop]$ vim hadoop-env.sh <在hadoop运行环境配置文件中指定JAVA_HOME的路径>
[hadoop@node1 hadoop]$ grep "^export JAVA_HOME" hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_45
[hadoop@node1 hadoop]$ vim core-site.xml <定义整个hadoop集群的namespce名及指定zookeeper集群>
[hadoop@node1 hadoop]$ tail -17 core-site.xml
<configuration>
<!-- 指定hdfs的nameservice为ns1 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<!-- 指定hadoop临时目录 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/tmp</value>
</property>
<!-- 指定zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>node4.robin.com:2181,node5.robin.com:2181,node6.robin.com:2181</value>
</property>
</configuration>
[hadoop@node1 hadoop]$
vim hdfs-site.xml <设置HDFS相关的属性>
[hadoop@node1 hadoop]$ tail -62 hdfs-site.xml
<configuration>
<!--指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 -->
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<!-- ns1下面有两个NameNode,分别是nn1,nn2 -->
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<!-- nn1的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>node1.robin.com:9000</value>
</property>
<!-- nn1的http通信地址 -->
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>node1.robin.com:50070</value>
</property>
<!-- nn2的RPC通信地址 -->
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>node2.robin.com:9000</value>
</property>
<!-- nn2的http通信地址 -->
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>node2.robin.com:50070</value>
</property>

5/23
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node4.robin.com:8485;node5.robin.com:8485;node6.robin.com:8485/ns1</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop/journal</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!-- 使用隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
</configuration>
[hadoop@node1 hadoop]$
[hadoop@node1 hadoop]$
vim slaves <指定DataNode节点>
[hadoop@node1 hadoop]$ cat slaves
node4.robin.com
node5.robin.com
node6.robin.com
[hadoop@node1 hadoop]$
[hadoop@node1 hadoop]$ cp mapred-site.xml{.template,}
[hadoop@node1 hadoop]$ vim mapred-site.xml <指定MapReduce运行在yarn框架之上>
[hadoop@node1 hadoop]$ tail -7 mapred-site.xml
<configuration>
<!-- 指定mr框架为yarn方式 -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
[hadoop@node1 hadoop]$
[hadoop@node1 hadoop]$ vim yarn-site.xml <指定yarn的ResourceManager节点>
[hadoop@node1 hadoop]$ tail -13 yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<!-- 指定resourcemanager地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>node3.robin.com</value>
</property>
<!-- 指定nodemanager启动时加载server的方式为shuffle server -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</configuration>
[hadoop@node1 hadoop]$
以下要保证每个节点上的hadoop用户对/opt目录有写入操作权限
[root@node2 ~]#
chown hadoop.hadoop /opt/
[root@node2 ~]# ll /opt/ -d
剩余22页未读,继续阅读















machen_smiling
- 粉丝: 504
- 资源: 1964
上传资源 快速赚钱
我的内容管理 收起
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助

会员权益专享
安全验证
文档复制为VIP权益,开通VIP直接复制

评论0