1. Zookeeper 설치
1) 패키지 설치
zookeeper, zookeeper-server
2) 설정파일 배포
- zoo.cfg (myid 설정)
2. Hadoop 설치
1) 네임노드
- 패키지 설치 (hadoop, hadoop-hdfs-namenode)
- 네임노드용 디렉토리 생성(dfs_namenode_name_dir)
2) YARN Resource manager
- 패키지 설치(hadoop, hadoop-yarn-resourcemanager)
3) Journal Node
- 패키지 설치(hadoop-hdfs-journalnode)
- 저널노드용 디렉토리 생성 (dfs_journalnode_edits_dir)
4) ZKFC
- 패키지 설치(hadoop-hdfs-zkfc)
5) Hadoop 설치
- 모든 노드에 패키지 설치 (hadoop, hadoop-hdfs, hadoop-mapreduce)
3. Hadoop 설정
1) 설정파일 배포 (HA용)
- core-site.xml
...
<property>
<name>fs.defaultFS</name>
<value>hdfs://{{ dfs_nameservice }}</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>{% for host in groups["zookeeper"] %}{{ host }}:{{ zookeeper_client_port }}{% if loop.index0 < loop.length-1 %},{% endif %}{% endfor %}</value>
</property>
...
- hdfs-site.xml
...
<property>
<name>dfs.nameservices</name>
<value>{{ dfs_nameservice }}</value>
</property>
<property>
<name>dfs.ha.namenodes.{{ dfs_nameservice }}</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.{{ dfs_nameservice }}.nn1</name>
<value>{{ dfs_namenode_rpc_address_nn1 }}:{{ fs_default_port }}</value>
</property>
<property>
<name>dfs.namenode.rpc-address.{{ dfs_nameservice }}.nn2</name>
<value>{{ dfs_namenode_rpc_address_nn2 }}:{{ fs_default_port }}</value>
</property>
<property>
<name>dfs.namenode.http-address.{{ dfs_nameservice }}.nn1</name>
<value>{{ dfs_namenode_rpc_address_nn1 }}:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.{{ dfs_nameservice }}.nn2</name>
<value>{{ dfs_namenode_rpc_address_nn2 }}:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://{% for host in groups["hdfs-journalnode"] %}{{ host }}:8485{% if loop.index0 < loop.length-1 %};{% endif %}{% endfor %}/{{ dfs_nameservice }}</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>{{ dfs_journalnode_edits_dir|join(',') }}</value>
</property>
<property>
<name>dfs.journalnode.http-address</name>
<value>0.0.0.0:8480</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/etc/hadoop/conf/shellfence.sh $target_address $target_namenodeid)</value>
</property>
...
- yarn-site.xml
...
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>{{ dfs_nameservice }}</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>{{ groups["yarn-resourcemanager"][0] }}</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>{{ groups["yarn-resourcemanager"][1] }}</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>{{ groups['yarn-resourcemanager'][0] }}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>{{ groups['yarn-resourcemanager'][1] }}:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>{% for host in groups["zookeeper"] %}{{ host }}:{{ zookeeper_client_port }}{% if loop.index0 < loop.length-1 %},{% endif %}{% endfor %}</value>
</property>
...
- hadoop-env.sh, yarn-env.sh, mapred-site.xml, topology.data 등
- topology.sh, shellfence.sh
2) 초기화 및 실행
- Journal Node 시작
- namenode 포맷
- zookeeper 시작, formatZK
- active namenode 시작
- zkfc 시작 (네임노드 먼저 시작한 후에 zkfc 시작)
- standby namenode 실행 (-bootstrapStandby), standby namenode 시작
- zkfc 시작
- datanode 시작
참고링크
gymbombom.github.io/2019/12/11/1-hadoop-HA-cluster-install/
'BigData 기술 > Hadoop' 카테고리의 다른 글
DataNode failed volumes 원인 및 해결법 (8) | 2021.01.07 |
---|---|
[HDFS] 네임노드 구동과정 (Namenode Startup Process) (8) | 2021.01.05 |
[HDFS] Block Pool 개념 정리 (8) | 2021.01.04 |
[HDFS] 하둡 Balancer 과정 (1060) | 2020.07.20 |
[HDFS] Rack Awareness 란 (911) | 2020.07.15 |
댓글