본문 바로가기
BigData 기술/Hadoop

Hadoop 클러스터 구축 과정

by 잇서니 2021. 4. 1.
반응형

 

1. Zookeeper 설치

1) 패키지 설치

zookeeper, zookeeper-server

2) 설정파일 배포

- zoo.cfg (myid 설정)

 

2. Hadoop 설치

1) 네임노드

- 패키지 설치 (hadoop, hadoop-hdfs-namenode)

- 네임노드용 디렉토리 생성(dfs_namenode_name_dir)

 

2) YARN Resource manager

- 패키지 설치(hadoop, hadoop-yarn-resourcemanager)

 

3) Journal Node

- 패키지 설치(hadoop-hdfs-journalnode)

- 저널노드용 디렉토리 생성 (dfs_journalnode_edits_dir)

 

4) ZKFC

- 패키지 설치(hadoop-hdfs-zkfc)

 

5) Hadoop 설치

- 모든 노드에 패키지 설치 (hadoop, hadoop-hdfs, hadoop-mapreduce)

 

3. Hadoop 설정

1) 설정파일 배포 (HA용)

- core-site.xml

...
	<property>
        <name>fs.defaultFS</name>
        <value>hdfs://{{ dfs_nameservice }}</value>
    </property>

    <property>
        <name>ha.zookeeper.quorum</name>
        <value>{% for host in groups["zookeeper"] %}{{ host }}:{{ zookeeper_client_port }}{% if loop.index0 < loop.length-1 %},{% endif %}{% endfor %}</value>
    </property>

...

 

- hdfs-site.xml


...
	<property>
      <name>dfs.nameservices</name>
      <value>{{ dfs_nameservice }}</value>
  </property>

  <property>
      <name>dfs.ha.namenodes.{{ dfs_nameservice }}</name>
      <value>nn1,nn2</value>
  </property>

  <property>
      <name>dfs.namenode.rpc-address.{{ dfs_nameservice }}.nn1</name>
      <value>{{ dfs_namenode_rpc_address_nn1 }}:{{ fs_default_port }}</value>
  </property>
  <property>
      <name>dfs.namenode.rpc-address.{{ dfs_nameservice }}.nn2</name>
      <value>{{ dfs_namenode_rpc_address_nn2 }}:{{ fs_default_port }}</value>
  </property>

  <property>
      <name>dfs.namenode.http-address.{{ dfs_nameservice }}.nn1</name>
      <value>{{ dfs_namenode_rpc_address_nn1 }}:50070</value>
  </property>
  <property>
      <name>dfs.namenode.http-address.{{ dfs_nameservice }}.nn2</name>
      <value>{{ dfs_namenode_rpc_address_nn2 }}:50070</value>
  </property>

  <property>
      <name>dfs.namenode.shared.edits.dir</name>
      <value>qjournal://{% for host in groups["hdfs-journalnode"] %}{{ host }}:8485{% if loop.index0 < loop.length-1 %};{% endif %}{% endfor %}/{{ dfs_nameservice }}</value>
  </property>

  <property>
      <name>dfs.journalnode.edits.dir</name>
      <value>{{ dfs_journalnode_edits_dir|join(',') }}</value>
  </property>
  
  <property>
      <name>dfs.journalnode.http-address</name>
      <value>0.0.0.0:8480</value>
  </property>

  <property>
      <name>dfs.ha.fencing.methods</name>
      <value>shell(/etc/hadoop/conf/shellfence.sh $target_address $target_namenodeid)</value>
  </property>
...

 

- yarn-site.xml

...
	<property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>{{ dfs_nameservice }}</value>
    </property>
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>{{ groups["yarn-resourcemanager"][0]  }}</value>
    </property>
    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>{{ groups["yarn-resourcemanager"][1]  }}</value>
    </property>

    <property>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>{{ groups['yarn-resourcemanager'][0] }}:8088</value>
    </property>

    <property>
        <name>yarn.resourcemanager.webapp.address.rm2</name>
        <value>{{ groups['yarn-resourcemanager'][1] }}:8088</value>
    </property>

    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>{% for host in groups["zookeeper"] %}{{ host }}:{{ zookeeper_client_port }}{% if loop.index0 < loop.length-1 %},{% endif %}{% endfor %}</value>
    </property>

...

 

- hadoop-env.sh, yarn-env.sh, mapred-site.xml, topology.data  등

- topology.sh, shellfence.sh

 

2) 초기화 및 실행

- Journal Node 시작

- namenode 포맷

- zookeeper 시작, formatZK

- active namenode 시작

- zkfc 시작 (네임노드 먼저 시작한 후에 zkfc 시작)

-  standby namenode 실행 (-bootstrapStandby), standby namenode 시작

- zkfc 시작

- datanode 시작

 

 

참고링크

gymbombom.github.io/2019/12/11/1-hadoop-HA-cluster-install/

 

Hadoop HA cluster 세팅 방법

준비 Hadoop HA cluser를 세팅하기 전에 사전작업으로 Hadoop-SingleNode-Install => Hadoop-multiNode-Install 이 구성되어 있는 상태에서 HA cluster를 세팅하는것을 권장한다.

gymbombom.github.io

 

반응형

댓글