[AWS EMR Hadoop 실습] 설치하기

Hyunjun Kim·2025년 8월 14일

Data_Engineering

목록 보기

126/153

실습 전 EMR 비용이 부담이라면 기능적인 것만 테스트 할 것이라면 Hadoop standalone 모드로 기능적인 것은 할 수 있음. 하둡 공식 메뉴얼에 보면 싱글 노드 클러스터 세팅하는 방법을 제공한다.

1 설치하기

이번 실습의 목적은 Hadoop 실습을 위한 클러스터 세팅이다. 클러스터 운영을 위한 과정은 아니므로 실습의 편의에 맞추어저 있다.

1.1 사전 준비

1.1.1 HA(multi primary)를 위한 MySQL RDS 생성

EMR의 Multi Primary 또는 hue를 사용하지 않을거면 별도의 MySQL을 생성하지 않아도 된다.

하지만 HA Hadoop eco system을 위해서라면 필요하다.

별도의 MySQL 서버가 있다면, 꼭 RDS 를 이용하지 않아도 괜찮다.

RDS를 이용하는 경우

실습용으로 작은 MySQL을 원한다면: free tier 선택후, standard 로 이동
필수 설정: public accessible Yes

1.1.2 HA(multi primary)를 위한 MySQL database, user 생성

mysql cli로 접속

hive 용 데이터베이스 생성

create database hive;
show databases;
create user 'hive' identified by 'hive';
grant all privileges on hive.* to 'hive'@'%';

1.2 클러스터 생성 설정

AWS console에서 EMR > Create cluster 한 뒤 진행.

1.2.1 패키지 선택

다음은 실습에서 구성한 패키지이다. 버전이 조금 바뀌더라도 기본 기능에는 큰 차이는 없지만, 일부 차이가 있을 수 있다.

이 외에 추가적인 패키지 설치시에는, 호환성 및 정상설치 여부가 확인되지 않았다.
추가적인 패키지와 함께 EMR 클러스터 구성이 필요하다면 설치시에 문제가 발생하면 디버깅에 많은 시간이 걸릴 수 있다.

EMR version: emr-6.8.0

HDFS 버전은 : Hadoop 3.2.1-amzn-8 로 설치된다.

패키지

Flink 1.15.1
Ganglia 3.7.2
HBase 2.4.12
HCatalog 3.1.3
Hadoop 3.2.1
Hive 3.1.3
JupyterHub 1.4.1
Livy 0.7.1
Spark 3.3.0
Tez 0.9.2
Trino 388
ZooKeeper 3.5.10

1.2.2 Instance Group

실습에서는 m5.xlarge(4 vCore, 16 GiB memory, EBS only storage) 인스턴스 타입을 설정했다.

이것보다 낮아도 되기는 하지만, 설치된 패키지가 많으므로 속도가 느려지는 것은 감안해야한다.

EBS Root Volumn: 50GB

이후 실습은 지정된 리소스 풀에서 사용하는 것을 실습할 것이기 때문에 Task Node는 지정하지 않았다.

Core node 의 수에 따라서 hdfs-site.xml의 dfs.replication이 달라진다. 매뉴얼 참고

1.2.3 Cluster scaling and provisioning option

Set Cluster Size Manually

1.2.4 Networking

subnet은 Enable auto-assign public IPv4 address 이 활성화된 subnet이어야 한다.
Primary 의 security group : EMR managed security group
Core and task nodes: EMR managed security group
EMR Managed Security Group 매뉴얼

1.2.5 Security

Service role for Amazon EMR : EMR_DefaultRole 을 추천한다.
- 매뉴얼
- custom 을 하더라도 default 에서 일부 role 을 추가하는 방식으로 한다.
IAM role for instance profile: EMR_EC2_DefaultRole 을 추천한다.
- 매뉴얼
- custom 을 하더라도 default 에서 일부 role 을 추가하는 방식으로 한다.

💡 EMR_DefaultRole 이 안보이는 경우 다음 순서를 따라서 생성하면 된다.

AWS Console 에서 IAM 선택
Roles
Create Role 버튼 클릭
Step 1: Select trusted entity
1. Type에서 AWS servcie 를 선택
2. Use case 에서 EMR 검색한 뒤, EMR 선택
3. Next
Step 2: Add Permission 에서 기본 선택되어있는 AmazonElasticMapReduceRole 그대로 진행
Step 3: Name, review, and create 에서 이름을 지정하고 Next

이렇게 만든 자신의 Role 을 EMR 생성 단계에서 불러와서 사용하면 된다.

💡 EMR_EC2_DefaultRole 이 안보이는 경우 다음 순서를 따라서 생성하면 된다.

EMR_EC2_DefaultRole 이 안보이는 경우 다음 순서를 따라서 생성하면 된다.

위의 EMR_DefaultRole 생성시 했던 방법과 동일한 순서로 1~3번까지 진행한다.
Step 1: Select trusted entity
1. Type에서 AWS servcie 를 선택
2. Use case 에서 EMR 검색한 뒤, EMR Role for EC2 선택
3. Next
Step 2: Add Permission 에서 기본 선택되어있는 AmazonElasticMapReduceforEC2Role 그대로 진행
Step 3: Name, review, and create 에서 이름을 지정하고 Next
생성한 자신의 Role 설정화면에 접속해서 권한 추가
1. Permissions > Permission policies > Add permissions > Attach policies 클릭
1. Other permission policies 에서 EMR로 검색 후 다음 policies 를 추가
  1. AmazonElasticMapReduceFullAccess
  2. AmazonElasticMapReduceforAutoScalingRole
2. Add permissions 클릭

이렇게 만든 자신의 Role 을 EMR 생성 단계에서 불러와서 사용하면 된다.

1.2.6 Cluster Logs

클러스터의 로그를 S3 에 저장하도록하면 EC2에 각각 접속하지 않고도 쉽게 확인할 수 있다. 또한 예상치못한 에러로 종료되었을 때도 원인을 파악하는 데 사용할 수 있다. 인스턴스가 자동으로 줄어들거나, 종료되었을때 로그를 볼 수 없는 경우에는 문제해결이 어렵다.

EMR용 로그 디렉토리를 만들고 연결해주자.

클러스터별로 고유 이름으로 하위디렉토리가 생성되고, 그 하위에 로그가 쌓인다.

1.2.7 EMR의 Configuration Override

HA를 위해 Multi Primary(Master)를 구성하는 경우, 메타데이터의 데이터베이스가 외부에 있어야 한다. 데이터베이스는 MySQL 만 가능하다.

별도의 MySQL 을 설치하고 위 HA(multi primary)를 위한 MySQL database, user 생성 구성을 따라한 뒤 아래 내용으로 설정을 추가한다.

hive-site 설정 변경 매뉴얼

ℹ️ 아래 jupyter 설정은 필수는 아니어서 강의영상에서는 나오지 않았다. EMR의 jupyter hub는 내부적으로 sparkmagic 을 이용하는데, sparkmagic에서 사용하는 auth 정보를 설정할 수 있다.

jupyter 설정 매뉴얼 (필수 아님)

jupyter password 를 위한 base64encoding

https://www.base64encode.org/ 에서 자신이 원하는 password 를 입력하고 encode 를 누른다.
해당 결과를 jupyter password 내용에 적용한다.
실습에서는 username: jupyter, password: anVweXRlcg== (jupyter 를 encode한 값) 을 사용한다.

❗ 단, 설치 이후 jupyter hub web에 로그인하는 user password 는 별도이다. [매뉴얼](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-jupyterhub-user-access.html)

user: jovyan
password: jupyter

아래는 위 매뉴얼을 반영한 설정이다. 이것을 Software settings - optional 메뉴에 설정으로 추가한다.

AWS cli 를 이용하는 경우 아래 설정을 json 파일로 저장한뒤, --configurations $yourconfig.json 과 같이 파라미터로 설정한다.
$your_rds_hostname 을 자신의 mysql host name으로 수정한다.

[
  {
    "Classification": "hive-site",
    "Properties": {
      "javax.jdo.option.ConnectionDriverName": "org.mariadb.jdbc.Driver",
      "javax.jdo.option.ConnectionPassword": "hive",
      "javax.jdo.option.ConnectionURL": "jdbc:mysql://$your_rds_hostname:3306/hive?createDatabaseIfNotExist=true",
      "javax.jdo.option.ConnectionUserName": "hive"
    }
  },
  {
    "Classification":"jupyter-sparkmagic-conf",
    "Properties": {
      "kernel_python_credentials" : "{\"username\":\"jupyter\",\"base64_password\":\"anVweXRlcg==\",\"url\":\"http:\/\/localhost:8998\",\"auth\":\"None\"}"
      }
    }
]

$your_rds_hostname 수정필요
jupyter 의 kernel_python_credentials (필수 아님, 선택) 내의 username: jupyter, password: anVweXRlcg== (jupyter 를 encode한 값) 을 별도로 수정할 수 있다.
oozie 와 hive 에서 사용하는 드라이버가 mariadb jdbc 드라이버인 것을 주의하자. (MySQL 드라이버로 하면 class not found)

hive 외에 다른 추가 설정이 필요하다면, 최상위 배열안의 json object 로 구분해서 설정한면 된다.

[
  {
    "Classification": "hive-site",
    "Properties": {
       # key-value configs..
    }
  },
  {
    "Classification": "oozie-site",
    "Properties": {
       # key-value configs..
    }
  }
]

Classification 의 이름과 Properties 들의 항목은 패키지 소프트웨어마다 다르니 꼭 매뉴얼을 확인해야한다.

1.3 EMR 클러스터 생성 완료

1.3.1 상태정보 및 Primary 주소 확인

클러스터 생성시 EC2인스턴스를 EMR 에서 직접 생성하기 때문에 대시보드에서 다음과 같이 확인한다.

Status 가 waiting 이면 사용할 준비가 되었다.
- submit 된 job이 없으므로 waiting 이라고 표시한다.
view primary node public DNS 를 클릭해서 에서 primary node를 확인한다.

1.3.2 Primary Node 에 SSH로 terminal 접속

자세한 내용은 환경별로 매뉴얼 을 따라한다.

클러스터 생성시에 설정한 ec2 key file (pem)을 활용한다.

ssh -i $your_key_file hadoop@$your_primary_node

다음 명령어로 hdfs 명령어가 잘 수행된다면, 잘 설치된 것이다.

hdfs --help
hdfs version
hdfs dfs --help
hdfs dfs -ls /

1.3.3 Active master 노드 확인

다음 명령어로 hdfs 의 active namnode를 확인할 수 있다.

active 가 아니어도 hdfs 명령어는 가능하다.

hdfs haadmin -getAllServiceState

다음 명령어로 yarn의 active resource manager를 확인할 수 있다.

yarn rmadmin -getAllServiceState

active 가 아니어도 yarn 명령어는 가능하다.

1.3.4 클러스터의 hadoop 관련 설정파일

Primary 노드의 다음 경로에서 설정 파일을 확인할 수 있다.

ls -al /etc/hadoop/conf/

1.4 EMR 클러스터 생성 도중 실패한다면

1.4.1 event 확인

해당 EMR 인스턴스 > Events

메뉴로 들어가서 이벤트의 순서와 실패의 이유를 확인할 수 있다.

1.4.2 기존 생성 또는 실패 했던 EMR 설정을 다시 불러오기

원래 시도했던 설정을 유지하고, 실패한 부분만 변경하고 싶다면

Actions > Clone Cluster 를 선택하면 같은 설정을 유지하면서 EMR 생성 console 로 이동한다.

AWS Cli 로 표현된 설정을 보고 싶다면

Actions > View command for cloning cluster

❗ clone시에 password 설정은 masking(*) 된다. password 설정은 clone 뒤에 다시 수동으로 고쳐줘야한다.

1.4.3 Error: The "map public IP on launch" feature for subnet $your_subnet in VPC $your_vpc must be enabled in order to launch a multi-master cluster in a public subnet.

❗ The "map public IP on launch" feature for subnet …
위 에러를 만난다면, 선택한 subnet 의 설정을 변경해야한다. 다음 두 설정을 선택으로 변경한다.
Enable auto-assign public IPv4 address
Enable resource name DNS A record on launch

1.4.4 BOOTSTRAP_FAILED

매뉴얼

bootstrap-actions 디렉토리 하위에 있는 로그를 확인한다.
1. s3://emr-log-location/example-cluster-id/node 하위에 여러 instance 가 있다. 이 중 bootstrap-actions 디렉토리가 있는 경로에서 찾아야 한다.
2. 해당 로그에서 몇번 bootstrap action 이 실패했는지 확인한다.
만약, 1번에서 확인한 bootstrap action이 내가 설정한 것이 아니라면, 아래의 방식으로 puppet 로그를 확인한다.
- EC2에서 직접 접근: 로그는 디스크의 /var/log/provision-node/apps-phase/0/example-UUID}/puppet.log
- (로그를 S3로 푸시하도록 구성한 경우) S3 로 접근: s3://example-log-location/example-cluster-id/node/example-instance-id/provision-node/apps-phase/0/{UUID}/puppet.log.gz
만약 1+2번에서 확인한 puppet 로그에서 error 를 찾지 못했다면, 해당 클러스터의 directory 에 다른 instance-id 하위에 있는 bootstrap action 로그와, puppet 로그를 확인하면 다른 인스턴스에서 error 가 발생한 것을 확인할 수 있다.
위의 모든 방법이 안된다면 해당 emr클러스터의 로그 디렉토리에 있는 모든 stderror 로그를 샅샅이 뒤져야 한다. S3에는 가끔 로그파일이 잘린 경우도 발견된다.

사례

bootstrap-actions

2023-01-07 08:56:15,457 INFO i-08026ffb5574067e2: new instance started
2023-01-07 08:56:20,504 ERROR i-08026ffb5574067e2: failed to start. bootstrap action 1 failed with non-zero exit code.
2023-01-07 08:57:51,195 INFO i-0e1dce635b968dcff: new instance started
2023-01-07 08:57:51,195 INFO i-0e1dce635b968dcff: all bootstrap actions complete and instance ready
2023-01-07 08:57:52,055 INFO i-05bc2ee0e9e775662: new instance started
2023-01-07 08:57:52,055 INFO i-05bc2ee0e9e775662: all bootstrap actions complete and instance ready
2023-01-07 08:57:52,286 INFO i-07e46be496b950d83: new instance started
2023-01-07 08:57:52,286 INFO i-07e46be496b950d83: all bootstrap actions complete and instance ready

puppet 로그 확인

s3://de-lecture/emr-log/j-2AAC2QKY5Y136/node/i-0d8bfde73425748e4/provision-node/apps-phase/0/6e6c91d9-0db5-47cc-ae96-6a63e7aee254/puppet.log.gz puppet.txt
검색으로 error 를 찾으면 다음 텍스트를 찾을 수 있었다.
- Error: Could not connect to the database: java.lang.ClassNotFoundException: com.mysql.jdbc.Driver

또 다른 사례에서는 mysql 의 hive database 에 권한이 잘못설정되어 접근을 못한 로그가 발생했다.

2023-01-07 09:03:27 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema] (info): Starting to evaluate the resource (984 of 1138)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Initializing the schema to: 3.1.0
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Metastore connection URL:	 jdbc:mysql://de-emr.cgfpdcibdeds.ap-northeast-2.rds.amazonaws.com:3306/hive?createDatabaseIfNotExist=true
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Metastore Connection Driver :	 org.mariadb.jdbc.Driver
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Metastore connection User:	 hive
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Underlying cause: java.sql.SQLSyntaxErrorException : Could not connect to address=(host=de-emr.cgfpdcibdeds.ap-northeast-2.rds.amazonaws.com)(port=3306)(type=master) : Access denied for user 'hive'@'%' to database 'hive'
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): SQL Error code: 1044
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): org.apache.hadoop.hive.metastore.HiveMetaException: Failed to get schema version.
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:94)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hive.beeline.schematool.HiveSchemaTool.getConnectionToMetastore(HiveSchemaTool.java:163)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hive.beeline.schematool.HiveSchemaTool.testConnectionToMetastore(HiveSchemaTool.java:174)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hive.beeline.schematool.HiveSchemaToolTaskInit.execute(HiveSchemaToolTaskInit.java:53)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hive.beeline.schematool.HiveSchemaTool.main(HiveSchemaTool.java:353)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at java.lang.reflect.Method.invoke(Method.java:498)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Caused by: java.sql.SQLSyntaxErrorException: Could not connect to address=(host=de-emr.cgfpdcibdeds.ap-northeast-2.rds.amazonaws.com)(port=3306)(type=master) : Access denied for user 'hive'@'%' to database 'hive'
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:62)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:192)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1392)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.util.Utils.retrieveProxy(Utils.java:635)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.MariaDbConnection.newConnection(MariaDbConnection.java:150)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.Driver.connect(Driver.java:89)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at java.sql.DriverManager.getConnection(DriverManager.java:664)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at java.sql.DriverManager.getConnection(DriverManager.java:247)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.apache.hadoop.hive.metastore.tools.HiveSchemaHelper.getConnectionToMetastore(HiveSchemaHelper.java:88)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	... 10 more
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): Caused by: java.sql.SQLException: Access denied for user 'hive'@'%' to database 'hive'
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readErrorPacket(AbstractQueryProtocol.java:1681)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:1543)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1506)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.additionalData(AbstractConnectProtocol.java:1136)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.postConnectionQueries(AbstractConnectProtocol.java:890)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.createConnection(AbstractConnectProtocol.java:595)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	at org.mariadb.jdbc.internal.protocol.AbstractConnectProtocol.connectWithoutProxy(AbstractConnectProtocol.java:1387)
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): 	... 16 more
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (notice): *** schemaTool failed ***
2023-01-07 09:05:04 +0000 Puppet (err): '/usr/lib/hive/bin/schematool -dbType mysql -initSchema -verbose' returned 1 instead of one of [0]
2023-01-07 09:05:04 +0000 /Stage[main]/Hadoop_hive::Init_metastore_schema/Exec[init hive-metastore schema]/returns (err): change from 'notrun' to ['0'] failed: '/usr/lib/hive/bin/schematool -dbType mysql -initSchema -verbose' returned 1 instead of one of [0]

Hyunjun Kim

Data Analytics Engineer 가 되

이전 포스트

[EMR 클러스터 구성을 위한 개념] 클러스터 인스턴스 구성 지침

다음 포스트