데이터 처리 - Spark SQL
그래프 처리 - Graph X
java 8 / Spring Boot ( Gradle )
hadoop 3.2.2
Spark 3.1.3
유의점
Java 8 prior to version 8u92 support is deprecated as of Spark 3.0.0. For the Scala API, Spark 3.1.3 uses Scala 2.12.
https://icefree.tistory.com/entry/Spark-Window-10%EC%97%90-Spark%EC%84%A4%EC%B9%98
// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
compileOnly group: 'org.apache.spark', name: 'spark-sql_2.12', version: '3.1.3'
// https://mvnrepository.com/artifact/org.apache.spark/spark-core
implementation group: 'org.apache.spark', name: 'spark-core_2.12', version: '3.1.3'
Caused by: java.lang.IllegalArgumentException: LoggerFactory is not a Logback LoggerContext but Logback is on the classpath. Either remove Logback or the competing implementation (class org.slf4j.impl.Reload4jLoggerFactory loaded from file:/C:/Users/-----/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-reload4j/1.7.36/db708f7d959dee1857ac524636e85ecf2e1781c1/slf4j-reload4j-1.7.36.jar). If you are using WebLogic you will need to add 'org.slf4j' to prefer-application-packages in WEB-INF/weblogic.xml: org.slf4j.impl.Reload4jLoggerFactory
implementation("org.springframework.boot:spring-boot-starter-web") {
exclude module : "logback-classic"
}
Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession
Command line is too long. Shorten command line for .
해결 : http://jmlim.github.io/intellij/2020/02/27/intellij-idea-command-line-is-too-long-error/
해결 후 다음과 같은 오류 발생
A master URL must be set in your configuration
해결 : https://gankrin.org/how-to-fix-spark-error-a-master-url-must-be-set/
해결 후 다음과 같은 오류 발생
internal컴파일러 오류
Caused by: java.lang.ClassNotFoundException: org.codehaus.janino.InternalCompilerException
janino가 문제인데 이걸 spark-sql에서 / core 에서 exclude 해도 같은 오류가 난다.
java 버전이 문제인가..