cdh 默认把 spark 的 spark-sql 以及 hive-thriftserver 给弃用掉了, 想玩玩 thriftserver, 于是自己重新编译一个
官网参考:
http://spark.apache.org/docs/2.3.3/building-spark.html#building-a-runnable-distribution
环境:
- #java
- export JAVA_HOME="/usr/lib/java/jdk1.8.0_144"
- export JRE_HOME="$JAVA_HOME/jre"
- export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib
- export PATH=$JAVA_HOME/bin:$PATH
- #maven
- export MAVEN_HOME="/home/etluser/kong/spark/apache-maven-3.6.2"
- export MAVEN_OPTS="-Xmx2g -XX:ReservedCodeCacheSize=512m"
- export PATH=$MAVEN_HOME/bin:$PATH
修改./dev/make-distribution.sh,
1. 根据服务器实际配置使用多 core
2. 直接指定相关 VERSION, 注释获取 version 的部分
3.hadoop,flume,zk 指定 cdh 相关版本
- VIM spark-2.3.4/dev/make-distribution.sh
- BUILD_COMMAND=("$MVN" -T 1C clean package -DskipTests [email protected])
修改为
- BUILD_COMMAND=("$MVN" -T 10C package -DskipTests [email protected])
- #VERSION=$("$MVN" help:evaluate -Dexpression=project.version [email protected] 2>/dev/null | grep -v "INFO" | tail -n 1)
- #SCALA_VERSION=$("$MVN" help:evaluate -Dexpression=scala.binary.version [email protected] 2>/dev/null\
- # | grep -v "INFO"\
- # | tail -n 1)
- #SPARK_HADOOP_VERSION=$("$MVN" help:evaluate -Dexpression=hadoop.version [email protected] 2>/dev/null\
- # | grep -v "INFO"\
- # | tail -n 1)
- #SPARK_HIVE=$("$MVN" help:evaluate -Dexpression=project.activeProfiles -pl sql/hive [email protected] 2>/dev/null\
- # | grep -v "INFO"\
- # | fgrep --count "<id>hive</id>";\
- # Reset exit status to 0, otherwise the script stops here if the last grep finds nothing\
- # because we use "set -o pipefail"
- # echo -n)
- VERSION=2.3.4
- SCALA_VERSION=2.11.8
- SPARK_HADOOP_VERSION=2.6.0-cdh5.13.3
- SPARK_HIVE=1
- <hadoop.version>2.6.0-cdh5.14.0</hadoop.version>
- <flume.version>1.6.0-cdh5.14.0</flume.version>
- <zookeeper.version>3.4.5-cdh5.14.0</zookeeper.version>
修改 maven conf 目录下 settings.xml 文件, 加入阿里库
- <mirror>
- <id>alimaven</id>
- <mirrorOf>central</mirrorOf>
- <name>aliyun-maven</name>
- <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
- </mirror>
在 spark-2.3.4/pom.xml 添加 cdh 仓库
- <repository>
- <id>cloudera</id>
- <name>cloudera Repository</name>
- <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
- </repository>
执行
./make-distribution.sh --name hadoop2.6 --tgz -Pyarn -Phadoop-2.6 -Dhadoop.version=2.6.0-cdh5.13.3 -Phive -Phive-thriftserver -DskipTests
然后等待...
完成!
生成的压缩包
来源: http://www.bubuko.com/infodetail-3298487.html