为学习 spark, 虚拟机中开 4 台虚拟机安装 spark3.0.0
底层 hadoop 集群已经安装好, 见 ol7.7 安装部署 4 节点 hadoop 3.2.1 分布式集群学习环境
首先, 去 http://spark.apache.org/downloads.html 下载对应安装包
解压
- [hadoop@master ~]$ sudo tar -zxf spark-3.0.0-bin-without-hadoop.tgz -C /usr/local
- [hadoop@master ~]$ cd /usr/local
- [hadoop@master /usr/local]$ sudo mv ./spark-3.0.0-bin-without-hadoop/ spark
- [hadoop@master /usr/local]$ sudo chown -R hadoop: ./spark
四个节点都添加环境变量
- export SPARK_HOME=/usr/local/spark
- export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin
配置 spark
spark 目录中的 conf 目录下 cp ./conf/spark-env.sh.template ./conf/spark-env.sh 后面添加
- export SPARK_MASTER_IP=192.168.168.11
- export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
- export SPARK_LOCAL_DIRS=/usr/local/hadoop
- export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)
然后配置 work 节点, cp ./conf/slaves.template ./conf/slaves 修改为
- master
- slave1
- slave2
- slave3
写死 JAVA_HOME,sbin/spark-config.sh 最后添加
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_191
复制 spark 目录到其他节点
- sudo scp -r /usr/local/spark/ slave1:/usr/local/
- sudo scp -r /usr/local/spark/ slave2:/usr/local/
- sudo scp -r /usr/local/spark/ slave3:/usr/local/
- sudo chown -R hadoop ./spark/
- ...
启动集群
先启动 hadoop 集群 / usr/local/hadoop/sbin/start-all.sh
然后启动 spark 集群
通过 master8080 端口监控
完成安装
来源: https://www.jb51.net/article/190482.htm