How to Install Apache Spark on Ubuntu/Linux

Apache Spark Installation

Spark is a framework and in-memory data processing engine. Compare with Hadoop Map Reduce 100 times faster for data processing.  Developed in Java, Scala, Python and R languages. Nowadays mostly working and execute the data in Streaming, Machine Learning.

Prerequisite of Spark Installation:

1. Update the packages on Ubuntu using

sudo apt-get update

After entering your password it will update some packages

2.  Now you can install the JDK for Java installation

sudo apt-get install default – jdk

Java version must be greater than 1.6 version

Step 1 : Download spark tar ball  from Apache spark official website

Step 2: Tar ball file into your  Hadoop directory

Step 3 : After that Extract the Downloaded tarball using below command:

tar -xzvf  spark tar ball

Step 4:  After tarball extraction , we get Spark directory and Update the SPARK_HOME & PATH variables in bashrc file

using below commands:

export SPARK_HOME=/home/slthupili/INSTALL/spark-2.x.x-bin-hadoop2.x


Step 5 : To check the bashrc changes, open a new terminal and type ‘echo $SPARK_HOME command

Step 6: After successfully Installation of Spark, Will check with Spark shell in terminal using below command :



Step 7: To check spark version and scala version using below commands: