How to install Sqoop on Ubuntu 16.04 (Linux) with Steps

This post explained how to install Apache Sqoop on Ubuntu 16.04 (Linux) with pictures.




Apache Sqoop is import/export data from RDMS to HDFS and HDFS to RDBMS vice versa using Sqoop commands in Big data environment.

Simple steps to Installation of Sqoop on Ubuntu 16.04

Step 1: First, step to download Sqoop tarball from Apache mirrors website

Step2: After downloading the tarball file then change the permissions of the tarball

Step3: Changed permissions then will extract the tarball using below command:

tar -xzvf sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz

otherwise, directly extract the tarball simply.

Step4: Next, we will update the environment variables in bashrc file. Then update SQOOP_HOME & PATH using below command:

export SQOOP_HOME=/home/sreekanth/Hadoop/sqoop-1.4.7
export PATH=$PATH:$SQOOP_HOME/bin

Step5: After updating SQOOP home path then verify whether it is correctly updated or not using below command:

echo $SQOOP_HOME

Note: Open the new Terminal ( Command Line) to verify the Sqoop home path is set or not.




Step6: After verified your Sqoop installation then integrate with MYSQL connector jar file into the SQOOP libraries path into that because SQOOP depends upon the Database so we need MYSQL jar file import/export large data.

Step7: Once completed above steps successfully then to verify Sqoop installation on Ubuntu 16.04 version.

Step8: To check Sqoop installed on Ubuntu 16.04 by using below command:

sqoop version

 

Summary: Basically, Apache Sqoop is used for data import/export using Sqoop commands. It is installed on top of Hadoop only with help of external database system MYSQL integration. On Ubuntu 16.04 needs some prerequisites like JDK 1.8 or above version and Hadoop then it is simple to install Apache Sqoop with the help Sqoop tarball from apache mirrors official website. Here we provided a Sqoop installation with pictures simple steps for Hadoop admins or Hadoop developers in a single node cluster. Be careful with barshrc file updating because if single word misses, then it is not worked for entire cluster also.