Apache Sqoop Installation on Ubuntu

Apache SQOOP is one of the Hadoop components. It is mainly used for data fetching from HDFS to RDBMS vice versa or bulk data between Hadoop and data stores such as relational databases.

Prerequisites :

Before you can installation of Sqoop, you have to need Hadoop 2.x.x and compatible with Sqoop 1.x.x

Step 1: Download SQOOP 1.x.x tar ball from the below website:

Sqoop Download

Step 2: After downloading extract the SQOOP tar ball using below command:

tar – xzvf sqoop – 1.x.x. bin – hadoop- 2.x.x – alpha.tar. gz

Step 3: Update the bashrc file with SQOOP_HOME & PATH variables

export SQOOP_HOME=/home/slthupili/INSTALL/sqoop-1.x.x.bin-hadoop-2.x.x

PATH=$PATH:$SQOOP_HOME/bin

Step 4: To check the bashrc changes, open a new terminal and type ‘echo $ SQOOP_HOME’

Step 5: To Integrate with MySQL Database from Hadoop Using SQOOP, we MUST have to place the respective

JAR file (mysql – connector-java5.1.38. jar) in $SQOOP _ HOME / lib path

Step 6: To check the version of SQOOP using below command:

sqoop version

Above steps are simple to the installation of Sqoop on top of Hadoop in Ubuntu

Sqoop to import data from a relational database management system (RDBMS) like MySQL into the Hadoop Distributed File System. Sqoop automates most of this process on the database to explain about schema for the data to be imported. Sqoop uses Map Reduce to import and export the data.