How to Install Hadoop in Ubuntu/Linux in Single Node Cluster





Nowadays most emerging technology Hadoop. It is a solution for Big data to store and process a large amount of data. For storage purposes HDFS and Processing in Map Reduce but nowadays Map Reduce is not used. Will move to Apache Spark for processing and 100% better than Map Reduce because it is based on c

Step 1: First step we need to update the “System Software Repositories” using below command:
sudo apt-get update

Step 2: Next will Install java-1.8 version using the below command.

sudo apt-get install openjdk-8-jdk

Step 3: After that check Java Version using below command:

java -version

Step 4: We must and should Install ssh using below command

 sudo apt-get install ssh

Password Less SSH Communication, enter the below commands at any terminal:

ssh localhost

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

STEP 5: Download Hadoop-2.6.0 version tarball from Apache Mirrors from Apache official website

STEP 6: Extract the copied tarball using below command:

tar -xzvf hadoop-2.6.0.tar.gz

Below are the Total Configuration files in ‘Hadoop’ directory




STEP 7: We must and should to do edit the below 8 configuration files as part of HADOOP Installation:

1. core-site.xml

2. mapred-site.xml

3. mapred-env.sh

4. yarn-site.xml

5. hdfs-site.xml

6. hadoop-env.sh

7. yarn-env.sh

8. slaves

STEP 8: Open  core-site.xml file, add the below  properties

STEP 9: Open “hadoop-env.sh” file and update the JAVA_HOME path

STEP 10: Open mapred-env.sh and update JAVA_HOME

STEP 11: Open hdfs-site.xml  file and add the below properties:

STEP 12: Open mapred-site.xml and update the framework architecture details as “yarn”

STEP 13: Open yarn-env.sh and update the JAVA_HOME path in that file

STEP 14: Open yarn-site.xml and add the below properties to configure “Resource Manager”.

STEP 15: Open slaves file and to check whether the hostname is localhost or not

STEP 16: Update and Set JAVA_HOME, HADOOP_HOME & PATH variables:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/home/gopalkrishna/INSTALL/hadoop-2.6.0 
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/sbin

STEP 17: To check the bashrc changes, open a new terminal, and type the below command:

echo $HADOOP_HOME

STEP 18: Before starting the Name Node, we must have to format the name node using below command:

hadoop namenode -format

STEP 19: To start all the daemons of hadoop in 2.X.X use “start-all.sh” command

Step 20: How to check the Name node, Node manager, Data node running or not will use below command:

jps

STEP 21: To Access the Name Node information in GUI using below link in your system

http://localhost:50070

STEP 22: To Start Job History Server in Hadoop Cluster using below command

mr-jobhistory-daemon.sh start historyserver

STEP 23: To Access Resource Manager in Hadoop cluster:

localhost:8088





STEP 24: To Access Job History Server in Hadoop Cluster

localhost:19888

STEP 25: To stop all the daemons of Hadoop in 2.X.X use “stop-all.sh” command

STEP 26: To Stop Job History Server in 2.x.x.

mr-jobhistory-daemon.sh stop historyserve