In this article, we will explain how to integrate Hadoop in eclipse using one java program example.
Prerequisites:
- Java 1.7 or above version
- Eclipse
- Hadoop 2.7
How to integrate Hadoop in Eclipse:
Step 1: Open Eclipse IDE then click on “Open Perspective” after that choose “Java” then click on “Ok”
Step 2: Create a new java project int the eclipse.
Step 3: Give the required project name then click on the “Finish” button.
Step 4: Click on the “src” and then create “Package” and give the required package name like “org.hadoop.eclipse”
Step5: Here we need to give the “Source folder” and “Package” is mandatory after that choose a new option and select java class name like “first_example” then click on the “Finish” button.
Step6: We need to add required below Hadoop jars for the connectivity in between Eclipse and Hadoop cluster like Hadoop client, etc.
hadoop-ant-1.x.x.jar hadopp client-1.x.x.jar hadoop-examples.1.x.x.jar hadoop-minicluster1.x.x.jar hadoop-test1.x.x.jar hadoop-tools1.x.x.jar
Step 7: Right-click on the project name, chose the build path option, and then select the configure build path next like below snapshot.
Step 6: Go to the libraries tab, then click on the “Add External JARS” then click on the “OK” button.
Step 7: Here we added common jars and the remaining Hadoop jar files after that click on the “OK” button.
Step8: After that try to write the first Java program in Eclipse and then connect to the Hadoop environment with the help of libraries and Hadoop jars. While writing a program it will take automatically import packages from the jar files. Whether it is from jars or Maven pom files.
Summary: The above steps are very simple to integrate Hadoop in the Eclipse IDE (Integrated Development Environment) for the beginners. In case getting any error try to add more jars in the build path. Sometimes getting like version type of issues, we need to fix with the same versions for Hadoop.