Top 10 Hadoop Interview Questions

1.What exactly meaning of Hadoop?

Hadoop is a framework to Process and Store the huge amount of data. It is an open source software framework for distributed file system

2. Why do we need Hadoop in IT?

A.Storage

B.Analytics

C.Data Quality

D.High Availability

E.Hardware Commodity

3. Difference between Hadoop 2.x and Hadoop 3.x?

Hadoop 2 handles only single Name Node to manage all Name Spaces.

Hadoop 3 has multiple Namenodes for multiple NameSpaces

Hadoop 2 has a lot more storage overhead than Hadoop3

Hadoop 2 not support GPUs but Hadoop 3 support GPUs.

4.Define Data Locality in Hadoop?

Sending the Logic near to the of HDFS.

5. How is Security achieved in Hadoop?

In Hadoop by using Kerberos Hadoop achieves more securiy

6. What are different modes in which Hadoop run?

A.Standalone mode

B.Pseudo Distributed

C.Fully Distributed

7. Explain about Safemode in Hadoop?

Safemode in Hadoop is a maintenance state of Name Node. During which Name Node doesn’t allow any modifications to the file system. During Safemode, HDFS cluster is in read-only and doesn’t replicate or delete blocks.

Commands:

hadoop dfsadming -safemode get

hadoop dfsadming -safemode enter

hadoop dfsadming -safemode leave



8.What are the main components in Hadoop eco-system?

A)HDFS             -Hadoop Distributed File System

B)MapReduce  – Programming paradigm- based on Java

C)Pig                  – To process and analyse  the structured,semi-structured data

D)Hive              – To process and analyse structured data

E)HBASE        – NoSQL database

F)SQOOP       – Import/Export structured data

G)Oozie          -Scheduler

H)Zookeeper – Configuration

9.Explain the differencebetween Name Node,Check point ,Backup Node in Hadoop eco-system?

Name Node- HDFS that manages the metadata



Checkpont Name Node- Directory structure as  Name Node, and creates checkpoints

Backup Node-It needs to save the current state in memory to an image file to create a new checkpoint.

10.Benefits of Hadoop?

A)Ability to handle bigdata

B)Commodity hardware and is open-source

C)Ability to handle multiple data types

 

Leave a Reply

Your email address will not be published. Required fields are marked *