Infosys Big Data interview questions and answers | Infosys | Big Data

In this article, we will share the Big Data / Hadoop related interview questions for freshers and experienced.

Interview Pattern:

I have attended totally 3 rounds:
1. Technical Round – I
2. Technical Round – II
3. Managerial Round

Once it’s done HR will discuss about package discussion.

I had given recently interview for Big Data technologies. Below are the questions and answers.

1. Technical Round – I

In this around they were asked basic questions:

1. Tell me about you’re self and explain experience and projects?

2. Explain current project roles & responsibilities and choose any project in you’re career and explain roles & responsibilities?

3. What is Big Data and Hadoop?

4.What are various Hadoop services, you’re using in current projects? explain any service?

5. What is Hive? how to connect Dbeaver to cluster?

Use the Hive JDBC URL and add into the host url and add the hive stand alone jar file into libraries location. After that use the project credentials.

6. What are the major difference between Hive and Beeline service?

Simply Hive is directly connect with HDFS and Hive metastore but beeline connect through hive standalone jar files (JDBC) files.

7. Explain about Spark standalone and spark cluster mode?

Basically, these are all deploy modes to launch the Spark application on Spark services. In cluster mode all the worker nodes act as executor. Coming to the Client mode which spark application submit on any node that will act a Spark driver.

2. Technical Round – II

Here in this round, bit tough questions means scenario based and real-time interview questions:

1. How to set the yarn queues and how to add into the particular users

2. In you’re cluster Namenode went to down, how to fixt it?

3.What are the major issue you’re faced in you’re entire career?

4. Do you have any idea on Kafka service, how to do retention period for the particular topic?

5. How to resolve the Solr shards went to down, without restart how to resolve the issue?

6. You’ve around 10 nodes, how to build the cluster with proper capacity planning.

7.Explain Azure Blob storage architecture? how to stores the large data?