1. How to handle Kafka back pressure with scripting parameters?
2.How to achieve performance tuning through executors?
3. What is idle size of deciding the executors and what ram should be used ?
4. How do you scale Kafka brokers and Integrate with spark streaming without stopping the cluster and along with script?
5.How to delete records in Hive and how to delete duplicate records with the scripting?
6. Can we have more than one replica exist in same rack?
7. In a data base out of 10 tables, one table is failed while importing from MySql into HDFS by using Sqoop? What is the solution?
8. If you submit a spark job in a cluster and almost rdd has already created in the middle of the process the cluster goes down what will happen to you are rdd and how data will tackle?