Most Frequently Asked Apache Storm Interview Questions and Answers

Top 5 Apache Storm Interview Questions:

1.Difference between Apache Kafka and Apache Storm?

Apache Kafka is a distributed and robust messaging system that can handle a large amount of data and allows to passage of messages from one endpoint to another endpoint communication. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability.

Coming to Apache Storm is a real-time message processing system and can we edit or manipulate data in the real-time. It is pulling the data from Kafka and applies some required manipulation for easily process in streams data in the real-time data processing.

2. What are the key benefits of using Storm for Real-Time Processing?

Real fast: Apache Storm can process 1000 messages per 10seconds per one node.

Fault-Tolerant: Apache Storm detects the fault automatically and re-starts the functional attributes.

Easy to Operate: The Operating Apache Stor is very easy

3. Does Apache Storm act as a Proxy server?

Yes, Apache Storm acts as a proxy also by using the mod_proxy. It implements a proxy, gateway or cache for Apache Storm.

4. How can kill a topology in Apache Storm?

Simply we can run: storm kill {stormname}

Give the same name to storm kill as you used when submitting in Storm topology.

5. What are the common configurations in Apache Storm?

In Apache Storm there are different types of configurations can set topology. Here are some common ones that are set for a topology.

  1. Config.TOPOLOGY_WORKERS: In this set the number of worker processes to use to execute the topology.
  2. Config.TOPOLOGY_ACKER_EXECUTORS: In this set the number of executors that will track tuple trees and detect when a spout tuple has been fully processing by not setting this variable is null.
  3. Config.TOPOLOGY_MAX_SPOUT_PENDING: In this sets the maximum number of spouts tuples that can be pending on a single spout task at once.
  4. Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS: This is the maximum amount of time a spout tuple has to be fully completed before it is considered failed.
  5. Config.TOPOLOGY_SERALIZATIONS: Can register more serializers to Storm using this config so that you can use custom types within tuples.