For Big Data Engineer Kafka is a mandatory skill. Here is simple Kafka commands with explanation. First will go with Kafka path like below.
/home/Bigdata/Kafka_Installation/kafka_2.10-0.10.1.1
To start Zookeeper server as part of Kafka – Kafka uses Zookeeper so we nee to first start a Zookeeper server
bin/zookeeper-server.start.sh config/zookeeper.properties
To Start Kafka server:
bin/kafka-server-start.sh config/server.properties
To create a topic with single partition and one replica:
bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic sreekanthtopic
To List all the created topics in Kafka:
bin/kafka-topics.sh --list --zookeeper localhost:2181
To start the producer to send messages to consumer:
bin/kafka-console-produce.sh --broker-list localhost : 9092 --topic sreekanthtopic
Start the consumer to receive the messages produced by producer:
bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic sreekanthtopic --from-beginning
To describe the created kafka topic:
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic sreekanthtopic
To set multi broker kafka cluster:
cp config/server.properties config/server-1.properties cp config/server.properties config/server-2.properties
Edit both nodes server properties like below:
Nano config/server-1.properties: broker.id = 1 port = 9093 log.dir = /tmp/kafka - logs -1 nano config/server - 2 .properties broker.id = 2 port = 9094 log.dir = /tmp/kafka-log-2
To start kafka server 1:
bin/kafka - server -start.sh config/server-1/properties To start kafka server 2 bin/kafka - server -start.sh config/server-2/properties
Create a new topic with replication factor of “5”
bin/kafka-topics.h --create --zookeeper localhost:2181 --replication-factor 5 --partitions 1 --topic multisreekanthtopic
To describe the created kafka topic
bin/kafka-topics.sh --describe --zookeeper localhost:2181 --topic multisreekanthtopic
In Kafka “Leader” is the node behind for all read/write for the given partition and each node will be the leader.
And another most important one is Kafa is “Replicas” is the list of nodes that replicate the log for this partition regardless of whether they are the leader or even if they are present alive.
Finally, the “isr” is the set of “in – sync” replicas. This is the part of the replicas list that is present alive and caught – up the leader.
Summary: Kafka is a message broker in between producer and consumer concepts in the production environment. Here we provide Kafka related commands with explanation for beginners as well as experienced in the Big Data environment.