How to Install Kakfa in Linux/Ubuntu (Single node cluster) with Pictures

Apache Kafka is one of the distributed messaging systems. Here is step by step processing to install of Apache Kafka in Linux/Ubuntu operating system.



Prerequisites:

To install Kafka required Zookeeper and java to run. Mandatory for JDK 1.7 or above version for Kafka installation using below commands:

$ sudo apt update
$ sudo apt install default - jdk

Step 1: Download the Kafka binary files from official website like Download from apache website.

https://archive.apache.org/dist/kafka/

Step 2:  Extract the tarball using the below command:

$ tar -xzvf kafka_2.12.-0.10.1.1tgz


Step 3: After extraction, we see Kafka directory

Step 4: Update the KAFKA_HOME & PATH variable in bashrc file

 export KAFKA_HOME = /home/your_path/INSTALL/kafka_2.12-0.10.1.1
 export PATH=$PATH:$KAFKA_HOME/bin


Step 5: Ater bashrc changes, open a new terminal and check the bashrc changes using below command:

$ echo $KAFKA_HOME






After installing Apache Kafka on Linux/Ubuntu Start Kafka Server. Before start, the Kafka server, start Zookeeper server on your single node cluster using below commands:

$ cd/usr/local/kafka
$ bin/zookeeper-server-start.sh config/zookeeper.properties

After the start, the Zookeeper server then start the Kafka server

$ bin/kafka-server-start.sh config/server.properties

After starting Kafka server then create topics after that will go with message passing from producer to consumer. Above steps Kafka installation for single/pseudo cluster setup on top of the Hadoop ecosystem.

Summary: Apache Kafka installation in Linux/Ubuntu operating system it is very simple and uses it. If you need installation in Clouder to need to download separately bundle in Cloudera manager to set up in multi-node cluster setup. In Hortonworks need to the installation of Kafka in Ambari.

 

Kafka Interview Questions and Answers

Kafka Interview Questions and Answers:

1. What is Kafka?

Kafka is an open source message broker project coded in Scala/Python/Java. Kafka is originally developed by LinkedIn and developed as an open sourced in early.




2. Which are the components of Kafka?

The major components of Kafka are:

Topic: A group of messages belongs to the same type

Producer: Using the producer can publish messages to the topic

Consumer: Pulls data from the brokers

Brokers: This is the place where the disclose messages are stored known as servers.

3. What role does Zookeeper play in a cluster of Kafka?

Kafka is an open source system and it also a distributed system and it is built to use Zookeeper. The basic responsibility of Zookeeper is to build coordination between different nodes in a cluster. Zookeeper works as periodically commit offset so that if any ode gets failure it will be used to recover from previously committed offset. The Zookeeper is also responsible for configuration management leader detection, finding if any node leaves or joins the cluster, synchronization.

4. Distinguish between the Kafka and Flume?

Flume’s major use-case is incorporated with the Hadoop’s monitoring system, file formats, file systems, and utilities. It is used for Hadoop integration. Flume will be the best option to use when you have non-relational data sources. But Kafka used for the distributed publish-subscribe messaging system. Kafka is not developed for Hadoop and using Kafka to read and write data to Hadoop considerably than the Flume. Kafka is a highly reliable and scalable enterprise messaging system to connect different multiple systems.




5. It is possible to use Kafka without Zookeeper?

It is impossible to use Kafka without Zookeeper because it is not possible to bypass Zookeeper and connect directly to the server. If the Zookeeper is down then we will not able to sever any client request.

6. How to start a Kafka Server?

Kafka uses Zookeeper, we have to start the zookeeper server. One can use the convince script packaged with Kafka with a single node Zookeeper
> bin/zookeeper-server-start.shconfig/zookeeper.properties Now the Kafka server can start> bin/Kafka-server-start.sh config/server.properties

What is Apache Kafka?Why Kafka?Explain about Architecture of Kafka

Apache Kafka is one of the distributed publisher-subscriber system or a distributed messaging system.




i.e. By accepting the message from different producers in a streaming fashion deliver the same messages to the distributed consumer.

Apache Kafka opens source stream processing platform for developers and it is written Functional Programming and Object oriented languages.

Kafka is a message broker between Producer and Consumer with less time sending message.

The architecture of Apache Kafka see in the below diagram:

Topics

Messages

 

Producer: Producer is a process which will publish to message through Kafka broker.

Consumer: Consumer is a process which will consume or receives the message from Kafka broker.

Topic: The fields of the message will be inscribed in a topic.

Kafka Cluster: It will maintain the different brokers which are responsible for the smooth flow of the message passing between producers and consumers.

How to Installation Apache Kafka on Ubuntu simple steps:

Without Zookeeper, there is no Kafka. It means that first Install Zookeeper then will go with Kafka Installation.

Zookeeper is a centralized service for distributed systems to key, value store. It provides configurations for centralized systems.

Start the Kafka server using below commands in command prompt

Kafka/bin > zookeeper-server start.sh config/zookeeper.properties

Kafka/bin > zookeeper-server start.sh config/server.properties

How to create a topic in Kafka using below commands

Kafka/bin > Kafka-topics.sh –create –zookeeper localhost:2181  –replication -factor 1 –topic sreekanth




Kafka/bin> Kafka-topics.sh –list –zookeeper localhost:2181

Create a topic in Producer console:

Kafka/bin > Kafka -console-producer.sh –broker-list localhost:9092 –topic sreekanth

Hello Apache Kafka is entered my message

Kafka/bin > Kafka -console-consumer.sh –broker-list localhost:9092 –topic sreekanth –from beginning

Hello Apache Kafka is entering after your message in Consumer console