Cloudera with Kerberos:
Nowadays, the most popular big data distribution is Cloudera, at present trends, Hortonworks and Cloudera merged and then different features. Here is explained and setup of the Kerberos on Cloudera (CDH) latest version using Cloudera Manager with steps. Logged in Cloudera Web UI and then see the setup of the cluster and Cloudera Management services with the data.
After successfully logged in the cluster then will provide security either Kerberos or SASL in the present era. Here we choose MIT Kerberos Installation and how to set up with Cloudera manager simply.
What is Cloudera?
Cloudera is one of the Biggest Big Data Distribution with simple Web UI and features. It provides Hadoop eco-system services like HDFS, Hive, Spark, Hue, Impala, and etc. services for Hadoop users.
What is Kerberos?
Kerberos is a third level authentication protocol in the Hadoop cluster developed by MIT. It is highly secured because of default authentication in Hadoop
MIT Kerberos Installation on Clouder Manager:
First installing MIT Kerberos on Cloudera with the following steps on top of the cluster environment.
Step 1: Install the KDC Server here is the latest version of the KDC server on CentOS / RHEL / ORACLE LINUX operating system. For Ubuntu or other operating-system has different commands in the Cloudera documentation. By using yum install krb5 server command with libraries for krb5 server on Centos operating system.
yum install krb5 -server -libs krb5 -workstation
Step 2: After the yum installation of the krb5 server and libraries. Need to some changes on the configure files like IP address or hostname in that file. It means that to change the default realm of your hostname.
Step 3: Changed the configuration file like your hostname or FQDN (Fully Qualified Domain Name) below changes in the realm.
Step4: After successfully changes int the krb5 configuration file then create the database for kdb5_util. It means that to create the Kerberos database for strong level authentication for the KDC server.
kdb5_util create -s
Step 5: After completed of the Kerberos database then start the KDC server and KDC admin server by using below commands on your command line.
service krb5kdc start
service kadmin start
Step 6: After starting the KDC server and KDC admin server, if we need to check whether service status, in case of KDC service automatically reboots so we need to check configuration like below commands:
chkconfig krb5kdc on
chkconfig kadmin on
Step 7: After that, we created Admin principals for Kerberos: Here Kadmin server added the principals by using below command:
kadmin.local -q "addprinc admin/admin"
Step 8: After the added admin principal has permissions in the KDC ACL (Access Control List) file. Then open the KDC ACL (Access Control List) file for changing the permission to give our specific details like FQDN etc.
Step 9: In the above step showing “*/admin@EXAMPLE.com” we need to change into alias name of your hostname like below:
*/admin@EXAMPLE.com to */admin@HADOOP.com*
Step 10: Changed to your hostname and saving the kadm5.acl file, then restart the kadmin process (restart the Kadmin server) using below simple command.
service kadmin restart
Step 11: Now, connecting kadmin server here is Authenticating as principal “admin/admin@HADOOP.com” with the password. Here we set up the kadmin password for the this server
kadmin -p admin/admin@HADOOP.com
Step 12: Open Cloudera Web UI with admin user and admin password then go to right corner click on Administration -> Security. After choosing the security option to enable Kerberos by using Cloudera manager for simply.
Step 13: In this sep will go to enable options here we choose default options “select all options” to enable Kerberos for a cluster like below snapshot.
Step 14: Give your default values (options) like the below steps for KDC and Realm services. Here we give default values then click on the “Continue” option.
A.Select KDC Type "MIT KDC" ,
B. Kerberos security Realm give your alias name like "HADOOP.com"
C.KDC server type, KDC admin server type give our host name
D.Give Kerberos encryption type etc.
Step 15: After completion of KDC settings then click on KRB5 configuration select the “Manage krb5.conf through Cloudera manager” then choose default values then click on “Continue”
Step 16: Give your KDC Account credential details like your KDC Account username and Password like below snapshot.
Step 17: After click on “continue” then automatically imported KDC Account Manager Credentials successfully. If it is not successful then change KDC configuration with correct details then again continue with this process.
Step 18: Updated KDC Account manager credentials after that will update the Kerberos Principals on Hadoop Cluster services like HDFS, Hue, Hive, Spark, Oozie and Zookeeper, etc.
Step 19: In this step, we need to configure the ports for Datanode Transceiver and Datanode HTTP Web UI with default ports. If you want to change the password then simply change it after that proceed to click on “continue”.
Step 20: After setting up the Ports. Here is enable the Kerberos deployment on cluster services then click on continue option
Step 21: Then click on the “Finish” button. After successfully completed deployment on the Cluster services. It means that “Enable Kerberos for Cluster” on your Cloudera. It showing like below snapshot with Congratulation message for this cluster.
Step 22: After successfully enabled Keberos on the Cloudera cluster for all services.
Step 23: “Restart” all services on Cloudera Web UI only. After restarting all services means successfully completed enabled Kerberos on Cluster services also.
Step 24: In case HDFS or any services get failed then again stop that service and start the service manually within the Cloudera Manager Web UI only.
Summary: The following above steps for enabling Kerberos on Cloudera with the help of Cloudera manager as simple as. Here is using MIT KDC server for security within the entire cluster. If you need folder access with security using Ranger. For gateway using Knox in the cluster. Kerberos is a strong third level authentication for data security. AD or MIT KDC are first level authentication in any Hadoop cluster. How to utilize Cloudera manager in a cluster with different services like Hive, Spark, and Zookeeper. While clouder Web UI is the simplest to connect security for Data. Cloudera Manager is a simple WebUI with different features compared to the remaining Big Data distributions. In case of any service is failed then restart the services simply. Most of the time network issues or socket connection issues on Cloudera Manager on the Cluster services.