How to Install HIVE with MySQL on Ubuntu/Linux in Hadoop

Apache Hive is a data warehouse system mostly used for data summarization for structured data type files. Hive is a one of the component of Hadoop built on top of HDFS and is a data warehouse kind of system in Hadoop. It is used in Tabular form(Structured data) not for FLAT files.

Step:1 Download the hive-1.2.2 tarball from Apache Mirrors official website

http://apache.mirrors.tds.net/hive/hive-1.2.2

Step 2: Extract the tar ball file in your path using below command:

tar-xzvf Apache-hive-1.2.2-bin.tar.gz

Step 3:Update HIVE_HOME & PATH variables in bashrc file

export HIVE_HOME=/home/sreekanth/Big_Data/Apache-hive-1.2.1-bin

export PATH=$PATH:$HIVE_HOME/bin

After update, the .bashrc file will change then go to the next step

Step 5: To check the bashrc changes, open a new terminal and type the command

echo $HIVE_HOME

Step 6: Remove jline-0.9.94.jar file from the below path to avoid the incompatibility issues of Hive version with hadoop-2.6.0

Step 7: There are 2 types of Meta Stores we can configure in Hive to store metadata.

Internally using Derby in Hive. It is only for one user

Externally using MySQL is used multiple users. In case your conf file does not contain hive-site.xml file then

Create hive-site.xml file

Step 8: Configure hive-site.xml file with MySQL configuration and add the below content:

Step 9: For External Meta Store ‘MySQL’ , we need MySQL connector jar file

Step 10: MySQL connector jar file into $HIVE_HOME/lib path

Step 11: Run hive command in terminal but it will showing connection refused

Due to daemons are not working so it is necessary to start all daemons other wise hive is not working

Step 11: First start all daemons using start-all.sh command

Step 12: Now successfully run the hive in your machine

Step 13: How to Check Hive version using below command:

hive –version

Why we use HIVE?

Because of data summarization or querying tabular data in the Hadoop system. Default hive database Derby it is only for one user. Mostly MySQL used for large data and multiple users.