Meta Store in APACHE HIVE

In Hadoop eco-system Hive component processing all the structure information of the various tables and partitions in the warehouse including column, column type information to the necessary to read-write data and the corresponding HDFS files where data is stored. That is the central repository of Hive metadata.

Here mainly Metadata is the internal database of the hive which is responsible for managing metadata information.

Metadata which metastore stores contain things like:

1.Ids of database

2.Ids of Tables and index

3.Time of creation of the index

4.Time of creation of tables

5.Input format used for tables

6.Output format used for tables

Hive majorly Three modes of Metastore

1.Embedded Metastore

2.Local Metastore

3.Remote Metastore

1.Embedded Metastore:

Embedded metastore runs in the same JVM as the Hive service and contains an embedded Derby database instance backed by the local disk it is called the embedded metastore.

Default configuration for Embedded Metastore

below are the metastore configuration details in “hive-site.xml”



<value>jdbc:derby: ;databaseName=/var/lib/hive/metastore/metastore_db;create=true</value>

<descripton>JDBC connect string for a Embedded metastore</description>





<descripton>Driver class name  for a Embedded metastore</description>



2. Local Metastore:

It supports multiple users to use a standalone database. Configuration same as local metastore but connect to a database running in a separate process.

If we use MySQL database as the local metastore, then we need some configuration in “hive-site.xml”



<value>jdbc:mysql://host/dbname?create DAtabaseIfnotExist=trueue</value>

<descripton>JDBC connect string for a JDBC metastore</description>





<descripton>Driver class name  for a JDBCmetastore</description>



3.Remote Metastore: 

In remoter metastore, where one or more metastore servers run in separate processes to Hive service.

This metastore better manageability and security.