MapR Architecture

MapR Architecture:

Before Hadoop was introduced in 2007, there was not a single data platform that can provide the scalable architecture to handle fast-growing data with a unified security model.

There are four important pillars of a data platform

1.Distributed Metadata

2.Variety of Protocols and API support

3.Variety of Data persistence like objects, files, tables and event queues.


Distributed Metadata:

In Distributed metadata is a centralized metadata service leads to a number of restrictions as below:

1.Creates a single point of failure

2.Creates a hotspot that limits the scalability of the cluster

3.Limits sharing of data artifacts

4. Limits the number of data artifacts that can be stored in the cluster.

MapR has built a distributed metadata service from the top that removes all these restrictions.

CLDB (Container Location Data Base) serves as MapR’s level – I metadata service and maintains metadata about volumes, containers, nodes in the entire cluster.

The metadata about data artifacts such as objects, files, tables, topics, directories are maintained in the level-Il metadata is stored in the name container.

Variety of APIs and Protocol Support:

MapR Data Platform provides data ability among the different APIs. In different applications using different APIs:


2.S3 API





Variety of Data persistence:

MapR data container is the unit of storage allocation and management. Each container stores a variety of data elements such as objects, files, tables, and directories.

It supports two types of data elements:

1.File chunks

2.Key – Value stores

These two are data elements in MapR for thread file chunks across containers. Directories are built over Key-Value stores. The tables are built on top of files and key-value stores in an index.

MapR Data Platform war architected in such a way to solve most data problems for enterprise and eliminate data tools.

The heart of the MapR data platform is the Data Container.

And Data Container provides:

1.Different data persistence models, such as files, tables, objects etc.

2.Distributed scale-out storage

3.Data loss prevention

4.Failure resilience and disaster recovery