OOZIE WORKFLOW





Simple Oozie workflow diagram:

Oozie Workflow:

Workflow in Oozie is a sequence of actions(schedules) are arranged in the Directed Acyclic Graph (DAG). The schedules are in controlled dependency as the next schedule can only run as per the output of the current schedule. In subsequent schedules are not independent on its previous schedules.

The Oozie workflow action can be in Java action, Hive action and some Shell scriptings actions, etc. There can be decision trees to decide how and on which condition a job should run.

Some chron jobs like Kafka jobs, Scripting jobs are scheduled by Oozie. It detects completion of tasks through a callback and polling. When Oozie starts a task then will take a unique call back HTTP URL to the task and notifications that URL until task completed. If the Oozie takes the task fails to instance the callback URL Oozie can poll the task for completion.

Mainly these three types of jobs in Oozie:

Oozie Workflow Jobs – Oozie jobs are represented as Directed Acyclic Graphs to specific actions to be executed.

Oozie Coordinator Jobs – Oozie coordinator jobs are consist of workflow jobs triggered by time and data availability (Scheduling).

Oozie Bundle – Oozie like as a package of multiple coordinators and workflow jobs( Chron jobs).