In this article, we will explain how to resolve the java.lang.OutOfMemoryError: Java heap space issue in the Big Data environment.
Exception message 'ERROR [HY000] [Microsoft][Hardy] (35) Error from server: error code: '2' error message: 'Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 6, vertexId=vertex_112325028493241_01108_02_03, diagnostics=[Task failed, taskId=task_0811123345345534_00659, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: Java heap space at org.apache.hadoop.hive.serde2.WriteBuffers.nextBufferToWrite(WriteBuffers.java:261)
The above error belongs to memory related while running Spark or Hadoop jobs on the cluster with huge volume data , it might be monthly, yearly jobs. Basically, these jobs consuming more container in the Yarn (Yet Another Resource Negotiator) cluster level. Then how will fix the this type of issue.
Step 1 : First, we need to re-run application. Kindly monitor at the moment whether the jobs is running properly or not.
Step 2: If still the application is getting failed, please try with below hive parameters and re-run the application:
set hive.exec.dynamic.partition = true set hive.exec.dynamic.partition.mode = nonstrict
Once you re-run the application with above configurations, if you’re getting these type of error in the end of the log file:
FAILED : Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
Incase the jobs are running on ADF (Azure Data Factory) pipelines try with below parameters:
If you’re increase container size and java heap memory in the cluster level, it will reduce the space and time complexity of the particular job on the Big Data environment or Azure HDInsights for Spark or Hadoop Developers.
These type of error in different scenarios, in different tools for example Talend, Informatica, etc tolls with different java memory parameters. Sometimes, these scenarios are in Cloud related parameters we need to pass it on that. In Azure, AWS (Amazon Web Services) and GCP (Google Cloud Platform) need to configure on it otherwise, we are getting these type of error for long running jobs or Cron jobs. Please let us know if you find any resolutions kindly comment in the comment box.