Error:Caused by: java.io.IOException Failed in Spark




Spark Error:

Spark throws errors while submitting the Spark jar file in the Hadoop cluster.

error: caused by: java.io.IOException: Failed to connect to Cluster

While submitting Spark Jar file Hadoop cluster mode using below command spark-submit command in deploy mode and then get failed to connect to the cluster

spark-submit --class com.spark.core.LinesWithErrorStarting --master spark:// custer:7077 --deploy-mode clusterLINES - WITH - STARTING - ERROR_WC.jar. Data.log SparkOutDir7

Resolution:

The above command to run must have start Spark daemons using below commands:

$SPARK_HOME/sbin/start-all.sh

$SPARK_HOME/sbin/stop-all.sh

In yarn cluster mode: Your driver program is running on the cluster master machine where you type the command to submit the spark application.

After starting the Spark daemons will execute the Spark jar submit using below Spark jar submit command:

spark-submit --class com.spark.core.LinesWithErrorStarting --master spark:// custer:7077 --deploy-mode cluster --num-executors 2 --driver -memory 512m --executor-memory 512 m --executor-coresĀ  1 LINES - WITH - STARTING - ERROR_WC.jar. Data.log SparkOutDir7

If you still facing the same error then enabling the “External Shuffle Service” in Spark submit.

spark-submit --class com.spark.core.LinesWithErrorStarting --master spark:// custer:7077 --deploy-mode cluster --num-executors 2 --driver -memory 512m --executor-memory 512 m --executor-coresĀ  1 --conf spark.shuffle.service.enabled=true LINES - WITH - STARTING - ERROR_WC.jar. Data.log SparkOutDir7