Zookeeper Socket Connection for client issue in Hadoop | Big Data | Spark





I am getting the Zookeeper Socket Connection error for the Spark jobs. In my Big Data environment, we set up a large Hadoop cluster for large data sets. While I am triggeringSpark jobs I am getting “Closed socket connection”.

Zookeeper Socket Connection error:

Zookeeper INFO [Thread - Server Cxn] - Closed socket connection for client /localhost: 59522 (no session established for client)
2020 - 04 - 25 -07:24:23, 785 - WARN [SyncThread:3:FileTxnLog@2340 ] - fsync -ing the write-ahead log in SyncThread : 3  tooks 1550ms which will adversely effect operation latency. See the Zookeeper troubleshooting guide
2020 - 04 - 25 -07:24:23, 785 - WARN [SyncThread:3:FileTxnLog@2340 ] - Accepted socket connection from /localhost:59586

Solution:





This error belongs to the Zookeeper connection timed out issue. The running Zookeeper not connected with the Hadoop cluster so jobs got failed with a connection timed out issue. Also, syncing took place at the zookeeper side after that container departed.

Stoped Zookeeper services. After a few mins start the Zookeeper.

First, stop the Zookeeper service after that start the Zookeeper service.

After restarting Zookeeper services, restarted failure Spark jobs but at this time not failed the Spark jobs.




Summary: In the Hadoop cluster, Zookeeper is a major role for centralized service for maintaining configuration, information, and distributed services offering to the key-value stores. After setting up Spark in the Hadoop cluster, I run the Spark jobs for large data processing. After sometime job got failed in the cluster due to the Zookeeper socket connection error. Then I restarted my Spark job after also getting with the same error. I checked the log files showing the same error, after that, I stopped Zookeeper services in the Hadoop cluster. Then I started Zookeeper services, then automatically set up my Zookeeper along with jobs within the cluster. This type of error belongs to the Zookeeper connection timed out exception or it may be a network connection error. It is a common error in the Big data distributions.