java.lang.NoSuchMethodError: PoolConfig.setMinEvictableIdleTime

visibility 51 event 2022-08-27 access_time 29 days ago language English
more_vert

Context

When using structured streaming to sink Kafka messages into HDFS using Spark, I am hitting this error: 

java.lang.NoSuchMethodError: org.apache.spark.sql.kafka010.consumer.InternalKafkaConsumerPool$PoolConfig.setMinEvictableIdleTime(Ljava/time/Duration;)V

The environment I am using:

  • Spark: 3.3.0 (Scala 2.12)
  • Kafka: kafka_2.13-3.2.0 (Kafka 3.2.0, Scala 2.13).
  • org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.0 (for Spark 3.3.0, Kafka broker 0.10.0+, Scala 2.12)

The error occurred when Spark trying to establish an internal Kafka consumer to read messages in the topic.

Look into the details

The error happens to class PoolConfig where method setMinEvictableIdleTime doesn't exist. This class is part of Apache Commons Pool library (commons-pool2). 

From Maven central, the following versions are used by org.apache.spark:spark-sql-kafka-0-10_2.12:3.3.02.11.1.

The exception is raised from line 186: InternalKafkaConsumerPool.scala#L186 in the file.

For class PoolConfig, it is inherited from BaseObjectPoolConfig. In the base class, method setMinEvictableIdleTime was added from version 2.10.0. Before that version, method setMinEvictableIdleTimeMillis was used. 

Thus I am thinking - it might be because of the older version of commons-pool2 is used. However, from the Spark job logs, I can tell that version 2.11.1 was loaded:

2022-08-26T23:38:09,085 INFO [Thread-6] org.apache.spark.executor.Executor - Fetching spark://localhost:39883/jars/org.apache.commons_commons-pool2-2.11.1.jar with timestamp 1661521085729
2022-08-26T23:38:09,086 INFO [Thread-6] org.apache.spark.util.Utils - Fetching spark://localhost:39883/jars/org.apache.commons_commons-pool2-2.11.1.jar to /tmp/spark-547fe757-e24b-4675-843d-0122d27b6daf/userFiles-223b6753-1c52-4816-baf2-bf324f94e01f/fetchFileTemp5942499760833953456.tmp
2022-08-26T23:38:09,089 INFO [Thread-6] org.apache.spark.util.Utils - /tmp/spark-547fe757-e24b-4675-843d-0122d27b6daf/userFiles-223b6753-1c52-4816-baf2-bf324f94e01f/fetchFileTemp5942499760833953456.tmp has been previously copied to /tmp/spark-547fe757-e24b-4675-843d-0122d27b6daf/userFiles-223b6753-1c52-4816-baf2-bf324f94e01f/org.apache.commons_commons-pool2-2.11.1.jar
2022-08-26T23:38:09,094 INFO [Thread-6] org.apache.spark.executor.Executor - Adding file:/tmp/spark-547fe757-e24b-4675-843d-0122d27b6daf/userFiles-223b6753-1c52-4816-baf2-bf324f94e01f/org.apache.commons_commons-pool2-2.11.1.jar to class loader

Then I looked into Spark (3.3.0) jars folder and I can find a version of 1.5.4 for commons-pool: commons-pool-1.5.4.jar

Resolution

I then manually downloaded commons-pool2 version 2.11.1 into Spark jars folder:

spark-3.3.0/jars$ wget https://repo1.maven.org/maven2/org/apache/commons/commons-pool2/2.11.1/commons-pool2-2.11.1.jar
spark-3.3.0/jars$ ls | grep commons-pool
commons-pool-1.5.4.jar
commons-pool2-2.11.1.jar

Rerun my Spark structure streaming application, the issue is then resolved. 

warning Warning - I am not 100% sure whether replacing this library will cause issues to Spark. At the moment, I have not hit any issues. So please be cautious while adopting this method. 
info Last modified by Raymond 29 days ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts