Error when connecting to oracle database in pyspark

N Nguyen visibility 408 comment 3 event 2023-01-09 access_time 11 months ago language English
more_vert

This is my code when run in pyspark env(version spark 3.1.2):

jdbcDF = spark.read \

.format("jdbc") \

.option("url", "jdbc:oracle:thin:@10.0.1.1:1521/sbank") \

.option("dbtable", "sa.a") \

.option("user", "g") \

.option("password", "zxc") \

.option("driver", "oracle.jdbc.driver.OracleDriver") \

.load()


But shows the announcement below as:


Py4JJavaError                             Traceback (most recent call last)
/tmp/ipykernel_29/4076487584.py in <module>
----> 1 jdbcDF = spark.read \
      2     .format("jdbc") \
      3     .option("url", "jdbc:oracle:thin:@10.0.1.1:1521/sbank") \
      4     .option("dbtable", "sa.a") \
      5     .option("user", "g") \

/usr/local/spark/python/pyspark/sql/readwriter.py in load(self, path, format, schema, **options)
    208             return self._df(self._jreader.load(self._spark._sc._jvm.PythonUtils.toSeq(path)))
    209         else:
--> 210             return self._df(self._jreader.load())
    211 
    212     def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None,

/usr/local/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py in __call__(self, *args)
   1302 
   1303         answer = self.gateway_client.send_command(command)
-> 1304         return_value = get_return_value(
   1305             answer, self.gateway_client, self.target_id, self.name)
   1306 

/usr/local/spark/python/pyspark/sql/utils.py in deco(*a, **kw)
    109     def deco(*a, **kw):
    110         try:
--> 111             return f(*a, **kw)
    112         except py4j.protocol.Py4JJavaError as e:
    113             converted = convert_exception(e.java_exception)

/usr/local/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py in get_return_value(answer, gateway_client, target_id, name)
    324             value = OUTPUT_CONVERTER[type](answer[2:], gateway_client)
    325             if answer[1] == REFERENCE_TYPE:
--> 326                 raise Py4JJavaError(
    327                     "An error occurred while calling {0}{1}{2}.\n".
    328                     format(target_id, ".", name), value)

Py4JJavaError: An error occurred while calling o137.load

Can anyone help me to solve that? Thank you in advance.

I added ojdbc11.jar into jars forder of spark




More from Kontext
copyright This page is subject to Site terms.
Like this article?
Share on
comment Comments
Kontext Kontext access_time 11 months ago more_vert
#1786 Re: Error when connecting to oracle database in pyspark

I'm glad it works for you.

format_quote

person Nguyen access_time 11 months ago
Re: Error when connecting to oracle database in pyspark

Thanks Kontext. I have tried to follow that web https://kontext.tech/article/1060/pyspark-read-data-from-oracle-database

That was successful.

Version of jdk is 1.8.0_352, open jdk 64-bit server VM.

All of logs I have shown above when I ran that statement code.

N Nguyen access_time 11 months ago more_vert
#1785 Re: Error when connecting to oracle database in pyspark

Thanks Kontext. I have tried to follow that web https://kontext.tech/article/1060/pyspark-read-data-from-oracle-database

That was successful.

Version of jdk is 1.8.0_352, open jdk 64-bit server VM.

All of logs I have shown above when I ran that statement code.

format_quote

person Kontext access_time 11 months ago
Re: Error when connecting to oracle database in pyspark

Hi Nguyen,

Welcome to Kontext!

For questions like this, you can publish in our Forums in future.

Have you followed this article? PySpark - Read Data from Oracle Database.

Can you please try ojdbc 8 instead of 11? ojdbc 11 requires JDK 11. Spark 3.1.2 can run on JDK 11 technically. What is your JDK version?

The error message is not detailed, can you paste the full error logs?


Kontext Kontext access_time 11 months ago more_vert
#1781 Re: Error when connecting to oracle database in pyspark

Hi Nguyen,

Welcome to Kontext!

For questions like this, you can publish in our Forums in future.

Have you followed this article? PySpark - Read Data from Oracle Database.

Can you please try ojdbc 8 instead of 11? ojdbc 11 requires JDK 11. Spark 3.1.2 can run on JDK 11 technically. What is your JDK version?

The error message is not detailed, can you paste the full error logs?


Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts