EMR - Expected schema-specific part at index : s3:

event 2022-04-28 visibility 754 comment 0 insights
more_vert
insights Stats
Raymond Raymond Cloud Computing

About Microsoft Azure and cloud computing.

When reading data files from S3 in AWS EMR Spark or when submitting spark-submit command, the following exception can popup:

java.net.URISyntaxException: Expected schema-specific part at index * : s3:
at org.apache.hadoop.fs.Path.initialize(Path.java:263)

Debug this issue

The error message itself already provides hint about where the issue occurs: the URI is not correct or invalid format.

Thus to fix this issue, please check anything related to URL in your spark-submit command or Spark scripts.

For instance, the following command has spaces in the URL:

spark-submit --py-files "s3://mybucket/ spark/package.zip" ...

To resolve this issue, you can remove the space in the S3 path. 

References

Uniform Resource Identifier - Wikipedia

More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts