When reading data files from S3 in AWS EMR Spark or when submitting
spark-submit command, the following exception can popup:
java.net.URISyntaxException: Expected schema-specific part at index * : s3:
Debug this issue
The error message itself already provides hint about where the issue occurs: the URI is not correct or invalid format.
Thus to fix this issue, please check anything related to URL in your
spark-submit command or Spark scripts.
For instance, the following command has spaces in the URL:
spark-submit --py-files "s3://mybucket/ spark/package.zip" ...
To resolve this issue, you can remove the space in the S3 path.