arrow_back Run Multiple Python Scripts PySpark Application with yarn-cluster Mode

7 months ago link more_vert
Raymond Raymond
web_assetArticles 583
imageDiagrams 40
forumThreads 9
commentComments 218
loyaltyKontext Points 6293
account_circleProfile
#1545 Re: Run Multiple Python Scripts PySpark Application with yarn-cluster Mode

Hi Venu,

The code example you provided to me are local file write which has nothing to do with Spark:

with open("/user/user_name/myfile.ttl",mode='w+') as file:# It's a turtle file.

    file.write("This is truth")

The above lines will run in driver application container in the Spark cluster. 

That is why I made the comments before.

For me to illustrate more, can you share complete script if it is okay?

format_quote

person venu access_time 7 months ago
Re: Run Multiple Python Scripts PySpark Application with yarn-cluster Mode

Hi Raymond

Thanks for the reply!

I have some doubts. (.txt) is just an example but actually, I want to store .ttl type of file(turtle file)Want to store RDF(Resource descriptive framework) Triples. I don't want to read the file. I already have read the data using Spark. read and stored it in a data frame. After transforming the data I just want to write the output of the program to a file in spark-cluster mode.

Note:- I have tried your suggestion but still it gives the same error.

Can you please provide some more detailed explanation/solution?

Forum discussions for column Spark.