Pass Environment Variables to Executors in PySpark

2019-12-03 pysparksparkspark-2-x

Sometime it is necessary to pass environment variables to Spark executors. To pass environment variable to executors, use setExecutorEnvfunction of SparkConfclass.

Code snippet

In the following code snippet, an environment variable name ENV_NAME is set up with value as 'ENV_Value'.

from pyspark import SparkConf
from pyspark.sql import SparkSession

appName = "Python Example - Pass Environment Variable to Executors"
master = 'yarn'

# Create Spark session
conf = SparkConf().setMaster(master).setAppName(
    appName).setExecutorEnv('ENV_NAME', 'ENV_Value')

spark = SparkSession.builder.config(conf=conf) \
    .getOrCreate()