Get the Current Spark Context Settings/Configurations

access_time 2 years ago visibility3513 comment 0

In Spark, there are a number of settings/configurations you can specify including application properties and runtime parameters.

https://spark.apache.org/docs/latest/configuration.html

Get current configurations

To retrieve all the current configurations, you can use the following code (Python):

from pyspark.sql import SparkSession

appName = "PySpark Partition Example"
master = "local[8]"

# Create Spark session with Hive supported.
spark = SparkSession.builder \
    .appName(appName) \
    .master(master) \
    .getOrCreate()

configurations = spark.sparkContext.getConf().getAll()
for conf in configurations:
    print(conf)

* The above code is for Spark 2.0+ versions.

The output for the above code looks similar like the following:

('spark.rdd.compress', 'True')
('spark.app.name', 'PySpark Partition Example')
('spark.app.id', 'local-1554464117837')
('spark.master', 'local[8]')
('spark.serializer.objectStreamReset', '100')
('spark.executor.id', 'driver')
('spark.submit.deployMode', 'client')
('spark.driver.host', 'Raymond-Alienware')
('spark.driver.port', '11504')
('spark.ui.showConsoleProgress', 'true')

local_offer lite-log local_offer spark local_offer pyspark
info Last modified by Raymond at 2 years ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Want to publish your article on Kontext?

Learn more

Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.


Learn more arrow_forward

More from Kontext

local_offer tutorial local_offer pyspark local_offer spark local_offer how-to local_offer spark-dataframe

visibility 620
thumb_up 0
access_time 3 months ago

Column renaming is a common action when working with data frames. In this article, I will show you how to rename column names in a Spark data frame using Python.  The following code snippet creates a DataFrame from a Python native dictionary list. PySpark SQL types are used to create the ...

local_offer python local_offer spark local_offer pyspark local_offer spark-advanced

visibility 39185
thumb_up 10
access_time 2 years ago

Data partitioning is critical to data processing performance especially for large volume of data processing in Spark. Partitions in Spark won’t span across nodes though one node can contains more than one partitions. When processing, Spark assigns one task for each partition and each worker threads ...

local_offer python local_offer spark local_offer hadoop local_offer pyspark

visibility 1515
thumb_up 0
access_time 2 years ago

In one of my previous articles about Password Security Solution for Sqoop , I mentioned creating credential using hadoop credential command. The credentials are stored in JavaKeyStoreProvider. Credential providers are used to separate the use of sensitive tokens, secrets and passwords from the ...

About column

Apache Spark installation guides, performance tuning tips, general tutorials, etc.

*Spark logo is a registered trademark of Apache Spark.

rss_feed Subscribe RSS