Turn off INFO logs in Spark

access_time 2 months ago visibility88 comment 0

Spark is a robust framework with logging implemented in all modules. Sometimes it might get too verbose to show all the INFO logs. This article shows you how to hide those INFO logs in the console output.

Spark logging level

Log level can be setup using function pyspark.SparkContext.setLogLevel.

The definition of this function is available here:

def setLogLevel(self, logLevel):
        """
        Control our logLevel. This overrides any user-defined log settings.
        Valid log levels include: ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE, WARN
        """
        self._jsc.setLogLevel(logLevel)

Set log level to WARN

The following code sets the log level to WARN

from pyspark.sql import SparkSession

appName = "Spark - Setting Log Level"
master = "local"

# Create Spark session
spark = SparkSession.builder \
    .appName(appName) \
    .master(master) \
    .getOrCreate()

spark.sparkContext.setLogLevel("WARN")

When running the script with some actions, the console still prints out INFO logs before setLogLevel function is called. 

Change Spark logging config file

Follow these steps to configure system level logging (need access to Spark conf folder):

  1. Navigate to Spark home folder.
  2. Go to sub folder conf for all configuration files. 
  3. Create log4j.properties file from template file  log4j.properties.template.
  4. Edit file log4j.properties to change default logging to WARN:

Run the application again and the output is very clean as the following screenshot shows:


For Scala

The above system level Spark configuration will apply to all programming languages supported by Spark incl. Scala. 

If you want to change log type via programming way, try the following code in Scala:

spark = SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("WARN")

If you use Spark shell, you can directly access SparkContext via sc:

sc.setLogLevel("WARN")


Run Spark code

You can easily run Spark code on your Windows or UNIX-alike (Linux, MacOS) systems. Follow these articles to setup your Spark environment if you don't have one yet:

info Last modified by Raymond at 2 months ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Want to publish your article on Kontext?

Learn more

Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.


Learn more arrow_forward

More from Kontext

local_offer spark local_offer hadoop local_offer pyspark local_offer oozie local_offer hue

visibility 3024
thumb_up 0
access_time 2 years ago

When submitting Spark applications to YARN cluster, two deploy modes can be used: client and cluster. For client mode (default), Spark driver runs on the machine that the Spark application was submitted while for cluster mode, the driver runs on a random node in a cluster. On this page, I am going ...

local_offer pyspark local_offer spark local_offer spark-2-x local_offer spark-file-operations

visibility 10975
thumb_up 0
access_time 10 months ago

Spark provides rich APIs to save data frames to many different formats of files such as CSV, Parquet, Orc, Avro, etc. CSV is commonly used in data application though nowadays binary formats are getting momentum. In this article, I am going to show you how to save Spark data frame as CSV file in ...

local_offer tutorial local_offer pyspark local_offer spark local_offer how-to local_offer spark-dataframe

visibility 150
thumb_up 0
access_time 2 months ago

This article shows how to add a constant or literal column to Spark data frame using Python.  Follow article  Convert Python Dictionary List to PySpark DataFrame to construct a dataframe. +----------+---+------+ | Category| ID| Value| +----------+---+------+ |Category A| 1| ...

About column

Apache Spark installation guides, performance tuning tips, general tutorials, etc.

*Spark logo is a registered trademark of Apache Spark.

rss_feed Subscribe RSS