access_time 3 years ago visibility1906 comment 0 languageEnglish

This page summarizes the steps to install Spark 2.2.1 in your Windows environment.

Tools and Environment

  • GIT Bash
  • Command Prompt
  • Windows 10

Download Binary Package

Download the latest binary from the following site:

In my case, I am saving the file to folder: F:\DataAnalytics.

UnZip binary package

Open Git Bash, and change directory (cd) to the folder where you save the binary package and then unzip:

$ cd F:\DataAnalytics

fahao@Raymond-Alienware MINGW64 /f/DataAnalytics
$ tar -xvzf   spark-2.2.1-bin-hadoop2.7.tgz

In my case, spark is extracted to: F:\DataAnalytics\spark-2.2.1-bin-hadoop2.7

Setup environment variables


Follow section ‘JAVA_HOME environment variable’ in the following page to setup JAVA_HOME


Setup SPARK_HOME environment variable with value of your spark installation directory.



Added ‘%SPARK_HOME%\bin’ to your path environment variable.

Verify the installation

Verify command

Run the following command in Command Prompt to verify the installation.


The screen should be similar to the following screenshot:


Run examples

Execute the following command in Command Prompt to run one example provided as part of Spark installation (class SparkPi with param 10).

%SPARK_HOME%\bin\run-example.cmd SparkPi 10

The output looks like the following:

Spark context UI

As printed out, Spark context Web UI available at

The following is a screenshot of the UI:


Spark developer tools

Refer to the following page if you are interested in any Spark developer tools.

info Last modified by Raymond at 3 years ago * This page is subject to Site terms.

More from Kontext

Improve PySpark Performance using Pandas UDF with Apache Arrow

local_offer pyspark local_offer spark local_offer spark-2-x local_offer pandas

visibility 1842
thumb_up 4
access_time 7 months ago

Apache Arrow is an in-memory columnar data format that can be used in Spark to efficiently transfer data between JVM and Python processes. This currently is most beneficial to Python users that work with Pandas/NumPy data. In this article, ...

open_in_new Spark + PySpark

local_offer pyspark local_offer spark-2-x local_offer spark

visibility 2232
thumb_up 0
access_time 7 months ago

This article shows you how to read and write XML files in Spark. Sample XML file Create a sample XML file named test.xml with the following content: <?xml version="1.0"?> <data> <record id="1"> <rid>1</rid> <nam...

open_in_new Code snippets

local_offer pyspark local_offer spark-2-x local_offer spark local_offer python

visibility 2742
thumb_up 0
access_time 7 months ago

This article shows how to convert a Python dictionary list to a DataFrame in Spark using Python. Example dictionary list data = [{"Category": 'Category A', "ID": 1, "Value": 12.40}, {"Category": 'Category B', "ID": 2, "Value": 30.10}, {"Category": 'Category C', "...

open_in_new Spark + PySpark

local_offer pyspark local_offer spark-2-x local_offer spark

visibility 166
thumb_up 0
access_time 8 months ago

Sometime it is necessary to pass environment variables to Spark executors. To pass environment variable to executors, use setExecutorEnv function of SparkConf class. Code snippet In the following code snippet, an environment variable name ENV_NAME is set up with value ...

open_in_new Code snippets

info About author

comment Comments (0)

comment Add comment

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

No comments yet.

Dark theme mode

Dark theme mode is available on Kontext.

Learn more arrow_forward

Kontext Column

Created for everyone to publish data, programming and cloud related articles. Follow three steps to create your columns.

Learn more arrow_forward