By using this site, you acknowledge that you have read and understand our Cookie policy, Privacy policy and Terms .

This articles show you how to convert a Python dictionary list to a Spark DataFrame. The code snippets runs on Spark 2.x environments.

Input

The input data (dictionary list looks like the following):

data = [{"Category": 'Category A', 'ItemID': 1, 'Amount': 12.40},
        {"Category": 'Category B', 'ItemID': 2, 'Amount': 30.10},
        {"Category": 'Category C', 'ItemID': 3, 'Amount': 100.01},
        {"Category": 'Category A', 'ItemID': 4, 'Amount': 110.01},
        {"Category": 'Category B', 'ItemID': 5, 'Amount': 70.85}
        ]

Solution 1 - Infer schema

In Spark 2.x, DataFrame can be directly created from Python dictionary list and the schema will be inferred automatically. 

def infer_schema():
    # Create data frame
    df = spark.createDataFrame(data)
    print(df.schema)
    df.show()
The output looks like the following:
StructType(List(StructField(Amount,DoubleType,true),StructField(Category,StringType,true),StructField(ItemID,LongType,true)))
+------+----------+------+
|Amount|  Category|ItemID|
+------+----------+------+
|  12.4|Category A|     1|
|  30.1|Category B|     2|
|100.01|Category C|     3|
|110.01|Category A|     4|
| 70.85|Category B|     5|
+------+----------+------+

Solution 2 - Explicit schema

Of course, you can also define the schema directly when creating the data frame:

def explicit_schema():
    # Create a schema for the dataframe
    schema = StructType([
        StructField('Category', StringType(), False),
        StructField('ItemID', IntegerType(), False),
        StructField('Amount', FloatType(), True)
    ])

    # Create data frame
    df = spark.createDataFrame(data, schema)
    print(df.schema)
    df.show()

In this way, you can control the data types explicitly. The output looks like the following:

StructType(List(StructField(Category,StringType,false),StructField(ItemID,IntegerType,false),StructField(Amount,FloatType,true)))
+----------+------+------+
|  Category|ItemID|Amount|
+----------+------+------+
|Category A|     1|  12.4|
|Category B|     2|  30.1|
|Category C|     3|100.01|
|Category A|     4|110.01|
|Category B|     5| 70.85|
+----------+------+------+
You will notice that the sequence of attributes is slightly different from the inferred one.

Summary

You can easily convert Python list to Spark DataFrame in Spark 2.x. 

Complete code

Code is available in GitHub:

https://github.com/FahaoTang/spark-examples/tree/master/python-dict-list

info Last modified by Raymond at 4 months ago * This page is subject to Site terms.

More from Kontext

Pandas DataFrame Plot - Scatter and Hexbin Chart

local_offer plot local_offer pandas local_offer jupyter-notebook local_offer python

visibility 7
thumb_up 0
access_time 3 days ago

 In this article I'm going to show you some examples about plotting scatter and hexbin chart with Pandas DataFrame. I'm using Jupyter Notebook as IDE/code execution environment.  Hexbin chart &nbs...

open_in_new View open_in_new Code snippets

Pandas DataFrame Plot - Area Chart

local_offer plot local_offer jupyter-notebook local_offer python local_offer pandas

visibility 3
thumb_up 0
access_time 4 days ago

This article provides examples about plotting area chart using  pandas.DataFrame.plot  or  pandas.core.groupby.DataFrameGroupBy.plot   function. ...

open_in_new View open_in_new Code snippets

Pandas DataFrame Plot - Pie Chart

local_offer plot local_offer pandas local_offer jupyter-notebook local_offer python

visibility 9
thumb_up 0
access_time 4 days ago

This article provides examples about plotting pie chart using  pandas.DataFrame.plot  function. Prerequisites The data I'm going to use is the same as the other article  ...

open_in_new View open_in_new Code snippets

local_offer python

visibility 9
thumb_up 0
access_time 4 days ago

In my previous article about  Convert string to date in Python / Spark , I showed how to use Spark udf to conver...

open_in_new View open_in_new Code snippets

info About author

Kontext dark theme mode

Dark theme mode

Dark theme mode is available on Kontext.

Learn more arrow_forward
Kontext Column

Kontext Column

Created for everyone to publish data, programming and cloud related articles. Follow three steps to create your columns.

Learn more arrow_forward
info Follow us on Twitter to get the latest article updates. Follow us