Pandas DataFrame Plot - Bar Chart
Recently, I've been doing some visualization/plot with Pandas DataFrame in Jupyter notebook. In this article I'm going to show you some examples about plotting bar chart (incl. stacked bar chart with series) with Pandas DataFrame. I'm using Jupyter Notebook as IDE/code execution environment.
Prepare the data
Use the following code snippet to create a Pandas DataFrame object in memory:
import pandas as pd from datetime import datetime def str_to_date(str): return datetime.strptime(str, '%Y-%m-%d').date() data = [{'DATE':str_to_date('2020-01-01'), 'TYPE': 'TypeA', 'SALES': 1000}, {'DATE':str_to_date('2020-01-01'), 'TYPE': 'TypeB', 'SALES': 200}, {'DATE':str_to_date('2020-01-01'), 'TYPE': 'TypeC', 'SALES': 300}, {'DATE':str_to_date('2020-02-01'), 'TYPE': 'TypeA', 'SALES': 700}, {'DATE':str_to_date('2020-02-01'), 'TYPE': 'TypeB', 'SALES': 400}, {'DATE':str_to_date('2020-02-01'), 'TYPE': 'TypeC', 'SALES': 500}, {'DATE':str_to_date('2020-03-01'), 'TYPE': 'TypeA', 'SALES': 300}, {'DATE':str_to_date('2020-03-01'), 'TYPE': 'TypeB', 'SALES': 900}, {'DATE':str_to_date('2020-03-01'), 'TYPE': 'TypeC', 'SALES': 100} ] df = pd.DataFrame(data) df
The content of the dataframe looks like the following:
DATE | TYPE | SALES | |
---|---|---|---|
0 | 2020-01-01 | TypeA | 1000 |
1 | 2020-01-01 | TypeB | 200 |
2 | 2020-01-01 | TypeC | 300 |
3 | 2020-02-01 | TypeA | 700 |
4 | 2020-02-01 | TypeB | 400 |
5 | 2020-02-01 | TypeC | 500 |
6 | 2020-03-01 | TypeA | 300 |
7 | 2020-03-01 | TypeB | 900 |
8 | 2020-03-01 | TypeC | 100 |
We will use this dataframe to create visuals/charts.
pandas.DataFrame.plot function
Refer to the following documentation about pandas.DataFrame.plot function.
* If you are using different version of Pandas, please navigate to the corresponded document version.
matplotlib
Plot function depends on matplotlib, please ensure you have it installed in your system. If not, you can use the following command to install it:
!pip install matplotlib
The output looks like this:
Bar chart
Use the following code to plot a bar chart:
df.plot(kind='bar', x='DATE', y='SALES')
The chart looks like the following:
Bar chart - groupby
Let's add a groupby and see how it looks like:df.groupby(['DATE','TYPE']).sum().plot(kind='bar')
The output is slightly better as it added TYPE to X-axis.
Bar chart - groupby and unstack
Let's unstack the dataframe after groupby.
df.groupby(['DATE','TYPE']).sum().unstack().plot(kind='bar')
The output now looks like this:
It is what we are looking for however there is a work 'None' in the legend.
To get rid of that, we just need to specify y attribute.
df.groupby(['DATE','TYPE']).sum().unstack().plot(kind='bar',y='SALES')
The chart now looks like this:
Stacked bar chart
Setting parameter stacked to True in plot function will change the chart to a stacked bar chart.
df.groupby(['DATE','TYPE']).sum().unstack().plot(kind='bar',y='SALES', stacked=True)
Cumulative stacked bar chart
To create a cumulative stacked bar chart, we need to use groupby function again:
df.groupby(['DATE','TYPE']).sum().groupby(level=[1]).cumsum().unstack().plot(kind='bar',y='SALES', stacked = True)
The chart now looks like this:
We group by level=[1] as that level is Type level as we want to accumulate sales by type.
Horizontal bar chart
To create horizontal bar charts, we just need to change chart kind to barh.
df.groupby(['DATE','TYPE']).sum().unstack().plot(kind='barh',y='SALES')