Kontext Copilot - An AI-powered assistant for data analytics that runs on your local computer. Learn more
Get started
Pandas DataFrame groupBy and then COUNT DISTINCT
insights Stats
warning Please login first to view stats information.
Kontext
Code Snippets & Tips
Code snippets and tips for various programming languages/frameworks. All code examples are under MIT or Apache 2.0 license unless specified otherwise.
Code description
This code snippet shows you how to group a pandas DataFrame via certain column and then do a distinct count of unique values in another column.
It is similar as COUNT DISTINCT aggregation functions in SQL. It also sort the values by the group by column.
Example output:
category value user group-count
0 A 0 5 7
80 A 80 4 7
70 A 70 7 7
60 A 60 10 7
50 A 50 9 7
.. ... ... ... ...
29 J 29 9 7
19 J 19 9 7
9 J 9 9 7
89 J 89 8 7
99 J 99 7 7
[100 rows x 4 columns]
Code snippet
import pandas as pd import random categories = [] users = [] values = [] for i in range(0, 100): categories.append(chr(i % 10+65)) values.append(i) users.append(random.randint(1, 10)) df = pd.DataFrame({'category': categories, 'value': values, 'user': users}) print(df) df['group-count'] = df.groupby(by='category')['user'].transform('nunique') print(df.sort_values(by=['category']))
copyright
This page is subject to Site terms.
comment Comments
No comments yet.