Code python

PySpark DataFrame - Calculate Distinct Count of Column(s)

Kontext Kontext visibility 257 comment 0 access_time 2 years ago language English

descriptionCode description

This code snippet provides an example of calculating distinct count of values in PySpark DataFrame using countDistinct PySpark SQL function.

Output:

+---+-----+
| ID|Value|
+---+-----+
|101|   56|
|101|   67|
|102|   70|
|103|   93|
|104|   70|
+---+-----+

+-----------------+------------------+
|DistinctCountOfID|DistinctCountOfRow|
+-----------------+------------------+
|                4|                 5|
+-----------------+------------------+
fork_rightFork
more_vert
copyright This page is subject to Site terms.
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts