Concatenate Columns in Spark DataFrame
Code description
This code snippet provides one example of concatenating columns using a separator in Spark DataFrame. Function concat_ws
is used directly. For Spark SQL version, refer to Spark SQL - Concatenate w/o Separator (concat_ws and concat).
Syntax of concat_ws
pyspark.sql.functions.concat_ws(sep: str, *cols: ColumnOrName)
Output:
+-----+--------+--------------+ | col1| col2| col1_col2| +-----+--------+--------------+ |Hello| Kontext| Hello,Kontext| |Hello|Big Data|Hello,Big Data| +-----+--------+--------------+
Code snippet
from pyspark.sql import SparkSession from pyspark.sql.functions import concat_ws app_name = "PySpark concat_ws Example" master = "local" spark = SparkSession.builder \ .appName(app_name) \ .master(master) \ .getOrCreate() spark.sparkContext.setLogLevel("WARN") # Create a DataFrame df = spark.createDataFrame( [['Hello', 'Kontext'], ['Hello', 'Big Data']], ['col1', 'col2']) # Concatenate these two columns using seperator ',' df = df.withColumn('col1_col2', concat_ws(',', df.col1, df.col2)) df.show()
copyright
This page is subject to Site terms.
comment Comments
No comments yet.