By using this site, you acknowledge that you have read and understand our Cookie and Privacy policy. Your use of Kontext website is subject to this policy. Accept

Tag - spark

spark

Debug PySpark Code in Visual Studio Code

21 views   0 comments last modified about 16 days ago

The page summarizes the steps required to run and debug PySpark (Spark for Python) in Visual Studio Code. Install Python and pip Install Python from the official website: https://...

View detail
spark

Implement SCD Type 2 Full Merge via Spark Data Frames

307 views   0 comments last modified about 2 months ago

Overview For SQL developers that are familiar with SCD and merge statements, you may wonder how to implement the same in big data platforms, considering database or storages in Hadoop are not designed/optimised for record level updates and inserts. In this post, I’m going to demons...

View detail
spark

PySpark: Convert JSON String Column to Array of Object (StructType) in Data Frame

421 views   0 comments last modified about 3 months ago

This post shows how to derive new column in a Spark data frame from a JSON array string column. I am running the code in Spark 2.2.1 though it is compatible with Spark 1.6.0 (with less JSON SQL functions). Prerequisites Refer to the following post to install Spark in Windows. ...

View detail
spark

Install Zeppelin 0.7.3 in Windows

2456 views   6 comments last modified about 2 years ago

This post summarizes the steps to install Zeppelin 0.7.3 in Windows environment. Tools and Environment GIT Bash Command Prompt Windows 10 Download Binary Package Download the latest binary package from the following website: ...

View detail
spark

Write and Read Parquet Files in Spark/Scala

7228 views   2 comments last modified about 2 years ago

In this page, I’m going to demonstrate how to write and read parquet files in Spark/Scala by using Spark SQLContext class. Reference What is parquet format? Go the following project site to understand more about parquet. ...

View detail
spark

Load Data into HDFS from SQL Server via Sqoop

1073 views   0 comments last modified about 12 months ago

This page shows how to import data from SQL Server into Hadoop via Apache Sqoop. Prerequisites Please follow the link below to install Sqoop in your machine if you don’t have one environment ready. ...

View detail
spark

Write and Read Parquet Files in HDFS through Spark/Scala

4053 views   0 comments last modified about 2 years ago

In my previous post, I demonstrated how to write and read parquet files in Spark/Scala. The parquet file destination is a local folder. Write and Read Parquet Files in Spark/Scala In this page...

View detail
spark

Read Text File from Hadoop in Zeppelin through Spark Context

2872 views   0 comments last modified about 2 years ago

Background This page provides an example to load text file from HDFS through SparkContext in Zeppelin (sc). Reference The details about this method can be found at: SparkContext.textFile ...

View detail
spark

Install Spark 2.2.1 in Windows

473 views   0 comments last modified about 2 years ago

This page summarizes the steps to install Spark 2.2.1 in your Windows environment. Tools and Environment GIT Bash Command Prompt Windows 10 Download Binary Package Download the latest binary from the following site: ...

View detail

Contacts

  • enquiry[at]kontext.tech

Subscribe