This site uses cookies to deliver our services. By using this site, you acknowledge that you have read and understand our Cookie and Privacy policy. Your use of Kontext website is subject to this policy. Allow Cookies and Dismiss

Install Big Data Tools (Spark, Zeppelin, Hadoop) in Windows for Learning and Practice

466 views 2 comments last modified about 4 months ago Raymond Tang

lite-log

Are you a Windows/.NET developer and willing to learn big data concepts and tools in your Windows?

If yes, you can follow the links below to install them in your PC. The installations are usually easier to do in Linux/UNIX but they are not difficult to implement in Windows either since they are based on Java.

Installation guides

All the following documents are based on Windows 10. The steps should be the same in other Windows environments though some of the screenshots may be different.

Install Zeppelin 0.7.3 in Windows

Install Hadoop 3.0.0 in Windows (Single Node)

Install Spark 2.2.1 in Windows

Install Apache Sqoop in Windows

Configure Hadoop 3.1.0 in a Multi Node Cluster

Learning tutorials - latest update (2018-05-06)

Use Hadoop File System Task in SSIS to Write File into HDFS
Invoke Hadoop WebHDFS APIs in .NET Core

Write and Read Parquet Files in Spark/Scala

Write and Read Parquet Files in HDFS through Spark/Scala

Convert String to Date in Spark (Scala)

Read Text File from Hadoop in Zeppelin through Spark Context

Connecting Apache Zeppelin to your SQL Server

Load Data into HDFS from SQL Server via Sqoop

Default Ports Used by Hadoop Services (HDFS, MapReduce, YARN)

I will be constantly updating my blog with tutorials. Feel free to subscribe this blog (RSS).

Related pages

Install Zeppelin 0.7.3 in Windows

996 views   6 comments last modified about 6 months ago

This post summarizes the steps to install Zeppelin 0.7.3 in Windows environment. Tools and Environment GIT Bash Command Prompt Windows 10 Download Binary Package Download the latest binary package from the following website: ...

View detail

Install Hadoop 3.0.0 in Windows (Single Node)

3489 views   14 comments last modified about 6 months ago

This page summarizes the steps to install Hadoop 3.0.0 in your Windows environment. Reference page: https://wiki.apache.org/hadoop/Hadoop2OnWindows ...

View detail

Write and Read Parquet Files in Spark/Scala

1286 views   2 comments last modified about 6 months ago

In this page, I’m going to demonstrate how to write and read parquet files in Spark/Scala by using Spark SQLContext class. Reference What is parquet format? Go the following project site to understand more about parquet. ...

View detail

Resolve Hadoop RemoteException - Name node is in safe mode

84 views   0 comments last modified about 4 months ago

In Safe Mode, the HDFS cluster is read-only. After completion of block replication maintenance activity, the name node leaves safe mode automatically. If you try to delete files in safe mode, the following exception may raise: org.apache.hadoop.ipc.RemoteException(org.apac...

View detail

Configure Sqoop in a Edge Node of Hadoop Cluster

333 views   0 comments last modified about 4 months ago

This page continues with the following documentation about configuring a Hadoop multi-nodes cluster via adding a new edge node to configure administration or client tools. ...

View detail

Configure YARN and MapReduce Resources in Hadoop Cluster

128 views   0 comments last modified about 4 months ago

When configuring YARN and MapReduce in Hadoop cluster, it is very important to configure the memory and virtual processors correctly. If the configurations are incorrect, the nodes may not be able to start properly and the applications may not be able to run successfully. For example...

View detail

Add comment

Please login first to add comments.  Log in New user?  Register

Comments (0)

No comments yet.