Hadoop, Hive & HBase

Articles

Build Latest Hadoop on Windows 10 natively via Docker

2022-12-11

Hadoop build error - Cound not find a SASL library (GSASL (gsasl) or Cyrus SASL (libsasl2)

2022-12-11

Introduction to Hive Bucketed Table

2022-08-24

Configure HiveServer2 to Enable Transactions (ACID Support)

2022-08-20

Hive ACID Inserts, Updates and Deletes with ORC

2022-08-17

Hive SQL - Analytics with GROUP BY and GROUPING SETS, Cubes, Rollups

2022-07-23

Extract Values from XML Column in Hive Tables

2022-07-23

Hive SQL - Virtual Columns

2022-07-23

Hive SQL - Cluster By and Distribute By

2022-07-10

Hive SQL - Differences between Order By and Sort By

2022-07-10

Hive SQL - Aggregate Functions Overview with Examples

2022-07-10

Hive - Create External Table for Multiline CSV Files

2022-06-01

Install Hadoop 3.3.2 in WSL on Windows

2022-04-18

Connect to HBase in Python via HappyBase

2022-03-22

Install Ambari 2.7.6 on Windows via WSL to Provision Hadoop Cluster

2021-12-29

Install Hadoop 3.3.1 on Windows 10 Step by Step Guide

This detailed step-by-step guide shows you how to install the latest Hadoop v3.3.1 on Windows 10. It leverages Hadoop 3.3.1 winutils tool and WSL is not required. This version was released on June 15 2021.

2021-10-12

Hadoop 3.3.1 winutils

2021-09-27

Install HBase in WSL - Pseudo-Distributed Mode

Detailed step-by-step guide about installing HBase 2.4.1 pseudo-distributed cluster with Hadoop 3.2.0 in Windows Subsystem for Linux (WSL) Ubuntu distro.

2021-02-03

Install HBase in WSL - Standalone Mode

2021-02-02

Hadoop Daemon Log Files Location

2021-01-21

Python: Load Data from Hive

2021-01-06

Apache Hive 3.1.2 Installation on Linux Guide

2020-12-27

Install Hadoop 3.3.0 on macOS

This article provides step-by-step guidance to install Hadoop 3.3.0 on macOS. Hadoop 3.3.0 was released on July 14 2020. It is the first release of Apache Hadoop 3.3 line. There are significant changes compared with Hadoop 3.2.0, such as Java 11 runtime support, protobuf upgrade to 3.7.1, scheduling of opportunistic containers, non-volatile SCM support in HDFS cache directives, etc.

2020-12-22

Create Temporary Table - Hive SQL

2020-08-25

Create Table as SELECT - Hive SQL

2020-08-25

Create Bucketed Sorted Table - Hive SQL

2020-08-25

Create Partitioned Table - Hive SQL

2020-08-25

Create Table Stored as CSV, TSV, JSON Format - Hive SQL

2020-08-25

Create Table with Parquet, Orc, Avro - Hive SQL

2020-08-25

Create, Drop, and Truncate Table - Hive SQL

2020-08-24

Create, Drop, Alter and Use Database - Hive SQL

2020-08-24

Load File into HDFS through WebHDFS APIs

2020-08-22

Apache Hive 3.1.2 Installation on Windows 10

2020-08-10

Install Hadoop 3.3.0 on Linux

2020-08-04

Install Hadoop 3.3.0 on Windows 10 Step by Step Guide

This detailed step-by-step guide shows you how to install the latest Hadoop v3.3.0 on Windows 10. It leverages Hadoop 3.3.0 winutils tool and WSL is not required. This version was released on July 14 2020. It is the first release of Apache Hadoop 3.3 line. There are significant changes compared with Hadoop 3.2.0, such as Java 11 runtime support, protobuf upgrade to 3.7.1, scheduling of opportunistic containers, non-volatile SCM support in HDFS cache directives, etc.

2020-08-01

Hadoop 3.3.0 winutils

2020-08-01

Install Hadoop 3.3.0 on Windows 10 using WSL

2020-07-31

Hive: Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V

2020-04-20

Ingest Data into HDFS from NAS or Windows Shared Folder

2020-03-08

Differences between Hive External and Internal (Managed) Tables

2020-02-22

Fix for Hadoop 3.2.1 namenode format issue on Windows 10

2020-01-25

Compile and Build Hadoop 3.2.1 on Windows 10 Guide

2020-01-19

Install Hadoop 3.2.1 on Windows 10 Step by Step Guide

This detailed step-by-step guide shows you how to install the latest Hadoop (v3.2.1) on Windows 10. It also provides a temporary fix for bug HDFS-14084 (java.lang.UnsupportedOperationException INFO).

2020-01-18

Apache Hive 3.1.1 Installation on Windows 10 using Windows Subsystem for Linux

2019-05-18

Install Hadoop 3.2.0 on Windows 10 using Windows Subsystem for Linux (WSL)

2019-05-11

HiveServer2 Cannot Connect to Hive Metastore Resolutions/Workarounds

2019-04-15

Configure a SQL Server Database as Remote Hive Metastore

2019-04-14

Copy Files from Hadoop HDFS to Local

2019-03-27

Hadoop on Windows - UNHEALTHY Data Nodes Fix

2019-03-26

Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

2019-03-25

Resolve Hadoop Name node is in safe mode

2018-05-13

Configure YARN and MapReduce Resources in Hadoop Cluster

2018-05-13

Default Ports Used by Hadoop Services (HDFS, MapReduce, YARN)

2018-04-29

Configure Hadoop 3.1.0 in a Multi Node Cluster

2018-04-28

Use Hadoop File System Task in SSIS to Write File into HDFS

2018-02-25

Hadoop datanode issue and resolution - ‘Incompatible clusterIDs’

2018-02-19

Install Hadoop 3.0.0 on Windows (Single Node)

2018-02-18