Hadoop, Hive & HBase
Articles
Build Latest Hadoop on Windows 10 natively via Docker
Hadoop build error - Cound not find a SASL library (GSASL (gsasl) or Cyrus SASL (libsasl2)
Introduction to Hive Bucketed Table
Configure HiveServer2 to Enable Transactions (ACID Support)
Hive ACID Inserts, Updates and Deletes with ORC
Hive SQL - Analytics with GROUP BY and GROUPING SETS, Cubes, Rollups
Extract Values from XML Column in Hive Tables
Hive SQL - Virtual Columns
Hive SQL - Cluster By and Distribute By
Hive SQL - Differences between Order By and Sort By
Hive SQL - Aggregate Functions Overview with Examples
Hive - Create External Table for Multiline CSV Files
Install Hadoop 3.3.2 in WSL on Windows
Connect to HBase in Python via HappyBase
Install Ambari 2.7.6 on Windows via WSL to Provision Hadoop Cluster
Install Hadoop 3.3.1 on Windows 10 Step by Step Guide
This detailed step-by-step guide shows you how to install the latest Hadoop v3.3.1 on Windows 10. It leverages Hadoop 3.3.1 winutils tool and WSL is not required. This version was released on June 15 2021.
Hadoop 3.3.1 winutils
Install HBase in WSL - Pseudo-Distributed Mode
Detailed step-by-step guide about installing HBase 2.4.1 pseudo-distributed cluster with Hadoop 3.2.0 in Windows Subsystem for Linux (WSL) Ubuntu distro.
Install HBase in WSL - Standalone Mode
Hadoop Daemon Log Files Location
Python: Load Data from Hive
Apache Hive 3.1.2 Installation on Linux Guide
Install Hadoop 3.3.0 on macOS
This article provides step-by-step guidance to install Hadoop 3.3.0 on macOS. Hadoop 3.3.0 was released on July 14 2020. It is the first release of Apache Hadoop 3.3 line. There are significant changes compared with Hadoop 3.2.0, such as Java 11 runtime support, protobuf upgrade to 3.7.1, scheduling of opportunistic containers, non-volatile SCM support in HDFS cache directives, etc.
Create Temporary Table - Hive SQL
Create Table as SELECT - Hive SQL
Create Bucketed Sorted Table - Hive SQL
Create Partitioned Table - Hive SQL
Create Table Stored as CSV, TSV, JSON Format - Hive SQL
Create Table with Parquet, Orc, Avro - Hive SQL
Create, Drop, and Truncate Table - Hive SQL
Create, Drop, Alter and Use Database - Hive SQL
Load File into HDFS through WebHDFS APIs
Apache Hive 3.1.2 Installation on Windows 10
Install Hadoop 3.3.0 on Linux
Install Hadoop 3.3.0 on Windows 10 Step by Step Guide
This detailed step-by-step guide shows you how to install the latest Hadoop v3.3.0 on Windows 10. It leverages Hadoop 3.3.0 winutils tool and WSL is not required. This version was released on July 14 2020. It is the first release of Apache Hadoop 3.3 line. There are significant changes compared with Hadoop 3.2.0, such as Java 11 runtime support, protobuf upgrade to 3.7.1, scheduling of opportunistic containers, non-volatile SCM support in HDFS cache directives, etc.
Hadoop 3.3.0 winutils
Install Hadoop 3.3.0 on Windows 10 using WSL
Hive: Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V
Ingest Data into HDFS from NAS or Windows Shared Folder
Differences between Hive External and Internal (Managed) Tables
Fix for Hadoop 3.2.1 namenode format issue on Windows 10
Compile and Build Hadoop 3.2.1 on Windows 10 Guide
Install Hadoop 3.2.1 on Windows 10 Step by Step Guide
This detailed step-by-step guide shows you how to install the latest Hadoop (v3.2.1) on Windows 10. It also provides a temporary fix for bug HDFS-14084 (java.lang.UnsupportedOperationException INFO).