Create Table with Parquet, Orc, Avro - Hive SQL

access_time 4 months ago visibility182 comment 0

This page shows how to create Hive tables with storage file format as Parquet, Orc and Avro via Hive SQL (HQL).

The following examples show you how to create managed tables and similar syntax can be applied to create external tables if Parquet, Orc or Avro format already exist in HDFS.

Create table stored as Parquet


CREATE TABLE IF NOT EXISTS hql.customer_parquet(cust_id INT, name STRING, created_date DATE)
COMMENT 'A table to store customer records.'

Create table stored as Orc


CREATE TABLE IF NOT EXISTS hql.customer_orc(cust_id INT, name STRING, created_date DATE)
COMMENT 'A table to store customer records.'

Create table stored as Avro


CREATE TABLE IF NOT EXISTS hql.customer_avro(cust_id INT, name STRING, created_date DATE)
COMMENT 'A table to store customer records.'

Install Hive database

Follow the article below to install Hive on Windows 10 via WSL if you don't have available available Hive database to practice Hive SQL:

Examples on this page are based on Hive 3.* syntax.

Run query

All these SQL statements can be run using beeline CLI:

$HIVE_HOME/bin/beeline --silent=true

The above command line connects to the default HiveServer2 service via beeline. Once beeline is loaded, type the following command to connect:

0: jdbc:hive2://localhost:10000> !connect jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000: hive
Enter password for jdbc:hive2://localhost:10000:
1: jdbc:hive2://localhost:10000>

The terminal looks like the following screenshot:

info Last modified by Administrator at 4 months ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Want to publish your article on Kontext?

Learn more

Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.

Learn more arrow_forward

More from Kontext

local_offer teradata local_offer SQL

visibility 980
thumb_up 0
access_time 2 years ago

This code snippet shows how to calculate time differences.

Apache Hive 3.1.2 Installation on Windows 10

local_offer hive local_offer hadoop local_offer windows10 local_offer big-data-on-windows-10

visibility 982
thumb_up 1
access_time 4 months ago

Hive 3.1.2 was released on 26th Aug 2019. It is still the latest 3.x release and works with Hadoop 3.x.y releases. In this article, I’m going to provide step by step instructions about installing Hive 3.1.2 on Windows 10. * Logos are registered trademarks of Apache Hive and Microsoft Windows.

local_offer teradata local_offer SQL

visibility 161
thumb_up 1
access_time 9 months ago

COALESCE function in Teradata returns NULL if all arguments evaluate to null; otherwise it returns the value of the first non-null argument. NULLIF is to used evaluate two expressions and returns NULL if the two arguments are equal otherwise if returns the first arguments. IS NULL is used to ...

About column

Articles about Apache Hadoop installation, performance tuning and general tutorials.

*The yellow elephant logo is a registered trademark of Apache Hadoop.

rss_feed Subscribe RSS