Create Table with Parquet, Orc, Avro - Hive SQL

access_time 28 days ago visibility20 comment 0

This page shows how to create Hive tables with storage file format as Parquet, Orc and Avro via Hive SQL (HQL).

The following examples show you how to create managed tables and similar syntax can be applied to create external tables if Parquet, Orc or Avro format already exist in HDFS.

Create table stored as Parquet

Example:

CREATE TABLE IF NOT EXISTS hql.customer_parquet(cust_id INT, name STRING, created_date DATE)
COMMENT 'A table to store customer records.'
STORED AS PARQUET;

Create table stored as Orc

Example:

CREATE TABLE IF NOT EXISTS hql.customer_orc(cust_id INT, name STRING, created_date DATE)
COMMENT 'A table to store customer records.'
STORED AS ORC;

Create table stored as Avro

Example:

CREATE TABLE IF NOT EXISTS hql.customer_avro(cust_id INT, name STRING, created_date DATE)
COMMENT 'A table to store customer records.'
STORED AS AVRO;

Install Hive database

Follow the article below to install Hive on Windows 10 via WSL if you don't have available available Hive database to practice Hive SQL:

Examples on this page are based on Hive 3.* syntax.

Run query

All these SQL statements can be run using beeline CLI:

$HIVE_HOME/bin/beeline --silent=true

The above command line connects to the default HiveServer2 service via beeline. Once beeline is loaded, type the following command to connect:

0: jdbc:hive2://localhost:10000> !connect jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000: hive
Enter password for jdbc:hive2://localhost:10000:
1: jdbc:hive2://localhost:10000>

The terminal looks like the following screenshot:


info Last modified by Administrator at 28 days ago copyright This page is subject to Site terms.
Like this article?
Share on

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts

Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.


Learn more arrow_forward

More from Kontext

local_offer SQL local_offer hive

visibility 4365
thumb_up 1
access_time 11 months ago

In different databases, the syntax of selecting top N records are slightly different. They may also differ from ISO standards.

local_offer hive local_offer SQL local_offer hive-sql-ddl

visibility 18
thumb_up 0
access_time 28 days ago

This page shows how to create Hive tables with storage file format as CSV or TSV via Hive SQL (HQL). Create table stored as CSV Example: CREATE TABLE IF NOT EXISTS hql.customer_csv(cust_id INT, name STRING, created_date DATE) COMMENT 'A table to store customer records.' RO...

Schema Merging (Evolution) with Parquet in Spark and Hive

local_offer parquet local_offer pyspark local_offer spark-2-x local_offer hive local_offer hdfs local_offer spark-advanced

visibility 4328
thumb_up 1
access_time 8 months ago

Schema evolution is supported by many frameworks or data serialization systems such as Avro, Orc, Protocol Buffer and Parquet. With schema evolution, one set of data can be stored in multiple files with different but compatible schema. In Spark, Parquet data source can detect and merge sch...

About column

Hadoop

Articles about Apache Hadoop installation, performance tuning and general tutorials.

*The yellow elephant logo is a registered trademark of Apache Hadoop.

rss_feed Subscribe RSS