To find the number of rows/records in a Hive table, we can use Spark SQL
count
aggregation function: Hive SQL - Aggregate Functions Overview with Examples.This code snippet provide example of Scala code to implement the same. spark-shell
is used directly for simplicity. The code snippet can also run Jupyter Notebooks or Zeppelin with Spark kernel. Alternatively, it can be compiled to jar file and then submit as job via spark-submit
.