java lite-log hive

Connect to Hive via HiveServer2 JDBC Driver

9   0   about 4 days ago

This post shows you how to connect to HiveServer2 via Hive JDBC driver in Java. *The way to connect to HiveServer1 is very similar though the driver names are different: Version Drive...

View detail
hadoop hive

Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

105   0   about 4 days ago

If you have been following my website, you would know I’ve published a number of articles about installing big data tools/framewo...

View detail
lite-log hive

HiveServer2 Cannot Connect to Hive Metastore Resolutions/Workarounds

15   0   about 4 days ago

Since Hive 3.x, new authentication feature for HiveServer2 client is added. When starting HiveServer2 service (Hive version 3.0.0), you may encounter errors like: ‘HiveServer2 metastore.RetryingMetaStoreClient: RetryingMetaStoreClient trying reconnect as [username]  (auth:S...

View detail
sql server hive

Configure a SQL Server Database as Remote Hive Metastore

13   0   about 4 days ago

In one of my previous post, I showed how to configure Apache Hive 3.0.0 in Windows 10. Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide ...

View detail

Install Big Data Tools (Spark, Zeppelin, Hadoop) in Windows for Learning and Practice

1,475   4   about 5 days ago

Are you a Windows/.NET developer and willing to learn big data concepts and tools in your Windows? If yes, you can follow the links below to install them in your PC. The installations are usually easier to do in Linux/UNIX but they are not difficult to implement in Windows either since the...

View detail core gulp

Migrate from Bower to Gulp for Client Libraries Management in ASP.NET Core

4   0   about 5 days ago

Background If you have been working on ASP.NET projects in the past years, you probably have heard or used quite a few client library management frameworks/tools. For example, Bower, npm, Gulp, Grunt, Webpack, Yarn, Parcel, Libman, etc. Before SPA became popular, the default ASP.NET (or A...

View detail
spark pyspark partitioning

Data Partitioning Functions in Spark (PySpark) Deep Dive

23   0   about 13 days ago

In my previous post about Data Partitioning in Spark (PySpark) In-depth Walkthrough , I mentioned how to repartition data frames in Spark using repartition ...

View detail
lite-log spark pyspark

Get the Current Spark Context Settings/Configurations

12   0   about 14 days ago

In Spark, there are a number of settings/configurations you can specify including application properties and runtime parameters. Ge...

View detail
lite-log spark pyspark hive

Read Data from Hive in Spark 1.x and 2.x

19   0   about 15 days ago

Spark 2.x Form Spark 2.0, you can use Spark session builder to enable Hive support directly. The following example (Python) shows how to implement it. from pyspark.sql import SparkSession appName = "PySpark Hive Example" master = "local" # Create Spark session with Hive...

View detail
python spark pyspark

Data Partitioning in Spark (PySpark) In-depth Walkthrough

36   0   about 20 days ago

Data partitioning is critical to data processing performance especially for large volume of data processing in Spark. Partitions in Spark won’t span across nodes though one node can contains more than one partitions. When processing, Spark assigns one task for each partition and each worker threa...

View detail
python lite-log spark pyspark

PySpark - Fix PermissionError: [WinError 5] Access is denied

28   0   about 23 days ago

When running pyspark or spark-submit command in Windows to execute python scripts, you may encounter the following error: PermissionError: [WinError 5] Access is denied As it’s self-explained, permissions are not setup correctly. To resolve this issue y...

View detail
python spark pyspark hive

Spark - Save DataFrame to Hive Table

31   0   about 23 days ago

From Spark 2.0, you can easily read data from Hive data warehouse and also write/append new data to Hive tables. This page shows how to operate with Hive in Spark including: Create DataFrame from existing Hive table Save DataFrame to a new Hive table Append data ...

View detail
lite-log hadoop hdfs

Copy Files from Hadoop HDFS to Local

17   0   about 23 days ago

Copy file from HDFS to local Use the following command: hadoop fs [-copyToLocal [-f] [-p] [-ignoreCrc] [-crc] <src> ... <localdst>] For example, copy a file from /hdfs-file.txt in HDFS to local /tmp/ using the following command: ...

View detail
lite-log hadoop

Hadoop on Windows - UNHEALTHY Data Nodes Fix

13   0   about 24 days ago

Solution to fix the issue If you have been running Hadoop on Windows machines, you may encounter issues about unhealthy data nodes. Usually this will happen if there is no enough disk space in your local drive. For example, if I start the HDFS and YARN demons under the context...

View detail
lite-log hadoop hdfs

Hadoop datanode issue and resolution - ‘Incompatible clusterIDs’

212   0   about 24 days ago

Issue After finishing installation Hadoop 3.0.0 in my Windows: Install Hadoop 3.0.0 in Windows (Single Node) , I got the following error after I formated the name node several ti...

View detail
sql server python spark pyspark

Connect to SQL Server in Spark (PySpark)

44   0   about 27 days ago

Spark is an analytics engine for big data processing. There are various ways to connect to a database in Spark. This page summarizes some of common approaches to connect to SQL Server using Python as programming language. ...

View detail

HTTPS is now enabled in this website

20   0   about 28 days ago

To provide better security, Kontext is now fully HTTPS enabled. You can view the SSL certificate in your browser: ...

View detail core 2 core dotnetcore open-banking

ASP.NET Core 2.2 Implementation for Consumer Data Standards

121   0   about 2 months ago

I’ve just started an core 2.2 based implementation for Australia Consumer Data Standards (published by Data 61). Opening Banking initiative will follow these standards. The purpose is to help you to get familiar with these standards, especially the APIs that need to be implemented. ...

View detail

Querying Teradata and SQL Server - Tutorial 1: The SELECT Statement

33,977   7   about 4 years ago

SELECT is one of the most commonly used statements. In this tutorial, I will cover the following items: Two of the principal query clauses—FROM and SELECT Data Types Built-in functions CASE expressions and variations like ISNULL and COALESCE. * The functio...

View detail

Install Teradata Express by Using VMware Player 6.0 in Windows

14,021   23   about 5 years ago

In this article, I am going to introduce how to install Teradata Express in virtual machines in Windows. Download software 1) Download VMware Player for Windows 32-bit and 64-bit from the following link (version 6.0): ...

View detail
hadoop yarn hdfs

Install Hadoop 3.0.0 in Windows (Single Node)

13,996   15   about 2 years ago

This page summarizes the steps to install Hadoop 3.0.0 in your Windows environment. Reference page: ...

View detail

Working with SQL Server Compact 4.0 using Entity Framework 6 and ADO.NET

11,957   0   about 5 years ago

SQL Server Compact 4.0 (CE 4.0) is a free SQL Server embedded database ideal for building standalone and occasionally connected applications for mobile devices, desktops, Web clients and others. In one of my projects, I used it as the database for logging errors, which assumes the errors will onl...

View detail

Create ETL Project with Teradata through SSIS

10,514   2   about 4 years ago

Infosphere DataStage is adopted as ETL (Extract, Transform, Load) tool in many Teradata based data warehousing projects. With the Teradata ODBC and .NET data providers, you can also use the BI tools from Microsoft, i.e. SSIS. In my previous post, I demonstrated how to install Teradata Tool...

View detail core 2

Server.MapPath Equivalent in ASP.NET Core 2

10,418   0   about 2 years ago

In traditional applications, Server.MapPath is commonly used to generate absolute path in the web server. However, this has been removed from ASP.NET Core. So what is the equivalent way of doing it?

View detail

Generate Formatted Excel Destination (Output) in SSIS Data Flow Task

10,241   0   about 5 years ago

SSIS (SQL Server Integration Service) provides a number of convenient tasks to enable data integration. Exporting data from database to Excel file is a common task in ETL (Extract, Transform, Load) projects. Constantly the users/customers may raise format request regarding the Excel extract. To g...

View detail
spark scala parquet

Write and Read Parquet Files in Spark/Scala

8,042   2   about 2 years ago

In this page, I’m going to demonstrate how to write and read parquet files in Spark/Scala by using Spark SQLContext class. Reference What is parquet format? Go the following project site to understand more about parquet. ...

View detail
dotnet core angular core 2

Issue - Unable to get property 'apply' of undefined or null reference occurred in Angular 4.*, VS2017 15.3, ASP.NET Core 2.0

7,784   10   about 2 years ago

Issue Context After installed Visual Studio 2017 15.3 preview and .net core 2.0 preview SDK, I upgraded one of my existing core project to 2.0. The project was created using ‘dotnet new angular’ SPA template.  I also upgraded all the client app packages to the latest. For exa...

View detail
java kerberos

Java Kerberos Authentication Configuration Sample & SQL Server Connection Practice

7,389   2   about 3 years ago

Overview Recently, I have been working on an ETL framework to load various source data (i.e. files, SQL Server, Oracle and Teradata) into Teradata. Due to some limitations, Java was chosen as the implementation language though IBM Infosphere DataStage is available to use. DataStage has p...

View detail core identity core 2

Retrieve Identity username, email and other information in ASP.NET Core

7,251   0   about 2 years ago

The identity system in ASP.NET has evolved over time. If you are using ASP.NET Core, you probably found User property is an instance of ClaimsPrincipal in Controller or Razor views. Thus to retrieve the information, you need to utilize the claims.

View detail

Connect to Teradata Virtual Machine Guest from Windows Host

6,920   16   about 4 years ago

In my previous posts about Querying Teradata and SQL Server, I logged into the virtual machine graphic interface to manage the database. However, I constantly found it is resource intensive as there is only 4GB memory in my laptop. Instead, I will use text mode to start the virtual machine and co...

View detail

[C#] Connect to Teradata Database via .NET Data Provider

5,333   2   about 4 years ago

In this post, I will demonstrate how to connect to Teradata database via .NET Data Provider for Teradata using C#. Prerequisites Install the .NET Data Provider for Teradata from the following link: ...

View detail
teradata python

Connect to Teradata database through Python

5,151   3   about 2 years ago

Teradata published an official Python module which can be used in DevOps projects. More details can be found at the following GitHub site: Install Teradata module ...

View detail

Create and Debug C/C++ Programs with Eclipse and Cygwin in Windows

4,677   0   about 4 years ago

In this post, I am going to demonstrate how to use Eclipse to create and debug C/C++ programs for Unix/Linux in Windows. I am going to use Cygwin GCC as toolchains. Cygwin GDB will also be installed for debugging purpose. I am using Windows 10 and JRE 1.8 in the following steps. Install E...

View detail

Resolve the Issues in Upgrading Entity Framework to Version 6.1

4,575   0   about 5 years ago

When upgrading your Entity Framework to Entity Framework 6.1 (EF6) from version 5.0, you may meet a number of issues. I have summarized all the issues I’ve encountered and their resolutions for your reference. Upgrade to EF6 Microsoft has provided one summary about upgrading to E...

View detail
lite-log spark hdfs scala parquet

Write and Read Parquet Files in HDFS through Spark/Scala

4,478   0   about 2 years ago

In my previous post, I demonstrated how to write and read parquet files in Spark/Scala. The parquet file destination is a local folder. Write and Read Parquet Files in Spark/Scala In this page...

View detail
lite-log scala

Convert String to Date in Spark (Scala)

4,326   0   about 2 years ago

Context This pages demonstrates how to convert string to java.util.Date in Spark via Scala. Prerequisites If you have not installed Spark, follow the page below to install it: ...

View detail

about 6 days ago

Thanks for posting this. Great Article! I struggled a lot get Hadoop installed on my machine. Finally, I was able to install it with the steps given in the below post.

I had Java pre-installed, and the problem I was facing was with the path of Java! It had spaces in it. Anyways, thanks for posting this. Happy learning!

about 29 days ago

Yes, you can load your text file into hdfs via CLI, WebHDFS api or any other tools/programming that supports this. You can then do transformations using tools like Apache Beam, Spark or notebooks (Zeppeline or Jupyter), etc.These tools also can then write into sql server database through ODBC/JDBC or native SQL Server drivers.

about 29 days ago

Can we load date from text file to HDFS and do calculations/corrections and load the data to SQL SERVER DB table

about 2 months ago

Hi, is your issue resolved?  If not, please paste your detailed error messages here. 

about 2 months ago


Did you setup the ODBC data source successfully?

If you are encountering connection issues from your windows host to the VM, you can check my following post to ensure the connection is well established:

Connect to Teradata Virtual Machine Guest from Windows Host

about 2 months ago

thx a lot for this tutorial. But I have a problem to ensure the connection of python (installed into windows) to Teradata (installed into VMWare in the same computer).

What shoul I do?

about 11 months ago

I can get it work by using the following approach:

1) Create an IIS website (http://localhost/Test/) with one page index.html:

        <h1>Test iframe</h1>
        <iframe src="http://localhost:8080/#/notebook/2D7J63CN7" width="600px" height="400px" style="border:1px solid #000000"></iframe>

2) And then open the website in the browser: http://localhost/Test/

If you open with a file URL, then the content cannot be displayed due to security reasons:


When you publish your website, your Zeppelin site should also be deployed into a server that your user can access.

about 11 months ago

Hi Raymond Tang. I come back because I tried but I didn't succeded to embed a zeppelin notebook as an iframe in my website. I have something like that  

<div id="interactivForm">

                                <iframe id="MyInterpreter" src="http://localhost:8085/#/notebook/2DHDGTVNU"></iframe>
                           </div> in my website but it doesn't work. But I can access to http://localhost:8085/#/notebook/2DHDGTVNU without problem. Do you know how to do that?
Thank you.

about 11 months ago

You can use <iframe> html element to embed Zeppelin into your website.

This also means that your Zeppelin website (*:8080 by default) needs to be exposed to all your users (i.e. their networks).

about 11 months ago

Hi Raymond. Now I don't really want to do any authentification. I want only to give an opportunity to anonymous user to execute spark in my website using zeppelin. So do you know how to do that or do you have a tutorial where I can see how to do that step by step? Thank you very much.

about 11 months ago

You can embed Zeppelin into a website.

However you need to decide how to pass through user credentials from your website to Zeppelin website depends on the authentication type. 

The authentication part you can reference to the following website: 

Based on my current understanding, I don't think currently you can directly implement the automatic logon without extending Zeppelin (but I might be wrong). 

about 11 months ago

Hi everyone,

do you know if it is possible to embed Zeppelin Notebook in a webpage? Like as an iframe or another method? So that users can come and execute their own code? Do you think that is possible? Has someone an idea to do that? Thank you!

about 11 months ago

Nw, I'm glad it worked. :)

about 11 months ago

I am sorry. It works!

about 11 months ago

I come again and sorry. When I run this commande %HADOOP_HOME%\bin\hdfs dfs -put file:///G:/DataAnalytics/test.txt / I get an error: put: `G:/DataAnalytics/test.txt': No such file or directory. But I followed the configuration step by step and my DataAnalytics folder is in G:.

about 11 months ago

Hi. thank you for your answer. I found a solution for my problem. If it can help someone, the problem was related to the syntax of my system username. It contains a space. So, to fixe it, you can edit /etc/hadoop/hadoop-env.cmd, at the end of this file, you will find set HADOOP_IDENT_STRING=%USERNAME% , change this with a string that you want but without space. For example: set HADOOP_IDENT_STRING=myuser, the problem will be fixed.

about 11 months ago

Did you follow all the steps in this post? For example, you need to ensure winutils tool is installed: 

Overwrite your bin folder (%HADOOP_HOME%\bin) with the files from this link:

The current available version is 3.0.0 and I am not very sure whether it can fix the issue for 3.0.1 but worth giving it a try.

about 11 months ago

Good morning, I am trying to install hadoop 3.0.1 in my windows but when I want to test my configuration it gives me that error: Error: Can not find or load the main class. Can someone help me please?

Thank you