By using this site, you acknowledge that you have read and understand our Cookie policy, Privacy policy and Terms .

Share. Ask. Meet.

Join Our Community for Cloud, Data and AI Professionals.

person_add Register account_circle Log in

Featured columns/forums

Articles about Apache Hadoop installation, performance tuning and general tutorials.

open_in_new View

Apache Spark installation guides, performance tuning tips, general tutorials, etc.

open_in_new View

Tutorials and informations about Teradata.

open_in_new View

Code snippets for various programming languages/frameworks.

open_in_new View

Any Kontext website related questions, please publish here incl. feature suggestions, bug reports, other feedbacks, etc. 

open_in_new View

ML.NET is an open source and cross-platform machine learning framework. With ML.NET, you can create custom ML models using C# or F# without having to leave the .NET ecosystem. This column publish articles about ML.NET.

open_in_new View

Featured posts

Compile and Build Hadoop 3.2.1 on Windows 10 Guide

local_offer windows local_offer hadoop

visibility 55
comment 0
thumb_up 1
access_time 1 day ago

This article provides detailed steps about how to compile and build Hadoop (incl. native libs) on Windows 10. The following guide is based on Hadoop release 3.2.1. ...

open_in_new View

local_offer hadoop local_offer yarn local_offer hdfs

visibility 26557
comment 30
thumb_up 2
access_time 2 years ago

This page summarizes the steps to install Hadoop 3.0.0 on your Windows environment. Reference page: https://wiki.apache.org/hadoop/Hadoop2OnWindows ...

open_in_new View

local_offer .net core local_offer entity-framework

visibility 13994
comment 4
thumb_up 0
access_time 2 years ago

SQLite is a self-contained and embedded SQL database engine. In .NET Core, Entity Framework Core provides APIs to work with SQLite. This page provides sample code to create a SQLite database using package Microsoft.EntityFrameworkCore.Sqlite . Create sample project ...

open_in_new View

local_offer python local_offer spark local_offer pyspark

visibility 5462
comment 0
thumb_up 0
access_time 7 months ago

In Spark, SparkContext.parallelize function can be used to convert Python list to RDD and then RDD can be converted to DataFrame object. The following sample code is based on Spark 2.x. In this page, I am going to show you how to convert the following list to a data frame: data = [(...

open_in_new View

Latest Hadoop 3.2.1 Installation on Windows 10 Step by Step Guide

local_offer windows local_offer hadoop local_offer yarn

visibility 32
comment 0
thumb_up 0
access_time 2 days ago

This detailed step-by-step guide shows you how to install the latest Hadoop (v3.2.1) on Windows 10. It also provides a temporary fix for bug HDFS-14084 (java.lang.UnsupportedOperationException INFO).

open_in_new View

local_offer hadoop local_offer yarn local_offer hdfs

visibility 7714
comment 0
thumb_up 0
access_time 2 years ago

This page summarizes the default ports used by Hadoop services. It is useful when configuring network interfaces in a cluster. Hadoop 3.1.0 HDFS The secondary namenode http/https server address and port. ...

open_in_new View

local_offer hadoop local_offer hive

visibility 11365
comment 11
thumb_up 1
access_time 11 months ago

If you have been following my website, you would know I’ve published a number of articles about installing big data tools/framewo...

open_in_new View

local_offer .NET local_offer dotnet core local_offer spark local_offer parquet local_offer hive

visibility 1067
comment 2
thumb_up 0
access_time 9 months ago

I’ve been following Mobius project for a while and have been waiting for this day. .NET for Apache Spark v0.1.0 was just published on 2019-04-25 on GitHub. It provides high performance APIs for programming Apache Spark applications with C# and F#. It is .NET Standard complaint and can run in Wind...

open_in_new View

Kontext release v0.6.6

local_offer kontext

visibility 15
comment 0
thumb_up 0
access_time 17 days ago

Kontext v0.6.6 is now released with a few changes/enhancements. Changes SEO enhancements Added a number of Facebook and twitter meta tags into head section of each page. Robots.txt is updated to make it simple. ...

open_in_new View

Latest posts

Compile and Build Hadoop 3.2.1 on Windows 10 Guide

local_offer windows local_offer hadoop

visibility 55
comment 0
thumb_up 1
access_time 1 day ago

This article provides detailed steps about how to compile and build Hadoop (incl. native libs) on Windows 10. The following guide is based on Hadoop release 3.2.1. ...

open_in_new View

Latest Hadoop 3.2.1 Installation on Windows 10 Step by Step Guide

local_offer windows local_offer hadoop local_offer yarn

visibility 32
comment 0
thumb_up 0
access_time 2 days ago

This detailed step-by-step guide shows you how to install the latest Hadoop (v3.2.1) on Windows 10. It also provides a temporary fix for bug HDFS-14084 (java.lang.UnsupportedOperationException INFO).

open_in_new View

Kontext release v0.6.7

local_offer kontext

visibility 14
comment 0
thumb_up 0
access_time 9 days ago

Kontext v0.6.7 is now released with a few changes/enhancements. Changes The following sections list the new features/changes i...

open_in_new View

Kontext release v0.6.6

local_offer kontext

visibility 15
comment 0
thumb_up 0
access_time 17 days ago

Kontext v0.6.6 is now released with a few changes/enhancements. Changes SEO enhancements Added a number of Facebook and twitter meta tags into head section of each page. Robots.txt is updated to make it simple. ...

open_in_new View

Machine Learning with .NET in Jupyter Notebooks

local_offer machine-learning local_offer jupyter-notebook local_offer C# local_offer dotnet core

visibility 106
comment 0
thumb_up 0
access_time 18 days ago

In this article, I'm going to show you how to install Jupyter in Windows and then install .NET kernel for Jupyter notebooks. It also shows a machine learning example using ML.NET. The target audience are .NET developers who want to expand their skills in data engineering and science domain...

open_in_new View

Kontext release v0.6.5

local_offer kontext

visibility 13
comment 0
thumb_up 0
access_time 20 days ago

Kontext v0.6.5 is now released with a few changes/enhancements. Changes RSS changes RSS subscriptions are now increased to 200 items and the d...

open_in_new View

local_offer pyspark local_offer spark-2-x local_offer python

visibility 33
comment 0
thumb_up 0
access_time 20 days ago

This articles show you how to convert a Python dictionary list to a Spark DataFrame. The code snippets runs on Spark 2.x environments. Input The input data (dictionary list looks like the following): data = [{"Category": 'Category A', 'ItemID': 1, 'Amount': 12.40}, ...

open_in_new View

Improve PySpark Performance using Pandas UDF with Apache Arrow

local_offer pyspark local_offer spark local_offer spark-2-x local_offer pandas

visibility 123
comment 0
thumb_up 2
access_time 23 days ago

Apache Arrow is an in-memory columnar data format that can be used in Spark to efficiently transfer data between JVM and Python processes. This currently is most beneficial to Python users that work with Pandas/NumPy data. In this article, ...

open_in_new View

local_offer pyspark local_offer spark-2-x local_offer spark

visibility 11
comment 0
thumb_up 0
access_time 26 days ago

This article shows you how to read and write XML files in Spark. Sample XML file Create a sample XML file named test.xml with the following content: <?xml version="1.0"?> <data> <record id="1"> <rid>1</rid> <nam...

open_in_new View