Kontext Column

Created for everyone to publish data, programming and cloud related articles.
Follow three steps to create your columns.


Learn more arrow_forward

local_offer linux local_offer WSL local_offer ubuntu local_offer big-data-on-wsl

visibility 9601
thumb_up 4
access_time 2 years ago

This page shows how to install Windows Subsystem for Linux (WSL) system on a non-system drive manually. Open PowerShell as Administrator and run the following command to enable WSL feature: Enable-WindowsOptionalFeature -Online -FeatureName Microsoft-Windows-Subsystem-Linux Run the following ...

Install Hadoop 3.2.1 on Windows 10 Step by Step Guide

local_offer windows10 local_offer hadoop local_offer yarn local_offer big-data-on-windows-10

visibility 13392
thumb_up 13
access_time 9 months ago

This detailed step-by-step guide shows you how to install the latest Hadoop (v3.2.1) on Windows 10. It also provides a temporary fix for bug HDFS-14084 (java.lang.UnsupportedOperationException INFO).

local_offer python local_offer spark local_offer pyspark local_offer spark-advanced

visibility 32425
thumb_up 9
access_time 2 years ago

Data partitioning is critical to data processing performance especially for large volume of data processing in Spark. Partitions in Spark won’t span across nodes though one node can contains more than one partitions. When processing, Spark assigns one task for each partition and each worker threads ...

local_offer sqlite local_offer entity-framework local_offer dotnetcore

visibility 28484
thumb_up 2
access_time 3 years ago

SQLite is a self-contained and embedded SQL database engine. In .NET Core, Entity Framework Core provides APIs to work with SQLite. This page provides sample code to create a SQLite database using package Microsoft.EntityFrameworkCore.Sqlite . Create a .NET Core 2.x console application in ...

local_offer python local_offer spark local_offer pyspark local_offer spark-dataframe

visibility 23678
thumb_up 0
access_time 2 years ago

In Spark, SparkContext.parallelize function can be used to convert Python list to RDD and then RDD can be converted to DataFrame object. The following sample code is based on Spark 2.x. In this page, I am going to show you how to convert the following list to a data frame: data = [('Category A' ...

local_offer hadoop local_offer hive local_offer big-data-on-windows-10

visibility 23078
thumb_up 5
access_time 2 years ago

In this article, I’m going to demo how to install Hive 3.0.0 on Windows 10. Before installation of Apache Hive, please ensure you have Hadoop available on your Windows environment. We cannot run Hive without Hadoop.  I recommend to install Hadoop 3.x to work with Hive 3.0.0. There are two ...

Pandas DataFrame Plot - Pie Chart

local_offer plot local_offer pandas local_offer jupyter-notebook local_offer python local_offer pandas-plot

visibility 3177
thumb_up 0
access_time 6 months ago

This article provides examples about plotting pie chart using  pandas.DataFrame.plot  function. The data I'm going to use is the same as the other article  Pandas DataFrame Plot - Bar Chart . I'm also using Jupyter Notebook to plot them. The DataFrame has 9 records: DATE TYPE ...

local_offer python local_offer spark local_offer pyspark local_offer hive local_offer spark-database-connect

visibility 22219
thumb_up 4
access_time 2 years ago

From Spark 2.0, you can easily read data from Hive data warehouse and also write/append new data to Hive tables. This page shows how to operate with Hive in Spark including: Create DataFrame from existing Hive table Save DataFrame to a new Hive table Append data to the existing Hive table via ...

Install Hadoop 3.3.0 on Windows 10 Step by Step Guide

local_offer windows10 local_offer hadoop local_offer yarn local_offer hdfs local_offer big-data-on-windows-10

visibility 1858
thumb_up 0
access_time 2 months ago

This detailed step-by-step guide shows you how to install the latest Hadoop v3.3.0 on Windows 10. It leverages Hadoop 3.3.0 winutils tool and WSL is not required. This version was released on July 14 2020. It is the first release of Apache Hadoop 3.3 line. There are significant changes compared with Hadoop 3.2.0, such as Java 11 runtime support, protobuf upgrade to 3.7.1, scheduling of opportunistic containers, non-volatile SCM support in HDFS cache directives, etc.

Install Hadoop 3.0.0 on Windows (Single Node)

local_offer hadoop local_offer yarn local_offer hdfs local_offer big-data-on-windows-10

visibility 35745
thumb_up 3
access_time 3 years ago

This page summarizes the steps to install Hadoop 3.0.0 on your Windows environment. Reference page: https://wiki.apache.org/hadoop/Hadoop2OnWindows https://hadoop.apache.org/docs/r1.2.1/cluster_setup.html info A newer version of installation guide for latest Hadoop 3.2.1 is available. I ...

local_offer pyspark local_offer spark local_offer spark-2-x local_offer spark-file-operations

visibility 11005
thumb_up 0
access_time 10 months ago

Spark provides rich APIs to save data frames to many different formats of files such as CSV, Parquet, Orc, Avro, etc. CSV is commonly used in data application though nowadays binary formats are getting momentum. In this article, I am going to show you how to save Spark data frame as CSV file in ...

local_offer hadoop local_offer linux local_offer WSL local_offer big-data-on-wsl

visibility 17509
thumb_up 9
access_time 2 years ago

In my previous post , I showed how to configure a single node Hadoop instance on Windows 10. The steps are not too difficult to follow if you have Java programming background. However there is one step that is not very straightforward: native Hadoop executable (winutils.exe) is not included in the ...

local_offer pyspark local_offer spark-2-x local_offer teradata local_offer SQL Server local_offer spark-database-connect

visibility 4670
thumb_up 0
access_time 7 months ago

In my previous article about  Connect to SQL Server in Spark (PySpark) , I mentioned the ways to read data from SQL Server databases as dataframe using JDBC. We can also use JDBC to write data from Spark dataframe to database tables. In the following sections, I'm going to show you how to ...

Install Hadoop 3.3.0 on Windows 10 using WSL

local_offer linux local_offer hadoop local_offer WSL local_offer big-data-on-wsl

visibility 1257
thumb_up 3
access_time 2 months ago

Hadoop 3.3.0 was released on July 14 2020. It is the first release of Apache Hadoop 3.3 line. There are significant changes compared with Hadoop 3.2.0, such as Java 11 runtime support, protobuf upgrade to 3.7.1, scheduling of opportunistic containers, non-volatile SCM support in HDFS ...

local_offer spark local_offer scala local_offer parquet local_offer spark-file-operations

visibility 20947
thumb_up 0
access_time 3 years ago

In this page, I’m going to demonstrate how to write and read parquet files in Spark/Scala by using Spark SQLContext class. Go the following project site to understand more about parquet. https://parquet.apache.org/ If you have not installed Spark, follow this page to setup: Install Big Data ...

Get Started on Reunified .NET 5

local_offer .NET

visibility 6
thumb_up 0
access_time 2 days ago

In May 2019, Microsoft announced the roadmap for .NET in Build conference. .NET 5 is the update that unifies divergent frameworks, reduces code complexity and supports cross-platform reach including desktop, Web, mobile, cloud and device platforms. On 13th September 2020, Microsoft announced .NET ...

C# 9.0 New Features

local_offer C# local_offer .NET

visibility 26
thumb_up 0
access_time 2 days ago

.NET 5.0 release candidate 1 (rc.1) was published on 2020-09-14, which marks another big step towards the official .NET 5.0 release. As part of 5.0, C# 9.0 will be released with a bunch of new features. This article summarizes some of the new features with examples. Download .NET 5.0 SDK from this ...

Introduction to C# Interactive

local_offer C# local_offer .NET

visibility 7
thumb_up 0
access_time 3 days ago

Python, R and many other scripting languages generally support interactive programming features in their IDEs. When C# was created initially, all C# written programs need to be complied into MSIL first before it can run in .NET runtime environments (unless the code is dynamically complied).  ...

local_offer asp.net core local_offer asp.net core 3 local_offer C#

visibility 4
thumb_up 0
access_time 3 days ago

This page summarize information about how to retrieve client and server IP address in ASP.NET core applications.  Client IP address can be retrieved via HttpContext.Connection object. This properties exist in both Razor page model and ASP.NET MVC controller. Property  RemoteIpAddress ...

Statistics with R (Part II)

local_offer r-lang

visibility 5
thumb_up 0
access_time 3 days ago

In article Statistics with R (Part I) , we walked-through the basic statistics calculation using R and also regression models incl. linear regression, multiple regression, logistic regression and Poisson regression. In this part, we will continue to explore more complicated analysis including ...

Statistics with R (Part I)

local_offer r-lang

visibility 7
thumb_up 0
access_time 3 days ago

Till now, we've gone through R programming basics, data types, packages and IDEs, data APIs to work with data sources and various plotting functions. Let's now dive into the most important part about statistics and modelling with R. After all, R was created for statistics.  warning  Due ...

Plotting with R (Part II)

local_offer plot local_offer r-lang

visibility 8
thumb_up 0
access_time 3 days ago

In Plotting with R (Part I) , I summarized the functions that can be used in R plotting. In this part, we continue the journey to plot more rich and complex charts like Pie Chart, Bar Chart, BoxPlot, Histogram, Line and Scatterplot using those functions.  Pie chart can be drawn using ...

Plotting with R (Part I)

local_offer plot local_offer r-lang

visibility 6
thumb_up 0
access_time 3 days ago

For data analyst, it is critical to use charts to tell data stories clearly. R has numerous libraries to create charts and graphs. This article summarizes the high-level R plotting APIs (incl. graphical parameters) and provides examples about plotting Pie Chart, Bar ...

local_offer r-lang

visibility 8
thumb_up 0
access_time 3 days ago

R provides rich APIs to interact with source data such as databases and files (CSV, XML, JSON, etc.) With SparklyR, R can also be used to interact with big data platforms like Hadoop. This articles shows examples about using R to load data from relational databases and text files.   The ...

local_offer r-lang

visibility 4
thumb_up 0
access_time 3 days ago

In many scenarios, we need to generate data directly in memory. This article provides examples about generating regular and random sequences with R. It also shows you how to reshape or restructure data.  In the preceding articles, we already used a quite a few functions to generate regular ...

local_offer r-lang

visibility 5
thumb_up 0
access_time 3 days ago

In this series, we've walked-through R programming basics and advanced data types . This article will focus on R packages and IDEs so that you can program efficiently with R. Let's recap these commonly mentioned R terminologies: Package : An extension of the R base system with code, data and ...

local_offer r-lang

visibility 5
thumb_up 0
access_time 3 days ago

R implements a number of useful data types to support complex analytics and calculations. This articles focus on String, Vector, List, Matrix, Array, Factory and Data Frame. It also shows examples about expanding data frame, for example, add or drop columns for data frames, add rows for data ...

local_offer r-lang

visibility 8
thumb_up 0
access_time 4 days ago

This article provides a basic introduction about programming with R incl. atomic vector, variable, operations, branching, loops and functions.  info All examples can run RStudio or R Tools for Visual Studio on Windows.  About these two IDEs, refer to R Introduction . We always start ...

local_offer C#

visibility 5
thumb_up 0
access_time 4 days ago

C# regular expressions can be used to match and replace certain text patterns from a string variable. The following regular expression can be used to remove all heading tags incl. h1 to h9 from HTML text string. <[hH][1-9][^>]*>[^<]*</[hH][1-9]\s*> var html = "Your HTML ...

local_offer .NET local_offer C# local_offer dotnetcore local_offer Linq

visibility 12
thumb_up 0
access_time 4 days ago

Language-Integrated Query (LINQ) is a set of technologies based on the integration of query capabilities directly into the C# or VB language in .NET. It allows intuitive query against SQL databases, XML, object list, etc.  This article shows how to return a top N records randomly.  The ...

All columns

Programming with R language - tutorials about R. 

Digitalkora is proud to announce it is currently offering a top-rated livelihood application of Digital Marketing Courses in Bangalore under the oversight of digital marketing experts. Our class content is closely planned in light of industry requirements. We concentrate on giving practical ...

Streaming analytics related tutorials and ideas.

I came across issue while running Sqoop import to a partitioned table, and found workaround for same, sharing my two cents.. Let’s begin….. Create hive partitioned table at same time, import data:  sqoop  import --create-hive-table \ --connect ...

Code snippets for various programming languages/frameworks.

AspNetCore.XmlRpc - a XML Remote Procedure Call library for ASP.NET Core.

Data analytics with Google Cloud Platform.