This site uses cookies to deliver our services. By using this site, you acknowledge that you have read and understand our Cookie and Privacy policy. Your use of Kontext website is subject to this policy. Allow Cookies and Dismiss

Analytics & BI

Data Analytics,Big Data,Data Storage and Business Intelligence.

Subscribe

python spark pyspark

Implement SCD Type 2 Full Merge via Spark Data Frames

72 views   0 comments last modified about 19 days ago

Overview For SQL developers that are familiar with SCD and merge statements, you may wonder how to implement the same in big data platforms, considering database or storages in Hadoop are not designed/optimised for record level updates and inserts. In this post, I’m going to demons...

View detail
lite-log hadoop sqoop

Password Security Solution for Sqoop

21 views   0 comments last modified about 2 months ago

In Sqoop, there are multiple approaches to pass in passwords for RDBMS. Options Option 1 - clear password through --password argument sqoop [subcommand] --username user --password pwd This is the weakest approach as password is exposed directly...

View detail
python spark

PySpark: Convert JSON String Column to Array of Object (StructType) in Data Frame

145 views   0 comments last modified about 2 months ago

This post shows how to derive new column in a Spark data frame from a JSON array string column. I am running the code in Spark 2.2.1 though it is compatible with Spark 1.6.0 (with less JSON SQL functions). Prerequisites Refer to the following post to install Spark in Windows. ...

View detail
java bigquery gcp dataflow gcs

Load CSV File from Google Cloud Storage to BigQuery Using Dataflow

1463 views   0 comments last modified about 7 months ago

This page documents the detailed steps to load CSV file from GCS into BigQuery using Dataflow to demo a simple data flow creation using Dataflow Tools for Eclipse. However it doesn’t necessarily mean this is the right use case for DataFlow. Alternatively ...

View detail
azure power-bi

Advanced analytics on big data with Azure - Tutorial

418 views   0 comments last modified about 7 months ago

Microsoft Azure provides a number of data analytics related products and services. It allows users to tailor the solutions to meet different requirements, for example, architecture for modern data warehouse, advanced analytics with big data or real time analytics. The following diagram sho...

View detail
power-bi bigquery

Use Google Cloud BigQuery as Data Source in Power BI

912 views   0 comments last modified about 8 months ago

BigQuery is Google’s serverless data warehouse in Google Cloud. Power BI can consume data from various sources including RDBMS, NoSQL, Could, Services, etc. It is also easy to get data from BigQuery in Power BI. In this article, I am going to demonstrate how to connect to BigQuery to create...

View detail
zeppelin spark

Install Zeppelin 0.7.3 in Windows

2245 views   6 comments last modified about 12 months ago

This post summarizes the steps to install Zeppelin 0.7.3 in Windows environment. Tools and Environment GIT Bash Command Prompt Windows 10 Download Binary Package Download the latest binary package from the following website: ...

View detail
hadoop yarn hdfs

Install Hadoop 3.0.0 in Windows (Single Node)

11138 views   14 comments last modified about 13 months ago

This page summarizes the steps to install Hadoop 3.0.0 in your Windows environment. Reference page: https://wiki.apache.org/hadoop/Hadoop2OnWindows ...

View detail
lite-log power-bi

Data Analysis Expressions to Create Static Tables in PowerBI

132 views   0 comments last modified about 9 months ago

DATATABLE StaticTable1 = DATATABLE("IntCol",INTEGER,"StringCol",STRING,{{1,"User1"},{2,"User2"}}) The above expression generates a table with two columns IntCol and StringCol : ...

View detail
power-bi google-analytics

Power Analytics with Power BI and Google Analytics

419 views   0 comments last modified about 9 months ago

Power BI is my favourite BI and visualization tool as it is very simple yet powerful. It doesn’t only support traditional data sources like databases, CSV, JSON, XML and etc., but also supports emerging sources that are available in HDFS, Spark, R, Salesforce, Google Analytics and cloud platforms...

View detail

Contacts

  • enquiry[at]kontext.tech

Subscribe