Dear Kontext users,
I hope you and your family are well and ready for the festive season celebrations. Thank you very much for using Kontext in 2021. I wish you a happy and safe holiday! Sharing knowledge is Kontext's passion and we sincerely welcome you to create columns on Kontext and become one of our writers in 2022! This week's newsletter will feature some articles about Spark, Hyperspace, JMESPath, .NET, Data Masking, JSON Lines, Hadoop, etc.
What?! yes, you heard it right. You can use Hyperspace to create indexes for your Spark DataFrame. This can potentially accelerate your query and reduce cost in your data lake solution. Find out more details and sample code in this article.
For XML document, XPath can be used to query the data including nodes (elements) and attributes. Similarly, for JSON document, we can use JMESPath to query it. Azure CLI supports using JMESPath to query resource information. This article shows you some examples of querying JSON using JMESPath .NET library.
Check out this diagram to understand the differences between static and dynamic data masking that are commonly used in data projects.
JSON Lines text file is a newline-delimited JSON object document. It is commonly used in many data related products. For example, Spark by default reads JSON line document, BigQuery provides APIs to load JSON Lines file.
This Hadoop installation guide on Windows is still very relevant as many Kontext users use it to configure a Hadoop system on Windows machine recently. Follow it if you also want to configure one to practice.