visibility 89
thumb_up 0
access_time 5 months ago
more_vert
visibility 375
thumb_up 0
access_time 5 months ago
more_vert
visibility 227
thumb_up 0
access_time 9 months ago
more_vert
visibility 73
thumb_up 0
access_time 10 months ago
more_vert
visibility 135
thumb_up 0
access_time 11 months ago
more_vert
visibility 429
thumb_up 0
access_time 12 months ago
more_vert
visibility 682
thumb_up 0
access_time 2 years ago
more_vert
visibility 417
thumb_up 0
access_time 2 years ago
more_vert
visibility 3730
thumb_up 0
access_time 2 years ago
more_vert
#1726 Re: Hadoop 3.3.0 winutils access_time5 days ago
Hello, the compiled native HDFS lib is available here: winutils/hadoop-3.3.0/bin at master ยท kontext-tech/winutils (github.com)
L Liviu
#1724 Re: Hadoop 3.3.0 winutils access_time8 days ago

Hi Raymond

Thanks for sharing but you don't have hdfs on your compiled code (for 3.3.0)

#1721 Re: Install Python 3.9.1 on WSL access_time14 days ago

I'm glad that you find it helpful :)

A Alexander
#1719 Re: Install Python 3.9.1 on WSL access_time24 days ago

Thank you! Super useful

#1718 Re: Azure DevOps dotnet publish Task Fail for .NET 5 Projects access_time26 days ago

Hi Asif,

I eventually upgraded to use .NET 6 for my Azure Functions and there is no issues: Build Azure Functions V4 with .NET 6.

Do you have to use .NET Core 3? If not, I would suggest upgrade to the latest versions. 

A Asif
#1717 Re: Azure DevOps dotnet publish Task Fail for .NET 5 Projects access_time26 days ago
I am also getting the same error and I followed the above-suggested step but error is still prevailing



For switch syntax, type "MSBuild -help"
##[error]Error: The process '/usr/bin/dotnet' failed with exit code 1
##[warning].NET 5 has some compatibility issues with older Nuget versions(<=5.7), so if you are using an older Nuget version(and not dotnet cli) to restore, then the dotnet cli commands (e.g. dotnet build) which rely on such restored packages might fail. To mitigate such error, you can either: (1) - Use dotnet cli to restore, (2) - Use Nuget version 5.8 to restore, (3) - Use global.json using an older sdk version(<=3) to build
Info: Azure Pipelines hosted agents have been updated and now contain .Net 5.x SDK/Runtime along with the older .Net Core version which are currently lts. Unless you have locked down a SDK version for your project(s), 5.x SDK might be picked up which might have breaking behavior as compared to previous versions. You can learn more about the breaking changes here: https://docs.microsoft.com/en-us/dotnet/core/tools/ and https://docs.microsoft.com/en-us/dotnet/core/compatibility/ . To learn about more such changes and troubleshoot, refer here: https://docs.microsoft.com/en-us/azure/devops/pipelines/tasks/build/dotnet-core-cli?view=azure-devops#troubleshooting
##[error]Dotnet command failed with non-zero exit code on the following projects : /home/vsts/work/1/s/Sempra.ADF.AzureFunctions/Sempra.ADF.AzureFunctions.csproj
Finishing: DotNetCoreCLI
#1716 Re: Introduction to Hive Bucketed Table access_time26 days ago

I have not figured out the reason why it is not being used. I would expect similar behavior as Spark bucketing table pruning: Spark Bucketing and Bucket Pruning Explained.

AdaptiveSparkPlan isFinalPlan=false
+- SortMergeJoin [id#0L], [id#1L], Inner
   :- Sort [id#0L ASC NULLS FIRST], false, 0
   :  +- Filter (id#0L IN (100,1000) AND isnotnull(id#0L))
   :     +- FileScan parquet test_db.spark_bucket_table1[id#0L] Batched: true, Bucketed: true, DataFilters: [id#0L IN (100,1000), isnotnull(id#0L)], Format: Parquet, Location: InMemoryFileIndex(1 paths)[hdfs://localhost:9000/user/hive/warehouse/test_db.db/spark_bucket_table1], PartitionFilters: [], PushedFilters: [In(id, [100,1000]), IsNotNull(id)], ReadSchema: struct<id:bigint>, SelectedBucketsCount: 2 out of 100
   +- Sort [id#1L ASC NULLS FIRST], false, 0
      +- Filter (id#1L IN (100,1000) AND isnotnull(id#1L))
         +- FileScan parquet test_db.spark_bucket_table2[id#1L] Batched: true, Bucketed: true, DataFilters: [id#1L IN (100,1000), isnotnull(id#1L)], Format: Parquet, Location: InMemoryFileIndex(1 paths)[hdfs://localhost:9000/user/hive/warehouse/test_db.db/spark_bucket_table2], PartitionFilters: [], PushedFilters: [In(id, [100,1000]), IsNotNull(id)], ReadSchema: struct<id:bigint>, SelectedBucketsCount: 2 out of 100

I have also raised this to Spark user community and will update once I get any feedback or if I figure it out myself. 

M Michał
#1715 Re: Introduction to Hive Bucketed Table access_time27 days ago

What about TODO: ? :)

#1714 Re: PySpark: Convert Python Array/List to Spark Data Frame access_time2 months ago

Thanks for the feedback, Ravi. Welcome to Kontext!

R Ravi
#1713 Re: PySpark: Convert Python Array/List to Spark Data Frame access_time2 months ago

Very nice code and explanation . Excellent feature in pyspark.