Forums

This forum is for general programming, development related discussions.

Discuss about cloud computing technologies, learning resources, etc. 

Discuss big data frameworks/technologies such as Hadoop, Spark,  etc. 

Any Kontext website related questions, please publish here incl. feature suggestions, bug reports, other feedbacks, etc. Visit Help Centre to learn how to use Kontext platform efficiently. 

#1530 Re: Hadoop 3.3.1 winutils access_time1d

Thanks for the suggestion. I've added the Hadoop project license file in as Hadoop project actually has different license for different part of the project. 

https://github.com/kontext-tech/winutils/blob/master/LICENSE.txt 

#1529 Re: Hadoop 3.3.1 winutils access_time1d

Hi - can you please add back the original Apache License to the gitbug (all repos forked from https://github.com/steveloughran/winutils/blob/master/ which is on Apache 2.0)

#1528 Re: Install Apache Spark 3.0.0 on Windows 10 access_time3d

If you use Derby for hive metastore, please ensure that the directory context in your command prompt is the same when you run your previous init command previously otherwise you will have to initialize the metastore again. I feel like the error you got was caused by that but I will need to look into details to be able to tell.

For the data warehouse folder, it exists in HDFS not in file system directly. 

#1527 Re: Apache Hive 3.1.2 Installation on Windows 10 access_time3d

Hi Orland, I may be able to find sometime this week (after my work hours or on weekend).

Can you please send an email about the timezone and preferred time to the following email box?

Contact us 

I will try to organize one session with you. 


#1526 Re: Install Hadoop 3.2.1 on Windows 10 Step by Step Guide access_time3d

Hi.

What does this mean? here's the link to the full output https://www.dropbox.com/s/00rjsiyu8ezdf2w/yarn%20node%20manager.txt?dl=0

This is my output for the hive metastore ,  it showing warnings and no access to hiveserver2

https://www.dropbox.com/s/ec16lpp8d0tz1n9/--servicemetastoreoutput.txt?dl=0

2021-10-19 13:39:44,152 WARN nativeio.NativeIO: NativeIO.getStat error (3): The system cannot find the path specified.

 -- file path: tmp/hadoop-User/nm-local-dir/filecache

2021-10-19 13:39:44,219 WARN nativeio.NativeIO: NativeIO.getStat error (3): The system cannot find the path specified.

 -- file path: tmp/hadoop-User/nm-local-dir/usercache

2021-10-19 13:39:44,285 WARN nativeio.NativeIO: NativeIO.getStat error (3): The system cannot find the path specified.

 -- file path: tmp/hadoop-User/nm-local-dir/nmPrivate

#1525 Re: Install Apache Spark 3.0.0 on Windows 10 access_time3d

Nope it didnt. BTw Raymond I managed to run my hive smoothly the other day after installation and was able to access the hiveserver2 but now when I try to connect Im able to access hive but the hive --help doesnt work and I cant connect to the hiveserver2 as well when I run these commands:

HIVE_HOME/bin/hive --service metastore &

$HIVE_HOME/bin/hive --service hiveserver2 start &

also I dont have hive in my users directory with a warehouse subfolder /user/hive/warehouse.

#1524 Re: Apache Hive 3.1.2 Installation on Windows 10 access_time3d

Hi again Raymond is there anyway I can get in touch with you through chat since Im in need of mentoring in Hive and Spark for some exercises. I find this topic quite challenging. 


#1523 Re: Install Apache Spark 3.0.0 on Windows 10 access_time5d

Did your Spark session crash after you see the warning message?

WARN executor.ProcfsMetricsGetter: Exception when trying to compute pagesize, as a result reporting of ProcessTree metrics is stopped.

If it doesn't crash, it is ok. There was recommendation of creating PYSPARK_PYTHON  environment variable that points to your Python executable in your machine. However, since you are using spark-shell (Scala), I don't think Python matters. 

Can you run the following command in Command Prompt:

getconf PAGESIZE

You should be able to see something like the following screenshot. 


getconf command is provided by my Git Bash:

So if you cannot run it successfully, it suggests you have not added git bash bin folder to environment variable PATH correctly. Please do that as I suggested in the preceding comment. 

#1522 Re: Install Apache Spark 3.0.0 on Windows 10 access_time5d

Sorry I forgot to mention that %SPARK_HOME% works with Command Prompt.

For Git Bash, please use $SPARK_HOME to access the environment variable:

For adding Git Bash bin to PATH variable: please add path C:\Program Files\Git\usr\bin to environment variable PATH. Depends on where Git is installed in your computer, please change the path accordingly. You can just directly go to Spark installation folder and then manually copy the file instead of using command. 

#1521 Re: Install Apache Spark 3.0.0 on Windows 10 access_time5d

This is the warning. Thank you.

starsTop contributors
# User web_assetArticles forum Threads comment Comments
1 549 7 168
2 50 5 6
3 3 0 0
4 1 0 6
5 0 2 0
6 0 1 0
7 0 0 11
8 0 0 8
9 0 0 8
10 0 0 7