Raymond Tang

Big Data Engineer, Full Stack .NET and Cross-Platform Software Engineer/Architect


I'm passionate about building data driven, scalable, cloud native applications and products.

Microsoft MVP C#/.NET (2010-2016)/Visual Studio | MCP | MCSE: Data Management and Analytics | Google Cloud Platform Certified Professional Data Engineer

 LinkedIn    MVP Reconnect 

Posts

Bootstrap 4 Mega Full Width Dropdown Menu

local_offer HTML local_offer Javascript local_offer bootstrap

visibility 5
thumb_up 0
access_time 25 days ago

On Kontext, a full width mega dropdown menu is implemented using Bootstrap 4. If you'd like to implement similar menus, please follow these steps.  Environment The following code snippets work with Bootstrap 4.4.1. It should also work with other 4.x versions but I didn't test i...

open_in_new View open_in_new Frontend & Javascript

local_offer teradata local_offer python

visibility 198
thumb_up 0
access_time 1 month ago

Pandas is commonly used by Python users to perform data operations. In many scenarios, the results need to be saved to a storage like Teradata. This article shows you how to do that easily using JayDeBeApi or  ...

open_in_new View open_in_new Spark + PySpark

local_offer python

visibility 59
thumb_up 0
access_time 2 months ago

CSV is a common data format used in many applications. It's also a common task for data workers to read and parse CSV and then save it into another storage such as RDBMS (Teradata, SQL Server, MySQL). In my previous article  ...

open_in_new View open_in_new Python Programming

local_offer teradata local_offer SQL

visibility 29
thumb_up 0
access_time 2 months ago

In SQL Server, we can use TRUNCATE statement to clear all the records in a table and it usually performs better compared with DELETE statements as no transaction log for each individual row deletion. The syntax looks like the following: TRUNCATE TABLE { database_name.schema_name.tab...

open_in_new View open_in_new Code snippets

local_offer teradata local_offer python local_offer Java

visibility 126
thumb_up 0
access_time 2 months ago

Python JayDeBeApi module allows you to connect from Python to Teradata databases using Java JDBC drivers. In article Connect to Teradata database through Python , I showed ho...

open_in_new View open_in_new Python Programming

local_offer hadoop local_offer hive local_offer Java

visibility 172
thumb_up 1
access_time 2 months ago

When I was configuring Hive 3.0.0 in Hadoop 3.2.1 environment, I encountered the following error: Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Preconditions.checkArgument(ZLjava/lang/String;Ljava/lang/Object;)V Ro...

open_in_new View open_in_new Hadoop

local_offer pandas local_offer sqlite

visibility 35
thumb_up 0
access_time 2 months ago

In my previous posts, I showed how to use  jaydebeapi or sqlite3 pack...

open_in_new View open_in_new Python Programming

Modern Web Application - Azure Blob Storage for Uploaded Files

local_offer Azure local_offer asp.net core local_offer dotnetcore

visibility 79
thumb_up 0
access_time 2 months ago

With cloud platforms like Azure, we can totally separate user content storage from web application storage to decouple components from each other and to make the application easy to scale and deploy. This article provides detailed information with code snippets about how to use Azure server-less product Blob Storage and App Service to enable horizontally scalable web application for users to upload files (BLOBs).

open_in_new View open_in_new Azure

Azure SQL Database Automated Backup Strategy

local_offer Azure local_offer SQL Server

visibility 41
thumb_up 0
access_time 2 months ago

When designing the architecture of Kontext platform, Azure SQL Database is chosen as the storage for relational data. TDE and other advanced security features are always enabled to protect the database. Backup plans are also employed to ensure I can always restore the database for as point of tim...

open_in_new View open_in_new Azure

local_offer sqlite local_offer python local_offer Java

visibility 46
thumb_up 0
access_time 2 months ago

To read data from SQLite database in Python, you can use the built-in sqlite3 package . Another approach is to use SQLite JDBC driver via  ...

open_in_new View open_in_new Python Programming

Comments

Hi Tim,

Just an update the previous issue (understanding you have fixed it but I'd like to post my findings here too just in case other people may be interested).

I've done the following steps to see if I can run Hadoop daemons without Administrator right.

  • Create a local computer account named hadoop.
  • Setup environment variables for this account.


  • Reconfigured HDFS dfs locations for both data and namespace.
  • Format the namenode again using this local account.
hadoop namenode -format
  • Start HDFS daemons
start-dfs.cmd

Commands can start successfully without any errors.


  • Start YARN daemons
start-yarn.cmd

Very interestingly, this time NodeManager can start successfully while ResourceManager cannot due to the following error:


org.apache.hadoop.service.ServiceStateException: java.io.IOException: Mkdirs failed to create file:/tmp/hadoop-yarn-hadoop/node-attribute

For YARN tmp folder, I am configuring it as the following:

<property>
		<name>yarn.nodemanager.local-dirs</name>
		<value>file:///F:/tmp</value>
	</property>

So I then tried the following steps:

  • Stopped all the running Hadoop daemons.
  • Delete the existing tmp folder and recreate it using hadoop local account:


  • Delete DFS folder and recreate it


  • Reformat namenode
  • Restarted HDFS: the services were started successfully as the following screenshot shows.


  • Start YARN daemons:

This time the services all started successfully without any errors.

I can verify that through resource manager UI too:

So to summarize:

  • You don't necessarily need to create the tmp folder under your user directory.
  • And you can run Hadoop services without Administrator privileges on Windows as long as the HDFS directories and also tmp directories are setup correctly using the Windows account that runs Hadoop daemons.

Hope the above helps.

format_quote

person Tim access_time 19 days ago
Re: Install Hadoop 3.2.1 on Windows 10 Step by Step Guide

Hello,

I have been able to get around my need for admin it seems so far by changing my config so the tmp-nm folder is in my Documents versus in C drive directly in tmp.

However, it seems I still have some issues.   Two of them seem to point to wrong version of winutils.exe.   I am running windows 10 64 bit and am trying to get hadoop 3.2.1 running. One symtom of the wrong version is the repeated warning in Yarn node manager window over and over

WARN util.SysInfoWindows: Expected split length of sysInfo to be 11. Got 7

Another was the failure code of a job I submitted to insert data into a table from the hive prompt.  Job details were found in the Hadoop cluster local UI 

Application application_1589548856723_0001 failed 2 times due to AM Container for appattempt_1589548856723_0001_000002 exited with exitCode: 1639

Failing this attempt.Diagnostics: [2020-05-15 09:53:23.804]Exception from container-launch.

Container id: container_1589548856723_0001_02_000001

Exit code: 1639

Exception message: Incorrect command line arguments.

Shell output: Usage: task create [TASKNAME] [COMMAND_LINE] |

task isAlive [TASKNAME] |

task kill [TASKNAME]

task processList [TASKNAME]

Creates a new task jobobject with taskname

Checks if task jobobject is alive

Kills task jobobject

Prints to stdout a list of processes in the task

along with their resource usage. One process per line

and comma separated info per process

ProcessId,VirtualMemoryCommitted(bytes),

WorkingSetSize(bytes),CpuTime(Millisec,Kernel+User)

[2020-05-15 09:53:23.831]Container exited with a non-zero exit code 1639.


Some sites have said these two issues are symtom of having the wrong winutils.exe.

I have some other issues I'll wait to post after I can get these fixed.

I have used the link in this article to get winutils.exe.    I have also tried other winutils.exe's I find out there.  However, for the other ones I've tried when trying to start yarn, in the yarn node manager window it is full of errors like

2020-05-15 10:12:16,444 ERROR util.SysInfoWindows: java.io.IOException: Cannot run program "C:\Users\XXX\Documents\Big-Data\Hadoop\hadoop-3.2.1\bin\winutils.exe": CreateProcess error=216, This version of %1 is not compatible with the version of Windows you're running. Check your computer's system information and then contact the software publisher

So those ones are worse - I can't even get yarn started with those due to that error.  

So with the version I am using now I can get YARN to start although I get the warning about "WARN util.SysInfoWindows: Expected split length of sysInfo to be 11. Got 7" but the actual hive insert fails anyway... 

Appreciate the help.  How do I find or know if a winutil.exe is meant for windows 10 64 bit and Hadoop 3.2.1?

reply Reply

Hi Tim,

In my computer, all the paths for HADDOP_HOME and JAVA_HOME are configured to a location without any space as I was worried that the spaces issue may cause problems in the applications.

That's also the reasons that most of Windows Hadoop installation guides recommend configuring them in a path that has no space. This is even more important for Hive installation.  

So I think you are right the issue was due to the space in your environment variables.

JAVA_HOME environment variable is setup in the following folder:

%HADOOP_HOME%\etc\hadoop\hadoop-env.cmd

And also in Step 6 of this page, we've added class paths for JARs:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property> 
        <name>mapreduce.application.classpath</name>
        <value>%HADOOP_HOME%/share/hadoop/mapreduce/*,%HADOOP_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_HOME%/share/hadoop/common/*,%HADOOP_HOME%/share/hadoop/common/lib/*,%HADOOP_HOME%/share/hadoop/yarn/*,%HADOOP_HOME%/share/hadoop/yarn/lib/*,%HADOOP_HOME%/share/hadoop/hdfs/*,%HADOOP_HOME%/share/hadoop/hdfs/lib/*</value>
    </property>
</configuration>

You can try to change them to absolute values with double quotes to see if they work. 

To save all the troubles, I would highly recommend getting Java available in a path without space or create a symbolic link to the Java folder in a location without space in the path.

format_quote

person Tim access_time 18 days ago
Re: Install Hadoop 3.2.1 on Windows 10 Step by Step Guide

Hi Raymond,

Ok I got my work IT helpdesk to add my ID to the Create Symbolic Links directory. That worked fine. So I am now passed the exitCode=1: CreateSymbolicLink error (1314): A required privilege is not held by the client. error.

Now, it is throwing this error:

Application application_1589579676240_0001 failed 2 times due to AM Container for appattempt_1589579676240_0001_000002 exited with exitCode: 1

Failing this attempt.Diagnostics: [2020-05-15 17:57:08.681]Exception from container-launch.

Container id: container_1589579676240_0001_02_000001

Exit code: 1

Shell output: 1 file(s) moved.

"Setting up env variables"

"Setting up job resources"

"Copying debugging information"

C:\Users\V121119\Documents\Big-Data\tmp-nm\usercache\XXX\appcache\application_1589579676240_0001\container_1589579676240_0001_02_000001>rem Creating copy of launch script

C:\Users\V121119\Documents\Big-Data\tmp-nm\usercache\XXX\appcache\application_1589579676240_0001\container_1589579676240_0001_02_000001>copy "launch_container.cmd" "C:/Users/V121119/Documents/Big-Data/Hadoop/hadoop-3.2.1/logs/userlogs/application_1589579676240_0001/container_1589579676240_0001_02_000001/launch_container.cmd"

1 file(s) copied.

C:\Users\V121119\Documents\Big-Data\tmp-nm\usercache\XXX\appcache\application_1589579676240_0001\container_1589579676240_0001_02_000001>rem Determining directory contents

C:\Users\V121119\Documents\Big-Data\tmp-nm\usercache\XXX\appcache\application_1589579676240_0001\container_1589579676240_0001_02_000001>dir 1>>"C:/Users/XXX/Documents/Big-Data/Hadoop/hadoop-3.2.1/logs/userlogs/application_1589579676240_0001/container_1589579676240_0001_02_000001/directory.info"

"Launching container"

[2020-05-15 17:57:08.696]Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :

'C:\Program' is not recognized as an internal or external command,

operable program or batch file.

[2020-05-15 17:57:08.696]Container exited with a non-zero exit code 1. Last 4096 bytes of stderr :

'C:\Program' is not recognized as an internal or external command,

operable program or batch file.


I think I know what is happening but don't know how to fix. One of my first errors was in hadoop.config.xml where it was trying to check if not exist %JAVA_HOME%\bin\java.exe but the problem is in path : C:\program files\Java\jre8  -Please note this directory "program files" has a space in it. The result was that I had to modify hadoop.config.xml  to put double quotes around the check - making it 

if not exist "%JAVA_HOME%\bin\java.exe" .  This resolved that problem.  Then another place I had found issue was on this line

for /f "delims=" %%A in ('%JAVA% -Xmx32m %HADOOP_JAVA_PLATFORM_OPTS% -classpath "%CLASSPATH%" org.apache.hadoop.util.PlatformName') do set JAVA_PLATFORM=%%A

This was messing up too for similar reason - it was erroring with similar error since some values in the classpath list were C:\program files\...  and once it hit the space it blew up.  For this line of code I just remarked it - since my HADOOP_JAVA_PLATFORM_OPTS is empty - I am not sure what I would have done had HADOOP_JAVA_PLATFORM_OPTS been populated. In any case, these were preliminary issues and all dealt with the fact that C:\program files... path was causing issues due to space.  Therefore when I saw this latest exception, I am assuming it too is hitting this at java path or some member of classpath that has the same... but not sure where to modify - how to work around.

As of 5/15 6:10pm EDT - this is my current issue - you may disregard the prior comments if you wish since they are resolved...   Thanks

reply Reply

Hi Tim,

I did similar changes as you did:

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
	<property>
		<name>yarn.nodemanager.local-dirs</name>
		<value>F:/big-data/data/tmp</value>
	</property>
</configuration>

And I cloud not start YARN nodemanager service because of the following error:

Caused by: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Permissions incorrectly set for dir F:/big-data/data/tmp/filecache, should be rwxr-xr-x, actual value = rwxrwxr-x

This issue is recorded here:

I cannot resolve this problem without running the commands as Administrator. 

Based on the JIRA links, these issues should have been fixed. However it may not work because my Windows account is not a local account or domain account.

I will find sometime to try directly using a local account (without Microsoft account) to see if it works. 

It seems you didn't get any issue when changing the local tmp folder, is that correct?

format_quote

Comment is deleted or blocked.

reply Reply

Hi Tim,

Have you checked YARN web portal to see if you can see the Spark application is submitted successfully? You should be able to find more details there too (assuming you are run Spark with master set as yarn).

I’m working today and will try to replicate what you did in my machine when I am off work.



format_quote

Comment is deleted or blocked.

reply Reply

Can you add the environment variables into bash profile?

vi ~/.bashrc

And then insert the following lines (replace the values to your paths as shown in your screenshot):

export HADOOP_HOME='/cygdrive/f/DataAnalytics/hadoop-3.0.0'
export PATH=$PATH:$HADOOP_HOME/bin
export HIVE_HOME='/cygdrive/f/DataAnalytics/apache-hive-3.0.0-bin'
export PATH=$PATH:$HIVE_HOME/bin
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HIVE_HOME/lib/*.jar

Save the file after insert.

It's very hard to debug without access to your environment. 

format_quote

person Praveen access_time 2 months ago
Re: Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

Still facing same issue.... 

Tries with derby command. I have shown  the echo path of hive and hadoop in the below screen shot... 

Can you help me in this?


reply Reply

Hello Praveen,

If you close the cygwin windows and reopen it and then type the following:

echo $HIVE_HOME
echo $HADOOP_HOME

Does that still list all the values?

And also, I noticed you are using mysql as hive metastore, have you configured all the values correctly? If not, I would recommend using derby if you are just installing Hive for learning.  derby is built in however it only supports on session concurrently.  

format_quote

person Praveen access_time 2 months ago
Re: Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

While I try to execute the below command in cygwin 

$HIVE_HOME/bin/schematool -dbType derby -initSchema

I am facing issues... I have installed hadoop 3.1.0

Can anyone help. Me in this? 

reply Reply

Hi Saad,

Refer to the Reference section on this page: Default Ports Used by Hadoop Services (HDFS, MapReduce, YARN). It has the links to the official documentation about all the parameters you can configure in HDFS and YARN. It also shows the default values for each configurations.

For different versions of Hadoop, the default values might be different. 

format_quote

person Saad access_time 2 months ago
Re: Install Hadoop 3.2.1 on Windows 10 Step by Step Guide

Hi,

http://localhost:9870/dfshealth.html#tab-overview

http://localhost:9864/datanode.html
these 2 links were not opening once i reached till end, then i started changing values in hdfs-site.xml 

to some other paths locations in E drive and then i think got lost.

Today when i run start-dfs.cmd then data and name node start without any error and i can see above 2 urls without any error. 

Thanks for quick reply.

Can you also guide me where can i find and change ports values like 8088,9870 etc.  

 Thanks again for this tutorial.

Regards,

Saad

reply Reply

Hi Saad,

I don't see any error message in the log you pasted.

Can you please be more specific about the errors you encounterred.

For formatting namenode, it is correct to expect the namenode daemon to shutdown after the format is done. We will start all the HDFS and YAN daemons at the end. 

format_quote

person Saad access_time 2 months ago
Re: Install Hadoop 3.2.1 on Windows 10 Step by Step Guide

hello Raymond,


I learning about Hadoop and was following your detailed information on windows installation.




2020-04-27 22:32:04,347 INFO namenode.NameNode: createNameNode [-format]


Formatting using clusterid: CID-1d0c51aa-5dde-446b-99c1-3997255160fa


2020-04-27 22:32:05,369 INFO namenode.FSEditLog: Edit logging is async:true


2020-04-27 22:32:05,385 INFO namenode.FSNamesystem: KeyProvider: null


2020-04-27 22:32:05,387 INFO namenode.FSNamesystem: fsLock is fair: true


2020-04-27 22:32:05,388 INFO namenode.FSNamesystem: Detailed lock hold time metrics enabled: false


2020-04-27 22:32:05,428 INFO namenode.FSNamesystem: fsOwner             = saad (auth:SIMPLE)


2020-04-27 22:32:05,431 INFO namenode.FSNamesystem: supergroup          = supergroup


2020-04-27 22:32:05,431 INFO namenode.FSNamesystem: isPermissionEnabled = true


2020-04-27 22:32:05,432 INFO namenode.FSNamesystem: HA Enabled: false


2020-04-27 22:32:05,535 INFO common.Util: dfs.datanode.fileio.profiling.sampling.percentage set to 0. Disabling file IO profiling


2020-04-27 22:32:05,554 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit: configured=1000, counted=60, effected=1000


2020-04-27 22:32:05,554 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true


2020-04-27 22:32:05,562 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000


2020-04-27 22:32:05,563 INFO blockmanagement.BlockManager: The block deletion will start around 2020 Apr 27 22:32:05


2020-04-27 22:32:05,566 INFO util.GSet: Computing capacity for map BlocksMap


2020-04-27 22:32:05,566 INFO util.GSet: VM type       = 64-bit


2020-04-27 22:32:05,568 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB


2020-04-27 22:32:05,568 INFO util.GSet: capacity      = 2^21 = 2097152 entries


2020-04-27 22:32:05,579 INFO blockmanagement.BlockManager: Storage policy satisfier is disabled


2020-04-27 22:32:05,580 INFO blockmanagement.BlockManager: dfs.block.access.token.enable = false


2020-04-27 22:32:05,588 INFO Configuration.deprecation: No unit for dfs.namenode.safemode.extension(30000) assuming MILLISECONDS


2020-04-27 22:32:05,589 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.threshold-pct = 0.9990000128746033


2020-04-27 22:32:05,589 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.min.datanodes = 0


2020-04-27 22:32:05,589 INFO blockmanagement.BlockManagerSafeMode: dfs.namenode.safemode.extension = 30000


2020-04-27 22:32:05,591 INFO blockmanagement.BlockManager: defaultReplication         = 1


2020-04-27 22:32:05,591 INFO blockmanagement.BlockManager: maxReplication             = 512


2020-04-27 22:32:05,591 INFO blockmanagement.BlockManager: minReplication             = 1


2020-04-27 22:32:05,592 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2


2020-04-27 22:32:05,592 INFO blockmanagement.BlockManager: redundancyRecheckInterval  = 3000ms


2020-04-27 22:32:05,592 INFO blockmanagement.BlockManager: encryptDataTransfer        = false


2020-04-27 22:32:05,593 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000


2020-04-27 22:32:05,646 INFO namenode.FSDirectory: GLOBAL serial map: bits=29 maxEntries=536870911


2020-04-27 22:32:05,646 INFO namenode.FSDirectory: USER serial map: bits=24 maxEntries=16777215


2020-04-27 22:32:05,647 INFO namenode.FSDirectory: GROUP serial map: bits=24 maxEntries=16777215


2020-04-27 22:32:05,647 INFO namenode.FSDirectory: XATTR serial map: bits=24 maxEntries=16777215


2020-04-27 22:32:05,664 INFO util.GSet: Computing capacity for map INodeMap


2020-04-27 22:32:05,664 INFO util.GSet: VM type       = 64-bit


2020-04-27 22:32:05,664 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB


2020-04-27 22:32:05,665 INFO util.GSet: capacity      = 2^20 = 1048576 entries


2020-04-27 22:32:05,666 INFO namenode.FSDirectory: ACLs enabled? false


2020-04-27 22:32:05,666 INFO namenode.FSDirectory: POSIX ACL inheritance enabled? true


2020-04-27 22:32:05,667 INFO namenode.FSDirectory: XAttrs enabled? true


2020-04-27 22:32:05,667 INFO namenode.NameNode: Caching file names occurring more than 10 times


2020-04-27 22:32:05,674 INFO snapshot.SnapshotManager: Loaded config captureOpenFiles: false, skipCaptureAccessTimeOnlyChange: false, snapshotDiffAllowSnapRootDescendant: true, maxSnapshotLimit: 65536


2020-04-27 22:32:05,677 INFO snapshot.SnapshotManager: SkipList is disabled


2020-04-27 22:32:05,681 INFO util.GSet: Computing capacity for map cachedBlocks


2020-04-27 22:32:05,681 INFO util.GSet: VM type       = 64-bit


2020-04-27 22:32:05,682 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB


2020-04-27 22:32:05,683 INFO util.GSet: capacity      = 2^18 = 262144 entries


2020-04-27 22:32:05,713 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.window.num.buckets = 10


2020-04-27 22:32:05,714 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.num.users = 10


2020-04-27 22:32:05,714 INFO metrics.TopMetrics: NNTop conf: dfs.namenode.top.windows.minutes = 1,5,25


2020-04-27 22:32:05,720 INFO namenode.FSNamesystem: Retry cache on namenode is enabled


2020-04-27 22:32:05,721 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis


2020-04-27 22:32:05,723 INFO util.GSet: Computing capacity for map NameNodeRetryCache


2020-04-27 22:32:05,723 INFO util.GSet: VM type       = 64-bit


2020-04-27 22:32:05,724 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB


2020-04-27 22:32:05,724 INFO util.GSet: capacity      = 2^15 = 32768 entries


2020-04-27 22:32:05,765 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1264791665-192.168.10.2-1588008725757


2020-04-27 22:32:05,810 INFO common.Storage: Storage directory E:\big-data\data\dfs\namespace_logs has been successfully formatted.


2020-04-27 22:32:05,841 INFO namenode.FSImageFormatProtobuf: Saving image file E:\big-data\data\dfs\namespace_logs\current\fsimage.ckpt_0000000000000000000 using no compression


2020-04-27 22:32:05,939 INFO namenode.FSImageFormatProtobuf: Image file E:\big-data\data\dfs\namespace_logs\current\fsimage.ckpt_0000000000000000000 of size 399 bytes saved in 0 seconds .


2020-04-27 22:32:05,957 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0


2020-04-27 22:32:05,963 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.


2020-04-27 22:32:05,963 INFO namenode.NameNode: SHUTDOWN_MSG:


/************************************************************


SHUTDOWN_MSG: Shutting down NameNode at DESKTOP-ROC4R5P/192.168.10.2


************************************************************/




i have downloaded jar and put in folder also.




https://github.com/FahaoTang/big-data/blob/master/hadoop-hdfs-3.2.1.jar 


Can you help me what thing i am setting wrong??? it will be great help and guidance.


Regards,

Saad

reply Reply

This article is for Hive 3.1.1 installation on Windows 10 using WSL. All the command line needs to run in WSL bash window (not Command Prompt). 

Based on your screenshot, you are trying to install it on Windows 10 directly. If that's the case, please following the following article:

Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

It has been tested by quite a few users with successful installation. 

format_quote

person Yathish access_time 2 months ago
Re: Apache Hive 3.1.1 Installation on Windows 10 using Windows Subsystem for Linux

How do we run schematool in windows.

Also failing to run with cygwin available


reply Reply
I’m glad it is working. Have fun with your big data journey!
format_quote

person Saikat access_time 2 months ago
Re: Apache Hive 3.0.0 Installation on Windows 10 Step by Step Guide

Hey Raymond,

Thanks a lot brother for your prompt response.

This worked like a charm after amending mapred-site.xml with %HADOOP_HOME%.  I made a mistake using the unix convention of the variable and was trying the same thing with $HADOOP_HOME$.

Successfully inserted data. Cheers!

reply Reply

Columns

ML.NET is an open source and cross-platform machine learning framework. With ML.NET, you can create custom ML models using C# or F# without having to leave the .NET ecosystem. This column publish articles about ML.NET.

open_in_new View

Code snippets for various programming languages/frameworks.

open_in_new View

Data analytics with Google Cloud Platform.

open_in_new View

Data analytics, application development with Microsoft Azure cloud platform.

open_in_new View

Posts about Apache Sqoop, a tool designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases.

open_in_new View

PowerShell, Bash, ksh, sh, Perl and etc. 

open_in_new View

General IT information for programming.

open_in_new View