Install Hadoop 3.3.1 on Windows 10 Step by Step Guide

Raymond Raymond event 2021-10-12 visibility 10,955 comment 36
more_vert
Install Hadoop 3.3.1 on Windows 10 Step by Step Guide

This detailed step-by-step guide shows you how to install the latest Hadoop v3.3.1 on Windows 10. It leverages Hadoop 3.3.1 winutils tool. WLS (Windows Subsystem for Linux) is not required. This version was released on June 15 2021. It is the second release of Apache Hadoop 3.3 line.

Please follow all the instructions carefully. Once you complete the steps, you will have a shiny pseudo-distributed single node Hadoop to work with.

References

Refer to the following articles if you prefer to install other versions of Hadoop or if you want to configure a multi-node cluster or using WSL.

Required tools

Before you start, make sure you have these following tools enabled in Windows 10.

ToolComments
PowerShell

We will use this tool to download package.

In my system, PowerShell version table is listed below:

$PSversionTable
Name Value
---- -----
PSVersion 5.1.19041.1237
PSEdition Desktop
PSCompatibleVersions {1.0, 2.0, 3.0, 4.0...}
BuildVersion 10.0.19041.1237
CLRVersion 4.0.30319.42000
WSManStackVersion 3.0
PSRemotingProtocolVersion 2.3
SerializationVersion 1.1.0.1
Git Bash or 7 Zip

We will use Git Bash or 7 Zip to unzip Hadoop binary package.

You can choose to install either tool or any other tool as long as it can unzip *.tar.gz files on Windows.

Command PromptWe will use it to start Hadoop daemons and run some commands as part of the installation process. 
Java JDK

JDK is required to run Hadoop as the framework is built using Java.

In my system, my JDK version is jdk1.8.0_161.

Check out the supported JDK version on the following page. 

https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+Java+Versions

From Hadoop 3.3.1, Java 11 runtime is also supported. 

Now we will start the installation process. 

Step 1 - Download Hadoop binary package

Go to download page of the official website:

Apache Download Mirrors - Hadoop 3.3.1

And then choose one of the mirror link. The page lists the mirrors closest to you based on your location. For me, I am choosing the following mirror link:

https://dlcdn.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

info In the following sections, this URL will be used to download the package.  Your URL might be different from mine and you can replace the link accordingly.

Download the package

info In this guide, I am installing Hadoop in folder big-data of my F drive (F:\big-data). If you prefer to install on another drive, please remember to change the path accordingly in the following command lines. This directory is also called destination directory in the following sections.

Open PowerShell and then run the following command lines one by one:

$dest_dir="F:\big-data"
$url = "https://dlcdn.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz"
$client = new-object System.Net.WebClient
$client.DownloadFile($url,$dest_dir+"\hadoop-3.3.1.tar.gz")

It may take a few minutes to download. 

Once the download completes, you can verify it:

PS F:\big-data> cd $dest_dir
PS F:\big-data> ls


    Directory: F:\big-data


Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a----        12/10/2021   7:28 PM      605187279 hadoop-3.3.1.tar.gz

You can also directly download the package through your web browser and save it to the destination directory.

warning Please keep this PowerShell window open as we will use some variables in this session in the following steps. If you already closed it, it is okay, just remember to reinitialize the above variables: $client, $dest_dir.

Step 2 - Unpack the package

Now we need to unpack the downloaded package using GUI tool (like 7 Zip) or command line. For me, I will use git bash to unpack it.

Open git bash and change the directory to the destination folder:

cd $dest_dir

And then run the following command to unzip:

tar -xvzf  hadoop-3.3.1.tar.gz

The command will take quite a few minutes as there are numerous files included and the latest version introduced many new features.

After the unzip command is completed, a new folder hadoop-3.3.1 is created under the destination folder. 

2021101284616-image.png

info When running the command you may experience errors like the following:
tar: Exiting with failure status due to previous errors
Please ignore it for now.

Step 3 - Install Hadoop native IO binary

Hadoop on Linux includes optional Native IO support. However Native IO is mandatory on Windows and without it you will not be able to get your installation working. The Windows native IO libraries are not included as part of Apache Hadoop release. Thus we need to build and install it.

infoThe following repository already pre-built Hadoop Windows native libraries:
https://github.com/kontext-tech/winutils 
warning These libraries are not signed and there is no guarantee that it is 100% safe. We use it purely for test&learn purpose. 

Download all the files in the following location and save them to the bin folder under Hadoop folder. For my environment, the full path is: F:\big-data\hadoop-3.3.1\bin. Remember to change it to your own path accordingly. 

https://github.com/kontext-tech/winutils/tree/master/hadoop-3.3.1/bin

After this, the bin folder looks like the following:
2021101285040-image.png

Step 4 - (Optional) Java JDK installation

Java JDK is required to run Hadoop. If you have not installed Java JDK, please install it.

You can install JDK 8 from the following page:

https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

Once you complete the installation, please run the following command in PowerShell or Git Bash to verify:

$ java -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

If you got error about 'cannot find java command or executable'. Don't worry we will resolve this in the following step.

Step 5 - Configure environment variables

Now we've downloaded and unpacked all the artefacts we need to configure two important environment variables.

Configure JAVA_HOME environment variable

As mentioned earlier, Hadoop requires Java and we need to configure JAVA_HOME environment variable (though it is not mandatory but I recommend it).

First, we need to find out the location of Java SDK. In my system, the path is: D:\Java\jdk1.8.0_161.

2020011804902-image.png

Your location can be different depends on where you install your JDK.

And then run the following command in the previous PowerShell window:

SETX JAVA_HOME "D:\Java\jdk1.8.0_161" 

Remember to quote the path especially if you have spaces in your JDK path.

infoYou can setup environment variable at system level by adding option /M however just in case you don't have access to change system variables, you can just set it up at user level.

The output looks like the following:

2020011805542-image.png

Configure HADOOP_HOME environment variable

Similarly we need to create a new environment variable for HADOOP_HOME using the following command. The path should be your extracted Hadoop folder. For my environment it is: F:\big-data\hadoop-3.3.1.

If you used PowerShell to download and if the windowis still open, you can simply run the following command:

SETX HADOOP_HOME $dest_dir+"/hadoop-3.3.1"                        

Alternatively, you can specify the full path:

SETX HADOOP_HOME "F:\big-data\hadoop-3.3.1"

Now you can also verify the two environment variables in the system:

2021101285504-image.png

Configure PATH environment variable

Once we finish setting up the above two environment variables, we need to add the bin folders to the PATH environment variable. 

If PATH environment exists in your system, you can also manually add the following two paths to it:

  • %JAVA_HOME%/bin
  • %HADOOP_HOME%/bin

Alternatively, you can run the following command to add them:

setx PATH "$env:PATH;$env:JAVA_HOME/bin;$env:HADOO_HOME/bin"

If you don't have other user variables setup in the system, you can also directly add a Path environment variable that references others to make it short:

2020011812142-image.png

Close PowerShell window and open a new one and type winutils.exe directly to verify that our above steps are completed successfully:

2021101285718-image.png

You should also be able to run the following command:

hadoop -version
java version "1.8.0_161"
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.161-b12, mixed mode)

Step 6 - Configure Hadoop

Now we are ready to configure the most important part - Hadoop configurations which involves Core, YARN, MapReduce, HDFS configurations. 

Configure core site

Edit file core-site.xml in %HADOOP_HOME%\etc\hadoop folder. For my environment, the actual path is F:\big-data\hadoop-3.3.1\etc\hadoop.

Replace configuration element with the following:

<configuration>
   <property>
     <name>fs.default.name</name>
     <value>hdfs://0.0.0.0:19000</value>
   </property> </configuration>

Configure HDFS

Edit file hdfs-site.xml in %HADOOP_HOME%\etc\hadoop folder

Before editing, please correct two folders in your system: one for namenode directory and another for data directory.  For my system, I created the following two sub folders:

  • F:\big-data\data\dfs\namespace_logs_331
  • F:\big-data\data\dfs\data_331

Replace configuration element with the following (remember to replace the highlighted paths accordingly):

<configuration>
   <property>
     <name>dfs.replication</name>
     <value>1</value>
   </property>
   <property>
     <name>dfs.namenode.name.dir</name>
     <value>file:///F:/big-data/data/dfs/namespace_logs_331</value>
   </property>
   <property>
     <name>dfs.datanode.data.dir</name>
     <value>file:///F:/big-data/data/dfs/data_331</value>
   </property> </configuration>

In Hadoop 3, the property names are slightly different from previous version. Refer to the following official documentation to learn more about the configuration properties:

Hadoop 3.3.1 hdfs_default.xml

infoFor DFS replication we configure it as one as we are configuring just one single node. By default the value is 3.
infoThe directory configuration are not mandatory and by default it will use Hadoop temporary folder. For our tutorial purpose, I would recommend customize the values. 

Configure MapReduce and YARN site

Edit file mapred-site.xml in %HADOOP_HOME%\etc\hadoop folder

Replace configuration element with the following:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
    <property> 
        <name>mapreduce.application.classpath</name>
        <value>%HADOOP_HOME%/share/hadoop/mapreduce/*,%HADOOP_HOME%/share/hadoop/mapreduce/lib/*,%HADOOP_HOME%/share/hadoop/common/*,%HADOOP_HOME%/share/hadoop/common/lib/*,%HADOOP_HOME%/share/hadoop/yarn/*,%HADOOP_HOME%/share/hadoop/yarn/lib/*,%HADOOP_HOME%/share/hadoop/hdfs/*,%HADOOP_HOME%/share/hadoop/hdfs/lib/*</value>
    </property>
</configuration>

Edit file yarn-site.xml in %HADOOP_HOME%\etc\hadoop folder

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.env-whitelist</name>
        <value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
    </property>
</configuration>

Step 7 - Initialize HDFS & bug fix

Run the following command in Command Prompt 

hdfs namenode -format

The following is an example when it is formatted successfully:

2021101290316-image.png

Step 8 - Start HDFS daemons 

Run the following command to start HDFS daemons in Command Prompt:

%HADOOP_HOME%\sbin\start-dfs.cmd
Two Command Prompt windows will open: one for datanode and another for namenode as the following screenshot shows:

2021101290546-image.png

You can verify the processes via running the following command:

jps

The results should include the two processes:

F:\big-data>jps
15704 DataNode
16828 Jps
23228 NameNode

Verify HDFS web portal UI through this link: http://localhost:9870/dfshealth.html#tab-overview.

2021101290731-image.png

You can also navigate to a data node UI:

2021101290809-image.png

Step 9 - Start YARN daemons

warning You may encounter permission issues if you start YARN daemons using normal user. To ensure you don't encounter any issues. Please open a Command Prompt window using Run as administrator.
Alternatively, you can follow this comment on this page which doesn't require Administrator permission using a local Windows account:
https://kontext.tech/comment/314

Run the following command in an elevated Command Prompt window (Run as administrator) to start YARN daemons:

%HADOOP_HOME%\sbin\start-yarn.cmd
Similarly two Command Prompt windows will open - one for resource manager and another for node manager as the following screenshot shows:

2021101291015-image.png

You can verify YARN resource manager UI when all services are started successfully. 

http://localhost:8088

2021101291105-image.png

Step 10 - Verify Java processes

Run the following command to verify all running processes:

jps

The output looks like the following screenshot:

* We can see the process ID of each Java process for HDFS/YARN. 

* Your process IDs can be different from mine. 

2021101291131-image.png

Step 11 - Shutdown YARN & HDFS daemons

You don't need to keep the services running all the time. You can stop them by running the following commands one by one once you finish the test:

%HADOOP_HOME%\sbin\stop-yarn.cmd
%HADOOP_HOME%\sbin\stop-dfs.cmd
check Congratulations! You've successfully completed the installation of Hadoop 3.3.1 on Windows 10.

Let me know if you encounter any issues. Enjoy with your latest Hadoop on Windows 10.

More from Kontext
comment Comments
N NA VN

NA access_time 12 months ago link more_vert

I got that error when run "%HADOOP_HOME%\sbin\start-dfs.cmd" 

2024010934611-image.png

-My hdfs-site.xml

2024010934858-image.png

-My folder: 2024010934924-image.png

-Java version 

2024010934647-image.png

-Hadoop version 3.3.1.

- This is my enviroment:

2024010934751-image.png


I had been tried: "hdfs namenode -format" But it still not work

Raymond Raymond

Raymond access_time 12 months ago link more_vert

And I also suggest using WSL if you find it not easy to install natively on Windows: Install Hadoop 3.3.2 in WSL on Windows (kontext.tech)

N NA VN

NA access_time 12 months ago link more_vert

I deleted everything then read carefully your article. Well I installed hadoop successfully.

Thanks so much for the  great tutorial.

For those who can't install, the hardest step is "set enviroment". You should specify directly to the bin folder. Hope this help!

Raymond Raymond

Raymond access_time 12 months ago link more_vert

I'm glad it is now working for you. 

Raymond Raymond

Raymond access_time 12 months ago link more_vert

Can you delete these two sub folders and try format again?

S Super Gamer

Super access_time 3 years ago link more_vert

How do I use it with Hadoop?

Raymond Raymond

Raymond access_time 3 years ago link more_vert

Can you please be more specific?

Once youโ€™ve configured it, you can just use it as Hadoop instance. For example, you can use HDFS CLI to interact with it.

A Arya Sanjaya

Arya access_time 3 years ago link more_vert

Hi, 

I managed to get 64-bits winutils.exe and using this file instead you gave. 

the resource manager is on but the DFS are not. pls advise on mistake I did? thanks


STARTUP_MSG: java = 1.8.0_321 ************************************************************/ 2022-02-08 13:56:05,932 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2022-02-08 13:56:06,424 INFO checker.ThrottledAsyncChecker: Scheduling a check for [DISK]file:/C:/hadoop-3.3.1/data/dfs/datanode331 2022-02-08 13:56:06,562 WARN checker.StorageLocationChecker: Exception checking StorageLocation [DISK]file:/C:/hadoop-3.3.1/data/dfs/datanode331 java.lang.UnsatisfiedLinkError: org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Ljava/lang/String;I)Z at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method) at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:793) at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1215) at org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:160) at org.apache.hadoop.util.DiskChecker.checkDirInternal(DiskChecker.java:142) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:116) at org.apache.hadoop.hdfs.server.datanode.StorageLocation.check(StorageLocation.java:239) at org.apache.hadoop.hdfs.server.datanode.StorageLocation.check(StorageLocation.java:52) at org.apache.hadoop.hdfs.server.datanode.checker.ThrottledAsyncChecker$1.call(ThrottledAsyncChecker.java:142) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:69) at org.apache.hadoop.thirdparty.com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) 2022-02-08 13:56:06,564 ERROR datanode.DataNode: Exception in secureMain org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0 at org.apache.hadoop.hdfs.server.datanode.checker.StorageLocationChecker.check(StorageLocationChecker.java:233) at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2841) at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2754) at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2798) at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2942) at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2966) 2022-02-08 13:56:06,568 INFO util.ExitUtil: Exiting with status 1: org.apache.hadoop.util.DiskChecker$DiskErrorException: Too many failed volumes - current valid volumes: 0, volumes configured: 1, volumes failed: 1, volume failures tolerated: 0 2022-02-08 13:56:06,571 INFO datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at ARYA-DUGI-LAPTOP/192.168.1.18 ************************************************************/ C:\hadoop-3.3.1\sbin>jps 2244 Jps 26232 ResourceManager C:\hadoop-3.3.1\sbin>

Raymond Raymond

Raymond access_time 3 years ago link more_vert

It failed because the HDFS is not working probably because of the same error I mentioned earlier. Unfortunately I could not help you much as I don't have a Windows 11 system to test (my laptop CPU unfortunately is not supported). 

A Arya Sanjaya

Arya access_time 3 years ago link more_vert

Hi I am having issues with winutils.exe. pls advise

ARYA (arya.sanjaya@pradita.ac.id)

the returned on  command is: 


Program 'winutils.exe.exe' failed to run: The specified executable is not a valid application for this OS platform.At line:1 char:1 + winutils.exe +  CategoryInfo : ResourceUnavailable: (:) [], ApplicationFailedException + FullyQualifiedErrorId : NativeCommandFailed


system info: 
OS Name: Microsoft Windows 11 Pro OS Version: 10.0.22000 N/A Build 22000 OS Manufacturer: Microsoft Corporation OS Configuration: Standalone Workstation OS Build Type: Multiprocessor Free Registered Owner: arya.sanjaya@outlook.com Registered Organization: N/A Product ID: 00330-52275-85811-AAOEM Original Install Date: 11-Nov-21, 1:22:08 AM System Boot Time: 04-Feb-22, 1:54:58 PM System Manufacturer: Acer System Model: TravelMate P2410-G2-M System Type: x64-based PC Processor(s): 1 Processor(s) Installed. [01]: Intel64 Family 6 Model 142 Stepping 10 GenuineIntel ~1600 Mhz BIOS Version: Insyde Corp. V3.02, 12-Nov-18 Windows Directory: C:\WINDOWS System Directory: C:\WINDOWS\system32 Boot Device: \Device\HarddiskVolume1 System Locale: en-us;English (United States) Input Locale: en-us;English (United States) Time Zone: (UTC+07:00) Bangkok, Hanoi, Jakarta Total Physical Memory: 16,261 MB Available Physical Memory: 7,949 MB Virtual Memory: Max Size: 21,758 MB Virtual Memory: Available: 8,754 MB Virtual Memory: In Use: 13,004 MB Page File Location(s): C:\pagefile.sys Domain: WORKGROUP Logon Server: \\ARYA-DUGI-LAPTO Hotfix(s): 5 Hotfix(s) Installed. [01]: KB5008880 [02]: KB5004567 [03]: KB5008295 [04]: KB5009566 [05]: KB5007414 Network Card(s): 4 NIC(s) Installed. [01]: TAP-Windows Adapter V9 Connection Name: Ethernet 2 Status: Media disconnected [02]: Intel(R) Ethernet Connection I219-LM Connection Name: Ethernet Status: Media disconnected [03]: Intel(R) Dual Band Wireless-AC 7265 Connection Name: Wi-Fi DHCP Enabled: Yes DHCP Server: 192.168.1.1 IP address(es) [01]: 192.168.1.18 [02]: fe80::7c59:ef36:46a1:13b5 [04]: Bluetooth Device (Personal Area Network) Connection Name: Bluetooth Network Connection Status: Media disconnected Hyper-V Requirements: VM Monitor Mode Extensions: Yes Virtualization Enabled In Firmware: Yes Second Level Address Translation: Yes Data Execution Prevention Available: Yes


Raymond Raymond

Raymond access_time 3 years ago link more_vert

I have not tried installing this on Windows 11 thus wouldn't be able to provide accurate advice about this one. However, can you try winutils directly instead of winutils.exe? The path winutils.exe.exe in the error message is not right.

A Arya Sanjaya

Arya access_time 3 years ago link more_vert

tried to reinstall  the winutils from github, but still error like this



Error while running command to get file permissions : java.io.IOException: Cannot run program "C:\hadoop-3.3.1\bin\winutils.exe": CreateProcess error=216, This version of %1 is not compatible with the version of Windows you're running. Check your computer's system information and then contact the software publisher
Raymond Raymond

Raymond access_time 3 years ago link more_vert

Looks like there is a compatible issue since the native libs were built for Windows 10. As I am not using Windows 11, I cannot really debug for you about this issue before I upgrade my system.

Can you try the following to see if it works?

Right click winutils.exe program and click Properties. Go to Compatibility tab and set Compatibility mode as Windows 10.

A Arya Sanjaya

Arya access_time 3 years ago link more_vert

I always get my datanode and namenode shutdown after using comman startdfs.cmd and start-yarn.cmd. any advise how to settler?
Thanks


Raymond Raymond

Raymond access_time 3 years ago link more_vert

That usually means the installation was not successful and you will need to look into details to find out the actual error. Did you successfully complete all the steps before starting DFS and YARN?

R Rakesh Sharma

Rakesh access_time 3 years ago link more_vert

Everything worked except jps command. 

Raymond Raymond

Raymond access_time 3 years ago link more_vert

jps command is located in bin sub folder of JDK home folder (JAVA_HOME). If it doesn't work, it usally means your PATH environment variable doesn't include %JAVA_HOME%\bin.


W White Portal

White access_time 3 years ago link more_vert

๐Ÿ˜„ Everything worked. ๐Ÿ‘๐Ÿ‘๐Ÿ‘

Thank you very much! ๐Ÿ˜


Raymond Raymond

Raymond access_time 3 years ago link more_vert

You are welcome. Iโ€™m glad that everything now works for you. Have fun learning big data.

W White Portal

White access_time 3 years ago link more_vert

Now when opening Eclipse IDE 2021โ€‘12, this happens:

Would I have to use https://eclipse.en.uptodown.com/windows/download/2065150 ? ๐Ÿค”

Raymond Raymond

Raymond access_time 3 years ago link more_vert

You have several options:

  1. Download Eclipse versions (earlier versions) that supports JDK 1.8. 
  2. or replace JDK1.8 with JDK 11 since it is supported for Hadoop 3.3.1 (i.e. change JAVA_HOME to JDK 11 installation folder). In this way, your JAVA_HOME will work for both (Hadoop and Eclipse).
  3. or install additional JDK11 however keep JAVA_HOME as 1.8 version and then change Eclipse config ini file (eclipse.ini in your Eclipse installation folder if I remember correctly) to use JDK11 directly. 
W White Portal

White access_time 3 years ago link more_vert

It worked out ๐Ÿ˜„:

But mine is like this:


And yours like this:

What is it? ๐Ÿค”

Raymond Raymond

Raymond access_time 3 years ago link more_vert

Congratulations! If you click About link, you should be able to see similar screenshot as mine. Feel free to navigate around the resource manager portal. 

W White Portal

White access_time 3 years ago link more_vert

It worked, but it's different from yours:


Here too:



Raymond Raymond

Raymond access_time 3 years ago link more_vert

It was not successful as there is no node manager started successfully. Can you start YARN services using Administrator Command Prompt (run as Administrator)? I believe that will potentially resolve the permission error in your node manager. 

W White Portal

White access_time 3 years ago link more_vert

Now this problem ๐Ÿ˜ฅ:


Raymond Raymond

Raymond access_time 3 years ago link more_vert

Hi White,

You are getting closer. For this issue, can you please just manually add the required paths to your user PATH environment variable? You can remove the redundant variables that already exist in your system PATH environment variable.

I provided that PowerShell scripts to help people easily add those paths into PATH but for some users like you, error like this will occur if your system PATH environment is very long. 

W White Portal

White access_time 3 years ago link more_vert

Is this version ok:


?

With this link of Hadoop:

https://dlcdn.apache.org/hadoop/common/hadoop-3.3.1/hadoop-3.3.1.tar.gz

?

Raymond Raymond

Raymond access_time 3 years ago link more_vert

Yes, JDK 1.8 is ok. 

W White Portal

White access_time 3 years ago link more_vert

When I run:

setx PATH "$env:PATH;$env:JAVA_HOME/bin;$env:HADOO_HOME/bin"

Stays like this:


Raymond Raymond

Raymond access_time 3 years ago link more_vert

Hello White, I think you missed one character P in your Hadoop home environment variable. 

$env:HADOOP_HOME
W White Portal

White access_time 3 years ago link more_vert

When I type: http://localhost:8088 it gives a problem.



Raymond Raymond

Raymond access_time 3 years ago link more_vert

Hi White, 

You cannot open HDFS portal because the services were not started successfully. The video doesn't show full details of your error message. Can you please paste the detailed error messages?

And also can you try if you can run the following command successfully in Command Prompt?

%HADOOP_HOME%\bin\winutils.exe
W White Portal

White access_time 3 years ago link more_vert

I can run HDFS and I can't run YARN. And when running "%HADOOP_HOME%\sbin\start-yarn.cmd":

1ยช Window:

https://www.mediafire.com/file/rb5kv7b81hkhxff/1%25C2%25AA_Window.txt/file

2ยช Window:

https://www.mediafire.com/file/vgnulsp4rrnviie/2%25C2%25AA_Window.txt/file

When I run "%HADOOP_HOME%\bin\winutils.exe":

https://www.mediafire.com/file/69hc7fxs44o2m2n/Command.txt/file

And now?

Raymond Raymond

Raymond access_time 3 years ago link more_vert

Your nodemanager failed at the step of adding filter authentication:

http.HttpServer2: Added filter authentication (class=org.apache.hadoop.security.authentication.server.AuthenticationFilter) to context logs
webapp.WebApps: Registered webapp guice modules

Can you confirm whether you've followed the exact steps in this article? and also what is your JDK version? Can you try starting yarn services in Administrator Command Prompt (run as Administrator)?

From the logs, I can see you the commands started with arguments 192.168.56.1. This is a also suspicious to me. 

W White Portal

White access_time 3 years ago link more_vert

I use jdk-17.0.1. And yes I put it as administrator. And I followed the tutorial exactly. And what do I do now?

Raymond Raymond

Raymond access_time 3 years ago link more_vert

That is the root cause - Hadoop 3.3.1 only supports Java 8 and 11 (for runtime only not compile). In the prerequisites section, I've mentioned about this. I would suggest you installing Java 8:

Hadoop Java Versions - Hadoop - Apache Software Foundation

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts