Hadoop on Windows - UNHEALTHY Data Nodes Fix
Solution to fix the issue
If you have been running Hadoop on Windows machines, you may encounter issues about unhealthy data nodes.
Usually this will happen if there is no enough disk space in your local drive.
For example, if I start the HDFS and YARN demons under the context of C drive, the local temporary folders will be created in C drive.
C:\>%HADOOP_HOME%\sbin\start-dfs.cmd
C:\>%HADOOP_HOME%\sbin\start-yarn.cmd
By default, YARN will check the disk ratio and the default ratio is 90%. If your C drive has less than 10% space left (which is my case), YARN will report unhealthy nodes errors:
local-dirs have errors: [ /tmp/hadoop-fahao/nm-local-dir : Directory is not writable: mphadoop-fahao m-local-dir ]
*Your user name can be different from mine.
So to fix this problem, you can change YARN configuration to skip disk ratio check or increase the default ratio to 99%; alternatively you can also free up some space.
However, for my scenario, the issue is different as my Hadoop cluster is configured in F drive which has enough space. So if I start these daemons under the context of F drive and the issue is gone.
C:\WINDOWS\system32>cd /D F:
F:\>%HADOOP_HOME%\sbin\start-dfs.cmd
F:\>%HADOOP_HOME%\sbin\start-yarn.cmd
starting yarn daemons
I can also confirm that the temporary directories are now created in F drive:
For UNIX/Linux systems
Of course, you may encounter similar issues. To fix the issue, you need to ensure your local temporary folder has enough space, i.e. disk usage ratio is lower than YARN configured.
The default temporary folder is: /tmp/hadoop-{hdusername}/nm-local-dir.
*Replace {hdusername} with your user name.