By using this site, you acknowledge that you have read and understand our Cookie policy, Privacy policy and Terms .

Issue

After finishing installation Hadoop 3.0.0 in my Windows: Install Hadoop 3.0.0 in Windows (Single Node), I got the following error after I formated the name node several times.

The following error is thrown out when I tried to start Hadoop HDFS.

2018-02-19 22:02:06,848 WARN common.Storage: Failed to add storage directory [DISK]file:/F:/DataAnalytics/dfs/data
java.io.IOException: Incompatible clusterIDs in F:\DataAnalytics\dfs\data: namenode clusterID = CID-ef46d03c-27ff-45f3-88ae-a2b6f207b001; datanode clusterID = CID-242a04f8-6b63-4325-b56e-14b608af786c
         at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:722)
         at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadStorageDirectory(DataStorage.java:286)
         at org.apache.hadoop.hdfs.server.datanode.DataStorage.loadDataStorage(DataStorage.java:399)
         at org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:379)
         at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:544)
         at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1690)
         at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1650)
         at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:376)
         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
         at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
         at java.lang.Thread.run(Thread.java:748)
2018-02-19 22:02:06,851 ERROR datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid 94e366db-2ed5-40a2-bc36-7335bfcdec05) service to /0.0.0.0:19000. Exiting.
java.io.IOException: All specified directories have failed to load.

Root cause

This issue happened because the cluster id in my hadoop name node and data node VERSION files are not consistent.

image

The root cause is: I formated the namenode using the following command without deleting the files in data node directory:

hadoop namenode -format

Resolution

Before formarting name node, ensure the files under <dfs.data.dir>/ directories are deleted on all data nodes.

info Last modified by Raymond at 2 years ago * This page is subject to Site terms.

More from Kontext

local_offer jupyter-notebook local_offer hdfs

visibility 15
thumb_up 0
access_time 18 days ago

Jupyter notebook service can be started in most of operating system. In the system where Hadoop clients are available, you can also easily ingest data into HDFS (Hadoop Distributed File System) using HDFS CLIs.&nbsp; *Python 3 Kernel is used in the following examples. List files in H...

open_in_new View

local_offer hdfs local_offer hadoop local_offer windows

visibility 35
thumb_up 0
access_time 25 days ago

Network Attached Storage are commonly used in many enterprises where files are stored remotely on those servers.&nbsp; They typically provide access to files using network file sharing protocols such as&nbsp; ...

open_in_new View

local_offer hive local_offer hdfs

visibility 56
thumb_up 0
access_time 2 months ago

In Hive, there are two types of tables can be created - internal and external table. Internal tables are also called managed tables. Different features are available to different types. This article lists some of the common differences.&nbsp; Internal table By default, Hive creates ...

open_in_new View

Schema Merging (Evolution) with Parquet in Spark and Hive

local_offer parquet local_offer pyspark local_offer spark-2-x local_offer hive local_offer hdfs

visibility 246
thumb_up 0
access_time 3 months ago

Schema evolution is supported by many frameworks or data serialization systems such as Avro, Orc, Protocol Buffer and Parquet. With schema evolution, one set of data can be stored in multiple files with different but compatible schema. In Spark, Parquet data source can detect and merge sch...

open_in_new View

info About author

Kontext Column

Kontext Column

Created for everyone to publish data, programming and cloud related articles. Follow three steps to create your columns.

Learn more arrow_forward
info Follow us on Twitter to get the latest article updates. Follow us