Invoke Hadoop WebHDFS APIs in .NET Core
Background
Apache doesn't provide native official .NET APIs for Hadoop HDFS. The HTTP REST API supports the complete FileSystem/FileContext interface for HDFS.
Thus, we could use these web APIs to perform HDFS operations in other programming language like C#.
WebHDFS APIs reference
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/WebHDFS.html
Examples
List files
The following code snippet retrieve the file list in the root directory in my local Hadoop node.:
static void Main(string[] args)
{
WebHdfsListStatusApi();
Console.ReadLine();
}static void WebHdfsListStatusApi()
{var protocal = "http";
var host = "127.0.0.1";
var port = 9870;
var hdfsFilePath = "\\";
var operation = "LISTSTATUS";
var url = $"{protocal}://{host}:{port}/webhdfs/v1/{hdfsFilePath}?op={operation}";
var request = (HttpWebRequest)WebRequest.Create(url);
var response = (HttpWebResponse)request.GetResponse();
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
var result = reader.ReadToEnd();
Console.WriteLine(result);
}
}
The output looks like the following screenshot:
The following is the output in Postman:
Get file content
Similarly you can also get the content of a file through OPEN operation:
static void Main(string[] args)
{
WebHdfsGetFileContent();
Console.ReadLine();
}static void WebHdfsGetFileContent()
{var protocal = "http";
var host = "127.0.0.1";
var port = 9870;
var hdfsFilePath = "\\Sales.csv";
var operation = "OPEN";
var url = $"{protocal}://{host}:{port}/webhdfs/v1/{hdfsFilePath}?op={operation}";
var request = (HttpWebRequest)WebRequest.Create(url);
var response = (HttpWebResponse)request.GetResponse();
using (StreamReader reader = new StreamReader(response.GetResponseStream()))
{
var result = reader.ReadToEnd();
Console.WriteLine(result);
}
}
The following screenshot is the sample output: