By using this site, you acknowledge that you have read and understand our Cookie and Privacy policy. Your use of Kontext website is subject to this policy. Accept

Use Hadoop File System Task in SSIS to Write File into HDFS

1420 views last modified about 2 years ago Raymond Tang

SSIS hadoop hdfs


SQL Server Integration Service (SSIS) has tasks to perform operations against Hadoop, for example:

  • Hadoop File System Task
  • Hadoop Hive Task
  • Hadoop Pig Task

In Data Flow Task, you can also use:

  • Hadoop HDFS Source
  • Hadoop HDFS Destination

In this page, I’m going to demonstrate how to write file into HDFS through SSIS Hadoop File System Task.




Refer to the following page to install Hadoop if you don’t have one instance to play with.

Install Hadoop 3.0.0 in Windows (Single Node)


SSIS can be installed via SQL Server Data Tools (SSDT). In this example, I am using 15.1.

Create Hadoop connection manager

In your SSIS package, create a Hadoop Connection Manager:


In WebHDFS tab of the editor, specify the following details:


Click Test Connection button to ensure you can connect and then click OK:


Create a file connection manager

Create a local CSV file

Create a local CSV file named F:\DataAnalytics\Sales.csv with the following content:


Create a file connection manager

Create a file connection manager Sales.csv which points to the file created above.


Create Hadoop File System Task

Use the two connection managers created above to create a Hadoop File System Task:


In the above settings, it uploads Sales.csv into /Sales.csv in HDFS.

Run the package

Run the package or execute the task to make sure it is completed successfully:


Verify the result via HDFS CLI

Use the following command to verify whether the file is uploaded successfully:

hdfs dfs -ls \


You can also print out the content via the following command:

hdfs dfs -cat /Sales.csv


Verify the result through Name Node web UI



WebHDFS REST API reference


It is very easy to upload files into HDFS through SSIS. You can also upload the whole directory into HDFS through this task if you change the file connection manager to pointing to a folder.

If you have any questions, please let me know.

Related pages

Password Security Solution for Sqoop

37 views   0 comments last modified about 3 months ago

In Sqoop, there are multiple approaches to pass in passwords for RDBMS. Options Option 1 - clear password through --password argument sqoop [subcommand] --username user --password pwd This is the weakest approach as password is exposed directly...

View detail

Install Hadoop 3.0.0 in Windows (Single Node)

12863 views   14 comments last modified about 2 years ago

This page summarizes the steps to install Hadoop 3.0.0 in your Windows environment. Reference page: ...

View detail

Resolve Hadoop RemoteException - Name node is in safe mode

213 views   0 comments last modified about 10 months ago

In Safe Mode, the HDFS cluster is read-only. After completion of block replication maintenance activity, the name node leaves safe mode automatically. If you try to delete files in safe mode, the following exception may raise: org.apache.hadoop.ipc.RemoteException(org.apac...

View detail

Configure Sqoop in a Edge Node of Hadoop Cluster

1037 views   0 comments last modified about 10 months ago

This page continues with the following documentation about configuring a Hadoop multi-nodes cluster via adding a new edge node to configure administration or client tools. ...

View detail

Configure YARN and MapReduce Resources in Hadoop Cluster

776 views   0 comments last modified about 10 months ago

When configuring YARN and MapReduce in Hadoop cluster, it is very important to configure the memory and virtual processors correctly. If the configurations are incorrect, the nodes may not be able to start properly and the applications may not be able to run successfully. For example...

View detail

Configure Hadoop 3.1.0 in a Multi Node Cluster

3932 views   0 comments last modified about 10 months ago

Previously, I summarized the steps to install Hadoop in a single node Windows machine. Install Hadoop 3.0.0 in Windows (Single Node) In this page, I...

View detail

Add comment

Comments (0)

No comments yet.


  • enquiry[at]