Teradata FastLoad with Optional Quotes

Raymond Raymond event 2020-09-13 visibility 3,394
more_vert

In article Teradata FastLoad - Load CSV File, it shows how to load CSV into Teradata. The input file is very basic CSV file. This article expands on that to provide examples about loading CSV files with optional quoted fields to skip delimiters. It also shows how to load files with multi-character quotes. 

Example CSV file

Create a sample CSV file named test_fastload3.csv with the following content:

ID, Attr, Value, CreatedDate, Comments
1,A,10,2020-09-13,"This is a comment with comma,"
2,B,20,2020-09-13,This is a comment without comma.
3,C,30,2020-09-13,This is a comment without comma.
4,D,40,2020-09-13,This is a comment without comma.

Column 'Comments' in the first record is double quoted as there is a comma in the content. 

Optional double quoted fields

FastLoad command SET RECORD has options available to specify quote characters. It also provides options about mandatory and optional quotes. 

Create a FastLoad script file named load-csv3.fastload. The content looks like the following:

LOGON 192.168.119.128/DBC,DBC;
DATABASE TestDb;
DROP TABLE FASTLOAD_CSV3;
DROP TABLE FASTLOAD_CSV3_ERR_1;
DROP TABLE FASTLOAD_CSV3_ERR_2;
CREATE SET TABLE FASTLOAD_CSV3
(
ID VARCHAR(10) NOT NULL,
Col_Attr VARCHAR(1),
Col_Value VARCHAR(10),
CreatedDate VARCHAR(10),
Comments VARCHAR(255)
)
UNIQUE PRIMARY INDEX(ID);

BEGIN LOADING FASTLOAD_CSV3 ERRORFILES FASTLOAD_CSV3_ERR_1, FASTLOAD_CSV3_ERR_2;
	SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '"';
	RECORD 2;
	DEFINE	ID (VARCHAR(10)), 
			Col_Attr (VARCHAR(1)), 
			Col_Value(VARCHAR(10)), 
			CreatedDate (VARCHAR(10)),
			Comments (VARCHAR(255))
			FILE = test_fastload3.csv;
   
	INSERT INTO FASTLOAD_CSV3 (ID, Col_Attr, Col_Value, CreatedDate, Comments) 
		VALUES (:ID, :Col_Attr, :Col_Value, :CreatedDate, :Comments) ; 
END LOADING;
LOGOFF ;

SET RECORD command is set as this:

SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '"';

Quote character is specified as double quotes (") and it is optional. If all fields are all quoted, the value can be specified as YES:

SET RECORD VARTEXT DELIMITER ',' QUOTE YES '"';

Run FastLoad script

Command fastload can be used to run the script file:

fastload < load-csv3.fastload

Output:

...
0008    SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '"';

**** 23:30:14 Now set to read 'Variable-Length Text' records
**** 23:30:14 Delimiter character(s) is set to ','
**** 23:30:14 Field values are optionally quoted with '"'
**** 23:30:14 Command completed successfully

**** 23:06:55 Starting record number set to  : 2
...

**** 23:30:14 END LOADING COMPLETE

     Total Records Read              =  5
      - skipped by RECORD command    =  1
      - sent to the RDBMS            =  4
     Total Error Table 1             =  0  ---- Table has been dropped
     Total Error Table 2             =  0  ---- Table has been dropped
     Total Inserts Applied           =  4
     Total Duplicate Rows            =  0

     Start:   Sun Sep 13 23:30:14 2020
     End  :   Sun Sep 13 23:30:14 2020

Verify the data

Run the following SQL statement to verify the result:

SELECT * FROM TestDb.FASTLOAD_CSV3;

The result set looks like the following screenshot:

20200913133149-image.png

Multi-characters quotes

FastLoad can also handle fields are quoted with multiple character. For example, column 'Comments' in the first record is quoted with string '"|"'.

ID, Attr, Value, CreatedDate, Comments
1,A,10,2020-09-13,"|"This is a comment with comma,"|"
2,B,20,2020-09-13,This is a comment without comma.
3,C,30,2020-09-13,This is a comment without comma.
4,D,40,2020-09-13,This is a comment without comma.

To load this file, SET RECORD command can be changed to the following:

SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '"|"';

Run the FastLoad script and it will parse the command differently:

0008    SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '"|"';

**** 23:35:19 Now set to read 'Variable-Length Text' records
**** 23:35:19 Delimiter character(s) is set to ','
**** 23:35:19 Field values are optionally quoted with '"|"'
**** 23:35:19 Command completed successfully

Different characters for opening and closing quotes

FastLoad can also load delimited files with different opening and closing quote characters.

For example the following file is optionally quoted by '[...]'.

ID, Attr, Value, CreatedDate, Comments
1,A,10,2020-09-13,[This is a comment with comma,]
2,B,20,2020-09-13,This is a comment without comma.
3,C,30,2020-09-13,This is a comment without comma.
4,D,40,2020-09-13,This is a comment without comma.

Change SET RECORD command to the following:

SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '[' ']';

The output for SET RECORD command looks like the following lines:

0008    SET RECORD VARTEXT DELIMITER ',' QUOTE OPTIONAL '[' ']';

**** 23:40:23 Now set to read 'Variable-Length Text' records
**** 23:40:23 Delimiter character(s) is set to ','
**** 23:40:23 Field values are optionally quoted with
              opening quote '[' and closing quote ']'
**** 23:40:23 Command completed successfully

References

Teradata FastLoad Reference

More from Kontext
comment Comments
No comments yet.

Please log in or register to comment.

account_circle Log in person_add Register

Log in with external accounts