Import a CSV file into SQL Server with SSIS

Step by step tutorial to import a csv flat file into a SQL Server table with SSIS.

It is possible to import a CSV file into a database with the SQL Server BI suite and SSIS in particular. Often, data integration projects require to load data from flat files or external databases. Files structures are either CSV, TXT, RAW files or other less common formats such as fixed width files. This SSIS tutorial explains how to load a CSV file into a SQL Server database using Integration Services. The loading is done in two parts consisting of several steps.

The first part consists of setting up the connection to the text file, i.e. pointing to its location on the disk. The second part is to create and configure the destination table and the connection to the target database to integrate the rows contained in the file.

1. Set up the connection to the CSV file, the target table and the data flow

Of course, the prerequisites must be met first, i.e. preparing the source file, connect to a SQL Server database with SSMS and install Visual Studio and the SSIS projects extension for the package development.

1.1 Prepare the source file and the target table

Start by downloading the file used in the example and creating the target table of customers. The Customers_Data.csv file contains 49 rows, consisting of one header row and 48 rows of automatically generated data.

The purpose of this package is to import the 48 rows from the text file into the SQL Server customers table without errors. The creation code of the table is available here. Execute the code in an exercise or development database with SSMS.

-- Create the customers table
CREATE TABLE [dbo].[Customers](
   [CustomerID] [int] NOT NULL,
   [FirstName] [nvarchar](20) NULL,
   [LastName] [nvarchar](20) NULL,
   [City] [nvarchar](20) NULL,
   [Country] [nvarchar](50) NULL,
   CONSTRAINT [CustomersPrimaryKeyCustomerID] 
      PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO

1.2 Use an existing package or create a new one

This step is about how to manage the connection to the flat file to be imported into the SQL Server database. This step is done in a SSIS data flow.

  • Open Microsoft Visual studio 2019 and create a new SSIS project or open an existing project.
  • Open an existing package to modify. Alternatively, right click and add a new package to the SSIS project.

1.3 Add the SSIS data flow and the Flat File Source

From the workspace, locate the SSIS toolbox on the left. Select Data flow Task and drag it onto the workspace and double click to open it.

New SSIS data flow to load a text file into a SQL database
New SSIS data flow to load a text file into a SQL database

In the SSIS Toolbox, now select Flat File Source and drag the component onto the workspace. Then double click on it to open it and set it up.

Add the Flat File Source to the SSIS data stream (Flat File Source)
Add the Flat File Source to the SSIS data stream (Flat File Source)

Using SSIS variables to manage file paths

With SSIS, almost all components and properties can be set with variables. It is good practice to use variables to manage paths, connection strings and changing SSIS settings efficiently. Like changing environments from development to test or production for example. This topic will be covered in a dedicated tutorial.

1.4 Create a new connection to the text file

Create and configure a new flat file connection by selecting the following items:

  1. Enter the name of the connection
  2. The path of the file
  3. Select the encoding of the file (code page)
  4. The format, in our case it is delimited
  5. Check the delimiter at the end of the header (header row delimiter)
  6. If the column names are in the first line of the file
Configure a new connection to the CSV file in the SSIS package
Configure a new connection to the CSV file in the SSIS package

1.5 Set up the columns and double check the data

You need to set up the columns and check the data:

  1. Validate or modify the row delimiter
  2. Choose the column delimiter, which is the character at the end of each line of data
  3. Check if the data and in particular the accents or special characters are displayed correctly. For example, lines 29 and 32 have an error because the accents are misinterpreted. The encoding must be changed.
Choose the row and column delimiter and check the data
Choose the row and column delimiter and check the data

1.6 Choose the right encoding for the source file with Notepad++

Change the encoding of the source file until the data is displayed correctly. To find the encoding of the file, simply use Notepad++ and open the file. There are 2 possibilities to check the encoding:

  • From the Encoding menu
  • At the bottom right of the software where information about the file is displayed.
Double check the encoding of a text file easily with Notepad++
Double check the encoding of a text file easily with Notepad++

To load the text file without errors, change the encoding and use 65001 (UTF-8) instead of 1252 (ANSI – Latin I) proposed by default by SSIS.

Change the file encoded to 65001 (UTF-8) on the code page setting
Change the file encoded to 65001 (UTF-8) on the code page setting

1.7 Adjust the type and length of columns in the text file

Once the correct encoding has been selected, click on the Columns tab again and check the data again.

The data and accents are now displayed correctly
The data and accents are now displayed correctly

Now click on the Advanced tab. Next, rename the columns if necessary and assign them a consistent data type. By default, SSIS uses the DT_STR type with a length of 50 characters. Use the DT_WSTR type with the correct length to avoid conversion errors later in the development of the SSIS package. So adapt the type of each column to the length of the fields in our customer table.

  1. CustomerID: int becomes a DT_WSTR(8) as the column will be used in a derived column.
  2. FirstName : nvarchar(20)
  3. LastName : nvarchar(20)
  4. City : nvarchar(20)
  5. Country : nvarchar(50)
Use the DT_WSTR type when connecting to the flat file to facilitate integration

Important note: the names and data types of the columns in the flat file connection must match the names and data types of the columns of the target table in the database, which will make data mapping in SSIS much easier. Now check the columns in the Flat File Source component:

Check that the columns are mapped correctly in the source component
Check that the columns are mapped correctly in the source component

2. Configure the OLE DB destination component and the connection to the database from SSIS

The second part of the text file import into a SQL Server database is the mapping and the connection in SSIS. Indeed, configure now the target database component and load the data with the right format. A format compatible with the data of the flat file.

2.1 Add the Derived Column component and link it to the source

  • In the SSIS toolbox, from the Common part. Drag the Derived Column transformation into the data flow.
  • Then link the Flat File Source component to Derived Column as shown in the image below.
Add the Derived Column component and link it to the Flat File Source
Add the Derived Column component and link it to the Flat File Source

2.2 Add a derived column to convert the customer ID to integer

The CustomerID column is of type integer, so a new one must be created based on the one in the file. To convert it to a new SSIS column of type Integer (DT_I8).

Create a derived column to convert Customer ID from text to integer
Create a derived column to convert Customer ID from text to integer

2.3 Now configure the OLE DB Destination component

To do this, you must first create a connection to the target database from the SSIS connection manager. Set up the data mapping in the Mappings tab to check that the flat file and SQL Server table columns match. Use the Derived Column.CustomerID column and not the original flat file column.

Use the new column for the Customer ID derived column
Use the new column for the Customer ID derived column

Automatic mapping with SSIS

Note: It is possible to automatically map SSIS columns in the next step. Make sure to name all source and target columns the same.

2.4 Run the SSIS package to import the text file and check the table

Once the data flow is fully configured, run the SSIS package to insert the rows into the customer table. The 48 rows are processed and integrated into the database.

Press F5 or click Run to launch the SSIS package
Press F5 or click Run to launch the SSIS package

From SSMS, check the data with these two SQL queries. The table has 48 rows, identical to the contents of the file.

SELECT COUNT(*) AS [NumberOfCustomers]
FROM   [dbo].[Customers];

SELECT *  
FROM   [dbo].[Customers];
Execute the 2 selection queries to count and display the rows
Execute the 2 selection queries to count and display the rows

Finally, run a SELECT query to check the data loaded into the database. This example allows you to easily import a text file into a SQL Server database with SSIS. It is also possible to load several flat files into a database with an SSIS loop (For Each File component).

3. Avoid text file import errors with Integration Services

A common error is the SSIS code 0x80019002 which indicates that the maximum number of errors allowed has been reached. It is indeed possible to set the number of errors tolerated via the MaximumErrorCount parameter. However, this is not recommended except in the case of automatic management of errors and re-loads. Three key elements to check to successfully integrate files and avoid classic errors:

  1. Column separator
  2. Line separator
  3. Encoding of the CSV file

So, CSV and flat files integration in general with SSIS and Visual Studio is powerful. However, errors are common, unless you prepare and check the data types in both the source file and the target table beforehand. To take it a step further, it is possible to load an Excel file into a table with SSIS in the same way. The previous tutorial is bout how to create a simple data flow to load a table into another and enrich the data.

Be the first to comment

Leave a Reply

Your email address will not be published.


*