How to import CSV files into a SQL Server table with SSIS ? Step by step tutorial to load flat files into an MS database.
It is possible to import a CSV file into a SQL Server table, with the SQL Server BI suite and SSIS in particular. Often, data integration projects require to load data from flat files or external databases. Files structures are either CSV, TXT, RAW files or other less common formats such as fixed width files. This SSIS tutorial explains how to load a CSV file into a SQL Server database using Integration Services. The loading is done in two parts consisting of several steps.
The first part consists of setting up the connection to the text file, i.e. pointing to its location on the disk. The second part is to create and configure the destination table and the connection to the target database to integrate the rows contained in the file.
Table of Contents
1. Set up the connection to the CSV file and the target table
Of course, the prerequisites must be met first, i.e. :
- Preparing the source file
- Connect to a SQL Server database with SSMS
- And install Visual Studio and the SSIS projects extension for the package development.
1.1 Prepare the source file to import
To start the tutorial and import a CSV file into a SQL Server table with SSIS, follow this first step. Start by downloading the file used in the example and creating the target table of customers. The Customers_Data.csv file contains 49 rows, consisting of one header row and 48 rows of automatically generated data.
1.2 Prepare the target table with SSMS
The purpose of this package is to import the 48 rows from the text file into the SQL Server customers table without errors. The creation code of the table is available here. Execute the code in an exercise or development database with SSMS.
-- Create the customers table CREATE TABLE [dbo].[Customers]( [CustomerID] [int] NOT NULL, [FirstName] [nvarchar](20) NULL, [LastName] [nvarchar](20) NULL, [City] [nvarchar](20) NULL, [Country] [nvarchar](50) NULL, CONSTRAINT [CustomersPrimaryKeyCustomerID] PRIMARY KEY CLUSTERED ([CustomerID] ASC) ); GO
2. Use an existing SSIS package or create a new one
This step is about how to manage the connection to the flat file to be imported into the SQL Server database. This step is done in a SSIS data flow.
- Open Microsoft Visual studio 2019 and create a new SSIS project or open an existing project.
- Open an existing package to modify. Alternatively, right click and add a new package to the SSIS project.
2.1 Add the SSIS data flow and the Flat File Source
From the workspace, locate the SSIS toolbox on the left. Select Data flow Task and drag it onto the workspace and double click to open it.
In the SSIS Toolbox, now select Flat File Source and drag the component onto the workspace. Then double click on it to open it and set it up.
Using SSIS variables to manage file paths
With SSIS, almost all components and properties can be set with variables. It is good practice to use variables to manage paths, connection strings and changing SSIS settings efficiently. Like changing environments from development to test or production for example. This topic will be covered in a dedicated tutorial.
2.2 Create a new connection to the text file
Create and configure a new flat file connection by selecting the following items:
- Enter the name of the connection
- The path of the file
- Select the encoding of the file (code page)
- The format, in our case it is delimited
- Check the delimiter at the end of the header (header row delimiter)
- If the column names are in the first line of the file
2.3 Set up the columns and double check the data
You need to set up the columns and check the data:
- Validate or modify the row delimiter
- Choose the column delimiter, which is the character at the end of each line of data
- Check if the data and in particular the accents or special characters are displayed correctly. For example, lines 29 and 32 have an error because the accents are misinterpreted. The encoding must be changed.
2.4 Choose the right encoding for the source file with Notepad++
Change the encoding of the source file until the data is displayed correctly. To find the encoding of the file, simply use Notepad++ and open the file. There are 2 possibilities to check the encoding:
- From the Encoding menu
- At the bottom right of the software where information about the file is displayed.
To load the text file without errors, change the encoding and use 65001 (UTF-8) instead of 1252 (ANSI – Latin I) proposed by default by SSIS.
2.5 Adjust the type and length of columns in the text file
Once the correct encoding has been selected, click on the Columns tab again and check the data again.
Now click on the Advanced tab. Next, rename the columns if necessary and assign them a consistent data type. By default, SSIS uses the DT_STR type with a length of 50 characters. Use the DT_WSTR type with the correct length to avoid conversion errors later in the development of the SSIS package. So adapt the type of each column to the length of the fields in our customer table.
- CustomerID: int becomes a DT_WSTR(8) as the column will be used in a derived column.
- FirstName : nvarchar(20)
- LastName : nvarchar(20)
- City : nvarchar(20)
- Country : nvarchar(50)
Important note: the names and data types of the columns in the flat file connection must match the names and data types of the columns of the target table in the database, which will make data mapping in SSIS much easier. Now check the columns in the Flat File Source component:
3. Configure the SSIS OLE DB destination and the connection
The second part of the text file import into a SQL Server database is the mapping and the connection in SSIS. Indeed, configure now the target database component and load the data with the right format. A format compatible with the data of the flat file.
3.1 Add the Derived Column component and link it to the source
- In the SSIS toolbox, from the Common part. Drag the Derived Column transformation into the data flow.
- Then link the Flat File Source component to Derived Column as shown in the image below.
3.2 Configure the SSIS derived column to convert the customer ID to integer
The CustomerID column is of type integer, so a new one must be created based on the one in the file. To convert it to a new SSIS column of type Integer (DT_I8).
3.3 Configure the SSIS OLE DB Destination component
To do this, you must first create a connection to the target database from the SSIS connection manager. Set up the data mapping in the Mappings tab to check that the flat file and SQL Server table columns match. Use the Derived Column.CustomerID column and not the original flat file column.
Automatic mapping with SSIS
Note: It is possible to automatically map SSIS columns in the next step. Make sure to name all source and target columns the same.
4. Run the SSIS package to import the csv file
Once the data flow is fully configured, run the SSIS package to insert the rows into the customer table. The 48 rows are processed and integrated into the database.
5. Check the data imported using SSMS
From SSMS, check the data with these two SQL queries. The table has 48 rows, identical to the contents of the file.
SELECT COUNT(*) AS [NumberOfCustomers] FROM [dbo].[Customers]; SELECT * FROM [dbo].[Customers];
Finally, run a SELECT query to check the data loaded into the database. This example allows you to easily import a text file into a SQL Server database with SSIS. It is also possible to load several flat files into a database with an SSIS loop (For Each File component).
How to avoid csv text file import errors with Integration Services ?
A common error is the SSIS code 0x80019002 which indicates that the maximum number of errors allowed has been reached. It is indeed possible to set the number of errors tolerated via the MaximumErrorCount parameter. However, this is not recommended except in the case of automatic management of errors and re-loads. Three key elements to check to successfully integrate files and avoid classic errors:
- Column separator
- Line separator
- Encoding of the CSV file
So, CSV and flat files integration in general with SSIS and Visual Studio is powerful. However, errors are common, unless you prepare and check the data types in both the source file and the target table beforehand. To take it a step further, it is possible to load an Excel file into a table with SSIS in the same way. The previous tutorial is bout how to create a simple data flow to load a table into another and enrich the data.
To conclude, this step by step SSIS tutorial explains how to import a CSV file into a SQL Server table using a package developed with Visual Studio.