Import Excel cells with SSIS into SQL Server and filter the data

How to import and filtrer a range of specific Excel cells into a SQL Server table with SSIS?

When importing data from an XLSX file into a SQL table, it is possible to import specific Excel cells with the SSIS options. The range will necessarily be contiguous, in other words, the Excel cells to be loaded are side by side. However, to load multiple non-contiguous ranges of cells, the method presented in this tutorial must be adapted.

To load an Excel range where the cells are separated or only certain targeted Excel cells. It is necessary to go through a specific SSIS data flow development to process each range separately. It is necessary to run several data flows or a loop with dynamic queries.

To load a particular range of Excel cells, it is necessary to write a static query. This query specifies which cells are to be loaded. Excel spreadsheets are extraordinarily powerful and are very often integrated into databases for data warehouse projects.

1. Download the Excel file used in the tutorial

Here is a sample of the data range used in the Excel file, it contains 24 rows of data, and after each block of 5 rows, there is a blank row, to test the behaviour of the filters from SSIS.

Excel data range from the Excel file used in the tutorial
Excel data range from the Excel file used in the tutorial

Here is the link to download the source Excel file.

2. SQL query to load an Excel range with SSIS

In this first example of a SQL query, the query retrieves all the cells of an Excel table. That is, from the 3rd to the 13th row, and from column A to column C of the MS Excel file to be loaded with the package.

Storing or not storing the source data in an Excel table does not change the SQL query behaviour. In other words, this query loads the Excel cells without filtering the data upstream.

SELECT * FROM [Sheet1$A3:C13];
Import Excel cells with SSIS using a SQL query in the Excel component
Import Excel cells with SSIS using a SQL query in the Excel component

3. SSIS query to filter out empty Excel rows

In this second example, the SQL query is used to filter the data. And therefore to select all rows with the first cell of the selected range not null. And from the 3rd to the 10th row, from column A to column D, of the Microsoft Excel file.

SELECT * FROM [Sheet1$A3:C13]
WHERE CustomerID IS NOT NULL;

Use the HDR=YES parameter to indicate that the column names are fixed and used in the query for example.

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\data\Customers_Data_Sample.xlsx;Extended Properties="EXCEL 12.0 XML;HDR=YES";
SQL Query with a filter on the Excel column using SSIS
SQL Query with a filter on the Excel column using SSIS

4. Filter Excel data without column names

To use generic column names such as F1, F2, F3, etc. Use syntax like this to construct the Excel connection string. This option is particularly useful for loading data with column names that may change.

SELECT * FROM [Sheet1$A3:C13]
WHERE F1 IS NOT NULL;

So use the HDR=NO option to not enable column name handling. HDR is simply an abbreviation for Header.

Data Source=C:\dossier\file.xlsx;Provider=Microsoft.ACE.OLEDB.12.0;Extended Properties="Excel 12.0 Xml;HDR=NO";
SQL query on Excel File from SSIS with a filter on generic column name
SQL query on Excel File from SSIS with a filter on generic column name

5. Query to select Excel rows dynamically

In this last example of a query, here is how to load Excel cells with SSIS and select the data up to the last row of the table without knowing the number of rows. To do this use this variant of the query, where you only need to fill in the name of the rightmost column of the range to be loaded without the row number. The empty rows are loaded, to filter them, add the WHERE clause again as in the previous examples.

SELECT * FROM [Sheet1$A3:E];
SQL query to load an Excel range and select all rows having data
SQL query to load an Excel range and select all rows having data

6. Create the complete SSIS package in 4 steps

To finally put these queries into practice, import Excel files into SQL Server and perform other transformation tasks. Such as filtering and sorting the data before loading it into the database. Also use SSIS to import Excel data in both directions, i.e. import data from Excel to SQL Server or export data from MS SQL to an Excel file.

The creation of the package is done in 4 steps, in an Integration Services data flow:

  1. First create a connection with the SSIS manager to point to the source file.
  2. Add an Excel Source that uses the connection and use the query to filter the cells.
  3. Create an OLE DB destination to the target database and table and configure it.
  4. Then configure the data mapping.

For details of the steps, here is how to load an Excel file into a SQL Server table. This SSIS tutorial explains step by step how to load cells from an Excel sheet into a table. It is also possible to load an Excel file into a table with the SQL Server Wizard.

Be the first to comment

Leave a Reply

Your email address will not be published.


*