Import an XML file into a SQL table with SSIS

Tutorial to import an XML file into a SQL Server table with an SSIS package.

To import a file in XML format into a SQL Server table with SSIS, it is necessary to provide the schema in XSD format that represents the structure of the file. In this case, the XSD file is automatically generated by Visual Studio and Integration Services. This tutorial explains step by step how to import an XML file into a SQL table with the native XML Source component.

1. Prepare the SQL table with SSMS and the XML file to import

The destination table is the customer table with a basic structure. Here is the SQL Server code to create it from a connection to the target database with SSMS.

-- Create the client table
CREATE TABLE [dbo].[Customers](
   [CustomerID] [int] NOT NULL,
   [FirstName] [nvarchar](20) NULL,
   [LastName] [nvarchar](20) NULL,
   [City] [nvarchar](20) NULL,
   [Country] [nvarchar](50) NULL,
   CONSTRAINT [CustomersPrimaryKeyCustomerID] PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO

Download the XML file used in this tutorial from the link below. Due to restrictions, the file is renamed to txt and with comments. So, rename the file to Customers_Data.csv and remove the comments, i.e. delete the first and last line of the file.

In case you have trouble downloading and using the file, here is a sample of the file in XML format with only the first 6 lines.

<?xml version="1.0" encoding="UTF-8"?>
<Customers>
   <Customer>
      <CustomerID>1</CustomerID>
      <FirstName>Ali</FirstName>
      <LastName>Ahmed</LastName>
      <City>Cairo</City>
      <Country>Egypt</Country>
   </Customer>
   <Customer>
      <CustomerID>2</CustomerID>
      <FirstName>Johnny</FirstName>
      <LastName>John</LastName>
      <City>Toronto</City>
      <Country>Canada</Country>
   </Customer>
   <Customer>
      <CustomerID>3</CustomerID>
      <FirstName>John</FirstName>
      <LastName>Doe</LastName>
      <City>Mexico City</City>
      <Country>Mexico</Country>
   </Customer>
   <Customer>
      <CustomerID>4</CustomerID>
      <FirstName>Shu</FirstName>
      <LastName>Abbas</LastName>
      <City>Paris</City>
      <Country>France</Country>
   </Customer>
   <Customer>
      <CustomerID>5</CustomerID>
      <FirstName>Jeane</FirstName>
      <LastName>Raffin</LastName>
      <City>Liushutun</City>
      <Country>China</Country>
   </Customer>
   <Customer>
      <CustomerID>6</CustomerID>
      <FirstName>Legra</FirstName>
      <LastName>Leate</LastName>
      <City>Błaszki</City>
      <Country>Poland</Country>
   </Customer>
</Customers>

2. Configure the XML source file from the SSIS Data Flow

First create an SSIS package in a project, and add a data flow into it. Then add the XML Source and OLE DB Destination components to the SSIS data flow.

Use the XML Source and the OLE DB Destination SSIS components to import the XML file into a table
Use the XML Source and the OLE DB Destination SSIS components to import the XML file

Open the XML component and point to the XML file.

  1. Select the path of the XML file to import: C:\data\Customers_Data.xml
  2. Use Generate XSD and choose the path of the file: C:\data\Customers_Data_VisualStudio.xsd
  3. Click on Columns to check the mapping
Set up the source XML file to be imported into the database and the XSD schema with SSIS
Set up the source XML file to be imported into the database and the XSD schema with SSIS

The XSD schema does not indicate the length of the fields by default and the package therefore assigns the DT_WSTR type with a length of 255 characters. This is the SSIS data type (DT_WSTR,255). The package therefore displays this warning message.

SSIS Error No maximum length was specified for the XML Source
SSIS Error No maximum length was specified for the XML Source

Warning at {EF1A3256-9D27-4D85-9F3A-570513FF215F} [XML Source [69]]: No maximum length was specified for the XML Source.Outputs[Customer].Columns[FirstName] with external data type System.String. The SSIS Data Flow Task data type “DT_WSTR” with a length of 255 will be used.

3. SSIS error No maximum length was specified for the XML Source

This is only a warning message, but it is possible to correct the problem. To do this, modify the XSD file and specify the lengths for all text fields. The XSD file generated automatically by Visual Studio is of the following type.

Automatically generated XSD schema file in Visual Studio XML Source
Automatically generated XSD schema file in Visual Studio XML Source

The XML element line does not contain the maxLength property. Replace each such line:

<xs:element minOccurs="0" name="FirstName" type="xs:string" />

With a block with the explicit length of the text field, for example:

<xs:element minOccurs="0" name="FirstName" >
 <xs:simpleType>
   <xs:restriction base="xs:string">
     <xs:minLength value="0"/>
     <xs:maxLength value="20"/>
   </xs:restriction>
 </xs:simpleType>
</xs:element>

The result is a file with the corresponding length for each column in the SQL table. So, the modified file shows the length of each XML text field, as in the picture below.

Update the XSD schema file with the maxLength property
Update the XSD schema file with the maxLength property

Save the new XSD file as C:\dataCustomers_Data.xsd. Finally point the SSIS XML component to the new and modified XML schema.

Configure the XML file to be imported with SSIS with the new XSD file
Configure the XML file to be imported with SSIS with the new XSD file

Once the warning is corrected, check the mapping of the columns in the source component.

Check the columns of the XML file to be imported with SSIS
Check the columns of the XML file to be imported with SSIS

4. Configure the connection to the destination SQL table

Open the OLE DB Destination component and choose the table in which to import the XML file. This is the dbo.Customers table created in step 1 of this tutorial.

Choose the target table to import the customers XML file
Choose the target table to import the customers XML file

5. Run the package to import the XML data into SQL Server

Finally, run the package and check that the data is inserted into the table without error with an SQL query from SSMS.

Run the SSIS package and import the XML file into the SQL Server database
Run the SSIS package and import the XML file into the SQL Server database

Use this SQL SELECT query to read the imported data from an SSMS connection.

SELECT *
FROM [dbo][Customers];

Conclusion on XML file import into SQL Server with SSIS

The XML Source component allows you to import an XML file into a MS SQL table with SSIS. The component is easy to use, especially with the option to automatically generate the XSD schema directly from the XML file provided. However, to work in control and avoid warnings and truncations, make manual adjustments, especially on the length of text fields.

Other tutorials on data integration with SSIS

Be the first to comment

Leave a Reply

Your email address will not be published.


*