Group multiple tables with SSIS Union All

Develop an SSIS package step-by-step to group data from multiple SQL Server tables into one using the Union All transformation.

To group the contents of several tables or data sources in general, it is possible to use a SQL Union All query, here is how to do it without code with SSIS. A prerequisite for this type of query is to build all source tables with the same data structure, i.e., column type and length. Or at least at the input of the Union All component. This operation is equivalent to a classic SQL Union All query.

1. Create several tables to be grouped and the target table

First of all, create three source tables with the same structure and insert rows into them. Each table has 16 rows, for a total of 48 rows.

-- Create the first customers table
CREATE TABLE [dbo].[Customers_1](
   [CustomerID] [int] NOT NULL,
   [FirstName] [nvarchar](20) NULL,
   [LastName] [nvarchar](20) NULL,
   [City] [nvarchar](20) NULL,
   [Country] [nvarchar](50) NULL,
   CONSTRAINT [CustomersPKCustomerID_1] PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO

-- Create the second customers table
CREATE TABLE [dbo].[Customers_2](
   [CustomerID] [int] NOT NULL,
   [FirstName] [nvarchar](20) NULL,
   [LastName] [nvarchar](20) NULL,
   [City] [nvarchar](20) NULL,
   [Country] [nvarchar](50) NULL,
   CONSTRAINT [CustomersPKCustomerID_2] PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO

-- Create the third customers table
CREATE TABLE [dbo].[Customers_3](
   [CustomerID] [int] NOT NULL,
   [FirstName] [nvarchar](20) NULL,
   [LastName] [nvarchar](20) NULL,
   [City] [nvarchar](20) NULL,
   [Country] [nvarchar](50) NULL,
   CONSTRAINT [CustomersPKCustomerID_3] PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO

Create a fourth table, the target, used to store the results, i.e., all data from the 3 sources tables.

-- Create the target and global customers table
CREATE TABLE [dbo].[Customers_All](
   [CustomerID] [int] NOT NULL,
   [FirstName] [nvarchar](20) NULL,
   [LastName] [nvarchar](20) NULL,
   [City] [nvarchar](20) NULL,
   [Country] [nvarchar](50) NULL,
   CONSTRAINT [CustomersPKCustomerID_All] PRIMARY KEY CLUSTERED ([CustomerID] ASC)
);
GO

2. Feed the SQL tables to be grouped with SSIS

Download this file and rename it with an .sql extension. Or run its contents directly from an SSMS window.

If you are unable to download the file because of security restrictions, then here are its contents:

-- table 1
INSERT [dbo].[Customers_1] VALUES (1, N'Ali', N'Ahmed', N'Cairo', N'Egypt' );
INSERT [dbo].[Customers_1] VALUES (2, N'Johnny', N'John', N'Toronto', N'Canada' );
INSERT [dbo].[Customers_1] VALUES (3, N'John', N'Doe', N'Mexico City', N'Mexico' );
INSERT [dbo].[Customers_1] VALUES (4, N'Shu', N'Abbas', N'Paris', N'France' );
INSERT [dbo].[Customers_1] VALUES (5, N'Jeane', N'Raffin', N'Liushutun', N'China' );
INSERT [dbo].[Customers_1] VALUES (6, N'Legra', N'Leate', N'Błaszki', N'Poland' );
INSERT [dbo].[Customers_1] VALUES (7, N'Sullivan', N'Goadby', N'Xiaoguwei', N'China' );
INSERT [dbo].[Customers_1] VALUES (8, N'Tom', N'Ellams', N'Lop Buri', N'Thailand' );
INSERT [dbo].[Customers_1] VALUES (9, N'Trace', N'Fowell', N'Sriwing', N'Indonesia' );
INSERT [dbo].[Customers_1] VALUES (10, N'Christoffer', N'Rochford', N'Mburukullu', N'Indonesia' );
INSERT [dbo].[Customers_1] VALUES (11, N'Dru', N'Dunridge', N'Mistrató', N'Colombia' );
INSERT [dbo].[Customers_1] VALUES (12, N'Dud', N'Gravenor', N'Sandaogou', N'China' );
INSERT [dbo].[Customers_1] VALUES (13, N'Lori', N'Garshore', N'Jeziora Wielkie', N'Poland' );
INSERT [dbo].[Customers_1] VALUES (14, N'Cordy', N'Kentish', N'Vydreno', N'Russia' );
INSERT [dbo].[Customers_1] VALUES (15, N'Ancell', N'Wileman', N'Claresholm', N'Canada' );
INSERT [dbo].[Customers_1] VALUES (16, N'Elsworth', N'Chantrell', N'Sumberjo', N'Indonesia' );

-- table 2
INSERT [dbo].[Customers_2] VALUES (17, N'Beverly', N'Giraudy', N'Cigembong', N'Indonesia' );
INSERT [dbo].[Customers_2] VALUES (18, N'Tamas', N'Zavattiero', N'Jangkat', N'Indonesia' );
INSERT [dbo].[Customers_2] VALUES (19, N'Benedick', N'Schaffler', N'Itaparica', N'Brazil' );
INSERT [dbo].[Customers_2] VALUES (20, N'Leonard', N'Brimman', N'Moscow', N'Russia' );
INSERT [dbo].[Customers_2] VALUES (21, N'Morton', N'Capelle', N'Uk', N'Russia' );
INSERT [dbo].[Customers_2] VALUES (22, N'Larissa', N'Rawle', N'Tembayangan Barat', N'Indonesia' );
INSERT [dbo].[Customers_2] VALUES (23, N'Karalee', N'Wall', N'Clifden', N'Ireland' );
INSERT [dbo].[Customers_2] VALUES (24, N'Miller', N'Shakesbye', N'Gongnong', N'China' );
INSERT [dbo].[Customers_2] VALUES (25, N'Deidre', N'Cacacie', N'Itapeva', N'Brazil' );
INSERT [dbo].[Customers_2] VALUES (26, N'Gerri', N'Fawcus', N'Karma', N'Belarus' );
INSERT [dbo].[Customers_2] VALUES (27, N'Rudie', N'Helix', N'Gamut', N'Philippines' );
INSERT [dbo].[Customers_2] VALUES (28, N'Maddi', N'Rounce', N'Gavrilov-Yam', N'Russia' );
INSERT [dbo].[Customers_2] VALUES (29, N'Cornall', N'Beazer', N'Lérida', N'Colombia' );
INSERT [dbo].[Customers_2] VALUES (30, N'Ardelia', N'Pearcehouse', N'Fontenay-sous-Bois', N'France' );
INSERT [dbo].[Customers_2] VALUES (31, N'Adrian', N'Legh', N'Porto Velho', N'Brazil' );
INSERT [dbo].[Customers_2] VALUES (32, N'Buiron', N'Eads', N'Göteborg', N'Sweden' );

-- table 3
INSERT [dbo].[Customers_3] VALUES (33, N'Danette', N'Howick', N'Mombasa', N'Kenya' );
INSERT [dbo].[Customers_3] VALUES (34, N'Geno', N'Hannaway', N'Bantuanon', N'Philippines' );
INSERT [dbo].[Customers_3] VALUES (35, N'Kaleb', N'Hoovart', N'Largo', N'United States' );
INSERT [dbo].[Customers_3] VALUES (36, N'Yanaton', N'Coneybeare', N'Yanqi', N'China' );
INSERT [dbo].[Customers_3] VALUES (37, N'Jaimie', N'Zincke', N'Greeley', N'United States' );
INSERT [dbo].[Customers_3] VALUES (38, N'Bonnibelle', N'Dorman', N'Canhestros', N'Portugal' );
INSERT [dbo].[Customers_3] VALUES (39, N'Baxter', N'Howland', N'Finzes', N'Portugal' );
INSERT [dbo].[Customers_3] VALUES (40, N'Elmira', N'Dary', N'Lubao', N'China' );
INSERT [dbo].[Customers_3] VALUES (41, N'Beitris', N'Teresi', N'Impendle', N'South Africa' );
INSERT [dbo].[Customers_3] VALUES (42, N'Gustavo', N'Poston', N'Tuquan', N'China' );
INSERT [dbo].[Customers_3] VALUES (43, N'Renae', N'Macenzy', N'Al Ghayl', N'Yemen' );
INSERT [dbo].[Customers_3] VALUES (44, N'Solly', N'Davydenko', N'Durham', N'United States' );
INSERT [dbo].[Customers_3] VALUES (45, N'Herve', N'Baxandall', N'Ponggeok', N'Indonesia' );
INSERT [dbo].[Customers_3] VALUES (46, N'Page', N'Meek', N'Gufeng', N'China' );
INSERT [dbo].[Customers_3] VALUES (47, N'Cosme', N'Cranage', N'Willowmore', N'South Africa' );
INSERT [dbo].[Customers_3] VALUES (48, N'Lind', N'Bransden', N'Laval', N'France' );

Then check the number of lines in each table.

select count(*) from [Customers_1];
select count(*) from [Customers_2];
select count(*) from [Customers_3];
Check the number of rows in each table with an SQL query from SSMS
Check the number of rows in each table with an SQL query from SSMS

3. Create an SSIS data stream with the Union All transformation

Use as many sources as there are tables to be grouped in the SSIS package. To make the component work, you need at least 2 tables, and the structure of the columns to be grouped must be identical.

  1. Add three OLE DB Source components and rename them explicitly.
  2. Add the Union All transformation.
Add 3 OLE DB sources and the Union All transformation to the SSIS data flow
Add 3 OLE DB sources and the Union All transformation to the SSIS data flow

Configure each Source to point to the corresponding customer table.

Set up the first source table for the Union All SSIS
Set up the first source table for the Union All SSIS

This example is deliberately simple because all the tables have the same structure. The mapping of columns is therefore automatic because SSIS is based on the names of the columns in the links between components. Naming the sources and targets as much as possible therefore greatly facilitates the development of Integration Services data flows.

Set up the second source table for the Union All SSIS
Set up the second source table for the Union All SSIS
Set up the third source table to be grouped using the SSIS Union All
Set up the third source table to be grouped using the SSIS Union All

4. Check the SSIS Union All component

After configuring all the source tables, then link the OLE DB Source components to the Union All transformation. Link the tables in order to facilitate control and correction of the mapping in the component.

Link the 3 source SSIS components to the transformation to aggregate the data
Link the 3 source SSIS components to the transformation to group the data

Open the transformation component and check the columns. Here all columns have identical names and types. It is possible to perform the following operations.

  • Rename the output columns by clicking and editing the name.
  • Change the source columns by clicking on a row and changing the source from the source.
Mapping of source columns to group using the SSIS Union All transformation
Mapping of source columns to group using the SSIS Union All transformation

5. Set up the destination table

Finally add an OLE DB Destination component and point to the customers table with the grouped data. That is the Customers_All table, created in step 1 of this tutorial.

Configure the target table to store the joined data
Configure the target table to store the grouped data

6. Run the SSIS package to combine data from source tables

Now press F5 to execute the package and the data flow and moreover check the outcome. Indeed, the goal here is to aggregate the 16 rows of data from each source table into the single target table and have 48 different lines. It is the equivalent of the T-SQL Union All operator.

Successfully execution of the SSIS package storing data from the three tables into one
Successfully execution of the SSIS package storing data from the three tables into one

7. Check the results with an SQL query from SSMS

For the last step, check the data inserted in the target table with an SQL query to be executed from SSMS.

SELECT COUNT(*) AS [Lines] FROM [Customers_All];
The target table contains the 48 rows of the 3 source tables.
The target table contains the 48 rows of the 3 source tables.

It is also possible to control the rows of a SQL Server table directly from the OLE DB SSIS components.

Read data from the SQL Server target table directly from the SSIS Viewer
Read data from the SQL Server target table directly from the SSIS Viewer

Conclusion on using SSIS Union All to group multiple tables

To conclude, this illustrated course explains general concepts and illustrates the use of the SSIS data grouping component with a basic case study. In projects, it is important to prepare tables and their structures in advance to ensure smooth mapping and consistent data types throughout the ETL project development process.

Be the first to comment

Leave a Reply

Your email address will not be published.


*