Tutorial to use the SSIS Merge Join transformation with two SQL Server tables into one target table.
Use the Merge Join with the SSIS component allows you to perform a join and merge two or more data sources based on a common key. In other words, the SSIS Merge Join component allows you to merge different heterogeneous or non-heterogeneous sources into one, such as flat files, XML documents, Excel files, or SQL Server tables.
This SSIS tutorial explains how to use the SSIS component that merges two SQL Server source tables into one target table. With SSIS, whether for on-premises versions or on the Microsoft Azure Cloud, the component is useful for managing distinct tables without using T-SQL or .NET code.
Table of Contents
1. Prepare the source and target tables to join with SSMS
From an SSMS tab, connect to the target database and run the following SQL code to create two source tables to merge into a target table. The source tables are:
- The sales table: dbo.Sales
- And the customers table: dbo.Customers
1.1 Create the sales source table
The first source table is the sales table, it will be enriched in an SSIS data flow with data from the customers table, first and last name.
-- Create the Sales table CREATE TABLE [dbo].[Sales]( [CustomerID] [int] NOT NULL, [MonthID] [nvarchar](20) NOT NULL, [YearID] [nvarchar](20) NOT NULL, [Qty] [numeric](10,2) NOT NULL, [Price] [numeric](10,2) NOT NULL, [Sales] [numeric](10,2) NOT NULL, [ProductID] [int] NOT NULL, [ProductName] [nvarchar](100) NOT NULL, CONSTRAINT [SalesPK] PRIMARY KEY CLUSTERED ( [CustomerID], [MonthID], [YearID], [ProductID] ASC ) ); -- Insert data into the Sales table INSERT [dbo].[Sales] VALUES (1, N'1', N'2022', 10.00, 879.20, 8792.00, 31, N'Babka'); INSERT [dbo].[Sales] VALUES (1, N'3', N'2021', 5.00, 312.00, 1560.00, 40, N'Unreal'); INSERT [dbo].[Sales] VALUES (1, N'6', N'2021', 9.00, 627.00, 5643.00, 27, N'Qrunch'); INSERT [dbo].[Sales] VALUES (1, N'7', N'2022', 6.00, 621.70, 3730.20, 10, N'Ambrosi'); INSERT [dbo].[Sales] VALUES (1, N'7', N'2022', 4.00, 653.10, 2612.40, 47, N'Quickick'); INSERT [dbo].[Sales] VALUES (1, N'9', N'2021', 2.00, 734.90, 1469.80, 47, N'Quickick'); INSERT [dbo].[Sales] VALUES (2, N'2', N'2021', 9.00, 895.60, 8060.40, 47, N'Quickick'); INSERT [dbo].[Sales] VALUES (2, N'3', N'2022', 11.00, 51.80, 569.80, 49, N'Menudito'); INSERT [dbo].[Sales] VALUES (2, N'7', N'2021', 11.00, 112.20, 1234.20, 19, N'Chiavie'); INSERT [dbo].[Sales] VALUES (2, N'9', N'2022', 7.00, 581.80, 4072.60, 33, N'Jans'); INSERT [dbo].[Sales] VALUES (3, N'2', N'2022', 4.00, 576.60, 2306.40, 5, N'Amoy'); INSERT [dbo].[Sales] VALUES (3, N'6', N'2021', 6.00, 551.10, 3306.60, 8, N'Gilda'); INSERT [dbo].[Sales] VALUES (4, N'0', N'2021', 7.00, 15.40, 107.80, 26, N'Lasco'); INSERT [dbo].[Sales] VALUES (4, N'1', N'2022', 2.00, 706.60, 1413.20, 27, N'Qrunch'); INSERT [dbo].[Sales] VALUES (4, N'1', N'2022', 9.00, 491.70, 4425.30, 30, N'Paldo'); INSERT [dbo].[Sales] VALUES (4, N'3', N'2022', 4.00, 810.20, 3240.80, 22, N'Bourbon'); INSERT [dbo].[Sales] VALUES (4, N'4', N'2021', 9.00, 648.90, 5840.10, 42, N'Barbacoa'); INSERT [dbo].[Sales] VALUES (4, N'5', N'2021', 7.00, 284.90, 1994.30, 5, N'Amoy'); INSERT [dbo].[Sales] VALUES (5, N'11', N'2021', 3.00, 914.90, 2744.70, 45, N'Predue'); INSERT [dbo].[Sales] VALUES (5, N'11', N'2022', 9.00, 903.80, 8134.20, 44, N'Kemps'); INSERT [dbo].[Sales] VALUES (5, N'4', N'2022', 10.00, 133.90, 1339.00, 6, N'Haystack'); INSERT [dbo].[Sales] VALUES (5, N'6', N'2021', 10.00, 940.20, 9402.00, 34, N'Amazin'); INSERT [dbo].[Sales] VALUES (5, N'7', N'2021', 7.00, 352.90, 2470.30, 32, N'Maverik'); INSERT [dbo].[Sales] VALUES (5, N'7', N'2022', 3.00, 116.40, 349.20, 7, N'Kalamata'); INSERT [dbo].[Sales] VALUES (6, N'1', N'2021', 4.00, 283.30, 1133.20, 23, N'Oatsnack'); INSERT [dbo].[Sales] VALUES (6, N'4', N'2021', 4.00, 604.90, 2419.60, 10, N'Ambrosi'); INSERT [dbo].[Sales] VALUES (6, N'4', N'2022', 5.00, 184.30, 921.50, 6, N'Haystack'); INSERT [dbo].[Sales] VALUES (6, N'5', N'2022', 9.00, 464.40, 4179.60, 18, N'Lurpak'); INSERT [dbo].[Sales] VALUES (6, N'8', N'2021', 6.00, 836.10, 5016.60, 40, N'Unreal'); INSERT [dbo].[Sales] VALUES (6, N'9', N'2022', 4.00, 254.80, 1019.20, 1, N'Boncora'); INSERT [dbo].[Sales] VALUES (7, N'0', N'2021', 7.00, 100.90, 706.30, 45, N'Predue'); INSERT [dbo].[Sales] VALUES (7, N'10', N'2021', 4.00, 161.50, 646.00, 10, N'Ambrosi'); INSERT [dbo].[Sales] VALUES (7, N'10', N'2022', 7.00, 484.50, 3391.50, 50, N'Armanino'); INSERT [dbo].[Sales] VALUES (7, N'3', N'2022', 7.00, 748.80, 5241.60, 38, N'Exo'); INSERT [dbo].[Sales] VALUES (7, N'6', N'2021', 1.00, 623.50, 623.50, 45, N'Predue'); INSERT [dbo].[Sales] VALUES (7, N'7', N'2021', 6.00, 185.00, 1110.00, 28, N'Basilico'); INSERT [dbo].[Sales] VALUES (7, N'8', N'2022', 7.00, 625.50, 4378.50, 4, N'Sartori'); INSERT [dbo].[Sales] VALUES (7, N'8', N'2022', 3.00, 125.40, 376.20, 38, N'Exo'); INSERT [dbo].[Sales] VALUES (8, N'1', N'2021', 8.00, 949.80, 7598.40, 16, N'Homekist'); INSERT [dbo].[Sales] VALUES (8, N'10', N'2021', 9.00, 944.30, 8498.70, 41, N'Baked'); INSERT [dbo].[Sales] VALUES (8, N'10', N'2022', 7.00, 578.10, 4046.70, 2, N'Master'); INSERT [dbo].[Sales] VALUES (8, N'12', N'2022', 3.00, 309.20, 927.60, 42, N'Barbacoa'); INSERT [dbo].[Sales] VALUES (8, N'2', N'2022', 4.00, 417.80, 1671.20, 25, N'Scrapple'); INSERT [dbo].[Sales] VALUES (8, N'4', N'2021', 7.00, 509.20, 3564.40, 4, N'Sartori');
1.2 Create the customers source table
The second source table is the customers table, which stores the customer number, first name, last name, city, and country. It contains 8 rows.
-- Create the Customers table CREATE TABLE [dbo].[Customers]( [CustomerID] [int] NOT NULL, [FirstName] [nvarchar](20) NULL, [LastName] [nvarchar](20) NULL, [City] [nvarchar](20) NULL, [Country] [nvarchar](50) NULL, CONSTRAINT [CustomersPrimaryKeyCustomerID] PRIMARY KEY CLUSTERED ([CustomerID] ASC) ); GO -- Insert 8 lines into the Customers table INSERT INTO [dbo].[Customers] VALUES (1, N'Ali', N'Ahmed', N'Cairo', N'Egypt'); INSERT INTO [dbo].[Customers] VALUES (2, N'Johnny', N'John', N'Toronto', N'Canada'); INSERT INTO [dbo].[Customers] VALUES (3, N'John', N'Doe', N'Mexico City', N'Mexico'); INSERT INTO [dbo].[Customers] VALUES (4, N'Shu', N'Abbas', N'Paris', N'France'); INSERT INTO [dbo].[Customers] VALUES (5, N'Jeane', N'Raffin', N'Liushutun', N'China'); INSERT INTO [dbo].[Customers] VALUES (6, N'Legra', N'Leate', N'Błaszki', N'Poland'); INSERT INTO [dbo].[Customers] VALUES (7, N'Sullivan', N'Goadby', N'Xiaoguwei', N'China'); INSERT INTO [dbo].[Customers] VALUES (8, N'Tom', N'Ellams', N'Lop Buri', N'Thailand');
1.3 Create the enriched sales target table
The target table is an enriched sales table with the first and last names of customers: dbo.SalesEnriched. The enriched table has the same structure, but with the following two additional columns:
- FirstName: Customer’s first name
- LastName: Customer’s last name
CREATE TABLE [dbo].[SalesEnriched]( [CustomerID] [int] NOT NULL, [MonthID] [nvarchar](20) NOT NULL, [YearID] [nvarchar](20) NOT NULL, [Qty] [numeric](10,2) NOT NULL, [Price] [numeric](10,2) NOT NULL, [Sales] [numeric](10,2) NOT NULL, [ProductID] [int] NOT NULL, [ProductName] [nvarchar](100) NOT NULL, [FirstName] [nvarchar](20) NULL, [LastName] [nvarchar](20) NULL, CONSTRAINT [SalesSalesEnrichedPK] PRIMARY KEY CLUSTERED ( [CustomerID], [MonthID], [YearID], [ProductID] ASC ) );
2. Create an SSIS data flow using Merge Join
Prior to this, create a connection to the SQL Server database from the SSIS package.
2.1 Prepare the source components
In an SSIS data flow, add the following components, then rename the components with the names of the source and target tables.
- 2 OLE DB Source components
- A Merge Join transformation component
- An OLE DB destination
Choose the first source table for the SSIS join, i.e., the left table.
In the same way, select the second source table, the customers table.
Finally, connect the sources to the SSIS Merge Join transformation, and the following window appears. Choose the left input for the dbo.Sales table.
2.2 Configure the source data sorting
The package displays an SSIS error:
[Merge Join [2]] Error: “Merge Join.Inputs[Merge Join Left Input]” must be sorted. If possible, sort the data at the source and mark the data as sorted by setting the IsSorted and SortKeyPosition properties in the Advanced Editor. Otherwise, add a Sort Transformation to the path before the merge.
The message is quite explicit, so just sort the join columns using the following 2 properties, from the component’s advanced editor.
- IsSorted: to indicate that the source is sorted.
- SortKeyPosition: to indicate the order of the columns used for sorting.
Repeat these operations for both source tables.
From the OLE DB Source components:
- Right-click
- In the Input and Output Properties tab
- Select OLE DB Source Output
- Change the IsSorted property to True
Still in the advanced SSIS editor, select this time the column used as the join key, here it is CustomerID, and modify the SortKeyPosition property to 1. In a join with multiple columns, assign the same sort values to the corresponding columns in the two source tables.
3. Configure the SSIS Merge Join transformation
Open the Merge Join transformation component and configure the options as follows:
- Choose the type of SSIS join: Left outer join.
- Select all columns from the first table.
- Select only first name and last name from the second table.
4. Prepare the target table to store the SSIS Merge Join data
Configure the last component of the data flow and point it to the enriched sales table.
Verify column mapping from the Mappings tab. Columns from both source tables are properly mapped to the target table.
5. Run the SSIS Merge join and check the target data
Finally, run the SSIS package. The 44 sales data rows are successfully integrated into the enriched table with no errors. In this simple tutorial, unenriched rows are not handled and therefore not redirected.
Finally, click on View Data from the OLE DB Destination component to read the table and verify that the two columns are enriched with the names and first names of the clients.
6. Conclusion on the SSIS Merge Join component
The SSIS join allows you to group data from two tables without writing a single line of code. The only prerequisite is to sort the data beforehand, directly from the advanced editor or using the SSIS Sort Transformation.
Be the first to comment