SQL Server subqueries used with SELECT, WHERE and JOIN operators.
Transact-SQL (T-SQL) is a crucial extension of SQL used predominantly in Microsoft SQL Server environments, here we’ll learn how to write subqueries. This guide dives into T-SQL subqueries, a high-level concept that allows for nesting SQL queries within other queries. By exploring their role in SQL INSERT operations, SQL JOINS, and SQL GROUP BY clauses, we illustrate these concepts using a ‘Sales’ database table.
This tutorial aims to enrich your understanding of T-SQL subqueries, enabling you to write more efficient and flexible SQL code, and enhancing your data analysis and database management skills. Technically, a subquery is a SQL query nested inside a larger query. A subquery may occur in:
- A SELECT clause
- A FROM clause
- A WHERE clause
- A JOIN operation
The subquery can be nested inside a SELECT, INSERT, UPDATE, DELETE, or another subquery. A subquery can return a set of rows or just a single row. Here we are going to explore subqueries in T-SQL using the Sales
table created above.
Prerequisite : Create the sample table used in the examples
Use the Sales table creation script available in this tutorial to create a table in SSMS and reproduce the queries.
1. Subqueries in INSERT statements
Subqueries can be used in the INSERT
statement to insert data into a table from one or more other tables. First example: If we had another table named DiscountedSales
, and we wanted to add all sales from 2022 where the quantity was above 5 from the Sales
table, the query would look something like this:
INSERT INTO DiscountedSales SELECT * FROM Sales WHERE YearID = '2022' AND Qty > 5;
This subquery is pulling the relevant data from the Sales
table and inserting it into the DiscountedSales
table.
2. Subqueries in JOINS queries
In SQL Server and in generic SQL, you can join a table to the result set of a subquery, like in this second example. Suppose we want to join Sales
with a subquery that returns the maximum sales for each customer.
SELECT S1.CustomerID, S1.Sales, S2.MaxSales FROM Sales S1 JOIN ( SELECT CustomerID, MAX(Sales) as MaxSales FROM Sales GROUP BY CustomerID ) S2 ON S1.CustomerID = S2.CustomerID;
This subquery joins the Sales
table to a derived table S2
that contains the maximum sales for each customer.
3. Subqueries in GROUP BY
A subquery can also be used in a SQL Server GROUP BY clause. Let’s calculate the total sales for each customer, but only include customers that have made sales in 2022.
SELECT CustomerID, SUM(Sales) FROM Sales WHERE CustomerID IN ( SELECT DISTINCT CustomerID FROM Sales WHERE YearID = '2022' ) GROUP BY CustomerID;
The subquery here selects distinct CustomerID
s where YearID
is ‘2022’, and then the outer query calculates the total sales for these customers.
Here’s what your result will look like, if you use the sample table provided in the beginning of the tutorial:
CustomerID | SUM(Sales) |
---|---|
1 | 21238.40 |
2 | 13937.00 |
3 | 2306.40 |
4 | 9079.30 |
5 | 20046.70 |
6 | 9674.50 |
7 | 14100.80 |
8 | 5715.50 |
4. SQL Server Scalar Subqueries
A scalar subquery is a subquery that returns exactly one column value from one row. Let’s find the average sales of the best-selling product with this 4th example to illustrate the scalar T-SQL subqueries.
SELECT AVG(Sales) FROM Sales WHERE ProductID = ( SELECT TOP 1 ProductID FROM Sales GROUP BY ProductID ORDER BY SUM(Qty) DESC );
The subquery identifies the ProductID
of the best-selling product by quantity, and then the outer query calculates the average sales of this product.
5. Correlated Subqueries
A correlated SQL Server subquery works like a nested loop: the subquery only has access to rows related to a single record at a time in the outer query. The subquery can refer to a column from a table listed in the FROM list of the outer query. Let’s find the customers who have sales above the average sales of all customers.
SELECT DISTINCT S1.CustomerID FROM Sales S1 WHERE S1.Sales > ( SELECT AVG(S2.Sales) FROM Sales S2 WHERE S1.CustomerID = S2.CustomerID );
The correlated subquery calculates the average sales for each customer one at a time, and then the outer query compares these average sales to each individual sales transaction.
Conclusion on SQL Server subqueries
Subqueries in SQL provide powerful ways to perform complex calculations, filter data, and even transform data. You can use them in various places in your SQL statements to solve tricky data problems. Remember that while subqueries can be very useful, they can also be very slow, especially for large data sets. Always test your queries to ensure they perform as expected.
Be the first to comment