SQL Server subqueries tutorial with examples

SQL Server subqueries used with SELECT, WHERE and JOIN operators.

Transact-SQL (T-SQL) is a crucial extension of SQL used predominantly in Microsoft SQL Server environments, here we’ll learn how to write subqueries. This guide dives into T-SQL subqueries, a high-level concept that allows for nesting SQL queries within other queries. By exploring their role in SQL INSERT operations, SQL JOINS, and SQL GROUP BY clauses, we illustrate these concepts using a ‘Sales’ database table.

This tutorial aims to enrich your understanding of T-SQL subqueries, enabling you to write more efficient and flexible SQL code, and enhancing your data analysis and database management skills. Technically, a subquery is a SQL query nested inside a larger query. A subquery may occur in:

  • A SELECT clause
  • A FROM clause
  • A WHERE clause
  • A JOIN operation

The subquery can be nested inside a SELECT, INSERT, UPDATE, DELETE, or another subquery. A subquery can return a set of rows or just a single row. Here we are going to explore subqueries in T-SQL using the Sales table created above.

Prerequisite : Create the sample table used in the examples

Use the Sales table creation script available in this tutorial to create a table in SSMS and reproduce the queries.

1. Subqueries in INSERT statements

Subqueries can be used in the INSERT statement to insert data into a table from one or more other tables. First example: If we had another table named DiscountedSales, and we wanted to add all sales from 2022 where the quantity was above 5 from the Sales table, the query would look something like this:

INSERT INTO DiscountedSales
SELECT *
FROM   Sales
WHERE  YearID = '2022'
   AND Qty > 5;

This subquery is pulling the relevant data from the Sales table and inserting it into the DiscountedSales table.

2. Subqueries in JOINS queries

In SQL Server and in generic SQL, you can join a table to the result set of a subquery, like in this second example. Suppose we want to join Sales with a subquery that returns the maximum sales for each customer.

SELECT S1.CustomerID, S1.Sales, S2.MaxSales
FROM Sales S1
JOIN (
    SELECT CustomerID, MAX(Sales) as MaxSales
    FROM Sales
    GROUP BY CustomerID
) S2 ON S1.CustomerID = S2.CustomerID;

This subquery joins the Sales table to a derived table S2 that contains the maximum sales for each customer.

3. Subqueries in GROUP BY

A subquery can also be used in a SQL Server GROUP BY clause. Let’s calculate the total sales for each customer, but only include customers that have made sales in 2022.

SELECT CustomerID, SUM(Sales)
FROM Sales
WHERE CustomerID IN (
    SELECT DISTINCT CustomerID
    FROM Sales
    WHERE YearID = '2022'
)
GROUP BY CustomerID;

The subquery here selects distinct CustomerIDs where YearID is ‘2022’, and then the outer query calculates the total sales for these customers.

Here’s what your result will look like, if you use the sample table provided in the beginning of the tutorial:

CustomerIDSUM(Sales)
121238.40
213937.00
32306.40
49079.30
520046.70
69674.50
714100.80
85715.50

4. SQL Server Scalar Subqueries

A scalar subquery is a subquery that returns exactly one column value from one row. Let’s find the average sales of the best-selling product with this 4th example to illustrate the scalar T-SQL subqueries.

SELECT AVG(Sales)
FROM Sales
WHERE ProductID = (
    SELECT TOP 1 ProductID
    FROM Sales
    GROUP BY ProductID
    ORDER BY SUM(Qty) DESC
);

The subquery identifies the ProductID of the best-selling product by quantity, and then the outer query calculates the average sales of this product.

5. Correlated Subqueries

A correlated SQL Server subquery works like a nested loop: the subquery only has access to rows related to a single record at a time in the outer query. The subquery can refer to a column from a table listed in the FROM list of the outer query. Let’s find the customers who have sales above the average sales of all customers.

SELECT DISTINCT S1.CustomerID 
FROM Sales S1
WHERE S1.Sales > (
    SELECT AVG(S2.Sales)
    FROM Sales S2
    WHERE S1.CustomerID = S2.CustomerID
);

The correlated subquery calculates the average sales for each customer one at a time, and then the outer query compares these average sales to each individual sales transaction.

Conclusion on SQL Server subqueries

Subqueries in SQL provide powerful ways to perform complex calculations, filter data, and even transform data. You can use them in various places in your SQL statements to solve tricky data problems. Remember that while subqueries can be very useful, they can also be very slow, especially for large data sets. Always test your queries to ensure they perform as expected.

Be the first to comment

Leave a Reply

Your email address will not be published.


*