How to use the GROUP BY query in T-SQL to aggregate data?

Examples of GROUP BY queries in T-SQL to aggregate data from a table with SQL Server.

These GROUP BY query examples in T-SQL shows multiple ways of grouping data. Grouping is a clause used in SQL Server or standard SQL code to group rows from a query result according to one or more columns. The GROUP BY clause is typically used with aggregation functions such as SUM, AVG, MAX, MIN, COUNT, etc., to produce grouped results. Here are 10 concrete examples of using the GROUP BY clause in T-SQL.

In all of these examples, the GROUP BY query in T-SQL groups rows based on the specified column(s), and the aggregation functions are used to produce grouped, i.e., aggregated results.

1. Create example tables for the T-SQL GROUP BY queries

To simplify the tutorial, here are ready and functional scripts to create tables and execute queries. The first one creates the products table and the second one the orders table. We need at least two tables to create meaningful join queries.

1.1 Query to create the products table

Here is the code to create the products table with 10 rows of data.

-- Create the dbo.Products table
CREATE TABLE products (
   product_id INT PRIMARY KEY,
   name VARCHAR(255) NOT NULL,
   category VARCHAR(255) NOT NULL,
   quantity INT NOT NULL,
   price DECIMAL(10,2) NOT NULL,
   purchase_price DECIMAL(10,2) NOT NULL
);

-- Insert a few sample line into the dbo.Products table
INSERT INTO products 
   (product_id, name, category, quantity, price, purchase_price)
VALUES (1, 'Laptop', 'IT', 10, 1000, 500),
       (2, 'Touch Tablet', 'IT', 5, 500, 250),
       (3, 'LED TV', 'Electronics', 20, 2000, 1000),
       (4, 'Smartphone', 'Electronics', 15, 300, 150),
       (5, 'Vacuum Cleaner', 'Household Appliances', 30, 100, 50),
       (6, 'Washing Machine', 'Household Appliances', 25, 500, 250),
       (7, 'Refrigerator', 'Household Appliances', 35, 700, 350),
       (8, 'Microwave Oven', 'Household Appliances', 40, 100, 50),
       (9, 'Washing Machine', 'Household Appliances', 45, 800, 400),
       (10, 'Extractor Hood', 'Household Appliances', 50, 150, 75);

1.2 Query to create the orders table

-- Create the orders table: dbo.Orders
CREATE TABLE dbo.Orders(
   order_id INT PRIMARY KEY,
   customer_id INT NOT NULL,
   order_date DATE NOT NULL,
   product_id INT NOT NULL,
   quantity INT NOT NULL,
   price DECIMAL(10,2) NOT NULL,
   sales DECIMAL(10,2) NOT NULL
);

-- Insert some sample rows into the dbo.Orders table
INSERT INTO dbo.Orders 
   (order_id, customer_id, order_date, product_id, quantity, price, sales)
VALUES (1, 1, '2022-01-01', 1, 2, 1000, 2000),
       (2, 2, '2022-01-02', 2, 3, 500, 1500),
       (3, 3, '2022-01-03', 3, 1, 2000, 2000),
       (4, 4, '2022-01-04', 4, 5, 300, 1500),
       (5, 5, '2022-01-05', 5, 1, 100, 100),
       (6, 6, '2022-01-06', 6, 2, 500, 1000),
       (7, 7, '2022-01-07', 7, 1, 700, 700),
       (8, 8, '2022-01-08', 8, 4, 100, 400),
       (9, 9, '2022-01-09', 9, 1, 800, 800),
       (10, 10, '2022-01-10', 10, 2, 150, 300);

2. SQL query example to calculate the sum of sales per year

To get the sum of sales per year, simply use and adapt the following SQL query.

SELECT
   year,
   SUM(sales)
FROM dbo.Orders
GROUP BY
   year;

3. SQL Server query to get the number of orders per customer

To get the number of orders for each customer, you can use the following query.

SELECT
   customer_id,
   COUNT(*)
FROM dbo.Orders
GROUP BY
   customer_id;

4. GROUP BY with total sales by year and country

To display the total sales per year and per country, copy and paste the next script, again using the SUM built-in SQL Server function.

SELECT
   year,
   country,
   SUM(sales)
FROM dbo.Orders
GROUP BY
   year,
   country;

5. T-SQL example query to display the number of orders by country and year

To get the number of orders per country and per year, you can use the following query.

SELECT
   year,
   country,
   COUNT(*)
FROM dbo.Orders
GROUP BY
   year,
   country;

Now, here are five more complex code examples using the GROUP BY clause. And also using various aggregation functions like:

  • Average
  • Max

And combined with built-in T-SQL time functions like the following ones:

  • MONTH
  • YEAR

6. Average by month and year using GROUP BY, YEAR, MONTH and AVG functions

To get the average quantity ordered per month and per year, you can use the following query.

SELECT
   YEAR(order_date) AS year,
   MONTH(order_date) AS month,
   AVG(quantity)
FROM orders
GROUP BY
   YEAR(order_date),
   MONTH(order_date);

7. GROUP BY Query with Orders by Category and Year

In order to calculate the number of orders per product category and per year, use the following T-SQL query example and adapt it to your current needs.

SELECT
   YEAR(order_date) AS year,
   p.category, COUNT(*)
FROM dbo.Orders o
INNER JOIN dbo.Products p
   ON o.product_id = p.product_id
GROUP BY
   YEAR(order_date),
   p.category;

8. Calculate the highest price paid by each customer in T-SQL

This query retrieves the maximum price paid by each customer from the Orders table. To do so, it’s grouping the data by customer_id, which means it’s considering each customer separately. Then, for each customer, it finds the highest price in their orders.

SELECT
   customer_id,
   MAX(price)
FROM dbo.Orders
GROUP BY
   customer_id;

9. SQL Server GROUP BY scripts to calculate the sum of sales by category and year

The following script combines data from the ‘orders’ and ‘products’ tables based on their matching product IDs. This way, it can get information about each order and the corresponding product at the same time. The query then groups this combined data by both year and product category. The year is derived from the order date. For each group, it calculates the total sales. So it provides total sales, split by year and product category.

SELECT
   YEAR(order_date) AS year,
   p.category,
   SUM(o.sales)
FROM orders o
INNER JOIN products p
   ON o.product_id = p.product_id
GROUP BY
   YEAR(order_date),
   p.category;

10. Customer who spent more per category and year with total amount and average

In this query, we first create a CTE named SalesData that calculates the total amount spent by each customer in each category for each year. The total spent is calculated as the product of price and quantity from the Orders table, grouped by year, category, and customer_id.

In the main query, we join SalesData with the Orders table using customer_id and calculate the average order value using the window function AVG(). This value is calculated for each customer_id for each year. The result is then ordered by year, category and total spent descending

WITH SalesData AS (
    SELECT 
        YEAR(o.order_date) AS year,
        p.category,
        o.customer_id,
        SUM(o.price * o.quantity) AS total_spent
    FROM 
        dbo.Orders o
    JOIN 
        products p ON
	       o.product_id = p.product_id
    GROUP BY 
        YEAR(o.order_date),
	   p.category,
	   o.customer_id
)
SELECT 
    sd.year,
    sd.category,
    sd.customer_id,
    sd.total_spent,
    AVG(o.price * o.quantity)
        OVER (PARTITION BY sd.customer_id, sd.year) AS avg_order_value
FROM 
    SalesData sd
JOIN 
    dbo.Orders o ON
        sd.customer_id = o.customer_id
ORDER BY
    sd.year,
    sd.category,
    sd.total_spent desc; 

Here the result in a table format after running the query in SSMS.

yearcategorycustomer_idtotal_spentavg_order_value
2022Electronics32000.002000.000000
2022Electronics41500.001500.000000
2022Household Appliances61000.001000.000000
2022Household Appliances9800.00800.000000
2022Household Appliances7700.00700.000000
2022Household Appliances8400.00400.000000
2022Household Appliances10300.00300.000000
2022Household Appliances5100.00100.000000
2022IT12000.002000.000000
2022IT21500.001500.000000

Conclusion on the GROUP BY with SQL Server

It is important to note that these GROUP BY query examples in T-SQL should be used with caution as it can impact the performance of the query if used improperly. Indeed, the GROUP BY clause can be resource-intensive if used on a large table or if combined with other costly clauses such as DISTINCT or ORDER BY. It is therefore recommended to understand how GROUP BY works and to use it appropriately in your SQL queries.

Data Aggregation with SSIS

To go further, here are other ways to aggregate data with SQL Server, this time with SSIS.

Be the first to comment

Leave a Reply

Your email address will not be published.


*