Alright, let's put the concepts from this chapter into practice. We've learned how aggregate functions like COUNT
, SUM
, AVG
, MIN
, and MAX
summarize data, how GROUP BY
helps us perform these calculations on specific subsets of our data, and how HAVING
lets us filter based on the results of those aggregations. Now it's time to write some queries!
For these exercises, imagine we have two tables: Products
and Orders
.
Sample Products
Table:
product_id | product_name | category | price |
---|---|---|---|
101 | Quantum Widget | Electronics | 75.00 |
102 | Flux Capacitor | Electronics | 1200.00 |
103 | Turbo Encabulator | Gadgets | 250.50 |
104 | Sky Hook | Hardware | 35.20 |
105 | Left-Handed Wrench | Hardware | 15.00 |
106 | Photon Phaser | Gadgets | 150.75 |
107 | Sonic Screwdriver | Gadgets | 99.99 |
Sample Orders
Table:
order_id | customer_id | order_date | order_total |
---|---|---|---|
1 | CUST-A | 2023-10-26 | 150.75 |
2 | CUST-B | 2023-10-26 | 75.00 |
3 | CUST-A | 2023-10-27 | 250.50 |
4 | CUST-C | 2023-10-28 | 50.20 |
5 | CUST-B | 2023-10-28 | 1275.00 |
6 | CUST-A | 2023-10-29 | 35.20 |
Let's get started.
First, let's get a general overview of our orders. We want to find the total number of orders, the total sales amount across all orders, the average order value, and the highest and lowest order amounts.
SELECT
COUNT(*) AS total_orders,
SUM(order_total) AS total_revenue,
AVG(order_total) AS average_order_value,
MIN(order_total) AS minimum_order_value,
MAX(order_total) AS maximum_order_value
FROM
Orders;
Explanation:
COUNT(*)
: Counts the total number of rows in the Orders
table. We use the alias total_orders
to make the output clear.SUM(order_total)
: Calculates the sum of all values in the order_total
column, giving us the total revenue. Aliased as total_revenue
.AVG(order_total)
: Computes the average of the order_total
column. Aliased as average_order_value
. Remember, this is calculated as SUM(order_total)/COUNT(order_total).MIN(order_total)
: Finds the smallest value in the order_total
column. Aliased as minimum_order_value
.MAX(order_total)
: Finds the largest value in the order_total
column. Aliased as maximum_order_value
.Expected Result:
(Based on the sample data)
total_orders | total_revenue | average_order_value | minimum_order_value | maximum_order_value |
---|---|---|---|---|
6 | 1836.65 | 306.1083 | 35.20 | 1275.00 |
Note: The exact average might be displayed with more or fewer decimal places depending on the SQL database system. |
Now, let's see how many orders each customer has placed. This requires grouping the rows by customer_id
before counting.
SELECT
customer_id,
COUNT(*) AS number_of_orders
FROM
Orders
GROUP BY
customer_id;
Explanation:
GROUP BY customer_id
: This clause groups the rows from the Orders
table based on unique values in the customer_id
column. All rows for 'CUST-A' are in one group, all rows for 'CUST-B' in another, and so on.COUNT(*)
: This aggregate function is now applied to each group separately. It counts the number of rows within each customer's group.SELECT customer_id, ...
: We select the customer_id
to know which group each count belongs to.Expected Result:
customer_id | number_of_orders |
---|---|
CUST-A | 3 |
CUST-B | 2 |
CUST-C | 1 |
Let's expand on the previous query. For each customer, let's find out not only how many orders they placed, but also their total spending and average order value.
SELECT
customer_id,
COUNT(*) AS number_of_orders,
SUM(order_total) AS total_spent,
AVG(order_total) AS average_spent
FROM
Orders
GROUP BY
customer_id;
Explanation:
customer_id
.SUM(order_total)
and AVG(order_total)
. These functions are calculated independently for each customer group created by the GROUP BY
clause.Expected Result:
customer_id | number_of_orders | total_spent | average_spent |
---|---|---|---|
CUST-A | 3 | 436.45 | 145.4833 |
CUST-B | 2 | 1350.00 | 675.0000 |
CUST-C | 1 | 50.20 | 50.2000 |
Let's switch to the Products
table. We can group by category
to find out how many products we have in each category and what their average price is.
SELECT
category,
COUNT(*) AS number_of_products,
AVG(price) AS average_price
FROM
Products
GROUP BY
category;
Explanation:
GROUP BY category
: Groups rows based on the product category ('Electronics', 'Gadgets', 'Hardware').COUNT(*)
: Counts the products within each category group.AVG(price)
: Calculates the average price for products within each category group.Expected Result:
category | number_of_products | average_price |
---|---|---|
Electronics | 2 | 637.5000 |
Gadgets | 3 | 167.0800 |
Hardware | 2 | 25.1000 |
Let's visualize the average price per category.
Average price calculated for products within each distinct category.
Suppose we want to find only those customers who are "big spenders," defined here as having spent more than $500 in total. We first need to calculate the total spending per customer (SUM(order_total)
grouped by customer_id
) and then filter these summarized results. This is where HAVING
comes in.
SELECT
customer_id,
SUM(order_total) AS total_spent
FROM
Orders
GROUP BY
customer_id
HAVING
SUM(order_total) > 500;
Explanation:
customer_id
and calculates the SUM(order_total)
for each customer.HAVING SUM(order_total) > 500
: This clause filters the results of the aggregation. It only keeps the groups (customers) where the calculated sum is greater than 500.WHERE SUM(order_total) > 500
. The WHERE
clause filters individual rows before aggregation happens. HAVING
filters groups after aggregation.Expected Result:
customer_id | total_spent |
---|---|
CUST-B | 1350.00 |
Let's try a more complex scenario combining row filtering (WHERE
), grouping (GROUP BY
), and group filtering (HAVING
). Find the average order total for each customer, but only consider orders placed after 2023-10-26. Furthermore, only show customers whose average order total (for these recent orders) is greater than $100.
SELECT
customer_id,
AVG(order_total) AS average_recent_order_value
FROM
Orders
WHERE
order_date > '2023-10-26' -- Filter rows first
GROUP BY
customer_id -- Group remaining rows
HAVING
AVG(order_total) > 100; -- Filter groups based on aggregate
Explanation:
WHERE order_date > '2023-10-26'
: Filters the Orders
table, removing orders placed on or before 2023-10-26. Order IDs 1 and 2 are excluded.GROUP BY customer_id
: Groups the remaining rows (orders 3, 4, 5, 6) by customer.
AVG(order_total)
: Calculates the average order_total
for each of these groups.
HAVING AVG(order_total) > 100
: Filters these calculated group averages, keeping only those greater than 100.Expected Result:
customer_id | average_recent_order_value |
---|---|
CUST-A | 142.85 |
CUST-B | 1275.00 |
This practical exercise walked through using aggregate functions, grouping data effectively, and filtering those groups based on calculated values. Experiment with these queries on your own data or modify them using different conditions and aggregates. Consistent practice is the best way to become comfortable with these fundamental SQL operations for data analysis.
© 2025 ApX Machine Learning