A few days ago at my day job, I was working on optimizing some SQL queries. Things were getting messy and I wasn’t sure where to start.
That’s when I decided to check out some suggestions from ChatGPT and it recommended using CTEs and window functions . At first I had no idea what they were. After trying them out I realize they make things a bit simpler and efficient.
Example Scenario #
To make these concepts more practical let us consider a fictional table that might be found in a services like Zomato or Swiggy.
Common Table Expressions (CTEs) #
CTEs help structure and simplify SQL queries by creating a temporary result set that can be referenced within the main query. This allows complex queries to be broken down into smaller logical components.
1WITH HighValueOrders AS (
2 SELECT customer_name, SUM(amount) AS total_spent
3 FROM orders
4 GROUP BY customer_name
5 HAVING SUM(amount) > 50
6)
7
8SELECT customer_name, total_spent FROM HighValueOrders;
In this example, the CTE named HighValueOrders
calculates the total amount spent by each customer and only includes customers who have spent more than 50 Rs. in total. This makes the overall query easier to read and maintain.
CTEs are re-evaluated each time they’re referenced within a query, which can lead to performance overhead, especially with large datasets. However, they support recursion which is helpful for hierarchical data retrieval.
Subqueries #
SQL can run queries inside other queries. This feature is called a subquery . Subqueries are useful for solving problems in stages, allowing for data filtering and calculation in separate steps.
Non-Correlated Subquery
1SELECT restaurant_name
2FROM orders
3WHERE amount > (
4 SELECT AVG(amount)
5 FROM orders
6);
Here, the subquery calculates the average order amount across all orders. This value is then used by the outer query to find restaurants with orders exceeding this average. Since the subquery runs only once , non-correlated subqueries are generally more efficient.
Correlated Subqueries
1SELECT o1.restaurant_name, o1.amount
2FROM orders o1
3WHERE o1.amount > (
4 SELECT AVG(o2.amount)
5 FROM orders o2
6 WHERE o2.restaurant_name = o1.restaurant_name
7);
In this case, the subquery calculates the average order amount for each restaurant. For every row in the outer query, the subquery recalculates the average specific to that restaurant.
Because the subquery executes for each row of the outer query, correlated subqueries can be resource-intensive and slower .
Window Functions #
Window functions in SQL provide additional calculations across a set of rows related to the current row without aggregating or collapsing the data. They allow calculations such as running totals, averages, and ranks, while preserving the details of each individual row.
1SELECT customer_name, amount,
2 AVG(amount) OVER (PARTITION BY customer_name) AS avg_spent_per_customer
3FROM orders;
The window function (AVG(amount) OVER (PARTITION BY customer_name)
) provides additional context without changing the individual order details.
Window functions are great for calculating things like running totals, row rankings, or moving averages. They can outperform traditional subqueries by reducing the need for multiple scans of the same data. Yet, they may introduce sorting operations , which canf be costly.
Share your thoughts below or try writing a query using one of these methods. Thanks for taking a read.