Let’s dive into subqueries for a moment. Specifically, how to use a SQL subquery in the SELECT
list. It’s a handy little trick that can help us fetch related data without resorting to complex joins.
Sample Database
First, let’s set up some sample tables:
-- Create the Authors table
CREATE TABLE Authors (
AuthorID INT PRIMARY KEY,
AuthorName VARCHAR(100)
);
-- Create the Books table
CREATE TABLE Books (
BookID INT PRIMARY KEY,
Title VARCHAR(200),
AuthorID INT,
PublicationYear INT,
FOREIGN KEY (AuthorID) REFERENCES Authors(AuthorID)
);
-- Create the Sales table
CREATE TABLE Sales (
SaleID INT PRIMARY KEY,
BookID INT,
QuantitySold INT,
SaleDate DATE,
FOREIGN KEY (BookID) REFERENCES Books(BookID)
);
-- Populate the Authors table
INSERT INTO Authors (AuthorID, AuthorName) VALUES
(1, 'J.K. Rowling'),
(2, 'George Orwell'),
(3, 'Jane Austen'),
(4, 'Stephen King');
-- Populate the Books table
INSERT INTO Books (BookID, Title, AuthorID, PublicationYear) VALUES
(1, 'Harry Potter and the Philosopher''s Stone', 1, 1997),
(2, '1984', 2, 1949),
(3, 'Pride and Prejudice', 3, 1813),
(4, 'The Shining', 4, 1977),
(5, 'Harry Potter and the Chamber of Secrets', 1, 1998);
-- Populate the Sales table
INSERT INTO Sales (SaleID, BookID, QuantitySold, SaleDate) VALUES
(1, 1, 100, '2023-01-15'),
(2, 2, 50, '2023-02-20'),
(3, 3, 75, '2023-03-10'),
(4, 4, 60, '2023-04-05'),
(5, 5, 90, '2023-05-12'),
(6, 1, 120, '2023-06-18'),
(7, 2, 40, '2023-07-22'),
(8, 3, 80, '2023-08-30');
Now that we’ve got our tables set up and populated, let’s see how we can write a query that uses a subquery in the SELECT
list.
Example 1
Imagine we want to display each book’s title along with its total sales. Here’s how we can do that:
SELECT
b.Title,
b.PublicationYear,
(SELECT SUM(QuantitySold)
FROM Sales s
WHERE s.BookID = b.BookID) AS TotalSales
FROM
Books b
ORDER BY
TotalSales DESC;
Result:
Title PublicationYear TotalSales
---------------------------------------- --------------- ----------
Harry Potter and the Philosopher's Stone 1997 220
Pride and Prejudice 1813 155
1984 1949 90
Harry Potter and the Chamber of Secrets 1998 90
The Shining 1977 60
In this query, we’re using a SQL subquery in the SELECT
list to calculate the total sales for each book. The subquery is correlated, meaning it refers to the outer query. We can see this with the part that goes WHERE s.BookID = b.BookID
part. The b
part is the table from the outer query.
Example 2
Here’s another example where we use a subquery in the SELECT
list to find the most recent publication year for each author:
SELECT
a.AuthorName,
(SELECT MAX(PublicationYear)
FROM Books b
WHERE b.AuthorID = a.AuthorID) AS LatestPublicationYear
FROM
Authors a
ORDER BY
LatestPublicationYear DESC;
Result:
AuthorName LatestPublicationYear
------------- ---------------------
J.K. Rowling 1998
Stephen King 1977
George Orwell 1949
Jane Austen 1813
Here, the inner query got the most recent publication for each author, and the outer query returned the authors’ names and ordered the results.
Some Considerations
A correlated subquery can’t be run independently of the outer query. In other words, we can’t copy and paste the subquery and run it by itself. That’s because it refers to outer query, and so it requires the outer query in order to run.
Putting a subquery into the SELECT
list (and using subqueries in general) can have its advantages:
- It’s often more readable than a complex join, especially for simple calculations.
- It allows us to perform aggregations without grouping in the main query.
- It can be more efficient in some cases, particularly when dealing with large datasets.
However, it’s worth noting that some subqueries can result in performance issues. It all depends on the query. For more complex scenarios or when dealing with large amounts of data, you might want to consider using SQL joins or Common Table Expressions (CTEs) instead.
SQL Joins as an Alternative
Many subqueries can be rewritten using SQL joins. Some RDBMSs rewrite our subqueries as joins behind the scenes anyway. Sometimes when we encounter performance issues with a query, we can try rewriting it to use a join instead of a subquery (or vice-versa).
See my SQL Joins Tutorial for a quick-start guide to understanding joins and including them in your SQL code.