SQL stands for Structured Query Language. It’s the go-to query language for most of us who develop relational databases. If you want to work with relational database management systems, learning SQL is pretty much non-negotiable. Especially if you want to become a SQL developer.
If you’re interested in becoming a SQL developer, here are ten essential SQL concepts that you need to learn.
The SELECT
Statement
The SELECT
statement is the cornerstone of SQL queries. It allows us to retrieve data from one or more tables in a database. Understanding how to use the SELECT
statement is a fundamental part of querying SQL databases.
Here’s how the SELECT
statement works:
SELECT column1, column2
FROM table_name;
One thing about SQL is that it uses simple language that is often quite self-explanatory. We can see from this code that we’re simply trying to select (or retrieve) two columns from a specified table. More specifically, we’re trying to retrieve the data from those columns.
Here’s a real world example:
SELECT
CatId,
CatName
FROM Cats;
Result:
catid | catname
-------+---------
1 | Fluffy
2 | Scratch
3 | Purr
Here, we retrieved data from two columns (CatId
and CatName
) of a table called Cats
. This assumes that we’ve previously created a table of that name and that it contains those columns. It also assumes that there’s data in those columns (although this is not required – if there’s no data, then we’d get an empty result).
In any case, the table contained three rows, and so those three rows were returned.
We can also use various clauses to filter and sort our results. This is where SQL really shines.
The WHERE
Clause
One of the most common clauses in SQL is the WHERE
clause. The WHERE
clause is used to filter data based on specific conditions. It’s essential for narrowing down our query results to only the data you need. In the previous query, the table contained just three rows, but imagine if it contained hundreds, or even thousands? This is where the WHERE
clause really earns it’s keep.
Example:
SELECT *
FROM Customers
WHERE TotalPurchases > 70000;
Example result:
CustomerID CustomerName TotalPurchases
---------- -------------- --------------
2 GlobalTech 75000
4 MegaStore 100000
6 TechGiants 80000
9 BigRetailer 95000
14 Enterprise Inc 85000
This query retrieves all customers with total purchases over 70,000 (the >
sign is an “operator” that means “greater than”).
We know that it selected all columns from the table, because the query uses the asterisk (*
). This is a wildcard character that means “all columns”. It saves us from having to type out all of the columns in the table (let alone remember them!).
This table contained a lot more rows than this, but these were the only ones that satisfied our search criteria (which was determined with the WHERE
clause).
We can use plenty more operators with the WHERE
clause. This allows us to create precise, targeted queries. Here are 12 Commonly Used SQL Operators to get you started.
JOIN
Operations
SQL joins are used to combine rows from two or more tables based on a related column between them.
Example:
SELECT
orders.order_id,
customers.customer_name
FROM orders
INNER JOIN customers
ON orders.customer_id = customers.customer_id;
This query combines data from the orders
and customers
tables to show order IDs alongside customer names.
Having the ability to perform joins in SQL allows us to normalise our database, so that our data is separated across multiple tables that may or may not be related. When they’re related, we can create a relationship between them (see What is a Relationship?), which helps us when designing our SQL joins across those tables.
Understanding different types of joins (INNER
, LEFT
, RIGHT
, FULL
, etc) is crucial for querying relational databases. See my SQL Joins Tutorial for a comprehensive run-down of SQL joins.
Aggregate Functions
Aggregate functions perform calculations on a set of values and return a single result. Common aggregate functions in SQL include COUNT()
, SUM()
, AVG()
, MAX()
, and MIN()
. These can be very handy for doing stuff like summarising data, generating reports, etc.
Example:
SELECT AVG(TotalPurchases)
FROM Customers
WHERE TotalPurchases > 70000;
Example result:
87000
Here, I returned the average of the total purchases for those customers who purchased over 70,000. In an earlier query, we retrieved all rows that matched this criteria. In this example we returned a single row consisting of the average value of those rows.
The GROUP BY
Clause
The GROUP BY
clause is used with aggregate functions to group the result set by one or more columns. It’s handy for creating summary reports and analysing data at different levels of granularity.
Example:
SELECT
product_category,
SUM(sales_amount) as total_sales
FROM sales
GROUP BY product_category;
Example result:
product_category total_sales
---------------- -----------
Accessories 259.97
Clothing 149.96
Electronics 3449.93
Footwear 89.99
This query summarises total sales for each product category in the sales
table. We used the SUM()
function to add up all the amounts in the sales_amount
column for each product category.
The HAVING
Clause
The HAVING
clause is used to filter the results of GROUP BY
queries. It’s similar to the WHERE
clause but operates on grouped data rather than individual rows.
Example:
SELECT
product_category,
SUM(sales_amount) as total_sales
FROM sales
GROUP BY product_category
HAVING SUM(sales_amount) > 200;
Result:
product_category total_sales
---------------- -----------
Accessories 259.97
Electronics 3449.93
This is almost the same query as the previous one, with the exception that we used the HAVING
clause to filter the results to just those rows that have a total sales amount of over 200.
Here’s another example:
SELECT
department,
COUNT(*) as employee_count
FROM employees
GROUP BY department
HAVING COUNT(*) > 10;
Result:
department | employee_count
------------+----------------
Marketing | 12
Sales | 14
IT | 11
This query shows departments with more than 10 employees, along with the number of employees in each respective department. We were able to count the number of employees by using the COUNT()
function.
Subqueries
Subqueries, also known as nested queries, are queries within queries. Subqueries allow us to use the results of one query as input for another query, enabling more complex data retrieval and manipulation.
Example:
SELECT
employee_name
FROM employees
WHERE department_id IN (
SELECT department_id
FROM departments
WHERE location = 'London'
);
This query finds employees working in departments located in London. In this case there are two tables; employees
and departments
. The outer query returns the employee name from the employees
table, while the inner query gets the department information from the departments
table and restricts it to just those with a location of London. This relies on both tables having a department_id
column.
See my article Understanding the SQL Subquery for more information and subquery examples.
UNION
and UNION ALL
The UNION
operator is used to combine the result sets of two or more SELECT
statements. UNION
removes duplicate rows, while UNION ALL
retains all rows, including duplicates.
It’s also possible to use UNION DISTINCT
, which is the same as UNION
by itself (it removes duplicates).
These operators are useful for merging data from different tables or queries.
Example:
SELECT product_name FROM products_2023
UNION
SELECT product_name FROM products_2024;
This query combines unique product names from two different tables.
Here it is with UNION ALL
:
SELECT product_name FROM products_2023
UNION ALL
SELECT product_name FROM products_2024;
That will also combine product names from two different tables, but this one will also include any duplicates (whereas the previous won’t).
INSERT
, UPDATE
, and DELETE
Statements
While SELECT
is used for retrieving data, INSERT
, UPDATE
, and DELETE
are used for modifying data in the database.
Examples:
INSERT INTO customers (customer_name, email) VALUES ('John Doe', '[email protected]');
UPDATE employees SET salary = salary * 1.05 WHERE performance_rating = 'Excellent';
DELETE FROM orders WHERE order_date < '2023-01-01';
These statements add a new customer, give a five percent raise to top performers, and remove old orders, respectively.
Indexing
Once you’ve developed enough SQL skills to write complex queries against databases, you’ll soon realise that some queries take longer than others. And some take a lot longer.
There are many reasons why some queries run faster than others, just as there are many ways to deal with slow queries. But there’s one thing you can be sure of; the more data your database contains, the slower your queries will become. This is where indexing can help.
Indexes are data structures that can improve the speed of data retrieval operations on database tables. They can help the database engine find and retrieve data much more quickly than scanning the entire table.
Here’s an example of SQL code that creates an index:
CREATE INDEX idx_last_name ON employees (last_name);
This statement creates an index called idx_last_name
on the last_name
column of the employees
table. This index has the potential to speed up queries that search or sort by that column.
Learn More
Check out my SQL Tutorial for Beginners, which is a comprehensive guide that helps beginners learn the fundamentals of SQL. I walk you through using SQL to create a small database, create a relationship, populate the database with data, query it, and more.