Understanding the RANK() Function in SQL

In SQL databases, the RANK() function is a window function that returns the rank of the current row within its partition, with gaps.

By “with gaps” I mean that if there are any ties for a given rank, there will be a gap between that rank value and the next rank value.

If you don’t want such gaps, use the DENSE_RANK() function instead, as it returns the rank without gaps.

Continue reading

Understanding the CUME_DIST() Function in SQL

Many relational database management systems (RDBMSs) provide us with a range of window functions.

The CUME_DIST() function is a window function that’s commonly implemented in RDBMSs for the purpose of calculating the cumulative distribution across a data set. In other words, it calculates the relative position of a specified value in a group of values.

Continue reading

Create a Window Function in SQL

Most of the major RDBMSs allow us to create window functions in SQL queries.

A window function is a special aggregate-like function that enables partitioning and ordering of data within a result set. Unlike regular aggregate functions, which collapse multiple rows into a single result, window functions retain individual rows in the output, augmenting them with additional computed values.

The following example demonstrates how to create a window function in SQL.

Continue reading

Get the Most Populous City in Each District When Using a Window Function in SQL

While it’s true that we can use the SQL MAX() function to get the maximum value in a column, what if we want to return the value from another column instead of the actual maximum value itself? And what if we want it partitioned by category, so that it’s based on the maximum value from each category? And what if we want all values listed out, including those that aren’t the most populous?

In the following example, we use a window function to solve this problem.

Continue reading

Overview of the WINDOW Clause in SQL

The WINDOW clause is an optional clause that we can use in our SQL queries to create a named window. The named window can then be used as part of a window function.

When creating a window function, a SQL developer will often define it directly in the OVER clause. But that’s not the only way to do it. We can alternatively use the WINDOW clause to define it in a named window, and then refer to that named window in the OVER clause.

Continue reading

Compute a Moving Average in SQL

SQL provides us with an easy way to automatically compute the moving average of a given column.

The moving average (also known as the rolling average, running average, moving mean (MM), or rolling mean) is a series of averages of different selections of the full data set. The moving average changes as we move through the data set. We can add the moving average to our SQL query result sets to see how it changes across the result set.

Continue reading

Create a Running Total in SQL

We can use SQL to create a running total of a given column. By this I mean, we can create a column that calculates the cumulative sum of a given column.

The running total/cumulative sum is the sum of all rows up until the current row. The result is that the sum increases (or decreases in the case of negative values) with each row returned in the result set. Each row’s value is added to the cumulative amount from all prior rows, so for any given row, we get the total sum for all rows up to that point – the “running total”.

Continue reading

Calculate a Grand Total using a Window Function in SQL

When we use an OVER() clause to create a window function in SQL, we often use a PARTITION BY clause to partition the results. This can be handy if we want to do stuff like calculate subtotals.

But we can also use an empty OVER clause to calculate a grand total.

We might want to do this if we’re already using another OVER clause to calculate subtotals, but we also want a column to provide the grand total.

Continue reading