The release of SQL Server 2022 came with the introduction of the DATE_BUCKET()
function.
The DATE_BUCKET()
function allows us to arrange data into groups that represent fixed intervals of time. It returns the date/time value that corresponds to the start of each date/time bucket, as defined by the arguments passed to the function.
Syntax
The syntax goes like this:
DATE_BUCKET (datepart, number, date [, origin ] )
The function returns the date/time value corresponding to the start of each date/time bucket, from the timestamp defined by the origin
parameter, or the default origin value of 1900-01-01 00:00:00.000
if the origin parameter is not specified.
Example
Here’s an example to demonstrate:
DECLARE
@date date = '2030-01-05',
@origin date = '2030-01-01';
SELECT DATE_BUCKET(day, 1, @date, @origin);
Result:
2030-01-05
In this example, my width bucket is 1 day wide. Therefore, it moved along the timeline in increments of 1 day (starting from the origin date). In this case, the resulting date is the same as my @date
argument.
Let’s change the second argument to 2, 3, and 4 respectively:
DECLARE
@date date = '2030-01-05',
@origin date = '2030-01-01';
SELECT
DATE_BUCKET(day, 1, @date, @origin) AS "1 day",
DATE_BUCKET(day, 2, @date, @origin) AS "2 days",
DATE_BUCKET(day, 3, @date, @origin) AS "3 days",
DATE_BUCKET(day, 4, @date, @origin) AS "4 days";
Result:
+------------+------------+------------+------------+ | 1 day | 2 days | 3 days | 4 days | |------------+------------+------------+------------| | 2030-01-05 | 2030-01-05 | 2030-01-04 | 2030-01-05 | +------------+------------+------------+------------+
Here’s an explanation of the result:
- The first column is the same as the previous example – it has a bucket width of 1 day.
- The second column has a bucket width of 2 days. Therefore, it increments along the timeline in blocks of 2 days. This causes the
@date
value to fall within the third bucket. Therefore, the resulting date corresponds to the start of the third bucket. - The third column has a bucket width of 3 days. This increments along the timeline in blocks of 3 days. This causes the
@date
value to fall within the second bucket. Therefore, the resulting date corresponds to the start of the second bucket. - The fourth column has a bucket width of 4 days. This increments along the timeline in blocks of 4 days. This causes the
@date
value to fall within the second bucket. Therefore, the resulting date corresponds to the start of the second bucket.
Let’s walk slowly through another example:
DECLARE
@date date = '2030-01-05',
@origin date = '2030-01-01';
SELECT DATE_BUCKET(day, 7, @date, @origin);
Result:
2030-01-01
In this case, my date bucket is 7 days wide, but there are only 5 days between @date
and @origin
. Therefore, the output falls within the first bucket. The output in this case is the start of the first bucket (which is the origin date).
Now, let’s increase the gap between the two dates to more than 7:
DECLARE
@date date = '2030-01-10',
@origin date = '2030-01-01';
SELECT DATE_BUCKET(day, 7, @date, @origin);
Result:
2030-01-08
This time, the output has incremented to the start of a new date bucket.
Remember, in this example, each date bucket spans 7 days. Therefore if we want to progress to the next date bucket, we’d need to provide a date that’s at least 15 days away from the origin:
DECLARE
@date date = '2030-01-15',
@origin date = '2030-01-01';
SELECT DATE_BUCKET(day, 7, @date, @origin);
Result:
2030-01-15
Default Origin
The fourth argument is optional, so we can omit it without error. When we do this, DATE_BUCKET()
uses the default origin date of 1900-01-01 00:00:00.000
.
Example:
DECLARE @date date = '2030-01-15';
SELECT DATE_BUCKET(day, 7, @date);
Result:
2030-01-14
Valid Date Parts
The first argument can be any of the following:
Date Part | Abbreviations |
---|---|
day | dd , d |
week | wk , ww |
month | mm , m |
quarter | qq , q |
year | yy , yyyy |
hour | hh |
minute | mi , n |
second | ss , s |
millisecond | ms |
Note that the actual date parts returned in the output will depend on the date part being used, and how narrow the bucket is.
Here’s an example that demonstrates what I mean:
DECLARE @date datetime2 = '2030-01-15 10:30:45.1234567';
SELECT
'Day' AS "Datepart",
DATE_BUCKET(dd, 1, @date) AS "Output"
UNION ALL
SELECT
'Hour',
DATE_BUCKET(hh, 1, @date)
UNION ALL
SELECT
'Minutes',
DATE_BUCKET(mi, 1, @date)
UNION ALL
SELECT
'Seconds',
DATE_BUCKET(ss, 1, @date)
UNION ALL
SELECT
'Milliseconds',
DATE_BUCKET(ms, 1, @date);
Result:
+--------------+-----------------------------+ | Datepart | Output | |--------------+-----------------------------| | Day | 2030-01-15 00:00:00.0000000 | | Hour | 2030-01-15 10:00:00.0000000 | | Minutes | 2030-01-15 10:30:00.0000000 | | Seconds | 2030-01-15 10:30:45.0000000 | | Milliseconds | 2030-01-15 10:30:45.1230000 | +--------------+-----------------------------+
Date & Origin Data Types
The third and fourth arguments (i.e. date and origin) can be an expression that resolves to one of the following data types:
- date
- datetime
- datetime2
- datetimeoffset
- smalldatetime
- time
Database Example
Here’s an example of using the DATE_BUCKET()
function in a GROUP BY
clause (in addition to the SELECT
clause), to group the results by their date bucket:
SELECT
DATE_BUCKET(week, 1, AccountOpenedDate) AS AccountOpenedDateBucket,
COUNT(CustomerName) AS CustomerCount
FROM WideWorldImporters.Sales.Customers
WHERE AccountOpenedDate BETWEEN '2012-12-31' AND '2013-12-31'
GROUP BY DATE_BUCKET(week, 1, AccountOpenedDate)
ORDER BY AccountOpenedDateBucket;
Result:
+---------------------------+-----------------+ | AccountOpenedDateBucket | CustomerCount | |---------------------------+-----------------| | 2012-12-31 | 602 | | 2013-01-21 | 1 | | 2013-02-18 | 1 | | 2013-03-25 | 1 | | 2013-04-01 | 2 | | 2013-04-08 | 3 | | 2013-04-29 | 1 | | 2013-05-06 | 1 | | 2013-06-10 | 2 | | 2013-06-17 | 1 | | 2013-07-15 | 1 | | 2013-07-29 | 1 | | 2013-08-05 | 1 | | 2013-08-12 | 1 | | 2013-09-02 | 1 | | 2013-09-30 | 1 | | 2013-10-21 | 1 | | 2013-11-18 | 1 | | 2013-11-25 | 2 | | 2013-12-23 | 1 | +---------------------------+-----------------+
In this case, I used the COUNT()
function to return a count of the customers in each date bucket.
And here’s a similar query, except this time we use a month as the bucket width:
SELECT
DATE_BUCKET(month, 1, AccountOpenedDate) AS AccountOpenedDateBucket,
COUNT(CustomerName) AS CustomerCount
FROM WideWorldImporters.Sales.Customers
WHERE AccountOpenedDate BETWEEN '2012-12-31' AND '2013-12-31'
GROUP BY DATE_BUCKET(month, 1, AccountOpenedDate)
ORDER BY AccountOpenedDateBucket;
Result:
+---------------------------+-----------------+ | AccountOpenedDateBucket | CustomerCount | |---------------------------+-----------------| | 2013-01-01 | 603 | | 2013-02-01 | 1 | | 2013-03-01 | 1 | | 2013-04-01 | 5 | | 2013-05-01 | 2 | | 2013-06-01 | 3 | | 2013-07-01 | 1 | | 2013-08-01 | 3 | | 2013-09-01 | 1 | | 2013-10-01 | 2 | | 2013-11-01 | 2 | | 2013-12-01 | 2 | +---------------------------+-----------------+
Where Can DATE_BUCKET()
be Used?
We can use the DATE_BUCKET()
function in the following clauses:
GROUP BY
HAVING
ORDER BY
SELECT <list>
WHERE
When was DATE_BUCKET()
Introduced?
The DATE_BUCKET()
function was first implemented in SQL Server 2022 (16.x). However, prior to that it was first implemented in Azure SQL Edge (see my article from 2021).