SQL Server supports partitioned tables and indexes. When a partitioned table or index is partitioned, its data is divided into units that can be spread across more than one filegroup.
Therefore, to create a partitioned table in SQL Server, you first need to create the filegroup/s that will hold each partition. You also need to create a partition function and a partition scheme.
So it goes like this:
- Create filegroup/s
- Create a partition function
- Create a partition scheme
- Create the partitioned table
Below is an example of using these steps to create a table with four partitions.
Create Filegroups
First, we add four filegroups to the database called Test, and then specify the physical file for each of those filegroups.
ALTER DATABASE Test
ADD FILEGROUP MoviesFg1;
GO
ALTER DATABASE Test
ADD FILEGROUP MoviesFg2;
GO
ALTER DATABASE Test
ADD FILEGROUP MoviesFg3;
GO
ALTER DATABASE Test
ADD FILEGROUP MoviesFg4;
ALTER DATABASE Test
ADD FILE
(
NAME = MoviesFg1dat,
FILENAME = '/var/opt/mssql/data/MoviesFg1dat.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
)
TO FILEGROUP MoviesFg1;
ALTER DATABASE Test
ADD FILE
(
NAME = MoviesFg2dat,
FILENAME = '/var/opt/mssql/data/MoviesFg2dat.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
)
TO FILEGROUP MoviesFg2;
GO
ALTER DATABASE Test
ADD FILE
(
NAME = MoviesFg3dat,
FILENAME = '/var/opt/mssql/data/MoviesFg3dat.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
)
TO FILEGROUP MoviesFg3;
GO
ALTER DATABASE Test
ADD FILE
(
NAME = MoviesFg4dat,
FILENAME = '/var/opt/mssql/data/MoviesFg4dat.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
)
TO FILEGROUP MoviesFg4;
GO
You’ll need to change this code, depending on your requirements. You’ll also need to change the file paths to suit your environment. For example, if you’re on Windows, your path might look more like D:\mssql\data\MoviesFg4dat.ndf
.
Also, if you need more partitions add more filegroups here. Conversely, if you need less partitions, specify less filegroups here.
Create a Partition Function
Next we create a partition function called MoviesPartitionFunction that will partition the table into four partitions.
CREATE PARTITION FUNCTION MoviesPartitionFunction (int)
AS RANGE LEFT FOR VALUES (1, 100, 1000);
GO
The int part specifies the data type of the column used for partitioning.
All data types are valid for use as partitioning columns, except text, ntext, image, xml, timestamp, varchar(max), nvarchar(max), varbinary(max), alias data types, or CLR user-defined data types.
Here, I use three boundary values (1, 100
, and 1000
) to specify four partitions. These boundary values must either match or be implicitly convertible to the data type specified in parentheses after the partition function’s name.
Given these boundary values, and the fact that I specified a RANGE LEFT
partition, the four partitions will hold values as specified in the following table.
Partition | Values |
---|---|
1 | <= 1 |
2 | > 1 AND <= 100 |
3 | > 100 AND <=1000 |
4 | > 1000 |
If I had specified a RANGE RIGHT
partition, the breakdown would be slightly different, as outlined in the following table.
Partition | Values |
---|---|
1 | < 1 |
2 | >= 1 AND < 100 |
3 | >= 100 AND < 1000 |
4 | >= 1000 |
The same concept applies if the partitioning column uses other data types, such as date/time values.
Create a Partition Scheme
Next we need to create a partition scheme.
A partition scheme maps the partitions of a partitioned table or index to the new filegroups.
In our case, the code will look like this:
CREATE PARTITION SCHEME MoviesPartitionScheme
AS PARTITION MoviesPartitionFunction
TO (MoviesFg1, MoviesFg2, MoviesFg3, MoviesFg4);
GO
Notice that we reference the partition function that we created in the previous step. We also reference the filegroups that we created in the first step.
Create the Partitioned Table
Finally we can create the partitioned table.
CREATE TABLE Movies (
MovieId int IDENTITY PRIMARY KEY,
MovieName varchar(60)
)
ON MoviesPartitionScheme (MovieId);
GO
The only difference between this and creating an unpartitioned table is that when creating a partitioned table, we use the ON
argument to specify a partition scheme to use. In our case, we specify the partition scheme we created in the previous step, and specify the MovieId column as the partitioning column.
You’ll notice that the MovieId column has a data type of int, which matches the boundary values that we specified when creating the partition function.
Note that if you use a computed column in a partition function, it must be explicitly marked PERSISTED
.
Check the Partition Function
You can use the sys.partition_functions
view to return all partition functions.
SELECT * FROM sys.partition_functions;
Result (using vertical output):
name | MoviesPartitionFunction function_id | 65536 type | R type_desc | RANGE fanout | 4 boundary_value_on_right | 0 is_system | 0 create_date | 2020-10-10 05:37:41.330 modify_date | 2020-10-10 05:37:41.330
Check the Partition Scheme
You can use sys.partition_schemes
to check the partition scheme.
SELECT * FROM sys.partition_schemes;
Result (using vertical output):
name | MoviesPartitionScheme data_space_id | 65601 type | PS type_desc | PARTITION_SCHEME is_default | 0 is_system | 0 function_id | 65536
Alternatively, you could use the following query to return other details, such as the schema, table, index, etc.
SELECT
object_schema_name(i.object_id) AS [Schema],
object_name(i.object_id) AS [Object],
i.name AS [Index],
s.name AS [Partition Scheme]
FROM sys.indexes i
INNER JOIN sys.partition_schemes s ON i.data_space_id = s.data_space_id;
Result (using vertical output):
Schema | dbo Object | Movies Index | PK__Movies__4BD2941A0ED85ACA Partition Scheme | MoviesPartitionScheme
Check the Partitioned Table
You can run the sys.dm_db_partition_stats
view to return page and row-count information for every partition in the current database.
But running that before inserting any data into the table will result in most statistics being zero.
So I’m going to insert data first.
INSERT INTO Movies
SELECT name FROM OtherDb.dbo.Movies;
Result:
(4079 rows affected)
We can see that 4,079 rows were inserted.
Now let’s query the sys.dm_db_partition_stats
view.
SELECT *
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('dbo.Movies');
Result:
+-------------------+-------------+------------+--------------------+--------------------------+--------------------------+------------------------------+-----------------------+---------------------------+--------------------------------+------------------------------------+-------------------+-----------------------+-------------+ | partition_id | object_id | index_id | partition_number | in_row_data_page_count | in_row_used_page_count | in_row_reserved_page_count | lob_used_page_count | lob_reserved_page_count | row_overflow_used_page_count | row_overflow_reserved_page_count | used_page_count | reserved_page_count | row_count | |-------------------+-------------+------------+--------------------+--------------------------+--------------------------+------------------------------+-----------------------+---------------------------+--------------------------------+------------------------------------+-------------------+-----------------------+-------------| | 72057594048413696 | 2030630277 | 1 | 1 | 1 | 2 | 9 | 0 | 0 | 0 | 0 | 2 | 9 | 1 | | 72057594048479232 | 2030630277 | 1 | 2 | 1 | 2 | 9 | 0 | 0 | 0 | 0 | 2 | 9 | 99 | | 72057594048544768 | 2030630277 | 1 | 3 | 3 | 5 | 25 | 0 | 0 | 0 | 0 | 5 | 25 | 900 | | 72057594048610304 | 2030630277 | 1 | 4 | 10 | 12 | 33 | 0 | 0 | 0 | 0 | 12 | 33 | 3079 | +-------------------+-------------+------------+--------------------+--------------------------+--------------------------+------------------------------+-----------------------+---------------------------+--------------------------------+------------------------------------+-------------------+-----------------------+-------------+
This view returns a lot of columns, so let’s narrow the columns down to just a couple.
SELECT
partition_number,
row_count
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('dbo.Movies');
Result:
+--------------------+-------------+ | partition_number | row_count | |--------------------+-------------| | 1 | 1 | | 2 | 99 | | 3 | 900 | | 4 | 3079 | +--------------------+-------------+
We can see how the rows are allocated across the partitions. They’re allocated exactly as we specified in the partition function. The rows total 4,079, which is exactly how many rows we inserted.
However, it’s worth noting that the Microsoft documentation actually states that this column is just an approximate count of the rows in each partition.
Best Practice
Microsoft recommends that we always keep empty partitions at both ends of the partition range.
This is in case you need to either split or merge the partitions in the future.
The reason for this recommendation is to guarantee that the partition split and the partition merge don’t incur any unexpected data movement.
Therefore, given the data in my example, I could change the partition function to look something like this:
CREATE PARTITION FUNCTION MoviesPartitionFunction (int)
AS RANGE LEFT FOR VALUES (-1, 100, 10000);
GO
Or if I anticipate more than 10,000 rows, I could use a larger number (or create more partitions).
If I were to recreate all the steps again, to create my partitioned table, my partition stats would look like this:
SELECT
partition_number,
row_count
FROM sys.dm_db_partition_stats
WHERE object_id = OBJECT_ID('dbo.Movies');
Result:
+--------------------+-------------+ | partition_number | row_count | |--------------------+-------------| | 1 | 0 | | 2 | 100 | | 3 | 3979 | | 4 | 0 | +--------------------+-------------+
Now my data is concentrated into the middle two partitions, and the partitions at both ends are empty.