Hello,
In this post, I want to mention a few things about the data storage metadata in SQL Server, or metadata storage, as the title mentiones. What’s that, you say? 🙂
Well, to be more exact, my goal is to cover the following DMVs, with some meaningful examples and explanations. And maybe at the end, I will draw a pretty picture, to keep this functionality in mind:
- indexes
- partitions
- allocation_units (+ sys.system_internals_allocation_units)
In order to check what can be found in these views, I will follow the next steps:
- Create an empty database (I am using SSMS for this, because it’s a lot faster than doing it from a query, but remember: the most secure way to work in SQL Server is by using code snippets, which you can save and reuse at a later point in time. I will stress that as many times as I have the opportunity). If you want to create it from a script, just paste the code below:
CREATE DATABASE TestDB2
- Check the values from the three views, when there isn’t anything in the database yet:
SELECT * FROM sys.indexes SELECT * FROM sys.partition SELECT * FROM sys.allocation_units
- What we see, is that there are some rows in these views, even though the database has just been created. The indexes view has 146 rows on my new database, sys.partitions has 145 and the allocation units has 170.
The next step is to create a table and see what happens in this views afterwards. I am using the following query for table creation:
CREATE TABLE dbo.Employee ( LastName varchar(25) NOT NULL, FirstName varchar(15) NOT NULL, Address varchar(100) NOT NULL, PhoneNr char(12) NOT NULL, JobLevel smallint NOT NULL );
And then I query again the view, to see what changed. The first change that we see is that the number of rows increased by 1 in the views. But let’s see with what.
- For sys.indexes, I will use a WHERE clause, to search for changes related to my Employee table.
SELECT * FROM sys.indexes WHERE OBJECT_ID = OBJECT_ID(N'dbo.Employee')
The following row was inserted in the indexes view:
The things that catch my eye in this row are the following:
- The value from the name column is NULL
- Index_id = 0
- Type = 0
- Type_desc = HEAP
After searching a bit on the sys.indexes table (https://msdn.microsoft.com/en-us/library/ms173760.aspx), I learn the following:
Column name | Data type | Description | ||||||||||||||||
object_id | int | ID of the object to which this index belongs. | ||||||||||||||||
name | sysname | Name of the index. name is unique only within the object.
(NULL = Heap) |
||||||||||||||||
index_id | int | ID of the index. index_id is unique only within the object.
0 = Heap |
||||||||||||||||
type | tinyint | Type of index:
|
||||||||||||||||
type_desc | nvarchar(60) | Description of the index |
So, to sum up:
- every time we create a new table in SQL Server, a new row is created in sys.indexes, even though the table does not (yet) have an index.
- a table without a clustered index is called a HEAP.
- The index_id is unique in the context of the table where it’s created. This means that if we see two rows with index_id = 13 for example, we MUST also look at the object_id, to match the table to which they belong.
- Index_id = 0 means that that’s actually the row for the table (for the heap) and we can also take a look at the name column, to make sure that it’s NULL. For the other “real” indexes, the name column contains the name of the index.
- Sys.partitions
SELECT * FROM sys.partitions WHERE OBJECT_ID = OBJECT_ID(N'dbo.Employee')
These are the results:
This view contains a row for each partition of all the tables and most types of indexes in the database. Special index types such as Full-Text, Spatial, and XML are not included in this view. All tables and indexes in SQL Server contain at least one partition, whether or not they are explicitly partitioned.
- And finally, let’s query the allocation units. Unfortunately, this view must be linked to the sys.partitions one, because otherwise it is not readable with ease. So I will make the following query:
SELECT OBJECT_NAME(object_id) AS TableName, partition_id AS PartitionId, partition_number AS PartitionNumber, rows AS NumberOfRows, allocation_unit_id AS AllocationUnitID, type_desc AS AllocationUnit_Description, total_pages AS TotalPages FROM sys.partitions PART INNER JOIN sys.allocation_units ALLOC ON PART.partition_id = ALLOC.container_id WHERE object_id = object_id(N'dbo.Employee');
This simple table shows only one row because it has only one partition, no nonclustered indexes, and only one type of data (IN_ROW_DATA). Here is the result:
Every partition in a SQL Server table can contain 3 types of data, each stored on its own set of pages. And each of these types of pages is called an Allocation Unit. Below are the 3 types of Allocation Units.
- IN_ROW_DATA
- ROW_OVERFLOW_DATA
- LOB_DATA
So, an allocation unit is basically just a set of particular type of pages.
Now, to test that values are added in these DMVs, let’s create add some more interesting data: larger columns, more indexes and data in the table.
- Let’s add some new columns, that can’t be fit inside a single page (a page has a fixed length of 8KB and can contain rows of a maximum of 8060 bytes. Also, if a row’s length is more than 8060, it cannot be stored on a normal page – IN-ROW-DATA, but it will be stored on a row-overflow page).
ALTER TABLE dbo.Employee ADD LongAddress varchar(8000); ALTER TABLE dbo.Employee ADD FirstImpressions text;
The allocation units has new rows now:
Let’s add some indexes now:
CREATE NONCLUSTERED INDEX [IDX_JobLevel_NC] ON [dbo].[Employee]( [Address] ASC, [JobLevel] ASC ) CREATE CLUSTERED INDEX [IDX_Name_Clustered] ON [dbo].[Employee]( [LastName] ASC, [FirstName] ASC )
The sys.indexes view has changed: it does not contain a single row for the heap anymore, but it contains two rows, each for the index that has been created. The name column is not NULL anymore, in neither of the two cases.
The partitions table also contains more rows now: the initial row is now used for the clustered index, while a new row has been added for the nonclustered index. Also, the index_id column has changed.
A query joining all these DMVs, which tells us some useful information is the following:
SELECT OBJECT_NAME(IND.object_id) AS TableName, IND.name AS IndexName, IND.index_id IndexID, IND.type_desc as IndexType, PART.partition_id PartitionID, PART.partition_number AS PartitionNumber, PART.rows NumberOfRows, ALLOC.allocation_unit_id AS AllocationUnitID, ALLOC.type_desc as PageTypeDescription, ALLOC.total_pages AS NumberOfPages FROM sys.indexes IND INNER JOIN sys.partitions PART ON IND.object_id = PART.object_id AND IND.index_id = PART.index_id INNER JOIN sys.allocation_units ALLOC ON PART.partition_id = ALLOC.container_id WHERE IND.object_id = object_id(N'dbo.Employee');
The results of this query are:
What if we add some rows?
I have inserted 3 rows in this table, using the following query:
INSERT INTO [dbo].[Employee] (LastName, FirstName, Address, PhoneNr, JobLevel, LongAddress, FirstImpressions) VALUES ('Mouse', 'Mickey', '7484 Roundtree Drive', '849-555-0139', 1, '7484 Roundtree Drive but with a much longer text here', 'I am writing the first impressions of a new employee') ,('Mouse', 'Minnie', '1399 Firestone Drive', '181-555-0156', 2, '1399 Firestone Drive but with a much longer text here', NULL) ,('Duck', 'Donald', '5415 San Gabriel Dr.', '330-555-2568', 3, NULL, NULL)
And now, we have the following information:
We can see that the actual number of rows is presented correctly in this view and number of pages that were used to store this information.