SET STATISTICS IO ON is an option that shows, at the page level, the physical and logical IO that goes on for each statement in a batch. Its not the most detailed inspection of a the query performance but it gives a nice insight into whats going in when the query is executing. I’m going to ignore the LOB statistics for now. Here’s the first run to see what it pulls back.
use AdventureWorks; go set statistics io on; go -- clear the page cache so pages must be read from disk DBCC DROPCLEANBUFFERS go -- initial read of 181 records select * from Person.StateProvince; /* Three pages in total were read in, 1 from the physical read and 2 from reading ahead from the first. Table 'StateProvince'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 2, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ select * from Person.StateProvince; /* As expected the physical reads are now down to 0, all the data is in the cache. Table 'StateProvince'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ -- two statements together, just to see the output select * from Person.StateProvince; select * from Person.CountryRegion; /* -- all logical reads as the cache is still hot. Table 'StateProvince'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. -- the cache for the second table was cold and required -- physical reads Table 'CountryRegion'. Scan count 1, logical reads 4, physical reads 1, read-ahead reads 8, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ set statistics io off; go
That works fine but the confusing thing is the why for the first query 3 pages were physically read in (including the read ahead) and 4 pages were logically read. The logical reads counter is the number of times a page is read from the buffer cache, not the total number of distinct pages read by the query; so reading the same page twice adds two to the count. Thanks to Inside Microsoft SQL Server 2005: Query Tuning and Optimization for clearing this up.
The next task was to take a look at the number of pages in the clustered index for the page and work out if a page was indeed read multiple times. After all its a simple query without any sub-queries or need to perform multiple scans; the scan count bears that out. sys.dm_db_index_physical_stats returns a row for each level of a (B-Tree) index, the query above will use the clustered index as it requires a simple scan and must return all the columns. So lets take a look at the pages in that index.
-- look at the number of pages at all levels of the clustered
-- index in the Person.StateProvince table
select
cast(index_type_desc as varchar(20)) as index_type_desc,
cast(alloc_unit_type_desc as varchar(20)) as alloc_unit_type_desc,
index_level,
page_count,
record_count
from
sys.dm_db_index_physical_stats(
db_id('AdventureWorks'),
object_id('Person.StateProvince'),
1,
1,
'DETAILED')
order by
index_level desc
/*
index_type_desc alloc_unit_type_desc index_level page_count record_count
-------------------- -------------------- ----------- -------------------- --------------------
CLUSTERED INDEX IN_ROW_DATA 1 1 2
CLUSTERED INDEX IN_ROW_DATA 0 2 181
*/
This makes sense now. There are two data pages at the leaf level (level 0) in the clustered index for the table, between them they hold the 181 rows in the table. There is a single non leaf level in the index, at level 1, which of course contains a single page; the root page. The root page has two records, one each to point at the two (data) pages on the next level down on the index tree. So the previous full table scan may have taken the following path through the index:
- Read the root node and determine the first page to read, this will be the left most data page.
- Read the left most data page.
- Read the root node to determine the next page to read, this will be the right most data page.
- Read the right most data page.
That makes four page reads. To test the idea all thats needed is a query that can be resolved completely on one data page, that should only require two page reads. The query below selects records with an ID less than 10 and should require one read of the root page to determine the data page with records that have an ID less than 10 (it will be the left one). After that it should be able to read all the records that qualify from a single page, as the pages each have approximately 90 records and we know there should only be 9 records returned.
set statistics io on; go -- read the records with an id less than 10, they should be contained on a -- single leaf level page and it should take a single scan -- of the root page. select * from Person.StateProvince where StateProvinceID < 10; /* As predicted only two pages were read to answer the query Table 'StateProvince'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. */ set statistics io off; go
Post a Comment