Skip to content

Snapshot Isolation Levels and Phantom Reads

I’m trying to study for the next MS SQL Server 2005 exam and keep getting annoyed by practice questions that are wrong. The latest one to bother me was this statement in one of the answers:

“Both read committed snapshot isolation and snapshot isolation can produce phantom records.”

Wikipedia has a nice definition of the problem of Phantom Reads:

“A phantom read occurs when, in the course of a transaction, two identical queries are executed, and the collection of rows returned by the second query is different from the first.”

The practice question is wrong; under READ COMMITTED SNAPSHOT isolation phantom reads may occur but they will not occur under full SNAPSHOT isolation. The simple difference between the two modes of snapshot isolation is the starting point and duration for which they guarantee a transactionally consistent view of the data.

  • READ COMMITTED SNAPSHOT isolation presents a view of the data at the point that the statement started, this view is consistent for the duration of the statement.
  • SNAPSHOT isolation presents a view of the data from point that the transaction started that is consistent for the duration of the transaction. The transaction is not considered to have started until the first statement of the transaction starts and it then finishes with either COMMIT or ROLLBACK (or an error under XACT_ABORT I guess).

With a definition like this (that I just made up) there is no way that SNAPSHOT isolation can allow phantom reads to occur. If it did it would not be presenting a consistent view of data throughout the duration of the transaction in the way that SERIALIZABLE transaction does. With the exception of the problem of updates SNAPSHOT isolation is essentially a row version based implementation of the SERIALIZABLE level.

Examples of Phantom Reads

Below are three queries that demonstrate phantom reads occurring under READ COMMITTED SNAPSHOT and not occurring under SNAPSHOT isolation. Follow along by running query 1 to setup the initial data, then jumping between the three queries to perform the required reads and insert the phantom row.

-- query 1 - setup the db and initial data.
create database snap ;
go

use snap;
go

-- turn on both read committed snapshot and full snapshot 
-- isolation modes. 
alter database
    snap
set 
    allow_snapshot_isolation on;
alter database
    snap
set 
    read_committed_snapshot on;

create table 
    snap_test
(
    id  char(1) not null
)
on [primary]

insert into snap_test values ('a');
insert into snap_test values ('b');
insert into snap_test values ('c');

-- go to query 2 and then 3 to perform the initial read

-- a quick check of the snapshot transactions and the version store

-- we have two transactions using the version store, one is full snapshot
select 
    transaction_id,
    transaction_sequence_num ,
    is_snapshot, 
    first_snapshot_sequence_num, 
    max_version_chain_traversed
from 
    sys.dm_tran_active_snapshot_database_transactions

/*
transaction_id       transaction_sequence_num is_snapshot first_snapshot_sequence_num max_version_chain_traversed
-------------------- ------------------------ ----------- --------------------------- ---------------------------
391601               274                      1           0                           0
391612               275                      0           0                           0
*/

-- the version store is empty
select 
    * 
from 
    sys.dm_tran_version_store

-- insert the phantom row
insert into
    snap_test
values 
    ('z');

-- go to query 2 and then 3 to check for phantom rows

-- back from checking for phantom rows

-- the two transactions have not needed to visit the version store
select 
    transaction_id,
    transaction_sequence_num ,
    is_snapshot, 
    first_snapshot_sequence_num, 
    max_version_chain_traversed
from 
    sys.dm_tran_active_snapshot_database_transactions
    
/*
transaction_id       transaction_sequence_num is_snapshot first_snapshot_sequence_num max_version_chain_traversed
-------------------- ------------------------ ----------- --------------------------- ---------------------------
391601               274                      1           0                           0
391612               275                      0           0                           0
*/

-- as the version store is still empty

select 
    * 
from 
    sys.dm_tran_version_store
-- query 2 - snapshot isolation
use snap;

set transaction isolation level snapshot;

begin transaction

-- initial read
select * from snap_test

/*
id
----
a
b
c

(3 row(s) affected)
*/

-- go to query 3 and perform the initial read committed read

-- after the phantom row has been inserted, perform the read again
select * from snap_test

-- same result, no phantom records have appeared in the result set.
/*
id
----
a
b
c

(3 row(s) affected)
*/

-- query 3 - read commited snapshot isolation
use snap;

set transaction isolation level read committed;

begin transaction

-- initial read
select * from snap_test

/*
id
----
a
b
c

(3 row(s) affected)
*/

-- back to query 1

-- after the phantom row has been inserted, read again

select * from snap_test

-- phantom row (-1) is now in the result set.
-- read committed snapshot only gurantees consistency for the duration of the statement
-- not the transaction.

/*
id
----
a
b
c
z

(4 row(s) affected)
*/

-- go back to query 1

Under the SNAPSHOT level the ‘z’ row was not returned, just what I thought would happen. But then I started wondering, how does SQL Server know not to return the ‘z’ row.

Pseudo Range Locks under SNAPSHOT isolation

Under SERIALIZATION isolation a transaction is protected from phantom reads by taking Key-Range Locks that prevent another transaction from inserting (or updating or deleting) a row that would have a key value in the locked range. Clearly this approach is not taken under SNAPSHOT isolation as a second transaction can insert rows that qualify for the first transactions read; the first transaction just ignores them.

Thinking about it: ignoring a record because it would result in a phantom read is not that different to ignoring a record because it was modified after the transaction or statement started. In this case the query will examine the version store for the record and return the row at the appropriate version. The appropriate version for a record is determined by using the new Transaction Sequence Number (XSN) associated with each transaction and added to each record modified after either of the snapshot modes have been enabled for the database.

So my theory is that a SNAPSHOT isolation transaction uses the same mechanism to ignore records in the table that were created after the transaction started.

After snapshots have been enabled each record that is modified has 14 bytes appended to it to maintain row versioning. The first 8 bytes are used to describe the file, page and row in the tempdb that contains the previous version and the next 6 bytes are used to store the XSN of the transaction that created/modified this version. The previous record in the version store will contain the same information; its version number and a pointer to the previous version if there is one. So a query can walk this chain of versions until it finds the latest version that was created before the current transaction or statement started.

The XSN for the two transactions started in my example above were shown in the transaction_sequence_num column when querying the sys.dm_tran_active_snapshot_database_transactions dmv. To view the XSN associated with the records I’ve used the technique described in Inside Microsoft SQL Server 2005: The Storage Engine to view the contents of a data page. (The DBCC calls used here are unsupported and super secret so are not listed in the Books Online.)

-- first call to list the pages in the table heap.
-- pages of type 1 are data pages, there should be a single page of this 
-- type. The PagePID column is the page number we need below, 73 in my case.
dbcc ind (snap, snap_test, -1)
-- this trace flag is needed to use the DBCC PAGE method below
dbcc traceon(3604)
-- view the contents of the single data page, 
-- params are db_name, file_number, page_number and print_options (0 to 3)
dbcc page (snap, 1,73,3)

/*
-- some results omitted

Slot 0 Offset 0x60 Length 22
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VERSIONING_INFO
Memory Dump @0x4D5DC060

00000000:   50000500 610100fe 00000000 00000000 †P...a...........         
00000010:   0f010000 0000††††††††††††††††††††††††......                   

Version Information = 
    Transaction Timestamp: 271
    Version Pointer: Null

Slot 0 Column 0 Offset 0x4 Length 1
id = a     

Slot 1 Offset 0x76 Length 22
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VERSIONING_INFO
Memory Dump @0x4D5DC076

00000000:   50000500 620100fe 00000000 00000000 †P...b...........         
00000010:   10010000 0000††††††††††††††††††††††††......                   

Version Information = 
    Transaction Timestamp: 272
    Version Pointer: Null

Slot 1 Column 0 Offset 0x4 Length 1
id = b                               

Slot 2 Offset 0x8c Length 22
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VERSIONING_INFO
Memory Dump @0x4D5DC08C

00000000:   50000500 630100fe 00000000 00000000 †P...c...........         
00000010:   11010000 0000††††††††††††††††††††††††......                   

Version Information = 
    Transaction Timestamp: 273
    Version Pointer: Null

Slot 2 Column 0 Offset 0x4 Length 1
id = c                               

Slot 3 Offset 0xa2 Length 22
Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP VERSIONING_INFO
Memory Dump @0x4D5DC0A2

00000000:   50000500 7a0100fe 00000000 00000000 †P...z...........         
00000010:   15010000 0000††††††††††††††††††††††††......                   

Version Information = 
    Transaction Timestamp: 277
    Version Pointer: Null

Slot 3 Column 0 Offset 0x4 Length 1
id = z                   
*/

My SNAPSHOT isolation transaction above had a XSN of 274. Looking at the first row in the data page at slot 0, the output from DBCC says this record has a Transaction Timestamp of 271. There for the record was created or modified before my transaction was started and should be included in the result set. The same can be said for the next two records at slots 1 and 2, which were created by transactions 272 and 273. As a side, this is a nice demonstration of implicit system transactions been created for each of the three INSERT statements used to populate the initial data.

When we get to fourth row at slot 3 its easy to see a mechanism for the SNAPSHOT transaction to ignore it; its XSN is 277 which is greater than the 274 assigned to the transaction. In this way the system can provide keyset stability for a SNAPSHOT transaction without requiring locks to prevent other transactions from going about their business.

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*