Friday, April 27, 2012

How SQL Server deletes records - DELETE internals

I was surprised with the fact that a lot of developers don't know that SQL Server does not perform immediate delete of their DELETE command, but it is doing an a background thread called GHOST CLEANUP periodically. When row delete is requested by transaction, that row is being just marked as "ghosted" and left to the background process to actually removes data. Thus, performances are enhanced and in case of ROLLBACK less work is done.  In a system table sys.dm_exec_requests you can periodically notice appearance of the GHOST CLEANUP as command. 
So, let's see what SLQ Server actually does when we are executing 


DELETE FROM  MyTable WHERE ... 


At the beginning let's create test table and insert couple of records. 



CREATE TABLE [dbo].[DeleteInternals](
[Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[Name] [char](4) NOT NULL
)
GO


insert into DeleteInternals([Name]) values('Row1')
insert into DeleteInternals([Name]) values('Row2')
insert into DeleteInternals([Name]) values('Row3')
insert into DeleteInternals([Name]) values('Row4')




To discover internals of the table structure, we will use undocumented commands DBCC IND and DBCC PAGE which are heavily used internally in SQL Server but they are not documented and not supported which means you are using them at your own risk. 


To find out which page IDs to look at with DBCC PAGE we are executing


DBCC IND(TestDatabase, DeleteInternals, 1)





Some important fileds for us in the output are
  • PageFID - the file ID of the page
  • PagePID - the page number in the file
  • OBjectID - the ID of the object which page is part of
  • IndexID - the ID of the index which page is part of
  • PageType - where 1 = Data Page, 2 = Index Page
  • IndexLevel - as SQL Server uses doubly-linked lists in a B-Tree for indexes, the data are actually leaf level (level 0) of the clustered index.
 In order to get output back to the console from DBCC commands we need to turn on T3604


DBCC TRACEON(3604)
GO
DBCC PAGE(TestDatabase, 1,399,1)


Here is output from this command:




Important stuff for our example here are following records marked in a red square):

  • m_slotCnt = 4 - which shows number of records on the page
  • m_GhostRecCnt = 0 - count of "ghosted" records
  • RecordType = PRIMARY_RECORD for a Slot1 where "Row2" data is located which we will delete later on
  • Row - Offset table - offsets for all records. 



After executing DELETE command we will take a look at a page structure again. But, to prevent GHOST CLEANUP process to execute and cleans our page we will enclose DELETE command in a transaction which we will not commit for some time: 



begin transaction 
delete from DeleteInternals where Id = 2


Let's take a look at page content: 


DBCC PAGE(TestDatabase, 1,399,1)















 We still have 4 records (m_slotCnt) but we can notice now that we have one "ghosted" record, m_ghostRecCnt = 1. That record is exactly our "Row2" which is marked now as GHOST_DATA_RECORD (RecordType). Row - Offset table remains unchanged. 


After committing transaction by executing


commit


and inspecting again page structure


DBCC PAGE(TestDatabase, 1,399,1)




 We see now, after committing transaction that number of records is equal 3 (m_slotCnt = 3) and number of "ghosted" records is equal 0 (m_ghostRecCnt = 0) 
Also, Row - Offset table shows all three records, but you can notice that their offset remains as previous, before delete commnad is executed, what means that SQL Server didn't compacted page. About that, we speak some when next time. 



No comments: