Deadlock due to Lock Partitioning

In a previous post we have discussed how to analyze deadlock involving lock partitions. In this post we are going to look into a rare deadlock condition that arises only when lock partitioning is enabled. The same script will only cause blocking on a server where Lock Partitioning is not enabled.

The deadlock I am going to demonstrate below is reproducible in SQL server 2008 R2 (10.50.1600.1) and SQL server 2012 (11.0.2100.60)

Setup

Make sure lock partitioning is enabled. Run the below script to create the necessary table. You can skip this step if you have created the table from the previous post.

IF DB_ID(‘DeadLockTest’) IS NULL
CREATE Database DeadLockTest
GO

USE DeadLockTest
GO

CREATE TABLE dbo.lpdl(id int identity(1,1) primary key clustered, filler char(100) default ‘abc’)
GO
SET NOCOUNT ON
INSERT INTO dbo.lpdl DEFAULT VALUES
GO 100

Open a new session (Session 1) and run the below script to create a head blocker thread.

BEGIN TRAN

UPDATE dbo.lpdl
SET filler = ‘xyz’
WHERE id = 1

–ROLLBACK TRAN

Open another session (Session 2) and run the below script.

BEGIN TRAN

UPDATE dbo.lpdl
SET filler = ‘xyz’
WHERE id = 2

SELECT * FROM dbo.lpdl WITH(TABLOCKX)

ROLLBACK TRAN

The update statement in the above script will succeed and the SELECT will be blocked due to the (TABLOCKX) hint.

Now open one more session (Session 3) and run the below script.

BEGIN TRAN

UPDATE dbo.lpdl
SET filler = ‘xyz’
WHERE id = 3

SELECT * FROM dbo.lpdl WITH(TABLOCKX)

ROLLBACK TRAN

The update statement in the above script will probably succeed and the SELECT will be blocked due to the (TABLOCKX) hint. The deadlock can be simulated only when the IX lock request of Session 3 succeed and it is waiting for X Lock on all partitions. The probability of the IX lock request being succeeded is solely dependent on the local lock partition assigned to the thread. Query the sys.dm_tran_locks to see the lock partitions assigned to each thread.

SELECT request_session_id, resource_lock_partition,request_mode,request_status
FROM sys.dm_tran_locks
WHERE resource_type = ‘OBJECT’
AND request_mode = ‘IX’

My results looks like the following.

request_session_id	resource_lock_partition	request_mode	request_status
51	4	IX	GRANT
55	9	IX	GRANT
56	10	IX	GRANT

If the request_status for any of the session is WAIT, start over by switching the order of queries or by starting new connections.

If you see all IX locks in GRANT state as above, Check the X lock requests also. Session2 will be waiting for an X lock on the local partition of Session 1and Session 3 will be waiting for the X lock on partition 0 since all distributed acquire must start on partition 0.

At this point, go back to Session1 and run the ROLLBACK TRAN command to remove the blocking. Now session 3 will fail with the deadlock error.

Analyze

Lets build the deadlock analysis table from the trace data to see what is happening here.

Process	Resource	Mode	State	Command
process4eb4748	OBJECT:26:242099903:0	X	WAIT	SELECT
process4eb4748	OBJECT: 26:242099903:10	IX	GRANT	UPDATE
process4c9a153	OBJECT: 26:242099903:10	X	WAIT	SELECT
process4c9a153	OBJECT:26:242099903:0	X	GRANT	SELECT

This tells us that process process4eb4748 (Session 3) is holding an IX lock on lock partition 10 and it is waiting for an X lock on lock partition 0. Remember that X lock needs to be acquired on all lock partitions starting from lock partition 0.

Meanwhile, process process4c9a153 is holding an X lock on lock partition 0 (It is also having X lock on all lock partitions from 0 to 9) and is waiting for an X lock on lock partition 10. Since both the processes are holding and waiting for non-compatible locks on each other, a deadlock ensues.

This deadlock will never happen on a server where lock partitioning is not enabled because, due to the FIFO model of Lock Manager, the first UPDATE statement of the third session will always be blocked by the second session.

Fix

This is an extremely rare scenario. The key to avoid these kind of issues is avoiding Object level locks. Don’t start looking for a trace flag to disable lock partitioning (which is btw –T1229). Lock partitioning is an extremely useful feature for reducing lock contention especially on Sch-M, SCh-S, IS and IX locks. Remember that all queries requires Sch-S/Sch-M locks not just on the tables but also on several internal metadata structures.

Note: Apparently there is a bug in the Lock partitioning implementation which causes a distributed acquire for Sch-S locks which can result in deadlock condition. This issue is fixed in SQL 2012.

UPDATE : Robert Dorr discusses the above mentioned bug in his blog post titled “Strange Sch-S / Sch-M Deadlock on Machines with 16 or More Schedulers”

This post is part of the series SQL Server deadlocks and live locks – Common Patterns. If you have questions, please leave a comment here or reach me at @sqlindian

Written by sqlindian

July 18, 2012 at 8:00 PM

Posted in Deadlocks, Locks

Tagged with Deadlock, Lock Partition

3 Responses

Subscribe to comments with RSS.

Here’s another demo script without using index hints. This deadlock involves Sch-S and Sch-M locks. This script also just get blocked when lock partitioning is not enabled and causes deadlock when it is enabled. The probability of the deadlock still depends upn the lock partition number.

Execute the below scripts in the order.

Session1

BEGIN TRAN

UPDATE dbo.lpdl
SET filler = ‘xyz’
WHERE id = 3

Session 2
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
IF EXISTS(SELECT 1 FROM dbo.lpdl)
alter table dbo.lpdl check constraint all
rollback tran

Session 3
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
IF EXISTS(SELECT 1 FROM dbo.lpdl)
alter table dbo.lpdl check constraint all
rollback tran

Now go back to session 1 and run ROLLBACK TRAN. Session 3 will fail with a deadlock error.

sqlindian

July 19, 2012 at 12:58 AM

Reply
[…] Regardless of whether lock portioning was enabled or not, the above script will cause a deadlock. But there are extremely rare scenarios where the lock partitioning feature will cause a deadlock whereas the normal locking behavior will only cause blocking. I will talk about that in the next post. […]

Deadlocks Involving Lock Partitions « sqlindian – On Locks and Deadlocks

October 11, 2012 at 11:29 PM

Reply
Is this deadlock due to lock partitioning?

Proc [Database Id = 119 Object Id = 1650104919]

Proc [Database Id = 119 Object Id = 1650104919]

Proc [Database Id = 119 Object Id = 1650104919]

Free Huh

February 20, 2014 at 10:00 PM

Reply

sqlindian – On Locks and Deadlocks