PostgreSQL How to deal TUPLE LOCK : 2 - one more transactions waiting one more transactions release tuple lock

14 minute read

背景

接上一篇 :

《PostgreSQL How to deal TUPLE LOCK : 1 - “One transaction lock single or multiple tuples rows”》

http://blog.163.com/digoal@126/blog/static/16387704020130312271679/

本文基于以下版本进行讲解(已经修复了multixact的问题).

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b78647a0e6f7b110273e98601f26d3d1db0ad931

在这个补丁前, 关于MultiXact情形下key share锁升级为share锁的问题, 我们看看会发生什么?

SESSION A :

digoal=# create table test (id int);  
CREATE TABLE  
digoal=# insert into test values (1),(2);  
INSERT 0 2  
digoal=# begin;  
BEGIN  
digoal=# select * from test for no key update;  
 id   
----  
  1  
  2  
(2 rows)  

SESSION B :

digoal=# begin;  
BEGIN  
digoal=# select * from test for key share;  
 id   
----  
  1  
  2  
(2 rows)  

接下来BUG发生了, 居然能够上升到share锁, “no key update和share是冲突的, 不应该上升”.

问题出在哪里呢? 因为此时tuple lock的信息不是存储在infomask里面的, 而是存储在pg_multixact里面, 所以问题一定是出在multixact的锁的处理上.

digoal=# select * from test for share;  
 id   
----  
  1  
  2  
(2 rows)  

补丁请参见 :

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b78647a0e6f7b110273e98601f26d3d1db0ad931;hp=77a3082fc546774808b76f58173caec3852ebf62

打补丁后, 看看情况如何?

SESSION B 执行 digoal=# select * from test for share;时等待锁. 没有发生锁升级的情况.

说明这个补丁已经修复了前面提到的BUG.

进入主题

上一篇BLOG讲的是单事务锁单行或多行的情形, 不存在并发.

这一篇要讲多个事务锁单行或多行的情形, 主要解释一下几个问题。

1. 多个事务锁锁单行, 这些锁信息依靠tuple head的t_infomask以及t_infomask2来存储是不够用的.

那么这些锁信息存储在哪里呢?

2. 当多个事务请求锁同一条tuple时, 是如何处理的? 锁请求顺序如何?

1. 单条记录同时被多个事务共享锁, 这些锁信息依靠tuple head的t_infomask以及t_infomask2来存储是不够用的.

那么这些锁信息存储在哪里呢?

当多个事务共享锁单条记录时, 该记录的tuple head中的t_infomask HEAP_XMAX_IS_MULTI 比特位将会被设置.

同时HeapTupleFields.t_xmax 将改为MultiXactId, 而非TransactionId.

这些锁信息存储在以下数据结构 :

每个MultiXactId中包含多个member(member数组).

00068 typedef struct xl_multixact_create  
00069 {  
00070     MultiXactId mid;            /* new MultiXact's ID */  
00071     MultiXactOffset moff;       /* its starting offset in members file */  
00072     int32       nmembers;       /* number of member XIDs */  
00073     MultiXactMember members[FLEXIBLE_ARRAY_MEMBER];  
00074 } xl_multixact_create;  

member的数据结构如下, 包含事务ID以及状态信息 :

00052 typedef struct MultiXactMember  
00053 {  
00054     TransactionId   xid;  
00055     MultiXactStatus status;  
00056 } MultiXactMember;  

状态信息是个枚举类型, 如下, 用来存储锁信息 :

00032 /*  
00033  * Possible multixact lock modes ("status").  The first four modes are for  
00034  * tuple locks (FOR KEY SHARE, FOR SHARE, FOR NO KEY UPDATE, FOR UPDATE); the  
00035  * next two are used for update and delete modes.  
00036  */  
00037 typedef enum  
00038 {  
00039     MultiXactStatusForKeyShare = 0x00,  
00040     MultiXactStatusForShare = 0x01,  
00041     MultiXactStatusForNoKeyUpdate = 0x02,  
00042     MultiXactStatusForUpdate = 0x03,  
00043     /* an update that doesn't touch "key" columns */  
00044     MultiXactStatusNoKeyUpdate = 0x04,  
00045     /* other updates, and delete */  
00046     MultiXactStatusUpdate = 0x05  
00047 } MultiXactStatus;  

看起来好像有2个是重复的, 如下 :

MultiXactStatusForNoKeyUpdate = 0x02 以及 MultiXactStatusNoKeyUpdate = 0x04 ;   
MultiXactStatusForUpdate = 0x03 以及 MultiXactStatusUpdate = 0x05 ;   

其实没有重复, 因为一种是显锁(select for …), 一种是隐锁(update).

来看个例子区分一下 :

先使用A和B会话, 设置TUPLE的头信息中的t_infomask.HEAP_XMAX_IS_MULTI .

SESSION A :

digoal=# begin;  
BEGIN  
digoal=# select * from test for key share;  
 id   
----  
  1  
  2  
(2 rows)  

SESSION B :

digoal=# begin;  
BEGIN  
digoal=# select * from test for key share;  
 id   
----  
  1  
  2  
(2 rows)  

SESSION C :

digoal=# begin;  
BEGIN  
digoal=# select * from test for no key update;  
 id   
----  
  1  
  2  
(2 rows)  

SESSION D :

digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |       xids       |                     modes                     |        pids           
------------+--------+-------+------------------+-----------------------------------------------+---------------------  
 (0,1)      |     25 | t     | {1774,1775,1776} | {"Key Share","Key Share","For No Key Update"} | {10099,10102,12822}  
 (0,2)      |     25 | t     | {1774,1775,1776} | {"Key Share","Key Share","For No Key Update"} | {10099,10102,12822}  
(2 rows)  

注意这里SESSION C获得的是显锁, For No Key Update.

再来看看隐锁 :

先使用A和B会话, 设置TUPLE的头信息中的t_infomask.HEAP_XMAX_IS_MULTI .

SESSION A :

digoal=# begin;  
BEGIN  
digoal=# select * from test for key share;  
 id | info   
----+------  
  1 |   
  2 |   
(2 rows)  

SESSION B :

digoal=# select * from test for key share;  
 id | info   
----+------  
  1 |   
  2 |   
(2 rows)  

SESSION C :

digoal=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          12822  
(1 row)  

SESSION C GDB :

gdb  
  
(gdb) attach 12822  
(gdb) b MultiXactIdWait  
Breakpoint 1 at 0x48b1f0: file heapam.c, line 5403.  

SESSION C :

digoal=# begin;  
BEGIN  
digoal=# update test set info='test' where id=1;  

SESSION C GDB :

Breakpoint 1, MultiXactIdWait (multi=26, status=MultiXactStatusNoKeyUpdate, remaining=0x7ffff88ddb94, infomask=4496)  
    at heapam.c:5403  
5403            Do_MultiXactIdWait(multi, status, remaining, infomask, false);  
(gdb) p status  
$1 = MultiXactStatusNoKeyUpdate  
(gdb) c  
Continuing.  

在gdb中打印status, 显然等待的是MultiXactStatusNoKeyUpdate锁, 而不是MultiXactStatusForNoKeyUpdate . 这就是显锁和隐锁的区别.

SESSION D :

digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |       xids       |                   modes                   |        pids           
------------+--------+-------+------------------+-------------------------------------------+---------------------  
 (0,1)      |     27 | t     | {1781,1782,1783} | {"Key Share","Key Share","No Key Update"} | {10099,10102,12822}  
 (0,2)      |     26 | t     | {1781,1782}      | {"Key Share","Key Share"}                 | {10099,10102}  
(2 rows)  

注意这里SESSION C获得的是隐锁, No Key Update. 与之前的例子用select … for no key update的到的不一样.

2. 当多个事务请求锁同一条tuple时, 是如何处理的? 锁请求顺序如何?

为什么要关心多个事务对同一条TUPLE请求锁呢, 来看一个场景.

假设按顺序在不同的会话中执行以下SQL.

SESSION A :

select * from test for share;  

SESSION B :

select * from test for share;  

SESSION C :

select * from test for update;  
-- 这里等待是毋庸置疑的.  

SESSION D :

select * from test for share;  
-- 这里需要等待吗?  
-- 如果SESSION D不需要等待锁, 那么持续不断的涌现出类似SESSION D这样的SQL, SESSION C就杯具了, 一直需要等待下去.  
-- 当然这种事情不会发生. 因为C会先获得锁, D继续等待. 原因看后面的介绍.  

下面使用gdb来跟踪一下以上场景, 看看SESSION D要不要等待. :

SESSION A :

digoal=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          10099  
(1 row)  
digoal=# begin;  
BEGIN  
digoal=# select * from test for share;  
 id | info   
----+------  
  2 |   
  1 | test  
(2 rows)  

SESSION B :

digoal=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          10102  
(1 row)  
digoal=# begin;  
BEGIN  
digoal=# select * from test for share;  
 id | info   
----+------  
  2 |   
  1 | test  
(2 rows)  

SESSION C GDB :

[root@db-172-16-3-150 ~]# gdb  
(gdb) attach 12822  
Attaching to process 12822  
(gdb) b LockTuple  
Breakpoint 1 at 0x65d240: file lmgr.c, line 388.  
(gdb) c  
Continuing.  

SESSION C :

digoal=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          12822  
(1 row)  
digoal=# begin;  
BEGIN  
digoal=# select * from test for update;  

SESSION C GDB :

Breakpoint 1, LockTuple (relation=0x2b7faf660578, tid=0x7ffff88ddce4, lockmode=8) at lmgr.c:388  
388     {  
(gdb) n  
391             SET_LOCKTAG_TUPLE(tag,  
(gdb) p tag  
$1 = {locktag_field1 = 8, locktag_field2 = 0, locktag_field3 = 2, locktag_field4 = 0, locktag_type = 0 '\000',   
  locktag_lockmethodid = 0 '\000'}  
(gdb) n  
388     {  
(gdb)   
391             SET_LOCKTAG_TUPLE(tag,  
(gdb) p tag  
$2 = {locktag_field1 = 8, locktag_field2 = 0, locktag_field3 = 2, locktag_field4 = 0, locktag_type = 0 '\000',   
  locktag_lockmethodid = 0 '\000'}  
(gdb) n  
397             (void) LockAcquire(&tag, lockmode, false, false);  
(gdb) p lockmode  
$3 = 8  
(gdb) p tag  
$4 = {locktag_field1 = 16384, locktag_field2 = 16742, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 0 '\000',   
  locktag_lockmethodid = 0 '\000'}  
(gdb) n  
391             SET_LOCKTAG_TUPLE(tag,  
(gdb)   
397             (void) LockAcquire(&tag, lockmode, false, false);  
(gdb)   
398     }  
(gdb)   
heap_lock_tuple (relation=0x2b7faf660578, tuple=0x7ffff88ddce0, cid=0, mode=LockTupleExclusive, nowait=0 '\000',   
    follow_updates=1 '\001', buffer=0x7ffff88ddd18, hufd=0x7ffff88ddd00) at heapam.c:4011  
4011                    if (mode == LockTupleKeyShare)  
(gdb)   
4085                    else if (mode == LockTupleShare)  
(gdb)   
4106                    else if (mode == LockTupleNoKeyExclusive)  
(gdb)   
4192                            if (infomask & HEAP_XMAX_IS_MULTI)  
(gdb)   
4194                                    MultiXactStatus status = get_mxact_status_for_lock(mode, false);  
(gdb)   
4197                                    if (status >= MultiXactStatusNoKeyUpdate)  
(gdb)   
4201                                    if (nowait)  
(gdb)   
4211                                            MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);  
(gdb) p xwait  
$5 = 28  
-- 因为SESSION A和SESSION B共享锁住了2行, 所以锁信息在pg_multixact中, 因此这里跟踪到的是MultiXactIdWait.  
-- 如果是单事务锁, 这里应该跟踪到XactLockTableWait.  
(gdb) n  
... 等待锁.  

SESSION X :

digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |    xids     |     modes     |     pids        
------------+--------+-------+-------------+---------------+---------------  
 (0,2)      |     28 | t     | {1785,1786} | {Share,Share} | {10099,10102}  
 (0,3)      |     28 | t     | {1785,1786} | {Share,Share} | {10099,10102}  
(2 rows)  
  
digoal=# select * from heap_page_items(get_raw_page('test', 0));  
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------  
  1 |   8160 |        1 |     28 |      2 |     27 |        0 | (0,3)  |       16385 |       4416 |     24 |        |        
  2 |   8128 |        1 |     28 |      2 |     28 |        0 | (0,2)  |           1 |       4560 |     24 |        |        
  3 |   8088 |        1 |     33 |   1783 |     28 |        0 | (0,3)  |       32770 |      12754 |     24 |        |        
(3 rows)  

gdb跟踪到的xwait就是这里的t_max : 28.

SESSION D :

digoal=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          14886  
(1 row)  

SESSION D GDB :

(gdb) attach 14886  
Attaching to program: /home/pgdev/pgsql9.3/bin/postgres, process 14886  
(gdb) b LockTuple  
Note: breakpoint 1 also set at pc 0x65d240.  
Breakpoint 2 at 0x65d240: file lmgr.c, line 388.  
(gdb) c  
Continuing.  

SESSION D :

digoal=# begin;  
BEGIN  
digoal=# select * from test for share;  

SESSION D GDB :

Breakpoint 1, LockTuple (relation=0x2b7faf643e30, tid=0x7ffff88ddce4, lockmode=2) at lmgr.c:388  
388     {  
(gdb) n  
391             SET_LOCKTAG_TUPLE(tag,  
(gdb)   
388     {  
(gdb)   
391             SET_LOCKTAG_TUPLE(tag,  
(gdb)   
397             (void) LockAcquire(&tag, lockmode, false, false);  
(gdb)   
391             SET_LOCKTAG_TUPLE(tag,  
(gdb)   
397             (void) LockAcquire(&tag, lockmode, false, false);  
(gdb)   
  
等待, 未跟踪到MultiXactIdWait.   
那这里其实是进入了LockAcquireExtended函数的这个逻辑  
  
00900     /*  
00901      * If lock requested conflicts with locks requested by waiters, must join  
00902      * wait queue.  Otherwise, check for conflict with already-held locks.  
00903      * (That's last because most complex check.)  
00904      */  
00905     if (lockMethodTable->conflictTab[lockmode] & lock->waitMask)  
00906         status = STATUS_FOUND;  
00907     else  
00908         status = LockCheckConflicts(lockMethodTable, lockmode,  
00909                                     lock, proclock, MyProc);  

另一种方法是使用开发参数进行跟踪 :

SESSION C :

digoal=# \set VERBOSITY verbose  
digoal=# set trace_locks = on;  
SET  
digoal=# set trace_userlocks = on;  
SET  
digoal=# set client_min_messages=debug5;  
DEBUG:  00000: CommitTransactionCommand  
LOCATION:  finish_xact_command, postgres.c:2409  
DEBUG:  00000: CommitTransaction  
LOCATION:  ShowTransactionState, xact.c:4388  
DEBUG:  00000: name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:   
LOCATION:  ShowTransactionStateRec, xact.c:4426  
LOG:  00000: LockReleaseAll: lockmethod=1  
LOCATION:  LockReleaseAll, lock.c:1941  
LOG:  00000: LockReleaseAll done  
LOCATION:  LockReleaseAll, lock.c:2164  
LOG:  00000: LockReleaseAll: lockmethod=2  
LOCATION:  LockReleaseAll, lock.c:1941  
LOG:  00000: LockReleaseAll done  
LOCATION:  LockReleaseAll, lock.c:2164  
SET  
digoal=# begin;  
DEBUG:  00000: StartTransactionCommand  
LOCATION:  start_xact_command, postgres.c:2385  
DEBUG:  00000: StartTransaction  
LOCATION:  ShowTransactionState, xact.c:4388  
DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:   
LOCATION:  ShowTransactionStateRec, xact.c:4426  
DEBUG:  00000: ProcessUtility  
LOCATION:  PortalRunUtility, pquery.c:1154  
DEBUG:  00000: CommitTransactionCommand  
LOCATION:  finish_xact_command, postgres.c:2409  
BEGIN  
  
digoal=# select * from test for update;  
  
DEBUG:  00000: StartTransactionCommand  
LOCATION:  start_xact_command, postgres.c:2385  
LOG:  00000: LockAcquire: lock [16384,16742] RowShareLock  
LINE 1: select * from test for update;  
                      ^  
LOCATION:  LockAcquireExtended, lock.c:729  
LOG:  00000: LockAcquire: lock [16384,16742] AccessShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
LOG:  00000: LockRelease: lock [16384,16742] AccessShareLock  
LOCATION:  LockRelease, lock.c:1751  
LOG:  00000: LockAcquire: lock [16384,16745] AccessShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
LOG:  00000: LockAcquire: lock [16384,16742] RowShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
LOG:  00000: LockAcquire: lock [16384,16742] AccessExclusiveLock  
LOCATION:  LockAcquireExtended, lock.c:729  
LOG:  00000: LockAcquire: new: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0 wait(0) type(AccessExclusiveLock)  
LOCATION:  LOCK_PRINT, lock.c:318  
LOG:  00000: LockAcquire: new: proclock(0x2b7faf1c12c8) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da6a0) hold(0)  
LOCATION:  PROCLOCK_PRINT, lock.c:330  
LOG:  00000: LockCheckConflicts: no conflict: proclock(0x2b7faf1c12c8) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da6a0) hold(0)  
LOCATION:  PROCLOCK_PRINT, lock.c:330  
-- 先检测与已经存在的请求锁的队列中的锁是否有冲突. 有冲突则加入等待队列. 没有冲突则与已经HOLD的锁检测冲突性.如果发现冲突则加入队列中.  
-- 这就能解释本文的第二个问题了.  
00900     /*  
00901      * If lock requested conflicts with locks requested by waiters, must join  
00902      * wait queue.  Otherwise, check for conflict with already-held locks.  
00903      * (That's last because most complex check.)  
00904      */  
LOG:  00000: GrantLock: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(100) req(0,0,0,0,0,0,0)=1 grant(0,0,0,0,0,0,0)=1 wait(0) type(AccessExclusiveLock)  
LOCATION:  LOCK_PRINT, lock.c:318  

SESSION D :

digoal=# set trace_locks = on;  
SET  
digoal=# set trace_userlocks = on;  
SET  
digoal=# \set VERBOSITY verbose  
digoal=# set client_min_messages=debug5;  
DEBUG:  00000: CommitTransactionCommand  
LOCATION:  finish_xact_command, postgres.c:2409  
DEBUG:  00000: CommitTransaction  
LOCATION:  ShowTransactionState, xact.c:4388  
DEBUG:  00000: name: unnamed; blockState:       STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:   
LOCATION:  ShowTransactionStateRec, xact.c:4426  
LOG:  00000: LockReleaseAll: lockmethod=1  
LOCATION:  LockReleaseAll, lock.c:1941  
LOG:  00000: LockReleaseAll done  
LOCATION:  LockReleaseAll, lock.c:2164  
LOG:  00000: LockReleaseAll: lockmethod=2  
LOCATION:  LockReleaseAll, lock.c:1941  
LOG:  00000: LockReleaseAll done  
LOCATION:  LockReleaseAll, lock.c:2164  
SET  
digoal=# begin;  
DEBUG:  00000: StartTransactionCommand  
LOCATION:  start_xact_command, postgres.c:2385  
DEBUG:  00000: StartTransaction  
LOCATION:  ShowTransactionState, xact.c:4388  
DEBUG:  00000: name: unnamed; blockState:       DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:   
LOCATION:  ShowTransactionStateRec, xact.c:4426  
DEBUG:  00000: ProcessUtility  
LOCATION:  PortalRunUtility, pquery.c:1154  
DEBUG:  00000: CommitTransactionCommand  
LOCATION:  finish_xact_command, postgres.c:2409  
BEGIN  
  
digoal=# select * from test for share;  
DEBUG:  00000: StartTransactionCommand  
LOCATION:  start_xact_command, postgres.c:2385  
  
LOG:  00000: LockAcquire: lock [16384,16742] RowShareLock  
LINE 1: select * from test for share;  
                      ^  
LOCATION:  LockAcquireExtended, lock.c:729  

对应代码:

00725 #ifdef LOCK_DEBUG  
00726     if (LOCK_DEBUG_ENABLED(locktag))  
00727         elog(LOG, "LockAcquire: lock [%u,%u] %s",  
00728              locktag->locktag_field1, locktag->locktag_field2,  
00729              lockMethodTable->lockModeNames[lockmode]);  
00730 #endif  
  
这里的16384是dbid, 16742是relid, 代码如下 :   
  
00240 #define SET_LOCKTAG_TUPLE(locktag,dboid,reloid,blocknum,offnum) \  
00241     ((locktag).locktag_field1 = (dboid), \  
00242      (locktag).locktag_field2 = (reloid), \  
00243      (locktag).locktag_field3 = (blocknum), \  
00244      (locktag).locktag_field4 = (offnum), \  
00245      (locktag).locktag_type = LOCKTAG_TUPLE, \  
00246      (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)  

验证如下 :

pgdev@db-172-16-3-150-> psql  
psql (9.3devel)  
Type "help" for help.  
digoal=# select oid,datname from pg_database where oid=16384;  
  oid  | datname   
-------+---------  
 16384 | digoal  
(1 row)  
digoal=# \c digoal  
You are now connected to database "digoal" as user "postgres".  
digoal=# select relname,relkind from pg_class where oid=16742;  
 relname | relkind   
---------+---------  
 test    | r  
(1 row)  
  
LOG:  00000: LockAcquire: lock [16384,16742] AccessShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
  
LOG:  00000: LockRelease: lock [16384,16742] AccessShareLock  
LOCATION:  LockRelease, lock.c:1751  
  
LOG:  00000: LockAcquire: lock [16384,16745] AccessShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
  
LOG:  00000: LockAcquire: lock [16384,16742] RowShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
  
LOG:  00000: LockAcquire: lock [16384,16742] RowShareLock  
LOCATION:  LockAcquireExtended, lock.c:729  
  
LOG:  00000: LockAcquire: found: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(100) req(0,0,0,0,0,0,0)=1 grant(0,0,0,0,0,0,0)=1 wait(0) type(RowShareLock)  
LOCATION:  LOCK_PRINT, lock.c:318  

代码如下 :

00298 inline static void  
00299 LOCK_PRINT(const char *where, const LOCK *lock, LOCKMODE type)  
00300 {  
00301     if (LOCK_DEBUG_ENABLED(&lock->tag))  
00302         elog(LOG,  
00303              "%s: lock(%p) id(%u,%u,%u,%u,%u,%u) grantMask(%x) "  
00304              "req(%d,%d,%d,%d,%d,%d,%d)=%d "  
00305              "grant(%d,%d,%d,%d,%d,%d,%d)=%d wait(%d) type(%s)",  
00306              where, lock,  
00307              lock->tag.locktag_field1, lock->tag.locktag_field2,  
00308              lock->tag.locktag_field3, lock->tag.locktag_field4,  
00309              lock->tag.locktag_type, lock->tag.locktag_lockmethodid,  
00310              lock->grantMask,  
00311              lock->requested[1], lock->requested[2], lock->requested[3],  
00312              lock->requested[4], lock->requested[5], lock->requested[6],  
00313              lock->requested[7], lock->nRequested,  
00314              lock->granted[1], lock->granted[2], lock->granted[3],  
00315              lock->granted[4], lock->granted[5], lock->granted[6],  
00316              lock->granted[7], lock->nGranted,  
00317              lock->waitProcs.size,  
00318              LockMethods[LOCK_LOCKMETHOD(*lock)]->lockModeNames[type]);  
00319 }  
  
tag枚举列表如下  
  
00158 /*  
00159  * LOCKTAG is the key information needed to look up a LOCK item in the  
00160  * lock hashtable.  A LOCKTAG value uniquely identifies a lockable object.  
00161  *  
00162  * The LockTagType enum defines the different kinds of objects we can lock.  
00163  * We can handle up to 256 different LockTagTypes.  
00164  */  
00165 typedef enum LockTagType  
00166 {  
00167     LOCKTAG_RELATION,           /* whole relation */  
00168     /* ID info for a relation is DB OID + REL OID; DB OID = 0 if shared */  
00169     LOCKTAG_RELATION_EXTEND,    /* the right to extend a relation */  
00170     /* same ID info as RELATION */  
00171     LOCKTAG_PAGE,               /* one page of a relation */  
00172     /* ID info for a page is RELATION info + BlockNumber */  
00173     LOCKTAG_TUPLE,              /* one physical tuple */  
00174     /* ID info for a tuple is PAGE info + OffsetNumber */  
00175     LOCKTAG_TRANSACTION,        /* transaction (for waiting for xact done) */  
00176     /* ID info for a transaction is its TransactionId */  
00177     LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */  
00178     /* ID info for a virtual transaction is its VirtualTransactionId */  
00179     LOCKTAG_OBJECT,             /* non-relation database object */  
00180     /* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */  
00181   
00182     /*  
00183      * Note: object ID has same representation as in pg_depend and  
00184      * pg_description, but notice that we are constraining SUBID to 16 bits.  
00185      * Also, we use DB OID = 0 for shared objects such as tablespaces.  
00186      */  
00187     LOCKTAG_USERLOCK,           /* reserved for old contrib/userlock code */  
00188     LOCKTAG_ADVISORY            /* advisory user locks */  
00189 } LockTagType;  
  
LOCK数据结构如下 :   
  
00281 /*  
00282  * Per-locked-object lock information:  
00283  *  
00284  * tag -- uniquely identifies the object being locked  
00285  * grantMask -- bitmask for all lock types currently granted on this object.  
00286  * waitMask -- bitmask for all lock types currently awaited on this object.  
00287  * procLocks -- list of PROCLOCK objects for this lock.  
00288  * waitProcs -- queue of processes waiting for this lock.  
00289  * requested -- count of each lock type currently requested on the lock  
00290  *      (includes requests already granted!!).  
00291  * nRequested -- total requested locks of all types.  
00292  * granted -- count of each lock type currently granted on the lock.  
00293  * nGranted -- total granted locks of all types.  
00294  *  
00295  * Note: these counts count 1 for each backend.  Internally to a backend,  
00296  * there may be multiple grabs on a particular lock, but this is not reflected  
00297  * into shared memory.  
00298  */  
00299 typedef struct LOCK  
00300 {  
00301     /* hash key */  
00302     LOCKTAG     tag;            /* unique identifier of lockable object */  
00303   
00304     /* data */  
00305     LOCKMASK    grantMask;      /* bitmask for lock types already granted */  
00306     LOCKMASK    waitMask;       /* bitmask for lock types awaited */  
00307     SHM_QUEUE   procLocks;      /* list of PROCLOCK objects assoc. with lock */  
00308     PROC_QUEUE  waitProcs;      /* list of PGPROC objects waiting on lock */  
00309     int         requested[MAX_LOCKMODES];       /* counts of requested locks */  
00310     int         nRequested;     /* total of requested[] array */  
00311     int         granted[MAX_LOCKMODES]; /* counts of granted locks */  
00312     int         nGranted;       /* total of granted[] array */  
00313 } LOCK;  
  
解释 id(16384,16742,0,2,3,1)  
  
00193 /*  
00194  * The LOCKTAG struct is defined with malice aforethought to fit into 16  
00195  * bytes with no padding.  Note that this would need adjustment if we were  
00196  * to widen Oid, BlockNumber, or TransactionId to more than 32 bits.  
00197  *  
00198  * We include lockmethodid in the locktag so that a single hash table in  
00199  * shared memory can store locks of different lockmethods.  
00200  */  
00201 typedef struct LOCKTAG  
00202 {  
00203     uint32      locktag_field1; /* a 32-bit ID field */  
00204     uint32      locktag_field2; /* a 32-bit ID field */  
00205     uint32      locktag_field3; /* a 32-bit ID field */  
00206     uint16      locktag_field4; /* a 16-bit ID field */  
00207     uint8       locktag_type;   /* see enum LockTagType */  
00208     uint8       locktag_lockmethodid;   /* lockmethod indicator */  
00209 } LOCKTAG;  
00132 /* These identify the known lock methods */  
00133 #define DEFAULT_LOCKMETHOD  1  
00134 #define USER_LOCKMETHOD     2  
  
本例是行锁, 所以设置tag的值对应以下宏 :   
  
00240 #define SET_LOCKTAG_TUPLE(locktag,dboid,reloid,blocknum,offnum) \  
00241     ((locktag).locktag_field1 = (dboid), \  
00242      (locktag).locktag_field2 = (reloid), \  
00243      (locktag).locktag_field3 = (blocknum), \  
00244      (locktag).locktag_field4 = (offnum), \  
00245      (locktag).locktag_type = LOCKTAG_TUPLE, \  
00246      (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)  
  
因此 id(16384,16742,0,2,3,1)  
  
16384 : lock->tag.locktag_field1 = dboid  
16742 : lock->tag.locktag_field2 = reloid  
0 : lock->tag.locktag_field3 = blocknum  
2 : lock->tag.locktag_field4 = offnum  
3 : lock->tag.locktag_type = LOCKTAG_TUPLE (见LockTagType枚举)  
1 : lock->tag.locktag_lockmethodid = DEFAULT_LOCKMETHOD (见DEFAULT_LOCKMETHOD定义)  

offnum对应CTID :

digoal=# select ctid,* from test;  
 ctid  | id | info   
-------+----+------  
 (0,2) |  2 |   
 (0,3) |  1 | test  
(2 rows)  
  
LOG:  00000: LockAcquire: new: proclock(0x2b7faf1c1160) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da0e0) hold(0)  
LOCATION:  PROCLOCK_PRINT, lock.c:330  
  
LOG:  00000: LockCheckConflicts: conflicting: proclock(0x2b7faf1c1160) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da0e0) hold(0)  
LOCATION:  PROCLOCK_PRINT, lock.c:330  
  
LOG:  00000: WaitOnLock: sleeping on lock: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(100) req(0,1,0,0,0,0,0)=2 grant(0,0,0,0,0,0,0)=1 wait(0) type(RowShareLock)  
LOCATION:  LOCK_PRINT, lock.c:318  

小结

1.
单条记录同时被多个事务共享锁时, 锁信息存储在pg_multixact中.

记录的头信息中t_infomask HEAP_XMAX_IS_MULTI 比特位将会被设置。

记录的头信息中t_xmax代表的是MultiXactId.

2.

当多个事务并发请求锁同一条tuple时, 先请求锁的会话如果与已经存在的锁发生冲突, 将会产生一个锁队列, 后来的会话请求锁时需要先检测锁队列是否冲突再检测与已经HOLD住的锁是否冲突. 因此不会造成多个共享锁持续占用TUPLE而独占锁一直都请求不到锁的情况。

参考

1. src/backend/access/heap/README.tuplock

2. src/include/access/htup_details.h

3. src/include/access/multixact.h

4. src/backend/access/transam/multixact.c

5. src/include/access/heapam.h

6. src/backend/access/heap/heapam.c

7. src/backend/utils/adt/lockfuncs.c

8. src/backend/storage/lmgr/lock.c

9. src/backend/storage/lmgr/lmgr.c

10. http://www.postgresql.org/docs/devel/static/pgrowlocks.html

11. http://www.postgresql.org/docs/devel/static/pageinspect.html

12. https://github.com/postgres/postgres/commit/0ac5ad5134f2769ccbaefec73844f8504c4d6182

13. http://blog.163.com/digoal@126/blog/static/16387704020130249109133/

14. http://blog.163.com/digoal@126/blog/static/16387704020130305109687/

15. http://blog.163.com/digoal@126/blog/static/163877040201210134586363/

16. http://blog.163.com/digoal@126/blog/static/1638770402011515105557166/

17. http://blog.163.com/digoal@126/blog/static/16387704020130312271679/

Flag Counter

digoal’s 大量PostgreSQL文章入口