PostgreSQL How to deal TUPLE LOCK : 2 - one more transactions waiting one more transactions release tuple lock
背景
接上一篇 :
《PostgreSQL How to deal TUPLE LOCK : 1 - “One transaction lock single or multiple tuples | rows”》 |
http://blog.163.com/digoal@126/blog/static/16387704020130312271679/
本文基于以下版本进行讲解(已经修复了multixact的问题).
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b78647a0e6f7b110273e98601f26d3d1db0ad931
在这个补丁前, 关于MultiXact情形下key share锁升级为share锁的问题, 我们看看会发生什么?
SESSION A :
digoal=# create table test (id int);
CREATE TABLE
digoal=# insert into test values (1),(2);
INSERT 0 2
digoal=# begin;
BEGIN
digoal=# select * from test for no key update;
id
----
1
2
(2 rows)
SESSION B :
digoal=# begin;
BEGIN
digoal=# select * from test for key share;
id
----
1
2
(2 rows)
接下来BUG发生了, 居然能够上升到share锁, “no key update和share是冲突的, 不应该上升”.
问题出在哪里呢? 因为此时tuple lock的信息不是存储在infomask里面的, 而是存储在pg_multixact里面, 所以问题一定是出在multixact的锁的处理上.
digoal=# select * from test for share;
id
----
1
2
(2 rows)
补丁请参见 :
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b78647a0e6f7b110273e98601f26d3d1db0ad931;hp=77a3082fc546774808b76f58173caec3852ebf62
打补丁后, 看看情况如何?
SESSION B 执行 digoal=# select * from test for share;时等待锁. 没有发生锁升级的情况.
说明这个补丁已经修复了前面提到的BUG.
进入主题
二
上一篇BLOG讲的是单事务锁单行或多行的情形, 不存在并发.
这一篇要讲多个事务锁单行或多行的情形, 主要解释一下几个问题。
1. 多个事务锁锁单行, 这些锁信息依靠tuple head的t_infomask以及t_infomask2来存储是不够用的.
那么这些锁信息存储在哪里呢?
2. 当多个事务请求锁同一条tuple时, 是如何处理的? 锁请求顺序如何?
1. 单条记录同时被多个事务共享锁, 这些锁信息依靠tuple head的t_infomask以及t_infomask2来存储是不够用的.
那么这些锁信息存储在哪里呢?
当多个事务共享锁单条记录时, 该记录的tuple head中的t_infomask HEAP_XMAX_IS_MULTI 比特位将会被设置.
同时HeapTupleFields.t_xmax 将改为MultiXactId, 而非TransactionId.
这些锁信息存储在以下数据结构 :
每个MultiXactId中包含多个member(member数组).
00068 typedef struct xl_multixact_create
00069 {
00070 MultiXactId mid; /* new MultiXact's ID */
00071 MultiXactOffset moff; /* its starting offset in members file */
00072 int32 nmembers; /* number of member XIDs */
00073 MultiXactMember members[FLEXIBLE_ARRAY_MEMBER];
00074 } xl_multixact_create;
member的数据结构如下, 包含事务ID以及状态信息 :
00052 typedef struct MultiXactMember
00053 {
00054 TransactionId xid;
00055 MultiXactStatus status;
00056 } MultiXactMember;
状态信息是个枚举类型, 如下, 用来存储锁信息 :
00032 /*
00033 * Possible multixact lock modes ("status"). The first four modes are for
00034 * tuple locks (FOR KEY SHARE, FOR SHARE, FOR NO KEY UPDATE, FOR UPDATE); the
00035 * next two are used for update and delete modes.
00036 */
00037 typedef enum
00038 {
00039 MultiXactStatusForKeyShare = 0x00,
00040 MultiXactStatusForShare = 0x01,
00041 MultiXactStatusForNoKeyUpdate = 0x02,
00042 MultiXactStatusForUpdate = 0x03,
00043 /* an update that doesn't touch "key" columns */
00044 MultiXactStatusNoKeyUpdate = 0x04,
00045 /* other updates, and delete */
00046 MultiXactStatusUpdate = 0x05
00047 } MultiXactStatus;
看起来好像有2个是重复的, 如下 :
MultiXactStatusForNoKeyUpdate = 0x02 以及 MultiXactStatusNoKeyUpdate = 0x04 ;
MultiXactStatusForUpdate = 0x03 以及 MultiXactStatusUpdate = 0x05 ;
其实没有重复, 因为一种是显锁(select for …), 一种是隐锁(update).
来看个例子区分一下 :
先使用A和B会话, 设置TUPLE的头信息中的t_infomask.HEAP_XMAX_IS_MULTI .
SESSION A :
digoal=# begin;
BEGIN
digoal=# select * from test for key share;
id
----
1
2
(2 rows)
SESSION B :
digoal=# begin;
BEGIN
digoal=# select * from test for key share;
id
----
1
2
(2 rows)
SESSION C :
digoal=# begin;
BEGIN
digoal=# select * from test for no key update;
id
----
1
2
(2 rows)
SESSION D :
digoal=# select * from pgrowlocks('test');
locked_row | locker | multi | xids | modes | pids
------------+--------+-------+------------------+-----------------------------------------------+---------------------
(0,1) | 25 | t | {1774,1775,1776} | {"Key Share","Key Share","For No Key Update"} | {10099,10102,12822}
(0,2) | 25 | t | {1774,1775,1776} | {"Key Share","Key Share","For No Key Update"} | {10099,10102,12822}
(2 rows)
注意这里SESSION C获得的是显锁, For No Key Update.
再来看看隐锁 :
先使用A和B会话, 设置TUPLE的头信息中的t_infomask.HEAP_XMAX_IS_MULTI .
SESSION A :
digoal=# begin;
BEGIN
digoal=# select * from test for key share;
id | info
----+------
1 |
2 |
(2 rows)
SESSION B :
digoal=# select * from test for key share;
id | info
----+------
1 |
2 |
(2 rows)
SESSION C :
digoal=# select pg_backend_pid();
pg_backend_pid
----------------
12822
(1 row)
SESSION C GDB :
gdb
(gdb) attach 12822
(gdb) b MultiXactIdWait
Breakpoint 1 at 0x48b1f0: file heapam.c, line 5403.
SESSION C :
digoal=# begin;
BEGIN
digoal=# update test set info='test' where id=1;
SESSION C GDB :
Breakpoint 1, MultiXactIdWait (multi=26, status=MultiXactStatusNoKeyUpdate, remaining=0x7ffff88ddb94, infomask=4496)
at heapam.c:5403
5403 Do_MultiXactIdWait(multi, status, remaining, infomask, false);
(gdb) p status
$1 = MultiXactStatusNoKeyUpdate
(gdb) c
Continuing.
在gdb中打印status, 显然等待的是MultiXactStatusNoKeyUpdate锁, 而不是MultiXactStatusForNoKeyUpdate . 这就是显锁和隐锁的区别.
SESSION D :
digoal=# select * from pgrowlocks('test');
locked_row | locker | multi | xids | modes | pids
------------+--------+-------+------------------+-------------------------------------------+---------------------
(0,1) | 27 | t | {1781,1782,1783} | {"Key Share","Key Share","No Key Update"} | {10099,10102,12822}
(0,2) | 26 | t | {1781,1782} | {"Key Share","Key Share"} | {10099,10102}
(2 rows)
注意这里SESSION C获得的是隐锁, No Key Update. 与之前的例子用select … for no key update的到的不一样.
2. 当多个事务请求锁同一条tuple时, 是如何处理的? 锁请求顺序如何?
为什么要关心多个事务对同一条TUPLE请求锁呢, 来看一个场景.
假设按顺序在不同的会话中执行以下SQL.
SESSION A :
select * from test for share;
SESSION B :
select * from test for share;
SESSION C :
select * from test for update;
-- 这里等待是毋庸置疑的.
SESSION D :
select * from test for share;
-- 这里需要等待吗?
-- 如果SESSION D不需要等待锁, 那么持续不断的涌现出类似SESSION D这样的SQL, SESSION C就杯具了, 一直需要等待下去.
-- 当然这种事情不会发生. 因为C会先获得锁, D继续等待. 原因看后面的介绍.
下面使用gdb来跟踪一下以上场景, 看看SESSION D要不要等待. :
SESSION A :
digoal=# select pg_backend_pid();
pg_backend_pid
----------------
10099
(1 row)
digoal=# begin;
BEGIN
digoal=# select * from test for share;
id | info
----+------
2 |
1 | test
(2 rows)
SESSION B :
digoal=# select pg_backend_pid();
pg_backend_pid
----------------
10102
(1 row)
digoal=# begin;
BEGIN
digoal=# select * from test for share;
id | info
----+------
2 |
1 | test
(2 rows)
SESSION C GDB :
[root@db-172-16-3-150 ~]# gdb
(gdb) attach 12822
Attaching to process 12822
(gdb) b LockTuple
Breakpoint 1 at 0x65d240: file lmgr.c, line 388.
(gdb) c
Continuing.
SESSION C :
digoal=# select pg_backend_pid();
pg_backend_pid
----------------
12822
(1 row)
digoal=# begin;
BEGIN
digoal=# select * from test for update;
SESSION C GDB :
Breakpoint 1, LockTuple (relation=0x2b7faf660578, tid=0x7ffff88ddce4, lockmode=8) at lmgr.c:388
388 {
(gdb) n
391 SET_LOCKTAG_TUPLE(tag,
(gdb) p tag
$1 = {locktag_field1 = 8, locktag_field2 = 0, locktag_field3 = 2, locktag_field4 = 0, locktag_type = 0 '\000',
locktag_lockmethodid = 0 '\000'}
(gdb) n
388 {
(gdb)
391 SET_LOCKTAG_TUPLE(tag,
(gdb) p tag
$2 = {locktag_field1 = 8, locktag_field2 = 0, locktag_field3 = 2, locktag_field4 = 0, locktag_type = 0 '\000',
locktag_lockmethodid = 0 '\000'}
(gdb) n
397 (void) LockAcquire(&tag, lockmode, false, false);
(gdb) p lockmode
$3 = 8
(gdb) p tag
$4 = {locktag_field1 = 16384, locktag_field2 = 16742, locktag_field3 = 0, locktag_field4 = 0, locktag_type = 0 '\000',
locktag_lockmethodid = 0 '\000'}
(gdb) n
391 SET_LOCKTAG_TUPLE(tag,
(gdb)
397 (void) LockAcquire(&tag, lockmode, false, false);
(gdb)
398 }
(gdb)
heap_lock_tuple (relation=0x2b7faf660578, tuple=0x7ffff88ddce0, cid=0, mode=LockTupleExclusive, nowait=0 '\000',
follow_updates=1 '\001', buffer=0x7ffff88ddd18, hufd=0x7ffff88ddd00) at heapam.c:4011
4011 if (mode == LockTupleKeyShare)
(gdb)
4085 else if (mode == LockTupleShare)
(gdb)
4106 else if (mode == LockTupleNoKeyExclusive)
(gdb)
4192 if (infomask & HEAP_XMAX_IS_MULTI)
(gdb)
4194 MultiXactStatus status = get_mxact_status_for_lock(mode, false);
(gdb)
4197 if (status >= MultiXactStatusNoKeyUpdate)
(gdb)
4201 if (nowait)
(gdb)
4211 MultiXactIdWait((MultiXactId) xwait, status, NULL, infomask);
(gdb) p xwait
$5 = 28
-- 因为SESSION A和SESSION B共享锁住了2行, 所以锁信息在pg_multixact中, 因此这里跟踪到的是MultiXactIdWait.
-- 如果是单事务锁, 这里应该跟踪到XactLockTableWait.
(gdb) n
... 等待锁.
SESSION X :
digoal=# select * from pgrowlocks('test');
locked_row | locker | multi | xids | modes | pids
------------+--------+-------+-------------+---------------+---------------
(0,2) | 28 | t | {1785,1786} | {Share,Share} | {10099,10102}
(0,3) | 28 | t | {1785,1786} | {Share,Share} | {10099,10102}
(2 rows)
digoal=# select * from heap_page_items(get_raw_page('test', 0));
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------
1 | 8160 | 1 | 28 | 2 | 27 | 0 | (0,3) | 16385 | 4416 | 24 | |
2 | 8128 | 1 | 28 | 2 | 28 | 0 | (0,2) | 1 | 4560 | 24 | |
3 | 8088 | 1 | 33 | 1783 | 28 | 0 | (0,3) | 32770 | 12754 | 24 | |
(3 rows)
gdb跟踪到的xwait就是这里的t_max : 28.
SESSION D :
digoal=# select pg_backend_pid();
pg_backend_pid
----------------
14886
(1 row)
SESSION D GDB :
(gdb) attach 14886
Attaching to program: /home/pgdev/pgsql9.3/bin/postgres, process 14886
(gdb) b LockTuple
Note: breakpoint 1 also set at pc 0x65d240.
Breakpoint 2 at 0x65d240: file lmgr.c, line 388.
(gdb) c
Continuing.
SESSION D :
digoal=# begin;
BEGIN
digoal=# select * from test for share;
SESSION D GDB :
Breakpoint 1, LockTuple (relation=0x2b7faf643e30, tid=0x7ffff88ddce4, lockmode=2) at lmgr.c:388
388 {
(gdb) n
391 SET_LOCKTAG_TUPLE(tag,
(gdb)
388 {
(gdb)
391 SET_LOCKTAG_TUPLE(tag,
(gdb)
397 (void) LockAcquire(&tag, lockmode, false, false);
(gdb)
391 SET_LOCKTAG_TUPLE(tag,
(gdb)
397 (void) LockAcquire(&tag, lockmode, false, false);
(gdb)
等待, 未跟踪到MultiXactIdWait.
那这里其实是进入了LockAcquireExtended函数的这个逻辑
00900 /*
00901 * If lock requested conflicts with locks requested by waiters, must join
00902 * wait queue. Otherwise, check for conflict with already-held locks.
00903 * (That's last because most complex check.)
00904 */
00905 if (lockMethodTable->conflictTab[lockmode] & lock->waitMask)
00906 status = STATUS_FOUND;
00907 else
00908 status = LockCheckConflicts(lockMethodTable, lockmode,
00909 lock, proclock, MyProc);
另一种方法是使用开发参数进行跟踪 :
SESSION C :
digoal=# \set VERBOSITY verbose
digoal=# set trace_locks = on;
SET
digoal=# set trace_userlocks = on;
SET
digoal=# set client_min_messages=debug5;
DEBUG: 00000: CommitTransactionCommand
LOCATION: finish_xact_command, postgres.c:2409
DEBUG: 00000: CommitTransaction
LOCATION: ShowTransactionState, xact.c:4388
DEBUG: 00000: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
LOCATION: ShowTransactionStateRec, xact.c:4426
LOG: 00000: LockReleaseAll: lockmethod=1
LOCATION: LockReleaseAll, lock.c:1941
LOG: 00000: LockReleaseAll done
LOCATION: LockReleaseAll, lock.c:2164
LOG: 00000: LockReleaseAll: lockmethod=2
LOCATION: LockReleaseAll, lock.c:1941
LOG: 00000: LockReleaseAll done
LOCATION: LockReleaseAll, lock.c:2164
SET
digoal=# begin;
DEBUG: 00000: StartTransactionCommand
LOCATION: start_xact_command, postgres.c:2385
DEBUG: 00000: StartTransaction
LOCATION: ShowTransactionState, xact.c:4388
DEBUG: 00000: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
LOCATION: ShowTransactionStateRec, xact.c:4426
DEBUG: 00000: ProcessUtility
LOCATION: PortalRunUtility, pquery.c:1154
DEBUG: 00000: CommitTransactionCommand
LOCATION: finish_xact_command, postgres.c:2409
BEGIN
digoal=# select * from test for update;
DEBUG: 00000: StartTransactionCommand
LOCATION: start_xact_command, postgres.c:2385
LOG: 00000: LockAcquire: lock [16384,16742] RowShareLock
LINE 1: select * from test for update;
^
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: lock [16384,16742] AccessShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockRelease: lock [16384,16742] AccessShareLock
LOCATION: LockRelease, lock.c:1751
LOG: 00000: LockAcquire: lock [16384,16745] AccessShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: lock [16384,16742] RowShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: lock [16384,16742] AccessExclusiveLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: new: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(0) req(0,0,0,0,0,0,0)=0 grant(0,0,0,0,0,0,0)=0 wait(0) type(AccessExclusiveLock)
LOCATION: LOCK_PRINT, lock.c:318
LOG: 00000: LockAcquire: new: proclock(0x2b7faf1c12c8) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da6a0) hold(0)
LOCATION: PROCLOCK_PRINT, lock.c:330
LOG: 00000: LockCheckConflicts: no conflict: proclock(0x2b7faf1c12c8) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da6a0) hold(0)
LOCATION: PROCLOCK_PRINT, lock.c:330
-- 先检测与已经存在的请求锁的队列中的锁是否有冲突. 有冲突则加入等待队列. 没有冲突则与已经HOLD的锁检测冲突性.如果发现冲突则加入队列中.
-- 这就能解释本文的第二个问题了.
00900 /*
00901 * If lock requested conflicts with locks requested by waiters, must join
00902 * wait queue. Otherwise, check for conflict with already-held locks.
00903 * (That's last because most complex check.)
00904 */
LOG: 00000: GrantLock: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(100) req(0,0,0,0,0,0,0)=1 grant(0,0,0,0,0,0,0)=1 wait(0) type(AccessExclusiveLock)
LOCATION: LOCK_PRINT, lock.c:318
SESSION D :
digoal=# set trace_locks = on;
SET
digoal=# set trace_userlocks = on;
SET
digoal=# \set VERBOSITY verbose
digoal=# set client_min_messages=debug5;
DEBUG: 00000: CommitTransactionCommand
LOCATION: finish_xact_command, postgres.c:2409
DEBUG: 00000: CommitTransaction
LOCATION: ShowTransactionState, xact.c:4388
DEBUG: 00000: name: unnamed; blockState: STARTED; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
LOCATION: ShowTransactionStateRec, xact.c:4426
LOG: 00000: LockReleaseAll: lockmethod=1
LOCATION: LockReleaseAll, lock.c:1941
LOG: 00000: LockReleaseAll done
LOCATION: LockReleaseAll, lock.c:2164
LOG: 00000: LockReleaseAll: lockmethod=2
LOCATION: LockReleaseAll, lock.c:1941
LOG: 00000: LockReleaseAll done
LOCATION: LockReleaseAll, lock.c:2164
SET
digoal=# begin;
DEBUG: 00000: StartTransactionCommand
LOCATION: start_xact_command, postgres.c:2385
DEBUG: 00000: StartTransaction
LOCATION: ShowTransactionState, xact.c:4388
DEBUG: 00000: name: unnamed; blockState: DEFAULT; state: INPROGR, xid/subid/cid: 0/1/0, nestlvl: 1, children:
LOCATION: ShowTransactionStateRec, xact.c:4426
DEBUG: 00000: ProcessUtility
LOCATION: PortalRunUtility, pquery.c:1154
DEBUG: 00000: CommitTransactionCommand
LOCATION: finish_xact_command, postgres.c:2409
BEGIN
digoal=# select * from test for share;
DEBUG: 00000: StartTransactionCommand
LOCATION: start_xact_command, postgres.c:2385
LOG: 00000: LockAcquire: lock [16384,16742] RowShareLock
LINE 1: select * from test for share;
^
LOCATION: LockAcquireExtended, lock.c:729
对应代码:
00725 #ifdef LOCK_DEBUG
00726 if (LOCK_DEBUG_ENABLED(locktag))
00727 elog(LOG, "LockAcquire: lock [%u,%u] %s",
00728 locktag->locktag_field1, locktag->locktag_field2,
00729 lockMethodTable->lockModeNames[lockmode]);
00730 #endif
这里的16384是dbid, 16742是relid, 代码如下 :
00240 #define SET_LOCKTAG_TUPLE(locktag,dboid,reloid,blocknum,offnum) \
00241 ((locktag).locktag_field1 = (dboid), \
00242 (locktag).locktag_field2 = (reloid), \
00243 (locktag).locktag_field3 = (blocknum), \
00244 (locktag).locktag_field4 = (offnum), \
00245 (locktag).locktag_type = LOCKTAG_TUPLE, \
00246 (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
验证如下 :
pgdev@db-172-16-3-150-> psql
psql (9.3devel)
Type "help" for help.
digoal=# select oid,datname from pg_database where oid=16384;
oid | datname
-------+---------
16384 | digoal
(1 row)
digoal=# \c digoal
You are now connected to database "digoal" as user "postgres".
digoal=# select relname,relkind from pg_class where oid=16742;
relname | relkind
---------+---------
test | r
(1 row)
LOG: 00000: LockAcquire: lock [16384,16742] AccessShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockRelease: lock [16384,16742] AccessShareLock
LOCATION: LockRelease, lock.c:1751
LOG: 00000: LockAcquire: lock [16384,16745] AccessShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: lock [16384,16742] RowShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: lock [16384,16742] RowShareLock
LOCATION: LockAcquireExtended, lock.c:729
LOG: 00000: LockAcquire: found: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(100) req(0,0,0,0,0,0,0)=1 grant(0,0,0,0,0,0,0)=1 wait(0) type(RowShareLock)
LOCATION: LOCK_PRINT, lock.c:318
代码如下 :
00298 inline static void
00299 LOCK_PRINT(const char *where, const LOCK *lock, LOCKMODE type)
00300 {
00301 if (LOCK_DEBUG_ENABLED(&lock->tag))
00302 elog(LOG,
00303 "%s: lock(%p) id(%u,%u,%u,%u,%u,%u) grantMask(%x) "
00304 "req(%d,%d,%d,%d,%d,%d,%d)=%d "
00305 "grant(%d,%d,%d,%d,%d,%d,%d)=%d wait(%d) type(%s)",
00306 where, lock,
00307 lock->tag.locktag_field1, lock->tag.locktag_field2,
00308 lock->tag.locktag_field3, lock->tag.locktag_field4,
00309 lock->tag.locktag_type, lock->tag.locktag_lockmethodid,
00310 lock->grantMask,
00311 lock->requested[1], lock->requested[2], lock->requested[3],
00312 lock->requested[4], lock->requested[5], lock->requested[6],
00313 lock->requested[7], lock->nRequested,
00314 lock->granted[1], lock->granted[2], lock->granted[3],
00315 lock->granted[4], lock->granted[5], lock->granted[6],
00316 lock->granted[7], lock->nGranted,
00317 lock->waitProcs.size,
00318 LockMethods[LOCK_LOCKMETHOD(*lock)]->lockModeNames[type]);
00319 }
tag枚举列表如下
00158 /*
00159 * LOCKTAG is the key information needed to look up a LOCK item in the
00160 * lock hashtable. A LOCKTAG value uniquely identifies a lockable object.
00161 *
00162 * The LockTagType enum defines the different kinds of objects we can lock.
00163 * We can handle up to 256 different LockTagTypes.
00164 */
00165 typedef enum LockTagType
00166 {
00167 LOCKTAG_RELATION, /* whole relation */
00168 /* ID info for a relation is DB OID + REL OID; DB OID = 0 if shared */
00169 LOCKTAG_RELATION_EXTEND, /* the right to extend a relation */
00170 /* same ID info as RELATION */
00171 LOCKTAG_PAGE, /* one page of a relation */
00172 /* ID info for a page is RELATION info + BlockNumber */
00173 LOCKTAG_TUPLE, /* one physical tuple */
00174 /* ID info for a tuple is PAGE info + OffsetNumber */
00175 LOCKTAG_TRANSACTION, /* transaction (for waiting for xact done) */
00176 /* ID info for a transaction is its TransactionId */
00177 LOCKTAG_VIRTUALTRANSACTION, /* virtual transaction (ditto) */
00178 /* ID info for a virtual transaction is its VirtualTransactionId */
00179 LOCKTAG_OBJECT, /* non-relation database object */
00180 /* ID info for an object is DB OID + CLASS OID + OBJECT OID + SUBID */
00181
00182 /*
00183 * Note: object ID has same representation as in pg_depend and
00184 * pg_description, but notice that we are constraining SUBID to 16 bits.
00185 * Also, we use DB OID = 0 for shared objects such as tablespaces.
00186 */
00187 LOCKTAG_USERLOCK, /* reserved for old contrib/userlock code */
00188 LOCKTAG_ADVISORY /* advisory user locks */
00189 } LockTagType;
LOCK数据结构如下 :
00281 /*
00282 * Per-locked-object lock information:
00283 *
00284 * tag -- uniquely identifies the object being locked
00285 * grantMask -- bitmask for all lock types currently granted on this object.
00286 * waitMask -- bitmask for all lock types currently awaited on this object.
00287 * procLocks -- list of PROCLOCK objects for this lock.
00288 * waitProcs -- queue of processes waiting for this lock.
00289 * requested -- count of each lock type currently requested on the lock
00290 * (includes requests already granted!!).
00291 * nRequested -- total requested locks of all types.
00292 * granted -- count of each lock type currently granted on the lock.
00293 * nGranted -- total granted locks of all types.
00294 *
00295 * Note: these counts count 1 for each backend. Internally to a backend,
00296 * there may be multiple grabs on a particular lock, but this is not reflected
00297 * into shared memory.
00298 */
00299 typedef struct LOCK
00300 {
00301 /* hash key */
00302 LOCKTAG tag; /* unique identifier of lockable object */
00303
00304 /* data */
00305 LOCKMASK grantMask; /* bitmask for lock types already granted */
00306 LOCKMASK waitMask; /* bitmask for lock types awaited */
00307 SHM_QUEUE procLocks; /* list of PROCLOCK objects assoc. with lock */
00308 PROC_QUEUE waitProcs; /* list of PGPROC objects waiting on lock */
00309 int requested[MAX_LOCKMODES]; /* counts of requested locks */
00310 int nRequested; /* total of requested[] array */
00311 int granted[MAX_LOCKMODES]; /* counts of granted locks */
00312 int nGranted; /* total of granted[] array */
00313 } LOCK;
解释 id(16384,16742,0,2,3,1)
00193 /*
00194 * The LOCKTAG struct is defined with malice aforethought to fit into 16
00195 * bytes with no padding. Note that this would need adjustment if we were
00196 * to widen Oid, BlockNumber, or TransactionId to more than 32 bits.
00197 *
00198 * We include lockmethodid in the locktag so that a single hash table in
00199 * shared memory can store locks of different lockmethods.
00200 */
00201 typedef struct LOCKTAG
00202 {
00203 uint32 locktag_field1; /* a 32-bit ID field */
00204 uint32 locktag_field2; /* a 32-bit ID field */
00205 uint32 locktag_field3; /* a 32-bit ID field */
00206 uint16 locktag_field4; /* a 16-bit ID field */
00207 uint8 locktag_type; /* see enum LockTagType */
00208 uint8 locktag_lockmethodid; /* lockmethod indicator */
00209 } LOCKTAG;
00132 /* These identify the known lock methods */
00133 #define DEFAULT_LOCKMETHOD 1
00134 #define USER_LOCKMETHOD 2
本例是行锁, 所以设置tag的值对应以下宏 :
00240 #define SET_LOCKTAG_TUPLE(locktag,dboid,reloid,blocknum,offnum) \
00241 ((locktag).locktag_field1 = (dboid), \
00242 (locktag).locktag_field2 = (reloid), \
00243 (locktag).locktag_field3 = (blocknum), \
00244 (locktag).locktag_field4 = (offnum), \
00245 (locktag).locktag_type = LOCKTAG_TUPLE, \
00246 (locktag).locktag_lockmethodid = DEFAULT_LOCKMETHOD)
因此 id(16384,16742,0,2,3,1)
16384 : lock->tag.locktag_field1 = dboid
16742 : lock->tag.locktag_field2 = reloid
0 : lock->tag.locktag_field3 = blocknum
2 : lock->tag.locktag_field4 = offnum
3 : lock->tag.locktag_type = LOCKTAG_TUPLE (见LockTagType枚举)
1 : lock->tag.locktag_lockmethodid = DEFAULT_LOCKMETHOD (见DEFAULT_LOCKMETHOD定义)
offnum对应CTID :
digoal=# select ctid,* from test;
ctid | id | info
-------+----+------
(0,2) | 2 |
(0,3) | 1 | test
(2 rows)
LOG: 00000: LockAcquire: new: proclock(0x2b7faf1c1160) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da0e0) hold(0)
LOCATION: PROCLOCK_PRINT, lock.c:330
LOG: 00000: LockCheckConflicts: conflicting: proclock(0x2b7faf1c1160) lock(0x2b7faf13b898) method(1) proc(0x2b7faf3da0e0) hold(0)
LOCATION: PROCLOCK_PRINT, lock.c:330
LOG: 00000: WaitOnLock: sleeping on lock: lock(0x2b7faf13b898) id(16384,16742,0,2,3,1) grantMask(100) req(0,1,0,0,0,0,0)=2 grant(0,0,0,0,0,0,0)=1 wait(0) type(RowShareLock)
LOCATION: LOCK_PRINT, lock.c:318
小结
1.
单条记录同时被多个事务共享锁时, 锁信息存储在pg_multixact中.
记录的头信息中t_infomask HEAP_XMAX_IS_MULTI 比特位将会被设置。
记录的头信息中t_xmax代表的是MultiXactId.
2.
当多个事务并发请求锁同一条tuple时, 先请求锁的会话如果与已经存在的锁发生冲突, 将会产生一个锁队列, 后来的会话请求锁时需要先检测锁队列是否冲突再检测与已经HOLD住的锁是否冲突. 因此不会造成多个共享锁持续占用TUPLE而独占锁一直都请求不到锁的情况。
参考
1. src/backend/access/heap/README.tuplock
2. src/include/access/htup_details.h
3. src/include/access/multixact.h
4. src/backend/access/transam/multixact.c
5. src/include/access/heapam.h
6. src/backend/access/heap/heapam.c
7. src/backend/utils/adt/lockfuncs.c
8. src/backend/storage/lmgr/lock.c
9. src/backend/storage/lmgr/lmgr.c
10. http://www.postgresql.org/docs/devel/static/pgrowlocks.html
11. http://www.postgresql.org/docs/devel/static/pageinspect.html
12. https://github.com/postgres/postgres/commit/0ac5ad5134f2769ccbaefec73844f8504c4d6182
13. http://blog.163.com/digoal@126/blog/static/16387704020130249109133/
14. http://blog.163.com/digoal@126/blog/static/16387704020130305109687/
15. http://blog.163.com/digoal@126/blog/static/163877040201210134586363/
16. http://blog.163.com/digoal@126/blog/static/1638770402011515105557166/
17. http://blog.163.com/digoal@126/blog/static/16387704020130312271679/