PostgreSQL How to deal TUPLE LOCK : 1 - One transaction lock single or multiple tuples rows

9 minute read

背景

注意, 本文基于2013-01-08到2013-01-31之间的postgresql9.3 devel分支进行的讲解.

也就是以下这个行锁补丁后的版本. 其他版本目前没有这个特性.

https://github.com/postgres/postgres/commit/0ac5ad5134f2769ccbaefec73844f8504c4d6182

同时昨天在写这篇BLOG的时候发现了几个BUG, 已经提交了BUG报告, 今天早上得到了修复. 如下 :

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=77a3082fc546774808b76f58173caec3852ebf62

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=b78647a0e6f7b110273e98601f26d3d1db0ad931

在此非常感谢 Alvaro Herrera .

所以如果你看到的是2013-01-31后修复了BUG的版本则与下面的讲解会有一些出入。

这两个BUG分别是pgrowlocks 的锁显示不正确问题以及multixact锁升级问题.

接下来就开始吧.

我们知道PostgreSQL中有一个系统视图pg_locks.

digoal=# \d+ pg_locks  
                     View "pg_catalog.pg_locks"  
       Column       |   Type   | Modifiers | Storage  | Description   
--------------------+----------+-----------+----------+-------------  
 locktype           | text     |           | extended |   
 database           | oid      |           | plain    |   
 relation           | oid      |           | plain    |   
 page               | integer  |           | plain    |   
 tuple              | smallint |           | plain    |   
 virtualxid         | text     |           | extended |   
 transactionid      | xid      |           | plain    |   
 classid            | oid      |           | plain    |   
 objid              | oid      |           | plain    |   
 objsubid           | smallint |           | plain    |   
 virtualtransaction | text     |           | extended |   
 pid                | integer  |           | plain    |   
 mode               | text     |           | extended |   
 granted            | boolean  |           | plain    |   
 fastpath           | boolean  |           | plain    |   
View definition:  
 SELECT l.locktype, l.database, l.relation, l.page, l.tuple, l.virtualxid,   
    l.transactionid, l.classid, l.objid, l.objsubid, l.virtualtransaction,   
    l.pid, l.mode, l.granted, l.fastpath  
   FROM pg_lock_status() l(locktype, database, relation, page, tuple, virtualxid, transactionid, classid, objid, objsubid, virtualtransaction, pid, mode, granted, fastpath);  

用来获取当前数据库集群中的锁信息. 详见

src/backend/utils/adt/lockfuncs.c

对用户级函数展现的数据结构 :

src/include/storage/lock.h

/*  
 * These structures hold information passed from lmgr internals to the lock  
 * listing user-level functions (in lockfuncs.c).  
 */  
  
typedef struct LockInstanceData  
{  
        LOCKTAG         locktag;                /* locked object */  
        LOCKMASK        holdMask;               /* locks held by this PGPROC */  
        LOCKMODE        waitLockMode;   /* lock awaited by this PGPROC, if any */  
        BackendId       backend;                /* backend ID of this PGPROC */  
        LocalTransactionId lxid;        /* local transaction ID of this PGPROC */  
        int                     pid;                    /* pid of this PGPROC */  
        bool            fastpath;               /* taken via fastpath? */  
} LockInstanceData;  
  
typedef struct LockData  
{  
        int                     nelements;              /* The length of the array */  
        LockInstanceData *locks;  
} LockData;  

注意行级别的锁并不存在这些结构中, 因为一个事务可能对很多行加锁. 如果都存在内存中, 耗费非常巨大.

这点在以前的一篇BLOG中已经分析过, 有兴趣的朋友可以阅读 :

http://blog.163.com/digoal@126/blog/static/163877040201210134586363/

第一个问题 : PostgreSQL How to deal TUPLE LOCK - “One transaction lock single or multiple tuples | rows”

由于PostgreSQL 9.3最近增加了2种行锁模式, 也就是说有4种行锁模式. 所以简单的提一下.

对于行只被1个事务加锁的情形来说, 不管是锁单行还是多行, 信息都是存储在tuple head上的.

PostgreSQL 9.3 对tuple的头数据中的t_infomask以及t_infomask2做了调整. 如下 :

src/include/access/htup_details.h

1. t_infomask :

老版本的t_infomask中锁信息占用了2个比特位如下 :

#define HEAP_XMAX_EXCL_LOCK             0x0040  /* xmax is exclusive locker */  
#define HEAP_XMAX_SHARED_LOCK   0x0080  /* xmax is shared locker */  

注意老版本的t_infomask中0x0010未使用;

0x0010在新版本中被启用了, 用来定义key shared lock.

#define HEAP_XMAX_KEYSHR_LOCK   0x0010  /* xmax is a key-shared locker */  

同时新版本对的0x0080比特位的含义重新定义如下 :

#define HEAP_XMAX_EXCL_LOCK             0x0040  /* xmax is exclusive locker */  
#define HEAP_XMAX_LOCK_ONLY             0x0080  /* xmax, if valid, is only a locker */  

由于新版本有4种行锁模式, 仅使用0x0010是不够的. 怎么办呢?

新版本用到了t_infomask2中的1个比特来存储key update的锁信息. 表示该记录已经被更改并且key值也被更改了, 或者该记录被删除.

#define HEAP_KEYS_UPDATED               0x2000  /* tuple was updated and key cols modified, or tuple deleted */  

综上所述, 新版本的shared locker和no key update信息在tuple头信息中没地方存了. 去哪了呢?

shared locker如下 : 如果同时定义了HEAP_XMAX_EXCL_LOCK HEAP_XMAX_KEYSHR_LOCK则代表shared locker.
#define HEAP_XMAX_SHR_LOCK      (HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_KEYSHR_LOCK)  

从以上XMAX掩码注释来看, no key update在tuple头的infomask中真的找不到地方存储了.

但实际上并不是这样. 必须从掩码的设置函数来介绍 :

从函数compute_new_xmax_infomask中截取单事务设置xmax t_infomask 与 t_infomask2标记的部分代码如下.

4种行锁模式都能在TUPLE头信息中合适的设置 :

                        new_infomask |= HEAP_XMAX_LOCK_ONLY;  
                        switch (mode)  
                        {  
                                case LockTupleKeyShare:  
                                        new_xmax = add_to_xmax;  
                                        new_infomask |= HEAP_XMAX_KEYSHR_LOCK;  
                                        break;  
                                case LockTupleShare:  
                                        new_xmax = add_to_xmax;  
                                        new_infomask |= HEAP_XMAX_SHR_LOCK;  
                                        break;  
                                case LockTupleNoKeyExclusive:  
                                        new_xmax = add_to_xmax;  
                                        new_infomask |= HEAP_XMAX_EXCL_LOCK;  
                                        break;  
                                case LockTupleExclusive:  
                                        new_xmax = add_to_xmax;  
                                        new_infomask |= HEAP_XMAX_EXCL_LOCK;  
                                        new_infomask2 |= HEAP_KEYS_UPDATED;  
                                        break;  
                                default:  
                                        new_xmax = InvalidTransactionId;        /* silence compiler */  
                                        elog(ERROR, "invalid lock mode");  
                        }  

单个事务对TUPLE加锁时, 首先设置HEAP_XMAX_LOCK_ONLY,

no key update实际上是标记的HEAP_XMAX_EXCL_LOCK.所以这里就能解释本文下面测试的第一个场景了.

获取tuple上是否加了什么锁的宏定义在src/backend/access/heap/heapam.c中, 其中HEAP_XMAX_IS_EXCL_LOCKED 判据不足,因为如果设置了HEAP_XMAX_IS_EXCL_LOCKED 那么t_infomask2的HEAP_KEYS_UPDATED 也需要判断才能分出是key update还是no key update.

宏如下 :

#define HEAP_XMAX_IS_SHR_LOCKED(infomask) \  
    (((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_SHR_LOCK)  
#define HEAP_XMAX_IS_EXCL_LOCKED(infomask) \  
    (((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_EXCL_LOCK)  
#define HEAP_XMAX_IS_KEYSHR_LOCKED(infomask) \  
    (((infomask) & HEAP_LOCK_MASK) == HEAP_XMAX_KEYSHR_LOCK)  

HEAP_LOCK_MASK的定义如下 :

#define HEAP_LOCK_MASK  (HEAP_XMAX_SHR_LOCK | HEAP_XMAX_EXCL_LOCK | \  
                                                 HEAP_XMAX_KEYSHR_LOCK)  

同时pgrowlocks扩展包在单事务锁行的场景也用到了的以上这些宏来获取锁信息. 如下 :

显然由于判据不足, 以下代码目前存在BUG :

				if (infomask & HEAP_XMAX_LOCK_ONLY)  
                                {  
                                        if (HEAP_XMAX_IS_SHR_LOCKED(infomask))  
                                                snprintf(values[Atnum_modes], NCHARS, "{For Share}");  
                                        else if (HEAP_XMAX_IS_KEYSHR_LOCKED(infomask))  
                                                snprintf(values[Atnum_modes], NCHARS, "{For Key Share}");  
                                        else if (HEAP_XMAX_IS_EXCL_LOCKED(infomask))  
                                                snprintf(values[Atnum_modes], NCHARS, "{For Update}");   
                                                   
						 # 这里应该再加个infomask2的判断, 因为还不能定到底是key update或者no key update;   
                                                 # 或改成"For [no] key update"  
                                        else  
                                                /* neither keyshare nor exclusive bit it set */  
                                                snprintf(values[Atnum_modes], NCHARS,  
                                                                 "{transient upgrade status}");  
                                }  
                                else  
                                {  
                                        if (tuple->t_data->t_infomask2 & HEAP_KEYS_UPDATED)  
                                                snprintf(values[Atnum_modes], NCHARS, "{Key Update}");  
                                        else  
                                                snprintf(values[Atnum_modes], NCHARS, "{Update}");  
                                }  

正因为这个BUG, 我们能看到以下第一个测试select for no key update, 使用pgrowlocks输出的确是For Update.

测试

1. 测试单个事务对TUPLE加for no key update锁. 并使用pgrowlocks和pageinspect查看锁信息以及t_infomask信息.

SESSION A :

digoal=# create table test (id int, info text);  
CREATE TABLE  
digoal=# insert into test values (1,'abc'),(2,'digoal');  
INSERT 0 2  
digoal=# begin;  
BEGIN  
digoal=# select * from test where id=1 for no key update;  
 id | info   
----+------  
  1 | abc  
(1 row)  

SESSION B :

-- 使用pgrowlocks看到的锁模式是for update, HEAP_XMAX_IS_MULTI未设置, 所以multi=false. 根据前面的介绍,locker就是xmax的值.  
digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |  xids  |     modes      |  pids    
------------+--------+-------+--------+----------------+--------  
 (0,1)      |   1752 | f     | {1752} | {"For Update"} | {6096}  
(1 row)  
-- 使用pageinspect看到的信息如下.  
-- xmax invalid 掩码未设置并且multixact掩码未设置时, t_xmax的内容表示锁该行事务号.  
digoal=# select * from heap_page_items(get_raw_page('test', 0));  
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------  
  1 |   8160 |        1 |     32 |   1751 |   1752 |        0 | (0,1)  |           2 |        450 |     24 |        |        
  2 |   8120 |        1 |     35 |   1751 |      0 |        0 | (0,2)  |           2 |       2306 |     24 |        |        
(2 rows)  

加锁的行lp=1 的t_infomask=450, 16进制 0x01C2 , 因此包含如下MASK :

#define HEAP_XMIN_COMMITTED             0x0100  /* t_xmin committed */  
#define HEAP_XMAX_EXCL_LOCK             0x0040  /* xmax is exclusive locker */  
#define HEAP_XMAX_LOCK_ONLY             0x0080  /* xmax, if valid, is only a locker */  
#define HEAP_HASVARWIDTH                0x0002  /* has variable-width attribute(s) */  

xmax掩码信息与compute_new_xmax_infomask函数的处理逻辑一致.

2. 测试单个事务对TUPLE加for key share锁. 并使用pgrowlocks和pageinspect查看锁信息以及t_infomask信息.
SESSION A :

digoal=# begin;  
BEGIN  
digoal=# select * from test where id=1 for key share;  
 id | info   
----+------  
  1 | abc  
(1 row)  

SESSION B :

digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |  xids  |       modes       |  pids    
------------+--------+-------+--------+-------------------+--------  
 (0,1)      |   1755 | f     | {1755} | {"For Key Share"} | {6096}  
(1 row)  
digoal=# select * from heap_page_items(get_raw_page('test', 0));  
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------  
  1 |   8160 |        1 |     32 |   1751 |   1755 |        0 | (0,1)  |           2 |        402 |     24 |        |        
  2 |   8120 |        1 |     35 |   1751 |      0 |        0 | (0,2)  |           2 |       2306 |     24 |        |        
(2 rows)  

加锁的行lp=1 的t_infomask=402, 16进制 0x0192 , 因此包含如下MASK :

#define HEAP_XMIN_COMMITTED             0x0100  /* t_xmin committed */  
#define HEAP_XMAX_KEYSHR_LOCK           0x0010  /* xmax is a key-shared locker */  
#define HEAP_XMAX_LOCK_ONLY             0x0080  /* xmax, if valid, is only a locker */  
#define HEAP_HASVARWIDTH                0x0002  /* has variable-width attribute(s) */  

xmax掩码信息与compute_new_xmax_infomask函数的处理逻辑一致.

3. 测试单个事务对TUPLE加for share锁. 并使用pgrowlocks和pageinspect查看锁信息以及t_infomask信息.

SESSION A :

digoal=# begin;  
BEGIN  
digoal=# select * from test where id=1 for share;  
 id | info   
----+------  
  1 | abc  
(1 row)  

SESSION B :

digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |  xids  |     modes     |  pids    
------------+--------+-------+--------+---------------+--------  
 (0,1)      |   1754 | f     | {1754} | {"For Share"} | {6096}  
(1 row)  
digoal=# select * from heap_page_items(get_raw_page('test', 0));  
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------  
  1 |   8160 |        1 |     32 |   1751 |   1754 |        0 | (0,1)  |           2 |        466 |     24 |        |        
  2 |   8120 |        1 |     35 |   1751 |      0 |        0 | (0,2)  |           2 |       2306 |     24 |        |        
(2 rows)  

加锁的行lp=1 的t_infomask=466, 16进制 0x01D2 , 因此包含如下MASK :

#define HEAP_XMIN_COMMITTED             0x0100  /* t_xmin committed */  
#define HEAP_XMAX_KEYSHR_LOCK           0x0010  /* xmax is a key-shared locker */  
#define HEAP_XMAX_EXCL_LOCK             0x0040  /* xmax is exclusive locker */  
#define HEAP_XMAX_LOCK_ONLY             0x0080  /* xmax, if valid, is only a locker */  
#define HEAP_HASVARWIDTH                0x0002  /* has variable-width attribute(s) */  

xmax掩码信息与compute_new_xmax_infomask函数的处理逻辑一致.

htup_detail.h头文件中设置了: HEAP_XMAX_EXCL_LOCK和HEAP_XMAX_KEYSHR_LOCK掩码同时设置表示shared lock.

4. 测试单个事务对TUPLE加for update锁. 并使用pgrowlocks和pageinspect查看锁信息以及t_infomask信息.

SESSION A :

digoal=# begin;  
BEGIN  
digoal=# select * from test where id=1 for update;  
 id | info   
----+------  
  1 | abc  
(1 row)  

SESSION B :

digoal=# select * from pgrowlocks('test');  
 locked_row | locker | multi |  xids  |     modes      |  pids    
------------+--------+-------+--------+----------------+--------  
 (0,1)      |   1756 | f     | {1756} | {"For Update"} | {6096}  
(1 row)  
digoal=# select * from heap_page_items(get_raw_page('test', 0));  
 lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid   
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------  
  1 |   8160 |        1 |     32 |   1751 |   1756 |        0 | (0,1)  |        8194 |        450 |     24 |        |        
  2 |   8120 |        1 |     35 |   1751 |      0 |        0 | (0,2)  |           2 |       2306 |     24 |        |        
(2 rows)  

加锁的行lp=1 的t_infomask=450, 16进制 0x01C2 , 因此包含如下MASK :

#define HEAP_XMIN_COMMITTED             0x0100  /* t_xmin committed */  
#define HEAP_XMAX_EXCL_LOCK             0x0040  /* xmax is exclusive locker */  
#define HEAP_XMAX_LOCK_ONLY             0x0080  /* xmax, if valid, is only a locker */  
#define HEAP_HASVARWIDTH                0x0002  /* has variable-width attribute(s) */  

t_infomask2=8194, 16进制 0x2002 , 因此包含如下MASK :

#define HEAP_KEYS_UPDATED               0x2000  /* tuple was updated and key cols  
                                                                                 * modified, or tuple deleted */  

xmax掩码信息与compute_new_xmax_infomask函数的处理逻辑一致.

小结

1. 单事务锁行时, 4种行锁模式和infomask的关系 :

1.1 select … for key share

t_infomask 设置 HEAP_XMAX_LOCK_ONLY  以及 HEAP_XMAX_KEYSHR_LOCK  

1.2 select … for share

t_infomask 设置 HEAP_XMAX_LOCK_ONLY  以及 HEAP_XMAX_SHR_LOCK [也就是HEAP_XMAX_KEYSHR_LOCK和HEAP_XMAX_EXCL_LOCK]  

1.3 select … for no key update

t_infomask 设置 HEAP_XMAX_LOCK_ONLY  以及 HEAP_XMAX_EXCL_LOCK  

1.4 select … for update [也就是key update]

t_infomask 设置 HEAP_XMAX_LOCK_ONLY  以及 HEAP_XMAX_EXCL_LOCK   
  
同时 t_infomask2 设置 HEAP_KEYS_UPDATED  

2. 有兴趣的朋友可以用gdb跟踪一下pgrowlocks的函数调用.

3. 下一篇将讲解一下多个事务对同一条TUPLE请求锁的情况, 因为infomask的信息量有限, 所以xmax不再存储xid而是存储MultiXactId.

有兴趣的朋友可以关注下一篇BLOG.  

《PostgreSQL How to deal TUPLE LOCK : 2 - “one|more transactions waiting one|more transactions release tuple lock” 》
http://blog.163.com/digoal@126/blog/static/16387704020131172754749/

参考

1. src/backend/access/heap/README.tuplock

2. src/include/access/htup_details.h

3. src/include/access/multixact.h

4. src/backend/access/transam/multixact.c

5. src/include/access/heapam.h

6. src/backend/access/heap/heapam.c

7. src/backend/utils/adt/lockfuncs.c

8. src/backend/storage/lmgr/lock.c

9. src/backend/storage/lmgr/lmgr.c

10. http://www.postgresql.org/docs/devel/static/pgrowlocks.html

11. http://www.postgresql.org/docs/devel/static/pageinspect.html

12. https://github.com/postgres/postgres/commit/0ac5ad5134f2769ccbaefec73844f8504c4d6182

13. http://blog.163.com/digoal@126/blog/static/16387704020130249109133/

  1. http://blog.163.com/digoal@126/blog/static/16387704020130305109687/

15. http://blog.163.com/digoal@126/blog/static/163877040201210134586363/

16. http://blog.163.com/digoal@126/blog/static/1638770402011515105557166/

17. 新版本t_infomask信息如下 :

/*  
 * information stored in t_infomask:  
 */  
#define HEAP_HASNULL                    0x0001  /* has null attribute(s) */  
#define HEAP_HASVARWIDTH                0x0002  /* has variable-width attribute(s) */  
#define HEAP_HASEXTERNAL                0x0004  /* has external stored attribute(s) */  
#define HEAP_HASOID                             0x0008  /* has an object-id field */  
#define HEAP_XMAX_KEYSHR_LOCK   0x0010  /* xmax is a key-shared locker */  
#define HEAP_COMBOCID                   0x0020  /* t_cid is a combo cid */  
#define HEAP_XMAX_EXCL_LOCK             0x0040  /* xmax is exclusive locker */  
#define HEAP_XMAX_LOCK_ONLY             0x0080  /* xmax, if valid, is only a locker */  
  
                                                                                /* xmax is a shared locker */  
#define HEAP_XMAX_SHR_LOCK      (HEAP_XMAX_EXCL_LOCK | HEAP_XMAX_KEYSHR_LOCK)  
  
#define HEAP_LOCK_MASK  (HEAP_XMAX_SHR_LOCK | HEAP_XMAX_EXCL_LOCK | \  
                                                 HEAP_XMAX_KEYSHR_LOCK)  
#define HEAP_XMIN_COMMITTED             0x0100  /* t_xmin committed */  
#define HEAP_XMIN_INVALID               0x0200  /* t_xmin invalid/aborted */  
#define HEAP_XMAX_COMMITTED             0x0400  /* t_xmax committed */  
#define HEAP_XMAX_INVALID               0x0800  /* t_xmax invalid/aborted */  
#define HEAP_XMAX_IS_MULTI              0x1000  /* t_xmax is a MultiXactId */  
#define HEAP_UPDATED                    0x2000  /* this is UPDATEd version of row */  
#define HEAP_MOVED_OFF                  0x4000  /* moved to another place by pre-9.0  
                                                                                 * VACUUM FULL; kept for binary  
                                                                                 * upgrade support */  
#define HEAP_MOVED_IN                   0x8000  /* moved from another place by pre-9.0  
                                                                                 * VACUUM FULL; kept for binary  
                                                                                 * upgrade support */  
#define HEAP_MOVED (HEAP_MOVED_OFF | HEAP_MOVED_IN)  
  
#define HEAP_XACT_MASK                  0xFFF0  /* visibility-related bits */  

18. t_infomask2 :
新增了#define HEAP_KEYS_UPDATED 0x2000 /* tuple was updated and key cols modified, or tuple deleted */

/*  
 * information stored in t_infomask2:  
 */  
#define HEAP_NATTS_MASK                 0x07FF  /* 11 bits for number of attributes */  
/* bits 0x1800 are available */  
#define HEAP_KEYS_UPDATED               0x2000  /* tuple was updated and key cols  
                                                                                 * modified, or tuple deleted */  
#define HEAP_HOT_UPDATED                0x4000  /* tuple was HOT-updated */  
#define HEAP_ONLY_TUPLE                 0x8000  /* this is heap-only tuple */  
  
#define HEAP2_XACT_MASK                 0xE000  /* visibility-related bits */  

Flag Counter

digoal’s 大量PostgreSQL文章入口