一个笛卡尔积的update from语句引发的(内存泄露?)问题

14 minute read

场景

PostgreSQL支持多表JOIN的更新操作,但是如果SQL没有写好,可能会导致出现笛卡尔积的情况。

如果是条查询语句,出现笛卡尔积时,没什么问题,大不了就是查询慢一点。

如果是条更新语句,现在看来可能代码中有内存泄露的BUG,已反馈给PG社区。

另外需要注意PostgreSQL不允许自关联的更新,但实际上使用别名可以规避这个语法错误

如果业务确实有自关联的更新操作需求,可以使用别名的方法。

postgres=# explain (analyze,verbose,timing,buffers,costs) UPDATE tbl1 SET in_predict = true FROM tbl1 , tbl2 p WHERE c.uid = p.uid;  
ERROR:  table name "tbl1" specified more than once  

使用别名规避以上错误

UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;  

另外需要注意,以上QUERY已经产生了笛卡尔积,三个表(tbl1, tbl1, tbl2)的JOIN,但是只提供了一个JOIN条件。

虽然以上是一条问题SQL,但是接下来的问题也是这样的问题SQL发现的。

复现query

执行以下QUERY生成测试数据

CREATE TABLE tbl1 (  
        id      bigserial PRIMARY KEY,  
        uid     bigint,  
        times   bigint,  
        hostname_num    bigint,  
        ip_num  bigint,  
        session_num     bigint,  
        device_num      bigint,  
        page_num        bigint,  
        platform_num    bigint,  
        duration        bigint,  
        duration_min    bigint,  
        duration_max    bigint,  
        in_server       bigint,  
        start_time_max  timestamp,  
        end_time_max    timestamp,  
        in_heartbeet    boolean,  
        in_predict              boolean  
);  
  
CREATE INDEX ON tbl1 USING BTREE(duration DESC NULLS LAST);  
CREATE INDEX ON tbl1 USING BTREE(ip_num DESC NULLS LAST);  
CREATE INDEX ON tbl1 USING BTREE(times DESC NULLS LAST);  
CREATE INDEX ON tbl1 USING BTREE(uid DESC NULLS LAST);  
  
  
CREATE FUNCTION fill_data_tbl1(numRows int) RETURNS VOID AS $$  
DECLARE  
BEGIN  
                INSERT INTO tbl1 (uid, times, hostname_num, ip_num,   
        session_num, device_num, page_num, platform_num,   
        duration, duration_min, duration_max, in_server,   
        start_time_max, end_time_max, in_heartbeet, in_predict)   
        select 13000000 + cnt, FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100),   
        FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100),   
        FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100),   
        FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100), FLOOR(RANDOM() * 100),  
        TO_TIMESTAMP(FLOOR(RANDOM() * 84151112122)), TO_TIMESTAMP(FLOOR(RANDOM() * 84151112122)),   
        false, false from generate_series(1, numrows) t(cnt);  
END;  
$$ LANGUAGE plpgsql;  
  
select fill_data_tbl1(1500000);  
  
CREATE TABLE tbl2 (  
        id      bigint PRIMARY KEY,  
        uid     bigint  
);  
  
CREATE INDEX ON tbl2 USING BTREE(uid DESC NULLS LAST);  
  
CREATE FUNCTION fill_data_tbl2(numRows int) RETURNS VOID AS $$  
DECLARE  
BEGIN  
                INSERT INTO tbl2 (id, uid) select cnt, 13001000 + cnt from generate_series(1, numrows) t(cnt);  
END;  
$$ LANGUAGE plpgsql;  
  
select fill_data_tbl2(1500000);  

执行以下SQL即可复现问题。

UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;  

SQL的执行计划如下

postgres=# explain UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;  
                                                      QUERY PLAN                                                       
-----------------------------------------------------------------------------------------------------------------------
 Update on tbl1  (cost=37.85..39154698995.45 rows=2252418000000 width=139)
   ->  Nested Loop  (cost=37.85..39154698995.45 rows=2252418000000 width=139)
         ->  Seq Scan on tbl1  (cost=0.00..43894.12 rows=1501612 width=127)
         ->  Materialize  (cost=37.85..133276.33 rows=1500000 width=12)
               ->  Merge Join  (cost=37.85..118451.33 rows=1500000 width=12)
                     Merge Cond: (c.uid = p.uid)
                     ->  Index Scan Backward using tbl1_uid_idx on tbl1 c  (cost=0.43..58384.61 rows=1501612 width=14)
                     ->  Index Scan Backward using tbl2_uid_idx on tbl2 p  (cost=0.43..37588.43 rows=1500000 width=14)
(8 rows)

这里join会用到临时空间,所以如果在这一步使用temp_file_limit限制了临时文件的使用,可能会报错。

postgres=# set temp_file_limit ='10MB';  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;  
ERROR:  temporary file size exceeds temp_file_limit (10240kB)  

复现问题时,可以把它设到无限大(-1),其实本例设成1GB就够用了,因为造成问题的不是temp file.

postgres=# set temp_file_limit ='1GB';  
SET  
postgres=# explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;  

问题分析

这样会无休止执行下去,同时导致内存不断增长,这个内存占用和temp file没有关联,应该是某处内存泄露导致的。

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                                                                                           
14105 digoal  20   0 8655m 8.3g 101m R 100.0  1.6  18:48.08 postgres: postgres postgres [local] EXPLAIN  

pmap可以看到该进程一处Anonymous内存不断的膨胀

[root@iZ28tqoemgtZ 20733]# pmap -XX 14105  
14105:   postgres: postgres postgres [local] EXPLAIN  
         Address Perm   Offset Device    Inode     Size     Rss     Pss Shared_Clean Shared_Dirty Private_Clean Private_Dirty Referenced Anonymous AnonHugePages Swap KernelPageSize MMUPageSize Locked                VmFlagsMapping  
        00400000 r-xp 00000000  fd:01  1311512     7240    2532    1954          896            0          1636             0       2532         0             0    0              4           4      0    rd ex mr mw me dw  postgres  
        00d12000 r--p 00712000  fd:01  1311512        4       4       0            0            4             0             0          4         4             0    0              4           4      0    rd mr mw me dw ac  postgres  
        00d13000 rw-p 00713000  fd:01  1311512       48      48      30            0           20             0            28         48        48             0    0              4           4      0 rd wr mr mw me dw ac  postgres  
        00d1f000 rw-p 00000000  00:00        0      292      64      58            0            8             0            56         60        64             0    0              4           4      0    rd wr mr mw me ac    
        013f5000 rw-p 00000000  00:00        0      248     216     155            0           72             0           144        164       216             0    0              4           4      0    rd wr mr mw me ac  [heap]  
        01433000 rw-p 00000000  00:00        0     4608    4412    4412            0            0             0          4412       4412      4412             0    0              4           4      0    rd wr mr mw me ac  [heap]  
    7eff9c68a000 rw-p 00000000  00:00        0  1602320 1601756 1601756            0            0             0       1601756    1601756   1601756       1200128    0              4           4      0    rd wr mr mw me ac    
    7efffe34e000 rw-p 00000000  00:00        0      516     516     516            0            0             0           516        516       516             0    0              4           4      0    rd wr mr mw me ac    
    7efffe3cf000 rw-p 00000000  00:00        0     2052    2048    2048            0            0             0          2048       2048      2048             0    0              4           4      0    rd wr mr mw me ac    
    7efffe611000 rw-p 00000000  00:00        0     1804    1800    1800            0            0             0          1800       1800      1800             0    0              4           4      0    rd wr mr mw me ac    
    7efffe7d4000 rw-s 00000000  00:04  5610332  8620720  321376  309186            0        23340             0        298036     321376         0             0    0              4           4      0 rd wr sh mr mw me ms  zero (deleted)  
    7f020ca80000 r--p 00000000  fd:01   151196   103580       4       0            4            0             0             0          4         0             0    0              4           4      0          rd mr mw me  locale-archive  
    7f0212fa7000 r-xp 00000000  fd:01   132852       88      16       0           16            0             0             0         16         0             0    0              4           4      0       rd ex mr mw me  libpthread-2.17.so  
    7f0212fbd000 ---p 00016000  fd:01   132852     2048       0       0            0            0             0             0          0         0             0    0              4           4      0             mr mw me  libpthread-2.17.so  
    7f02131bd000 r--p 00016000  fd:01   132852        4       4       0            0            4             0             0          0         4             0    0              4           4      0       rd mr mw me ac  libpthread-2.17.so  
    7f02131be000 rw-p 00017000  fd:01   132852        4       4       4            0            0             0             4          4         4             0    0              4           4      0    rd wr mr mw me ac  libpthread-2.17.so  
    7f02131bf000 rw-p 00000000  00:00        0       16       4       4            0            0             0             4          4         4             0    0              4           4      0    rd wr mr mw me ac    
    7f02131c3000 r-xp 00000000  fd:01   132826     1752     432       9          432            0             0             0        432         0             0    0              4           4      0       rd ex mr mw me  libc-2.17.so  
    7f0213379000 ---p 001b6000  fd:01   132826     2048       0       0            0            0             0             0          0         0             0    0              4           4      0             mr mw me  libc-2.17.so  
    7f0213579000 r--p 001b6000  fd:01   132826       16      16       2            0           16             0             0         12        16             0    0              4           4      0       rd mr mw me ac  libc-2.17.so  
    7f021357d000 rw-p 001ba000  fd:01   132826        8       8       8            0            0             0             8          8         8             0    0              4           4      0    rd wr mr mw me ac  libc-2.17.so  
    7f021357f000 rw-p 00000000  00:00        0       20      16      12            0            4             0            12         12        16             0    0              4           4      0    rd wr mr mw me ac    
    7f0213584000 r-xp 00000000  fd:01   132972     1028      68      32           44            0            24             0         68         0             0    0              4           4      0       rd ex mr mw me  libm-2.17.so  
    7f0213685000 ---p 00101000  fd:01   132972     2044       0       0            0            0             0             0          0         0             0    0              4           4      0             mr mw me  libm-2.17.so  
    7f0213884000 r--p 00100000  fd:01   132972        4       4       0            0            4             0             0          4         4             0    0              4           4      0       rd mr mw me ac  libm-2.17.so  
    7f0213885000 rw-p 00101000  fd:01   132972        4       4       0            0            4             0             0          4         4             0    0              4           4      0    rd wr mr mw me ac  libm-2.17.so  
    7f0213886000 r-xp 00000000  fd:01   132971       12       4       0            4            0             0             0          4         0             0    0              4           4      0       rd ex mr mw me  libdl-2.17.so  
    7f0213889000 ---p 00003000  fd:01   132971     2044       0       0            0            0             0             0          0         0             0    0              4           4      0             mr mw me  libdl-2.17.so  
    7f0213a88000 r--p 00002000  fd:01   132971        4       4       0            0            4             0             0          4         4             0    0              4           4      0       rd mr mw me ac  libdl-2.17.so  
    7f0213a89000 rw-p 00003000  fd:01   132971        4       4       0            0            4             0             0          0         4             0    0              4           4      0    rd wr mr mw me ac  libdl-2.17.so  
    7f0213a8a000 r-xp 00000000  fd:01   132976       28       4       0            4            0             0             0          4         0             0    0              4           4      0       rd ex mr mw me  librt-2.17.so  
    7f0213a91000 ---p 00007000  fd:01   132976     2044       0       0            0            0             0             0          0         0             0    0              4           4      0             mr mw me  librt-2.17.so  
    7f0213c90000 r--p 00006000  fd:01   132976        4       4       0            0            4             0             0          4         4             0    0              4           4      0       rd mr mw me ac  librt-2.17.so  
    7f0213c91000 rw-p 00007000  fd:01   132976        4       4       0            0            4             0             0          0         4             0    0              4           4      0    rd wr mr mw me ac  librt-2.17.so  
    7f0213c92000 r-xp 00000000  fd:01   132819      132      28       0           28            0             0             0         28         0             0    0              4           4      0    rd ex mr mw me dw  ld-2.17.so  
    7f0213cf0000 rw-p 00000000  00:00        0      132     128     128            0            0             0           128        128       128             0    0              4           4      0    rd wr mr mw me ac    
    7f0213e12000 rw-p 00000000  00:00        0      392     312     312            0            0             0           312        312       312             0    0              4           4      0    rd wr mr mw me ac    
    7f0213e75000 r--s 00000000  fd:01   401548      212       0       0            0            0             0             0          0         0             0    0              4           4      0          rd mr me ms  hosts  
    7f0213eaa000 rw-s 00000000  00:11  5610334        4       0       0            0            0             0             0          0         0             0    0              4           4      0 rd wr sh mr mw me ms  PostgreSQL.1804289383  
    7f0213eab000 rw-s 00000000  00:04 23166976        4       0       0            0            0             0             0          0         0             0    0              4           4      0 rd wr sh mr mw me ms  SYSV001d4fe9 (deleted)  
    7f0213eac000 rw-p 00000000  00:00        0       24      24       6            0           20             0             4         16        24             0    0              4           4      0    rd wr mr mw me ac    
    7f0213eb2000 r--p 00020000  fd:01   132819        4       4       0            0            4             0             0          4         4             0    0              4           4      0    rd mr mw me dw ac  ld-2.17.so  
    7f0213eb3000 rw-p 00021000  fd:01   132819        4       4       4            0            0             0             4          4         4             0    0              4           4      0 rd wr mr mw me dw ac  ld-2.17.so  
    7f0213eb4000 rw-p 00000000  00:00        0        4       4       0            0            4             0             0          4         4             0    0              4           4      0    rd wr mr mw me ac    
    7ffffb25f000 rw-p 00000000  00:00        0      152      60      28            0           36             0            24         24        60             0    0              4           4      0 rd wr mr mw me gd ac  [stack]  
    7ffffb336000 r-xp 00000000  00:00        0        8       4       0            4            0             0             0          4         0             0    0              4           4      0    rd ex mr mw me de  [vdso]  
ffffffffff600000 r-xp 00000000  00:00        0        4       0       0            0            0             0             0          0         0             0    0              4           4      0                rd ex  [vsyscall]  
                                               ======== ======= ======= ============ ============ ============= ============= ========== ========= ============= ==== ============== =========== ======   
                                               10357736 1935944 1922464         1432        23556          1660       1909296    1935824   1611476       1200128    0            188         188      0 KB   

使用bt打印该进程的调用栈

(gdb) bt   
#0  0x00007ffffb336ddf in gettimeofday ()  
#1  0x000000000063d26a in InstrStopNode (instr=0x14d4500, nTuples=1) at instrument.c:82  
#2  0x000000000062d813 in ExecProcNode (node=0x14cb548) at execProcnode.c:530  
#3  0x0000000000650578 in ExecModifyTable (node=0x14cacc8) at nodeModifyTable.c:1363  
#4  0x000000000062d53e in ExecProcNode (node=0x14cacc8) at execProcnode.c:389  
#5  0x000000000062ad21 in ExecutePlan (estate=0x14ca928, planstate=0x14cacc8, operation=CMD_UPDATE, sendTuples=0 '\000', numberTuples=0, direction=ForwardScanDirection, dest=0xd149e0 <donothingDR>) at execMain.c:1549  
#6  0x00000000006290ea in standard_ExecutorRun (queryDesc=0x14ca1e8, direction=ForwardScanDirection, count=0) at execMain.c:337  
#7  0x0000000000628fec in ExecutorRun (queryDesc=0x14ca1e8, direction=ForwardScanDirection, count=0) at execMain.c:285  
#8  0x00000000005c218f in ExplainOnePlan (plannedstmt=0x14ca150, into=0x0, es=0x14a54f8, queryString=0x1465b98 "explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;",   
    params=0x0, planduration=0x7ffffb281e40) at explain.c:503  
#9  0x00000000005c1ed9 in ExplainOneQuery (query=0x14a5348, into=0x0, es=0x14a54f8, queryString=0x1465b98 "explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;",   
    params=0x0) at explain.c:357  
#10 0x00000000005c1b9f in ExplainQuery (stmt=0x1467260, queryString=0x1465b98 "explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;", params=0x0, dest=0x14a5460)  
    at explain.c:245  
#11 0x0000000000790baf in standard_ProcessUtility (parsetree=0x1467260, queryString=0x1465b98 "explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;",   
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x14a5460, completionTag=0x7ffffb2820e0 "") at utility.c:658  
#12 0x00000000007903f2 in ProcessUtility (parsetree=0x1467260, queryString=0x1465b98 "explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;",   
    context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x14a5460, completionTag=0x7ffffb2820e0 "") at utility.c:334  
#13 0x000000000078f61a in PortalRunUtility (portal=0x14988a8, utilityStmt=0x1467260, isTopLevel=1 '\001', dest=0x14a5460, completionTag=0x7ffffb2820e0 "") at pquery.c:1183  
#14 0x000000000078f384 in FillPortalStore (portal=0x14988a8, isTopLevel=1 '\001') at pquery.c:1057  
#15 0x000000000078eced in PortalRun (portal=0x14988a8, count=9223372036854775807, isTopLevel=1 '\001', dest=0x1467998, altdest=0x1467998, completionTag=0x7ffffb2822c0 "") at pquery.c:781  
#16 0x0000000000789336 in exec_simple_query (query_string=0x1465b98 "explain (analyze,verbose,timing,costs,buffers) UPDATE tbl1 SET in_predict = true FROM tbl1 c, tbl2 p WHERE c.uid = p.uid;") at postgres.c:1104  
#17 0x000000000078d1ce in PostgresMain (argc=1, argv=0x13f6b28, dbname=0x13f6988 "postgres", username=0x13f6960 "postgres") at postgres.c:4030  
#18 0x0000000000719440 in BackendRun (port=0x14127f0) at postmaster.c:4239  
#19 0x0000000000718bd4 in BackendStartup (port=0x14127f0) at postmaster.c:3913  
#20 0x0000000000715612 in ServerLoop () at postmaster.c:1684  
#21 0x0000000000714c99 in PostmasterMain (argc=1, argv=0x13f5b90) at postmaster.c:1292  
#22 0x00000000006767fa in main (argc=1, argv=0x13f5b90) at main.c:228  

使用oprofile分析当时数据库的问题

下载并安装最新oprofile

http://oprofile.sourceforge.net/news/

采集信息

# mkdir op_results
# cd op_results
# operf --system-wide --lazy-conversion  

等待120秒后ctrl+c 退出

生成报告

#opreport -l -f -g -w -x -t 1 /home/digoal/pgsql9.5/bin/postgres  

Using /home/digoal/oprof/oprofile_data/samples/ for samples directory.
CPU: Intel Broadwell microarchitecture, speed 2494.28 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000
vma      samples  %        linenr info                 symbol name
0046a7f0 260      13.3952  /home/digoal/postgresql-9.5.3/src/backend/access/common/heaptuple.c:150 heap_fill_tuple
00491a50 208      10.7161  /home/digoal/postgresql-9.5.3/src/backend/access/heap/heapam.c:3217 heap_update
005b5100 157       8.0886  /home/digoal/postgresql-9.5.3/src/backend/executor/execQual.c:5486 ExecProject
0046a5d0 157       8.0886  /home/digoal/postgresql-9.5.3/src/backend/access/common/heaptuple.c:88 heap_compute_data_size
005ed880 86        4.4307  /home/digoal/postgresql-9.5.3/src/backend/nodes/bitmapset.c:860 bms_first_member
005b1030 78        4.0185  /home/digoal/postgresql-9.5.3/src/backend/executor/execJunk.c:263 ExecFilterJunk
0069f6e0 78        4.0185  /home/digoal/postgresql-9.5.3/src/backend/storage/lmgr/lwlock.c:906 LWLockAcquire
007b4ce0 69        3.5549  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/aset.c:659 AllocSetAlloc
0046ae30 66        3.4003  /home/digoal/postgresql-9.5.3/src/backend/access/common/heaptuple.c:695 heap_form_tuple
005b3bd0 64        3.2973  /home/digoal/postgresql-9.5.3/src/backend/executor/execMain.c:1663 ExecConstraints
005bf310 48        2.4730  /home/digoal/postgresql-9.5.3/src/backend/executor/instrument.c:70 InstrStopNode
0069ed30 47        2.4214  /home/digoal/postgresql-9.5.3/src/backend/storage/lmgr/lwlock.c:1525 LWLockRelease
005ede30 33        1.7002  /home/digoal/postgresql-9.5.3/src/backend/nodes/bitmapset.c:111 bms_copy
00787400 31        1.5971  /home/digoal/postgresql-9.5.3/src/backend/utils/cache/relcache.c:4202 RelationGetIndexAttrBitmap
005cc0e0 30        1.5456  /home/digoal/postgresql-9.5.3/src/backend/executor/nodeModifyTable.c:790 ExecUpdate
004897a0 30        1.5456  /home/digoal/postgresql-9.5.3/src/backend/access/hash/hashfunc.c:317 hash_any
007b5350 25        1.2880  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/aset.c:929 AllocSetFree
007a0180 25        1.2880  /home/digoal/postgresql-9.5.3/src/backend/utils/hash/dynahash.c:856 hash_search_with_hash_value
007b5a40 25        1.2880  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/mcxt.c:813 palloc
00683960 23        1.1850  /home/digoal/postgresql-9.5.3/src/backend/storage/buffer/bufmgr.c:239 GetPrivateRefCountEntry
007b5740 22        1.1334  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/mcxt.c:914 pfree
005cc510 20        1.0304  /home/digoal/postgresql-9.5.3/src/backend/executor/nodeModifyTable.c:1289 ExecModifyTable
004698a0 20        1.0304  /home/digoal/postgresql-9.5.3/src/backend/access/common/heaptuple.c:1327 slot_getsomeattrs

查看对应的函数调用

#opreport -l -f -g -w -x -d -t 1 /home/digoal/pgsql9.5/bin/postgres  

查看对应的函数调用, 哪些语句花了更多的CPU

#opannotate -x -s -t 1 /home/digoal/pgsql9.5/bin/postgres -i heap_fill_tuple  

update对应的select分析

这个UPDATE对应的查询不会导致内存膨胀

select * from tbl1, tbl1 c, tbl2 p WHERE c.uid = p.uid;  
  
postgres=# explain select * from tbl1, tbl1 c, tbl2 p WHERE c.uid = p.uid; 
                                                 QUERY PLAN                                                 
------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=37.85..69948391099.48 rows=2252418000000 width=260)
   ->  Merge Join  (cost=37.85..118451.33 rows=1500000 width=138)
         Merge Cond: (c.uid = p.uid)
         ->  Index Scan Backward using tbl1_uid_idx on tbl1 c  (cost=0.43..58384.61 rows=1501612 width=122)
         ->  Index Scan Backward using tbl2_uid_idx on tbl2 p  (cost=0.43..37588.43 rows=1500000 width=16)
   ->  Materialize  (cost=0.00..79264.18 rows=1501612 width=122)
         ->  Seq Scan on tbl1  (cost=0.00..43894.12 rows=1501612 width=122)
(7 rows)
  
postgres=# explain (analyze,verbose,timing,buffers,costs) select * from tbl1, tbl1 c, tbl2 p WHERE c.uid = p.uid;  

同样采集系统信息进行分析

operf --system-wide --lazy-conversion  
  
after 120 second  
  
ctrl+c  

查看报告

#opreport -l -f -g -w -x -t 1 /home/digoal/pgsql9.5/bin/postgres  

Using /home/digoal/oprof/oprofile_data/samples/ for samples directory.  
CPU: Intel Broadwell microarchitecture, speed 2494.28 MHz (estimated)  
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000  
vma      samples  %        linenr info                 symbol name  
005b5100 70       34.8259  /home/digoal/postgresql-9.5.3/src/backend/executor/execQual.c:5486 ExecProject  
00469570 44       21.8905  /home/digoal/postgresql-9.5.3/src/backend/access/common/heaptuple.c:1080 slot_deform_tuple  
005bf310 21       10.4478  /home/digoal/postgresql-9.5.3/src/backend/executor/instrument.c:70 InstrStopNode  
005bf640 11        5.4726  /home/digoal/postgresql-9.5.3/src/backend/executor/instrument.c:53 InstrStartNode  
005b45d0 7         3.4826  /home/digoal/postgresql-9.5.3/src/backend/executor/execProcnode.c:368 ExecProcNode  
007b4ce0 6         2.9851  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/aset.c:659 AllocSetAlloc  
0068ac40 6         2.9851  /home/digoal/postgresql-9.5.3/src/backend/storage/file/buffile.c:359 BufFileRead  
007c1130 5         2.4876  /home/digoal/postgresql-9.5.3/src/backend/utils/sort/tuplestore.c:888 tuplestore_gettuple  
007b5350 4         1.9900  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/aset.c:929 AllocSetFree  
005cd850 4         1.9900  /home/digoal/postgresql-9.5.3/src/backend/executor/nodeNestloop.c:61 ExecNestLoop  
005bc400 4         1.9900  /home/digoal/postgresql-9.5.3/src/backend/executor/execTuples.c:391 ExecStoreMinimalTuple  
005bc160 3         1.4925  /home/digoal/postgresql-9.5.3/src/backend/executor/execTuples.c:444 ExecClearTuple  
007b5670 3         1.4925  /home/digoal/postgresql-9.5.3/src/backend/utils/mmgr/mcxt.c:396 GetMemoryChunkSpace  

问题对比分析

update时跟踪到了AllocSetAlloc,但是比AllocSetFree多很多。

select时同时跟踪到了AllocSetAlloc,AllocSetFree,都不太多,虽然两者的比例差不多。

update时会不会是这里有内存泄露呢?

进一步分析
分配分析

#opannotate -x -s -t 1 /home/digoal/pgsql9.5/bin/postgres -i AllocSetAlloc|less

               :/*
               : * AllocSetAlloc
               : *              Returns pointer to allocated memory of given size or NULL if
               : *              request could not be completed; memory is added to the set.
               : *
               : * No request may exceed:
               : *              MAXALIGN_DOWN(SIZE_MAX) - ALLOC_BLOCKHDRSZ - ALLOC_CHUNKHDRSZ
               : * All callers use a much-lower limit.
               : */
               :static void *
               :AllocSetAlloc(MemoryContext context, Size size)
    16 23.1884 :{ /* AllocSetAlloc total:     69 100.000 */
               :        AllocSet        set = (AllocSet) context;
               :        AllocBlock      block;
               :        AllocChunk      chunk;
               :        int                     fidx;
               :        Size            chunk_size;
               :        Size            blksize;

....
               :        fidx = AllocSetFreeIndex(size);
     8 11.5942 :        chunk = set->freelist[fidx];
     2  2.8986 :        if (chunk != NULL)
               :        {
               :                Assert(chunk->size >= size);
               :
     5  7.2464 :                set->freelist[fidx] = (AllocChunk) chunk->aset;
               :
     4  5.7971 :                chunk->aset = (void *) set;
               :
...
               :                AllocAllocInfo(set, chunk);
     2  2.8986 :                return AllocChunkGetPointer(chunk);
               :        }
               :
               :        /*
               :         * Choose the actual chunk size to allocate.
               :         */
               :        chunk_size = (1 << ALLOC_MINBITS) << fidx;
               :        Assert(chunk_size >= size);
               :
               :        /*
               :         * If there is enough room in the active allocation block, we will put the
               :         * chunk into that block.  Else must start a new one.
               :         */
     2  2.8986 :        if ((block = set->blocks) != NULL)
               :        {
     2  2.8986 :                Size            availspace = block->endptr - block->freeptr;
               :
    12 17.3913 :                if (availspace < (chunk_size + ALLOC_CHUNKHDRSZ))
               :                {
...
     1  1.4493 :        chunk->size = chunk_size;
               :#ifdef MEMORY_CONTEXT_CHECKING
               :        chunk->requested_size = size;
               :        VALGRIND_MAKE_MEM_NOACCESS(&chunk->requested_size,
               :                                                           sizeof(chunk->requested_size));
               :        /* set mark to catch clobber of "unused" space */
               :        if (size < chunk->size)
               :                set_sentinel(AllocChunkGetPointer(chunk), size);
               :#endif
               :#ifdef RANDOMIZE_ALLOCATED_MEMORY
               :        /* fill the allocated space with junk */
               :        randomize_mem((char *) AllocChunkGetPointer(chunk), size);
               :#endif
               :
               :        AllocAllocInfo(set, chunk);
     4  5.7971 :        return AllocChunkGetPointer(chunk);
     3  4.3478 :}
               :

回收分析

#opannotate -x -s -t 1 /home/digoal/pgsql9.5/bin/postgres -i AllocSetFree|less

               :/*
               : * AllocSetFree
               : *              Frees allocated memory; memory is removed from the set.
               : */
               :static void
               :AllocSetFree(MemoryContext context, void *pointer)
     7 28.0000 :{ /* AllocSetFree total:     25 100.000 */
               :        AllocSet        set = (AllocSet) context;
               :        AllocChunk      chunk = AllocPointerGetChunk(pointer);
               :
               :        AllocFreeInfo(set, chunk);

小结

1. 写SQL时需要注意,避免产生笛卡尔积。
2. 这个问题已反馈给社区,看看是不是存在内存泄露。

问题已修

tom lane的速度太快了,从报BUG到现在10个小时过去,已经修复 。

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blobdiff;f=src/backend/access/heap/heapam.c;h=6a27ef4140091b0c53d3a68f9d947824b2ffe8c2;hp=c63dfa0bafc606ea3dc1ee9c7427d92c28ed09d4;hb=ae4760d667c71924932ab32e14996b5be1831fc6;hpb=ca9cb940d23dc8869a635fa27a08e60837b17c07

Fix small query-lifespan memory leak in bulk updates.

When there is an identifiable REPLICA IDENTITY index on the target table,
heap_update leaks the id_attrs bitmapset.  That's not many bytes, but it
adds up over enough rows, since the code typically runs in a query-lifespan
context.  Bug introduced in commit e55704d8b, which did a rather poor job
of cloning the existing use-pattern for RelationGetIndexAttrBitmap().

Per bug #14293 from Zhou Digoal.  Back-patch to 9.4 where the bug was
introduced.

Report: <20160824114320.15676.45171@wrigleys.postgresql.org>
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index c63dfa0..6a27ef4 100644 (file)
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3802,6 +3802,7 @@ l2:
            ReleaseBuffer(vmbuffer);
        bms_free(hot_attrs);
        bms_free(key_attrs);
+       bms_free(id_attrs);
        return result;
    }
 
@@ -4268,6 +4269,7 @@ l2:
 
    bms_free(hot_attrs);
    bms_free(key_attrs);
+   bms_free(id_attrs);
 
    return HeapTupleMayBeUpdated;
 }

Flag Counter

digoal’s 大量PostgreSQL文章入口