执行计划维度统计判断执行计划翻转统计每种执行计划的统计信息 use pg_stat_plans in PostgreSQL 9.0, 9.1 and 9.2
背景
pg_stat_plans 类似于auto_explain和pg_stat_statements的结合. 或者说它是从pg_stat_statements扩展而来.
pg_stat_statements一般被用来跟踪数据库中执行的SQL语句的次数, 共计执行时间, 块扫描的信息等等. 如下 :
Column | Type | Modifiers
---------------------+------------------+-----------
userid | oid |
dbid | oid |
query | text |
calls | bigint |
total_time | double precision |
rows | bigint |
shared_blks_hit | bigint |
shared_blks_read | bigint |
shared_blks_dirtied | bigint |
shared_blks_written | bigint |
local_blks_hit | bigint |
local_blks_read | bigint |
local_blks_dirtied | bigint |
local_blks_written | bigint |
temp_blks_read | bigint |
temp_blks_written | bigint |
blk_read_time | double precision |
blk_write_time | double precision |
而pg_stat_plans呢? 除了记录调用次数, 共计执行时间以外, 还需要记录planid(从query tree hash而来), 如下 :
Column | Type | Modifiers
---------------------+------------------+-----------
userid | oid |
dbid | oid |
planid | oid |
query | text |
had_our_search_path | boolean |
query_valid | boolean |
calls | bigint |
total_time | double precision |
rows | bigint |
shared_blks_hit | bigint |
shared_blks_read | bigint |
shared_blks_written | bigint |
local_blks_hit | bigint |
local_blks_read | bigint |
local_blks_written | bigint |
temp_blks_read | bigint |
temp_blks_written | bigint |
last_startup_cost | double precision |
last_total_cost | double precision |
换句话说, pg_stat_plans并没有真正记录explain的内容, 而是换算成了hash值, 这可以省去不少存储空间, 因为这些都是消耗内存的.
pg_stat_plans需要的共享内存在数据库启动时就被分配, 约等于pg_stat_plans.max * plans_query_size bytes;
安装
1. 下载
https://github.com/2ndQuadrant/pg_stat_plans.git
2. 下载后解压到 $PG_SRC/contrib/
3. 编译安装需要用到pg_config, 所以需要把它放到PATH中.
su - root
. /home/postgres/.bash_profile
cd $PG_SRC/contrib/pg_stat_plans/
gmake clean
gmake
gmake install
4. 9.1和9.2 使用create extension 安装.
9.0则使用pg_stat_plans.sql来安装.
5. 安装完后需要修改postgresql.conf重启数据库.
简单的配置如下 :
# postgresql.conf
shared_preload_libraries = 'pg_stat_plans'
# Optionally:
pg_stat_plans.max = 10000
pg_stat_plans.track = all
如果plans_query_size bytes=2048的话, 需要大约20MB的共享内存.(如果track_activity_query_size没有设置的话, query_size使用PostgreSQL的全局参数值)
其他可用到的参数 :
pg_stat_plans adds the following configuration parameters:
pg_stat_plans.max (integer)
pg_stat_plans.max is the maximum number of plans tracked by the module
(i.e., the maximum number of rows in the pg_stat_plans view).
If more distinct plans than that are observed, information about the least-executed statements is discarded.
The default value is 1000. This parameter can only be set at server start.
pg_stat_plans.track (enum)
pg_stat_plans.track controls which statements' plans are counted by the module.
Specify top to track top-level statements (those issued directly by clients), all to also track nested statements
(such as statements invoked within functions), or none to disable plan statistics collection.
The default value is top. Only superusers can change this setting.
pg_stat_plans.save (boolean)
pg_stat_plans.save specifies whether to save plan statistics across server shutdowns.
If it is off then statistics are not saved at shutdown nor reloaded at server start.
The default value is on. This parameter can only be set in the postgresql.conf file or on the server command line.
pg_stat_plans.planid_notice (boolean)
Raise notice of a plan's id after its execution. Useful for verifying explain output on an ad-hoc basis.
pg_stat_plans.explain_format (enum)
pg_stat_plans.explain_format selects the EXPLAIN output format to be used
(i.e the format that will be returned by pg_stat_plans_explain()).
The allowed values are text, xml, json, and yaml. The default is text.
pg_stat_plans.verbose (boolean)
pg_stat_plans.verbose specifies if explain output should be verbose
(that is, equivalent to specifying VERBOSE with SQL EXPLAIN). The default is off.
pg_stat_plans.plans_query_size (integer)
Controls the length in bytes of the stored SQL query string.
Because truncating the stored strings prevents subsequently explaining the entry, it may be necessary to increase this value.
The default is 2048.
配置好后重启数据库.
测试
我这里使用PostgreSQL 9.2beta4进行测试.
1. 在PSQL中查看是否安装好了pg_stat_plans.
postgres=# \dx
List of installed extensions
Name | Version | Schema | Description
---------------+---------+------------+-----------------------------------
pg_stat_plans | 0.1 | public | Monitors execution costs of plans
2. 可以看到新增了3个函数, 2个视图.
-[ RECORD 7 ]-------+---------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------
Schema | public
Name | pg_stat_plans_explain
Result data type | text
Argument data types | planid oid, userid oid DEFAULT NULL::oid, dbid oid DEFAULT NULL::oid, encodingid oid DEFAULT NULL::oid
Type | normal
-[ RECORD 8 ]-------+---------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------
Schema | public
Name | pg_stat_plans_pprint
Result data type | text
Argument data types | sqltext text
Type | normal
-[ RECORD 9 ]-------+---------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------------------------
Schema | public
Name | pg_stat_plans_reset
Result data type | void
Argument data types |
Type | normal
postgres=# \dv
List of relations
-[ RECORD 1 ]-----------------
Schema | public
Name | pg_stat_plans
Type | view
Owner | postgres
-[ RECORD 2 ]-----------------
Schema | public
Name | pg_stat_plans_queries
Type | view
Owner | postgres
3. 创建测试表, 插入测试数据.
分别测试一个SQL的两种执行计划. 那么pg_stat_plans将跟踪到这个改变, 并在pg_log中记录WARNING告警.
postgres=# drop table test;
DROP TABLE
postgres=# create table test (id int);
CREATE TABLE
postgres=# select pg_stat_plans_reset();
pg_stat_plans_reset
---------------------
(1 row)
postgres=# insert into test select generate_series(1,100000);
INSERT 0 100000
postgres=# select * from test where id=1;
id
----
1
(1 row)
创建索引, 执行同一条SQL, 执行计划将发生变化. 以此来观察pg_stat_plans的变化.
postgres=# create index idx_test on test(id);
CREATE INDEX
postgres=# select * from test where id=1;
id
----
1
(1 row)
查询pg_stat_plans, 可以看出planid=2758236521的记录发生了变化, query_valid=t变成了query_valid=f ;
postgres=# select planid,query,query_valid,pg_stat_plans_explain(planid, userid, dbid) from pg_stat_plans where dbid=12788 and query ~ 'test';
-[ RECORD 1 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
planid | 2758236521
query | select * from test where id=1;
query_valid | f
pg_stat_plans_explain | ***** Existing entry's planid (2758236521) and explain of original SQL query string planid (1252991393) differ *****
| Index Only Scan using idx_test on test (cost=0.00..2.30 rows=1 width=4)
| Index Cond: (id = 1)
-[ RECORD 5 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
planid | 1252991393
query | select * from test where id=1;
query_valid | t
pg_stat_plans_explain | Index Only Scan using idx_test on test (cost=0.00..2.30 rows=1 width=4)
| Index Cond: (id = 1)
从这个信息来看, planid=2758236521对应的SQL, 当前的执行计划(通过pg_stat_plans_explain(planid, userid, dbid)得到)与执行时的执行计划不一致, 也就是发生了改变, 所以”Existing entry’s planid (2758236521) and explain of original SQL query string planid (1252991393) differ”.
并且query_valid修改成f. 同时在pg_stat_plans中新增了一条planid=1252991393的执行计划. 与当前执行计划一致.
柱状图 :
postgres=# select * from pg_stat_plans_queries where normalize_query ~ 'test';
-[ RECORD 1 ]-------+----------------------------------------------
userid | 10
dbid | 12788
plan_ids | {2758236521,1252991393}
calls_histogram | {1,5}
avg_time_histogram | {21.224,0.016}
normalize_query | select * from test where id=?;
calls | 6
total_time | 21.304
time_variance | 224.889632
time_stddev | 14.9963206154043
rows | 2
shared_blks_hit | 444
shared_blks_read | 3
shared_blks_written | 0
local_blks_hit | 0
local_blks_read | 0
local_blks_written | 0
temp_blks_read | 0
temp_blks_written | 0
解释 :
plan_ids 存储planid的数组
calls_histogram 存储的是对应的调用次数的数组
avg_time_histogram 存储的是对应的平均执行时间的数组.
从这里查询到的信息, 使用planid=2758236521这个执行计划的SQL, 调用了1次, 平均执行时间21.224毫秒.
使用planid=2758236521这个执行计划的SQL, 调用了5次, 平均执行时间0.016毫秒.
4. 接下来观察一下当执行计划与原先一致时, query_valid再变回去t值.
删除索引 :
postgres=# drop index idx_test;
DROP INDEX
重新查看pg_stat_plans_explain :
postgres=# select planid,query,query_valid,pg_stat_plans_explain(planid, userid, dbid) from pg_stat_plans where dbid=12788 and query ~ 'test';
NOTICE: updated pg_stat_plans query string of entry 2758236521
CONTEXT: SQL statement "EXPLAIN select * from test where id=1;"
WARNING: Existing pg_stat_plans entry planid (1252991393) differs from new plan for query (2758236521).
-[ RECORD 1 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
planid | 2758236521
query | select * from test where id=1;
query_valid | f
pg_stat_plans_explain | Seq Scan on test (cost=0.00..1693.00 rows=1 width=4)
| Filter: (id = 1)
-[ RECORD 6 ]---------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
planid | 1252991393
query | select * from test where id=1;
query_valid | t
pg_stat_plans_explain | ***** Existing entry's planid (1252991393) and explain of original SQL query string planid (2758236521) differ *****
| Seq Scan on test (cost=0.00..1693.00 rows=1 width=4)
| Filter: (id = 1)
显然这里发生了调换, planid=2758236521的执行计划变回去了, 而planid=1252991393的SQL的执行计划则与之前index only scan不同.
所以这里发生了两个更新记录的操作. 都是修改query_valid.
小结
1. pg_stat_plans 一般可以用来发现数据库中执行计划发生变化的SQL。
第一, 从pg_stat_plans视图的query_valid字段值为false可以发现当前的SQL执行计划与执行这个SQL时用到的执行计划不一致.
第二, 从PostgreSQL log中的WARNING也能发现这个不一致.
2012-10-22 17:08:12.118 CST,"postgres","postgres",6720,"[local]",50850932.1a40,2,"SELECT",2012-10-22 16:52:02 CST,2/426,0,WARNING,55000,"Existing pg_stat_plans entry planid (2758236521) differs from new plan for query (1252991393).",,,,,,,,,"psql"
2. pg_stat_plans_explain(planid, userid, dbid)
函数将调用PostgreSQL中的explain得到执行计划(query tree). 调用pg_stat_plans_explain除了得到当前的执行计划外, 还需要对query tree的hash值与planid进行匹配, 如果不匹配则修改query_valid字段中存储的值为false. 所以不调用pg_stat_plans_explain()的话不会对query_valid的内容进行更改.
3. planId |= hash_any(jstate.jumble, jstate.jumble_len)
, 不同的执行计划可能得到同样的hash值, 也就是planid. 尽管这种概率较小.
【参考】
1. https://github.com/2ndQuadrant/pg_stat_plans
2. http://blog.163.com/digoal@126/blog/static/163877040201151534631313/
3. http://blog.163.com/digoal@126/blog/static/163877040201221382150858/
4. http://blog.ioguix.net/
5. src/backend/access/hash/hashfunc.c
00289 /*
00290 * hash_any() -- hash a variable-length key into a 32-bit value
00291 * k : the key (the unaligned variable-length array of bytes)
00292 * len : the length of the key, counting by bytes
00293 *
00294 * Returns a uint32 value. Every bit of the key affects every bit of
00295 * the return value. Every 1-bit and 2-bit delta achieves avalanche.
00296 * About 6*len+35 instructions. The best hash table sizes are powers
00297 * of 2. There is no need to do mod a prime (mod is sooo slow!).
00298 * If you need less than 32 bits, use a bitmask.
00299 *
00300 * Note: we could easily change this function to return a 64-bit hash value
00301 * by using the final values of both b and c. b is perhaps a little less
00302 * well mixed than c, however.
00303 */
00304 Datum
00305 hash_any(register const unsigned char *k, register int keylen)
00306 {
00307 register uint32 a,
00308 b,
00309 c,
00310 len;
00311
00312 /* Set up the internal state */
00313 len = keylen;
00314 a = b = c = 0x9e3779b9 + len + 3923095;
00315
00316 /* If the source pointer is word-aligned, we use word-wide fetches */
00317 if (((intptr_t) k & UINT32_ALIGN_MASK) == 0)
00318 {
00319 /* Code path for aligned source data */
00320 register const uint32 *ka = (const uint32 *) k;
00321
00322 /* handle most of the key */
00323 while (len >= 12)
00324 {
00325 a += ka[0];
00326 b += ka[1];
00327 c += ka[2];
00328 mix(a, b, c);
00329 ka += 3;
00330 len -= 12;
00331 }
00332
00333 /* handle the last 11 bytes */
00334 k = (const unsigned char *) ka;
00335 #ifdef WORDS_BIGENDIAN
00336 switch (len)
00337 {
00338 case 11:
00339 c += ((uint32) k[10] << 8);
00340 /* fall through */
00341 case 10:
00342 c += ((uint32) k[9] << 16);
00343 /* fall through */
00344 case 9:
00345 c += ((uint32) k[8] << 24);
00346 /* the lowest byte of c is reserved for the length */
00347 /* fall through */
00348 case 8:
00349 b += ka[1];
00350 a += ka[0];
00351 break;
00352 case 7:
00353 b += ((uint32) k[6] << 8);
00354 /* fall through */
00355 case 6:
00356 b += ((uint32) k[5] << 16);
00357 /* fall through */
00358 case 5:
00359 b += ((uint32) k[4] << 24);
00360 /* fall through */
00361 case 4:
00362 a += ka[0];
00363 break;
00364 case 3:
00365 a += ((uint32) k[2] << 8);
00366 /* fall through */
00367 case 2:
00368 a += ((uint32) k[1] << 16);
00369 /* fall through */
00370 case 1:
00371 a += ((uint32) k[0] << 24);
00372 /* case 0: nothing left to add */
00373 }
00374 #else /* !WORDS_BIGENDIAN */
00375 switch (len)
00376 {
00377 case 11:
00378 c += ((uint32) k[10] << 24);
00379 /* fall through */
00380 case 10:
00381 c += ((uint32) k[9] << 16);
00382 /* fall through */
00383 case 9:
00384 c += ((uint32) k[8] << 8);
00385 /* the lowest byte of c is reserved for the length */
00386 /* fall through */
00387 case 8:
00388 b += ka[1];
00389 a += ka[0];
00390 break;
00391 case 7:
00392 b += ((uint32) k[6] << 16);
00393 /* fall through */
00394 case 6:
00395 b += ((uint32) k[5] << 8);
00396 /* fall through */
00397 case 5:
00398 b += k[4];
00399 /* fall through */
00400 case 4:
00401 a += ka[0];
00402 break;
00403 case 3:
00404 a += ((uint32) k[2] << 16);
00405 /* fall through */
00406 case 2:
00407 a += ((uint32) k[1] << 8);
00408 /* fall through */
00409 case 1:
00410 a += k[0];
00411 /* case 0: nothing left to add */
00412 }
00413 #endif /* WORDS_BIGENDIAN */
00414 }
00415 else
00416 {
00417 /* Code path for non-aligned source data */
00418
00419 /* handle most of the key */
00420 while (len >= 12)
00421 {
00422 #ifdef WORDS_BIGENDIAN
00423 a += (k[3] + ((uint32) k[2] << 8) + ((uint32) k[1] << 16) + ((uint32) k[0] << 24));
00424 b += (k[7] + ((uint32) k[6] << 8) + ((uint32) k[5] << 16) + ((uint32) k[4] << 24));
00425 c += (k[11] + ((uint32) k[10] << 8) + ((uint32) k[9] << 16) + ((uint32) k[8] << 24));
00426 #else /* !WORDS_BIGENDIAN */
00427 a += (k[0] + ((uint32) k[1] << 8) + ((uint32) k[2] << 16) + ((uint32) k[3] << 24));
00428 b += (k[4] + ((uint32) k[5] << 8) + ((uint32) k[6] << 16) + ((uint32) k[7] << 24));
00429 c += (k[8] + ((uint32) k[9] << 8) + ((uint32) k[10] << 16) + ((uint32) k[11] << 24));
00430 #endif /* WORDS_BIGENDIAN */
00431 mix(a, b, c);
00432 k += 12;
00433 len -= 12;
00434 }
00435
00436 /* handle the last 11 bytes */
00437 #ifdef WORDS_BIGENDIAN
00438 switch (len) /* all the case statements fall through */
00439 {
00440 case 11:
00441 c += ((uint32) k[10] << 8);
00442 case 10:
00443 c += ((uint32) k[9] << 16);
00444 case 9:
00445 c += ((uint32) k[8] << 24);
00446 /* the lowest byte of c is reserved for the length */
00447 case 8:
00448 b += k[7];
00449 case 7:
00450 b += ((uint32) k[6] << 8);
00451 case 6:
00452 b += ((uint32) k[5] << 16);
00453 case 5:
00454 b += ((uint32) k[4] << 24);
00455 case 4:
00456 a += k[3];
00457 case 3:
00458 a += ((uint32) k[2] << 8);
00459 case 2:
00460 a += ((uint32) k[1] << 16);
00461 case 1:
00462 a += ((uint32) k[0] << 24);
00463 /* case 0: nothing left to add */
00464 }
00465 #else /* !WORDS_BIGENDIAN */
00466 switch (len) /* all the case statements fall through */
00467 {
00468 case 11:
00469 c += ((uint32) k[10] << 24);
00470 case 10:
00471 c += ((uint32) k[9] << 16);
00472 case 9:
00473 c += ((uint32) k[8] << 8);
00474 /* the lowest byte of c is reserved for the length */
00475 case 8:
00476 b += ((uint32) k[7] << 24);
00477 case 7:
00478 b += ((uint32) k[6] << 16);
00479 case 6:
00480 b += ((uint32) k[5] << 8);
00481 case 5:
00482 b += k[4];
00483 case 4:
00484 a += ((uint32) k[3] << 24);
00485 case 3:
00486 a += ((uint32) k[2] << 16);
00487 case 2:
00488 a += ((uint32) k[1] << 8);
00489 case 1:
00490 a += k[0];
00491 /* case 0: nothing left to add */
00492 }
00493 #endif /* WORDS_BIGENDIAN */
00494 }
00495
00496 final(a, b, c);
00497
00498 /* report the result */
00499 return UInt32GetDatum(c);
00500 }