PostgreSQL 函数调试、诊断、优化 & auto_explain
背景
PostgreSQL的服务端编程能力比较强,不仅支持像java, R, python这种流行的编程语言,另外还内置了一个与Oracle plsql功能几乎一致的plpgsql语言。
对于开发人员来说,有了语言还不够的,调试是非常常见的需求。
所以PostgreSQL的函数语言怎么调试呢,使用pldebugger即可。
pldebugger的安装
https://git.postgresql.org/gitweb/?p=pldebugger.git;a=summary
plpgsql调试方法
https://www.pgadmin.org/docs/1.22/debugger.html
除了函数的调试,还有函数内SQL的执行计划如何观测。
auto_explain
auto_explain 一个比较给力的模块.
用于向日志输出超出定义阈值执行时间的SQL的执行计划。更强大的是可以选择是否输出内嵌SQL的执行计划(如函数中的SQL).
auto_explain的输出更详细的解释可以参考EXPLAIN的解释。或者man EXPLAIN.
SYNOPSIS
EXPLAIN [ ( option [, ...] ) ] statement
EXPLAIN [ ANALYZE ] [ VERBOSE ] statement
where option can be one of:
ANALYZE [ boolean ]
VERBOSE [ boolean ]
COSTS [ boolean ]
BUFFERS [ boolean ]
FORMAT { TEXT | XML | JSON | YAML }
开启方式有两种 :
1. 通过配置文件postgresql.conf
需要重启数据库。
2. 通过LOAD ‘auto_explain’;
这个只在当前SESSION生效,不需要重启数据库, 需要超级用户权限。
其实还可以设置local_preload_libraries,session_preload_libraries参数来启用。
例子
1. 修改配置文件:
shared_preload_libraries = 'auto_explain'
新增如下配置:
auto_explain.log_min_duration = 0 # 为了方便查看,这里把时间设置为0,所有SQL都会被auto_explain捕获输出.实际使用的时候适当调大。如 100ms
auto_explain.log_analyze = true
# 以下可选
auto_explain.log_verbose = true
auto_explain.log_buffers = true
auto_explain.log_nested_statements = true
然后重启数据库.重启时会输出一个LOG
postgres@db5-> LOG: loaded library "$libdir/auto_explain.so"
下面来执行几条SQL
postgres@db5-> psql -h 127.0.0.1 digoal digoal
psql (9.1beta1)
Type "help" for help.
digoal=> select * from tbl_user_info limit 2;
id | firstname | lastname | corp
----+-----------+----------+----------
1 | zhou1 | digoal1 | sky-mobi
2 | zhou2 | digoal2 | sky-mobi
(2 rows)
digoal=> select count(*) from tbl_test;
count
---------
1000100
(1 row)
日志输出
2011-06-08 15:19:14.390 CST,"digoal","digoal",13789,"127.0.0.1:59549",4def2270.35dd,1,"SELECT",2011-06-08 15:19:12 CST,2/18,0,LOG,00
000,"duration: 0.040 ms plan:
Query Text: select * from tbl_user_info limit 2;
Limit (cost=0.00..0.04 rows=2 width=31) (actual time=0.020..0.023 rows=2 loops=1)
Output: id, firstname, lastname, corp
Buffers: shared hit=1
-> Seq Scan on public.tbl_user_info (cost=0.00..183.00 rows=10000 width=31) (actual time=0.014..0.015 rows=2 loops=1)
Output: id, firstname, lastname, corp
Buffers: shared hit=1",,,,,,,,,"psql"
2011-06-08 15:19:17.902 CST,"digoal","digoal",13789,"127.0.0.1:59549",4def2270.35dd,2,"SELECT",2011-06-08 15:19:12 CST,2/19,0,LOG,00
000,"duration: 1232.064 ms plan:
Query Text: select count(*) from tbl_test;
Aggregate (cost=16927.25..16927.26 rows=1 width=0) (actual time=1232.044..1232.045 rows=1 loops=1)
Output: count(*)
Buffers: shared hit=4426
-> Seq Scan on public.tbl_test (cost=0.00..14427.00 rows=1000100 width=0) (actual time=0.015..626.872 rows=1000100 loops=1)
Output: id
Buffers: shared hit=4426",,,,,,,,,"psql"
2. 通过LOAD ‘auto_explain’ ;
这个只在当前SESSION生效,不需要重启数据库, 需要超级用户权限。
首先先恢复postgresql.conf的配置,去除前面的配置.然后重启数据库.
普通用户不允许加载auto_explain模块.(普通用户只允许加载$libdir/plugins目录下的模块,但是auto_explain即使拷贝到这个目录也不行)
digoal=> load 'auto_explain';
ERROR: access to library "auto_explain" is not allowed
digoal=> \c digoal postgres
You are now connected to database "digoal" as user "postgres".
digoal=# load 'auto_explain';
LOAD
digoal=# set auto_explain.log_min_duration=0;
SET
digoal=# select * from tbl_user_info limit 2;
id | firstname | lastname | corp
----+-----------+----------+----------
1 | zhou1 | digoal1 | sky-mobi
2 | zhou2 | digoal2 | sky-mobi
(2 rows)
查看日志
2011-06-08 15:25:33.361 CST,"postgres","digoal",13968,"127.0.0.1:15445",4def23d5.3690,1,"SELECT",2011-06-08 15:25:09 CST,2/9,0,LOG,0
0000,"duration: 0.048 ms plan:
Query Text: select * from tbl_user_info limit 2;
Limit (cost=0.00..0.04 rows=2 width=31)
-> Seq Scan on tbl_user_info (cost=0.00..183.00 rows=10000 width=31)",,,,,,,,,"psql"
连接到digoal用户,这个LOAD的模块消失(后面会有解释).
digoal=# load 'auto_explain';
LOAD
digoal=# set session auto_explain.log_min_duration=0;
SET
digoal=# \c digoal digoal
You are now connected to database "digoal" as user "digoal".
digoal=> select * from tbl_user_info limit 3;
id | firstname | lastname | corp
----+-----------+----------+----------
1 | zhou1 | digoal1 | sky-mobi
2 | zhou2 | digoal2 | sky-mobi
3 | zhou3 | digoal3 | sky-mobi
(3 rows)
查看日志,没有auto_explain的输出.即使这里使用了session来定义这个参数的有效范围。
原因是session重新分配了,因为\c digoal digoal 相当于重新连接,backend也重新生成,如下.
digoal=> select * from pg_stat_activity;
datid | datname | procpid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_s
tart | xact_start | query_start | waiting | current_query
-------+---------+---------+----------+---------+------------------+-------------+-----------------+-------------+------------------
-------------+-------------------------------+-------------------------------+---------+---------------------------------
16430 | digoal | 14155 | 16423 | digoal | psql | 127.0.0.1 | | 5959 | 2011-06-08 15:31:
32.794906+08 | 2011-06-08 15:31:40.616346+08 | 2011-06-08 15:31:40.616346+08 | f | select * from pg_stat_activity;
(1 row)
digoal=> \c digoal postgres
You are now connected to database "digoal" as user "postgres".
digoal=# select * from pg_stat_activity;
datid | datname | procpid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_
start | xact_start | query_start | waiting | current_query
-------+---------+---------+----------+----------+------------------+-------------+-----------------+-------------+-----------------
--------------+-------------------------------+-------------------------------+---------+---------------------------------
16430 | digoal | 14161 | 10 | postgres | psql | 127.0.0.1 | | 5961 | 2011-06-08 15:31
:43.937297+08 | 2011-06-08 15:31:45.178005+08 | 2011-06-08 15:31:45.178005+08 | f | select * from pg_stat_activity;
(1 row)
那么如何避免这样的情况发生呢? 答案是使用set role .
digoal=# load 'auto_explain';
LOAD
digoal=# set session auto_explain.log_min_duration=0;
SET
digoal=# set role digoal;
SET
digoal=> select * from tbl_user_info limit 3;
id | firstname | lastname | corp
----+-----------+----------+----------
1 | zhou1 | digoal1 | sky-mobi
2 | zhou2 | digoal2 | sky-mobi
3 | zhou3 | digoal3 | sky-mobi
(3 rows)
再看日志,已经有了 :
2011-06-08 15:29:30.011 CST,"postgres","digoal",14062,"127.0.0.1:5939",4def24bc.36ee,1,"SELECT",2011-06-08 15:29:00 CST,2/29,0,LOG,0
0000,"duration: 0.031 ms plan:
Query Text: select * from tbl_user_info limit 3;
Limit (cost=0.00..0.05 rows=3 width=31)
-> Seq Scan on tbl_user_info (cost=0.00..183.00 rows=10000 width=31)",,,,,,,,,"psql"
如果你不想看日志这么麻烦,想在client直接显示,也很方便,设置client_min_messages=’log’就可以看到auto explain的输出了。
set client_min_messages='log';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;
set auto_explain.log_verbose = true;
set auto_explain.log_buffers = true;
set auto_explain.log_nested_statements = true;
postgres=# do language plpgsql $$
declare
begin
perform 1 from pg_class where oid=1;
end;
$$;
LOG: duration: 0.008 ms plan:
Query Text: SELECT 1 from pg_class where oid=1
Index Only Scan using pg_class_oid_index on pg_catalog.pg_class (cost=0.27..1.29 rows=1 width=4) (actual time=0.006..0.006 rows=0 loops=1)
Output: 1
Index Cond: (pg_class.oid = '1'::oid)
Heap Fetches: 0
Buffers: shared hit=2
DO
另一个例子
load 'auto_explain';
set auto_explain.log_analyze =on;
set auto_explain.log_buffers =on;
set auto_explain.log_min_duration =0;
set auto_explain.log_nested_statements =on;
set auto_explain.log_timing =on;
set auto_explain.log_verbose =on;
set client_min_messages ='log';
postgres=# do language plpgsql $$
declare
arr text;
begin
select string_agg((random()*100000)::int::text, ',') into arr from generate_series(1,10);
execute 'select * from a where id in ('||arr||')';
end;
$$ ;
LOG: duration: 0.024 ms plan:
Query Text: select string_agg((random()*100000)::int::text, ',') from generate_series(1,10)
Aggregate (cost=25.01..25.02 rows=1 width=32) (actual time=0.021..0.021 rows=1 loops=1)
Output: string_agg((((random() * '100000'::double precision))::integer)::text, ','::text)
-> Function Scan on pg_catalog.generate_series (cost=0.00..10.00 rows=1000 width=0) (actual time=0.008..0.009 rows=10 loops=1)
Output: generate_series
Function Call: generate_series(1, 10)
LOG: duration: 0.104 ms plan:
Query Text: select * from a where id in (23707,28634,42724,79820,82297,8859,19271,54725,5089,42662)
Index Scan using a_pkey on public.a (cost=0.43..17.72 rows=10 width=45) (actual time=0.036..0.100 rows=10 loops=1)
Output: id, info, crt_time
Index Cond: (a.id = ANY ('{23707,28634,42724,79820,82297,8859,19271,54725,5089,42662}'::integer[]))
Buffers: shared hit=40
DO
Time: 1.126 ms
参考
1. man LOAD
2. man EXPLAIN
3. http://www.postgresql.org/docs/9.6/static/auto-explain.html