WHY prepared Statement running slower in some situation CASE
背景
在某些情况下,使用PREPARED STATEMENT或函数,可能会比直接执行SQL更慢。为什么呢?
这个要从执行计划说起,
下面来看一个测试表
digoal=> \d tbl_user
Table "digoal.tbl_user"
Column | Type | Modifiers
-----------+-----------------------+-----------
id | integer | not null
firstname | character varying(32) |
lastname | character varying(32) |
corp | character varying(32) |
age | integer |
Indexes:
"tbl_user_pkey" PRIMARY KEY, btree (id)
"idx_user_age" btree (age)
digoal=> insert into tbl_user select generate_series(1,100000),'zhou','digoal','sky-mobi',27 ;
INSERT 0 100000
digoal=> insert into tbl_user select generate_series(100001,100100),'zhou','digoal','sky-mobi',generate_series(1,100) ;
INSERT 0 100
digoal=> analyze tbl_user;
ANALYZE
digoal=> select age,count(*) from tbl_user group by age order by count(*);
age | count
-----+--------
68 | 1
8 | 1
11 | 1
80 | 1
16 | 1
39 | 1
54 | 1
3 | 1
47 | 1
61 | 1
96 | 1
67 | 1
87 | 1
14 | 1
46 | 1
99 | 1
48 | 1
17 | 1
28 | 1
83 | 1
36 | 1
94 | 1
15 | 1
84 | 1
88 | 1
66 | 1
77 | 1
38 | 1
4 | 1
30 | 1
89 | 1
60 | 1
50 | 1
74 | 1
33 | 1
73 | 1
95 | 1
6 | 1
40 | 1
56 | 1
53 | 1
62 | 1
71 | 1
19 | 1
29 | 1
93 | 1
2 | 1
21 | 1
57 | 1
51 | 1
72 | 1
92 | 1
97 | 1
23 | 1
41 | 1
31 | 1
35 | 1
65 | 1
75 | 1
52 | 1
76 | 1
20 | 1
69 | 1
5 | 1
44 | 1
7 | 1
37 | 1
85 | 1
34 | 1
82 | 1
81 | 1
25 | 1
32 | 1
12 | 1
58 | 1
1 | 1
10 | 1
79 | 1
26 | 1
42 | 1
90 | 1
18 | 1
59 | 1
78 | 1
98 | 1
100 | 1
86 | 1
13 | 1
49 | 1
22 | 1
63 | 1
9 | 1
24 | 1
91 | 1
64 | 1
70 | 1
45 | 1
43 | 1
55 | 1
27 | 100001
(100 rows)
如果按照AGE为条件查询,可能全表扫描(如age=27),或走索引(如age=1)。
digoal=> explain analyze select * from tbl_user where age=1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_user_age on tbl_user (cost=0.00..4.32 rows=3 width=29) (actual time=0.018..0.019 rows=1 loops=1)
Index Cond: (age = 1)
Total runtime: 0.042 ms
(3 rows)
Time: 0.516 ms
digoal=> explain analyze select * from tbl_user where age=27;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_user (cost=0.00..1989.25 rows=100000 width=29) (actual time=0.010..21.112 rows=100001 loops=1)
Filter: (age = 27)
Total runtime: 27.784 ms
(3 rows)
Time: 28.114 ms
使用prepared statement看看情况如何:
digoal=> prepare p_user (int) as select * from tbl_user where age=$1;
PREPARE
Time: 12.408 ms
digoal=> explain analyze execute p_user(1);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_user_age on tbl_user (cost=0.00..105.55 rows=3229 width=29) (actual time=0.012..0.012 rows=1 loops=1)
Index Cond: (age = $1)
Total runtime: 0.038 ms
(3 rows)
Time: 0.191 ms
digoal=> explain analyze execute p_user(27);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_user_age on tbl_user (cost=0.00..105.55 rows=3229 width=29) (actual time=0.016..23.100 rows=100001 loops=1)
Index Cond: (age = $1)
Total runtime: 30.069 ms
(3 rows)
Time: 30.403 ms
10W左右数据量的情况下用索引比全表扫描相差3ms左右
插入1000W左右数据再看看情况,
digoal=> insert into tbl_user select generate_series(100101,9999999),'zhou','digoal','sky-mobi',27 ;
INSERT 0 9899899
digoal=> explain analyze execute p_user(1);
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_user_age on tbl_user (cost=0.00..159018.74 rows=5000094 width=29) (actual time=0.015..0.015 rows=1 loops=1)
Index Cond: (age = $1)
Total runtime: 0.036 ms
(3 rows)
Time: 0.535 ms
digoal=> explain analyze execute p_user(27);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------
--------
Index Scan using idx_user_age on tbl_user (cost=0.00..159018.74 rows=5000094 width=29) (actual time=0.014..2352.636 rows=9999900 l
oops=1)
Index Cond: (age = $1)
Total runtime: 3042.357 ms
(3 rows)
Time: 3042.689 ms
digoal=> explain analyze select * from tbl_user where age=27;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on tbl_user (cost=0.00..198533.34 rows=9999854 width=29) (actual time=0.011..2062.048 rows=9999900 loops=1)
Filter: (age = 27)
Total runtime: 2737.911 ms
(3 rows)
Time: 2738.430 ms
digoal=> explain analyze select * from tbl_user where age=1;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_user_age on tbl_user (cost=0.00..15.03 rows=333 width=29) (actual time=0.012..0.013 rows=1 loops=1)
Index Cond: (age = 1)
Total runtime: 0.031 ms
(3 rows)
Time: 0.403 ms
相差约300mS
并不是说这样就不建议使用prepared statement了,prepared statement的使用对于降低CPU开销和服务端代码重用来说是非常有效的。
好在PostgreSQL有算法,可以优化plan cache,即使使用了prepared statement,也可以对于不同的输入值,选择不同的执行计划。
请参考
[《执行计划选择算法 与 绑定变量 - PostgreSQL prepared statement: SPI_prepare, prepare | execute COMMAND, PL/pgsql STYLE: custom & generic plan cache》](../201212/20121224_01.md) |