PostgreSQL 11 preview - 强制auto prepared statment开关(自动化plan cache)(类似Oracle cursor_sharing force)

2 minute read


通常一个QUERY在执行时分为几个步骤,语义解析、SQL REWRITE、生成所有可选执行路径、选择最优路径、执行等多个步骤。

对于同一类SQL,只是更换SQL中的一些FILTER变量时,实际上很多步骤是可以共享的。例如语义解析、SQL REWRITE、生成执行计划都是可共享的,这些步骤通常比较耗CPU资源,通过prepared statement对于高并发的小事务来说,可以大幅降低CPU开销,降低延迟,性能提升非常明显。



select * from tbl where id=1;  
select * from tbl where id=2;  

2、绑定变量,一次语义解析、SQL REWRITE、生成执行计划,多次BIND,EXEC。

prepare ps1(int) as select * from tbl where id=$1;  
execute ps1(1);  
execute ps1(2);  




《学习 PostgreSQL Frontend/Backend protocol (通信协议)》




《PostgreSQL Oracle 兼容性之 - DBMS_SQL(存储过程动态SQL中使用绑定变量)》



《PostgreSQL plan cache 源码浅析 - 如何确保不会计划倾斜》


但是并不是所有场景都可以用绑定变量的,例如我们在使用连接池时,如果需要用transaction级复用机制,那么就无法使用绑定变量,因为事务结束后,连接就可能被其他会话复用,而你再次发起execute ps请求时,可能从池里分配给你的连接并不是你之前执行prepare statement的连接,导致ps不存在的错误。

那么怎么让数据库在执行简单SQL的时候,能够用上prepared statement呢?


Oracle CURSOR_SHARING 强制绑定变量

设置CURSOR_SHARING=force,即使你使用的是简单SQL,那么在数据库中也会自动帮你转换为prepared statement,当然相比正式的绑定变量还是更耗费资源一点,毕竟parser省不了。

CURSOR_SHARING determines what kind of SQL statements can share the same cursors.  
Forces statements that may differ in some literals, but are otherwise identical, to share a cursor, unless the literals affect the meaning of the statement.  
Causes statements that may differ in some literals, but are otherwise identical, to share a cursor, unless the literals affect either the meaning of the statement or the degree to which the plan is optimized.  
Only allows statements with identical text to share the same cursor.  

PostgreSQL 数据库强制自动绑定变量

PostgreSQL 11可能会加入这个patch,即使你执行的是简单SQL,内部也会自动对其绑定。



Hi hackers,  
There were a lot of discussions about query plan caching in hackers   
mailing list, but I failed to find some clear answer for my question and   
the current consensus on this question in Postgres community. As far as   
I understand current state is the following:  
1. We have per-connection prepared statements.  
2. Queries executed inside plpgsql code are implicitly prepared.  
It is not always possible or convenient to use prepared statements.  
For example, if pgbouncer is used to perform connection pooling.  
Another use case (which is actually the problem I am trying to solve   
now) is partitioning.  
Efficient execution of query to partitioned table requires hardcoded   
value for partitioning key.  
Only in this case optimizer will be able to construct efficient query   
plan which access only affected tables (partitions).  
My small benchmark for distributed partitioned table based on pg_pathman   
+ postgres_fdw shows 3 times degrade of performance in case of using   
prepared statements.  
But without prepared statements substantial amount of time is spent in   
query compilation and planning. I was be able to speed up benchmark more   
than two time by  
sending prepared queries directly to the remote nodes.  
So what I am thinking now is implicit query caching. If the same query   
with different literal values is repeated many times, then we can try to   
generalize this query and replace it with prepared query with   
parameters. I am not considering now shared query cache: is seems to be   
much harder to implement. But local caching of generalized queries seems   
to be not so difficult to implement and requires not so much changes in   
Postgres code. And it can be useful not only for sharding, but for many   
other cases where prepared statements can not be used.  
I wonder if such option was already considered and if it was for some   
reasons rejected: can you point me at this reasons?  


Flag Counter

digoal’s 大量PostgreSQL文章入口