聊一下PostgreSQL优化器 - in里面有重复值时PostgreSQL如何处理?

3 minute read

背景

比如某个业务APP收集了用户的位置信息，在数据库中会有用户去过的地方的一些行为日志数据。

现在要找出今天某些用户群体去过了哪些地方。

会发生什么呢？

一个人在某一个地点可能会上报很多条数据，同时不同的人也可能会去过同一个地点，因此同一个地名可能会有多条重复记录。

所以如果使用这样的查询，会导致IN里有很多重复值。

select * from 地址库 where 地址 in (select 去过的地方 from 用户跟踪表 where 某批用户群) ;

优化手段很多，我写过几篇可以参考如下

《distinct xx和count(distinct xx)的变态递归优化方法 - 索引收敛(skip scan)扫描》

《用PostgreSQL找回618秒逝去的青春 - 递归收敛优化》

《时序数据合并场景加速分析和实现 - 复合索引，窗口分组查询加速，变态递归加速》

本文不介绍优化方法，我介绍了PostgreSQL优化器遇到这样的情况会怎么做。

场景模拟

A表

地址库，ID为地名唯一标识。

B表

用户行为表，其中ID为用户ID，LID为去过的地名ID

建表

postgres=# create table a(id int primary key, info text);  
CREATE TABLE  
postgres=# create table b(id int, lid int, info text);  
CREATE TABLE  

插入测试数据

1000万地址库    
postgres=# insert into a select generate_series(1,10000000), 'test';  
  
1000万用户行为，大部分LID为重复值(仅生成1-1000的地址ID)。  
postgres=# insert into b select 1000000*random(), 1000*random(), 'test' from generate_series(1,10000000);  
INSERT 0 10000000  

测试，这些用户都去过哪里？

(只是模拟场景，优化手段很多，我们今天就看PG自己的优化器怎么做的)

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where id in (select lid from b);  
                                                             QUERY PLAN                                                               
------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=179057.62..181506.16 rows=1001 width=9) (actual time=3816.572..3820.227 rows=1000 loops=1)  
   Output: a.id, a.info  
   Buffers: shared hit=58062  
   ->  HashAggregate  (cost=179057.19..179067.20 rows=1001 width=4) (actual time=3816.541..3816.763 rows=1001 loops=1)  
         Output: b.lid  
         Group Key: b.lid  
         Buffers: shared hit=54055  
         ->  Seq Scan on public.b  (cost=0.00..154056.75 rows=10000175 width=4) (actual time=0.019..1227.238 rows=10000000 loops=1)  
               Output: b.id, b.lid, b.info  
               Buffers: shared hit=54055  
   ->  Index Scan using a_pkey on public.a  (cost=0.43..2.43 rows=1 width=9) (actual time=0.003..0.003 rows=1 loops=1001)  
         Output: a.id, a.info  
         Index Cond: (a.id = b.lid)  
         Buffers: shared hit=4007  
 Planning time: 0.200 ms  
 Execution time: 3820.401 ms  
(16 rows)  

分析

PostgreSQL这里选择了对B表的LID列进行hash聚合（聚合约3.8秒），这样就变成了少量值的匹配。由于A表的ID字段有索引，所以匹配花了很短的时间（搜索1001次总共约0.03毫秒）。

当然，PG的优化是基于CBO的，所以如果你有些成本因子没有选好，可能会走其他的执行计划，比如merge JOIN.

开脑洞的优化方法来了，由于B表的重复值太多了，可以适应递归收敛进行优化。

详见

《用PostgreSQL找回618秒逝去的青春 - 递归收敛优化》

强烈建议阅读

其他测试

1. 当IN里面是重复的常量时，PG是如何处理的呢？

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where id in (0,0,0,0,0,0,0);  
                                                    QUERY PLAN                                                      
------------------------------------------------------------------------------------------------------------------  
 Index Scan using a_pkey on public.a  (cost=0.43..11.17 rows=7 width=9) (actual time=0.021..0.021 rows=0 loops=1)  
   Output: id, info  
   Index Cond: (a.id = ANY ('{0,0,0,0,0,0,0}'::integer[]))  
   Buffers: shared hit=3  
 Planning time: 0.116 ms  
 Execution time: 0.057 ms  
(6 rows)  
  
postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where id in (0,0,0,0,0,0,0,1);  
                                                    QUERY PLAN                                                      
------------------------------------------------------------------------------------------------------------------  
 Index Scan using a_pkey on public.a  (cost=0.43..12.62 rows=8 width=9) (actual time=0.029..0.030 rows=1 loops=1)  
   Output: id, info  
   Index Cond: (a.id = ANY ('{0,0,0,0,0,0,0,1}'::integer[]))  
   Buffers: shared hit=7  
 Planning time: 0.116 ms  
 Execution time: 0.067 ms  
(6 rows)  
  
通常这种值是客户端传上来的，建议客户端可以做一次聚合，去重之后再传给数据库。  
  
实际上客户端如果传的VALUE不多，这块就无所谓了。  
  
如果客户端传递的VALUE非常非常多，而且不想在客户端去重，那么也可以用一张临时来表示，然后PG就可以用哈希聚合去重了。  

2. SRF function

PostgreSQL可以根据SRF进行逻辑推理，从而避免QUERY被执行。

《PostgreSQL 优化器逻辑推理能力源码解析》

例如

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where id in (select 0 from generate_series(1,100000000));  
                                                      QUERY PLAN                                                         
-----------------------------------------------------------------------------------------------------------------------  
 Nested Loop Semi Join  (cost=0.44..12.47 rows=1 width=9) (actual time=0.026..0.026 rows=0 loops=1)  
   Output: a.id, a.info  
   Buffers: shared hit=3  
   ->  Index Scan using a_pkey on public.a  (cost=0.43..2.45 rows=1 width=9) (actual time=0.024..0.024 rows=0 loops=1)  
         Output: a.id, a.info  
         Index Cond: (a.id = 0)  
         Buffers: shared hit=3  
   ->  Function Scan on pg_catalog.generate_series  (cost=0.00..10.00 rows=1000 width=0) (never executed)  
         Output: generate_series.generate_series  
         Function Call: generate_series(1, 100000000)  
 Planning time: 0.146 ms  
 Execution time: 0.073 ms  
(12 rows)  
  
由于A表中的bucket显示根本就没有0，所以这条generate_series直接被忽略了  

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where id in (select 1 from generate_series(1,1000000));  
                                                              QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop Semi Join  (cost=0.44..12.47 rows=1 width=9) (actual time=133.626..133.635 rows=1 loops=1)  
   Output: a.id, a.info  
   Buffers: shared hit=4  
   ->  Index Scan using a_pkey on public.a  (cost=0.43..2.45 rows=1 width=9) (actual time=0.028..0.036 rows=1 loops=1)  
         Output: a.id, a.info  
         Index Cond: (a.id = 1)  
         Buffers: shared hit=4  
   ->  Function Scan on pg_catalog.generate_series  (cost=0.00..10.00 rows=1000 width=0) (actual time=133.591..133.591 rows=1 loops=1)  
         Output: generate_series.generate_series  
         Function Call: generate_series(1, 1000000)  
 Planning time: 0.148 ms  
 Execution time: 184.720 ms  
(12 rows)  
  
而1在bucket里是有的，所以被执行了   

但是如果把0放到union all里面，又会被执行，此时它用了哈希聚合来减少扫描

postgres=# explain (analyze,verbose,timing,costs,buffers) select * from a where id in (select 0 from generate_series(1,10000000) union all select 1 from generate_series(1,1000000));  
                                                                                QUERY PLAN                                                                                   
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 Nested Loop  (cost=25.44..519.50 rows=5000088 width=9) (actual time=8042.867..8042.873 rows=1 loops=1)  
   Output: a.id, a.info  
   Buffers: shared hit=7, temp read=18801 written=18799  
   ->  HashAggregate  (cost=25.00..27.00 rows=200 width=4) (actual time=8042.823..8042.824 rows=2 loops=1)  
         Output: (0)  
         Group Key: (0)  
         Buffers: temp read=18801 written=18799  
         ->  Append  (cost=0.00..20.00 rows=2000 width=4) (actual time=1571.580..5694.255 rows=11000000 loops=1)  
               Buffers: temp read=18801 written=18799  
               ->  Function Scan on pg_catalog.generate_series  (cost=0.00..10.00 rows=1000 width=4) (actual time=1571.577..3656.432 rows=10000000 loops=1)  
                     Output: 0  
                     Function Call: generate_series(1, 10000000)  
                     Buffers: temp read=17091 written=17090  
               ->  Function Scan on pg_catalog.generate_series generate_series_1  (cost=0.00..10.00 rows=1000 width=4) (actual time=156.018..365.703 rows=1000000 loops=1)  
                     Output: 1  
                     Function Call: generate_series(1, 1000000)  
                     Buffers: temp read=1710 written=1709  
   ->  Index Scan using a_pkey on public.a  (cost=0.43..2.45 rows=1 width=9) (actual time=0.018..0.019 rows=0 loops=2)  
         Output: a.id, a.info  
         Index Cond: (a.id = (0))  
         Buffers: shared hit=7  
 Planning time: 0.323 ms  
 Execution time: 8087.089 ms  
(23 rows)