PostgreSQL 优化器案例之 - order by limit 索引选择问题

13 minute read

背景

当我们在执行一个这样的SQL时，假如有这样几个索引(c1,c2) (id)，数据库到底该用哪个索引呢？

explain select * from tbl where c1=200 and c2=200 order by id limit 10;  
或  
explain select * from tbl where c1=200 and c2 between 100 and 300 order by id limit 10;  

实际上PG会通过计算成本得到应该使用哪个索引。

但是实际上计算公式本身的通用性，使得当C1，C2数据分布倾斜时，可能导致某些C1，C2输入值的执行计划不准确。

走c1,c2索引，当满足c1,c2条件的记录很少时。如果数据比较多，则SORT耗费的成本就较大，导致整个耗时变大。

走id索引，当满足c1,c2条件的记录很多时，（并且c1,c2满足条件的数据分布在id偏小的端），如果分布在大端，则需要扫描更多的记录才能找到满足条件的记录。

如果数据库统计信息中可以评估出来满足条件的c1,c2的记录与ID顺序分布的关系，也可以作为优化器计算成本的一种参考输入（当然要做到这个可能比较复杂，同时也会增加优化器计算成本的开销）。

下面通过实例来说明。

例子

1、建测试表

postgres=# create table tbl (id int, c1 int, c2 int, c3 int, c4 int);  
CREATE TABLE  

2、写入一批随机数据，ID从1到1000万。

postgres=# insert into tbl select generate_series(1,10000000), random()*100, random()*100, random()*100, random()*100;  
INSERT 0 10000000  

3、写入另一批100万条数据，c1,c2 与前面1000万的值不一样。

postgres=# insert into tbl select generate_series(10000001,11000000), 200,200,200,200;  
INSERT 0 1000000  

4、创建两个索引，也就是本文需要重点关注的，到底走哪个索引更划算

postgres=# create index idx_tbl_1 on tbl(id);  
CREATE INDEX  
postgres=# create index idx_tbl_2 on tbl(c1,c2,c3,c4);  
CREATE INDEX  

5、收集统计信息

postgres=# vacuum analyze tbl;  
VACUUM  

6、查看下面SQL的执行计划，走了id索引

postgres=# explain select * from tbl where c1=200 and c2=200 order by id limit 10;  
                                      QUERY PLAN                                        
--------------------------------------------------------------------------------------  
 Limit  (cost=0.43..32.59 rows=10 width=20)  
   ->  Index Scan using idx_tbl_1 on tbl  (cost=0.43..323244.26 rows=100533 width=20)  
         Filter: ((c1 = 200) AND (c2 = 200))  
(3 rows)  

还是走了id索引

postgres=# explain select * from tbl where c1=200 and c2=200 order by id limit 1000;  
                                      QUERY PLAN                                        
--------------------------------------------------------------------------------------  
 Limit  (cost=0.43..3215.74 rows=1000 width=20)  
   ->  Index Scan using idx_tbl_1 on tbl  (cost=0.43..323244.26 rows=100533 width=20)  
         Filter: ((c1 = 200) AND (c2 = 200))  
(3 rows)  

当LIMIT达到50000时，走了c1,c2的索引。为什么呢？

postgres=# explain select * from tbl where c1=200 and c2=200 order by id limit 50000;  
                                         QUERY PLAN                                           
--------------------------------------------------------------------------------------------  
 Limit  (cost=70355.06..70480.06 rows=50000 width=20)  
   ->  Sort  (cost=70355.06..70606.39 rows=100533 width=20)  
         Sort Key: id  
         ->  Bitmap Heap Scan on tbl  (cost=1457.82..62005.97 rows=100533 width=20)  
               Recheck Cond: ((c1 = 200) AND (c2 = 200))  
               ->  Bitmap Index Scan on idx_tbl_2  (cost=0.00..1432.69 rows=100533 width=0)  
                     Index Cond: ((c1 = 200) AND (c2 = 200))  
(7 rows)  

7、分析以上执行计划的含义

首先，表的记录数(1100万)除以”满足c1=200 and c2=200 条件的记录数”(100533)，得到平均需要扫描多少条记录，可以得到一条满足c1=200 and c2=200条件的记录.

postgres=# select 11000000/100533.0;  
       ?column?         
----------------------  
 109.4168084111684720  
(1 row)  

也就是说每扫描109.4条记录，可以得到一条满足条件的记录。（优化器这么算，是认为数据分布是均匀的。）

但是，实际上，数据分布是不均匀的，c1=200 and c2=200的记录在表的末端（1000万条记录后面），也就是说需要扫描1000万条记录后，才能得到1条满足c1=200 and c2=200的记录。

并不是估算的每扫描109.4条记录，可以得到一条满足条件的记录。

问题就出在这里。

8、我们再来分析一下为什么limit 50000时，选择了c1,c2的索引。而不是id的索引

使用ID索引时，需要扫描100533条记录，同时需要排序，直到排序完成，总成约70606.39。然后就是GET HEAP TUPLE的成本。

当使用id的索引扫描时，返回多少条记录能达到70606.39的成本呢？

以limit 1000的3215.74成本为例  
  
postgres=# select 70606.39/3215.74;  
      ?column?         
---------------------  
 21.9564983487471002  
(1 row)  
  
postgres=# select 21.956*1000;  
 ?column?    
-----------  
 21956.000  
(1 row)  

分水岭

经过以上分析，也就是说，LIMIT 21956时，走ID索引扫描的执行计划，成本可达到70606.39。

所以limit 21956是一个分水岭，大于这个值时，可能使用c1,c2的索引扫描，而小于它，则会使用ID索引扫描.

如下

postgres=# explain select * from tbl where c1=200 and c2=200 order by id limit 22000;  
                                         QUERY PLAN                                           
--------------------------------------------------------------------------------------------  
 Limit  (cost=69759.69..69814.69 rows=22000 width=20)  
   ->  Sort  (cost=69759.69..70011.02 rows=100533 width=20)  
         Sort Key: id  
         ->  Bitmap Heap Scan on tbl  (cost=1457.82..62005.97 rows=100533 width=20)  
               Recheck Cond: ((c1 = 200) AND (c2 = 200))  
               ->  Bitmap Index Scan on idx_tbl_2  (cost=0.00..1432.69 rows=100533 width=0)  
                     Index Cond: ((c1 = 200) AND (c2 = 200))  
(7 rows)  

postgres=# explain select * from tbl where c1=200 and c2=200 order by id limit 21000;  
                                      QUERY PLAN                                        
--------------------------------------------------------------------------------------  
 Limit  (cost=0.43..67521.75 rows=21000 width=20)  
   ->  Index Scan using idx_tbl_1 on tbl  (cost=0.43..323244.26 rows=100533 width=20)  
         Filter: ((c1 = 200) AND (c2 = 200))  
(3 rows)  

真实的执行耗时

很显然，使用id扫描，一定会慢，因为满足条件的数据都分布在1000万行后面。

1、c1,c2索引扫描，直接命中数据，加排序（100万条），略快。

postgres=# explain  analyze select * from tbl where c1=200 and c2=200 order by id limit 22000;  
                                                                  QUERY PLAN                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=69759.69..69814.69 rows=22000 width=20) (actual time=293.961..299.054 rows=22000 loops=1)  
   ->  Sort  (cost=69759.69..70011.02 rows=100533 width=20) (actual time=293.960..296.006 rows=22000 loops=1)  
         Sort Key: id  
         Sort Method: top-N heapsort  Memory: 3255kB  
         ->  Bitmap Heap Scan on tbl  (cost=1457.82..62005.97 rows=100533 width=20) (actual time=47.919..175.698 rows=1000000 loops=1)  
               Recheck Cond: ((c1 = 200) AND (c2 = 200))  
               Heap Blocks: exact=6370  
               ->  Bitmap Index Scan on idx_tbl_2  (cost=0.00..1432.69 rows=100533 width=0) (actual time=47.160..47.160 rows=1000000 loops=1)  
                     Index Cond: ((c1 = 200) AND (c2 = 200))  
 Planning time: 0.152 ms  
 Execution time: 300.664 ms  
(11 rows)  

2、id 索引扫描，慢。

postgres=# explain  analyze select * from tbl where c1=200 and c2=200 order by id limit 21000;  
                                                                QUERY PLAN                                                                  
------------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=0.43..67521.75 rows=21000 width=20) (actual time=1404.932..1412.594 rows=21000 loops=1)  
   ->  Index Scan using idx_tbl_1 on tbl  (cost=0.43..323244.26 rows=100533 width=20) (actual time=1404.930..1409.639 rows=21000 loops=1)  
         Filter: ((c1 = 200) AND (c2 = 200))  
         Rows Removed by Filter: 10000000  
 Planning time: 0.139 ms  
 Execution time: 1414.142 ms  
(6 rows)  

3、limit 10同样，id 索引扫描，慢。

postgres=# explain ( analyze,verbose,timing,costs,buffers) select * from tbl where c1=200 and c2=200 order by id limit 10;  
                                                                  QUERY PLAN                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=0.43..32.59 rows=10 width=20) (actual time=1403.861..1403.865 rows=10 loops=1)  
   Output: id, c1, c2, c3, c4  
   Buffers: shared hit=91020  
   ->  Index Scan using idx_tbl_1 on public.tbl  (cost=0.43..323244.26 rows=100533 width=20) (actual time=1403.859..1403.861 rows=10 loops=1)  
         Output: id, c1, c2, c3, c4  
         Filter: ((tbl.c1 = 200) AND (tbl.c2 = 200))  
         Rows Removed by Filter: 10000000  
         Buffers: shared hit=91020  
 Planning time: 0.127 ms  
 Execution time: 1403.893 ms  
(10 rows)  

优化方法

1、使用HINT或者改SQL，强制不走ID扫描。

并不适用于所有场景，比如数据分布均匀时，那么PG的这周成本计算方法就对口，那么什么时候使用ID，什么时候使用C1，C2索引扫描就合乎常理了。

postgres=# explain ( analyze,verbose,timing,costs,buffers) select * from tbl where c1=200 and c2=200 order by id+0 limit 10;  
                                                                  QUERY PLAN                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=64429.79..64429.81 rows=10 width=24) (actual time=409.622..409.626 rows=10 loops=1)  
   Output: id, c1, c2, c3, c4, ((id + 0))  
   Buffers: shared hit=10205  
   ->  Sort  (cost=64429.79..64681.12 rows=100533 width=24) (actual time=409.620..409.621 rows=10 loops=1)  
         Output: id, c1, c2, c3, c4, ((id + 0))  
         Sort Key: ((tbl.id + 0))  
         Sort Method: top-N heapsort  Memory: 25kB  
         Buffers: shared hit=10205  
         ->  Bitmap Heap Scan on public.tbl  (cost=1457.82..62257.30 rows=100533 width=24) (actual time=47.347..237.455 rows=1000000 loops=1)  
               Output: id, c1, c2, c3, c4, (id + 0)  
               Recheck Cond: ((tbl.c1 = 200) AND (tbl.c2 = 200))  
               Heap Blocks: exact=6370  
               Buffers: shared hit=10205  
               ->  Bitmap Index Scan on idx_tbl_2  (cost=0.00..1432.69 rows=100533 width=0) (actual time=46.577..46.577 rows=1000000 loops=1)  
                     Index Cond: ((tbl.c1 = 200) AND (tbl.c2 = 200))  
                     Buffers: shared hit=3835  
 Planning time: 0.133 ms  
 Execution time: 409.670 ms  
(18 rows)  

2、如果c1,c2是等值查询的话，可以用以下索引，那么效率是最高的。

postgres=# create index idx_tbl_3 on tbl(c1,c2,id);  
CREATE INDEX  
  
  
postgres=# explain ( analyze,verbose,timing,costs,buffers) select * from tbl where c1=200 and c2 =200 order by id limit 10;  
                                                              QUERY PLAN                                                                 
---------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=0.56..6.93 rows=10 width=20) (actual time=0.102..0.106 rows=10 loops=1)  
   Output: id, c1, c2, c3, c4  
   Buffers: shared hit=1 read=4  
   I/O Timings: read=0.047  
   ->  Index Scan using idx_tbl_3 on public.tbl  (cost=0.56..64086.79 rows=100533 width=20) (actual time=0.101..0.103 rows=10 loops=1)  
         Output: id, c1, c2, c3, c4  
         Index Cond: ((tbl.c1 = 200) AND (tbl.c2 = 200))  
         Buffers: shared hit=1 read=4  
         I/O Timings: read=0.047  
 Planning time: 0.142 ms  
 Execution time: 0.131 ms  
(11 rows)  

3、建议方法:

注意方法2 不适合非等值查询，

postgres=# explain ( analyze,verbose,timing,costs,buffers) select * from tbl where c1=200 and c2 between 100 and 300 order by id limit 10;  
                                                                  QUERY PLAN                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=0.43..35.32 rows=10 width=20) (actual time=1371.094..1371.099 rows=10 loops=1)  
   Output: id, c1, c2, c3, c4  
   Buffers: shared hit=91020  
   ->  Index Scan using idx_tbl_1 on public.tbl  (cost=0.43..350743.84 rows=100533 width=20) (actual time=1371.092..1371.095 rows=10 loops=1)  
         Output: id, c1, c2, c3, c4  
         Filter: ((tbl.c2 >= 100) AND (tbl.c2 <= 300) AND (tbl.c1 = 200))  
         Rows Removed by Filter: 10000000  
         Buffers: shared hit=91020  
 Planning time: 0.278 ms  
 Execution time: 1371.128 ms  
(10 rows)  

但是不用担心，我们依旧可以使用其他等值查询列，加上排序列组成复合索引，在INDEX SCAN中使用FILTER来加速。

新增如下索引，加速如下

postgres=# create index idx_tbl_4 on tbl(c1,id);  
CREATE INDEX  
  
postgres=# explain ( analyze,verbose,timing,costs,buffers) select * from tbl where c1=200 and c2 between 100 and 300 order by id limit 10;  
                                                               QUERY PLAN                                                                 
----------------------------------------------------------------------------------------------------------------------------------------  
 Limit  (cost=0.43..10.47 rows=10 width=20) (actual time=0.105..0.110 rows=10 loops=1)  
   Output: id, c1, c2, c3, c4  
   Buffers: shared hit=1 read=3  
   I/O Timings: read=0.051  
   ->  Index Scan using idx_tbl_4 on public.tbl  (cost=0.43..100877.50 rows=100533 width=20) (actual time=0.104..0.107 rows=10 loops=1)  
         Output: id, c1, c2, c3, c4  
         Index Cond: (tbl.c1 = 200)  
         Filter: ((tbl.c2 >= 100) AND (tbl.c2 <= 300))  
         Buffers: shared hit=1 read=3  
         I/O Timings: read=0.051  
 Planning time: 0.172 ms  
 Execution time: 0.134 ms  
(12 rows)  

附Oracle

《Oracle migration to Greenplum - (含 Ora2pg)》

http://www.dba-oracle.com/t_OracleAutotrace.htm

SQL> create table tbl(id int, c1 int, c2 int, c3 int, c4 int);

Table created.

SQL> insert into tbl select rownum,trunc(dbms_random.value(0, 100)),trunc(dbms_random.value(0, 100)),trunc(dbms_random.value(0, 100)),trunc(dbms_random.value(0, 100)) from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> insert into tbl select rownum+10000000, 200,200,200,200 from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index idx_tbl_1 on tbl(id);

Index created.

SQL> create index idx_tbl_2 on tbl(c1,c2,c3,c4);

Index created.

SQL> set linesize 512
SQL> set pagesize 50000

SQL> set autotrace on;

SQL> exec DBMS_STATS.GATHER_TABLE_STATS('JIUDU','TBL'); 

PL/SQL procedure successfully completed.

SQL> select * from (select * from tbl where c1=200 and c2=200 order by id) t where rownum<10;

        ID         C1         C2         C3         C4
---------- ---------- ---------- ---------- ----------
  10000001        200        200        200        200
  10000002        200        200        200        200
  10000003        200        200        200        200
  10000004        200        200        200        200
  10000005        200        200        200        200
  10000006        200        200        200        200
  10000007        200        200        200        200
  10000008        200        200        200        200
  10000009        200        200        200        200

9 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 745043579

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     9 |   585 |       | 10253   (2)| 00:02:04 |
|*  1 |  COUNT STOPKEY          |      |       |       |       |            |          |
|   2 |   VIEW                  |      | 84875 |  5387K|       | 10253   (2)| 00:02:04 |
|*  3 |    SORT ORDER BY STOPKEY|      | 84875 |  1491K|  2672K| 10253   (2)| 00:02:04 |
|*  4 |     TABLE ACCESS FULL   | TBL  | 84875 |  1491K|       |  9767   (2)| 00:01:58 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<10)
   3 - filter(ROWNUM<10)
   4 - filter("C1"=200 AND "C2"=200)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      34868  consistent gets
          0  physical reads
          0  redo size
        937  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          9  rows processed

SQL> select * from (select * from tbl where c1=1 and c2=1 order by id) t where rownum<10;

        ID         C1         C2         C3         C4
---------- ---------- ---------- ---------- ----------
      9697          1          1         78         39
     20586          1          1         81         71
     27820          1          1         33         64
     44324          1          1         26         27
     47079          1          1          3          5
     64669          1          1         13         49
     73715          1          1         20         74
     80903          1          1         96         25
     98368          1          1         59          9

9 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 447312937

--------------------------------------------------------------------------------------------
| Id  | Operation                      | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |           |     9 |   585 |   641   (1)| 00:00:08 |
|*  1 |  COUNT STOPKEY                 |           |       |       |            |          |
|   2 |   VIEW                         |           |   704 | 45760 |   641   (1)| 00:00:08 |
|*  3 |    SORT ORDER BY STOPKEY       |           |   704 | 12672 |   641   (1)| 00:00:08 |
|   4 |     TABLE ACCESS BY INDEX ROWID| TBL       |   704 | 12672 |   640   (0)| 00:00:08 |
|*  5 |      INDEX RANGE SCAN          | IDX_TBL_2 |   704 |       |     5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<10)
   3 - filter(ROWNUM<10)
   5 - access("C1"=1 AND "C2"=1)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1072  consistent gets
         11  physical reads
          0  redo size
        969  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          9  rows processed


SQL> create index idx_tbl_3 on tbl(c1,c2,id);
Index created.

SQL> select * from (select * from tbl where c1=200 and c2 between 100 and 300 order by id) t where rownum < 10;

        ID         C1         C2         C3         C4
---------- ---------- ---------- ---------- ----------
  10000001        200        200        200        200
  10000002        200        200        200        200
  10000003        200        200        200        200
  10000004        200        200        200        200
  10000005        200        200        200        200
  10000006        200        200        200        200
  10000007        200        200        200        200
  10000008        200        200        200        200
  10000009        200        200        200        200

9 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 745043579

----------------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     9 |   585 |       | 10253   (2)| 00:02:04 |
|*  1 |  COUNT STOPKEY          |      |       |       |       |            |          |
|   2 |   VIEW                  |      | 84875 |  5387K|       | 10253   (2)| 00:02:04 |
|*  3 |    SORT ORDER BY STOPKEY|      | 84875 |  1491K|  2672K| 10253   (2)| 00:02:04 |
|*  4 |     TABLE ACCESS FULL   | TBL  | 84875 |  1491K|       |  9767   (2)| 00:01:58 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<10)
   3 - filter(ROWNUM<10)
   4 - filter("C1"=200 AND "C2">=100 AND "C2"<=300)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
      34868  consistent gets
          0  physical reads
          0  redo size
        937  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          9  rows processed

SQL> select * from (select * from tbl where c1=200 and c2 =200 order by id) t where rownum < 10;

        ID         C1         C2         C3         C4
---------- ---------- ---------- ---------- ----------
  10000001        200        200        200        200
  10000002        200        200        200        200
  10000003        200        200        200        200
  10000004        200        200        200        200
  10000005        200        200        200        200
  10000006        200        200        200        200
  10000007        200        200        200        200
  10000008        200        200        200        200
  10000009        200        200        200        200

9 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 1825274432

-------------------------------------------------------------------------------------------
| Id  | Operation                     | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |           |     9 |   585 |    12   (0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |           |       |       |            |          |
|   2 |   VIEW                        |           |    10 |   650 |    12   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| TBL       |    10 |   180 |    12   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IDX_TBL_3 |       |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<10)
   4 - access("C1"=200 AND "C2"=200)


Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          6  consistent gets
          2  physical reads
          0  redo size
        937  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          9  rows processed

同时也发现一个问题，Oracle可能无法使用index filter来优化，例如将index2,index3删除后，留下ID索引，Oracle无法走索引，而PG可以。

SQL> select * from (select * from tbl where c1=1 and c2 =1 order by id) t where rownum < 10;

        ID         C1         C2         C3         C4
---------- ---------- ---------- ---------- ----------
      9697          1          1         78         39
     20586          1          1         81         71
     27820          1          1         33         64
     44324          1          1         26         27
     47079          1          1          3          5
     64669          1          1         13         49
     73715          1          1         20         74
     80903          1          1         96         25
     98368          1          1         59          9

9 rows selected.


Execution Plan
----------------------------------------------------------
Plan hash value: 745043579

--------------------------------------------------------------------------------
| Id  | Operation               | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |      |     9 |   585 |  9766   (2)| 00:01:58 |
|*  1 |  COUNT STOPKEY          |      |       |       |            |          |
|   2 |   VIEW                  |      |   704 | 45760 |  9766   (2)| 00:01:58 |
|*  3 |    SORT ORDER BY STOPKEY|      |   704 | 12672 |  9766   (2)| 00:01:58 |
|*  4 |     TABLE ACCESS FULL   | TBL  |   704 | 12672 |  9765   (2)| 00:01:58 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<10)
   3 - filter(ROWNUM<10)
   4 - filter("C1"=1 AND "C2"=1)


Statistics
----------------------------------------------------------
        186  recursive calls
          0  db block gets
      34893  consistent gets
          0  physical reads
          0  redo size
        969  bytes sent via SQL*Net to client
        500  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          7  sorts (memory)
          0  sorts (disk)
          9  rows processed
  

PG


postgres=# explain analyze select * from tbl where c1=1 and c2 =1 order by id limit 10;
                                                          QUERY PLAN                                                           
-------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..3703.11 rows=10 width=20) (actual time=7.199..23.926 rows=10 loops=1)
   ->  Index Scan using idx_tbl_1 on tbl  (cost=0.43..323243.84 rows=873 width=20) (actual time=7.198..23.921 rows=10 loops=1)
         Filter: ((c1 = 1) AND (c2 = 1))
         Rows Removed by Filter: 142814
 Planning time: 0.119 ms
 Execution time: 23.950 ms
(6 rows)

小结

当SQL查询中包括排序，以及其他字段的过滤条件，并使用LIMIT快速返回少量数据时，如果满足条件的数据分布在排序键的末端，那么优化器给出的执行计划可能是不好的，导致通过排序索引扫描更多的数据后才能命中需要的记录。

然而，数据库目前使用的评估走排序键时，LIMIT需要扫描多少条记录，使用了数据均匀分布的假设，所以在数据（满足条件的数据与排序键本身的相关性不均匀）分布不均匀时，导致成本估算不准(oracle干脆走全表扫描)。

建议优化方法：

增加索引，创建等值查询条件列(s)加排序列(s)组成的复合索引，降低扫描量。

例子

select * from tbl where c1=200 and c2 between 100 and 300 order by id limit 10;  
  
增加索引  
  
(c1,id)  -- 索引扫描, filter c2  
  
已有  
(c1,c2)  -- 索引扫描, sort id  
(id)     -- 索引扫描, filter c1,c2  

select * from tbl where c1=200 and c2 =200 order by id limit 10;  
  
增加索引  
  
(c1,c2,id)  -- 索引扫描  
  
已有  
(c1,c2)  -- 索引扫描, sort id  
(id)     -- 索引扫描, filter c1,c2  

参考

《PostgreSQL 10 黑科技 - 自定义统计信息》

digoal’s 大量PostgreSQL文章入口

Twitter Facebook Google+ LinkedIn

Digoal.zhou

PostgreSQL 优化器案例之 - order by limit 索引选择问题

背景

例子

分水岭

真实的执行耗时

优化方法

附Oracle

小结

参考

digoal’s 大量PostgreSQL文章入口

You May Also Enjoy

PostgreSQL(PPAS 兼容Oracle) 从零开始入门手册 - 珍藏版

PostgreSQL pipelinedb 流计算插件 - IoT应用 - 实时轨迹聚合

PostgreSQL plpgsql 存储过程、函数 - 状态、异常变量打印、异常捕获… - GET [STACKED] DIAGNOSTICS

PostgreSQL datediff 日期间隔（单位转换）兼容SQL用法