分页优化, add max_tag column speedup Query in max match enviroment

5 minute read

背景

昨天在一个业务库中发现一个比较耗时的SQL, 如下 :

select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
from   
(select APP_ID,max(APP_VER) APP_VER from test1 group by APP_ID) s  
join test1 t  
on s.APP_ID=t.APP_ID and s.APP_VER=t.APP_VER and t.DELETED=0    
left outer join test2 at   
on t.APP_ID=at.APP_ID       
left outer join   
test3 h   
on t.APP_ID=h.APP_ID       
limit 24 offset 0;  

注意到这里面有一个子查询select APP_ID,max(APP_VER) APP_VER from test1 group by APP_ID, 用来取出app_id上面的max(app_ver).

也就是要检索的是最大版本的app_id. 每个表上的app_id上都有索引.

create index idx_test1_2 on test1(app_id);  
create index idx_test2_1 on test2(app_id);  
create index idx_test3_1 on test3(app_id);  

目前的执行计划如下 :

digoal=> explain analyze select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
from   
(select APP_ID,max(APP_VER) APP_VER from test1 group by APP_ID) s  
join test1 t  
on s.APP_ID=t.APP_ID and s.APP_VER=t.APP_VER and t.DELETED=0    
left outer join test2 at   
on t.APP_ID=at.APP_ID       
left outer join   
test3 h   
on t.APP_ID=h.APP_ID       
limit 24 offset 0;  
                                                                         QUERY PLAN                                                   
                          
------------------------------------------------------------------------------------------------------------------------------------  
------------------------  
 Limit  (cost=0.00..13251.13 rows=24 width=530) (actual time=0.054..0.447 rows=24 loops=1)  
   ->  Nested Loop Left Join  (cost=0.00..41409.79 rows=75 width=530) (actual time=0.053..0.442 rows=24 loops=1)  
         ->  Merge Left Join  (cost=0.00..41387.81 rows=75 width=527) (actual time=0.046..0.318 rows=24 loops=1)  
               Merge Cond: (t.app_id = at.app_id)  
               ->  Merge Join  (cost=0.00..41372.03 rows=75 width=526) (actual time=0.036..0.290 rows=24 loops=1)  
                     Merge Cond: (test1.app_id = t.app_id)  
                     Join Filter: ((max(test1.app_ver)) = t.app_ver)  
                     ->  GroupAggregate  (cost=0.00..19900.37 rows=80420 width=11) (actual time=0.020..0.160 rows=25 loops=1)  
                           ->  Index Scan using idx_test1_2 on test1  (cost=0.00..18542.94 rows=110646 width=11) (actual time=0.013.  
.0.116 rows=62 loops=1)  
                     ->  Index Scan using idx_test1_2 on test1 t  (cost=0.00..18819.56 rows=109790 width=526) (actual time=0.012..0.  
076 rows=60 loops=1)  
                           Filter: (deleted = 0::numeric)  
               ->  Index Scan using idx_test2_1 on test2 at  (cost=0.00..14.39 rows=476 width=7) (actual time=0.009..0.010 rows=6 lo  
ops=1)  
         ->  Index Scan using idx_test3_1 on test3 h  (cost=0.00..0.28 rows=1 width=9) (actual time=0.004..0.005 rows=1 loops=24)  
               Index Cond: (t.app_id = app_id)  
 Total runtime: 0.565 ms  
(15 rows)  
  
-- offset 100000 后执行时间就变得很漫长  
digoal=> explain analyze select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
digoal-> from   
digoal-> (select APP_ID,max(APP_VER) APP_VER from test1 group by APP_ID) s  
digoal-> join test1 t  
digoal-> on s.APP_ID=t.APP_ID and s.APP_VER=t.APP_VER and t.DELETED=0    
digoal-> left outer join test2 at   
digoal-> on t.APP_ID=at.APP_ID       
digoal-> left outer join   
digoal-> test3 h   
digoal-> on t.APP_ID=h.APP_ID       
digoal-> limit 24 offset  100000;  
                                                                            QUERY PLAN                                                
                                
------------------------------------------------------------------------------------------------------------------------------------  
------------------------------  
 Limit  (cost=42009.25..42009.25 rows=1 width=531) (actual time=1060.506..1060.506 rows=0 loops=1)  
   ->  Nested Loop Left Join  (cost=0.00..42009.25 rows=73 width=531) (actual time=0.088..1051.995 rows=92075 loops=1)  
         ->  Merge Left Join  (cost=0.00..41987.83 rows=73 width=528) (actual time=0.073..605.712 rows=92075 loops=1)  
               Merge Cond: (t.app_id = at.app_id)  
               ->  Merge Join  (cost=0.00..41972.06 rows=73 width=527) (actual time=0.056..564.375 rows=92075 loops=1)  
                     Merge Cond: (test1.app_id = t.app_id)  
                     Join Filter: ((max(test1.app_ver)) = t.app_ver)  
                     ->  GroupAggregate  (cost=0.00..20200.50 rows=79932 width=11) (actual time=0.030..279.882 rows=92796 loops=1)  
                           ->  Index Scan using idx_test1_2 on test1  (cost=0.00..18847.95 rows=110646 width=11) (actual time=0.015.  
.155.840 rows=110646 loops=1)  
                     ->  Index Scan using idx_test1_2 on test1 t  (cost=0.00..19124.56 rows=109857 width=527) (actual time=0.019..15  
4.435 rows=109855 loops=1)  
                           Filter: (deleted = 0::numeric)  
               ->  Index Scan using idx_test2_1 on test2 at  (cost=0.00..14.39 rows=476 width=7) (actual time=0.015..0.283 rows=476   
loops=1)  
         ->  Index Scan using idx_test3_1 on test3 h  (cost=0.00..0.28 rows=1 width=9) (actual time=0.004..0.004 rows=1 loops=92075)  
               Index Cond: (t.app_id = app_id)  
 Total runtime: 1060.683 ms  
(15 rows)  

优化1

消除子查询, select APP_ID,max(APP_VER) APP_VER from test1 group by APP_ID .

这是个读多写少的表, 所以可以这么来优化.

通过增加一个ismax字段, 标记该app_id的app_ver是否是max版本. 因此需要建立一个触发器来完成这个字段的更新, 确保最新的状态.

另外需要一个约束, 确保不会出现ismax重复为true的情况. (由于并发的情况下这个比较难保证, 新增的数据可能都会认为自己是max的版本,所以还有优化2).

alter table test1 add column ismax boolean ;  
update test1 set ismax = true where (app_id,app_ver) in (select app_id,max(app_ver) from test1 group by app_id);  
create index idx_test1_1 on test1(app_id) where ismax is true and deleted=0;  
  
-- 修改后的查询SQL :   
select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
from   
test1 t  
left outer join test2 at   
on (t.APP_ID=at.APP_ID and t.DELETED=0 and t.ismax is true)  
left outer join   
test3 h   
on (t.APP_ID=h.APP_ID)  
limit 24 offset 0;  
  
-- 修改后的执行计划  
digoal=> explain analyze select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
from   
test1 t  
left outer join test2 at   
on (t.APP_ID=at.APP_ID and t.DELETED=0 and t.ismax is true)  
left outer join   
test3 h   
on (t.APP_ID=h.APP_ID)  
limit 24 offset 0;  
                                                                       QUERY PLAN                                                     
                       
------------------------------------------------------------------------------------------------------------------------------------  
---------------------  
 Limit  (cost=0.00..5.04 rows=24 width=530) (actual time=0.060..0.242 rows=24 loops=1)  
   ->  Merge Right Join  (cost=0.00..23241.69 rows=110646 width=530) (actual time=0.059..0.231 rows=24 loops=1)  
         Merge Cond: (h.app_id = t.app_id)  
         ->  Index Scan using idx_test3_1 on test3 h  (cost=0.00..2511.68 rows=89962 width=9) (actual time=0.017..0.030 rows=8 loops  
=1)  
         ->  Materialize  (cost=0.00..19122.03 rows=110646 width=527) (actual time=0.037..0.163 rows=24 loops=1)  
               ->  Merge Left Join  (cost=0.00..18845.41 rows=110646 width=527) (actual time=0.032..0.126 rows=24 loops=1)  
                     Merge Cond: (t.app_id = at.app_id)  
                     Join Filter: ((t.ismax IS TRUE) AND (t.deleted = 0::numeric))  
                     ->  Index Scan using idx_test1_2 on test1 t  (cost=0.00..18542.94 rows=110646 width=530) (actual time=0.013..0.  
055 rows=24 loops=1)  
                     ->  Materialize  (cost=0.00..15.58 rows=476 width=7) (actual time=0.014..0.017 rows=8 loops=1)  
                           ->  Index Scan using idx_test2_1 on test2 at  (cost=0.00..14.39 rows=476 width=7) (actual time=0.013..0.0  
15 rows=3 loops=1)  
 Total runtime: 0.345 ms  
(12 rows)  
  
-- 修改后的最长执行时间.  
digoal=> explain analyze select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
from   
test1 t  
left outer join test2 at   
on (t.APP_ID=at.APP_ID and t.DELETED=0 and t.ismax is true)  
left outer join   
test3 h   
on (t.APP_ID=h.APP_ID)  
limit 24 offset  100000;  
                                                                          QUERY PLAN                                                  
                             
------------------------------------------------------------------------------------------------------------------------------------  
---------------------------  
 Limit  (cost=23606.24..23606.24 rows=1 width=531) (actual time=584.619..584.619 rows=0 loops=1)  
   ->  Merge Right Join  (cost=0.00..23606.24 rows=110646 width=531) (actual time=0.060..576.238 rows=110646 loops=1)  
         Merge Cond: (h.app_id = t.app_id)  
         ->  Index Scan using idx_test3_1 on test3 h  (cost=0.00..2566.38 rows=91875 width=9) (actual time=0.017..83.853 rows=91875   
loops=1)  
         ->  Materialize  (cost=0.00..19427.10 rows=110646 width=528) (actual time=0.037..353.038 rows=110646 loops=1)  
               ->  Merge Left Join  (cost=0.00..19150.48 rows=110646 width=528) (actual time=0.032..261.624 rows=110646 loops=1)  
                     Merge Cond: (t.app_id = at.app_id)  
                     Join Filter: ((t.ismax IS TRUE) AND (t.deleted = 0::numeric))  
                     ->  Index Scan using idx_test1_2 on test1 t  (cost=0.00..18847.95 rows=110646 width=531) (actual time=0.011..15  
9.184 rows=110646 loops=1)  
                     ->  Materialize  (cost=0.00..15.58 rows=476 width=7) (actual time=0.015..0.642 rows=900 loops=1)  
                           ->  Index Scan using idx_test2_1 on test2 at  (cost=0.00..14.39 rows=476 width=7) (actual time=0.013..0.3  
47 rows=476 loops=1)  
 Total runtime: 584.740 ms  
(12 rows)  

优化2

通过优化1, 查询性能基本上有1倍左右的提升.

但是从总时长来看, 越到后面, 每一次翻页都需要消耗几百毫秒. 每页显示的数量越少, 全部翻完所消耗的时间就越长.

我们来用一个函数测试一下使用大小不一样的分页, 看看时间分别是多少.

create or replace function f_test1(i_limit int) returns int as $$  
declare  
v_work int;  
v_count int;  
v_offset int;  
i int;  
begin  
v_work := 0;  
v_count := 0;  
v_offset := 0;  
i := 1;  
raise notice 'start time:%',clock_timestamp();  
loop  
  select count(*) into v_work from   
    (select 1 from   
      test1 t  
      left outer join test2 at   
      on (t.APP_ID=at.APP_ID and t.DELETED=0 and t.ismax is true)  
      left outer join   
      test3 h   
      on (t.APP_ID=h.APP_ID)  
      limit i_limit offset v_offset  
    )t;  
  if v_work=0 then  
    exit;  
  end if;  
  v_offset := i * i_limit;  
  v_count := v_count + v_work;  
  i := i+1;  
end loop;  
raise notice 'end time:%',clock_timestamp();  
return v_count;  
end;  
$$ language plpgsql;  
  
-- 测试每页10000条  
digoal=> select * from f_test1(10000);  
NOTICE:  start time:2012-06-21 10:44:34.148575+08  
NOTICE:  end time:2012-06-21 10:44:36.095619+08  
 f_test1   
---------  
  110646  
(1 row)  
耗时2秒.  
  
-- 测试每页100条  
digoal=> select * from f_test1(100);  
NOTICE:  start time:2012-06-21 10:44:45.448933+08  
NOTICE:  end time:2012-06-21 10:46:53.070959+08  
 f_test1   
---------  
  110646  
(1 row)  
耗时128秒.  

所以第二种优化手段是, 提高每页获取的数量, 增加应用层缓存, 也就是说每次取的页数多一点, 不用每一页都来数据库取. 当然如果应用能够一次把数据全取过去就最好了.

最后一点, 本例的分页SQL都没有ORDER BY, 算是个业务层的BUG, 这种分页是不可取的, 因为无法保证返回的顺序.

参考修改如下 :

select t.APP_ID, t.APP_VER, t.CN_NAME, t.PACKAGE, t.APK_SIZE, t.APP_SHOW_VER, t.DESCRIPTION,t.CONTENT_PROVIDER,at.APP_TAG,h.SCORE       
from   
test1 t  
left outer join test2 at   
on (t.APP_ID=at.APP_ID and t.DELETED=0 and t.ismax is true)  
left outer join   
test3 h   
on (t.APP_ID=h.APP_ID)  
order by t.app_id  
limit 24 offset 0;  

Flag Counter

digoal’s 大量PostgreSQL文章入口