PostgreSQL 11 preview - 分区表智能并行聚合、分组计算(已类似MPP架构,性能暴增)

2 minute read

背景

PostgreSQL 并行计算开始在细节方面进行打磨,例如11已添加了JOIN的分区并行,当两个分区表的分区定义一致时,在分区字段上JOIN就可以用到分区与分区之间直接并行JOIN,而不需要将数据APPEND后在JOIN。

《PostgreSQL 11 preview - 分区表智能并行JOIN (已类似MPP架构,性能暴增)》

现在又一个PATCH要提交了,还是和分区表有关,这次是以分区为单位的聚合、分组计算。

实际上也蛮好理解的,分两种情况,

1、一种是多阶段聚合,原理如下:(常用在MPP数据库,目前PG的单表多阶段并行聚合也是这么做的)

《PostgreSQL 10 自定义并行计算聚合函数的原理与实践 - (含array_agg合并多个数组为单个一元数组的例子)》

2、第二种是智能分区并行聚合,这种情况与第一种不矛盾,但是在非多阶段聚合时也可能能起到优化作用,比如说根据分区字段进行分组聚合,那么分区与分区之间实际上是没有计算集合的重叠的,各自算各自的。

Hi all,  
  
Declarative partitioning is supported in PostgreSQL 10 and work is already  
in  
progress to support partition-wise joins. Here is a proposal for  
partition-wise  
aggregation/grouping.  Our initial performance measurement has shown 7 times  
performance when partitions are on foreign servers and approximately 15%  
when  
partitions are local.  
  
Partition-wise aggregation/grouping computes aggregates for each partition  
separately.  If the group clause contains the partition key, all the rows  
belonging to a given group come from one partition, thus allowing aggregates  
to be computed completely for each partition.  Otherwise, partial aggregates  
computed for each partition are combined across the partitions to produce  
the  
final aggregates. This technique improves performance because:  
i. When partitions are located on foreign server, we can push down the  
aggregate to the foreign server.  
ii. If hash table for each partition fits in memory, but that for the whole  
relation does not, each partition-wise aggregate can use an in-memory hash  
table.  
iii. Aggregation at the level of partitions can exploit properties of  
partitions like indexes, their storage etc.  
  
Attached an experimental patch for the same based on the partition-wise join  
patches posted in [1].  
  
This patch currently implements partition-wise aggregation when group clause  
contains the partitioning key.  A query below, involving a partitioned table  
with 3 partitions containing 1M rows each, producing total 30 groups showed  
15% improvement over non-partition-wise aggregation. Same query showed 7  
times  
improvement when the partitions were located on the foreign servers.  
Patch needs a lot of improvement including:  
1. Support for partial partition-wise aggregation  
2. Estimating number of groups for every partition  
3. Estimating cost of partition-wise aggregation based on sample partitions  
similar to partition-wise join  
and much more.  
  
In order to support partial aggregation on foreign partitions, we need  
support  
to fetch partially aggregated results from the foreign server. That can be  
handled as a separate follow-on patch.  
  
Though is lot of work to be done, I would like to get suggestions/opinions  
from  
hackers.  
  
I would like to thank Ashutosh Bapat for providing a draft patch and helping  
me off-list on this feature while he is busy working on partition-wise join  
feature.  
  
[1]  
https://www.postgresql.org/message-id/CAFjFpRcbY2QN3cfeMTzVEoyF5Lfku-ijyNR%3DPbXj1e%3D9a%3DqMoQ%40mail.gmail.com  
  
Thanks  
  
--   
Jeevan Chalke  
Principal Software Engineer, Product Development  
EnterpriseDB Corporation  
The Enterprise PostgreSQL Company  

这里有两个例子:

开启分区智能聚合,各个分区聚合后再APPEND结果。

Here is the sample plan:  
  
postgres=# set enable_partition_wise_agg to true;  
SET  
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;  
                                                  QUERY  
PLAN  
--------------------------------------------------------------------------------------------------------------  
 Append  (cost=5100.00..61518.90 rows=30 width=12) (actual  
time=324.837..944.804 rows=30 loops=1)  
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual  
time=324.837..324.838 rows=10 loops=1)  
         Relations: Aggregate on (public.fplt1_p1 plt1)  
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual  
time=309.954..309.956 rows=10 loops=1)  
         Relations: Aggregate on (public.fplt1_p2 plt1)  
   ->  Foreign Scan  (cost=5100.00..20506.30 rows=10 width=12) (actual  
time=310.002..310.004 rows=10 loops=1)  
         Relations: Aggregate on (public.fplt1_p3 plt1)  
 Planning time: 0.370 ms  
 Execution time: 945.384 ms  
(9 rows)  

关闭分区智能聚合,需要将数据汇聚后,再进行聚合操作。

postgres=# set enable_partition_wise_agg to false;  
SET  
postgres=# EXPLAIN ANALYZE SELECT a, count(*) FROM plt1 GROUP BY a;  
                                                              QUERY  
PLAN  
---------------------------------------------------------------------------------------------------------------------------------------  
 HashAggregate  (cost=121518.01..121518.31 rows=30 width=12) (actual  
time=6498.452..6498.459 rows=30 loops=1)  
   Group Key: plt1.a  
   ->  Append  (cost=0.00..106518.00 rows=3000001 width=4) (actual  
time=0.595..5769.592 rows=3000000 loops=1)  
         ->  Seq Scan on plt1  (cost=0.00..0.00 rows=1 width=4) (actual  
time=0.007..0.007 rows=0 loops=1)  
         ->  Foreign Scan on fplt1_p1  (cost=100.00..35506.00 rows=1000000  
width=4) (actual time=0.587..1844.506 rows=1000000 loops=1)  
         ->  Foreign Scan on fplt1_p2  (cost=100.00..35506.00 rows=1000000  
width=4) (actual time=0.384..1839.633 rows=1000000 loops=1)  
         ->  Foreign Scan on fplt1_p3  (cost=100.00..35506.00 rows=1000000  
width=4) (actual time=0.402..1876.505 rows=1000000 loops=1)  
 Planning time: 0.251 ms  
 Execution time: 6499.018 ms  
(9 rows)  

场景

1、枚举、哈希分区表:

因为在不同分区中,分区字段值非常鲜明分离。可以按分区字段聚合,按分区字段GROUP BY。

2、范围分区表:

某些按分区字段GROUP BY(看GROUP BY的表达式是否有优化空间)

3、继承表,根据子表约束,选择是否可以使用分区智能聚合、GROUP。

参考

https://www.postgresql.org/message-id/flat/CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com#CAM2+6=V64_xhstVHie0Rz=KPEQnLJMZt_e314P0jaT_oJ9MR8A@mail.gmail.com

https://commitfest.postgresql.org/17/1250/

Flag Counter

digoal’s 大量PostgreSQL文章入口