PostgreSQL 9.x, 10, 11 hash分区表用法举例

5 minute read

背景

PostgreSQL 10开始内置分区表语法，当时只支持了range,list两种分区，实际上可以通过LIST实现HASH分区。

PostgreSQL 10 hash 分区表

使用list支持hash分区

postgres=# create table p (id int , info text, crt_time timestamp) partition by list (abs(mod(id,4)));  
CREATE TABLE  
  
postgres=# create table p0 partition of p for values in (0);  
CREATE TABLE  
postgres=# create table p1 partition of p for values in (1);  
CREATE TABLE  
postgres=# create table p2 partition of p for values in (2);  
CREATE TABLE  
postgres=# create table p3 partition of p for values in (3);  
CREATE TABLE  

分区表如下

postgres=# \d+ p  
                                                Table "public.p"  
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description   
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------  
 id       | integer                     |           |          |         | plain    |              |   
 info     | text                        |           |          |         | extended |              |   
 crt_time | timestamp without time zone |           |          |         | plain    |              |   
Partition key: LIST (abs(mod(id, 4)))  
Partitions: p0 FOR VALUES IN (0),  
            p1 FOR VALUES IN (1),  
            p2 FOR VALUES IN (2),  
            p3 FOR VALUES IN (3)  

写入数据

postgres=# insert into p select generate_series(1,1000),md5(random()::text),now();  
INSERT 0 1000  
postgres=# select tableoid::regclass,id from p limit 10;  
 tableoid | id   
----------+----  
 p0       |  4  
 p0       |  8  
 p0       | 12  
 p0       | 16  
 p0       | 20  
 p0       | 24  
 p0       | 28  
 p0       | 32  
 p0       | 36  
 p0       | 40  
(10 rows)  

普通的查询，无法做到分区的过滤

postgres=# explain select * from p where id=1 ;  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..96.50 rows=24 width=44)  
   ->  Seq Scan on p0  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 1)  
   ->  Seq Scan on p1  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 1)  
   ->  Seq Scan on p2  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 1)  
   ->  Seq Scan on p3  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 1)  
(9 rows)  

一定要带上分区条件，才可以做到分区过滤

postgres=# explain select * from p where id=1 and abs(mod(id, 4))=abs(mod(1, 4));  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..32.60 rows=1 width=44)  
   ->  Seq Scan on p1  (cost=0.00..32.60 rows=1 width=44)  
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))  
(3 rows)  

PostgreSQL 11 hash 分区表

PostgreSQL 11同样可以使用与10一样的方法，LIST来实现HASH分区，但是有一个更加优雅的方法，直接使用HASH分区。

postgres=# create table p (id int , info text, crt_time timestamp) partition by hash (id);  
CREATE TABLE  
  
postgres=# create table p0 partition of p  for values WITH (MODULUS 4, REMAINDER 0);  
CREATE TABLE  
postgres=# create table p1 partition of p  for values WITH (MODULUS 4, REMAINDER 1);  
CREATE TABLE  
postgres=# create table p2 partition of p  for values WITH (MODULUS 4, REMAINDER 2);  
CREATE TABLE  
postgres=# create table p3 partition of p  for values WITH (MODULUS 4, REMAINDER 3);  
CREATE TABLE  

表结构如下

postgres=# \d+ p  
                                                Table "public.p"  
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description   
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------  
 id       | integer                     |           |          |         | plain    |              |   
 info     | text                        |           |          |         | extended |              |   
 crt_time | timestamp without time zone |           |          |         | plain    |              |   
Partition key: HASH (id)  
Partitions: p0 FOR VALUES WITH (modulus 4, remainder 0),  
            p1 FOR VALUES WITH (modulus 4, remainder 1),  
            p2 FOR VALUES WITH (modulus 4, remainder 2),  
            p3 FOR VALUES WITH (modulus 4, remainder 3)  

表分区定义，内置的约束是一个HASH函数的返回值

postgres=# \d+ p0  
                                                Table "public.p0"  
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description   
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------  
 id       | integer                     |           |          |         | plain    |              |   
 info     | text                        |           |          |         | extended |              |   
 crt_time | timestamp without time zone |           |          |         | plain    |              |   
Partition of: p FOR VALUES WITH (modulus 4, remainder 0)  
Partition constraint: satisfies_hash_partition('180289'::oid, 4, 0, id)  
  
  
postgres=# \d+ p1  
                                                Table "public.p1"  
  Column  |            Type             | Collation | Nullable | Default | Storage  | Stats target | Description   
----------+-----------------------------+-----------+----------+---------+----------+--------------+-------------  
 id       | integer                     |           |          |         | plain    |              |   
 info     | text                        |           |          |         | extended |              |   
 crt_time | timestamp without time zone |           |          |         | plain    |              |   
Partition of: p FOR VALUES WITH (modulus 4, remainder 1)  
Partition constraint: satisfies_hash_partition('180289'::oid, 4, 1, id)  

这个hash函数的定义如下，他一定是一个immutable 函数，所以可以用于分区过滤

postgres=# \x  
Expanded display is on.  
postgres=# \df+ satisfies_hash_partition  
List of functions  
-[ RECORD 1 ]-------+--------------------------------------  
Schema              | pg_catalog  
Name                | satisfies_hash_partition  
Result data type    | boolean  
Argument data types | oid, integer, integer, VARIADIC "any"  
Type                | func  
Volatility          | immutable  
Parallel            | safe  
Owner               | postgres  
Security            | invoker  
Access privileges   |   
Language            | internal  
Source code         | satisfies_hash_partition  
Description         | hash partition CHECK constraint  

PostgreSQL 11终于可以只输入分区字段值就可以做到分区过滤了

postgres=# explain select * from p where id=1;  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..24.16 rows=6 width=44)  
   ->  Seq Scan on p0  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 1)  
(3 rows)  
  
postgres=# explain select * from p where id=0;  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..24.16 rows=6 width=44)  
   ->  Seq Scan on p0  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 0)  
(3 rows)  
  
postgres=# explain select * from p where id=2;  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..24.16 rows=6 width=44)  
   ->  Seq Scan on p2  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 2)  
(3 rows)  
  
postgres=# explain select * from p where id=3;  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..24.16 rows=6 width=44)  
   ->  Seq Scan on p1  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 3)  
(3 rows)  

它受控于一个开关，当关闭后，就无法只通过分区值来过滤分区。

postgres=# set enable_partition_pruning =off;  
SET  
postgres=# explain select * from p where id=0;  
                        QUERY PLAN                          
----------------------------------------------------------  
 Append  (cost=0.00..96.62 rows=24 width=44)  
   ->  Seq Scan on p0  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 0)  
   ->  Seq Scan on p1  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 0)  
   ->  Seq Scan on p2  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 0)  
   ->  Seq Scan on p3  (cost=0.00..24.12 rows=6 width=44)  
         Filter: (id = 0)  
(9 rows)  

PostgreSQL 继承表 hash 分区表实现

《PostgreSQL 传统 hash 分区方法和性能》

postgres=# explain select * from tbl where abs(mod(id,4)) = abs(mod(1,4)) and id=1;    
                                QUERY PLAN                                    
--------------------------------------------------------------------------    
 Append  (cost=0.00..979127.84 rows=3 width=45)    
   ->  Seq Scan on tbl  (cost=0.00..840377.67 rows=2 width=45)    
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))    
   ->  Seq Scan on tbl1  (cost=0.00..138750.17 rows=1 width=45)    
         Filter: ((id = 1) AND (abs(mod(id, 4)) = 1))    
(5 rows)