Greenplum 列存表(AO表)的膨胀、垃圾检查与空间收缩(含修改分布键)
背景
Greenplum支持行存储(堆存储)与AO存储,堆存储的垃圾回收和膨胀检测方法请参考:
《如何检测、清理Greenplum膨胀、垃圾 - 阿里云HybridDB for PG最佳实践》
对于AO存储,虽然是appendonly,但实际上GP是支持DELETE和UPDATE的,被删除或更新的行,通过BITMAP来标记。
AO存储是块级组织,当一个块内的数据大部分都被删除或更新掉时,扫描它浪费的成本实际上是很高的。
如何检查AO表的膨胀,收缩AO表呢?
如何查看表的存储结构
pg_class.relstorage表示这个对象是什么存储:
postgres=# select distinct relstorage from pg_class ;
relstorage
------------
a -- 行存储AO表
h -- heap堆表、索引
x -- 外部表(external table)
v -- 视图
c -- 列存储AO表
(5 rows)
查询当前数据库有哪些AO表:
postgres=# select t2.nspname, t1.relname from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in ('c', 'a');
nspname | relname
----------+-------------------
postgres | tbl_tag
postgres | tbl_pos_1_prt_p1
postgres | tbl_pos_1_prt_p2
postgres | tbl_pos_1_prt_p3
postgres | tbl_pos_1_prt_p4
postgres | tbl_pos_1_prt_p5
postgres | tbl_pos_1_prt_p6
postgres | tbl_pos_1_prt_p7
postgres | tbl_pos_1_prt_p8
postgres | tbl_pos_1_prt_p9
postgres | tbl_pos_1_prt_p10
postgres | tbl_pos
postgres | xx_czrk_qm_col
postgres | ao1
(14 rows)
查询当前数据库有哪些堆表:
select t2.nspname, t1.relname from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in ('h') and relkind='r';
查看AO表的膨胀率(有多少垃圾)
使用这个函数可以查看AO表的膨胀率
postgres=# \df+ gp_toolkit.__gp_aovisimap_compaction_info
List of functions
-[ RECORD 1 ]-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Schema | gp_toolkit
Name | __gp_aovisimap_compaction_info
Result data type | SETOF record
Argument data types | ao_oid oid, OUT content integer, OUT datafile integer, OUT compaction_possible boolean, OUT hidden_tupcount bigint, OUT total_tupcount bigint, OUT percent_hidden numeric
Type | normal
Data access | no sql
Volatility | volatile
Owner | dege.zzz
Language | plpgsql
Source code |
| DECLARE
| hinfo_row RECORD;
| threshold float;
| BEGIN
| EXECUTE 'show gp_appendonly_compaction_threshold' INTO threshold;
| FOR hinfo_row IN SELECT gp_segment_id,
| gp_toolkit.__gp_aovisimap_hidden_typed(ao_oid)::gp_toolkit.__gp_aovisimap_hidden_t
| FROM gp_dist_random('gp_id') LOOP
| content := hinfo_row.gp_segment_id;
| datafile := (hinfo_row.__gp_aovisimap_hidden_typed).seg;
| hidden_tupcount := (hinfo_row.__gp_aovisimap_hidden_typed).hidden;
| total_tupcount := (hinfo_row.__gp_aovisimap_hidden_typed).total;
| compaction_possible := false;
| IF total_tupcount > 0 THEN
| percent_hidden := (100 * hidden_tupcount / total_tupcount::numeric)::numeric(5,2);
| ELSE
| percent_hidden := 0::numeric(5,2);
| END IF;
| IF percent_hidden > threshold THEN
| compaction_possible := true;
| END IF;
| RETURN NEXT;
| END LOOP;
| RAISE NOTICE 'gp_appendonly_compaction_threshold = %', threshold;
| RETURN;
| END;
|
Description |
例子
postgres=# select * from gp_toolkit.__gp_aovisimap_compaction_info('postgres.ao1'::regclass);
NOTICE: gp_appendonly_compaction_threshold = 10
content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden
---------+----------+---------------------+-----------------+----------------+----------------
2 | 1 | f | 21 | 208283 | 0.01
9 | 1 | f | 25 | 208303 | 0.01
16 | 1 | f | 16 | 208352 | 0.01
4 | 1 | f | 23 | 208356 | 0.01
6 | 1 | f | 23 | 208386 | 0.01
3 | 1 | f | 14 | 208333 | 0.01
1 | 1 | f | 14 | 208329 | 0.01
14 | 1 | f | 12 | 208350 | 0.01
15 | 1 | f | 24 | 208346 | 0.01
7 | 1 | f | 22 | 208329 | 0.01
8 | 1 | f | 18 | 208334 | 0.01
0 | 1 | f | 21 | 208314 | 0.01
18 | 1 | f | 16 | 208417 | 0.01
11 | 1 | f | 24 | 208337 | 0.01
17 | 1 | f | 31 | 208380 | 0.01
12 | 1 | f | 12 | 208367 | 0.01
13 | 1 | f | 22 | 208365 | 0.01
5 | 1 | f | 22 | 208367 | 0.01
10 | 1 | f | 18 | 208347 | 0.01
20 | 1 | f | 17 | 208384 | 0.01
27 | 1 | f | 22 | 208348 | 0.01
19 | 1 | f | 31 | 208425 | 0.01
23 | 1 | f | 28 | 208344 | 0.01
26 | 1 | f | 14 | 208339 | 0.01
25 | 1 | f | 21 | 208386 | 0.01
24 | 1 | f | 14 | 208332 | 0.01
21 | 1 | f | 28 | 208360 | 0.01
33 | 1 | f | 18 | 208354 | 0.01
31 | 1 | f | 23 | 208335 | 0.01
22 | 1 | f | 17 | 208309 | 0.01
28 | 1 | f | 21 | 208314 | 0.01
29 | 1 | f | 23 | 208329 | 0.01
30 | 1 | f | 22 | 208342 | 0.01
35 | 1 | f | 18 | 208305 | 0.01
34 | 1 | f | 26 | 208363 | 0.01
32 | 1 | f | 25 | 208396 | 0.01
36 | 1 | f | 23 | 208323 | 0.01
38 | 1 | f | 22 | 208367 | 0.01
37 | 1 | f | 12 | 208334 | 0.01
39 | 1 | f | 12 | 208389 | 0.01
41 | 1 | f | 16 | 208373 | 0.01
42 | 1 | f | 30 | 208376 | 0.01
40 | 1 | f | 31 | 208397 | 0.01
43 | 1 | f | 16 | 208378 | 0.01
46 | 1 | f | 29 | 208371 | 0.01
47 | 1 | f | 16 | 208382 | 0.01
45 | 1 | f | 17 | 208368 | 0.01
44 | 1 | f | 29 | 208381 | 0.01
(48 rows)
解读:
1、content:对应gp_configuration.content,表示greenplum每个节点的唯一编号。
2、datafile:这条记录对应的这个表的其中一个数据文件的编号,每个数据文件假设1GB。
3、compaction_possible:这个数据文件是否可以被收缩。(通过gp_appendonly_compaction_threshold参数和percent_hidden值判断)。
4、hidden_tupcount:有多少条记录已更新或删除(不可见)。
5、total_tupcount:总共有多少条记录(包括已更新或删除的记录)。
6、percent_hidden:不可见记录的占比。如果这个占比大于gp_appendonly_compaction_threshold参数,那么执行vacuum时,会收缩这个数据文件。
检查系统中膨胀率超过N的AO表
膨胀率超过千分之2的AO表:
select * from (
select t2.nspname, t1.relname, (gp_toolkit.__gp_aovisimap_compaction_info(t1.oid)).*
from pg_class t1, pg_namespace t2 where t1.relnamespace=t2.oid and relstorage in ('c', 'a')
) t
where t.percent_hidden > 0.2;
返回结果
nspname | relname | content | datafile | compaction_possible | hidden_tupcount | total_tupcount | percent_hidden
----------+-------------------+---------+----------+---------------------+-----------------+----------------+----------------
postgres | tbl_pos_1_prt_p1 | 1 | 1 | t | 20833382 | 20833382 | 100.00
postgres | tbl_pos_1_prt_p1 | 7 | 1 | t | 20833495 | 20833628 | 100.00
postgres | tbl_pos_1_prt_p1 | 5 | 1 | t | 20833628 | 20833495 | 100.00
postgres | tbl_pos_1_prt_p1 | 3 | 1 | t | 20833469 | 20833469 | 100.00
....
显然膨胀了100%,验证如下(1条记录有效,占用了15GB空间):
postgres=# select pg_size_pretty(pg_relation_size('tbl_pos_1_prt_p1'));
pg_size_pretty
----------------
15 GB
(1 row)
postgres=# select count(*) from tbl_pos_1_prt_p1;
count
-------
1
(1 row)
vacuum可以直接收缩(因为膨胀率大于gp_appendonly_compaction_threshold参数):
postgres=# vacuum tbl_pos_1_prt_p1;
VACUUM
postgres=# select count(*) from tbl_pos_1_prt_p1;
count
-------
1
(1 row)
postgres=# select pg_size_pretty(pg_relation_size('tbl_pos_1_prt_p1'));
pg_size_pretty
----------------
40 bytes
(1 row)
VACUUM后,只占用40字节。
收缩已膨胀的AO表
方法有三:
1、执行VACUUM。(当膨胀率大于gp_appendonly_compaction_threshold参数时),为共享锁。
2、执行VACUUM FULL。(不管gp_appendonly_compaction_threshold参数的设置,都会回收垃圾空间。),为DDL锁。
3、执行重分布。(不管gp_appendonly_compaction_threshold参数,都会回收垃圾空间。),为DDL锁。
alter table <table_name> set with (reorganize=true) distributed randomly;
或
alter table <table_name> set with (reorganize=true) distributed by (<column_names1>,<column_names2>....)
set distribute可以回收索引的膨胀空间。set distribute 加载的锁与DDL锁类似,是排它锁。建议在没有业务的时候执行,不要堵塞业务。
同时set distribute只要分布条件不变,就是在节点内完成的,不会涉及数据的重分布。
建议的操作流程:
1 记录下表的分布列
2 执行set distribute (REORGANIZE=true)
如果是随机分布,则设置为随机分布
跑linux任务定时清理膨胀
结合本文提到的方法,取出膨胀的表实施vacuum或vacuum freeze或vacuum full.
export PGHOST=$1
export PGPORT=$2
export PGDATABASE=$3
export PGUSER=$4
export PGPASSWORD=$5
for tbl in `psql -q -t -A -c "select quote_ident(schemaname)||'.'||quote_ident(tablename) from pg_tables"`
do
psql <<EOF
set vacuum_freeze_min_age=0 ;
-- vacuum freeze $tbl ;
vacuum $tbl ;
EOF
done
参考
http://greenplum.org/docs/510/admin_guide/managing/maintain.html
If the ratio of hidden rows to total rows in a segment file is less than a threshold value (10, by default),
the segment file is not compacted.
The threshold value can be configured with the gp_appendonly_compaction_threshold server configuration parameter.
VACUUM FULL ignores the value of gp_appendonly_compaction_threshold and rewrites the segment file regardless of the ratio.
You can use the __gp_aovisimap_compaction_info() function in the the gp_toolkit schema to investigate
the effectiveness of a VACUUM operation on append-optimized tables.
《如何检测、清理Greenplum膨胀、垃圾 - 阿里云HybridDB for PG最佳实践》
https://wiki.postgresql.org/wiki/Show_database_bloat