HOW to Change PostgreSQL’s TOAST_TUPLE_THRESHOLD
背景
TOAST表的设计,在一个需要频繁更新(小字段更新)操作的表中非常有用,可以大大降低不必要的大字段带来的IO和CPU(大字段上有索引时) 。
优化场景可参考 :
http://blog.163.com/digoal@126/blog/static/16387704020130931040444/
但是PostgreSQL对什么时候启用TOAST存储做了限制,
检测是否需要TOAST的函数如下 :
src/backend/catalog/toasting.c
/*
* Check to see whether the table needs a TOAST table. It does only if
* (1) there are any toastable attributes, and (2) the maximum length
* of a tuple could exceed TOAST_TUPLE_THRESHOLD. (We don't want to
* create a toast table for something like "f1 varchar(20)".)
*/
static bool
needs_toast_table(Relation rel)
{
int32 data_length = 0;
bool maxlength_unknown = false;
bool has_toastable_attrs = false;
TupleDesc tupdesc;
Form_pg_attribute *att;
int32 tuple_length;
int i;
tupdesc = rel->rd_att;
att = tupdesc->attrs;
for (i = 0; i < tupdesc->natts; i++)
{
if (att[i]->attisdropped)
continue;
data_length = att_align_nominal(data_length, att[i]->attalign);
if (att[i]->attlen > 0)
{
/* Fixed-length types are never toastable */
data_length += att[i]->attlen;
}
else
{
int32 maxlen = type_maximum_size(att[i]->atttypid,
att[i]->atttypmod);
if (maxlen < 0)
maxlength_unknown = true;
else
data_length += maxlen;
if (att[i]->attstorage != 'p')
has_toastable_attrs = true;
}
}
if (!has_toastable_attrs)
return false; /* nothing to toast? */
if (maxlength_unknown)
return true; /* any unlimited-length attrs? */
tuple_length = MAXALIGN(offsetof(HeapTupleHeaderData, t_bits) +
BITMAPLEN(tupdesc->natts)) +
MAXALIGN(data_length);
return (tuple_length > TOAST_TUPLE_THRESHOLD);
}
TOAST_TUPLE_THRESHOLD如下 :
src/include/access/tuptoaster.h
/*
* Find the maximum size of a tuple if there are to be N tuples per page.
*/
#define MaximumBytesPerTuple(tuplesPerPage) \
MAXALIGN_DOWN((BLCKSZ - \
MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * sizeof(ItemIdData))) \
/ (tuplesPerPage))
/*
* These symbols control toaster activation. If a tuple is larger than
* TOAST_TUPLE_THRESHOLD, we will try to toast it down to no more than
* TOAST_TUPLE_TARGET bytes through compressing compressible fields and
* moving EXTENDED and EXTERNAL data out-of-line.
*
* The numbers need not be the same, though they currently are. It doesn't
* make sense for TARGET to exceed THRESHOLD, but it could be useful to make
* it be smaller.
*
* Currently we choose both values to match the largest tuple size for which
* TOAST_TUPLES_PER_PAGE tuples can fit on a heap page.
*
* XXX while these can be modified without initdb, some thought needs to be
* given to needs_toast_table() in toasting.c before unleashing random
* changes. Also see LOBLKSIZE in large_object.h, which can *not* be
* changed without initdb.
*/
#define TOAST_TUPLES_PER_PAGE 4
#define TOAST_TUPLE_THRESHOLD MaximumBytesPerTuple(TOAST_TUPLES_PER_PAGE)
#define TOAST_TUPLE_TARGET TOAST_TUPLE_THRESHOLD
从上面定义可以得出, 如果你编译PostgreSQL时指定的数据块是8KB的,
--with-blocksize=BLOCKSIZE
set table block size in kB [8]
那么TOAST_TUPLE_THRESHOLD约为2KB
MAXALIGN_DOWN(
(BLCKSZ - MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * sizeof(ItemIdData)))
/
(tuplesPerPage)
)
如果想要在字段长度[可选压缩后]大于200字节后就启用TOAST, 可以调整TOAST_TUPLES_PER_PAGE为40.
重新编译即可.
正文
重新编译后测试验证 :
postgres=# create table test (info text);
CREATE TABLE
改为external, 不压缩. 这样更能看出结果.
postgres=# alter table test alter column info set storage external;
ALTER TABLE
postgres=# select pg_column_size(repeat('a', 200));
pg_column_size
----------------
204
(1 row)
postgres=# insert into test values (repeat('a', 200));
INSERT 0 1
postgres=# select pg_relation_size(reltoastrelid::regclass) from pg_class where relname='test';
pg_relation_size
------------------
8192
(1 row)
可以看出200字节已经存储到TOAST表了.
在未修改TOAST_TUPLES_PER_PAGE的PostgreSQL数据库中测试如下 :
digoal=> truncate test;
TRUNCATE TABLE
digoal=> insert into test values (repeat('a', 200));
INSERT 0 1
digoal=> select pg_relation_size(reltoastrelid::regclass) from pg_class where relname='test';
pg_relation_size
------------------
0
(1 row)
可以看出200字节未触发存储到TOAST表的动作
digoal=> insert into test values (repeat('a', 2000));
INSERT 0 1
digoal=> select pg_relation_size(reltoastrelid::regclass) from pg_class where relname='test';
pg_relation_size
------------------
0
(1 row)
2000字节未触发.
digoal=> insert into test values (repeat('a', 2100));
INSERT 0 1
digoal=> select pg_relation_size(reltoastrelid::regclass) from pg_class where relname='test';
pg_relation_size
------------------
8192
(1 row)
2100字节触发TOAST.
参考
1. http://blog.163.com/digoal@126/blog/static/1638770402012116115354333/
2. http://blog.163.com/digoal@126/blog/static/16387704020120524144140/
3. http://blog.163.com/digoal@126/blog/static/163877040201122910531988/
4. src/include/access/tuptoaster.h
5. src/backend/catalog/toasting.c