请勿滥用unlogged table & hash index

4 minute read

背景

unlogged table和hash index同样都不会写XLOG,所以如果你用流复制来搞HA,一定概要搞清楚一个问题,切换到备库的话unlogged table数据会被清掉,而hash index也没有,走hash index会失败。

unlogged table 的风险以及修复手段可以见 :

http://blog.163.com/digoal@126/blog/static/163877040201582621345351/

hash index则风险略小,但是也必须重建,但是个人还是建议大家不要使用hash index,改用btree,因为性能确实相差无几。

批量将数据库集群的hash index修改为btree index的方法:

例子:

先并行创建一个btree索引,然后并行删除对应的hash 索引。

$ vi test.sh  
#!/bin/bash  
  
for db in `psql -n -q -t -h 127.0.0.1 -p 1921 -U postgres postgres -c "copy (select datname from pg_database where datname <>'template0') to stdout;"`  
do   
  
psql -n -q -t -h 127.0.0.1 -p 1921 -U postgres $db -c "with t1(sql,nsp,idx) as (select regexp_replace(indexdef,'USING hash','USING btree'),schemaname,indexname from pg_indexes where indexdef ~ 'USING hash'), t2(sql_create,sql_drop) as (select regexp_replace(sql,'CREATE INDEX','CREATE INDEX CONCURRENTLY'), 'DROP INDEX CONCURRENTLY '||quote_ident(nsp)||'.'||quote_ident(idx) from t1) select regexp_replace(sql_create,'CONCURRENTLY (.*) ON','CONCURRENTLY \1_0926 ON')  ||'; '|| sql_drop ||'; ' from t2;"|psql -a -e -h 127.0.0.1 -p 1921 -U postgres $db -f -  
  
done  
  
$ . ./test.sh  

这个with查询的结果如下举例:

postgres=# with t1(sql,nsp,idx) as (select regexp_replace(indexdef,'USING hash','USING btree'),schemaname,indexname from pg_indexes where indexdef ~ 'USING hash'),  
t2(sql_create,sql_drop) as (select regexp_replace(sql,'CREATE INDEX','CREATE INDEX CONCURRENTLY'), 'DROP INDEX CONCURRENTLY '||quote_ident(nsp)||'.'||quote_ident(idx) from t1)  
select regexp_replace(sql_create,'CONCURRENTLY (.*) ON','CONCURRENTLY \1_0926 ON')  ||'; '|| sql_drop ||'; ' from t2;  
                                            ?column?                                               
-------------------------------------------------------------------------------------------------  
 CREATE INDEX CONCURRENTLY hi1_0926 ON t USING btree (id); DROP INDEX CONCURRENTLY public.hi1;   
 CREATE INDEX CONCURRENTLY hi2_0926 ON s1.tbl USING btree (id); DROP INDEX CONCURRENTLY s1.hi2;   
(2 rows)  

或者在每个数据库调用这个inline code:

do language plpgsql $$  
declare  
  v_sql text;   
  v_schema name;   
  v_idx name;   
  sql1 text;  
begin  
for v_sql,v_schema,v_idx in select regexp_replace(indexdef,'USING hash','USING btree'),schemaname,indexname from pg_indexes where indexdef ~ 'USING hash'  
loop  
  sql1='DROP INDEX '||quote_ident(v_schema)||'.'||quote_ident(v_idx);   
  execute sql1;   
  execute v_sql;   
end loop;   
end;   
$$;   
  
postgres=# \d t  
       Table "public.t"  
 Column |  Type   | Modifiers   
--------+---------+-----------  
 id     | integer | not null  
Indexes:  
    "t_pkey" PRIMARY KEY, btree (id)  
    "i1" btree (id)  
    "i2" btree (id)  
  
postgres=# \d s1.tbl  
        Table "s1.tbl"  
 Column |  Type   | Modifiers   
--------+---------+-----------  
 id     | integer |   
Indexes:  
    "i1" btree (id)  

关于hash和btree性能:

查询性能:

postgres=# create table tbl(id int, info text);  
CREATE TABLE  
postgres=# insert into tbl select generate_series(1,1000000);  
INSERT 0 1000000  
postgres=# create index idx_tbl1 on tbl using hash (id);  
  
$ vi test.sql  
\setrandom id 1 1000000  
select * from tbl where id=:id;  
  
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30  
progress: 1.0 s, 24219.3 tps, lat 0.258 ms stddev 0.436  
progress: 2.0 s, 29387.1 tps, lat 0.270 ms stddev 0.401  
progress: 3.0 s, 29281.0 tps, lat 0.271 ms stddev 0.442  
progress: 4.0 s, 29231.7 tps, lat 0.272 ms stddev 0.844  
......  
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 30 s  
number of transactions actually processed: 876806  
latency average: 0.270 ms  
latency stddev: 0.592 ms  
tps = 29202.503991 (including connections establishing)  
tps = 29379.956438 (excluding connections establishing)  
statement latencies in milliseconds:  
        0.003062        \setrandom id 1 1000000  
        0.266481        select * from tbl where id=:id;  
  
postgres=# drop index idx_tbl1 ;  
DROP INDEX  
postgres=# create index idx_tbl1 on tbl using btree (id);  
CREATE INDEX  
  
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30  
progress: 1.0 s, 28414.2 tps, lat 0.240 ms stddev 0.306  
progress: 2.0 s, 31192.2 tps, lat 0.255 ms stddev 0.605  
progress: 3.0 s, 31022.8 tps, lat 0.256 ms stddev 0.451  
progress: 4.0 s, 29587.1 tps, lat 0.268 ms stddev 0.671  
......  
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 30 s  
number of transactions actually processed: 903467  
latency average: 0.263 ms  
latency stddev: 0.678 ms  
tps = 30088.054150 (including connections establishing)  
tps = 30229.295069 (excluding connections establishing)  
statement latencies in milliseconds:  
        0.002900        \setrandom id 1 1000000  
        0.259402        select * from tbl where id=:id;  

更新性能

$ vi test.sql  
\setrandom id 1 1000000  
update tbl set id=1+:id where id=:id;  
  
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30  
progress: 1.0 s, 12500.2 tps, lat 0.570 ms stddev 0.864  
progress: 2.0 s, 17456.9 tps, lat 0.456 ms stddev 0.641  
progress: 3.0 s, 18242.3 tps, lat 0.435 ms stddev 0.234  
progress: 4.0 s, 17693.0 tps, lat 0.450 ms stddev 0.909  
progress: 5.0 s, 17753.3 tps, lat 0.448 ms stddev 0.758  
......  
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 30 s  
number of transactions actually processed: 521331  
latency average: 0.456 ms  
latency stddev: 0.702 ms  
tps = 17372.386945 (including connections establishing)  
tps = 17430.542432 (excluding connections establishing)  
statement latencies in milliseconds:  
        0.003072        \setrandom id 1 1000000  
        0.452489        update tbl set id=1+:id where id=:id;  
  
postgres=# drop index idx_tbl1 ;  
DROP INDEX  
postgres=# create index idx_tbl1 on tbl using hash (id);  
CREATE INDEX  
  
postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30  
progress: 1.0 s, 16321.8 tps, lat 0.411 ms stddev 0.521  
progress: 2.0 s, 17372.0 tps, lat 0.458 ms stddev 0.409  
progress: 3.0 s, 16731.5 tps, lat 0.475 ms stddev 1.094  
progress: 4.0 s, 16972.9 tps, lat 0.469 ms stddev 0.880  
progress: 5.0 s, 17392.7 tps, lat 0.457 ms stddev 0.607  
......  
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 30 s  
number of transactions actually processed: 527778  
latency average: 0.450 ms  
latency stddev: 0.678 ms  
tps = 17587.300360 (including connections establishing)  
tps = 17671.181609 (excluding connections establishing)  
statement latencies in milliseconds:  
        0.002955        \setrandom id 1 1000000  
        0.446435        update tbl set id=1+:id where id=:id;  
  
SIZE  
postgres=# create index idx_tbl1 on tbl using hash (id);  
CREATE INDEX  
postgres=# create index idx_tbl2 on tbl using btree (id);  
CREATE INDEX  
postgres=# \di+ idx_tbl*  
                         List of relations  
 Schema |   Name   | Type  |  Owner   | Table | Size  | Description   
--------+----------+-------+----------+-------+-------+-------------  
 public | idx_tbl1 | index | postgres | tbl   | 32 MB |   
 public | idx_tbl2 | index | postgres | tbl   | 21 MB |   
(2 rows)  

小结

查询和更新性能相差无几。

大小,btree是hash的2/3。

Flag Counter

digoal’s 大量PostgreSQL文章入口