PostgreSQL bloom filter index 扩展 for bigint

2 minute read

背景

凡是支持HASH函数,以及相等operator的类型,都可以使用bloom filter index .

扩展方法见本文。

原文

https://obartunov.livejournal.com/201027.html

Bloom index by default works for int4 and text, but other types with hash function and equality operator could be supported.

Just use opclass interface, for example, for type bigint

create extension bloom;  
  
postgres=# select * from pg_opclass where opcname='bigint_ops';  
 opcmethod | opcname | opcnamespace | opcowner | opcfamily | opcintype | opcdefault | opckeytype   
-----------+---------+--------------+----------+-----------+-----------+------------+------------  
(0 rows)  

创建bigint的bloom filter index支持。

CREATE OPERATOR CLASS bigint_ops   
DEFAULT FOR TYPE bigint USING bloom AS   
OPERATOR 1 = (bigint, bigint),  
FUNCTION 1 hashint8(bigint);  
  
  
postgres=# select * from pg_opclass where opcname='bigint_ops';  
 opcmethod |  opcname   | opcnamespace | opcowner | opcfamily | opcintype | opcdefault | opckeytype   
-----------+------------+--------------+----------+-----------+-----------+------------+------------  
    136065 | bigint_ops |         2200 |       10 |    136074 |        20 | t          |          0  
(1 row)	  

Now, you can build bloom index for bigint data type.

Data types, which could be supported by bloom index.

查询可以支持bloom filter的类型,HASH函数

凡是支持HASH函数,以及相等operator操作符的类型,都可以使用bloom filter index .

postgres=# select oid,* from pg_am;  
  oid   | amname |  amhandler  | amtype   
--------+--------+-------------+--------  
    403 | btree  | bthandler   | i  
    405 | hash   | hashhandler | i  
    783 | gist   | gisthandler | i  
   2742 | gin    | ginhandler  | i  
   4000 | spgist | spghandler  | i  
   3580 | brin   | brinhandler | i  
  18204 | rum    | rumhandler  | i  
 136050 | bloom  | blhandler   | i  
(8 rows)  

查询可以支持bloom filter的类型,HASH函数

SELECT oc.opcintype::regtype, p.amproc FROM pg_opclass oc  
JOIN pg_amproc p ON p.amprocfamily = oc.opcfamily  
WHERE oc.opcmethod = 405 -- hash am   
AND oc.opcdefault -- 默认proc for this am  
-- https://www.postgresql.org/docs/devel/static/xindex.html Strategies number  
AND p.amprocnum = 1    
AND p.amproclefttype = oc.opcintype   
AND p.amprocrighttype = oc.opcintype;  
          opcintype          |     amproc       
-----------------------------+----------------  
 character                   | hashbpchar  
 "char"                      | hashchar  
 date                        | hashint4  
 anyarray                    | hash_array  
 real                        | hashfloat4  
 double precision            | hashfloat8  
 inet                        | hashinet  
 smallint                    | hashint2  
 integer                     | hashint4  
 bigint                      | hashint8  
 interval                    | interval_hash  
 macaddr                     | hashmacaddr  
 name                        | hashname  
 oid                         | hashoid  
 oidvector                   | hashoidvector  
 text                        | hashtext  
 time without time zone      | time_hash  
 numeric                     | hash_numeric  
 timestamp with time zone    | timestamp_hash  
 time with time zone         | timetz_hash  
 timestamp without time zone | timestamp_hash  
 boolean                     | hashchar  
 bytea                       | hashvarlena  
 xid                         | hashint4  
 cid                         | hashint4  
 abstime                     | hashint4  
 reltime                     | hashint4  
 aclitem                     | hash_aclitem  
 uuid                        | uuid_hash  
 pg_lsn                      | pg_lsn_hash  
 macaddr8                    | hashmacaddr8  
 anyenum                     | hashenum  
 anyrange                    | hash_range  
 jsonb                       | jsonb_hash  
(34 rows)  

创建索引,例子

postgres=# create table test(id int, c1 int8, c2 int8, c3 int8);  
CREATE TABLE  
postgres=# create index idx_test_1 on test using bloom (c1,c2,c3);  
CREATE INDEX  
  
  
postgres=# set enable_seqscan=off;  
SET  
postgres=# explain select * from test where c1=1::int8 and c2=1::int8 and c3=1::int8;  
                                       QUERY PLAN                                         
----------------------------------------------------------------------------------------  
 Bitmap Heap Scan on test  (cost=15.73..15.75 rows=1 width=28)  
   Recheck Cond: ((c1 = '1'::bigint) AND (c2 = '1'::bigint) AND (c3 = '1'::bigint))  
   ->  Bitmap Index Scan on idx_test_1  (cost=0.00..15.73 rows=1 width=0)  
         Index Cond: ((c1 = '1'::bigint) AND (c2 = '1'::bigint) AND (c3 = '1'::bigint))  
(4 rows)  
  
postgres=# set enable_seqscan =on;  
SET  
postgres=# explain select * from test where c1=1::int8 and c2=1::int8 and c3=1::int8;  
                                  QUERY PLAN                                    
------------------------------------------------------------------------------  
 Seq Scan on test  (cost=0.00..9.85 rows=1 width=28)  
   Filter: ((c1 = '1'::bigint) AND (c2 = '1'::bigint) AND (c3 = '1'::bigint))  
(2 rows)  

参考

https://obartunov.livejournal.com/201027.html

https://www.postgresql.org/docs/devel/static/xindex.html

Flag Counter

digoal’s 大量PostgreSQL文章入口