PostgreSQL 列存储引擎 susql (志铭奉献)

17 minute read

背景

susql在PostgreSQL9.5基础之上,增加了支持列存储,支持zlib压缩,支持COPY跳过异常行的功能。

可以在以下WEB下载测试。

https://github.com/susql/susql/releases

感谢志铭为PG社区的付出。

新增功能扩展语法介绍:

Synopsis  
CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXISTS ] table_name ( [  
  { column_name data_type [ COLLATE collation ] [ COMPRESS compression ] [ column_constraint [ ... ] ]  
    | table_constraint  
    | LIKE source_table [ like_option ... ]   
    | COMPRESS default_compression }  
    [, ... ]  
] )  
[ INHERITS ( parent_table [, ... ] ) ]  
[ WITH ( storage_parameter [= value] [, ... ] ) | WITH OIDS | WITHOUT OIDS ]  
[ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]  
[ TABLESPACE tablespace_name ]  
[ STORED AS { HEAP | ORC } ]  
[ SORT BY { column_name [,...]}]  
Parameters  
  
STORED AS  
HEAP: row storage format(postgres heap)(default).  
ORC: column storage format.  
  
SORT BY  
When table is STORED AS ORC, sort the column for reading optimize.  
  
COMPRESS  
To set column compression type for a table. COMPRESS default_compression is for default compression type if individual column compression is not specific.  
There is a buildin compression type PGLZ, and an extension compression type ZLIB(create extension dc_zlib fistly to use ZLIB compression).  

注意insert需要vacuum后才能转换到列存储, copy进来的数据不需要转换直接进 of 文件(即列存)。

COPY table_name [ ( column_name [, ...] ) ]  
    FROM { 'filename' | PROGRAM 'command' | DIRECTORY 'directory_name' [ RECURSIVE { 'NUMBER' } ] | STDIN }  
    [ [ WITH ] ( option [, ...] ) ]  
  
COPY { table_name [ ( column_name [, ...] ) ] | ( query ) }  
    TO { 'filename' | PROGRAM 'command' | STDOUT }  
    [ [ WITH ] ( option [, ...] ) ]  
  
where option can be one of:  
    FORMAT format_name  
    OIDS [ boolean ]  
    FREEZE [ boolean ]  
    DELIMITER 'delimiter_character'  
    NULL 'null_string'  
    HEADER [ boolean ]  
    QUOTE 'quote_character'  
    ESCAPE 'escape_character'  
    FORCE_QUOTE { ( column_name [, ...] ) | * }  
    FORCE_NOT_NULL ( column_name [, ...] )  
    FORCE_NULL ( column_name [, ...] )  
    ENCODING 'encoding_name'  
    UNSTRICT [ boolean ]  
    UNSTRICT_NUM { number }  
  
Parameters  
DIRECTORY Indicates the input is a directory and will copy all file in this directory.  
RECURSIVE {NUMBER} Specifies if the recusive directory copy will be applied.  
UNSTRICT Specifies that whether continue copy process when some line is broken.  
UNSTRICT_NUM Specifies how many broken lines reach to stop the copy process. Only effects when UNSTRICT is set.  

通过create compress可以创建压缩方法。

postgres=# create compress configuration zlib2 (template=zlib, level=9);  
CREATE COMPRESS CONFIGURATION  
Time: 9.996 ms  
  
postgres=# create compress configuration pglz2 (template=pglz, ...);  

pglz用的参数如下:

 * PGLZ_Strategy -  
 *  
 *              Some values that control the compression algorithm.  
 *  
 *              min_input_size          Minimum input data size to consider compression.  
 *  
 *              max_input_size          Maximum input data size to consider compression.  
 *  
 *              min_comp_rate           Minimum compression rate (0-99%) to require.  
 *                                                      Regardless of min_comp_rate, the output must be  
 *                                                      smaller than the input, else we don't store  
 *                                                      compressed.  
 *  
 *              first_success_by        Abandon compression if we find no compressible  
 *                                                      data within the first this-many bytes.  
 *  
 *              match_size_good         The initial GOOD match size when starting history  
 *                                                      lookup. When looking up the history to find a  
 *                                                      match that could be expressed as a tag, the  
 *                                                      algorithm does not always walk back entirely.  
 *                                                      A good match fast is usually better than the  
 *                                                      best possible one very late. For each iteration  
 *                                                      in the lookup, this value is lowered so the  
 *                                                      longer the lookup takes, the smaller matches  
 *                                                      are considered good.  
 *  
 *              match_size_drop         The percentage by which match_size_good is lowered  
 *                                                      after each history check. Allowed values are  
 *                                                      0 (no change until end) to 100 (only check  
 *                                                      latest history entry at all).  

安装:

[root@digoal ~]# rpm -ivh susql50-5.0.1-0.el6.x86_64.rpm   
error: Failed dependencies:  
        susql50-libs(x86-64) = 5.0.1-0.el6 is needed by susql50-5.0.1-0.el6.x86_64  
[root@digoal ~]# rpm -ivh susql50-libs-5.0.1-0.el6.x86_64.rpm   
Preparing...                ########################################### [100%]  
   1:susql50-libs           ########################################### [100%]  
/sbin/ldconfig: /opt/gcc4.9.3/lib/libstdc++.so.6.0.20-gdb.py is not an ELF file - it has the wrong magic bytes at the start.  
  
/sbin/ldconfig: /opt/gcc4.9.3/lib64/libstdc++.so.6.0.20-gdb.py is not an ELF file - it has the wrong magic bytes at the start.  
  
[root@digoal ~]# rpm -ivh susql50-5.0.1-0.el6.x86_64.rpm   
Preparing...                ########################################### [100%]  
   1:susql50                ########################################### [100%]  
[root@digoal ~]# rpm -ivh susql50-server-5.0.1-0.el6.x86_64.rpm   
Preparing...                ########################################### [100%]  
   1:susql50-server         ########################################### [100%]  
/sbin/ldconfig: /opt/gcc4.9.3/lib/libstdc++.so.6.0.20-gdb.py is not an ELF file - it has the wrong magic bytes at the start.  
  
/sbin/ldconfig: /opt/gcc4.9.3/lib64/libstdc++.so.6.0.20-gdb.py is not an ELF file - it has the wrong magic bytes at the start.  
  
error reading information on service susql-5.0: No such file or directory  
[root@digoal ~]# rpm -ivh susql50-contrib-5.0.1-0.el6.x86_64.rpm   
Preparing...                ########################################### [100%]  
   1:susql50-contrib        ########################################### [100%]  

初始化数据库

[root@digoal ~]# mkdir /data01/susql  
[root@digoal ~]# chown susql:susql /data01/susql  
[root@digoal ~]# su - susql  
  
-bash-4.1$ vi env_pg.sh  
export PS1="$USER@`/bin/hostname -s`-> "  
export PGPORT=1923  
export PGDATA=/data01/susql/pgdata/pg_root  
export LANG=en_US.utf8  
export PGHOME=/usr/susql-5.0  
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH  
export DATE=`date +"%Y%m%d%H%M"`  
export PATH=$PGHOME/bin:$PATH:.  
export MANPATH=$PGHOME/share/man:$MANPATH  
export PGHOST=$PGDATA  
export PGUSER=postgres  
export PGDATABASE=postgres  
alias rm='rm -i'  
alias ll='ls -lh'  
unalias vi  
  
[root@digoal ~]# su - susql  
-bash-4.1$ . ./env_pg.sh   
-bash: unalias: vi: not found  
susql@digoal-> initdb -D $PGDATA -E UTF8 --locale=C -U postgres -W  
The files belonging to this database system will be owned by user "susql".  
This user must also own the server process.  
  
The database cluster will be initialized with locale "C".  
The default text search configuration will be set to "english".  
  
Data page checksums are disabled.  
  
creating directory /data01/susql/pgdata/pg_root ... ok  
creating subdirectories ... ok  
selecting default max_connections ... 100  
selecting default shared_buffers ... 128MB  
selecting dynamic shared memory implementation ... posix  
creating configuration files ... ok  
creating template1 database in /data01/susql/pgdata/pg_root/base/1 ... ok  
initializing pg_authid ... ok  
Enter new superuser password:   
Enter it again:   
setting password ... ok  
initializing dependencies ... ok  
creating system views ... ok  
loading system objects' descriptions ... ok  
creating collations ... ok  
creating conversions ... ok  
creating dictionaries ... ok  
setting privileges on built-in objects ... ok  
creating information schema ... ok  
loading PL/pgSQL server-side language ... ok  
vacuuming database template1 ... ok  
copying template1 to template0 ... ok  
copying template1 to postgres ... ok  
syncing data to disk ... ok  
  
WARNING: enabling "trust" authentication for local connections  
You can change this by editing pg_hba.conf or using the option -A, or  
--auth-local and --auth-host, the next time you run initdb.  
  
Success. You can now start the database server using:  
  
    pg_ctl -D /data01/susql/pgdata/pg_root -l logfile start  
  
  
susql@digoal-> cd $PGDATA  

修改配置文件

susql@digoal-> grep "^[a-z]" postgresql.conf  
listen_addresses = '0.0.0.0'            # what IP address(es) to listen on;  
port = 1923                             # (change requires restart)  
max_connections = 100                   # (change requires restart)  
unix_socket_directories = '.'   # comma-separated list of directories  
unix_socket_permissions = 0700          # begin with 0 to use octal notation  
tcp_keepalives_idle = 70                # TCP_KEEPIDLE, in seconds;  
tcp_keepalives_interval = 10            # TCP_KEEPINTVL, in seconds;  
tcp_keepalives_count = 10               # TCP_KEEPCNT;  
shared_buffers = 4096MB                 # min 128kB  
dynamic_shared_memory_type = posix      # the default is the first option  
bgwriter_delay = 10ms                   # 10-10000ms between rounds  
bgwriter_lru_maxpages = 1000            # 0-1000 max buffers written/round  
synchronous_commit = off                # synchronization level;  
full_page_writes = off                  # recover from partial page writes  
wal_compression = off                   # enable compression of full-page writes  
wal_log_hints = off                     # also do full page writes of non-critical updates  
wal_buffers = 16MB                      # min 32kB, -1 sets based on shared_buffers  
wal_writer_delay = 10ms         # 1-10000 milliseconds  
checkpoint_timeout = 35min              # range 30s-1h  
max_wal_size = 4GB  
checkpoint_completion_target = 0.9      # checkpoint target duration, 0.0 - 1.0  
effective_cache_size = 4GB  
log_destination = 'csvlog'              # Valid values are combinations of  
logging_collector = on                  # Enable capturing of stderr and csvlog  
log_directory = 'pg_log'                # directory where log files are written,  
log_filename = 'postgresql-%a.log'      # log file name pattern,  
log_truncate_on_rotation = on           # If on, an existing log file with the  
log_rotation_age = 1d                   # Automatic rotation of logfiles will  
log_rotation_size = 0                   # Automatic rotation of logfiles will  
log_checkpoints = on  
log_connections = on  
log_disconnections = on  
log_error_verbosity = verbose            # terse, default, or verbose messages  
log_timezone = 'PRC'  
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and  
datestyle = 'iso, mdy'  
timezone = 'PRC'  
lc_messages = 'C'                       # locale for system error message  
lc_monetary = 'C'                       # locale for monetary formatting  
lc_numeric = 'C'                        # locale for number formatting  
lc_time = 'C'                           # locale for time formatting  
default_text_search_config = 'pg_catalog.english'  
enable_copyto_flush = on                #When set, Copy To operation will explicit flush the data when finishing  
copy_extension = on                     #When set, Copy Extension Feature Enabled  

测试

susql@digoal-> psql  
psql (9.5.0 (SuSQL 5.0.1))  
Type "help" for help.  
postgres=# create extension dc_zlib;  
CREATE EXTENSION  
postgres=# create table test_col(  
id int,   
info text compress zlib,   
crt_time timestamp,   
c1 int,   
c2 int,   
c3 int,   
c4 int,   
c5 int,   
c6 int,   
c7 int,   
c8 int,   
c9 int,   
c10 int,   
c11 int,   
c12 int)   
stored as orc   
sort by id;  
  
postgres=# create table test_heap(  
id int,   
info text compress zlib,   
crt_time timestamp,   
c1 int,   
c2 int,   
c3 int,   
c4 int,   
c5 int,   
c6 int,   
c7 int,   
c8 int,   
c9 int,   
c10 int,   
c11 int,   
c12 int)   
stored as heap;  
  
postgres=# create table test_heap_pglz(  
id int,   
info text compress pglz,   
crt_time timestamp,   
c1 int,   
c2 int,   
c3 int,   
c4 int,   
c5 int,   
c6 int,   
c7 int,   
c8 int,   
c9 int,   
c10 int,   
c11 int,   
c12 int)   
stored as heap;  
  
  
postgres=# \d+++ test_heap  
                                         Table "public.test_heap"  
  Column  |            Type             | Modifiers | Storage  | Compression | Stats target | Description   
----------+-----------------------------+-----------+----------+-------------+--------------+-------------  
 id       | integer                     |           | plain    |             |              |   
 info     | text                        |           | extended | zlib        |              |   
 crt_time | timestamp without time zone |           | plain    |             |              |   
 c1       | integer                     |           | plain    |             |              |   
 c2       | integer                     |           | plain    |             |              |   
 c3       | integer                     |           | plain    |             |              |   
 c4       | integer                     |           | plain    |             |              |   
 c5       | integer                     |           | plain    |             |              |   
 c6       | integer                     |           | plain    |             |              |   
 c7       | integer                     |           | plain    |             |              |   
 c8       | integer                     |           | plain    |             |              |   
 c9       | integer                     |           | plain    |             |              |   
 c10      | integer                     |           | plain    |             |              |   
 c11      | integer                     |           | plain    |             |              |   
 c12      | integer                     |           | plain    |             |              |   
Stored As: HEAP  
  
postgres=# \d+++ test_col  
                                         Table "public.test_col"  
  Column  |            Type             | Modifiers | Storage  | Compression | Stats target | Description   
----------+-----------------------------+-----------+----------+-------------+--------------+-------------  
 id       | integer                     |           | plain    |             |              |   
 info     | text                        |           | extended | zlib        |              |   
 crt_time | timestamp without time zone |           | plain    |             |              |   
 c1       | integer                     |           | plain    |             |              |   
 c2       | integer                     |           | plain    |             |              |   
 c3       | integer                     |           | plain    |             |              |   
 c4       | integer                     |           | plain    |             |              |   
 c5       | integer                     |           | plain    |             |              |   
 c6       | integer                     |           | plain    |             |              |   
 c7       | integer                     |           | plain    |             |              |   
 c8       | integer                     |           | plain    |             |              |   
 c9       | integer                     |           | plain    |             |              |   
 c10      | integer                     |           | plain    |             |              |   
 c11      | integer                     |           | plain    |             |              |   
 c12      | integer                     |           | plain    |             |              |   
Stored As: ORC  
  
postgres=# \d+++ test_heap_pglz   
                                      Table "public.test_heap_pglz"  
  Column  |            Type             | Modifiers | Storage  | Compression | Stats target | Description   
----------+-----------------------------+-----------+----------+-------------+--------------+-------------  
 id       | integer                     |           | plain    |             |              |   
 info     | text                        |           | extended | pglz        |              |   
 crt_time | timestamp without time zone |           | plain    |             |              |   
 c1       | integer                     |           | plain    |             |              |   
 c2       | integer                     |           | plain    |             |              |   
 c3       | integer                     |           | plain    |             |              |   
 c4       | integer                     |           | plain    |             |              |   
 c5       | integer                     |           | plain    |             |              |   
 c6       | integer                     |           | plain    |             |              |   
 c7       | integer                     |           | plain    |             |              |   
 c8       | integer                     |           | plain    |             |              |   
 c9       | integer                     |           | plain    |             |              |   
 c10      | integer                     |           | plain    |             |              |   
 c11      | integer                     |           | plain    |             |              |   
 c12      | integer                     |           | plain    |             |              |   
Stored As: HEAP  

性能测试

postgres=# \timing  
Timing is on.  
postgres=# insert into test_heap select i,repeat(md5(random()::text),64),clock_timestamp(),i,i,i,i,i,i,i,i,i,i,i,i from generate_series(1,1000000) t(i);  
INSERT 0 1000000  
Time: 18183.374 ms  
postgres=# insert into test_col select i,repeat(md5(random()::text),64),clock_timestamp(),i,i,i,i,i,i,i,i,i,i,i,i from generate_series(1,1000000) t(i);  
INSERT 0 1000000  
Time: 19871.817 ms  
postgres=# insert into test_heap select trunc(5000000*random()),repeat(md5(random()::text),64),clock_timestamp(),i,i,i,i,i,i,i,i,i,i,i,i from generate_series(1,1000000) t(i);  
INSERT 0 1000000  
Time: 20575.763 ms  
postgres=# insert into test_col select trunc(5000000*random()),repeat(md5(random()::text),64),clock_timestamp(),i,i,i,i,i,i,i,i,i,i,i,i from generate_series(1,1000000) t(i);  
INSERT 0 1000000  
Time: 20440.462 ms  
  
postgres=# select count(*) from test_col where id<10;  
 count   
-------  
    10  
(1 row)  
  
Time: 417.611 ms  
postgres=# select count(*) from test_col where id<10;  
 count   
-------  
    10  
(1 row)  
  
Time: 386.153 ms  
postgres=# select count(*) from test_heap where id<10;  
 count   
-------  
    12  
(1 row)  
  
Time: 204.857 ms  
postgres=# select count(*) from test_heap where id<10;  
 count   
-------  
    12  
(1 row)  
  
Time: 210.323 ms  
postgres=# select count(*) from test_col ;  
  count    
---------  
 2000000  
(1 row)  
  
Time: 383.868 ms  
postgres=# select count(*) from test_heap;  
  count    
---------  
 2000000  
(1 row)  
Time: 182.571 ms  
  
postgres=# \dt+   
                         List of relations  
 Schema |      Name      | Type  |  Owner   |  Size  | Description   
--------+----------------+-------+----------+--------+-------------  
 public | test_col       | table | postgres | 319 MB |   
 public | test_heap      | table | postgres | 319 MB |   
 public | test_heap_pglz | table | postgres | 319 MB |   
(3 rows)  
  
postgres=# select count(id) from test_heap;  
  count    
---------  
 2000000  
(1 row)  
  
Time: 210.082 ms  
  
postgres=# select count(id) from test_col;  
  count    
---------  
 2000000  
(1 row)  
  
Time: 468.594 ms  
  
postgres=# explain analyze select count(distinct id) from test_col;  
                                                        QUERY PLAN                                                           
---------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=8.83..8.84 rows=1 width=4) (actual time=1743.652..1743.652 rows=1 loops=1)  
   ->  Custom Scan (ORC) on test_col  (cost=0.00..7.06 rows=706 width=4) (actual time=0.018..431.255 rows=2000000 loops=1)  
 Planning time: 0.112 ms  
 Execution time: 1743.735 ms  
(4 rows)  
  
Time: 1744.215 ms  
postgres=# explain analyze select count(distinct id) from test_heap;  
                                                        QUERY PLAN                                                           
---------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=65816.71..65816.72 rows=1 width=4) (actual time=1513.416..1513.416 rows=1 loops=1)  
   ->  Seq Scan on test_heap  (cost=0.00..60816.77 rows=1999977 width=4) (actual time=0.017..221.881 rows=2000000 loops=1)  
 Planning time: 0.068 ms  
 Execution time: 1513.478 ms  
(4 rows)  
  
Time: 1513.900 ms  
  
postgres=# explain analyze select count(distinct (id,info)) from test_col;  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=8.83..8.84 rows=1 width=36) (actual time=15137.273..15137.274 rows=1 loops=1)  
   ->  Custom Scan (ORC) on test_col  (cost=0.00..7.06 rows=706 width=36) (actual time=0.014..513.342 rows=2000000 loops=1)  
 Planning time: 0.127 ms  
 Execution time: 15137.395 ms  
(4 rows)  
  
Time: 15138.411 ms  
postgres=# explain analyze select count(distinct (id,info)) from test_heap;  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=65816.71..65816.72 rows=1 width=72) (actual time=13865.787..13865.787 rows=1 loops=1)  
   ->  Seq Scan on test_heap  (cost=0.00..60816.77 rows=1999977 width=72) (actual time=0.012..235.289 rows=2000000 loops=1)  
 Planning time: 0.104 ms  
 Execution time: 13865.856 ms  
(4 rows)  
  
Time: 13866.441 ms  

排序

postgres=# set work_mem='1GB';  
postgres=# explain analyze select id from test_heap order by id;  
                                                        QUERY PLAN                                                           
---------------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=270132.69..275132.69 rows=2000000 width=4) (actual time=766.044..959.705 rows=2000000 loops=1)  
   Sort Key: id  
   Sort Method: quicksort  Memory: 142903kB  
   ->  Seq Scan on test_heap  (cost=0.00..60817.00 rows=2000000 width=4) (actual time=0.012..254.047 rows=2000000 loops=1)  
 Planning time: 0.148 ms  
 Execution time: 1041.256 ms  
(6 rows)  
  
postgres=# explain analyze select * from test_heap order by id;  
                                                         QUERY PLAN                                                            
-----------------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=270132.69..275132.69 rows=2000000 width=128) (actual time=820.897..1104.178 rows=2000000 loops=1)  
   Sort Key: id  
   Sort Method: quicksort  Memory: 580403kB  
   ->  Seq Scan on test_heap  (cost=0.00..60817.00 rows=2000000 width=128) (actual time=0.017..202.813 rows=2000000 loops=1)  
 Planning time: 0.107 ms  
 Execution time: 1193.899 ms  
(6 rows)  
  
postgres=# explain analyze select * from test_col order by id;  
                                                             QUERY PLAN                                                                
-------------------------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=229315.69..234315.69 rows=2000000 width=128) (actual time=1269.664..1553.249 rows=2000000 loops=1)  
   Sort Key: id  
   Sort Method: quicksort  Memory: 580403kB  
   ->  Custom Scan (ORC) on test_col  (cost=0.00..20000.00 rows=2000000 width=128) (actual time=0.019..417.134 rows=2000000 loops=1)  
 Planning time: 0.145 ms  
 Execution time: 1646.298 ms  
(6 rows)  
  
postgres=# explain analyze select id from test_col order by id;  
                                                            QUERY PLAN                                                               
-----------------------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=229315.69..234315.69 rows=2000000 width=4) (actual time=1001.483..1207.091 rows=2000000 loops=1)  
   Sort Key: id  
   Sort Method: quicksort  Memory: 142903kB  
   ->  Custom Scan (ORC) on test_col  (cost=0.00..20000.00 rows=2000000 width=4) (actual time=0.020..466.365 rows=2000000 loops=1)  
 Planning time: 0.112 ms  
 Execution time: 1293.078 ms  
(6 rows)  
  
postgres=# set work_mem='1MB';  
SET  
postgres=# explain analyze select id from test_col order by id;  
                                                            QUERY PLAN                                                               
-----------------------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=284006.69..289006.69 rows=2000000 width=4) (actual time=1837.612..2159.556 rows=2000000 loops=1)  
   Sort Key: id  
   Sort Method: external merge  Disk: 27392kB  
   ->  Custom Scan (ORC) on test_col  (cost=0.00..20000.00 rows=2000000 width=4) (actual time=0.019..465.012 rows=2000000 loops=1)  
 Planning time: 0.110 ms  
 Execution time: 2246.295 ms  
(6 rows)  
  
postgres=# explain analyze select id from test_heap order by id;  
                                                        QUERY PLAN                                                           
---------------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=324823.69..329823.69 rows=2000000 width=4) (actual time=1742.758..2080.076 rows=2000000 loops=1)  
   Sort Key: id  
   Sort Method: external merge  Disk: 27384kB  
   ->  Seq Scan on test_heap  (cost=0.00..60817.00 rows=2000000 width=4) (actual time=0.016..278.976 rows=2000000 loops=1)  
 Planning time: 0.079 ms  
 Execution time: 2176.640 ms  
(6 rows)  

验证列存储的sort by id是否起作用,从测试结果来看并没有排序存储。

postgres=# select id from test_col limit 1 offset 1000000;  
   id      
---------  
 3492623  
(1 row)  
  
postgres=# select id from test_col limit 1 offset 1000001;  
   id     
--------  
 292687  
(1 row)  
  
postgres=# select id from test_col limit 1 offset 1000002;  
   id      
---------  
 3435092  
(1 row)  

查看物理文件

postgres=# select pg_relation_filepath ('test_heap'::regclass);  
 pg_relation_filepath   
----------------------  
 base/13245/16436  
(1 row)  
  
Time: 0.593 ms  
postgres=# select pg_relation_filepath ('test_col'::regclass);  
 pg_relation_filepath   
----------------------  
 base/13245/16408  
(1 row)  
Time: 0.456 ms  
  
postgres=# \q  
susql@digoal-> cd $PGDATA/base/  
susql@digoal-> ll 13245/16436*  
-rw------- 1 susql susql 319M Feb 29 09:04 13245/16436  
-rw------- 1 susql susql 104K Feb 29 09:04 13245/16436_fsm  
susql@digoal-> ll 13245/16408*  
-rw------- 1 susql susql 319M Feb 29 09:04 13245/16408  
-rw------- 1 susql susql 104K Feb 29 09:04 13245/16408_fsm  
-rw------- 1 susql susql 8.0K Feb 29 08:58 13245/16408_of  

使用insert插入的数据,没有写入列存储,需要调用vacuum重铸。

postgres=# vacuum test_col ;  
VACUUM  
-rw------- 1 susql susql    0 Feb 29 12:57 16408  
-rw------- 1 susql susql  16K Feb 29 12:57 16408_fsm  
-rw------- 1 susql susql 319M Feb 29 12:57 16408_of  

重铸后,sort by起作用了。

postgres=# select id from test_heap offset 1000000 limit 10;  
   id      
---------  
 4668069  
 3385308  
 3298340  
 2234740  
 3610229  
 2569580  
 1705913  
  828489  
  856546  
 1197041  
(10 rows)  
  
Time: 123.598 ms  
postgres=# select id from test_col offset 1000000 limit 10;  
   id     
--------  
 990214  
 990215  
 990216  
 990217  
 990218  
 990219  
 990220  
 990221  
 990222  
 990223  
(10 rows)  
  
Time: 100.979 ms  

查看编译项

susql@digoal-> pg_config  
BINDIR = /usr/susql-5.0/bin  
DOCDIR = /usr/susql-5.0/doc  
HTMLDIR = /usr/susql-5.0/doc  
INCLUDEDIR = /usr/susql-5.0/include  
PKGINCLUDEDIR = /usr/susql-5.0/include  
INCLUDEDIR-SERVER = /usr/susql-5.0/include/server  
LIBDIR = /usr/susql-5.0/lib  
PKGLIBDIR = /usr/susql-5.0/lib  
LOCALEDIR = /usr/susql-5.0/share/locale  
MANDIR = /usr/susql-5.0/share/man  
SHAREDIR = /usr/susql-5.0/share  
SYSCONFDIR = /etc/sysconfig/susql  
PGXS = /usr/susql-5.0/lib/pgxs/src/makefiles/pgxs.mk  
CONFIGURE = '--enable-rpath' '--prefix=/usr/susql-5.0' '--includedir=/usr/susql-5.0/include' '--mandir=/usr/susql-5.0/share/man' '--datadir=/usr/susql-5.0/share' '--enable-nls' '--with-libxml' '--with-libxslt' '--with-system-tzdata=/usr/share/zoneinfo' '--sysconfdir=/etc/sysconfig/susql' '--docdir=/usr/susql-5.0/doc' 'CFLAGS=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -DLINUX_OOM_ADJ=0' 'LDFLAGS=-Wl,--as-needed'  
CC = gcc  
CPPFLAGS = -D_GNU_SOURCE -I/usr/include/libxml2  
CFLAGS = -DPGPG -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -DLINUX_OOM_ADJ=0  
CFLAGS_SL = -fpic  
LDFLAGS = -L../../../src/common -Wl,--as-needed -Wl,--as-needed -Wl,-rpath,'/usr/susql-5.0/lib',--enable-new-dtags  
LDFLAGS_EX =   
LDFLAGS_SL =   
LIBS = -lpgcommon -lpgport -lxslt -lxml2 -lz -lreadline -lrt -lcrypt -ldl -lm   
VERSION = PostgreSQL 9.5.0 (SuSQL 5.0.1)  

验证copy跳过异常的功能

postgres=# copy test_heap to '/data01/susql/test.csv' ;  
COPY 2000000  
-rw-r--r-- 1 susql susql 4.1G Feb 29 09:22 test.csv  
  
[root@digoal susql]# echo "err" >> test.csv   
[root@digoal susql]# echo "err" >> test.csv   
[root@digoal susql]# echo "err" >> test.csv   
[root@digoal susql]# echo "err" >> test.csv   
[root@digoal susql]# echo "err" >> test.csv   
[root@digoal susql]# head -n 10 test.csv >> test.csv  
  
  
postgres=# copy test_col from '/data01/susql/test.csv' with( UNSTRICT true, UNSTRICT_NUM 10);  
INFO:  00000: invalid input syntax for integer: "err" (822)  
CONTEXT:  COPY test_col, file /data01/susql/test.csv, line 2000001, column id: "err"  
LOCATION:  PerformErrorLoggingForCopy, copy.c:5117  
INFO:  00000: invalid input syntax for integer: "err" (822)  
CONTEXT:  COPY test_col, file /data01/susql/test.csv, line 2000002, column id: "err"  
LOCATION:  PerformErrorLoggingForCopy, copy.c:5117  
INFO:  00000: invalid input syntax for integer: "err" (822)  
CONTEXT:  COPY test_col, file /data01/susql/test.csv, line 2000003, column id: "err"  
LOCATION:  PerformErrorLoggingForCopy, copy.c:5117  
INFO:  00000: invalid input syntax for integer: "err" (822)  
CONTEXT:  COPY test_col, file /data01/susql/test.csv, line 2000004, column id: "err"  
LOCATION:  PerformErrorLoggingForCopy, copy.c:5117  
INFO:  00000: invalid input syntax for integer: "err" (822)  
CONTEXT:  COPY test_col, file /data01/susql/test.csv, line 2000005, column id: "err"  
LOCATION:  PerformErrorLoggingForCopy, copy.c:5117  
COPY 2000010  

压缩比比较:

postgres=# create compress configuration zlib0 (template=zlib, level=1);  
CREATE COMPRESS CONFIGURATION  
Time: 0.814 ms  
postgres=# create compress configuration zlib2 (template=zlib, level=9);  
CREATE COMPRESS CONFIGURATION  
Time: 0.814 ms  
postgres=# create table test_heap_zlib0(                         
id int,   
info text compress zlib0,   
crt_time timestamp,   
c1 int,   
c2 int,   
c3 int,   
c4 int,   
c5 int,   
c6 int,   
c7 int,   
c8 int,   
c9 int,   
c10 int,   
c11 int,   
c12 int)   
stored as heap;  
CREATE TABLE  
postgres=# create table test_heap_zlib2(                         
id int,   
info text compress zlib2,   
crt_time timestamp,   
c1 int,   
c2 int,   
c3 int,   
c4 int,   
c5 int,   
c6 int,   
c7 int,   
c8 int,   
c9 int,   
c10 int,   
c11 int,   
c12 int)   
stored as heap;  
CREATE TABLE  
postgres=# insert into test_heap_zlib0 select * from test_heap;  
INSERT 0 4000000  
postgres=# insert into test_heap_zlib2 select * from test_heap;  
INSERT 0 4000000  
postgres=# select pg_size_pretty(pg_total_relation_size('test_heap'));  
 pg_size_pretty   
----------------  
 638 MB  
(1 row)  
  
Time: 1.190 ms  
postgres=# select pg_size_pretty(pg_total_relation_size('test_heap_zlib0'));  
 pg_size_pretty   
----------------  
 638 MB  
(1 row)  
  
Time: 0.344 ms  
postgres=# select pg_size_pretty(pg_total_relation_size('test_heap_zlib2'));  
 pg_size_pretty   
----------------  
 638 MB  
(1 row)  

小结

1. 本测试用例下,pglz和zlib的压缩比一致。

2. 本测试用例下,列存的查询效率并没有比行存的效率高。列存使用了9.5新增的custom scan provider接口,和pgstrom使用GPU的方法类似。

3. 列存储在物理文件上和行存储一致,没有按列分割文件。insert插入的数据需要vacuum后才能转成列存,COPY进来的数据不需要执行vacuum。

4. COPY可以跳过错误的行,并输出行号。

本文测试用到的数据量还比较小,感兴趣的同学可以测试更大数据量,看看列存储的性能表现。

有问题可以找志铭.

Flag Counter

digoal’s 大量PostgreSQL文章入口