Greenplum 6已合并到PostgreSQL 9.3版本 - 比上一代GP提升:8倍读,195倍更新、删除 - 另有大量PG新特性
背景
Greenplum 6已合并到PostgreSQL 9.3版本 - 相比5性能提升:读8倍,更新、删除195倍 - 另有大量PG新特性,详见PostgreSQL release notes
https://www.postgresql.org/docs/11/static/release.html
例如
1、GIN、SPGIST 索引接口。在模糊查询、全文检索、向量相似计算方面终于有索引加速了。
2、异步事务,小事务写入有大幅提升。
3、物化视图,OLAP中很好用的功能。
4、事件触发器,基于事件控制细粒度DDL权限。
5、整体性能增强。
以上特性都是通过升级PostgreSQL版本加入的。
其他增强:
1、增加跨表的分布式死锁检测。
2、更新、删除由表级排他锁改成行级排他锁,大幅提升DML性能。
gpdb 6 部署示例
与如下同测试环境:
《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》
配置ECS虚拟机OS参数 (all host)
1、内核参数
vi /etc/sysctl.conf
# add by digoal.zhou
fs.aio-max-nr = 1048576
fs.file-max = 76724600
# 可选:kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p
# /data01/corefiles 事先建好,权限777,如果是软链接,对应的目录修改为777
kernel.sem = 4096 2147483647 2147483646 512000
# 信号量, ipcs -l 或 -u 查看,每16个进程一组,每组信号量需要17个信号量。
kernel.shmall = 107374182
# 所有共享内存段相加大小限制 (建议内存的80%),单位为页。
kernel.shmmax = 274877906944
# 最大单个共享内存段大小 (建议为内存一半), >9.2的版本已大幅降低共享内存的使用,单位为字节。
kernel.shmmni = 819200
# 一共能生成多少共享内存段,每个PG数据库集群至少2个共享内存段
net.core.netdev_max_backlog = 10000
net.core.rmem_default = 262144
# The default setting of the socket receive buffer in bytes.
net.core.rmem_max = 4194304
# The maximum receive socket buffer size in bytes
net.core.wmem_default = 262144
# The default setting (in bytes) of the socket send buffer.
net.core.wmem_max = 4194304
# The maximum send socket buffer size in bytes.
net.core.somaxconn = 4096
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.tcp_keepalive_intvl = 20
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_time = 60
net.ipv4.tcp_mem = 8388608 12582912 16777216
net.ipv4.tcp_fin_timeout = 5
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syncookies = 1
# 开启SYN Cookies。当出现SYN等待队列溢出时,启用cookie来处理,可防范少量的SYN攻击
net.ipv4.tcp_timestamps = 1
# 减少time_wait
net.ipv4.tcp_tw_recycle = 0
# 如果=1则开启TCP连接中TIME-WAIT套接字的快速回收,但是NAT环境可能导致连接失败,建议服务端关闭它
net.ipv4.tcp_tw_reuse = 1
# 开启重用。允许将TIME-WAIT套接字重新用于新的TCP连接
net.ipv4.tcp_max_tw_buckets = 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.tcp_wmem = 8192 65536 16777216
net.nf_conntrack_max = 1200000
net.netfilter.nf_conntrack_max = 1200000
vm.dirty_background_bytes = 409600000
# 系统脏页到达这个值,系统后台刷脏页调度进程 pdflush(或其他) 自动将(dirty_expire_centisecs/100)秒前的脏页刷到磁盘
# 默认为10%,大内存机器建议调整为直接指定多少字节
vm.dirty_expire_centisecs = 3000
# 比这个值老的脏页,将被刷到磁盘。3000表示30秒。
vm.dirty_ratio = 95
# 如果系统进程刷脏页太慢,使得系统脏页超过内存 95 % 时,则用户进程如果有写磁盘的操作(如fsync, fdatasync等调用),则需要主动把系统脏页刷出。
# 有效防止用户进程刷脏页,在单机多实例,并且使用CGROUP限制单实例IOPS的情况下非常有效。
vm.dirty_writeback_centisecs = 100
# pdflush(或其他)后台刷脏页进程的唤醒间隔, 100表示1秒。
vm.swappiness = 0
# 不使用交换分区
vm.mmap_min_addr = 65536
vm.overcommit_memory = 0
# 在分配内存时,允许少量over malloc, 如果设置为 1, 则认为总是有足够的内存,内存较少的测试环境可以使用 1 .
vm.overcommit_ratio = 90
# 当overcommit_memory = 2 时,用于参与计算允许指派的内存大小。
vm.zone_reclaim_mode = 0
# 禁用 numa, 或者在vmlinux中禁止.
net.ipv4.ip_local_port_range = 40000 65535
# 本地自动分配的TCP, UDP端口号范围
fs.nr_open=20480000
# 单个进程允许打开的文件句柄上限
# 以下参数请注意
# vm.extra_free_kbytes = 4096000
# vm.min_free_kbytes = 2097152
# 如果是小内存机器,以上两个值不建议设置
# vm.nr_hugepages = 66536
# 建议shared buffer设置超过64GB时 使用大页,页大小 /proc/meminfo Hugepagesize
# vm.lowmem_reserve_ratio = 1 1 1
# 对于内存大于64G时,建议设置,否则建议默认值 256 256 32
2、资源限制
vi /etc/security/limits.conf
# nofile超过1048576的话,一定要先将sysctl的fs.nr_open设置为更大的值,并生效后才能继续设置nofile.
* soft nofile 1024000
* hard nofile 1024000
* soft nproc unlimited
* hard nproc unlimited
* soft core unlimited
* hard core unlimited
* soft memlock unlimited
* hard memlock unlimited
3、关闭透明大页,使用精准时钟(可选)
vi /etc/rc.local
touch /var/lock/subsys/local
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
# tsc 时钟
echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource
chmod +x /etc/rc.d/rc.local
部署gpdb (all host)
1、部署依赖(root执行)
rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
yum install -y coreutils glib2 lrzsz mpstat dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl-devel perl-ExtUtils* openldap-devel jadetex openjade bzip2 curl-devel curl apr-devel apr cmake3 python git iotop perf gcc-c++ dstat bzip2-devel krb5-devel libcurl-devel libevent-devel libkadm5 libyaml-devel libxml2-devel openssl-devel perl-ExtUtils-Embed python-devel sysstat python-pip xerces-c-devel
pip install --upgrade pip
pip install paramiko pycrypto psutil lockfile pidlockfile
2、部署gpdb(root执行)
cd ~
git clone https://github.com/greenplum-db/gpdb
cd gpdb
pip install -r python-dependencies.txt
pip install -r python-developer-dependencies.txt
./configure --disable-orca --with-perl --with-python --with-libxml --prefix=/opt/gpdb6
make -j 128
make install
3、检查当前GPDB的pg版本
/opt/gpdb6/bin/psql -V
psql (PostgreSQL) 9.3beta1
规划存储目录 (all host)
useradd digoal
passwd digoal
mkdir /data01/gpdb6
chown digoal:digoal /data01/gpdb6
初始化GPDB集群 (master host)
1、配置文件
vi /opt/gpdb6/greenplum_path.sh
# 追加
export MASTER_DATA_DIRECTORY=/data01/gpdb6/gp-1
export PGDATA=$MASTER_DATA_DIRECTORY
export PGHOST=127.0.0.1
export PGPORT=18000
export PGUSER=digoal
export PGPASSWORD=123
export PGDATABASE=postgres
2、环境变量
su - digoal
vi ~/.bash_profile
# 追加
. /opt/gpdb6/greenplum_path.sh
3、配置集群主机文件,包含所有节点(master, standby, segment, mirror hosts)
su - digoal
vi hostfile
digoal-citus-gpdb-test001
digoal-citus-gpdb-test002
digoal-citus-gpdb-test003
digoal-citus-gpdb-test004
digoal-citus-gpdb-test005
digoal-citus-gpdb-test006
digoal-citus-gpdb-test007
digoal-citus-gpdb-test008
digoal-citus-gpdb-test009
4、配置host认证
《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》
gpssh-exkeys -f ./hostfile
5、配置集群初始化文件
如果master host不想配置segment node,则需要修改一下以上hostfile,把master host去掉。
本例在所有主机上初始化segment.
vi cluster.conf
ARRAY_NAME="mpp1 cluster"
CLUSTER_NAME="mpp1 cluster"
MACHINE_LIST_FILE=hostfile
SEG_PREFIX=gp
DATABASE_PREFIX=gp
PORT_BASE=28000
declare -a DATA_DIRECTORY=(/data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6)
MASTER_HOSTNAME=digoal-citus-gpdb-test001
MASTER_DIRECTORY=/data01/gpdb6
MASTER_PORT=18000
IP_ALLOW=0.0.0.0/0
TRUSTED_SHELL=/usr/bin/ssh
CHECK_POINT_SEGMENTS=32
ENCODING=UNICODE
export MASTER_DATA_DIRECTORY
export TRUSTED_SHELL
DEFAULT_QD_MAX_CONNECT=250
QE_CONNECT_FACTOR=5
6、初始化集群
. /opt/gpdb6/greenplum_path.sh
gpinitsystem -c cluster.conf -h hostfile
7、配置参数
gpconfig -c max_connections -v 500 -m 400
gpconfig -c shared_buffers -v '1GB'
gpconfig -c max_prepared_transactions -v '1500'
gpconfig -c max_stack_depth -v '4MB'
gpconfig -c vacuum_cost_delay -v '0'
gpconfig -c synchronous_commit -v 'off'
gpconfig -c wal_buffers -v '16MB'
gpconfig -c wal_writer_delay -v '10ms'
gpconfig -c checkpoint_segments -v '128' --skipvalidation
gpconfig -c random_page_cost -v '1.3'
gpconfig -c log_statement -v 'ddl'
gpconfig -c vacuum_freeze_table_age -v '1200000000'
gpconfig -c autovacuum_freeze_max_age -v '1300000000' --skipvalidation
gpconfig -c autovacuum_vacuum_cost_delay -v '0' --skipvalidation
gpconfig -c autovacuum -v 'on' --skipvalidation
重启实例
gpstop -M fast -a
gpstart -a
以下参数不允许修改,详见GUC文件
src/backend/utils/misc/guc.c
autovacuum
autovacuum_freeze_max_age
autovacuum_vacuum_cost_delay
GPDB 6 改进评测
1 支持异步事务
PostgreSQL 8.3 就有了,异步事务开启后,对于IO性能较差的盘,小事务的性能提升非常明显。
synchronous_commit = off
wal_buffers = 16MB
wal_writer_delay = 10ms
2 gin 倒排索引
GIN倒排索引,支持多值列(例如数组、JSON、HSTORE、全文检索),多列任意组合查询索引加速。
例子
postgres=# create table t(id int, c1 int[]);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# create index idx_t_1 on t using gin (c1);
CREATE INDEX
postgres=# create or replace function gen_rand_arr(int,int) returns int[] as $$
postgres$# select array(select (random()*$1)::int from generate_series(1,$2));
postgres$# $$ language sql strict;
CREATE FUNCTION
postgres=# select gen_rand_arr(100,10);
gen_rand_arr
-------------------------------
{3,85,71,73,91,2,29,81,69,77}
(1 row)
postgres=# insert into t select id,gen_rand_arr(10000,10) from generate_series(1,10000000) t(id);
INSERT 0 10000000
postgres=# explain analyze select * from t where c1 @> array[1,2];
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=6.50..7.81 rows=1 width=65) (actual time=0.340..0.426 rows=5 loops=2)
-> Bitmap Heap Scan on t (cost=6.50..7.81 rows=1 width=65) (actual time=0.329..0.335 rows=2 loops=2)
Recheck Cond: (c1 @> '{1,2}'::integer[])
-> Bitmap Index Scan on idx_t_1 (cost=0.00..6.50 rows=1 width=0) (actual time=0.259..0.259 rows=0 loops=2)
Index Cond: (c1 @> '{1,2}'::integer[])
(slice0) Executor memory: 322K bytes.
(slice1) Executor memory: 779K bytes avg x 4 workers, 907K bytes max (seg1). Work_mem: 33K bytes max.
Memory used: 128000kB
Optimizer: legacy query optimizer
Total runtime: 1.208 ms
(10 rows)
set enable_bitmapscan =off;
postgres=# explain analyze select * from t where c1 @> array[1,2];
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..16.50 rows=1 width=65) (actual time=503.962..636.780 rows=5 loops=2)
-> Seq Scan on t (cost=0.00..16.50 rows=1 width=65) (actual time=34.927..588.086 rows=2 loops=2)
Filter: (c1 @> '{1,2}'::integer[])
(slice0) Executor memory: 322K bytes.
(slice1) Executor memory: 54K bytes avg x 4 workers, 54K bytes max (seg0).
Memory used: 128000kB
Optimizer: legacy query optimizer
Total runtime: 1273.949 ms
(8 rows)
相比全表扫描,使用GIN索引的查询性能提升上千倍。
3 spgist 索引,范围类型
spgist索引接口,以及范围类型。
postgres=# create table t(id int, rg int4range);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# create index idx_t_1 on t using spgist (rg);
CREATE INDEX
postgres=# insert into t values (1, int4range(1,100));
INSERT 0 1
postgres=# insert into t values (2, int4range(101,200));
INSERT 0 1
postgres=# explain select * from t where rg @> 1;
QUERY PLAN
-----------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=0.12..2.74 rows=1 width=18)
-> Index Scan using idx_t_1 on t (cost=0.12..2.74 rows=1 width=18)
Index Cond: (rg @> 1)
Optimizer: legacy query optimizer
(4 rows)
postgres=# select * from t where rg @> 1;
id | rg
----+---------
1 | [1,100)
(1 row)
范围类型在一些场景的应用
《会议室预定系统实践(解放开发) - PostgreSQL tsrange(时间范围类型) + 排他约束》
《PostgreSQL 黑科技 range 类型及 gist index 20x+ speedup than Mysql index combine query》
4 增加行级排他锁,优化分布式死锁检测
1、原有分布式死锁检测
postgres=# show deadlock_timeout ;
deadlock_timeout
------------------
1s
(1 row)
postgres=# create table a (id int, c1 int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
Time: 156.785 ms
postgres=# create table b (id int, c1 int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
Time: 87.583 ms
postgres=# insert into a values (1,1);
INSERT 0 1
Time: 58.180 ms
postgres=# insert into b values (1,1);
INSERT 0 1
Time: 72.548 ms
postgres=# begin;
BEGIN
postgres=# update b set c1=2 where id=1;
UPDATE 1
...
postgres=# update a set c1=2 where id=1;
UPDATE 1
postgres=# begin;
BEGIN
Time: 1.997 ms
postgres=# update a set c1=2 where id=1;
UPDATE 1
Time: 1.425 ms
...
postgres=# update b set c1=2 where id=1;
ERROR: deadlock detected
LINE 1: update b set c1=2 where id=1;
^
DETAIL: Process 26021 waits for ExclusiveLock on relation 36930 of database 12097; blocked by process 26091.
Process 26091 waits for ExclusiveLock on relation 36927 of database 12097; blocked by process 26021.
gpdb 6 与 gpdb 5 行为一致
postgres=# begin;
BEGIN
postgres=# update a set c1=2 where id=1;
UPDATE 1
...
postgres=# update b set c1=2 where id=1;
UPDATE 1
postgres=# begin;
BEGIN
postgres=# update b set c1=3 where id=1;
UPDATE 1
...
postgres=# update a set c1=3 where id=1;
ERROR: deadlock detected (seg2 127.0.0.1:24002 pid=3721)
DETAIL: Process 3721 waits for ShareLock on transaction 1306618; blocked by process 3703.
Process 3703 waits for ShareLock on transaction 1306619; blocked by process 3721.
HINT: See server log for query details.
2、gpdb6增加了行级锁
(gpdb6以前为表级排他锁)
对同一张表的delete\update操作,堵塞insert\update\delete
begin;
update a set c1=2 where id=1;
堵塞其他会话对同一张表的如下操作:insert\update\delete:
update a set c1=3 where id=2;
insert into a values (3,1);
delete from a where id=2;
gpdb6
行级锁,以上操作不堵塞。
行级分布式死锁检测,参数gp_global_deadlock_detector_period
postgres=# show gp_global_deadlock_detector_period;
gp_global_deadlock_detector_period
------------------------------------
2min
(1 row)
postgres=# create table a (id int, c1 int);
NOTICE: Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.
HINT: The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.
CREATE TABLE
postgres=# insert into a values (1,1),(2,2);
INSERT 0 2
postgres=# begin;
BEGIN
postgres=# update a set c1=2 where id=1;
UPDATE 1
...
postgres=# update a set c1=1 where id=2;
UPDATE 1
postgres=# begin;
BEGIN
postgres=# update a set c1=3 where id=2;
UPDATE 1
...
postgres=# update a set c1=3 where id=1;
ERROR: canceling statement due to user request: "cancelled by global deadlock detector"
5 TP性能提升
1亿数据量,TPCB,只读,读写混合测试。
pgbench -i -s 1000 -h 127.0.0.1 -p 15432 -U postgres postgres
pgbench -M simple -v -r -P 1 -c 16 -j 16 -h 127.0.0.1 -p 15432 -U postgres postgres -T 120 -S
pgbench -M simple -v -r -P 1 -c 16 -j 16 -h 127.0.0.1 -p 15432 -U postgres postgres -T 120
1 gpdb 5
postgres=# select version();
version
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 8.3.23 (Greenplum Database 5.10.2+7615c3b build ga) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.3.1 20170216 (Red Hat 6.3.1-3), 64-bit compiled on Aug 25 2018 08:21:26
(1 row)
postgres=# show gp_server_version;
gp_server_version
-------------------------
5.10.2+7615c3b build ga
(1 row)
postgres=# show gp_server_version_num;
gp_server_version_num
-----------------------
51002
(1 row)
1、tpcb 只读
transaction type: <builtin: select only>
scaling factor: 1000
query mode: simple
number of clients: 16
number of threads: 16
duration: 120 s
number of transactions actually processed: 361911
latency average = 5.306 ms
latency stddev = 0.854 ms
tps = 3014.205774 (including connections establishing)
tps = 3014.474824 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set aid random(1, 100000 * :scale)
5.303 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
2、tpcb 读写混合
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1000
query mode: simple
number of clients: 16
number of threads: 16
duration: 120 s
number of transactions actually processed: 1822
latency average = 1057.754 ms
latency stddev = 130.580 ms
tps = 15.074113 (including connections establishing)
tps = 15.075498 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.005 \set aid random(1, 100000 * :scale)
0.001 \set bid random(1, 1 * :scale)
0.001 \set tid random(1, 10 * :scale)
0.001 \set delta random(-5000, 5000)
2.417 BEGIN;
990.377 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
7.147 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
1.186 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
1.081 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
0.674 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
54.865 END;
2 gpdb 6
psql
psql (9.3beta1)
Type "help" for help.
postgres=# select version();
version
---------------------------------------------------------------------------------------------------------------------------------------------------------
PostgreSQL 9.3beta1 (Greenplum Database 6.0.0-alpha.0+dev.11201.gb2e98d4 build dev-oss) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit compiled on Sep 26 2018 15:00:09
(1 row)
postgres=# show gp_server_version;
gp_server_version
------------------------------------------------
6.0.0-alpha.0+dev.11201.gb2e98d4 build dev-oss
(1 row)
postgres=# show gp_server_version_num;
gp_server_version_num
-----------------------
60000
(1 row)
1、tpcb 只读
transaction type: <builtin: select only>
scaling factor: 1000
query mode: simple
number of clients: 16
number of threads: 16
duration: 120 s
number of transactions actually processed: 3006326
latency average = 0.639 ms
latency stddev = 0.107 ms
tps = 25052.487094 (including connections establishing)
tps = 25056.694155 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set aid random(1, 100000 * :scale)
0.637 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
2、tpcb 读写混合
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1000
query mode: simple
number of clients: 16
number of threads: 16
duration: 120 s
number of transactions actually processed: 351382
latency average = 5.465 ms
latency stddev = 9.487 ms
tps = 2927.029739 (including connections establishing)
tps = 2927.497105 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set aid random(1, 100000 * :scale)
0.001 \set bid random(1, 1 * :scale)
0.001 \set tid random(1, 10 * :scale)
0.000 \set delta random(-5000, 5000)
0.195 BEGIN;
0.779 UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
0.692 SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
0.703 UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
0.685 UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
0.566 INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
1.838 END;
小结
Greenplum 6已合并到PostgreSQL 9.3版本,相比5性能提升:读8倍,更新、删除195倍,同时有大量PG新特性,例如
1、GIN、SPGIST 索引接口。在模糊查询、全文检索、向量相似计算方面支持索引加速。 1000万行的数组检索,性能提升了1000倍。
2、支持异步事务,小事务写入有大幅提升。
3、支持物化视图,OLAP中很好用的功能。
4、事件触发器,基于事件控制细粒度DDL权限。
5、整体性能增强,OLTP 读8倍,更新、删除195倍。
以上特性都是通过升级PostgreSQL版本加入的。
其他增强:
1、增加跨表的分布式死锁检测。
2、更新、删除由表级排他锁改成行级排他锁,大幅提升DML性能。表现在OLTP方面,更新、删除混合测试TPCB 提升了195倍。
参考
https://www.postgresql.org/docs/11/static/release.html
https://github.com/greenplum-db/gpdb
《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》
《会议室预定系统实践(解放开发) - PostgreSQL tsrange(时间范围类型) + 排他约束》
《PostgreSQL 黑科技 range 类型及 gist index 20x+ speedup than Mysql index combine query》