PostgreSQL 优化CASE - 无序UUID性能问题诊断

16 minute read

背景

某数据库的IO等待非常高, 基本上iostat 看到的磁盘使用率都是接近100%的.

分析这个业务库有如下特征 :

1. 最大的单表20GB左右, 1.2亿数据, 有插入操作, 无更改无删除, 有少量查询.

2. 这些表的索引较少, 只有PK和时间字段上的索引, 注意这里的PK是text类型, 并且使用的是业务系统生成的UUID。

分析统计信息如下 :

1. 从pg_stat_statements分析占用CPU时间最多的SQL中, 多数是涉及大表的INSERT语句.

2. 从iostat看到的基本上是写入的请求居多. 读请求较少. 写请求的等待队列较大.

其他分析 :

1. 在业务系统上线初期并没有暴露出严重的IO等待问题, 跟进后面的分析这个应该是和业务量的增长有关, 增长后暴露出来的问题.

2. 存储没有异常, 也就是说不是硬件带来的问题. 操作系统层面也没有异常.

3. 从iostat上分析, IO写请求基本上集中在索引所在的目录.

那么到底是什么导致了这么高的写IO等待呢?

就是UUID的索引 , 为什么这么说呢? 来看几个内容取样 :

                id                  
----------------------------------  
 f649e41701d6469396b6256a52a449d7  
 731533dc86ba4a449d43a2cbfdf5c8e0  
 f28b6efce57e42ed8526293043482a44  
 24e5cb3208874f529cf03e9e114b11f1  
 f17da47fa0b34f2888d1e97b36c921f6  
 175bbb128d2d49df9950dc34a31dab8e  
 6bcc21b43ada42128cc45d1b11c4a05f  
 c1e7e50fb5824fc9b934314c820941fa  
 9a7e490db3c2409d80e6203e84440234  
 bae295da28d944a68ecf79b54672811e  
 640ad54a5dc84d29b08b74456b3d002a  
 09645443e4344dd183ab2c03d573ca2d  
 ddd7122ec0484125aeaf3a7599264f3f  
 7e9ae853159c456db5dc3820ad463638  
 026148ab2d694de69acb6b04055b418c  
 9d89d4c9c74b4d02a37dca934e177277  
 676d37cc9c9247328f96b6f8eff0f155  
 e9aa8f4b96794b9e85a9b46797d07896  

很乱, 每次插入的值和前面的值没有关系, 也没有顺序 .

因此每次插入的值可能存在索引中的位置也是非常随机的, 所以btree索引的树节点和叶节点需要不断的调整, 产生大量的离散IO .

下面在测试系统进行验证, 分别考虑几种情况.

1. 测试3中类型的主键, uuid , text , int8

2. 针对 uuid 和 text 测试4种存储配置(plain, main, external, extended)

3. 测试数据量不同的情况下对性能的影响.

4. 测试没有索引的情况.

测试需求 :

1. 为了生成uuid, 这里需要用到uuid-ossp模块.

2. 测试用到PostgreSQL 9.2.0版本.

安装PostgreSQL

1. 首先要安装uuid需要的依赖包, 在以下网址下载 :

http://www.ossp.org/pkg/lib/uuid/

wget ftp://ftp.ossp.org/pkg/lib/uuid/uuid-1.6.2.tar.gz  
tar -zxvf uuid-1.6.2.tar.gz  
cd uuid-1.6.2  
./configure --prefix=/opt/uuid-1.6.2  
make  
make install  

2. 安装PostgreSQL

wget http://ftp.postgresql.org/pub/source/v9.2.0/postgresql-9.2.0.tar.bz2  
tar -jxvf postgresql-9.2.0.tar.bz2  
CPPFLAGS=-I/opt/uuid-1.6.2/include ./configure --prefix=/home/pg9.2.0/pgsql9.2.0 --with-pgport=9200 --with-perl --with-python --with-tcl --with-openssl --with-pam --without-ldap --with-libxml --with-libxslt --enable-thread-safety --with-wal-blocksize=16 --with-ossp-uuid --with-libs=/opt/uuid-1.6.2/lib  
gmake world  
gmake install-world  

3. 初始化数据库略

4. 创建两个表空间, 分布在两个不同的独立物理硬盘上, 目的是将索引和数据文件分开, 容易观察和定位到IO问题的发生. (最好能够完全独立, 和GLOBAL表空间, pg_xlog都分开.)

我这里使用的表空间如下 :

 tbs_digoal       | postgres | /data04/pg9.2.0/tbs_digoal  
 tbs_digoal_idx | postgres | /data03/pgdata/pg9.2.0/tbs_digoal_idx  

其中索引所在表空间位于/dev/sda3, 表所在表空间位于/dev/sdb

5. 安装uuid-ossp模块

pg9.2.0@db-172-16-3-150-> psql digoal postgres  
psql (9.2.0)  
Type "help" for help.  
digoal=# create extension "uuid-ossp";  

6. 创建测试表, 指定索引表空间和表的表空间.

digoal=> create table test_uuid_pk (id uuid primary key using index tablespace tbs_digoal_idx) tablespace tbs_digoal;  
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_uuid_pk_pkey" for table "test_uuid_pk"  
CREATE TABLE  
digoal=> create table test_text_pk (id text primary key using index tablespace tbs_digoal_idx) tablespace tbs_digoal;  
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_text_pk_pkey" for table "test_text_pk"  
CREATE TABLE  
digoal=> create table test_int8_pk (id int8 primary key using index tablespace tbs_digoal_idx) tablespace tbs_digoal;  
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "test_int8_pk_pkey" for table "test_int8_pk"  
CREATE TABLE  
digoal=> create sequence test_seq start with 1 cache 100;  
CREATE SEQUENCE  

开始测试, 每次测试前手工执行checkpoint, 以免影响测试的结果.

需要关注几组数据 :

top里面的%wa, iostat里面的%util, pgbench里面的tps.

1. 测试uuid字段类型的主键, 使用pgbench测试插入离散uuid数据到测试表.

pgbench script :

vi pgbench.sql  
insert into test_uuid_pk (id) values (uuid_generate_v4());  

测试,

digoal=# checkpoint;  
CHECKPOINT  

pgbench测试结果 :

pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 2462748  
tps = 41039.584555 (including connections establishing)  
tps = 41044.669903 (excluding connections establishing)  

测试过程中的top截取 :

Tasks: 247 total,  11 running, 236 sleeping,   0 stopped,   0 zombie  
Cpu(s): 41.1%us, 36.9%sy,  0.0%ni, 14.8%id,  4.1%wa,  0.2%hi,  2.8%si,  0.0%st  
Mem:  98992440k total, 89684600k used,  9307840k free,  1246100k buffers  
Swap:  8385920k total,        0k used,  8385920k free, 85007228k cached  
  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
 6267 pg9.2.0   15   0  319m 2100 1524 S 75.0  0.0   0:32.29 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoa  
 6278 pg9.2.0   16   0 1184m 130m 128m R 68.1  0.1   0:28.46 postgres: digoal digoal [local] BIND                                     
 6277 pg9.2.0   16   0 1184m 129m 128m R 66.1  0.1   0:27.94 postgres: digoal digoal [local] idle                                     
 6282 pg9.2.0   16   0 1184m 129m 128m R 66.1  0.1   0:27.83 postgres: digoal digoal [local] idle                                     
 6284 pg9.2.0   16   0 1184m 129m 127m R 65.2  0.1   0:27.70 postgres: digoal digoal [local] INSERT                                   
 6280 pg9.2.0   16   0 1184m 129m 127m R 62.2  0.1   0:27.95 postgres: digoal digoal [local] idle                                     
 6281 pg9.2.0   16   0 1184m 129m 128m R 62.2  0.1   0:27.82 postgres: digoal digoal [local] idle                                     
 6283 pg9.2.0   16   0 1184m 129m 128m R 60.2  0.1   0:28.11 postgres: digoal digoal [local] INSERT                                   
 6276 pg9.2.0   16   0 1184m 129m 128m R 59.2  0.1   0:27.73 postgres: digoal digoal [local] INSERT  

测试过程中的iostat截取 :

avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
          36.12    0.00   34.12    5.12    0.00   24.62  
  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00 27606.00  0.00 4723.00     0.00 265880.00    56.29    51.15   11.67   0.12  57.50  
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00 27606.00  0.00 4723.00     0.00 265880.00    56.29    51.15   11.67   0.12  57.50  
sdb               0.00 26533.00  0.00 976.00     0.00 220064.00   225.48     1.43    1.48   0.76  74.50  
dm-0              0.00     0.00  0.00 27508.00     0.00 220064.00     8.00    71.27    2.59   0.03  74.50  
dm-1              0.00     0.00  0.00 32192.00     0.00 257536.00     8.00   352.96   11.68   0.02  57.60  
dm-2              0.00     0.00  0.00 27508.00     0.00 220064.00     8.00    71.31    2.59   0.03  74.50  

2. 测试text字段类型的主键, 使用pgbench测试插入离散uuid数据到测试表.

pgbench script :

vi pgbench.sql  
insert into test_text_pk (id) values (uuid_generate_v4());  

测试,

digoal=# checkpoint;  
CHECKPOINT  

pgbench测试结果 :

pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 2133816  
tps = 34674.325489 (including connections establishing)  
tps = 34677.786574 (excluding connections establishing)  

测试过程中的top截取 :

Tasks: 244 total,   6 running, 238 sleeping,   0 stopped,   0 zombie  
Cpu(s): 32.6%us, 27.4%sy,  0.0%ni, 32.0%id,  6.0%wa,  0.1%hi,  1.9%si,  0.0%st  
Mem:  98992440k total, 89138112k used,  9854328k free,  1246048k buffers  
Swap:  8385920k total,        0k used,  8385920k free, 84442028k cached  
  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
 5580 pg9.2.0   16   0 1184m  69m  68m R 52.5  0.1   0:07.09 postgres: digoal digoal [local] INSERT                                   
 5566 pg9.2.0   15   0  255m 2096 1524 S 52.1  0.0   0:07.94 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoa  
 5582 pg9.2.0   16   0 1184m  69m  67m S 51.5  0.1   0:06.96 postgres: digoal digoal [local] INSERT                                   
 5576 pg9.2.0   16   0 1184m  69m  68m R 50.8  0.1   0:07.08 postgres: digoal digoal [local] INSERT                                   
 5581 pg9.2.0   16   0 1184m  69m  67m S 49.5  0.1   0:06.93 postgres: digoal digoal [local] INSERT                                   
 5577 pg9.2.0   16   0 1184m  69m  67m S 49.1  0.1   0:07.01 postgres: digoal digoal [local] INSERT                                   
 5579 pg9.2.0   16   0 1184m  69m  68m S 49.1  0.1   0:06.90 postgres: digoal digoal [local] INSERT                                   
 5575 pg9.2.0   16   0 1184m  69m  67m R 48.5  0.1   0:07.07 postgres: digoal digoal [local] BIND                                     
 5578 pg9.2.0   16   0 1184m  69m  67m R 46.2  0.1   0:06.83 postgres: digoal digoal [local] INSERT  

测试过程中的iostat截取 :

avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
          37.58    0.00   32.58    5.37    0.00   24.47  
  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00 28633.00  0.00 6048.00     0.00 272856.00    45.12    54.31    8.88   0.09  56.70  
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00 28633.00  0.00 6048.00     0.00 272856.00    45.12    54.31    8.88   0.09  56.70  
sdb               0.00 29436.00  0.00 746.00     0.00 239496.00   321.04     1.71    2.29   0.89  66.60  
dm-0              0.00     0.00  0.00 30196.00     0.00 241568.00     8.00    90.18    2.98   0.02  66.60  
dm-1              0.00     0.00  0.00 34804.00     0.00 278432.00     8.00   322.95    9.19   0.02  56.70  
dm-2              0.00     0.00  0.00 30196.00     0.00 241568.00     8.00    90.23    2.98   0.02  66.60  

3. 测试text字段类型的主键, 使用pgbench测试插入顺序的唯一数据到测试表.

顺序的UUID例如 :

5a50fad8-81ac-4249-8390-f06e37fb1e9b1  
5a50fad8-81ac-4249-8390-f06e37fb1e9b2  
5a50fad8-81ac-4249-8390-f06e37fb1e9b3  
5a50fad8-81ac-4249-8390-f06e37fb1e9b4  
5a50fad8-81ac-4249-8390-f06e37fb1e9b5  
...  

pgbench script :

vi pgbench.sql  
insert into test_text_pk (id) values ('5a50fad8-81ac-4249-8390-f06e37fb1e9b'||nextval('test_seq'::regclass));  

测试,

digoal=# checkpoint;  
CHECKPOINT  
digoal=> truncate table test_text_pk ;  
TRUNCATE TABLE  

pgbench测试结果 :

pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 4548223  
tps = 75802.392652 (including connections establishing)  
tps = 75811.583066 (excluding connections establishing)  

测试过程中的top截取 :

Tasks: 247 total,   6 running, 241 sleeping,   0 stopped,   0 zombie  
Cpu(s): 65.8%us, 26.7%sy,  0.0%ni,  6.3%id,  0.4%wa,  0.1%hi,  0.7%si,  0.0%st  
Mem:  98992440k total, 89949664k used,  9042776k free,  1246152k buffers  
Swap:  8385920k total,        0k used,  8385920k free, 85253040k cached  
  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
 7266 pg9.2.0   15   0  320m 2104 1520 S 119.6  0.0   0:29.43 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digo  
 7275 pg9.2.0   16   0 1180m 172m 170m S 80.1  0.2   0:19.20 postgres: digoal digoal [local] idle                                     
 7281 pg9.2.0   17   0 1180m 165m 163m R 79.1  0.2   0:18.45 postgres: digoal digoal [local] INSERT                                   
 7277 pg9.2.0   17   0 1180m 166m 164m R 77.1  0.2   0:18.62 postgres: digoal digoal [local] INSERT                                   
 7276 pg9.2.0   17   0 1180m 167m 165m R 75.1  0.2   0:18.69 postgres: digoal digoal [local] idle                                     
 7278 pg9.2.0   18   0 1180m 166m 164m R 75.1  0.2   0:18.57 postgres: digoal digoal [local] BIND                                     
 7280 pg9.2.0   17   0 1180m 167m 165m R 75.1  0.2   0:18.59 postgres: digoal digoal [local] INSERT                                   
 7279 pg9.2.0   17   0 1180m 167m 165m S 72.1  0.2   0:18.59 postgres: digoal digoal [local] idle                                     
 7282 pg9.2.0   17   0 1180m 164m 162m S 69.2  0.2   0:18.54 postgres: digoal digoal [local] idle    

测试过程中的iostat截取 :

avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
          64.96    0.00   26.56    0.50    0.00    7.98  
  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00  1909.00  0.00 36.00     0.00 15560.00   432.22     0.35    9.75   0.56   2.00  
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00  1909.00  0.00 36.00     0.00 15560.00   432.22     0.35    9.75   0.56   2.00  
sdb               0.00 10053.00  0.00 2817.00     0.00 102984.00    36.56     0.56    0.20   0.12  34.90  
dm-0              0.00     0.00  0.00 12873.00     0.00 102984.00     8.00    10.19    0.79   0.03  35.10  
dm-1              0.00     0.00  0.00 1945.00     0.00 15560.00     8.00    19.65   10.10   0.01   2.00  
dm-2              0.00     0.00  0.00 12873.00     0.00 102984.00     8.00    10.80    0.79   0.03  35.40  

4. 测试int8字段类型的主键, 使用pgbench测试插入整型数据到测试表.

pgbench script :

vi pgbench.sql  
insert into test_int8_pk (id) values (nextval('test_seq'::regclass));  

测试,

digoal=# checkpoint;  
CHECKPOINT  

pgbench测试结果 :

pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 5088896  
tps = 84813.774212 (including connections establishing)  
tps = 84822.447895 (excluding connections establishing)  

测试过程中的top截取 :

Tasks: 247 total,   9 running, 238 sleeping,   0 stopped,   0 zombie  
Cpu(s): 63.0%us, 29.1%sy,  0.0%ni,  7.0%id,  0.2%wa,  0.1%hi,  0.5%si,  0.0%st  
Mem:  98992440k total, 89760120k used,  9232320k free,  1246116k buffers  
Swap:  8385920k total,        0k used,  8385920k free, 85081028k cached  
  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
 6865 pg9.2.0   15   0  319m 2100 1524 S 126.5  0.0   0:35.91 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digo  
 6877 pg9.2.0   18   0 1180m 114m 113m R 77.1  0.1   0:20.89 postgres: digoal digoal [local] INSERT                                   
 6880 pg9.2.0   18   0 1180m 115m 113m R 77.1  0.1   0:20.80 postgres: digoal digoal [local] idle                                     
 6876 pg9.2.0   18   0 1180m 114m 113m R 76.1  0.1   0:21.10 postgres: digoal digoal [local] idle                                     
 6878 pg9.2.0   18   0 1180m 116m 115m R 75.1  0.1   0:21.11 postgres: digoal digoal [local] INSERT                                   
 6879 pg9.2.0   18   0 1180m 115m 114m R 75.1  0.1   0:20.99 postgres: digoal digoal [local] idle                                     
 6874 pg9.2.0   17   0 1180m 117m 115m R 74.1  0.1   0:21.30 postgres: digoal digoal [local] idle                                     
 6875 pg9.2.0   18   0 1180m 115m 114m R 74.1  0.1   0:21.36 postgres: digoal digoal [local] INSERT                                   
 6881 pg9.2.0   18   0 1180m 115m 114m R 69.2  0.1   0:20.85 postgres: digoal digoal [local] INSERT  

测试过程中的iostat截取 :

avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
          61.92    0.00   29.96    0.25    0.00    7.87  
  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00  2078.00  0.00 40.00     0.00 16944.00   423.60     0.15    3.65   0.53   2.10  
sda1              0.00    11.00  0.00  2.00     0.00   104.00    52.00     0.00    0.00   0.00   0.00  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00  2067.00  0.00 38.00     0.00 16840.00   443.16     0.15    3.84   0.55   2.10  
sdb               0.00  7377.00  0.00 2112.00     0.00 75912.00    35.94     0.40    0.19   0.12  25.20  
dm-0              0.00     0.00  0.00 9489.00     0.00 75912.00     8.00     5.81    0.61   0.03  25.90  
dm-1              0.00     0.00  0.00 2105.00     0.00 16840.00     8.00     7.83    3.72   0.01   2.10  
dm-2              0.00     0.00  0.00 9489.00     0.00 75912.00     8.00     5.83    0.61   0.03  26.10  

其他几种测试数据就不一一列出来了, 简单的进行总结.

1. 针对 uuid 和 text 测试4种存储配置 (plain, main, external, extended) .

默认是extended存储, 所以以上测试都是在extended下进行的.

数据量5000W后进行的测试 : 所以比较数据请参看后面的数据.

digoal=> \d+ test_text_pk  
                    Table "digoal.test_text_pk"  
 Column | Type | Modifiers | Storage  | Stats target | Description   
--------+------+-----------+----------+--------------+-------------  
 id     | text | not null  | extended |              |   
Indexes:  
    "test_text_pk_pkey" PRIMARY KEY, btree (id), tablespace "tbs_digoal_idx"  
Has OIDs: no  
  
digoal=> alter table test_text_pk alter column id set storage plain;  
ALTER TABLE  
  
pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 216127  
tps = 3576.422811 (including connections establishing)  
tps = 3576.800549 (excluding connections establishing)  
  
Tasks: 249 total,   1 running, 248 sleeping,   0 stopped,   0 zombie  
Cpu(s):  3.6%us,  3.6%sy,  0.0%ni, 79.1%id, 13.5%wa,  0.0%hi,  0.2%si,  0.0%st  
Mem:  98992440k total, 92489108k used,  6503332k free,   760752k buffers  
Swap:  8385920k total,       60k used,  8385860k free, 88174496k cached  
  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
10661 pg9.2.0   16   0 1184m 330m 328m S  6.9  0.3   0:04.10 postgres: digoal digoal [local] INSERT                                   
10654 pg9.2.0   16   0 1184m 326m 324m S  5.9  0.3   0:04.03 postgres: digoal digoal [local] INSERT                                   
10655 pg9.2.0   15   0 1184m 334m 332m S  5.9  0.3   0:04.14 postgres: digoal digoal [local] INSERT                                   
10658 pg9.2.0   16   0 1184m 322m 320m S  5.9  0.3   0:03.94 postgres: digoal digoal [local] INSERT                                   
10659 pg9.2.0   16   0 1184m 333m 331m S  5.9  0.3   0:04.16 postgres: digoal digoal [local] INSERT                                   
10656 pg9.2.0   16   0 1184m 322m 320m S  4.9  0.3   0:03.96 postgres: digoal digoal [local] INSERT                                   
10657 pg9.2.0   16   0 1184m 333m 331m S  4.9  0.3   0:04.15 postgres: digoal digoal [local] INSERT                                   
10660 pg9.2.0   16   0 1184m 331m 329m S  4.9  0.3   0:04.12 postgres: digoal digoal [local] INSERT                                   
10645 pg9.2.0   15   0  255m 2092 1524 S  3.9  0.0   0:03.35 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoa  
  
avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
           0.00    0.00    1.50   14.96    0.00   83.54  
  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00   862.00  0.00 654.00     0.00 11880.00    18.17   145.69  218.26   1.53 100.10  
sda1              0.00    13.00  0.00  2.00     0.00   120.00    60.00     0.31  157.00 157.00  31.40  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00   849.00  0.00 652.00     0.00 11760.00    18.04   145.38  218.45   1.54 100.10  
sdb               0.00  5311.00  0.00 84.00     0.00 39896.00   474.95    23.67  294.74  11.64  97.80  
dm-0              0.00     0.00  0.00 5403.00     0.00 43224.00     8.00  1436.09  279.59   0.18  97.80  
dm-1              0.00     0.00  0.00 1522.00     0.00 12176.00     8.00   330.52  212.65   0.66 100.10  
dm-2              0.00     0.00  0.00 5403.00     0.00 43224.00     8.00  1436.09  279.59   0.18  97.80  

略. 这几种模式影响不明显. 主要压缩和数据存储开销的均衡, 也就是硬件的CPU能力和IO能力的抉择.

plain和external都不压缩, 一个是存在表的数据文件中一个是存在toast中.

main和extended都压缩, 一个是存在表的数据文件中一个是存在toast中.

2. 测试数据量不同的情况下对性能的影响.

数据量越大, 连续插入离散值带来的影响越大, tps越低. 而连续插入顺序值则几乎不受影响. 下面是把基础数据加到5000W后进行的测试. 可以看出连续插入离散值到主键的测试有明显的性能变化.

连续插入顺序值的测试结果如下 :

digoal=> truncate table test_int8_pk ;  
TRUNCATE TABLE  
digoal=> \d test_int8_pk  
 Table "digoal.test_int8_pk"  
 Column |  Type  | Modifiers   
--------+--------+-----------  
 id     | bigint | not null  
Indexes:  
    "test_int8_pk_pkey" PRIMARY KEY, btree (id), tablespace "tbs_digoal_idx"  
  
digoal=> alter table test_int8_pk drop constraint test_int8_pk_pkey;  
ALTER TABLE  

插入5000W数据, 为了加快速度, 先删除PK, 数据插入完后再加入进来.

digoal=> insert into test_int8_pk select generate_series (1,50000000);  
  
\c digoal digoal  
set work_mem='4096MB';  
set maintenance_work_mem='4096MB';  
alter table test_int8_pk add constraint test_int8_pk_pkey primary key(id) using index tablespace tbs_digoal_idx;  
alter sequence test_seq start with 50000001;  
\c digoal postgres  
checkpoint;  

测试数据 :

pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 4557154  
tps = 75951.686893 (including connections establishing)  
tps = 75962.447482 (excluding connections establishing)  

TOP数据 :

Cpu(s): 65.4%us, 26.3%sy,  0.0%ni,  6.9%id,  0.4%wa,  0.1%hi,  0.9%si,  0.0%st  
Mem:  98992440k total, 93250652k used,  5741788k free,  1246256k buffers  
Swap:  8385920k total,        0k used,  8385920k free, 88484952k cached  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
 8263 pg9.2.0   15   0  319m 2108 1520 S 115.6  0.0   0:25.50 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digo  
 8274 pg9.2.0   18   0 1180m 137m 135m R 78.1  0.1   0:16.33 postgres: digoal digoal [local] INSERT                                   
 8276 pg9.2.0   18   0 1180m 134m 132m R 78.1  0.1   0:16.12 postgres: digoal digoal [local] idle                                     
 8273 pg9.2.0   18   0 1180m 134m 133m R 77.1  0.1   0:16.00 postgres: digoal digoal [local] idle                                     
 8277 pg9.2.0   18   0 1180m 136m 134m R 75.1  0.1   0:16.39 postgres: digoal digoal [local] INSERT                                   
 8272 pg9.2.0   18   0 1180m 143m 141m R 74.1  0.1   0:16.56 postgres: digoal digoal [local] INSERT                                   
 8275 pg9.2.0   18   0 1180m 137m 135m R 74.1  0.1   0:16.55 postgres: digoal digoal [local] INSERT                                   
 8278 pg9.2.0   18   0 1180m 133m 132m R 74.1  0.1   0:15.99 postgres: digoal digoal [local] idle                                     
 8279 pg9.2.0   18   0 1180m 135m 133m R 72.1  0.1   0:16.14 postgres: digoal digoal [local] INSERT  

IOSTAT 数据 :

avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
          65.04    0.00   27.59    0.50    0.00    6.87  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00  1899.00  0.00 40.00     0.00 15512.00   387.80     0.29    7.17   0.47   1.90  
sda1              0.00    11.00  0.00  2.00     0.00   104.00    52.00     0.00    0.00   0.00   0.00  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00  1888.00  0.00 38.00     0.00 15408.00   405.47     0.29    7.55   0.50   1.90  
sdb               0.00 10063.00  0.00 2843.00     0.00 103248.00    36.32     0.58    0.21   0.12  34.30  
dm-0              0.00     0.00  0.00 12902.00     0.00 103216.00     8.00     9.76    0.76   0.03  34.40  
dm-1              0.00     0.00  0.00 1926.00     0.00 15408.00     8.00    14.49    7.52   0.01   1.90  
dm-2              0.00     0.00  0.00 12902.00     0.00 103216.00     8.00     9.78    0.76   0.03  34.50  

持续插入离散值(uuid)的测试结果如下 :

digoal=> alter table test_text_pk drop constraint test_text_pk_pkey;  
ALTER TABLE  
digoal=> alter table test_uuid_pk drop constraint test_uuid_pk_pkey;  
ALTER TABLE  
insert into test_text_pk (id) values (uuid_generate_v4());  

插入5000W数据 :

pgbench -M prepared -n -j 8 -c 8 -t 6250000 -f ./pgbench.sql -U digoal digoal   

新建PK约束 :

\c digoal digoal  
insert into test_text_pk (id) values (uuid_generate_v4());  
set work_mem='4096MB';  
set maintenance_work_mem='4096MB';  
alter table test_text_pk add constraint test_text_pk_pkey primary key(id) using index tablespace tbs_digoal_idx;  
\c digoal postgres  
checkpoint;  

测试 :

pg9.2.0@db-172-16-3-150-> pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoal   
transaction type: Custom query  
scaling factor: 1  
query mode: prepared  
number of clients: 8  
number of threads: 8  
duration: 60 s  
number of transactions actually processed: 212645  
tps = 3528.551941 (including connections establishing)  
tps = 3529.030665 (excluding connections establishing)  

TOP数据 :

Tasks: 249 total,   2 running, 247 sleeping,   0 stopped,   0 zombie  
Cpu(s):  3.7%us,  4.0%sy,  0.0%ni, 78.3%id, 13.8%wa,  0.0%hi,  0.3%si,  0.0%st  
Mem:  98992440k total, 92512088k used,  6480352k free,   760656k buffers  
Swap:  8385920k total,       60k used,  8385860k free, 88177048k cached  
  
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                  
10378 pg9.2.0   16   0 1184m 320m 318m S  6.5  0.3   0:04.43 postgres: digoal digoal [local] INSERT                                   
10374 pg9.2.0   16   0 1184m 324m 322m S  5.9  0.3   0:04.54 postgres: digoal digoal [local] INSERT                                   
10375 pg9.2.0   16   0 1184m 321m 319m D  5.9  0.3   0:04.42 postgres: digoal digoal [local] INSERT                                   
10376 pg9.2.0   16   0 1184m 320m 318m S  5.9  0.3   0:04.46 postgres: digoal digoal [local] INSERT                                   
10377 pg9.2.0   16   0 1184m 325m 323m S  5.9  0.3   0:04.50 postgres: digoal digoal [local] INSERT waiting                           
10380 pg9.2.0   16   0 1184m 324m 322m S  5.9  0.3   0:04.49 postgres: digoal digoal [local] INSERT                                   
10379 pg9.2.0   16   0 1184m 328m 326m S  5.2  0.3   0:04.58 postgres: digoal digoal [local] INSERT                                   
10381 pg9.2.0   15   0 1184m 317m 315m S  4.6  0.3   0:04.37 postgres: digoal digoal [local] INSERT                                   
10365 pg9.2.0   15   0  256m 2088 1520 S  3.9  0.0   0:03.61 pgbench -M prepared -n -j 8 -c 8 -T 60 -f ./pgbench.sql -U digoal digoa  

IOSTAT 数据 :

avg-cpu:  %user   %nice %system %iowait  %steal   %idle  
           3.13    0.00    4.63   19.90    0.00   72.34  
  
Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util  
sda               0.00  1113.86  0.00 597.03     0.00 17964.36    30.09   144.10  236.67   1.66  99.01  
sda1              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00  
sda3              0.00  1113.86  0.00 597.03     0.00 17964.36    30.09   144.10  236.67   1.66  99.01  
sdb               0.00  7987.13  0.00 82.18     0.00 38114.85   463.81    35.87  280.34  10.96  90.10  
dm-0              0.00     0.00  0.00 8123.76     0.00 64990.10     8.00  2162.83  166.67   0.11  90.10  
dm-1              0.00     0.00  0.00 1700.99     0.00 13607.92     8.00   711.46  415.44   0.58  99.01  
dm-2              0.00     0.00  0.00 8123.76     0.00 64990.10     8.00  2162.84  166.67   0.11  90.10  

3. 测试没有索引的情况.

在没有索引的情况下, uuid, text, int8的插入速度都很快, 关键是IO等待都在0点几左右. IO不再是问题. 这更加印证了连续插入离散值会带来索引的大量IO开销的事实.

小结

1. 在做主键的类型选择时, 尽量不要使用UUID类型的字段, 如果要使用, 请使用有序生成的UUID, 不要使用无序生成的UUID.

如果不是主键,仅仅是普通索引,而且没有排序、>, <, >=, <=的查询需求的话,可以建立hash index,而不是b-tree索引。

HASH INDEX可以解决无序数据频繁写入带来的b-tree索引也频繁分裂的问题。

2. 使用离散uuid值和text类型作为主键的IO等待大概是6%, int8作为主键IO等待大概是0.25% , 从这里看使用离散uuid作为主键值带来的IO开销是使用序列插入int8类型字段带来IO开销的24倍. 当表里面的数据越多, 使用离散uuid作为主键值带来的IO开销会越来越大, 与INT8的性能相差就更加大了. 几百倍都有可能.

3. checkpoint时, UUID索引所在的硬盘有大量的IO操作, 持续时间较长. 而表数据文件所在的硬盘IO操作很快完成.

4. 使用perf 可以更详细的观察到代码层面的瓶颈,佐证以上问题。

《PostgreSQL 源码性能诊断(perf profiling)指南》

Flag Counter

digoal’s 大量PostgreSQL文章入口