PostgreSQL 单机3.9万亿/天(计数器、序列、自增)

5 minute read

背景

数据库中,自增序列是常见的需求,例如计数,主键,唯一值,或者自动生成的流水号等等。

因此序列这个功能就应运而生,序列的功能在很多商业数据库中都支持需求,PostgreSQL当然也支持,而且更好用。

在PostgreSQL中可以创建多个序列,设置序列的起始值,步长,缓存大小,是否轮回等。

postgres=# \h create sequence
Command:     CREATE SEQUENCE
Description: define a new sequence generator
Syntax:
CREATE [ TEMPORARY | TEMP ] SEQUENCE [ IF NOT EXISTS ] name [ INCREMENT [ BY ] increment ]
    [ MINVALUE minvalue | NO MINVALUE ] [ MAXVALUE maxvalue | NO MAXVALUE ]
    [ START [ WITH ] start ] [ CACHE cache ] [ [ NO ] CYCLE ]
    [ OWNED BY { table_name.column_name | NONE } ]

序列的应用场景

1. 作为字段默认值,default nextval(‘序列名’);

postgres=# create table seq_test(id serial, info text);
CREATE TABLE
postgres=# \d+ seq_test
                                             Table "public.seq_test"
 Column |  Type   |                       Modifiers                       | Storage  | Stats target | Description 
--------+---------+-------------------------------------------------------+----------+--------------+-------------
 id     | integer | not null default nextval('seq_test_id_seq'::regclass) | plain    |              | 
 info   | text    |                                                       | extended |              | 

2. 作为继承表的分布键

例如某个表有64个继承表,为了区分每个继承表的数据,可以将继承表的主键都按64取模,得到的结果不同来区分,使用步长很好的解决这个问题。

create sequence seq0 minvalue 0 increment by 64 start with 0 ;
create sequence seq1 minvalue 0 increment by 64 start with 1 ;
...
create table tbl0(id int default nextval('seq0'), .....);
create table tbl1(id int default nextval('seq1'), .....);
...

3. 作为计数器

select nextval('序列名');

4. 其他

那么PostgreSQL序列的性能怎样呢?

http://thebuild.com/blog/2015/10/30/dont-assume-postgresql-is-slow/

这个是某位网友的测试,我接下来会在某个32核的机器上测试一下,序列值的生成性能将达到每秒4.5千万。

Don’t Assume PostgreSQL is Slow

You can’t build a real-life system without caching.

That being said, it’s often the case that parts of the system you think are going to be slow aren’t. 

I’ve noticed a tendency to build out a huge stack of components (”we’ll have PostgreSQL, and Redis, and Celery, and Varnish, and…”) without actually measuring where the bottlenecks are.

Example: A counter.

  Suppose you need a global counter for something. 

  It needs to be super-fast, and available across all of the web front ends. 

  It’s not transactional (you never “uncount” based on a rollback), but you do want it to be persistent.

Option 1: Drop Redis into the stack, use INCR to keep the counter, and have some other process that reads the counter and spills it into PostgreSQL, 

  then have some other process that picks up the count when Redis starts and initializes it 

  (or be smart enough to read from both places and add them when yo need it), and accept that there are windows in which you might use counts.

Option 2: Use SERIAL in PostgreSQL.

But option 2 is really really really slow compared to super-ultra-fast Redis, right?

Not really (test on an Amazon i2-2xlarge instance, client over local sockets, Python client):

  Ten million INCRs in Redis: 824 seconds.

  Ten million SELECT nextval('') in PostgreSQL: 892 seconds.

So: Slower. 6.8 microseconds per increment slower. And no elaborate Redis tooling.

So, build for operation, apply load, then decide what to cache. Human intuition about what might be slow is almost certainly wrong.

32核序列生成性能测试数据

使用unix socket连接,厕所12种场景,每次取多条时,统一为每次取10万条。

1. 单个序列(nocache),1个客户端,每个客户端,每次取一条

create sequence seq;

vi test.sql
select nextval('seq');

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 1 -j 1 -T 100

tps = 44112.339267 (including connections establishing)
tps = 44114.552052 (excluding connections establishing)
4.4万/s

瓶颈

perf record -ag -p   PID

perf report --stdio
     3.58%  postgres  [kernel.kallsyms]      [k] _spin_lock_irqsave
     3.15%  postgres  postgres               [.] AllocSetAlloc.lto_priv.1064
     2.17%  postgres  postgres               [.] PostgresMain

2. 单个序列(nocache),1个客户端,每个客户端,每次取多条

vi test.sql
select nextval('seq') from generate_series(1,100000);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 1 -j 1 -T 100

tps = 12.785754 (including connections establishing)
tps = 12.786493 (excluding connections establishing)
127.8万/s

瓶颈

     5.77%  postgres  postgres            [.] int8out
     4.28%  postgres  postgres            [.] SearchCatCache
     3.84%  postgres  libc-2.12.so        [.] memcpy

3. 单个序列(cache),1个客户端,每个客户端,每次取一条

postgres=# alter sequence seq cache 100000;
ALTER SEQUENCE

tps = 45007.157046 (including connections establishing)
tps = 45009.591614 (excluding connections establishing)
4.5万/s

瓶颈

     3.47%  postgres  [kernel.kallsyms]      [k] _spin_lock_irqsave
     3.05%  postgres  postgres               [.] AllocSetAlloc.lto_priv.1064
     2.44%  postgres  [kernel.kallsyms]      [k] _spin_lock

4. 单个序列(cache),1个客户端,每个客户端,每次取多条

tps = 16.761021 (including connections establishing)
tps = 16.761574 (excluding connections establishing)
167.6万/s

瓶颈

     7.62%  postgres  postgres           [.] int8ou
     5.45%  postgres  postgres           [.] SearchCatCache
     4.94%  postgres  postgres           [.] AllocSetAlloc.lto_priv.1064
     4.65%  postgres  libc-2.12.so       [.] memcpy
     4.32%  postgres  postgres           [.] hash_search_with_hash_value

5. 单个序列(nocache),64个客户端,每个客户端,每次取一条

postgres=# alter sequence seq cache 1;
ALTER SEQUENCE

vi test.sql
select nextval('seq');

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100

tps = 418597.316363 (including connections establishing)
tps = 418849.992275 (excluding connections establishing)
41.8万/s

瓶颈

     5.41%  postgres  postgres               [.] LWLockWaitListLock.lto_priv.1132
     3.33%  postgres  postgres               [.] GetSnapshotData
     3.21%  postgres  [kernel.kallsyms]      [k] _spin_lock

6. 多个序列(nocache),64个客户端,每个客户端,每次取一条

postgres=# do language plpgsql     
declare
begin
  for i in 0..63 loop
    execute 'create sequence seq_'||i||' increment by 64 MINVALUE 0 start '||i;
  end loop;
end;
    ;

vi test.sql
\set id random(0,63)
select nextval('seq_'||:id);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100

tps = 1078827.770563 (including connections establishing)
tps = 1079469.007184 (excluding connections establishing)
107万/s

7. 单个序列(nocache),64个客户端,每个客户端,每次取多条

vi test.sql
select nextval('seq') from generate_series(1,100000);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100

tps = 6.671587 (including connections establishing)
tps = 6.675305 (excluding connections establishing)
66.7万/s

瓶颈

    24.25%  postgres  postgres            [.] LWLockWaitListLock.lto_priv.1132
            |
            --- LWLockWaitListLock.lto_priv.1132

     8.07%  postgres  postgres            [.] LWLockAcquire.constprop.859
            |
            --- LWLockAcquire.constprop.859

     6.38%  postgres  postgres            [.] LWLockDequeueSelf.lto_priv.1133
            |
            --- LWLockDequeueSelf.lto_priv.1133

8. 多个序列(nocache),64个客户端,每个客户端,每次取多条

vi test.sql
\set id random(0,63)
select nextval('seq_'||:id) from generate_series(1,100000);

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100

tps = 15.196673 (including connections establishing)
tps = 15.206898 (excluding connections establishing)
152万/s

瓶颈

pgbench分配的随机值不够均匀,可能导致冲突依旧存在,因此性能并没有上去多少。

如果使用更多的序列,可以看到无冲突情况下,多并发取值的效果。

9. 单个序列(cache),64个客户端,每个客户端,每次取一条

postgres=# alter sequence seq cache 100000;
ALTER SEQUENCE

vi test.sql
select nextval('seq');

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 100

tps = 1323611.827976 (including connections establishing)
tps = 1324468.027166 (excluding connections establishing)
132万/s

瓶颈

    10.06%  postgres  postgres               [.] s_lock
     8.31%  postgres  postgres               [.] GetSnapshotData
     2.87%  postgres  postgres               [.] AllocSetAlloc.lto_priv.1064
     2.03%  postgres  postgres               [.] LWLockRelease

10. 多个序列(cache),64个客户端,每个客户端,每次取一条

postgres=# do language plpgsql     
declare
begin
  for i in 0..63 loop
    execute 'alter sequence seq_'||i||' cache 100000';
  end loop;
end;
    ;

tps = 1286465.725029 (including connections establishing)
tps = 1287261.097736 (excluding connections establishing)
128万/s

11. 单个序列(cache),64个客户端,每个客户端,每次取多条

tps = 455.050231 (including connections establishing)
tps = 455.313609 (excluding connections establishing)
4550万/s

瓶颈

     6.20%  postgres  postgres            [.] int8out
     5.44%  postgres  postgres            [.] SearchCatCache
     5.38%  postgres  libc-2.12.so        [.] memcpy
     4.04%  postgres  postgres            [.] AllocSetAlloc.lto_priv.1064
     3.95%  postgres  postgres            [.] printtup.lto_priv.1769

12. 多个序列(cache),64个客户端,每个客户端,每次取多条

tps = 260.329785 (including connections establishing)
tps = 260.488070 (excluding connections establishing)
2603万/s

瓶颈

nextval属于volatile函数,||拼接序列名,带来了操作符的开销
     7.31%  postgres  postgres              [.] SearchCatCache
     6.55%  postgres  postgres              [.] AllocSetAlloc.lto_priv.1064
     3.20%  postgres  libc-2.12.so          [.] __strlen_sse42
     3.13%  postgres  postgres              [.] ExecMakeFunctionResultNoSets
     3.06%  postgres  postgres              [.] nocachegetattr

小结

如果需要较为正确的perf诊断,PG的编译参数要改一下。

《PostgreSQL 源码性能诊断(perf profiling)指南》

1. 序列的cache是会话层级的,例如一次cache 100个序列值,那么一次会消耗100,如果会话退出,这100个没有被用完的序列值也会被消耗掉。

cache的好处是减少锁冲突。

如果你不能用cache,那么可以用多个序列来解决锁冲突的问题。

如果需要并轨,只需要将多个序列的步长设为一样,同时起始值错开即可并轨。

2. 单个序列值,并发批量取序列值,如何提升性能?

设置cache

3. 并发单步取序列值,不允许cache的情况下,如何提升性能?

使用多个序列,每个会话对应一个序列,多个序列要求步长一致,起始值不一致即可。

4. 从上面的测试情况来看,理论上多个序列的性能应该更好,但是测试使用了||来拼接出序列名,带来了额外的开销,所以性能并未体现有些。

以单序列,批量并行取序列值的最好成绩来估算,每秒生成4550万自增值,那么一天可以生成3.9万亿个自增值,你想让它成为瓶颈都难。

Flag Counter

digoal’s 大量PostgreSQL文章入口