数据库平滑switchover的要素 - 会话资源漂移

8 minute read

背景

数据库迁移、切换是很普遍的一个话题,但是大多数的方案,对用户来说都是有感知的,所以用户的体验并不好。

例如用户使用了绑定变量语句,主备角色切换后绑定变量语句没有了,会导致调用报错。

我们需要维护主库的硬件,那么可以在中间件层面,将主备数据库的角色进行平滑调换,维护好硬件,再平滑的调换回来。

数据库主备切换时,如何做到会话级无感知?首先我们要了解会话中都有哪些内容,哪些内容是需要随角色切换一起迁移的。从而做到用户无感知。

(本文HA指中间件层级的HA,并非APP直连数据库,VIP切换的那种HA。)

简单的switchover过程举例:

等待所有会话的事务结束,会话都处于idle状态,冻结会话,不允许提交任何SQL,然后进行角色切换,并将每个会话的资源状态平移。

会话资源状态

会话中有些什么状态?通过discard这条SQL就可以了解。

https://www.postgresql.org/docs/10/static/sql-discard.html

DISCARD — discard session state  

discard all相当于执行如下

SET SESSION AUTHORIZATION DEFAULT;  
RESET ALL;  
DEALLOCATE ALL;  
CLOSE ALL;  
UNLISTEN *;  
SELECT pg_advisory_unlock_all();  
DISCARD PLANS;  
DISCARD SEQUENCES;  
DISCARD TEMP;  

会话资源中目前可能包含如下(每个PG版本可能有些许差异):

SESSOIN角色、参数设置、绑定变量语句、游标、异步消息监听、AD锁、序列、临时表等。

下面介绍一下每种资源的查询方法,以及在新的主库上进行资源复原的方法。

一、SESSION AUTHORIZATION

超级用户可以将会话用户设置为其他用户,普通用户无权切换用户。

当前用户为postgres,设置SESSION AUTHORIZATION为test

postgres=# set SESSION AUTHORIZATION test;  
SET  
postgres=> show SESSION AUTHORIZATION;  
 session_authorization   
-----------------------  
 test  
(1 row)  
postgres=> select usename from pg_stat_activity where pid=pg_backend_pid();  
 usename    
----------  
 postgres  
(1 row)  

查询方法

postgres=> show SESSION AUTHORIZATION;  
 session_authorization   
-----------------------  
 test  
(1 row)  
postgres=> select usename from pg_stat_activity where pid=pg_backend_pid();  
 usename    
----------  
 postgres  
(1 row)  

复原方法

当pg_stat_activity.usename不等于SESSION AUTHORIZATION时,需要通过如下方法复原它。

postgres=# set SESSION AUTHORIZATION test;  
SET  

二、参数

PostgreSQL的一些参数是允许用户在会话、事务中进行设置的。如下context in (‘user’,’superuser’)时,用户可以在会话或事务中设置。

postgres=# select distinct context from pg_settings ;  
      context        
-------------------  
 superuser-backend  
 sighup  
 superuser  
 postmaster  
 internal  
 user  
 backend  
(7 rows)  

设置例子

postgres=> set tcp_keepalives_count=1;  
SET  
  
source表示参数来自哪里的设置,如果来自会话或事务级设置,则显示session  
  
postgres=> select distinct source from pg_settings ;  
        source          
----------------------  
 session  
 default  
 command line  
 configuration file  
 client  
 override  
 environment variable  
(7 rows)  
  
重置方法  
  
postgres=# reset tcp_keepalives_count;  
RESET  
postgres=# select name,setting,reset_val,source,context from pg_settings where name='tcp_keepalives_count';  
         name         | setting | reset_val | source  | context   
----------------------+---------+-----------+---------+---------  
 tcp_keepalives_count | 3       | 0         | default | user  
(1 row)  

查询方法

postgres=# select name,setting,reset_val,source,context from pg_settings where source ='session' and setting<>reset_val;  
         name         | setting | reset_val | source  | context   
----------------------+---------+-----------+---------+---------  
 tcp_keepalives_count | 1       | 0         | session | user  
(1 row)  

复原方法

postgres=> set tcp_keepalives_count=1;  
SET  

三、绑定变量语句

使用绑定变量可以减少数据库的parser, plan开销,提高高并发的查询性能,同时避免SQL注入。

不同的驱动,有不同的使用方法。

https://www.postgresql.org/docs/10/static/libpq-exec.html#libpq-exec-main

使用绑定变量的例子

CREATE OR REPLACE FUNCTION public.getps()  
 RETURNS void  
 LANGUAGE plpgsql  
 STRICT  
AS $function$  
declare   
  rec record;  
begin  
  for rec in select t from pg_prepared_statements t loop  
    raise notice '%', (rec.*)::text;  
  end loop;  
end;  
$function$;  
  
  
create table ps(id int primary key, info text);  
insert into ps select generate_series(1,10000), 'test';  
  
  
vi test.sql  
  
\set id random(1,10000)  
select * from ps where id=:id;  
select getps();  
  
使用绑定变量的模式,调用SQL  
  
pgbench -M prepared -n -r -P 1 -f ./test.sql -c 1 -j 1  
  
NOTICE:  ("(P0_1,""select * from ps where id=$1;"",""2017-06-19 15:22:21.821454+08"",{integer},f)")  
NOTICE:  ("(P0_2,""select getps();"",""2017-06-19 15:22:21.822045+08"",{},f)")  
.....  

查询方法

postgres=# \d pg_prepared_statements  
                  View "pg_catalog.pg_prepared_statements"  
     Column      |           Type           | Collation | Nullable | Default   
-----------------+--------------------------+-----------+----------+---------  
 name            | text                     |           |          |   
 statement       | text                     |           |          |   
 prepare_time    | timestamp with time zone |           |          |   
 parameter_types | regtype[]                |           |          |   
 from_sql        | boolean                  |           |          |   
  
postgres=# select * from pg_prepared_statements;  
 name | statement | prepare_time | parameter_types | from_sql   
------+-----------+--------------+-----------------+----------  
(0 rows)  
  
postgres=# prepare a(int) as select * from ps where id=$1;  
PREPARE  
postgres=# execute a(1);  
 id | info   
----+------  
  1 | test  
(1 row)  
  
postgres=# select * from pg_prepared_statements;  
 name |                    statement                    |         prepare_time         | parameter_types | from_sql   
------+-------------------------------------------------+------------------------------+-----------------+----------  
 a    | prepare a(int) as select * from ps where id=$1; | 2017-06-19 15:23:24.68617+08 | {integer}       | t  
(1 row)  

复原方法

不同的驱动,复原方法不一样。

请根据pg_prepared_statements的内容进行复原。

NOTICE:  ("(P0_1,""select * from ps where id=$1;"",""2017-06-19 15:22:21.821454+08"",{integer},f)")  
  
PGresult *PQprepare(PGconn *conn,  
                    const char *stmtName,  
                    const char *query,  
                    int nParams,  
                    const Oid *paramTypes);  

四、游标

如果我们使用了hold选项,那么游标不会随事务结束而关闭,因此在迁移会话时也需要注意是否有这类游标。

postgres=# \h declare  
Command:     DECLARE  
Description: define a cursor  
Syntax:  
DECLARE name [ BINARY ] [ INSENSITIVE ] [ [ NO ] SCROLL ]  
    CURSOR [ { WITH | WITHOUT } HOLD ] FOR query  
  
postgres=# begin;  
BEGIN  
postgres=# declare cur cursor with hold for select * from ps where id=1;  
DECLARE CURSOR  
postgres=# end;  
COMMIT  
postgres=# select * from pg_cursors ;  
 name |                           statement                           | is_holdable | is_binary | is_scrollable |         creation_time           
------+---------------------------------------------------------------+-------------+-----------+---------------+-------------------------------  
 cur  | declare cur cursor with hold for select * from ps where id=1; | t           | f         | t             | 2017-06-19 15:27:58.604183+08  
(1 row)  
  
postgres=# close cur;  
CLOSE CURSOR  
postgres=# select * from pg_cursors ;  
 name | statement | is_holdable | is_binary | is_scrollable | creation_time   
------+-----------+-------------+-----------+---------------+---------------  
(0 rows)  
  

查询方法

postgres=# select * from pg_cursors ;  
 name |                           statement                           | is_holdable | is_binary | is_scrollable |         creation_time           
------+---------------------------------------------------------------+-------------+-----------+---------------+-------------------------------  
 cur  | declare cur cursor with hold for select * from ps where id=1; | t           | f         | t             | 2017-06-19 15:27:58.604183+08  
(1 row)  

复原方法

postgres=# declare cur cursor with hold for select * from ps where id=1;  
DECLARE CURSOR  

五、异步消息监听

PostgreSQL的异步消息,可以通过异步消息,推送事件。例子如下:

https://www.postgresql.org/docs/10/static/libpq-notify.html

https://www.postgresql.org/docs/10/static/libpq-example.html#libpq-example-2

postgres=# listen a;  
LISTEN  
postgres=# notify a , 'hello i am digoal';  
NOTIFY  
Asynchronous notification "a" with payload "hello i am digoal" received from server process with PID 21412.  

查询方法

查询已经开启了哪些异步监听

postgres=# select pg_listening_channels();  
 pg_listening_channels   
-----------------------  
 a  
(1 row)  

复原方法

postgres=# listen a;  
LISTEN  

六、advisory lock

advisory lock可以用于秒杀、解决高并发锁冲突问题、解决无空洞序列值问题等。

《PostgreSQL 使用advisory lock实现行级读写堵塞》

《PostgreSQL 无缝自增ID的实现 - by advisory lock》

《PostgreSQL 使用advisory lock或skip locked消除行锁冲突, 提高几十倍并发更新效率》

《聊一聊双十一背后的技术 - 不一样的秒杀技术, 裸秒》

postgres=# \df *.*advis*  
                                        List of functions  
   Schema   |               Name               | Result data type | Argument data types |  Type    
------------+----------------------------------+------------------+---------------------+--------  
 pg_catalog | pg_advisory_lock                 | void             | bigint              | normal  
 pg_catalog | pg_advisory_lock                 | void             | integer, integer    | normal  
 pg_catalog | pg_advisory_lock_shared          | void             | bigint              | normal  
 pg_catalog | pg_advisory_lock_shared          | void             | integer, integer    | normal  
 pg_catalog | pg_advisory_unlock               | boolean          | bigint              | normal  
 pg_catalog | pg_advisory_unlock               | boolean          | integer, integer    | normal  
 pg_catalog | pg_advisory_unlock_all           | void             |                     | normal  
 pg_catalog | pg_advisory_unlock_shared        | boolean          | bigint              | normal  
 pg_catalog | pg_advisory_unlock_shared        | boolean          | integer, integer    | normal  
 pg_catalog | pg_advisory_xact_lock            | void             | bigint              | normal  
 pg_catalog | pg_advisory_xact_lock            | void             | integer, integer    | normal  
 pg_catalog | pg_advisory_xact_lock_shared     | void             | bigint              | normal  
 pg_catalog | pg_advisory_xact_lock_shared     | void             | integer, integer    | normal  
 pg_catalog | pg_try_advisory_lock             | boolean          | bigint              | normal  
 pg_catalog | pg_try_advisory_lock             | boolean          | integer, integer    | normal  
 pg_catalog | pg_try_advisory_lock_shared      | boolean          | bigint              | normal  
 pg_catalog | pg_try_advisory_lock_shared      | boolean          | integer, integer    | normal  
 pg_catalog | pg_try_advisory_xact_lock        | boolean          | bigint              | normal  
 pg_catalog | pg_try_advisory_xact_lock        | boolean          | integer, integer    | normal  
 pg_catalog | pg_try_advisory_xact_lock_shared | boolean          | bigint              | normal  
 pg_catalog | pg_try_advisory_xact_lock_shared | boolean          | integer, integer    | normal  
(21 rows)  

advisory lock分为事务级锁和会话级锁,在会话迁移时,会话处于IDLE状态, 只需要关注会话级锁。

postgres=# select pg_try_advisory_lock(1);  
 pg_try_advisory_lock   
----------------------  
 t  
(1 row)  

查询方法

postgres=# select * from pg_locks where locktype='advisory' and pid=pg_backend_pid();  
 locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction |  pid  |     mode      | granted | fastpath   
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------+---------+----------  
 advisory |    13158 |          |      |       |            |               |       0 |     1 |        1 | 3/123301864        | 21412 | ExclusiveLock | t       | f  
(1 row)  

复原方法

注意复原时,需要指定是否为shared lock。

postgres=# select pg_try_advisory_lock(1);  
 pg_try_advisory_lock   
----------------------  
 t  
(1 row)  

七、序列

序列使用后,会在会话中存储最后一次使用的序列的VAL,以及每个序列被使用后的最后一次获取的VAL。

postgres=# create sequence seq1;  
CREATE SEQUENCE  
  
没有被调用的序列,返回错误。  
postgres=# select currval('seq');  
ERROR:  currval of sequence "seq" is not yet defined in this session  
  
没有调用过任何序列,返回错误。  
postgres=# select lastval();  
ERROR:  lastval is not yet defined in this session  
  
调用序列  
postgres=# select nextval('seq1');  
 nextval   
---------  
       1  
(1 row)  
  
返回会话中指定序列最后一次调用的VAL  
postgres=# select currval('seq1');  
 currval   
---------  
       1  
(1 row)  
  
返回整个会话中最后一次序列调用的VAL  
postgres=# select lastval();  
 lastval   
---------  
       1  
(1 row)  

查询方法

postgres=# select * from seq1;  
 last_value | log_cnt | is_called   
------------+---------+-----------  
          1 |      32 | t  
(1 row)  
postgres=# select nextval('seq1');  
 nextval   
---------  
       2  
(1 row)  
  
postgres=# select * from seq1;  
 last_value | log_cnt | is_called   
------------+---------+-----------  
          2 |      31 | t  
(1 row)  

复原方法

序列虽然可以设置当前值,但是会影响全局,强烈建议不要这么做。

目前没有好的方法复原序列在会话中的lastval。

八、临时表

postgres=# create temp table tmp(id int, info text);  
CREATE TABLE  
  
postgres=# select oid,relname from pg_class where relpersistence ='t' and relkind='r' and pg_table_is_visible(oid);  
  oid  | relname   
-------+---------  
 44804 | tmp  
(1 row)  

查询方法

postgres=# select oid,relname from pg_class where relpersistence ='t' and relkind='r' and pg_table_is_visible(oid);  
  oid  | relname   
-------+---------  
 44804 | tmp  
(1 row)  
  
********* QUERY **********  
SELECT c.oid,  
  n.nspname,  
  c.relname  
FROM pg_catalog.pg_class c  
     LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace  
WHERE c.relname ~ '^(tmp)$'  
  AND pg_catalog.pg_table_is_visible(c.oid)  
ORDER BY 2, 3;  
**************************  
  
********* QUERY **********  
SELECT c.relchecks, c.relkind, c.relhasindex, c.relhasrules, c.relhastriggers, c.relrowsecurity, c.relforcerowsecurity, c.relhasoids, pg_catalog.array_to_string(c.reloptions || array(select 'toast.' || x from pg_catalog.unnest(tc.reloptions) x), ', ')  
, c.reltablespace, CASE WHEN c.reloftype = 0 THEN '' ELSE c.reloftype::pg_catalog.regtype::pg_catalog.text END, c.relpersistence, c.relreplident  
FROM pg_catalog.pg_class c  
 LEFT JOIN pg_catalog.pg_class tc ON (c.reltoastrelid = tc.oid)  
WHERE c.oid = '44810';  
**************************  
  
********* QUERY **********  
SELECT a.attname,  
  pg_catalog.format_type(a.atttypid, a.atttypmod),  
  (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid) for 128)  
   FROM pg_catalog.pg_attrdef d  
   WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),  
  a.attnotnull, a.attnum,  
  (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t  
   WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> t.typcollation) AS attcollation, a.attidentity,  
  NULL AS indexdef,  
  NULL AS attfdwoptions,  
  a.attstorage,  
  CASE WHEN a.attstattarget=-1 THEN NULL ELSE a.attstattarget END AS attstattarget, pg_catalog.col_description(a.attrelid, a.attnum)  
FROM pg_catalog.pg_attribute a  
WHERE a.attrelid = '44810' AND a.attnum > 0 AND NOT a.attisdropped  
ORDER BY a.attnum;  
**************************  
  
********* QUERY **********  
SELECT inhparent::pg_catalog.regclass,          pg_get_expr(c.relpartbound, inhrelid),          pg_get_partition_constraintdef(inhrelid) FROM pg_catalog.pg_class c JOIN pg_catalog.pg_inherits ON c.oid = inhrelid WHERE c.oid = '44810' AND c.relispartition;  
**************************  
  
********* QUERY **********  
SELECT pol.polname, pol.polpermissive,  
CASE WHEN pol.polroles = '{0}' THEN NULL ELSE array_to_string(array(select rolname from pg_roles where oid = any (pol.polroles) order by 1),',') END,  
pg_catalog.pg_get_expr(pol.polqual, pol.polrelid),  
pg_catalog.pg_get_expr(pol.polwithcheck, pol.polrelid),  
CASE pol.polcmd  
WHEN 'r' THEN 'SELECT'  
WHEN 'a' THEN 'INSERT'  
WHEN 'w' THEN 'UPDATE'  
WHEN 'd' THEN 'DELETE'  
END AS cmd  
FROM pg_catalog.pg_policy pol  
WHERE pol.polrelid = '44810' ORDER BY 1;  
**************************  
  
********* QUERY **********  
SELECT oid, stxrelid::pg_catalog.regclass, stxnamespace::pg_catalog.regnamespace AS nsp, stxname,  
  (SELECT pg_catalog.string_agg(pg_catalog.quote_ident(attname),', ')  
   FROM pg_catalog.unnest(stxkeys) s(attnum)  
   JOIN pg_catalog.pg_attribute a ON (stxrelid = a.attrelid AND  
        a.attnum = s.attnum AND NOT attisdropped)) AS columns,  
  (stxkind @> '{d}') AS ndist_enabled,  
  (stxkind @> '{f}') AS deps_enabled  
FROM pg_catalog.pg_statistic_ext stat WHERE stxrelid = '44810'  
ORDER BY 1;  
**************************  
  
********* QUERY **********  
SELECT pub.pubname  
 FROM pg_catalog.pg_publication pub  
 LEFT JOIN pg_catalog.pg_publication_rel pr  
      ON (pr.prpubid = pub.oid)  
WHERE pr.prrelid = '44810' OR pub.puballtables  
ORDER BY 1;  
**************************  
  
********* QUERY **********  
SELECT c.oid::pg_catalog.regclass FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i WHERE c.oid=i.inhparent AND i.inhrelid = '44810' AND c.relkind != 'p' ORDER BY inhseqno;  
**************************  
  
********* QUERY **********  
SELECT c.oid::pg_catalog.regclass, pg_get_expr(c.relpartbound, c.oid) FROM pg_catalog.pg_class c, pg_catalog.pg_inherits i WHERE c.oid=i.inhrelid AND i.inhparent = '44810' AND EXISTS (SELECT 1 FROM pg_class c WHERE c.oid = '44810') ORDER BY c.oid::pg_catalog.regclass::pg_catalog.text;  
**************************  
  
                                   Table "pg_temp_3.tmp"  
 Column |  Type   | Collation | Nullable | Default | Storage  | Stats target | Description   
--------+---------+-----------+----------+---------+----------+--------------+-------------  
 id     | integer |           |          |         | plain    |              |   
 info   | text    |           |          |         | extended |              |   

复原方法

postgres=# create temp table tmp(id int, info text);  
CREATE TABLE  

小结

主备切换时,将会话资源状态进行平移,可以大幅提升客户端的体验,使得数据库硬件维护、迁移等工作也会变得更加轻松。

中间件需要维护客户端连接和数据库会话的映射关系,平移后映射关系同样需要保持一致。

参考

https://www.postgresql.org/docs/10/static/libpq-exec.html#libpq-exec-main

https://www.postgresql.org/docs/10/static/sql-discard.html

Flag Counter

digoal’s 大量PostgreSQL文章入口