PostgreSQL 逻辑备份一致性讲解 - Why pg_dump backup a database in consistent status

3 minute read

背景

今天一位群里的朋友在问怎么做一致的逻辑备份, 比如在hot_standby上使用pg_dump备份数据库时,是不是要先把recovery停掉再进行备份?

回答,不需要停止其他读写,使用pg_dump备份是一致的备份。

为什么呢?

因为pg_dump备份时使用的是serializable隔离级别, 如在hot_standby上面备份将使用repeatable read隔离级别.(目前hot_standby数据库不支持serializable隔离级别). 使用这两种隔离级别进行备份,整个事务从开始时就是看到的一个数据库的snapshot。

pg_dump代码中开头就有一段介绍如下:

 *      pg_dump will read the system catalogs in a database and dump out a  
 *      script that reproduces the schema in terms of SQL that is understood  
 *      by PostgreSQL  
 *  
 *      Note that pg_dump runs in a transaction-snapshot mode transaction,  
 *      so it sees a consistent snapshot of the database including system  
 *      catalogs. However, it relies in part on various specialized backend  
 *      functions like pg_get_indexdef(), and those things tend to run on  
 *      SnapshotNow time, ie they look at the currently committed state.  So  
 *      it is possible to get 'cache lookup failed' error if someone  
 *      performs DDL changes while a dump is happening. The window for this  
 *      sort of thing is from the acquisition of the transaction snapshot to  
 *      getSchemaData() (when pg_dump acquires AccessShareLock on every  
 *      table it intends to dump). It isn't very large, but it can happen.  

另外代码中还有一段关于启动事务的如下,

        /*  
         * Start transaction-snapshot mode transaction to dump consistent data.  
         */  
        do_sql_command(g_conn, "BEGIN");  
        if (g_fout->remoteVersion >= 90100)  
        {  
                if (serializable_deferrable)  
                        do_sql_command(g_conn,  
                                                   "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, "  
                                                   "READ ONLY, DEFERRABLE");  
                else  
                        do_sql_command(g_conn,  
                                                   "SET TRANSACTION ISOLATION LEVEL REPEATABLE READ");  
        }  
        else  
                do_sql_command(g_conn, "SET TRANSACTION ISOLATION LEVEL SERIALIZABLE");  

接下来我们测试一下:

1. SESSION A:

postgres@db-172-16-3-33-> pg_dump -f ./digoal.dmp -F p -v -C -E UTF8 -h 127.0.0.1 -U postgres digoal  
pg_dump: reading schemas  
pg_dump: reading user-defined tables  
pg_dump: reading extensions  
pg_dump: reading user-defined functions  
pg_dump: reading user-defined types  
pg_dump: reading procedural languages  
pg_dump: reading user-defined aggregate functions  
pg_dump: reading user-defined operators  
pg_dump: reading user-defined operator classes  
pg_dump: reading user-defined operator families  
pg_dump: reading user-defined text search parsers  
pg_dump: reading user-defined text search templates  
pg_dump: reading user-defined text search dictionaries  
pg_dump: reading user-defined text search configurations  
pg_dump: reading user-defined foreign-data wrappers  
pg_dump: reading user-defined foreign servers  
pg_dump: reading default privileges  
pg_dump: reading user-defined collations  
pg_dump: reading user-defined conversions  
pg_dump: reading type casts  
pg_dump: reading table inheritance information  
pg_dump: reading rewrite rules  
pg_dump: finding extension members  
pg_dump: finding inheritance relationships  
pg_dump: reading column info for interesting tables  
pg_dump: finding the columns and types of table "restore_point"  
pg_dump: finding default expressions of table "restore_point"  
pg_dump: finding the columns and types of table "t1"  
pg_dump: finding the columns and types of table "t2"  
pg_dump: finding the columns and types of table "t3"  
pg_dump: finding the columns and types of table "a1"  
pg_dump: finding the columns and types of table "a2"  
pg_dump: flagging inherited columns in subtables  
pg_dump: reading indexes  
pg_dump: reading indexes for table "t1"  
pg_dump: reading indexes for table "t2"  
pg_dump: reading constraints  
pg_dump: reading triggers  
pg_dump: reading large objects  
pg_dump: reading dependency data  
pg_dump: saving encoding = UTF8  
pg_dump: saving standard_conforming_strings = on  
pg_dump: saving database definition  
pg_dump: creating DATABASE digoal  
pg_dump: connecting to new database "digoal"  
pg_dump: creating SCHEMA public  
pg_dump: creating COMMENT SCHEMA public  
pg_dump: creating EXTENSION plpgsql  
pg_dump: creating COMMENT EXTENSION plpgsql  
pg_dump: creating TABLE a1  
pg_dump: creating TABLE a2  
pg_dump: creating TABLE restore_point  
pg_dump: creating SEQUENCE restore_point_id_seq  
pg_dump: creating SEQUENCE OWNED BY restore_point_id_seq  
pg_dump: executing SEQUENCE SET restore_point_id_seq  
pg_dump: creating TABLE t1  
pg_dump: creating TABLE t2  
pg_dump: creating TABLE t3  
pg_dump: creating DEFAULT id  
pg_dump: restoring data for table "a1"  
pg_dump: dumping contents of table a1  

到这个点的时候,pg_dump还未结束,在SESSION B执行:

digoal=# begin;  
BEGIN  
digoal=# drop table t2;  
  

处于waiting状态. 这个要等到SESSION A的备份完成.

或者在SESSION B执行:

digoal=# begin;  
BEGIN  
digoal=# truncate table t2;  
  

也处于waiting状态. 这个也要等到SESSION A的备份完成.

但是执行创建新表是可以的, 给已有表创建索引也是可以的. 但是删除已有表的索引需要等待.

既然会堵塞对已有表的一些DDL操作, 那么会不会堵塞对表的DML操作呢?

在备份过程中我在SESSION B执行如下 :

digoal=> delete from t1 where id=10000000;  
DELETE 1  
digoal=> delete from t2 where id=10000000;  
DELETE 1  
digoal=> delete from t2 where id=1000000;  
DELETE 1  
digoal=> delete from t1 where id=1000000;  
DELETE 1  

可以执行DML,也就是说pg_dump不会堵塞对表的DML操作。

那么从备份开始到备份结束是不是处于同一个事务中呢, 我们看一下备份这个进程的事务开始时间和QUERY开始时间.

digoal=# select usename,datname,current_query,xact_start,query_start from pg_stat_activity;  
 usename  | datname |                                   current_query                                    |          xact_start        
     |          query_start            
----------+---------+------------------------------------------------------------------------------------+--------------------------  
-----+-------------------------------  
 postgres | digoal  | select usename,datname,current_query,xact_start,query_start from pg_stat_activity; | 2012-05-11 15:39:44.92349  
3+08 | 2012-05-11 15:39:44.923493+08  
 postgres | digoal  | COPY public.a1 (id) TO stdout;                                                     | 2012-05-11 15:39:43.59293  
4+08 | 2012-05-11 15:39:43.687405+08  
(2 rows)  

结束前再执行查询看看xact_start时间未变化, 说明pg_dump备份是在一个事务中完成的.

digoal=# select usename,datname,current_query,xact_start,query_start from pg_stat_activity;  
 usename  | datname |                                   current_query                                    |          xact_start        
     |          query_start            
----------+---------+------------------------------------------------------------------------------------+--------------------------  
-----+-------------------------------  
 postgres | digoal  | select usename,datname,current_query,xact_start,query_start from pg_stat_activity; | 2012-05-11 15:39:52.60046  
3+08 | 2012-05-11 15:39:52.600463+08  
 postgres | digoal  | COPY public.t3 (id) TO stdout;                                                     | 2012-05-11 15:39:43.59293  
4+08 | 2012-05-11 15:39:51.891716+08  
(2 rows)  

备份结束后,我们看一下备份的数据是不是一致的,换句话说,我执行的那几个DML操作,删除了4条记录,这4条记录在备份开始时是存在的,如果是一致的备份,这个4条记录应该在备份的记录里面存在。

我在digoal.dmp这个文件里面找到了这4条记录,再一次验证pg_dump的备份是一致的。

小结

1. 使用pg_dump备份的时候,这个过程产生的垃圾数据不可以被VACUUM掉,所以一个很长时间的pg_dump操作会带来更多的垃圾数据.换句话说表的膨胀。

Flag Counter

digoal’s 大量PostgreSQL文章入口