PostgreSQL Selectivity Tablespace PITR - 部分表空间恢复
背景
今天在群里面一位老兄在问PostgreSQL能不能做基于数据库实例的PITR.
这位老兄可能是Oracle的用户, 因为ORACLE是支持基于表空间的RMAN恢复的.
至于为什么要这么做, 显然例如我要恢复的数据库只在某个表空间里面, 而其他表空间的数据我是不关心的. 基于表空间的恢复在恢复过程中占用更小的物理存储, 并且能加快恢复速度. 所以还是很有市场需求的.
理论上来讲, PG是不支持这么做的. 但是可以通过仅恢复要恢复的数据库实例涉及的表空间们来模拟基于数据库实例的恢复.
要成功的实施PostgreSQL的基于表空间的数据库PITR, 必须满足几个条件.
1. 有完整的基础数据文件备份和归档文件备份.
2. 恢复时pg_root, 也就是源数据是必选的(即initdb时指定的$PGDATA).
3. 如果要恢复的表空间不是pg_default表空间, 那么这个表空间的基础备份需要作为恢复材料放到指定位置. 而其他不需要的表空间不需要拷贝到指定位置.
实例讲解
下面我通过例子来讲解一下,
1. 我的测试库已有的表空间如下,
postgres-# \db
List of tablespaces
Name | Owner | Location
------------+----------+---------------------------------------------
digoal | postgres | /pgdata/digoal/1921/data02/pg_tbs/digoal
digoal_01 | postgres | /pgdata/digoal/1921/data03/pg_tbs/digoal_01
digoal_02 | postgres | /pgdata/digoal/1921/data04/pg_tbs/digoal_02
digoal_03 | postgres | /pgdata/digoal/1921/data05/pg_tbs/digoal_03
digoal_04 | postgres | /pgdata/digoal/1921/data06/pg_tbs/digoal_04
pg_default | postgres |
pg_global | postgres |
(7 rows)
2. 新建两个数据库test1和test2 , 分布使用digoal_01和digoal_02作为默认表空间. 我的目的是只恢复test1库, 而不关心test2库的数据.
postgres=# create database test1 tablespace digoal_01;
CREATE DATABASE
postgres=# create database test2 tablespace digoal_02;
CREATE DATABASE
连接到test1库
postgres=# \c test1
You are now connected to database "test1" as user "postgres".
建立测试表
test1=# create table test (id int);
CREATE TABLE
插入一些测试数据
连接到test2库
test1=# \c test2
You are now connected to database "test2" as user "postgres".
建立测试表
test2=# create table test (id int);
CREATE TABLE
插入测试数据
开始备份
test2=# select pg_start_backup('test');
pg_start_backup
-----------------
1C/B8000020
(1 row)
创建备份目录
postgres@db-172-16-3-150-> cd /pgdata/digoal/1921/data01
postgres@db-172-16-3-150-> mkdir -p pg_bak/pg_root
进入$PGDATA, 准备备份
postgres@db-172-16-3-150-> cd $PGDATA
postgres@db-172-16-3-150-> ll
total 100K
-rw------- 1 postgres postgres 176 Apr 26 11:03 backup_label
-rw------- 1 postgres postgres 188 Mar 13 09:59 backup_label.old
drwx------ 5 postgres postgres 4.0K Apr 25 15:16 base
drwx------ 2 postgres postgres 4.0K Apr 26 11:00 global
drwx------ 2 postgres postgres 4.0K Mar 13 13:13 pg_clog
-rw------- 1 postgres postgres 4.5K Mar 13 09:59 pg_hba.conf
-rw------- 1 postgres postgres 1.6K Mar 13 09:59 pg_ident.conf
drwx------ 4 postgres postgres 4.0K Mar 13 09:59 pg_multixact
drwx------ 2 postgres postgres 4.0K Apr 26 10:28 pg_notify
drwx------ 2 postgres postgres 4.0K Mar 13 09:59 pg_serial
drwx------ 2 postgres postgres 4.0K Apr 26 10:28 pg_stat_tmp
drwx------ 2 postgres postgres 4.0K Mar 19 11:32 pg_subtrans
drwx------ 2 postgres postgres 4.0K Apr 26 10:40 pg_tblspc
drwx------ 2 postgres postgres 4.0K Mar 13 09:59 pg_twophase
-rw------- 1 postgres postgres 4 Mar 13 09:59 PG_VERSION
lrwxrwxrwx 1 postgres postgres 34 Mar 13 10:02 pg_xlog -> /pgdata/digoal/1921/data01/pg_xlog
-rw------- 1 postgres postgres 19K Apr 26 10:58 postgresql.conf
-rw------- 1 postgres postgres 24 Apr 26 10:28 postmaster.opts
-rw------- 1 postgres postgres 87 Apr 26 10:28 postmaster.pid
-rw-r--r-- 1 postgres postgres 4.8K Mar 13 08:12 recovery.done
备份pg_xlog和pg_tblspc以外的所有文件
postgres@db-172-16-3-150-> cp -r backup_label backup_label.old base global pg_clog pg_hba.conf pg_ident.conf pg_multixact pg_notify pg_serial pg_stat_tmp pg_subtrans pg_twophase PG_VERSION postgresql.conf postmaster.opts recovery.done /pgdata/digoal/1921/data01/pg_bak/pg_root/
记录下pg_tblspc的软链接信息
postgres@db-172-16-3-150-> cd pg_tblspc/
postgres@db-172-16-3-150-> ll
total 0
lrwxrwxrwx 1 postgres postgres 40 Apr 26 10:40 16385 -> /pgdata/digoal/1921/data02/pg_tbs/digoal
lrwxrwxrwx 1 postgres postgres 43 Mar 13 09:59 16682 -> /pgdata/digoal/1921/data03/pg_tbs/digoal_01
lrwxrwxrwx 1 postgres postgres 43 Mar 13 09:59 16684 -> /pgdata/digoal/1921/data04/pg_tbs/digoal_02
lrwxrwxrwx 1 postgres postgres 43 Mar 13 09:59 16685 -> /pgdata/digoal/1921/data05/pg_tbs/digoal_03
lrwxrwxrwx 1 postgres postgres 43 Mar 13 09:59 16686 -> /pgdata/digoal/1921/data06/pg_tbs/digoal_04
备份test1库的默认表空间, 其他表空间我就不备份了, 因为我不打算还原它们.
postgres@db-172-16-3-150-> cp -r /pgdata/digoal/1921/data03/pg_tbs/digoal_01 /pgdata/digoal/1921/data01/pg_bak/
关闭备份标签.
postgres=# select * from pg_stop_backup();
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
pg_stop_backup
----------------
1C/B80001F8
(1 row)
再次在test1和test2库插入测试数据, 等会恢复后我们看看基础备份以后产生的数据会不会正常恢复.
postgres=# \c test1
You are now connected to database "test1" as user "postgres".
test1=# insert into test select generate_series(1,1000000);
INSERT 0 1000000
test1=# insert into test select generate_series(1,1000000);
INSERT 0 1000000
test1=# insert into test select generate_series(1,1000000);
INSERT 0 1000000
test1=# select count(*) from test;
count
----------
20000000
(1 row)
test1=# \c test2
test2=# insert into test select generate_series(1,1000000);
INSERT 0 1000000
test2=# insert into test select generate_series(1,1000000);
INSERT 0 1000000
test2=# insert into test select generate_series(1,1000000);
INSERT 0 1000000
...........................................
...........................................
test2=# select count(*) from test;
count
----------
20000000
(1 row)
test2=# checkpoint;
CHECKPOINT
停止数据库
postgres@db-172-16-3-150-> pg_ctl stop -m fast
waiting for server to shut down...... done
server stopped
以下就是关键步骤了, 基于表空间的PITR.
把老的pg_root目录改个名字
postgres@db-172-16-3-150-> cd $PGDATA/..
postgres@db-172-16-3-150-> ll
total 24K
drwx------ 2 postgres postgres 16K Mar 13 09:56 lost+found
drwx------ 12 postgres postgres 4.0K Apr 26 11:18 pg_root
drwxrwxr-x 3 postgres postgres 4.0K Mar 13 09:58 pg_tbs
postgres@db-172-16-3-150-> mv pg_root pg_root_old
还原前面的pg_root基础备份
postgres@db-172-16-3-150-> cp -r /pgdata/digoal/1921/data01/pg_bak/pg_root ./
postgres@db-172-16-3-150-> ll
total 28K
drwx------ 2 postgres postgres 16K Mar 13 09:56 lost+found
drwxrwxr-x 11 postgres postgres 4.0K Apr 26 11:21 pg_root
drwx------ 12 postgres postgres 4.0K Apr 26 11:18 pg_root_old
drwxrwxr-x 3 postgres postgres 4.0K Mar 13 09:58 pg_tbs
修改权限为700
postgres@db-172-16-3-150-> chmod -R 700 pg_root
新建pg_xlog和pg_tblspc并修改权限为700
postgres@db-172-16-3-150-> cd pg_root
mkdir pg_xlog
chmod 700 pg_xlog
mkdir pg_tblspc
chmod 700 *
把要恢复的表空间的软链接先做好, 其他软链接不做
cd pg_tblspc
ln -s /pgdata/digoal/1921/data03/pg_tbs/digoal_01 16682
以下选用第3点方法.
注意, 别指望以下能帮你节省掉其他不需要的表空间的恢复过程中产生的数据(即基础备份以来对这些表空间更改过或新增的BLOCK).
因为会报错, 我后面会贴出错误报告.
ln -s /dev/null 16385
ln -s /dev/null 16684
ln -s /dev/null 16685
ln -s /dev/null 16686
比较靠谱的方法是 :
1. 找到类似黑洞的文件系统, 建的文件在里面都是虚假的.
2. 弄个程序把过程中产生的大文件至空.
3. 如果其他表空间在此期间没有产生太多的数据就让他去吧
4. 修改PG源码, 让它支持基于表空间或数据库实例级别的PITR.
基于数据库实例级别的PITR注意事项, 为了避免产生不一致数据, 这种恢复可能丢失跨库事务的一致性.
基于表空间的恢复注意事项, 可能丢失同一数据库的事务一致性, 因为同一个数据库中可以有多个表空间. 所以基于表空间的PITR, 应该把要恢复的表空间以及该表空间所在库的其他表空间也进行PITR.
我这里为了测试方便选择了第3点.
接下来把需要恢复的表空间的目录改名,
postgres@db-172-16-3-150-> cd /pgdata/digoal/1921/data03/pg_tbs
postgres@db-172-16-3-150-> mv digoal_01 digoal_01_old
把需要恢复的digoal_01表空间的备份文件拷贝到它所在的目录, 并修改权限700
postgres@db-172-16-3-150-> cp -r /pgdata/digoal/1921/data01/pg_bak/digoal_01 ./
postgres@db-172-16-3-150-> chmod -R 700 digoal_01
新建recovery.conf文件, 告诉PostgreSQL要恢复了.
postgres@db-172-16-3-150-> cd $PGDATA
postgres@db-172-16-3-150-> vi recovery.conf
restore_command = 'cp /pgdata/digoal/1921/data01/pg_arch/%f %p' # e.g. 'cp /mnt/server/archivedir/%f %p'
recovery_target_timeline = 'latest'
postgres@db-172-16-3-150-> chmod 700 recovery.conf
启动数据库进行恢复
pg_ctl start -D $PGDATA
恢复后可以看到pg_tblspc中会新建一个目录16684, 其实就是digoal_02的表空间目录.
lrwxrwxrwx 1 postgres postgres 43 Mar 13 09:59 16684 -> /pgdata/digoal/1921/data04/pg_tbs/digoal_02
因为恢复过程中的WAL中包含了digoal_02这个表空间的XLOG日志(还记得我前面在test2库中插入的操作吗?), 恢复时会主动创建这些文件.
postgres@db-172-16-3-150-> cd $PGDATA/pg_tblspc
postgres@db-172-16-3-150-> ll
total 4.0K
lrwxrwxrwx 1 postgres postgres 43 Apr 26 12:41 16682 -> /pgdata/digoal/1921/data03/pg_tbs/digoal_01
drwx------ 3 postgres postgres 4.0K Apr 26 12:42 16684
postgres@db-172-16-3-150-> du -sh *
0 16682
692M 16684
恢复结束后就可以连到test1库了, 检查数据是否正确.
postgres=# \c test1
You are now connected to database "test1" as user "postgres".
test1=# select count(*) from test;
count
----------
20000000
(1 row)
从验证结果来看, PG的基于表空间的PITR是可行的.
小结
恢复过程中过滤掉不需要的表空间变更信息, 就不会产生那一堆的其他表空间的变更数据了, PostgreSQL内核中加入基于数据库或表空间的恢复不会太遥远.
附
使用/dev/null做黑洞的失败记录
postgres@db-172-16-3-150-> pg_ctl start
LOG: could not open temporary-files directory "pg_tblspc/16685/PG_9.1_201105231/pgsql_tmp": Not a directory
LOG: could not open tablespace directory "pg_tblspc/16685/PG_9.1_201105231": Not a directory
LOG: could not open temporary-files directory "pg_tblspc/16686/PG_9.1_201105231/pgsql_tmp": Not a directory
LOG: could not open tablespace directory "pg_tblspc/16686/PG_9.1_201105231": Not a directory
LOG: could not open temporary-files directory "pg_tblspc/16684/PG_9.1_201105231/pgsql_tmp": Not a directory
LOG: could not open tablespace directory "pg_tblspc/16684/PG_9.1_201105231": Not a directory
LOG: could not open temporary-files directory "pg_tblspc/16385/PG_9.1_201105231/pgsql_tmp": Not a directory
LOG: could not open tablespace directory "pg_tblspc/16385/PG_9.1_201105231": Not a directory
到最后会恢复失败,csv日志如下 :
postgres@db-172-16-3-150-> cd /var/applog/pg_log/digoal/1921/
2012-04-26 12:28:52.916 CST,,,24354,,4f98cef9.5f22,20,,2012-04-26 12:28:41 CST,1/0,0,FATAL,42809,"could not stat directory ""pg_tblspc/16684/PG_9.1_201105231/3944339"": Not a directory",,,,,"xlog redo insert: rel 16684/3944339/3944343; tid 22123/203",,,,""
2012-04-26 12:28:52.921 CST,,,24346,,4f98cef2.5f1a,2,,2012-04-26 12:28:34 CST,,0,LOG,00000,"startup process (PID 24354) exited with exit code 1",,,,,,,,,""
2012-04-26 12:28:52.921 CST,,,24346,,4f98cef2.5f1a,3,,2012-04-26 12:28:34 CST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,""