PostgreSQL standby会不会做检查点? 以及做检查点的用处

2 minute read

背景

当数据库异常关闭时,数据库关闭时来不及或根本没有机会创建一个一致性的检查点,所以需要从上一个一致性检查点开始恢复。

实际上是数据库启动时检查控制文件中的数据库状态,如果状态不是shutdown的,那么说明数据库是异常关闭的(当然我们说,除了recovery状态),总之就需要从检查点开始恢复。

src/include/catalog/pg_control.h

/*  
 * System status indicator.  Note this is stored in pg_control; if you change  
 * it, you must bump PG_CONTROL_VERSION  
 */  
typedef enum DBState  
{  
        DB_STARTUP = 0,  
        DB_SHUTDOWNED,  
        DB_SHUTDOWNED_IN_RECOVERY,  
        DB_SHUTDOWNING,  
        DB_IN_CRASH_RECOVERY,  
        DB_IN_ARCHIVE_RECOVERY,  
        DB_IN_PRODUCTION  
} DBState;  

那么问题来了,对于standby,一直是出于recovery状态的,即使停库也可能是这个状态(实际上是shutdown_in_recovery),所以STANDBY停库再起来,理论上来说又需要重上一个检查点开始恢复WAL,即使这个数据库已经恢复到了最新的WAL。

为了提高性能,缩短recovery时间,pg在standby上面也支持了checkpoint,叫做CreateRestartPoint。

CreateRestartPoint@src/backend/access/transam/xlog.c

/*  
 * Establish a restartpoint if possible.  
 *  
 * This is similar to CreateCheckPoint, but is used during WAL recovery  
 * to establish a point from which recovery can roll forward without  
 * replaying the entire recovery log.  
 *  
 * Returns true if a new restartpoint was established. We can only establish  
 * a restartpoint if we have replayed a safe checkpoint record since last  
 * restartpoint.  
 */  
bool  
CreateRestartPoint(int flags)  
{  

当standby关闭,重启时,从restartpoint开始APPLY WAL,而不是上游节点(主节点)在WAL日志中写的检查点开始。

停库时,备库写检查点

/*  
 * This must be called ONCE during postmaster or standalone-backend shutdown  
 */  
void  
ShutdownXLOG(int code, Datum arg)  
{  
        /* Don't be chatty in standalone mode */  
        ereport(IsPostmasterEnvironment ? LOG : NOTICE,  
                        (errmsg("shutting down")));  
  
        /*  
         * Signal walsenders to move to stopping state.  
         */  
        WalSndInitStopping();  
  
        /*  
         * Wait for WAL senders to be in stopping state.  This prevents commands  
         * from writing new WAL.  
         */  
        WalSndWaitStopping();  
  
        if (RecoveryInProgress())   // 备库写检查点  
                CreateRestartPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE);  
        else  
        {  
                /*  
                 * If archiving is enabled, rotate the last XLOG file so that all the  
                 * remaining records are archived (postmaster wakes up the archiver  
                 * process one more time at the end of shutdown). The checkpoint  
                 * record will go to the next XLOG file and won't be archived (yet).  
                 */  
                if (XLogArchivingActive() && XLogArchiveCommandSet())  
                        RequestXLogSwitch(false);  
  
                CreateCheckPoint(CHECKPOINT_IS_SHUTDOWN | CHECKPOINT_IMMEDIATE);  
        }  
        ShutdownCLOG();  
        ShutdownCommitTs();  
        ShutdownSUBTRANS();  
        ShutdownMultiXact();  
}  
CreateRestartPoint  
  
...  
        LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);  
        if (ControlFile->state == DB_IN_ARCHIVE_RECOVERY &&  
                ControlFile->checkPointCopy.redo < lastCheckPoint.redo)  
        {  
                ControlFile->checkPoint = lastCheckPointRecPtr;  
                ControlFile->checkPointCopy = lastCheckPoint;  
                ControlFile->time = (pg_time_t) time(NULL);  
  
                /*  
                 * Ensure minRecoveryPoint is past the checkpoint record.  Normally,  
                 * this will have happened already while writing out dirty buffers,  
                 * but not necessarily - e.g. because no buffers were dirtied.  We do  
                 * this because a non-exclusive base backup uses minRecoveryPoint to  
                 * determine which WAL files must be included in the backup, and the  
                 * file (or files) containing the checkpoint record must be included,  
                 * at a minimum. Note that for an ordinary restart of recovery there's  
                 * no value in having the minimum recovery point any earlier than this  
                 * anyway, because redo will begin just after the checkpoint record.  
                 */  
                if (ControlFile->minRecoveryPoint < lastCheckPointEndPtr)  
                {  
                        ControlFile->minRecoveryPoint = lastCheckPointEndPtr;  
                        ControlFile->minRecoveryPointTLI = lastCheckPoint.ThisTimeLineID;  
  
                        /* update local copy */  
                        minRecoveryPoint = ControlFile->minRecoveryPoint;  
                        minRecoveryPointTLI = ControlFile->minRecoveryPointTLI;  
                }  
                if (flags & CHECKPOINT_IS_SHUTDOWN)  
                        ControlFile->state = DB_SHUTDOWNED_IN_RECOVERY;  // 备库正常关闭时,状态改写为DB_SHUTDOWNED_IN_RECOVERY  
                UpdateControlFile();  
        }  

小结

PostgreSQL备库也可以写检查点,目的是避免每次重启备库都需要从上一个检查点(由主库产生,在WAL中回放出来的)APPLY后面所有的WAL。

从而forward 备库停库后重启时WAL的位点。

参考

CreateRestartPoint@src/backend/access/transam/xlog.c

Flag Counter

digoal’s 大量PostgreSQL文章入口