影响或控制PostgreSQL垃圾回收的参数或因素

3 minute read

背景

哪些参数或因素将影响或控制PostgreSQL的垃圾回收呢?

参数

全局参数

1、控制VACUUM命令的睡眠时间,当vacuum的阶段性COST大于vacuum_cost_limit时,睡眠一段时间后继续。

vacuum_cost_delay = 0                   # 0-100 milliseconds  

2、VACUUM时,不同数据块(在SHARED BUFFER中命中的块、未命中的块、脏块)的成本

#vacuum_cost_page_hit = 1               # 0-10000 credits  
#vacuum_cost_page_miss = 10             # 0-10000 credits  
#vacuum_cost_page_dirty = 20            # 0-10000 credits  

3、VACUUM阶段性COST阈值

#vacuum_cost_limit = 200                # 1-10000 credits  

4、对只读事务设置的超时时间,防止LONG SQL带来的膨胀

#old_snapshot_threshold = -1            # 1min-60d; -1 disables; 0 is immediate  
                                        # (change requires restart)  

5、在CKPT后第一次产生或变更的脏块,是否记录整个数据块的内容到WAL中。

full_page_writes = off                  # recover from partial page writes  

6、是否开启WAL压缩

#wal_compression = off                  # enable compression of full-page writes  

7、自动垃圾回收相关进程

#------------------------------------------------------------------------------  
# AUTOVACUUM PARAMETERS  
#------------------------------------------------------------------------------  
  
是否开启自动垃圾回收  
  
#autovacuum = on                        # Enable autovacuum subprocess?  'on'  
                                        # requires track_counts to also be on.  
  
运行时长超过阈值的AUTO VACUUM将被记录下来  
  
#log_autovacuum_min_duration = -1       # -1 disables, 0 logs all actions and  
                                        # their durations, > 0 logs only  
                                        # actions running at least this number  
                                        # of milliseconds.  
  
最多允许多少个垃圾回收WORKER进程同时工作  
autovacuum_max_workers = 9              # max number of autovacuum subprocesses  
                                        # (change requires restart)  
  
轮询查询完所有数据库是否有需要进行垃圾回收的对象的周期。  
autovacuum_naptime = 1min         # time between autovacuum runs  
  
判断是否需要进行垃圾回收、收集统计信息的阈值1:最小被影响记录数  
#autovacuum_vacuum_threshold = 50       # min number of row updates before  
                                        # vacuum  
#autovacuum_analyze_threshold = 50      # min number of row updates before  
                                        # analyze  
  
  
判断是否需要进行垃圾回收、收集统计信息的阈值2:被影响记录数占比  
autovacuum_vacuum_scale_factor = 0.00002        # fraction of table size before vacuum  
autovacuum_analyze_scale_factor = 0.00001       # fraction of table size before analyze  
  
不管有没有开启autovacuum,当年龄达到这个阈值,都会强制触发freeze  
#autovacuum_freeze_max_age = 200000000  # maximum XID age before forced vacuum  
                                        # (change requires restart)  
#autovacuum_multixact_freeze_max_age = 400000000        # maximum multixact age  
                                        # before forced vacuum  
                                        # (change requires restart)  
  
自动垃圾回收的睡眠时间  
autovacuum_vacuum_cost_delay = 0        # default vacuum cost delay for  
                                        # autovacuum, in milliseconds;  
                                        # -1 means use vacuum_cost_delay  
  
自动垃圾回收睡眠前的COST阈值  
autovacuum_vacuum_cost_limit = 0        # default vacuum cost limit for  
                                        # autovacuum, -1 means use  
                                        # vacuum_cost_limit  

主库参数

是否延迟回收垃圾,可能导致膨胀,但是可以降低只读备库的查询冲突的可能性。但是可能导致主库频繁进行垃圾回收,并回收不掉。  
  
#vacuum_defer_cleanup_age = 0   # number of xacts by which cleanup is delayed  

备库参数

备库是否把事务快照发回给主库,主库不能回收备库运行中事务所需的脏数据,可能导致主库膨胀。甚至导致主库频繁进行垃圾回收,并回收不掉。  
如果备库apply本身就已经发生了延迟,那么即使备库上运行的查询很短暂,快照还是很老,FEEDBACK后,主库依旧很多垃圾无法回收。
  
#hot_standby_feedback = off             # send info from standby to prevent  
                                        # query conflicts  

客户端参数

执行vacuum或触发autovacuum时,年龄小于vacuum_freeze_min_age的记录,不会被FREEZE。  
  
当表的年龄大于vacuum_freeze_table_age时,VACUUM会扫描全表,并执行FREEZE。  
  
#vacuum_freeze_min_age = 50000000  
#vacuum_freeze_table_age = 150000000  
  
以下则针对multixact事务号的年龄  
  
#vacuum_multixact_freeze_min_age = 5000000  
#vacuum_multixact_freeze_table_age = 150000000  

表级参数

表的PAGE中,分配多少空间给INSERT、COPY时请求的空间,剩余的留给UPDATE,尽量实现HOT。  
       fillfactor (integer)  
           The fillfactor for a table is a percentage between 10 and 100.   
	   100 (complete packing) is the default.   
	   When a smaller fillfactor is specified,   
	   INSERT operations pack table pages only to the indicated percentage;   
	   the  
           remaining space on each page is reserved for updating rows on that page.   
	   This gives UPDATE a chance to place the updated copy of a row on the same page as the original,   
	   which is more efficient than placing it on a  
           different page. For a table whose entries are never updated,   
	   complete packing is the best choice,   
	   but in heavily updated tables smaller fillfactors are appropriate.   
	   This parameter cannot be set for TOAST tables.  
  
是否开启表的自动垃圾回收  
       autovacuum_enabled, toast.autovacuum_enabled (boolean)  
           Enables or disables the autovacuum daemon for a particular table.   
	   If true, the autovacuum daemon will perform automatic VACUUM and/or   
	   ANALYZE operations on this table following the rules discussed in   
	   Section 24.1.6. If  
           false, this table will not be autovacuumed, except to prevent   
	   transaction ID wraparound. See Section 24.1.5 for more about   
	   wraparound prevention. Note that the autovacuum daemon does not   
	   run at all (except to prevent  
           transaction ID wraparound) if the autovacuum parameter is false;   
	   setting individual tables' storage parameters does not override that.  
	   Therefore there is seldom much point in explicitly setting this storage parameter to  
           true, only to false.  
  
与全局参数类似功能  
       autovacuum_vacuum_threshold, toast.autovacuum_vacuum_threshold (integer)  
           Per-table value for autovacuum_vacuum_threshold parameter.  
  
       autovacuum_vacuum_scale_factor, toast.autovacuum_vacuum_scale_factor (float4)  
           Per-table value for autovacuum_vacuum_scale_factor parameter.  
  
       autovacuum_analyze_threshold (integer)  
           Per-table value for autovacuum_analyze_threshold parameter.  
  
       autovacuum_analyze_scale_factor (float4)  
           Per-table value for autovacuum_analyze_scale_factor parameter.  
  
       autovacuum_vacuum_cost_delay, toast.autovacuum_vacuum_cost_delay (integer)  
           Per-table value for autovacuum_vacuum_cost_delay parameter.  
  
       autovacuum_vacuum_cost_limit, toast.autovacuum_vacuum_cost_limit (integer)  
           Per-table value for autovacuum_vacuum_cost_limit parameter.  
  
       autovacuum_freeze_min_age, toast.autovacuum_freeze_min_age (integer)  
           Per-table value for vacuum_freeze_min_age parameter.   
	   Note that autovacuum will ignore per-table autovacuum_freeze_min_age   
	   parameters that are larger than half the system-wide autovacuum_freeze_max_age setting.  
  
       autovacuum_freeze_max_age, toast.autovacuum_freeze_max_age (integer)  
           Per-table value for autovacuum_freeze_max_age parameter.   
	   Note that autovacuum will ignore per-table autovacuum_freeze_max_age   
	   parameters that are larger than the system-wide setting (it can only be set smaller).  
  
       autovacuum_freeze_table_age, toast.autovacuum_freeze_table_age (integer)  
           Per-table value for vacuum_freeze_table_age parameter.  
  
       autovacuum_multixact_freeze_min_age, toast.autovacuum_multixact_freeze_min_age (integer)  
           Per-table value for vacuum_multixact_freeze_min_age parameter.   
	   Note that autovacuum will ignore per-table autovacuum_multixact_freeze_min_age   
	   parameters that are larger than half the system-wide  
           autovacuum_multixact_freeze_max_age setting.  
  
       autovacuum_multixact_freeze_max_age, toast.autovacuum_multixact_freeze_max_age (integer)  
           Per-table value for autovacuum_multixact_freeze_max_age parameter.   
	   Note that autovacuum will ignore per-table autovacuum_multixact_freeze_max_age   
	   parameters that are larger than the system-wide setting (it can only be  
           set smaller).  
  
       autovacuum_multixact_freeze_table_age, toast.autovacuum_multixact_freeze_table_age (integer)  
           Per-table value for vacuum_multixact_freeze_table_age parameter.  
  
       log_autovacuum_min_duration, toast.log_autovacuum_min_duration (integer)  
           Per-table value for log_autovacuum_min_duration parameter.  

因素

1、数据库不能回收LONG SQL事务快照后产生的垃圾,也即是说,数据库中最老的事务,决定了数据库可以回收的垃圾上限。在此事务快照后产生的垃圾都无法被回收。

2、snapshot too old,开启这个参数后,只读事务的允许时间超过这个时间后,将自动回滚。防止LONG SQL引起的膨胀。

3、备库开启了hot_standby_feedback = on时,备库如果运行LONG SQL,也会导致主库垃圾回收受限。导致膨胀。同时如果naptime很小,则可能导致频繁的无效VACUUM发生,导致数据库的IO,CPU飙高。

4、freeze是扫描全表的动作,如果大表发生freeze,可能导致大量的数据文件读写IO,以及WAL的写IO。通过wal日志分析,可以找到FREEZE的蛛丝马迹。

5、如果主库设置了vacuum_defer_cleanup_age 大于 0 ,可能导致膨胀,同时如果naptime很小,则可能导致频繁的无效VACUUM发生,导致数据库的IO,CPU飙高。

Flag Counter

digoal’s 大量PostgreSQL文章入口