[未完待续] PostgreSQL 一键诊断项 - 珍藏级

11 minute read

背景


一、诊断项类目 - 规格、配置

一 规格

1 数据库实例规格

检测方法

规格配置是否与实际配置相符

cpu
mem
iops
network bandwidth
disk size

告警规则
处理方法
背景知识

二 配置

1 ROLE,DB级定制配置

检测方法

select * from pg_db_role_setting;  

告警规则
处理方法
背景知识

2 数据库非默认配置

检测方法

select source,name,setting,unit from pg_settings where source<>'default';  

告警规则
处理方法
背景知识


二、诊断项类目 - 状态诊断

一 模块状态

1 HA 模块状态

2 agent状态

3 备份模块状态

4 归档日志上传模块状态

pg_stat_archiver

5 审计日志上传模块状态

6 任务模块以及任务运行状态

用户任务

备份任务

… …

二 数据库状态

1 心跳状态

检测方法
数据库读写是否正常
告警规则
处理方法
背景知识

2 连接数(活跃、剩余)

select current_setting('max_connections')::int - current_setting('superuser_reserved_connections')::int max_conn;  
  
select state, count(*) from pg_stat_activity group by 1;  

3 容灾节点状态

检测方法
1、容灾节点是否存在,

2、pg_stat_replication.state 状态值
告警规则
处理方法
背景知识

4 备库状态

检测方法
1、备库是否存在,

2、pg_stat_replication.state 状态值
告警规则
处理方法
背景知识

三 延迟

1 容灾节点延迟

检测方法
send,write,apply 延迟

还有多少WAL SIZE有容灾节点重搭风险. (wal_keep_segments - send延迟)

告警规则
处理方法
背景知识

2 备库延迟

检测方法
send,write,apply 延迟

select usename,application_name,client_addr,state,  
pg_wal_lsn_diff(pg_current_wal_insert_lsn(), sent_lsn) sent_delay,   
pg_wal_lsn_diff(pg_current_wal_insert_lsn(), write_lsn) write_delay,   
pg_wal_lsn_diff(pg_current_wal_insert_lsn(), flush_lsn) flush_delay,   
pg_wal_lsn_diff(pg_current_wal_insert_lsn(), replay_lsn) replay_delay   
from pg_stat_replication;  

还有多少WAL SIZE有备库重搭风险. (wal_keep_segments - send延迟)

9.4使用如下SQL

select usename,application_name,client_addr,state,  
pg_xlog_location_diff(pg_current_xlog_insert_location(), sent_location) sent_delay,   
pg_xlog_location_diff(pg_current_xlog_insert_location(), write_location) write_delay,   
pg_xlog_location_diff(pg_current_xlog_insert_location(), flush_location) flush_delay,   
pg_xlog_location_diff(pg_current_xlog_insert_location(), replay_location) replay_delay   
from pg_stat_replication;  

告警规则
处理方法
背景知识

3 SLOT延迟

风险,导致当前数据库节点WAL日志堆积、系统表膨胀。

检查方法

select slot_name,plugin,slot_type,database,temporary,active,active_pid,  
xmin,catalog_xmin,  
pg_wal_lsn_diff(pg_current_wal_insert_lsn(),restart_lsn) restart_delay,   
pg_wal_lsn_diff(pg_current_wal_insert_lsn(),confirmed_flush_lsn) flush_delay   
from pg_replication_slots;  

9.4使用如下SQL

select slot_name,plugin,slot_type,database,active,                       
xmin,catalog_xmin,  
pg_xlog_location_diff(pg_current_xlog_insert_location(),restart_lsn) restart_delay   
from pg_replication_slots; 

已有风险处理方法:

1、删除SLOT, 删掉后会自动清理。(删掉slot后,下游使用这个SLOT接收的业务程序报错,无法继续接收)

2、下游尽快消费延迟的SLOT的日志

3、手动设置SLOT的位点(相当于丢弃未被接收的部分,下游使用这个SLOT的业务程序,重新的SLOT位点继续接收)

函数URL

https://www.postgresql.org/docs/11/static/functions-admin.html#FUNCTIONS-REPLICATION

4 归档延迟

风险,导致当前数据库节点WAL日志堆积。

检查方法

pg_wal/archive_status/.ready 的数量 乘以 wal_segment_size   

三、诊断项类目 - 资源诊断

一 硬件资源

1 CPU

2 IOPS

3 MEM

page table
rss
cache

4 network

5 disk 空间剩余

6 disk inode 剩余

df -i  

二 数据库资源

1 数据文件空间

已使用

2 WAL(REDO)空间

已使用

3 实例级空间分布

1、库维度空间分布

select datname, pg_size_pretty(pg_database_size(oid)) from pg_database order by pg_database_size(oid) desc limit 10;  

2、表空间维度空间分布

select spcname, pg_size_pretty(pg_tablespace_size(oid)) from pg_tablespace order by pg_tablespace_size(oid) desc limit 10;  

4 库级空间分布

1、对象维度空间分布(按pg_class.relkind归类)

select relkind,pg_size_pretty(sum((pg_relation_size(oid)))) from pg_class group by 1 order by sum((pg_relation_size(oid))) desc;  

2、表维度空间分布

select relname,pg_size_pretty(pg_relation_size(oid)) from pg_class where relkind='r' order by pg_relation_size(oid) desc limit 10;  

3、索引维度空间分布

select relname,pg_size_pretty(pg_relation_size(oid)) from pg_class where relkind='i' order by pg_relation_size(oid) desc limit 10;  

4、user维度空间分布

5、schema维度空间分布


四、诊断项类目 - 性能诊断

一 热点SQL、对象、函数

1 自上一次重置计数器以来, TOP SQL(多维度)

《PostgreSQL 如何查找TOP SQL (例如IO消耗最高的SQL) (包含SQL优化内容) - 珍藏级》

1、总耗时TOP 10

select userid::regrole, dbid, query from pg_stat_statements order by total_time desc limit 10;      

2、IO耗时TOP 10

select userid::regrole, dbid, query from pg_stat_statements order by (blk_read_time+blk_write_time) desc limit 10;      

3、性能抖动TOP 10

select userid::regrole, dbid, query from pg_stat_statements order by stddev_time desc limit 10;      

2 当前, TOP SQL

1、IO TOP SQL

结合iotop工具

2、CPU TOP SQL

结合top工具

3 自上一次重置计数器以来, 全表扫描次数TOP对象

select * from pg_stat_all_tables order by seq_scan desc limit 10;  

4 自上一次重置计数器以来, 全表扫描记录数TOP对象

select * from pg_stat_all_tables order by seq_tup_read desc limit 10;  

5 索引数超过4并且SIZE大于10MB的top(dml)表

6 自上一次重置计数器以来, 热表

1、dml维度

select schemaname,relname,pg_size_pretty(pg_total_relation_size(relid)),  
n_tup_ins+n_tup_upd+n_tup_del+n_tup_hot_upd as iud_tuples from pg_stat_all_tables   
order by iud_tuples desc limit 10;  

2、查询(IO)维度

select schemaname,relname,pg_size_pretty(pg_total_relation_size(relid)),  
coalesce(heap_blks_read,0)+coalesce(toast_blks_read,0) read_blks from pg_statio_all_tables  
order by read_blks desc limit 10;  

7 自上一次重置计数器以来, 热索引

1、查询(tuple)维度

select schemaname,relname,indexrelname,pg_size_pretty(pg_total_relation_size(indexrelid)),  
idx_scan,idx_tup_read,idx_tup_fetch,idx_tup_read/(case idx_scan when 0 then 1.0 else idx_scan end)::float8 as sel from pg_stat_all_indexes   
order by idx_tup_read desc limit 10;  

2、查询(IO)维度

select schemaname,relname,indexrelname,pg_size_pretty(pg_total_relation_size(indexrelid)),  
coalesce(idx_blks_read,0) read_blks from pg_statio_all_indexes  
order by read_blks desc limit 10;  

3、选择性可能不好的索引

select schemaname,relname,indexrelname,pg_size_pretty(pg_total_relation_size(indexrelid)),  
idx_scan,idx_tup_read,idx_tup_fetch,idx_tup_read/(case idx_scan when 0 then 1.0 else idx_scan end)::float8 as sel from pg_stat_all_indexes   
where idx_tup_read/(case idx_scan when 0 then 1.0 else idx_scan end)::float8 > ?  -- 大于阈值,选择性不好  
order by sel desc limit 10;  

8 当前, 系统级 profiling 热点

perf 分析  

9 当前, 未完全合并的gin\brin索引

通过pageinspect查看

二 冷对象

1 冷表, 最近(重置统计信息计数器后)

1、dml维度

select schemaname,relname,pg_size_pretty(pg_total_relation_size(relid)),  
n_tup_ins+n_tup_upd+n_tup_del+n_tup_hot_upd as iud_tuples from pg_stat_all_tables   
where n_tup_ins+n_tup_upd+n_tup_del+n_tup_hot_upd < ? -- 低于阈值  
order by iud_tuples limit 10;  

2、查询(IO)维度

select schemaname,relname,pg_size_pretty(pg_total_relation_size(relid)),  
coalesce(heap_blks_read,0)+coalesce(toast_blks_read,0) read_blks from pg_statio_all_tables  
where coalesce(heap_blks_read,0)+coalesce(toast_blks_read,0) < ? -- 低于阈值  
order by read_blks desc limit 10;  

2 冷索引, 最近(重置统计信息计数器后)

最近(重置统计信息计数器后)未使用的索引

select * from pg_stat_all_indexes where idx_scan=0 and schemaname not in ('pg_catalog','pg_toast');  

三 慢SQL

1 慢SQL

1、历史慢SQL

分析csvlog  

2、当前慢SQL

select * from pg_stat_activity   
where now()-query_start > ?  -- 大于阈值  
order by now()-query_start desc;  

2 慢SQL的详细执行计划(auto_explain)

3 长事务(包括2PC)

select * from pg_stat_activity   
where now()-xact_start > ?  -- 大于阈值  
order by now()-xact_start desc;  
  
select * from pg_prepared_xacts   
where now()-prepared > ?  -- 大于阈值  
order by now()-prepared desc;  

四 垃圾、膨胀

1 垃圾 TOP

select * from pg_stat_all_tables   
where n_dead_tup >   
order by   

2 膨胀 TOP

1、表膨胀 TOP 5

SELECT    
  current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,    
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,    
  CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,    
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,    
  CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,    
  iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,    
  ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,    
  CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,    
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,    
  CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,    
  CASE WHEN relpages < otta THEN    
    CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END    
    ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)    
      ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END    
  END AS totalwastedbytes    
FROM (    
  SELECT    
    nn.nspname AS schemaname,    
    cc.relname AS tablename,    
    COALESCE(cc.reltuples,0) AS reltuples,    
    COALESCE(cc.relpages,0) AS relpages,    
    COALESCE(bs,0) AS bs,    
    COALESCE(CEIL((cc.reltuples*((datahdr+ma-    
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,    
    COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,    
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols    
  FROM    
     pg_class cc    
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$    
  LEFT JOIN    
  (    
    SELECT    
      ma,bs,foo.nspname,foo.relname,    
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,    
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2    
    FROM (    
      SELECT    
        ns.nspname, tbl.relname, hdr, ma, bs,    
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,    
        MAX(coalesce(null_frac,0)) AS maxfracsum,    
        hdr+(    
          SELECT 1+count(*)/8    
          FROM pg_stats s2    
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname    
        ) AS nullhdr    
      FROM pg_attribute att     
      JOIN pg_class tbl ON att.attrelid = tbl.oid    
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace     
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname    
      AND s.tablename = tbl.relname    
      AND s.inherited=false    
      AND s.attname=att.attname,    
      (    
        SELECT    
          (SELECT current_setting($$block_size$$)::numeric) AS bs,    
            CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)    
              IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,    
          CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma    
        FROM (SELECT version() AS v) AS foo    
      ) AS constants    
      WHERE att.attnum > 0 AND tbl.relkind=$$r$$    
      GROUP BY 1,2,3,4,5    
    ) AS foo    
  ) AS rs    
  ON cc.relname = rs.relname AND nn.nspname = rs.nspname    
  LEFT JOIN pg_index i ON indrelid = cc.oid    
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid    
) AS sml order by wastedbytes desc limit 5    

2、索引膨胀 TOP 5

SELECT    
  current_database() AS db, schemaname, tablename, reltuples::bigint AS tups, relpages::bigint AS pages, otta,    
  ROUND(CASE WHEN otta=0 OR sml.relpages=0 OR sml.relpages=otta THEN 0.0 ELSE sml.relpages/otta::numeric END,1) AS tbloat,    
  CASE WHEN relpages < otta THEN 0 ELSE relpages::bigint - otta END AS wastedpages,    
  CASE WHEN relpages < otta THEN 0 ELSE bs*(sml.relpages-otta)::bigint END AS wastedbytes,    
  CASE WHEN relpages < otta THEN $$0 bytes$$::text ELSE (bs*(relpages-otta))::bigint || $$ bytes$$ END AS wastedsize,    
  iname, ituples::bigint AS itups, ipages::bigint AS ipages, iotta,    
  ROUND(CASE WHEN iotta=0 OR ipages=0 OR ipages=iotta THEN 0.0 ELSE ipages/iotta::numeric END,1) AS ibloat,    
  CASE WHEN ipages < iotta THEN 0 ELSE ipages::bigint - iotta END AS wastedipages,    
  CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta) END AS wastedibytes,    
  CASE WHEN ipages < iotta THEN $$0 bytes$$ ELSE (bs*(ipages-iotta))::bigint || $$ bytes$$ END AS wastedisize,    
  CASE WHEN relpages < otta THEN    
    CASE WHEN ipages < iotta THEN 0 ELSE bs*(ipages-iotta::bigint) END    
    ELSE CASE WHEN ipages < iotta THEN bs*(relpages-otta::bigint)    
      ELSE bs*(relpages-otta::bigint + ipages-iotta::bigint) END    
  END AS totalwastedbytes    
FROM (    
  SELECT    
    nn.nspname AS schemaname,    
    cc.relname AS tablename,    
    COALESCE(cc.reltuples,0) AS reltuples,    
    COALESCE(cc.relpages,0) AS relpages,    
    COALESCE(bs,0) AS bs,    
    COALESCE(CEIL((cc.reltuples*((datahdr+ma-    
      (CASE WHEN datahdr%ma=0 THEN ma ELSE datahdr%ma END))+nullhdr2+4))/(bs-20::float)),0) AS otta,    
    COALESCE(c2.relname,$$?$$) AS iname, COALESCE(c2.reltuples,0) AS ituples, COALESCE(c2.relpages,0) AS ipages,    
    COALESCE(CEIL((c2.reltuples*(datahdr-12))/(bs-20::float)),0) AS iotta -- very rough approximation, assumes all cols    
  FROM    
     pg_class cc    
  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> $$information_schema$$    
  LEFT JOIN    
  (    
    SELECT    
      ma,bs,foo.nspname,foo.relname,    
      (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,    
      (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2    
    FROM (    
      SELECT    
        ns.nspname, tbl.relname, hdr, ma, bs,    
        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS datawidth,    
        MAX(coalesce(null_frac,0)) AS maxfracsum,    
        hdr+(    
          SELECT 1+count(*)/8    
          FROM pg_stats s2    
          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND s2.tablename = tbl.relname    
        ) AS nullhdr    
      FROM pg_attribute att     
      JOIN pg_class tbl ON att.attrelid = tbl.oid    
      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace     
      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname    
      AND s.tablename = tbl.relname    
      AND s.inherited=false    
      AND s.attname=att.attname,    
      (    
        SELECT    
          (SELECT current_setting($$block_size$$)::numeric) AS bs,    
            CASE WHEN SUBSTRING(SPLIT_PART(v, $$ $$, 2) FROM $$#"[0-9]+.[0-9]+#"%$$ for $$#$$)    
              IN ($$8.0$$,$$8.1$$,$$8.2$$) THEN 27 ELSE 23 END AS hdr,    
          CASE WHEN v ~ $$mingw32$$ OR v ~ $$64-bit$$ THEN 8 ELSE 4 END AS ma    
        FROM (SELECT version() AS v) AS foo    
      ) AS constants    
      WHERE att.attnum > 0 AND tbl.relkind=$$r$$    
      GROUP BY 1,2,3,4,5    
    ) AS foo    
  ) AS rs    
  ON cc.relname = rs.relname AND nn.nspname = rs.nspname    
  LEFT JOIN pg_index i ON indrelid = cc.oid    
  LEFT JOIN pg_class c2 ON c2.oid = i.indexrelid    
) AS sml order by wastedibytes desc limit 5  

3 系统膨胀时间点 - 多久以前的垃圾可以被回收

select least(t1,t2) from (select min(prepared) t1 from pg_prepared_xacts) tt1, (select min(xact_start) t2 from pg_stat_activity) tt2;  

4 自动垃圾回收状态

1、系统级是否关闭自动垃圾回收

select current_setting('autovacuum');  

2、已关闭自动垃圾回收的表

select relname, pg_size_pretty(pg_total_relation_size(oid)) from pg_class where reloptions @> array['autovacuum_enabled=off'];  

五 等待、冲突

1 当前锁等待

with      
t_wait as      
(      
  select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.granted,     
  a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a.transactionid,a.fastpath,      
  b.state,b.query,b.xact_start,b.query_start,b.usename,b.datname,b.client_addr,b.client_port,b.application_name     
    from pg_locks a,pg_stat_activity b where a.pid=b.pid and not a.granted     
),     
t_run as     
(     
  select a.mode,a.locktype,a.database,a.relation,a.page,a.tuple,a.classid,a.granted,     
  a.objid,a.objsubid,a.pid,a.virtualtransaction,a.virtualxid,a.transactionid,a.fastpath,     
  b.state,b.query,b.xact_start,b.query_start,b.usename,b.datname,b.client_addr,b.client_port,b.application_name     
    from pg_locks a,pg_stat_activity b where a.pid=b.pid and a.granted     
),     
t_overlap as     
(     
  select r.* from t_wait w join t_run r on     
  (     
    r.locktype is not distinct from w.locktype and     
    r.database is not distinct from w.database and     
    r.relation is not distinct from w.relation and     
    r.page is not distinct from w.page and     
    r.tuple is not distinct from w.tuple and     
    r.virtualxid is not distinct from w.virtualxid and     
    r.transactionid is not distinct from w.transactionid and     
    r.classid is not distinct from w.classid and     
    r.objid is not distinct from w.objid and     
    r.objsubid is not distinct from w.objsubid and     
    r.pid <> w.pid     
  )      
),      
t_unionall as      
(      
  select r.* from t_overlap r      
  union all      
  select w.* from t_wait w      
)      
select locktype,datname,relation::regclass,page,tuple,virtualxid,transactionid::text,classid::regclass,objid,objsubid,     
string_agg(     
'Pid: '||case when pid is null then 'NULL' else pid::text end||chr(10)||     
'Lock_Granted: '||case when granted is null then 'NULL' else granted::text end||' , Mode: '||case when mode is null then 'NULL' else mode::text end||' , FastPath: '||case when fastpath is null then 'NULL' else fastpath::text end||' , VirtualTransaction: '||case when virtualtransaction is null then 'NULL' else virtualtransaction::text end||' , Session_State: '||case when state is null then 'NULL' else state::text end||chr(10)||     
'Username: '||case when usename is null then 'NULL' else usename::text end||' , Database: '||case when datname is null then 'NULL' else datname::text end||' , Client_Addr: '||case when client_addr is null then 'NULL' else client_addr::text end||' , Client_Port: '||case when client_port is null then 'NULL' else client_port::text end||' , Application_Name: '||case when application_name is null then 'NULL' else application_name::text end||chr(10)||      
'Xact_Start: '||case when xact_start is null then 'NULL' else xact_start::text end||' , Query_Start: '||case when query_start is null then 'NULL' else query_start::text end||' , Xact_Elapse: '||case when (now()-xact_start) is null then 'NULL' else (now()-xact_start)::text end||' , Query_Elapse: '||case when (now()-query_start) is null then 'NULL' else (now()-query_start)::text end||chr(10)||      
'SQL (Current SQL in Transaction): '||chr(10)||    
case when query is null then 'NULL' else query::text end,      
chr(10)||'--------'||chr(10)      
order by      
  (  case mode      
    when 'INVALID' then 0     
    when 'AccessShareLock' then 1     
    when 'RowShareLock' then 2     
    when 'RowExclusiveLock' then 3     
    when 'ShareUpdateExclusiveLock' then 4     
    when 'ShareLock' then 5     
    when 'ShareRowExclusiveLock' then 6     
    when 'ExclusiveLock' then 7     
    when 'AccessExclusiveLock' then 8     
    else 0     
  end  ) desc,     
  (case when granted then 0 else 1 end)    
) as lock_conflict    
from t_unionall     
group by     
locktype,datname,relation,page,tuple,virtualxid,transactionid::text,classid,objid,objsubid ;    

2 历史等待事件统计

需要内核支持,PPAS已支持  

3 历史备库冲突统计

备库视图

select * from pg_stat_database_conflicts;  

六 预测

1 FREEZE 风暴预测

CPU/IO 风暴预测

《PostgreSQL Freeze 风暴预测续 - 珍藏级SQL》

七 概貌

1 数据库 统计信息

回滚比例, 命中比例, 数据块读写时间, 死锁, 复制冲突

select * from pg_stat_database;  

2 检查点, bgwriter 统计信息

select * from pg_stat_bgwriter;  

五、诊断项类目 - 隐患诊断

一 性能隐患

1 统计信息不准的对象(表、物化视图)

导致执行计划不准,影响SQL性能

select oid::regclass as table, relpages, pg_relation_size(oid)/current_setting('block_size')::float8 as real_pages from pg_class     
where relkind in ('r', 'm')  -- 表和物化视图    
and pg_relation_size(oid) > 1048576  -- 大于1MB    
and (pg_relation_size(oid)/current_setting('block_size')::float8 - relpages)/(pg_relation_size(oid)/current_setting('block_size')::float8) > 0.2;  -- 大于 20% 偏差     

二 停库隐患

1 库年龄

大于19亿,剩余1亿事务后强制停库。检查是否有长事务导致FREEZE失效。

select datname,age(datfrozenxid) from pg_database where age(datfrozenxid) > 1900000000;  

2 表年龄

大于19亿,剩余1亿事务后强制停库。检查是否有长事务导致FREEZE失效。

select relname,age(relfrozenxid),* from pg_class where relfrozenxid<>0 and age(relfrozenxid) > 1900000000;  

三 业务逻辑隐患

1 序列剩余调用次数

剩余调用次数低于10240000次

create or replace function f(OUT v_datname name, OUT v_role name, OUT v_nspname name, OUT v_relname name, OUT v_times_remain int8) returns setof record as $$  
declare  
begin  
  v_datname := current_database();  
  for v_role,v_nspname,v_relname in select rolname,nspname,relname from pg_authid t1 , pg_class t2 , pg_namespace t3 where t1.oid=t2.relowner and t2.relnamespace=t3.oid and t2.relkind='S'   
  LOOP  
    execute 'select (max_value-last_value)/increment_by from "'||v_nspname||'"."'||v_relname||'" where not is_cycled' into v_times_remain;  
    return next;  
  end loop;  
end;  
$$ language plpgsql;  
  
select * from f() where v_times_remain is not null and v_times_remain < 10240000  -- 剩余调用次数低于10240000次  
order by v_times_remain limit 10;  

四 无法登陆隐患

1 用户过期

select rolname from pg_roles where   
rolvaliduntil - now() < interval '30 d';  -- 用户过期时间小于30天  

五 数据丢失隐患

1 unlogged table 和 哈希索引(10没有风险)

select relkind,relname from pg_class where relpersistence='u';  
  
select * from pg_indexes where indexdef ~ 'USING hash';  

六 其他隐患

1 事件触发器

select current_database(),rolname,proname,evtname,evtevent,evtenabled,evttags from pg_event_trigger t1,pg_proc t2,pg_authid t3 where t1.evtfoid=t2.oid and t1.evtowner=t3.oid;  

六、诊断项类目 - 安全诊断

一 链路安全

1 是否开启公网访问,公网地址

2 LVS防火墙

0.0.0.0  
  
不太安全  

3 PROXY防火墙

0.0.0.0  
  
不太安全  

4 数据库防火墙 pg_hba.conf

除127.0.0.1,local 以外,如果有trust认证,视为不安全  

5 暴力破解攻击(os, pg)

grep -c "Failed password" /var/log/secure  
统计错误代码为28P01的报错  
  
28P01	invalid_password  

二 数据库安全

1 用户权限异常

列出(除白名单外)超级用户

select rolname from pg_roles where rolsuper not in ('白名单');  

2 普通用户自定义函数安全检查

检查定义为security invoker的UDF

存在安全陷阱隐患

select proname, prosrc from pg_proc where not prosecdef and proowner not in (select oid from pg_roles where rolsuper); 

3 SQL注入

1、最近一天(时间范围可选)ddl (drop, truncate)

取自审计日志

2、最近一天(时间范围可选)不带条件的DML (delete, update)

取自审计日志

3、最近一天(时间范围可选)影响行数超过N(可配置)的DML(update, delete)

取自审计日志(内核层支持DML输出影响函数)

4、最近一天(时间范围可选)返回结果超过N(可配置)的查询(select)

取自审计日志(内核层支持SELECT输出影响函数)


七、诊断项类目 - 软硬件异常诊断

一 硬件

1 SSD 剩余寿命

硬件厂商提供命令,或者smartctl

2 硬件错误

cat /var/log/mcelog   

二 软件

1 oom

dmesg|grep -i oom  

2 dmesg

异常信息  

3 core dump

检查coredump目录,例如 sysctl -a|grep kernel.core_pattern得到  
kernel.core_pattern = /corefiles/core_%e_%u_%t_%s.%p  

4 错误日志

1、PG 错误日志分析

统计:

错误代码、错误次数、错误描述

Flag Counter

digoal’s 大量PostgreSQL文章入口