Systemtap EXP: Trace PostgreSQL iostat per SQL statement 1

5 minute read

背景

本文是一个systemtap例子 , 左右如下 :   
跟踪PostgreSQL每条sql产生的 :   
io请求次数, 次数/s  
io请求字节数, 字节数/s  
  
跟踪会话产生的总的 :   
io请求次数, 次数/s  
io请求字节数, 字节数/s  
过滤os cache的读和写, 仅包含块设备读写操作.  
  
下一篇blog将输出更加精细化的结果, 区分块设备, 是否缓存输出.  

http://blog.163.com/digoal@126/blog/static/16387704020139152191581/

stap脚本如下 :   
vi test.stp  
global io_stat1%[120000], io_stat2%[120000], io_stat11%[120000], io_stat22%[120000]  
  
probe vfs.read.return {  
  if ($return>0) {  
    if (devname!="N/A" && execname()=="postgres") { /*skip read from cache, filter postgres otherwise*/  
      io_stat1[pid(),execname(),uid(),ppid(),"R"] <<< $return  
      io_stat11[pid(),execname(),uid(),ppid(),"R"] <<< (gettimeofday_us() - @entry(gettimeofday_us()))  
      io_stat2[pid(),execname(),uid(),ppid(),"R"] <<< $return  
      io_stat22[pid(),execname(),uid(),ppid(),"R"] <<< (gettimeofday_us() - @entry(gettimeofday_us()))  
    }  
  }  
}  
  
probe vfs.write.return {  
  if ($return>0) {  
    if (devname!="N/A" && execname()=="postgres") { /*skip read from cache, filter postgres otherwise*/  
      io_stat1[pid(),execname(),uid(),ppid(),"W"] <<< $return  
      io_stat11[pid(),execname(),uid(),ppid(),"W"] <<< (gettimeofday_us() - @entry(gettimeofday_us()))  
      io_stat2[pid(),execname(),uid(),ppid(),"W"] <<< $return  
      io_stat22[pid(),execname(),uid(),ppid(),"W"] <<< (gettimeofday_us() - @entry(gettimeofday_us()))  
    }  
  }  
}  
  
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("query__start") {  
  delete io_stat1[pid(),execname(),uid(),ppid(),"R"]  
  delete io_stat1[pid(),execname(),uid(),ppid(),"W"]  
  delete io_stat11[pid(),execname(),uid(),ppid(),"R"]  
  delete io_stat11[pid(),execname(),uid(),ppid(),"W"]  
}  
  
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("query__done") {  
  var1 = @count(io_stat1[pid(),execname(),uid(),ppid(),"R"])  
  var2 = @sum(io_stat1[pid(),execname(),uid(),ppid(),"R"]) / 1024  
  var3 = @count(io_stat1[pid(),execname(),uid(),ppid(),"W"])  
  var4 = @sum(io_stat1[pid(),execname(),uid(),ppid(),"W"]) / 1024  
  var5 = ((@sum(io_stat11[pid(),execname(),uid(),ppid(),"R"]) !=0) ? ((1000000 * var1) / @sum(io_stat11[pid(),execname(),uid(),ppid(),"R"])) : 0)  
  var6 = ((@sum(io_stat11[pid(),execname(),uid(),ppid(),"R"]) !=0) ? ((1000000 * var2) / @sum(io_stat11[pid(),execname(),uid(),ppid(),"R"])) : 0)  
  var7 = ((@sum(io_stat11[pid(),execname(),uid(),ppid(),"W"]) !=0) ? ((1000000 * var3) / @sum(io_stat11[pid(),execname(),uid(),ppid(),"W"])) : 0)  
  var8 = ((@sum(io_stat11[pid(),execname(),uid(),ppid(),"W"]) !=0) ? ((1000000 * var4) / @sum(io_stat11[pid(),execname(),uid(),ppid(),"W"])) : 0)  
  
  printf("query: %s\nRIO requests: %d\nRIO request Kbytes: %d\nWIO requests: %d\nWIO request Kbytes: %d\n", user_string($arg1), var1, var2, var3, var4)  
  printf("RIO requests/s: %d\nRIO request Kbytes/s: %d\nWIO requests/s: %d\nWIO request Kbytes/s: %d\n", var5, var6, var7, var8)  
  
  /* clear data */  
  delete io_stat1[pid(),execname(),uid(),ppid(),"R"]  
  delete io_stat1[pid(),execname(),uid(),ppid(),"W"]  
  delete io_stat11[pid(),execname(),uid(),ppid(),"R"]  
  delete io_stat11[pid(),execname(),uid(),ppid(),"W"]  
}  
  
probe end{  
  println("----------END----------")  
  foreach([a,b,c,d,e] in io_stat2) {  
    var1 = @count(io_stat2[a,b,c,d,e])  
    var2 = @sum(io_stat2[a,b,c,d,e]) / 1024  
    var3 = ((@sum(io_stat22[a,b,c,d,e]) !=0) ? ((1000000 * var1) / @sum(io_stat22[a,b,c,d,e])) : 0)  
    var4 = ((@sum(io_stat22[a,b,c,d,e]) !=0) ? ((1000000 * var2) / @sum(io_stat22[a,b,c,d,e])) : 0)  
    printdln("---",a,b,c,d,e)  
    printf("%sIO requests: %d\n%sIO request Kbytes: %d\n", e, var1, e, var2)  
    printf("%sIO requests/s: %d\n%sIO request Kbytes/s: %d\n", e, var3, e, var4)  
  }  
  delete io_stat1  
  delete io_stat11  
  delete io_stat2  
  delete io_stat22  
}  
执行跟踪, 使用systemtap 2.4 :   
[root@db-172-16-3-150 ~]# /opt/systemtap/bin/stap test.stp   
在数据库中执行以下SQL :   
digoal=# explain (analyze,verbose,costs,buffers,timing) select count(*) from generate_series(1,1000000);  
  
  
                                                                 QUERY PLAN                                                           
           
------------------------------------------------------------------------------------------------------------------------------------  
---------  
 Aggregate  (cost=12.50..12.51 rows=1 width=0) (actual time=602.663..602.663 rows=1 loops=1)  
   Output: count(*)  
   Buffers: temp read=1710 written=1709  
   ->  Function Scan on pg_catalog.generate_series  (cost=0.00..10.00 rows=1000 width=0) (actual time=224.212..449.395 rows=1000000   
loops=1)  
         Output: generate_series  
         Function Call: generate_series(1, 1000000)  
         Buffers: temp read=1710 written=1709  
 Total runtime: 607.463 ms  
(8 rows)  
数据库explain输出中可以看到, 此SQL包含临时文件的读写各1709, 1710次.  
stap输出如下, 请求次数与explain的输出吻合 :   
query: explain (analyze,verbose,costs,buffers,timing) select count(*) from generate_series(1,1000000);  
RIO requests: 1709  
RIO request Kbytes: 13671  
WIO requests: 1709  
WIO request Kbytes: 13671  
RIO requests/s: 131259  
RIO request Kbytes/s: 1050000  
WIO requests/s: 64233  
WIO request Kbytes/s: 513831  
其他SQL的跟踪举例 :   
digoal=# create table t(id int, info text, crt_time timestamp)  
digoal-# ;  
CREATE TABLE  
digoal=# insert into t select generate_series(1,1000000),md5(random()::text),clock_timestamp();  
INSERT 0 1000000  
digoal=# explain (analyze,verbose,costs,buffers,timing) select count(*) from t;  
                                                        QUERY PLAN                                                          
--------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=22196.75..22196.76 rows=1 width=0) (actual time=357.822..357.823 rows=1 loops=1)  
   Output: count(*)  
   Buffers: shared hit=9346  
   ->  Seq Scan on public.t  (cost=0.00..19626.60 rows=1028060 width=0) (actual time=0.019..212.063 rows=1000000 loops=1)  
         Output: id, info, crt_time  
         Buffers: shared hit=9346  
 Total runtime: 357.868 ms  
(7 rows)  
  
digoal=# create index idx_t_1 on t(id);  
CREATE INDEX  
digoal=# \di+ idx_t_1   
                         List of relations  
 Schema |  Name   | Type  |  Owner   | Table | Size  | Description   
--------+---------+-------+----------+-------+-------+-------------  
 public | idx_t_1 | index | postgres | t     | 21 MB |   
(1 row)  
stap输出结果 :   
query: create table t(id int, info text, crt_time timestamp)  
;  
RIO requests: 1  
RIO request Kbytes: 8  
WIO requests: 2  
WIO request Kbytes: 16  
RIO requests/s: 50000  
RIO request Kbytes/s: 400000  
WIO requests/s: 30769  
WIO request Kbytes/s: 246153  
query: insert into t select generate_series(1,1000000),md5(random()::text),clock_timestamp();  
RIO requests: 4  
RIO request Kbytes: 32  
WIO requests: 9351  
WIO request Kbytes: 74808  
RIO requests/s: 80000  
RIO request Kbytes/s: 640000  
WIO requests/s: 43972  
WIO request Kbytes/s: 351781  
query: explain (analyze,verbose,costs,buffers,timing) select count(*) from t;  
RIO requests: 0  
RIO request Kbytes: 0  
WIO requests: 0  
WIO request Kbytes: 0  
RIO requests/s: 0  
RIO request Kbytes/s: 0  
WIO requests/s: 0  
WIO request Kbytes/s: 0  
query: create index idx_t_1 on t(id);  
RIO requests: 0  
RIO request Kbytes: 0  
WIO requests: 2757  
WIO request Kbytes: 22056  
RIO requests/s: 0  
RIO request Kbytes/s: 0  
WIO requests/s: 57136  
WIO request Kbytes/s: 457090  
query: SELECT pg_catalog.quote_ident(c.relname) FROM pg_catalog.pg_class c WHERE c.relkind IN ('i') AND substring(pg_catalog.quote_ident(c.relname),1,1)='i' AND pg_catalog.pg_table_is_visible(c.oid) AND c.relnamespace <> (SELECT oid FROM pg_catalog.pg_namespace WHERE nspname = 'pg_catalog')  
UNION  
SELECT pg_catalog.quote_ident(n.nspname) || '.' FROM pg_catalog.pg_namespace n WHERE substring(pg_catalog.quote_ident(n.nspname) || '.',1,1)='i' AND (SELECT pg_catalog.count(*) FROM pg_catalog.pg_namespace WHERE substrin  
RIO requests: 0  
RIO request Kbytes: 0  
WIO requests: 0  
WIO request Kbytes: 0  
RIO requests/s: 0  
RIO request Kbytes/s: 0  
WIO requests/s: 0  
WIO request Kbytes/s: 0  
query: SELECT n.nspname as "Schema",  
  c.relname as "Name",  
  CASE c.relkind WHEN 'r' THEN 'table' WHEN 'v' THEN 'view' WHEN 'm' THEN 'materialized view' WHEN 'i' THEN 'index' WHEN 'S' THEN 'sequence' WHEN 's' THEN 'special' WHEN 'f' THEN 'foreign table' END as "Type",  
  pg_catalog.pg_get_userbyid(c.relowner) as "Owner",  
 c2.relname as "Table",  
  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as "Size",  
  pg_catalog.obj_description(c.oid, 'pg_class') as "Description"  
FROM pg_catalog.pg_class c  
     LE  
RIO requests: 5  
RIO request Kbytes: 40  
WIO requests: 0  
WIO request Kbytes: 0  
RIO requests/s: 70422  
RIO request Kbytes/s: 563380  
WIO requests/s: 0  
WIO request Kbytes/s: 0  
ctrl结束stap, 将输出所有postgres相关读写统计.  
^C----------END----------  
22116---postgres---500---22111---W  
WIO requests: 599  
WIO request Kbytes: 131936  
WIO requests/s: 6108  
WIO request Kbytes/s: 1345448  
22118---postgres---500---22111---W  
WIO requests: 70  
WIO request Kbytes: 164  
WIO requests/s: 23071  
WIO request Kbytes/s: 54054  
22114---postgres---500---22111---W  
WIO requests: 8338  
WIO request Kbytes: 66704  
WIO requests/s: 65095  
WIO request Kbytes/s: 520762  
21758---postgres---500---22111---W  
WIO requests: 13819  
WIO request Kbytes: 110551  
WIO requests/s: 47110  
WIO request Kbytes/s: 376880  
21758---postgres---500---22111---R  
RIO requests: 1731  
RIO request Kbytes: 13847  
RIO requests/s: 123687  
RIO request Kbytes/s: 989424  
22112---postgres---500---22111---W  
WIO requests: 3  
WIO request Kbytes: 0  
WIO requests/s: 29126  
WIO request Kbytes/s: 0  
22998---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 133540  
RIO request Kbytes/s: 437888  
23026---postgres---500---22111---R  
RIO requests: 50  
RIO request Kbytes: 156  
RIO requests/s: 126903  
RIO request Kbytes/s: 395939  
22989---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 135646  
RIO request Kbytes/s: 444794  
23026---postgres---500---22111---W  
WIO requests: 1  
WIO request Kbytes: 8  
WIO requests/s: 13888  
WIO request Kbytes/s: 111111  
23028---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 133540  
RIO request Kbytes/s: 437888  
22992---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 135646  
RIO request Kbytes/s: 444794  
23030---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 133540  
RIO request Kbytes/s: 437888  
22986---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 132716  
RIO request Kbytes/s: 435185  
22994---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 132307  
RIO request Kbytes/s: 433846  
23020---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 131097  
RIO request Kbytes/s: 429878  
23023---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 133540  
RIO request Kbytes/s: 437888  
22983---postgres---500---22111---R  
RIO requests: 43  
RIO request Kbytes: 141  
RIO requests/s: 130699  
RIO request Kbytes/s: 428571  
22117---postgres---500---22111---R  
RIO requests: 33  
RIO request Kbytes: 15  
RIO requests/s: 58303  
RIO request Kbytes/s: 26501  

参考

1. http://blog.163.com/digoal@126/blog/static/163877040201062810748700/

2. http://blog.163.com/digoal@126/blog/static/163877040201391234023230/

3. http://blog.163.com/digoal@126/blog/static/163877040201391391613269/

4. http://blog.163.com/digoal@126/blog/static/16387704020139152191581/

Flag Counter

digoal’s 大量PostgreSQL文章入口