Systemtap examples, DISK IO - 4 I/O Monitoring (By Device)

2 minute read

背景

例子来自traceio2.stp 脚本, 该脚本使用vfs.write, vfs.read事件, 在handler中输出指定设备的读写情况, dev号可以从stat -c中获得.  
[root@db-172-16-3-150 oracle_fdw-0.9.9]# man stat  
STAT(1)                          User Commands                         STAT(1)  
  
NAME  
       stat - display file or file system status  
       -c  --format=FORMAT  
              use the specified FORMAT instead of the default; output a newline after each use of FORMAT  
       %D     Device number in hex  
// 获得当前系统的设备号, 16进制  
[root@db-172-16-3-150 oracle_fdw-0.9.9]# stat -c %D /dev/sda  
5  
[root@db-172-16-3-150 oracle_fdw-0.9.9]# stat -c %D /dev/sdb  
5  
[root@db-172-16-3-150 oracle_fdw-0.9.9]# stat -c %D /dev/sdc  
5  
[root@db-172-16-3-150 oracle_fdw-0.9.9]# stat -c %D /dev/sdd  
5  
[root@db-172-16-3-150 oracle_fdw-0.9.9]# stat -c %D /dev/sde  
5  
  
脚本内容以及注解 :   
[root@db-172-16-3-150 network]# cd /usr/share/systemtap/testsuite/systemtap.examples/io  
[root@db-172-16-3-150 io]# cat traceio2.stp  
#!/usr/bin/stap  
  
global device_of_interest  
  
probe begin {  
  /* The following is not the most efficient way to do this.  
      One could directly put the result of usrdev2kerndev()  
      into device_of_interest.  However, want to test out  
      the other device functions */  
  dev = usrdev2kerndev($1)  
  device_of_interest = MKDEV(MAJOR(dev), MINOR(dev))  
// 也可以直接使用device_of_interest = usrdev2kerndev($1)  
}  
  
probe vfs.write, vfs.read  
{  
  if (dev == device_of_interest)  
    printf ("%s(%d) %s 0x%x\n",  
            execname(), pid(), probefunc(), dev)  
}  
// 当设备和用户输入的设备一致时, 输出.  
  
我这里修改一下, 再输出devname, 同时增加未匹配的输出.  
[root@db-172-16-3-150 io]# vi traceio2.stp  
#!/usr/bin/stap  
  
global device_of_interest  
  
probe begin {  
  /* The following is not the most efficient way to do this.  
      One could directly put the result of usrdev2kerndev()  
      into device_of_interest.  However, want to test out  
      the other device functions */  
  dev = usrdev2kerndev($1)  
  device_of_interest = MKDEV(MAJOR(dev), MINOR(dev))  
}  
  
probe vfs.write, vfs.read  
{  
  if (dev == device_of_interest)  
    printf ("Match: %s(%d) %s 0x%x, %s\n",  
            execname(), pid(), probefunc(), dev, devname)  
  else  
    printf ("Unmatch: %s(%d) %s 0x%x, %s\n",  
            execname(), pid(), probefunc(), dev, devname)  
}  
  
执行输出举例 :   
需加上0x, 表示16进制.  
[root@db-172-16-3-150 io]# stap traceio2.stp 0x5  
Unmatch: stapio(16383) vfs_read 0x7, N/A  
Unmatch: stapio(16383) vfs_read 0x7, N/A  
Unmatch: avahi-daemon(1707) vfs_read 0x6, N/A  
Unmatch: sqlplus(8805) vfs_read 0xb, N/A  
Match: mingetty(2095) vfs_read 0x5, N/A  
Match: mingetty(2093) vfs_read 0x5, N/A  
Unmatch: bash(4047) vfs_read 0xb, N/A  
Match: mingetty(2091) vfs_read 0x5, N/A  
Match: mingetty(2089) vfs_read 0x5, N/A  
Match: mingetty(2087) vfs_read 0x5, N/A  
Match: mingetty(2085) vfs_read 0x5, N/A  
Unmatch: oracle(8835) vfs_read 0x8, N/A  
Unmatch: vmstat(8919) vfs_read 0x3, N/A  
Unmatch: vmstat(8919) vfs_read 0x3, N/A  
Unmatch: vmstat(8919) vfs_read 0x3, N/A  
Unmatch: vmstat(8919) vfs_write 0xb, N/A  
Match: sshd(8436) vfs_read 0x5, N/A  
Unmatch: sshd(8436) vfs_write 0x6, N/A  
Unmatch: postgres(1674) vfs_read 0x8, N/A  
Unmatch: postgres(1674) vfs_read 0x8, N/A  
Unmatch: postgres(1675) vfs_read 0x8, N/A  
Unmatch: postgres(8282) vfs_read 0x8, N/A  
  
本文用到的几个probe alias原型(包含对应的call原型).  

参照

http://blog.163.com/digoal@126/blog/static/1638770402013101885114320/

这几个函数的源码位置信息: (源码这里就不贴出来了)

参照

http://blog.163.com/digoal@126/blog/static/1638770402013101885114320/

本文用到的几个dev相关函数 :   
Name  
    function::usrdev2kerndev — Converts a user-space device number into the format used in the kernel  
Synopsis  
    usrdev2kerndev:long(dev:long)  
Arguments  
    dev  
        Device number in user-space format.  
// stat -c %D得到的dev需要使用usrdev2kerndev转换后与vfs.read, vfs.write alias中定义的dev = __file_dev($file)变量匹配.  
  
Name  
    function::MKDEV — Creates a value that can be compared to a kernel device number (kdev_t)  
Synopsis  
    MKDEV:long(major:long,minor:long)  
Arguments  
    major  
        Intended major device number.  
    minor  
        Intended minor device number.  
// 返回值和usrdev2kerndev 含义一致, 但是MKDEV一般用于用户创建的设备上.  
  
Name  
    function::MAJOR — Extract major device number from a kernel device number (kdev_t)  
Synopsis  
    MAJOR:long(dev:long)  
Arguments  
    dev  
        Kernel device number to query.  
// 从kernel device中抽取major号  
  
Name  
    function::MINOR — Extract minor device number from a kernel device number (kdev_t)  
Synopsis  
    MINOR:long(dev:long)  
Arguments  
    dev  
        Kernel device number to query.  
// 从kernel device中抽取minor号  

参考

1. https://sourceware.org/systemtap/SystemTap_Beginners_Guide/mainsect-disk.html

2. https://sourceware.org/systemtap/examples/

3. /usr/share/systemtap/testsuite/systemtap.examples

4. systemtap-testsuite

5. /usr/share/systemtap/testsuite/systemtap.examples/index.txt

6. /usr/share/systemtap/testsuite/systemtap.examples/keyword-index.txt

7. /usr/share/systemtap/tapset

8. https://sourceware.org/systemtap/tapsets/API-ctime.html

9. https://sourceware.org/systemtap/tapsets/API-gettimeofday-s.html

Flag Counter

digoal’s 大量PostgreSQL文章入口