SystemTap Flight Recorder Mode
背景
SystemTap's flight recorder mode allows you to run a SystemTap
script run for long periods and just focus on recent output.
The flight recorder mode (the -F option) limits the amount of
output generated.
There are two variations of the flight recorder mode:
in-memory and file mode. In both cases the SystemTap script runs as a background process.
systemtap飞行模式适合后台执行systemtap 脚本, 同时只保留stap最近的输出到限定大小的文件或内存中.
根据输出目标, 飞行模式分内存和文件模式.
使用stap -F选项可以打开飞行模式.
内存模式 :
最近的输出写入内存, 默认的内存大小为1MB, 超出会覆盖旧的输出的数据, 修改内存大小可使用-s选项.
man stap
-sNUM Use NUM megabyte buffers for kernel-to-user data transfer. On a multiprocessor in bulk mode, this is a
per-processor amount.
例如 :
[root@db-172-16-3-150 ~]# stap -F --vp 11111 -s10 -e '
probe vfs.* {
printdln("**", pn(), cmdline_str(), pid())
}
probe end {
printf("exited\n")
}'
输出 :
Pass 1: parsed user script and 96 library script(s) using 152024virt/25216res/2108shr/23932data kb, in 240usr/20sys/255real ms.
Pass 2: analyzed script: 13 probe(s), 11 function(s), 6 embed(s), 0 global(s) using 385080virt/128228res/7468shr/119004data kb, in 2240usr/280sys/2599real ms.
Pass 3: using cached /root/.systemtap/cache/7f/stap_7f1b46850d3110d61fa046ef63687b5b_8423.c
Pass 4: using cached /root/.systemtap/cache/7f/stap_7f1b46850d3110d61fa046ef63687b5b_8423.ko
Pass 5: starting run.
Disconnecting from systemtap module.
To reconnect, type "staprun -A stap_7f1b46850d3110d61fa046ef63687b5b_1303"
Pass 5: run completed in 0usr/20sys/34real ms.
命令行回到交互模式, 而非输出vfs.*的handler, 注意以上输出中的2行信息 :
Disconnecting from systemtap module.
To reconnect, type "staprun -A stap_7f1b46850d3110d61fa046ef63687b5b_1303"
表示stap已经运行在飞行模式, 使用lsmod可以看到这个已加载的模块.
[root@db-172-16-3-150 ~]# lsmod|grep stap
stap_7f1b46850d3110d61fa046ef63687b5b_1303 83009 0
要观察这个stap的输出, 可以使用命令 :
[root@db-172-16-3-150 ~]# staprun -A stap_7f1b46850d3110d61fa046ef63687b5b_1303|less
vfs.read**/opt/systemtap/libexec/systemtap/stapio -L -R stap_7f1b46850d3110d61fa046ef63687b5b_1303 -F3**1303
vfs.do_sync_read**/opt/systemtap/libexec/systemtap/stapio -L -R stap_7f1b46850d3110d61fa046ef63687b5b_1303 -F3**1303
vfs.read**/opt/systemtap/libexec/systemtap/stapio -L -R stap_7f1b46850d3110d61fa046ef63687b5b_1303 -F3**1303
vfs.do_sync_read**/opt/systemtap/libexec/systemtap/stapio -L -R stap_7f1b46850d3110d61fa046ef63687b5b_1303 -F3**1303
vfs.write**/opt/systemtap/libexec/systemtap/stapio -L -R stap_7f1b46850d3110d61fa046ef63687b5b_1303 -F3**1303
... 略
ERROR: Couldn't write to output 1 for cpu 0, exiting.: Success
使用内存模式的话, 重连模块后, 断开模块请使用Ctrl+\
^\ERROR: Couldn't write to output 1 for cpu 0, exiting.: Success
^\
Disconnecting from systemtap module.
To reconnect, type "staprun -A stap_7f1b46850d3110d61fa046ef63687b5b_1375"
使用Ctrl + \断开模块后, 模块不会被卸载
[root@db-172-16-3-150 ~]# lsmod|grep stap
stap_7f1b46850d3110d61fa046ef63687b5b_1375 83009 0
使用staprun连接和断开模块的方法参考man staprun :
MODULE DETACHING AND ATTACHING
After the staprun program installs a Systemtap kernel module, users can detach from the kernel module and reat-
tach to it later. The -L option loads the module and automatically detaches. Users can also detach from the
kernel module interactively by sending the SIGQUIT signal from the keyboard (typically by typing Ctrl-\).
To reattach to a kernel module, the staprun -A option would be used.
不再需要的模块可以使用staprun -d来卸载, 例如 :
[root@db-172-16-3-150 ~]# staprun -d stap_7f1b46850d3110d61fa046ef63687b5b_1375
[root@db-172-16-3-150 ~]# lsmod|grep stap
参考 :
man staprun
-d Delete a module. Only detached or unused modules the user has permission to access will be deleted. Use
"*" (quoted) to delete all unused modules.
文件模式 :
与内存模式类似, 只是保存在文件中, 而非内存. 使用-o 选项.
例如 :
[root@db-172-16-3-150 ~]# stap -F --vp 11111 -S 10,2 -o /tmp/test.log -e '
probe vfs.* {
printdln("**", pn(), cmdline_str(), pid())
}
probe end {
printf("exited\n")
}'
Pass 1: parsed user script and 96 library script(s) using 152036virt/25252res/2120shr/23944data kb, in 240usr/10sys/254real ms.
Pass 2: analyzed script: 13 probe(s), 11 function(s), 6 embed(s), 0 global(s) using 385092virt/128252res/7480shr/119016data kb, in 2250usr/270sys/2593real ms.
Pass 3: using cached /root/.systemtap/cache/7f/stap_7f1b46850d3110d61fa046ef63687b5b_8423.c
Pass 4: using cached /root/.systemtap/cache/7f/stap_7f1b46850d3110d61fa046ef63687b5b_8423.ko
Pass 5: starting run.
1515
Pass 5: run completed in 0usr/10sys/28real ms.
输出单个文件最大10MB, 保留最近的2个文件(也就是说保留最近20MB的输出信息).
文件命名方式/tmp/test.log.[0-9]+
[root@db-172-16-3-150 ~]# ll /tmp/test*
-rw-r--r-- 1 root root 5190802 Nov 6 16:32 /tmp/test.log.0
注意stap的输出如下, 1515指的是stap后台进程的pid. 可用于关闭这个后台跑着的stap.
Pass 5: starting run.
1515
Pass 5: run completed in 0usr/10sys/28real ms.
使用文件模式, 我们不需要使用staprun 连接到模块去获取输出, 直接查看文件即可.
[root@db-172-16-3-150 ~]# less /tmp/test.log.0
vfs.read**/opt/systemtap/libexec/systemtap/stapio -o /tmp/test.log -D -R -S 10,2 stap_7f1b46850d3110d61fa046ef63687b5b_1514 -F3**1515
vfs.read**/opt/systemtap/libexec/systemtap/stapio -o /tmp/test.log -D -R -S 10,2 stap_7f1b46850d3110d61fa046ef63687b5b_1514 -F3**1515
vfs.read**postgres: wal writer process "" "" ""**7014
vfs.do_sync_read**postgres: wal writer process "" "" ""**7014
vfs.write**-bash**32376
vfs.read**-bash**32376
vfs.read**sshd: root@pts/2 ""**32374
vfs.write**sshd: root@pts/2 ""**32374
vfs.do_sync_write**sshd: root@pts/2 ""**32374
vfs.read**/opt/systemtap/libexec/systemtap/stapio -o /tmp/test.log -D -R -S 10,2 stap_7f1b46850d3110d61fa046ef63687b5b_1514 -F3**1515
... 略
关闭后台stap, 与内存模式不同的是, 我们可使用kill来关闭, 而内存模式直接卸载模块.
kill -s SIGTERM 1515
参考
1. https://sourceware.org/systemtap/SystemTap_Beginners_Guide/using-usage.html
2. man staprun