Systemtap Userspace probing - 1
背景
Systemtap Support for userspace probing is supported on kernels that are configured to include the utrace or uprobes extensions.
前面几篇blog介绍了DWARF和DWARF-less探针, 本文将要介绍一个新的探针分类, userspace probe, 这个顾名思义是和用户或者运行的程序更相关的探针.
要支持user-space probing, 首先内核要支持, 内核需包含了utrace或者uprobes扩展.
Systemtap最初是为内核空间监视设计的, 在0.6的版本后才加入了进程空间的监视, 即user-space probe.
SystemTap initially focused on kernel-space probing. However, there are many instances where user-space probing can help diagnose a problem. SystemTap 0.6 added support to allow probing user-space processes. SystemTap includes support for probing the entry into and return from a function in user-space processes, probing predefined markers in user-space code, and monitoring user-process events.
The SystemTap user-space probing requires the utrace kernel extensions which provide an API for tracking various user-space events. More details about the utrace infrastructure are available athttp://sourceware.org/systemtap/wiki/utrace. The following command determines whether the currently running Linux kernel provides the needed utrace support:
grep CONFIG_UTRACE /boot/config-`uname -r`
If the Linux kernel support user-space probing, the following output is printed:
CONFIG_UTRACE=y
The SystemTap user-space probing also needs the uprobes kernel module. If the uprobes kernel module is not available, you will see an error message like the following when attempting to run a script that requires the uprobes kernel module:
SystemTap's version of uprobes is out of date.
As root, or a member of the 'root' group, run
"make -C /usr/share/systemtap/runtime/uprobes".
Pass 4: compilation failed. Try again with another '--vp 0001' option.
If this occurs, you need to generate a uprobes.ko module for the kernel as directed.
UTRACE介绍截取自/usr/share/doc/kernel-doc-x.x.x/Documentation/utrace.txt :
The UTRACE is infrastructure code for tracing and controlling user
threads. This is the foundation for writing tracing engines, which
can be loadable kernel modules. The UTRACE interfaces provide three
basic facilities:
* Thread event reporting
Tracing engines can request callbacks for events of interest in
the thread: signals, system calls, exit, exec, clone, etc.
* Core thread control
Tracing engines can prevent a thread from running (keeping it in
TASK_TRACED state), or make it single-step or block-step (when
hardware supports it). Engines can cause a thread to abort system
calls, they change the behaviors of signals, and they can inject
signal-style actions at will.
* Thread machine state access
Tracing engines can read and write a thread's registers and
similar per-thread CPU state.
检查内核是否支持utrace :
[root@db-172-16-3-150 pg93]# grep CONFIG_UTRACE /boot/config-`uname -r`
CONFIG_UTRACE=y
userspace probe的几种使用方法.
1. 进程创建和进程消亡.
process.begin 所有进程创建时触发. (如果为target process模式则受指定进程限制.stap -c or -x; 或者是stap --unprivileged 则限定为当前用户的进程)
process("PATH").begin PATH可以为当前的相对路径或者绝对路径或者在当前环境下的$PATH中搜索指定的进程名称. 在该进程创建时触发.
process(PID).begin 指进程ID为指定PID的进程被创建时触发
以下同上, 只不过代表线程.
process.thread.begin
process("PATH").thread.begin
process(PID).thread.begin
.end 指进程消亡时触发. 其他含义同上.
process.end
process("PATH").end
process(PID).end
process.thread.end
process("PATH").thread.end
process(PID).thread.end
The .begin variant is called when a new process described by PID or PATH is created. If no PID or PATH argument is specified (for example process.begin), the probe flags any new process being spawned.
The .thread.begin variant is called when a new thread described by PID or PATH is created.
The .end variant is called when a process described by PID or PATH dies.
The .thread.end variant is called when a thread described by PID or PATH dies.
例如 :
相对路径 :
[root@db-172-16-3-39 pgsql]# cd /opt
[root@db-172-16-3-39 opt]# ll pgsql9.3beta2/bin/postgres
-rwxr-xr-x 1 root root 5710772 Aug 8 22:01 pgsql9.3beta2/bin/postgres
[root@db-172-16-3-39 opt]# stap -e 'probe process("pgsql9.3beta2/bin/postgres").begin {printf("%s, %s, %d, %d\n", execname(), pp(), pid(), cpu()); exit();}'
postgres, process("/opt/pgsql9.3beta2/bin/postgres").begin, 13707, 0
绝对路径 :
[root@db-172-16-3-39 opt]# cd
[root@db-172-16-3-39 ~]# stap -e 'probe process("/opt/pgsql9.3beta2/bin/postgres").begin {printf("%s, %s, %d, %d\n", execname(), pp(), pid(), cpu()); exit();}'
postgres, process("/opt/pgsql9.3beta2/bin/postgres").begin, 13707, 0
注意这个进程并不是探测过程中新建的, 而是已经存在的, 和systemtap介绍的不一样?
[root@db-172-16-3-39 ~]# ps -ewf|grep 13707
pg93 13707 3917 0 16:27 ? 00:00:00 postgres: postgres digoal [local] idle
root 14008 3537 0 16:29 pts/2 00:00:00 grep 13707
.end 指进程消亡时倒是对的.
[root@db-172-16-3-39 ~]# stap -e 'probe process("/opt/pgsql9.3beta2/bin/postgres").end {printf("%s, %s, %d, %d\n", execname(), pp(), pid(), cpu()); exit();}'
进程退出后, 输出如下 :
postgres, process("/opt/pgsql9.3beta2/bin/postgres").end, 13707, 4
2. 进程的系统调用
Constructs:
process.syscall
process("PATH").syscall
process(PID).syscall
process.syscall.return
process("PATH").syscall.return
process(PID).syscall.return
The .syscall variant is called when a thread described by PID or PATH makes a system call.
The system call number is available in the $syscall context variable.
The first six arguments of the system call are available in the $argN parameter, for example $arg1, $arg2, and so on.
The .syscall.return variant is called when a thread described by PID or PATH returns from a system call.
The system call number is available in the $syscall context variable.
The return value of the system call is available in the $return context variable.
以上为指定进程或所有进程的系统调用探针的用法.
操作系统中支持哪些系统调用可以通过stap -l 'syscall.**'
来输出.
例如 :
[root@db-172-16-3-39 ~]# stap -l 'syscall.**'|less
syscall.accept
syscall.accept.return
syscall.access
syscall.access.return
.....略
syscall.write
syscall.write.return
举例 :
$syscall表示系统调用号, $arg1 - $arg6表示该系统调函数的前6个参数. .return不支持$arg, 但是支持$return.
[root@db-172-16-3-39 ~]# stap -e 'probe process("/opt/pgsql9.3beta2/bin/postgres").syscall { printf("%s, %d, %s, %d, %d, %d\n", execname(), $syscall, pp(), pid(), cpu(), $arg1); } probe timer.s(1) { exit(); }'
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3918, 1, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3918, 1, 140734936355648
postgres, 45, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 4583, 1, 6
postgres, 45, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 4591, 1, 7
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3920, 2, 3
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3921, 4, 3
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 3
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3924, 7, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3921, 4, 140734936357808
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3920, 2, 140734936357776
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 140734936357808
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3924, 7, 140734936356176
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3923, 4, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3923, 4, 140734936358000
postgres, 45, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 4758, 4, 7
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 140734936357808
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 140734936357808
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 140734936357808
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 3
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall, 3922, 6, 140734936357808
系统调用返回探针
[root@db-172-16-3-39 ~]# stap -e 'probe process("/opt/pgsql9.3beta2/bin/postgres").syscall.return { printf("%s, %d, %s, %d, %d, %d\n", execname(), $syscall, pp(), pid(), cpu(), $return); } probe timer.s(1) { exit(); }'
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, -11
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3921, 4, -11
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3923, 4, -11
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, 0
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, -11
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, 0
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, -11
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, 0
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, -11
postgres, 7, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, 0
postgres, 0, process("/opt/pgsql9.3beta2/bin/postgres").syscall.return, 3922, 6, -11
3. 进程的函数或者语句级探针.
与内核的函数和语句级探针类似, 只是这里限定在指定进程的函数调用或者语句探针.
Constructs:
process("PATH").function("NAME")
process("PATH").statement("*@FILE.c:123")
process("PATH").function("*").return
process("PATH").function("myfun").label("foo")
Full symbolic source-level probes in userspace programs and shared libraries are supported.
These are exactly analogous to the symbolic DWARF-based kernel or module probes described previously and expose similar contextual $-variables.
Here is an example of prototype symbolic userspace probing support:
# stap -e 'probe process("ls").function("*").call {
log (probefunc()." ".$$parms)
}' \
-c 'ls -l'
To run, this script requires debugging information for the named program and utrace support in the kernel.
If you see a "pass 4a-time" build failure, check that your kernel supports utrace.
注意如果要使用进程的函数或语句级探针, 与DWARF-based kernel or module探针一样, 需要先安装debuginfo包(指对应的function或者statement指定的函数对应的debug信息), 当然也需要内核支持utrace.
[root@db-172-16-3-39 ~]# rpm -qa|grep debuginf
kernel-debuginfo-common-2.6.18-348.12.1.el5
kernel-debuginfo-2.6.18-348.12.1.el5
进程级的函数或语句探针, 同样支持函数参数, 本地变量, 全局变量等. 详细介绍请参考
http://blog.163.com/digoal@126/blog/static/1638770402013823101827553/
举例 :
当进程不支持debuginfo时, 会报错 :
[root@db-172-16-3-39 ~]# stap -e 'probe process("/opt/pgsql9.3beta2/bin/postgres").function("tcp_v4_connect") { printf("%s, %d, %d, %s\n", pp(), pid(), cpu(), $$vars); } probe timer.s(1) { exit(); }'
WARNING: cannot find module /opt/pgsql9.3beta2/bin/postgres debuginfo: No DWARF information found
semantic error: while resolving probe point: identifier 'process' at <input>:1:7
source: probe process("/opt/pgsql9.3beta2/bin/postgres").function("tcp_v4_connect") { printf("%s, %d, %d, %s\n", pp(), pid(), cpu(), $$vars); } probe timer.s(1) { exit(); }
^
semantic error: no match
Pass 2: analysis failed. Try again with another '--vp 01' option.
postgresql编译时未加–enable-dtrace参数 :
pg93@db-172-16-3-39-> pg_config --configure
'--prefix=/opt/pgsql9.3beta2' '--with-pgport=1921' '--with-perl' '--with-python' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=16'
换个加了–enable-dtrace以及–enable-debug的试试 :
pg94@db-172-16-3-39-> pg_config --configure
'--prefix=/home/pg94/pgsql9.4devel' '--with-pgport=2999' '--with-perl' '--with-tcl' '--with-python' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=16' '--enable-dtrace' '--enable-debug'
可以了
[root@db-172-16-3-150 ~]# /usr/bin/stap -e 'probe process("/home/pg93/pgsql9.3.1/bin/postgres").function("*") { printf("%s, %d, %d, %s\n", pp(), pid(), cpu(), probefunc()); }'
WARNING: u*probe failed postgres[10269] 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../../src/include/storage/s_lock.h:204")' addr 00000000004b08fd rc -1
WARNING: u*probe failed postgres[10268] 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../../src/include/storage/s_lock.h:204")' addr 00000000004b08fd rc -1
WARNING: u*probe failed postgres[10269] 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../src/include/storage/s_lock.h:204")' addr 0000000000622070 rc -1
WARNING: u*probe failed postgres[10269] 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../src/include/storage/s_lock.h:204")' addr 0000000000622160 rc -1
WARNING: u*probe failed postgres[10268] 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../src/include/storage/s_lock.h:204")' addr 0000000000622070 rc -1
WARNING: u*probe failed postgres[10268] 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../src/include/storage/s_lock.h:204")' addr 0000000000622160 rc -1
process("/home/pg93/pgsql9.3.1/bin/postgres").function("ResetLatch@/opt/soft_bak/postgresql-9.3.1/src/backend/port/pg_latch.c:552"), 10269, 1, ResetLatch
process("/home/pg93/pgsql9.3.1/bin/postgres").function("XLogBackgroundFlush@/opt/soft_bak/postgresql-9.3.1/src/backend/access/transam/xlog.c:2074"), 10269, 1, XLogBackgroundFlush
process("/home/pg93/pgsql9.3.1/bin/postgres").function("RecoveryInProgress@/opt/soft_bak/postgresql-9.3.1/src/backend/access/transam/xlog.c:6223"), 10269, 1, RecoveryInProgress
process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../../src/include/storage/s_lock.h:204"), 10269, 1, tas
process("/home/pg93/pgsql9.3.1/bin/postgres").function("tas@../../../../src/include/storage/s_lock.h:204"), 10269, 1, tas
process("/home/pg93/pgsql9.3.1/bin/postgres").function("WaitLatch@/opt/soft_bak/postgresql-9.3.1/src/backend/port/pg_latch.c:194"), 10269, 1, WaitLatch
......略
使用stap -l输出支持的探针.
[root@db-172-16-3-150 ~]# /usr/bin/stap -l 'process("/home/pg93/pgsql9.3.1/bin/postgres").function("*")'|less
process("/home/pg93/pgsql9.3.1/bin/postgres").function("ATAddCheckConstraint@/opt/soft_bak/postgresql-9.3.1/src/backend/commands/tablecmds.c:5659")
process("/home/pg93/pgsql9.3.1/bin/postgres").function("ATAddForeignKeyConstraint@/opt/soft_bak/postgresql-9.3.1/src/backend/commands/tablecmds.c:5780")
process("/home/pg93/pgsql9.3.1/bin/postgres").function("ATColumnChangeRequiresRewrite@/opt/soft_bak/postgresql-9.3.1/src/backend/commands/tablecmds.c:7382")
process("/home/pg93/pgsql9.3.1/bin/postgres").function("ATController@/opt/soft_bak/postgresql-9.3.1/src/backend/commands/tablecmds.c:2913")
... 略
使用系统提供的例子 :
[root@db-172-16-3-39 ~]# stap -e 'probe process("ls").function("*").call {
> log (probefunc()." ".$$parms)
> }' \
> -c 'ls -l'
WARNING: cannot find module /bin/ls debuginfo: No DWARF information found
semantic error: while resolving probe point: identifier 'process' at <input>:1:7
source: probe process("ls").function("*").call {
^
semantic error: no match
Pass 2: analysis failed. Try again with another '--vp 01' option.
Missing separate debuginfos, use: debuginfo-install coreutils-5.97-34.el5.x86_64
以上报错原因是系统中未安装ls的debuginfo.
[root@db-172-16-3-39 ~]# stap -l 'process("/bin/ls").function("**")'
[root@db-172-16-3-39 ~]# stap -l 'process("/bin/ls").function("*")'
[root@db-172-16-3-39 ~]# stap -l 'process("/bin/ls").function("*.*")'
4. Absolute variant
A non-symbolic probe point such as process(PID).statement(ADDRESS).absolute is analogous to
kernel.statement(ADDRESS).absolute in that both use raw, unverified virtual addresses and provide no $variables.
The target PID parameter must identify a running process and ADDRESS must identify a valid instruction address.
All threads of the listed process will be probed.
This is a guru mode probe.
必须使用stap -g 模式运行.
5. 进程路径搜索(PATH).
For all process probes, PATH names refer to executables that are searched the same way that shells do: the explicit path specified if the path name begins with a slash (/) character sequence; otherwise $PATH is searched. For example, the following probe syntax:
probe process("ls").syscall {}
probe process("./a.out").syscall {}
works the same as:
probe process("/bin/ls").syscall {}
probe process("/my/directory/a.out").syscall {}
If a process probe is specified without a PID or PATH parameter, all user threads are probed. However, if systemtap is invoked in target process mode, process probes are restricted to the process hierarchy associated with the target process. If stap is running in -unprivileged mode, only processes owned by the current user are selected.
相对路径, 绝对路径, 或者当前环境$PATH下搜索.
例如 :
root@db-172-16-3-39-> which postgres
/home/pg94/pgsql9.4devel/bin/postgres
root@db-172-16-3-39-> whoami
root
当前环境中搜索
root@db-172-16-3-39-> stap -e 'probe process("postgres").syscall { printf("%s, %d\n", pp(), $syscall); exit() }'
process("/home/pg94/pgsql9.4devel/bin/postgres").syscall, 23
相对路径
root@db-172-16-3-39-> stap -e 'probe process("pgsql9.4devel/bin/postgres").syscall { printf("%s, %d\n", pp(), $syscall); exit() }'
process("/home/pg94/pgsql9.4devel/bin/postgres").syscall, 0
绝对路径
root@db-172-16-3-39-> stap -e 'probe process("/home/pg94/pgsql9.4devel/bin/postgres").syscall { printf("%s, %d\n", pp(), $syscall); exit() }'
process("/home/pg94/pgsql9.4devel/bin/postgres").syscall, 0
未完待续.
参考
1. https://sourceware.org/systemtap/langref/Probe_points.html
2. http://lwn.net/Articles/499190/
3. http://lwn.net/Articles/224772/
4. /usr/share/doc/kernel-doc-x.x.x/Documentation/utrace.txt
5. http://blog.163.com/digoal@126/blog/static/1638770402013823101827553/
6. https://sourceware.org/systemtap/SystemTap_Beginners_Guide/userspace-probing.html