PostgreSQL SystemTap on Linux - 1

4 minute read

背景

PostgreSQL 支持动态跟踪, 可以通过dtrace或者systemtap工具统计相关探针的信息.

安装systemtap

yum install systemtap kernel-debuginfo kernel-devel  

将安装以下包

systemtap-devel-1.8-6.el5  
systemtap-client-1.8-6.el5  
systemtap-runtime-1.8-6.el5  

后面将使用的stap命令

[root@db-172-16-3-39 ~]# rpm -qf /usr/bin/stap  
systemtap-devel-1.8-6.el5  
systemtap-client-1.8-6.el5  

检查stap是否正常

[root@db-172-16-3-39 ~]# stap   
stap: /usr/lib64/libelf.so.1: version `ELFUTILS_1.5' not found (required by stap)  
stap: /usr/lib64/libdw.so.1: version `ELFUTILS_0.148' not found (required by stap)  
stap: /usr/lib64/libdw.so.1: version `ELFUTILS_0.138' not found (required by stap)  
stap: /usr/lib64/libdw.so.1: version `ELFUTILS_0.142' not found (required by stap)  
stap: /usr/lib64/libdw.so.1: version `ELFUTILS_0.143' not found (required by stap)  
stap: /usr/lib64/libdw.so.1: version `ELFUTILS_0.149' not found (required by stap)  

这个错误的原因是库文件版本不正确, 用来老版本, 使用eu-readelf查看stap文件的两个环境变量, 如下 :

[root@db-172-16-3-39 ~]# eu-readelf -d /usr/bin/stap|grep -E "RPATH|RUNPATH"  
  RPATH             Library rpath: [/usr/lib64/systemtap]  
  RUNPATH           Library runpath: [/usr/lib64/systemtap]  

将路径加入到LD_LIBRARY_PATH中.

[root@db-172-16-3-39 ~]# export LD_LIBRARY_PATH=/usr/lib64/systemtap:$LD_LIBRARY_PATH  

stap现在正常了

[root@db-172-16-3-39 ~]# stap   
A script must be specified.  
Systemtap translator/driver (version 1.8/0.152 non-git sources)  
Copyright (C) 2005-2012 Red Hat, Inc. and others  
This is free software; see the source for copying conditions.  
enabled features: AVAHI LIBRPM LIBSQLITE3 NSS BOOST_SHARED_PTR TR1_UNORDERED_MAP NLS  
Usage: stap [options] FILE         Run script in file.  
   or: stap [options] -            Run script on stdin.  
   or: stap [options] -e SCRIPT    Run given script.  
   or: stap [options] -l PROBE     List matching probes.  
   or: stap [options] -L PROBE     List matching probes and local variables.  

测试 :

[root@db-172-16-3-39 pg94]# vi tps.d   
probe begin  
{  
  printf("hello\n")  
  exit()  
}  
[root@db-172-16-3-39 pg94]# stap tps.d   
Checking "/lib/modules/2.6.18-274.el5/build/.config" failed with error: No such file or directory  
Incorrect version or missing kernel-devel package, use: yum install kernel-devel-2.6.18-274.el5.x86_64   

这个错误是由于未安装当前正在运行的kernel对应的kernel-devel包.

[root@db-172-16-3-39 pg94]# rpm -qa|grep kernel  
kernel-headers-2.6.18-274.el5  
kernel-xen-devel-2.6.18-274.el5  
kernel-xen-2.6.18-274.el5  
kernel-2.6.18-274.el5  
[root@db-172-16-3-39 ~]# uname -a  
Linux db-172-16-3-39.sky-mobi.com 2.6.18-274.el5 #1 SMP Fri Jul 22 04:43:29 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux  

安装对应的kernel-devel版本.

yum install kernel-devel-2.6.18-274.el5.x86_64  

如果yum源中没有这个版本的kernel-devel包, 可以去安装光盘中找一找.

或者同时更新内核版本至新版本.

 yum install -y kernel.x86_64 kernel-devel.x86_64  
Package kernel-2.6.18-348.12.1.el5.x86_64 already installed and latest version  
Package kernel-devel-2.6.18-348.12.1.el5.x86_64 already installed and latest version  
vi /boot/grub/grub.conf  
default=0  
timeout=5  
splashimage=(hd0,0)/boot/grub/splash.xpm.gz  
hiddenmenu  
title CentOS (2.6.18-348.12.1.el5)  
        root (hd0,0)  
        kernel /boot/vmlinuz-2.6.18-348.12.1.el5 ro root=LABEL=/ rhgb quiet  
        initrd /boot/initrd-2.6.18-348.12.1.el5.img  

重启服务器

现在stap工作正常了 :

[root@db-172-16-3-39 postgresql-5e3e8e4]# stap -ve 'probe begin { log("hello world") exit() }'  
Pass 1: parsed user script and 85 library script(s) using 146788virt/23676res/3000shr/21384data kb, in 170usr/0sys/173real ms.  
Pass 2: analyzed script: 1 probe(s), 2 function(s), 0 embed(s), 0 global(s) using 147316virt/24396res/3224shr/21912data kb, in 10usr/0sys/6real ms.  
Pass 3: using cached /root/.systemtap/cache/3b/stap_3b2eaec778ce9832b394535505dde575_838.c  
Pass 4: using cached /root/.systemtap/cache/3b/stap_3b2eaec778ce9832b394535505dde575_838.ko  
Pass 5: starting run.  
hello world  
Pass 5: run completed in 0usr/20sys/306real ms.  

stap调试好后, 就可以用来跟踪postgresql了.

PostgreSQL编译时必须开启dtrace支持. 开启dtrace后, 数据库将启用代码中的探针或跟踪点.

PostgreSQL内建探针参考如下 :

src/backend/utils/probes.d  

http://www.postgresql.org/docs/devel/static/dynamic-trace.html

检查你的PostgreSQL是否开启了dtrace支持, 如下 :

pg94@db-172-16-3-39-> pg_config --configure  
'--prefix=/home/pg94/pgsql9.4devel' '--with-pgport=2999' '--with-perl' '--with-tcl' '--with-python' '--with-openssl' '--with-pam' '--without-ldap' '--with-libxml' '--with-libxslt' '--enable-thread-safety' '--with-wal-blocksize=16' '--enable-dtrace'  

如果没有–enable-dtrace, 那么需要重新编译一下你的PostgreSQL软件.

stap测试脚本1.

[root@db-172-16-3-39 pg94]# cat postgresql-query.stp   
global query_time, query_summary  
  
probe process("/home/pg94/pgsql9.4devel/bin/postgres").mark("query__start") {  
  query_time[tid(), $arg1] = gettimeofday_us();  
}  
  
probe process("/home/pg94/pgsql9.4devel/bin/postgres").mark("query__done") {  
  p = tid()  
  t = query_time[p, $arg1]; delete query_time[p, $arg1]  
  if (t) {  
    query_summary[p] <<< (gettimeofday_us() - t);  
  }  
}  
  
probe end {  
  printf("\ntid count min(us) avg(us) max(us)\n");  
  foreach (p in query_summary) {  
    printf("%d %d %d %d %d\n", p, @count(query_summary[p]),  
     @min(query_summary[p]), @avg(query_summary[p]), @max(query_summary[p]));  
  }  
}  

执行stap :

[root@db-172-16-3-39 pg94]# stap postgresql-query.stp   

执行以下SQL :

[root@db-172-16-3-39 pg94]# stap postgresql-query.stp   
digoal=# begin;  
BEGIN  
digoal=# select * from test for update;  
 id   
----  
  1  
  2  
  3  
  4  
  5  
  6  
  7  
  8  
  9  
 10  
(10 rows)  
digoal=# end;  
COMMIT  
digoal=# select txid_current();  
 txid_current   
--------------  
      5969062  
(1 row)  

结束stap, 输出 :

[root@db-172-16-3-39 pg94]# stap postgresql-query.stp   

按键Ctrl+C, 输出 :

tid count min(us) avg(us) max(us)  
17112 4 46 3794 14885  

这个tid对应PostgreSQL background process

[root@db-172-16-3-39 pg94]# ps -ewf|grep 17112  
pg94     17112 17005  0 15:15 ?        00:00:00 postgres: postgres digoal [local] idle  
digoal=# select pg_backend_pid();  
 pg_backend_pid   
----------------  
          17112  
(1 row)  

小结

1. 安装systemtap注意 :

需要安装kernel相关, 并且版本要一致, 启动的kernel版本也要一致 :

kernel  
kernel-debuginfo  
kernel-devel  

kernel-debuginfo的安装要用到debug源.

[root@db-172-16-3-39 pg94]# cd /etc/yum.repos.d/  
[root@db-172-16-3-39 yum.repos.d]# ll  
total 36  
-rw-r--r-- 1 root root 1926 Aug 29  2011 CentOS-Base.repo  
-rw-r--r-- 1 root root  631 Aug 29  2011 CentOS-Debuginfo.repo  
-rw-r--r-- 1 root root  626 Aug 29  2011 CentOS-Media.repo  
-rw-r--r-- 1 root root 5390 Aug 29  2011 CentOS-Vault.repo  
[root@db-172-16-3-39 yum.repos.d]# cat CentOS-Debuginfo.repo   
# CentOS-Base.repo  
#  
# The mirror system uses the connecting IP address of the client and the  
# update status of each mirror to pick mirrors that are updated to and  
# geographically close to the client.  You should use this for CentOS updates  
# unless you are manually picking other mirrors.  
#  
  
# All debug packages from all the various CentOS-5 releases  
# are merged into a single repo, split by BaseArch  
#  
# Note: packages in the debuginfo repo are currently not signed  
#  
  
[debug]  
name=CentOS-5 - Debuginfo  
baseurl=http://debuginfo.centos.org/5/$basearch/  
gpgcheck=0  
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-5  
enabled=0  

这里用到的源名为debug.

yum --enablerepo=debug list kernel-debuginfo  
yum --enablerepo=debug install kernel-debuginfo  

安装细节参考此文 :

http://pic.dhe.ibm.com/infocenter/lnxinfo/v3r0m0/topic/liaai.systemTap/liaaisystap_pdf.pdf

2. postgresql编译时必须加上–enable-dtrace参数, 否则stap时会出现类似以下错误.

[root@db-172-16-3-39 pg93]# stap test.stp   
semantic error: while resolving probe point: identifier 'process' at test.stp:3:7  
        source: probe process("/opt/pgsql9.3beta2/bin/postgres").mark("lock__wait__start")  
                      ^  
  
semantic error: no match  
semantic error: while resolving probe point: identifier 'process' at :8:7  
        source: probe process("/opt/pgsql9.3beta2/bin/postgres").mark("lock__wait__done")  
                      ^  
  
semantic error: while resolving probe point: identifier 'process' at :17:7  
        source: probe process("/opt/pgsql9.3beta2/bin/postgres").mark("lwlock__wait__start")  
                      ^  
  
semantic error: while resolving probe point: identifier 'process' at :22:7  
        source: probe process("/opt/pgsql9.3beta2/bin/postgres").mark("lwlock__wait__done")  
                      ^  
  
semantic error: while resolving probe point: identifier 'process' at :32:7  
        source: probe process("/opt/pgsql9.3beta2/bin/postgres").mark("lwlock__condacquire__fail")  
                      ^  
  
Pass 2: analysis failed.  Try again with another '--vp 01' option.  

3. 允许stap需要root权限, 或者将用户加入stapdev或者stapsys, 以及stapusr组.

如果普通用户不加入这两个组, 执行stap将报错 :

pg93@db-172-16-3-39-> stap -ve 'probe begin { log("hello world") exit() }'  
You are trying to run systemtap as a normal user.  
You should either be root, or be part of the group "stapusr" and possibly one of the groups "stapsys" or "stapdev".  
Systemtap translator/driver (version 1.8/0.152 non-git sources)  

加完组后正常

# usermod -G stapdev,stapusr pg94  
pg94@db-172-16-3-39-> stap -ve 'probe begin { log("hello world") exit() }'  
Pass 1: parsed user script and 85 library script(s) using 148892virt/23772res/3068shr/21396data kb, in 160usr/10sys/172real ms.  
Pass 2: analyzed script: 1 probe(s), 2 function(s), 0 embed(s), 0 global(s) using 149420virt/24492res/3292shr/21924data kb, in 10usr/0sys/6real ms.  
Pass 3: translated to C into "/tmp/stapcqtmUe/stap_758dbd41826239e5e3211a815f6bfc58_838_src.c" using 149420virt/24760res/3540shr/21924data kb, in 0usr/0sys/0real ms.  
Pass 4: compiled C into "stap_758dbd41826239e5e3211a815f6bfc58_838.ko" in 910usr/110sys/1028real ms.  
Pass 5: starting run.  
hello world  
Pass 5: run completed in 10usr/20sys/307real ms.  

本文就介绍到这里, 下次将介绍如何使用postgresql的探针.

参考

1. http://pgfoundry.org/projects/dtrace/

2. http://www.fosslc.org/drupal/content/probing-postgresql-dtrace-and-systemtap

3. http://www.postgresql.org/docs/devel/static/dynamic-trace.html

4. http://www.postgresql.org/docs/devel/static/install-procedure.html

5. http://www.ppurl.com/2011/05/dtrace-dynamic-tracing-in-oracle-solaris-mac-os-x-and-freebsd.html

6. http://www.emm.usp.br/downloads/pg/PG_perf_bootcamp.pdf

7. https://wiki.postgresql.org/wiki/DTrace

8. http://www.ppurl.com/?s=systemtap

9. http://www.ibm.com/developerworks/cn/linux/l-systemtap/

10. http://sourceware.org/systemtap/wiki/HomePage

11. http://fruli.krunch.be/~krunch/systemtap-osdcfr-20101010.pdf

12. http://blog.endpoint.com/2009/05/postgresql-with-systemtap.html

13. https://www.evernote.com/shard/s48/sh/1ccb0466-79b7-4090-9a5d-9371358ac54d/b8434e3e3b3130ce72422b9ae067e7b9

14. http://pic.dhe.ibm.com/infocenter/lnxinfo/v3r0m0/topic/liaai.systemTap/liaaisystap_pdf.pdf

15. http://linux.chinaunix.net/techdoc/develop/2008/12/28/1055546.shtml

16. http://os.51cto.com/art/201305/395819.htm

17. https://access.redhat.com/site/documentation/Red_Hat_Enterprise_Linux/?locale=en-US

Flag Counter

digoal’s 大量PostgreSQL文章入口