Systemtap EXP: PostgreSQL IN-BUILD mark Class 1 - transaction
背景
本文开始将以PostgreSQL内建的探针为例, 探讨一下围绕这些探针可以实现哪些想要的功能.
使用systemtap时, 需要把postgresql mark name中的-换成两个下划线.
第一个分类是事务相关探针 :
Name | Parameters | Description |
---|---|---|
transaction-start | (LocalTransactionId) | Probe that fires at the start of a new transaction. arg0 is the transaction ID. |
transaction-commit | (LocalTransactionId) | Probe that fires when a transaction completes successfully. arg0 is the transaction ID. |
transaction-abort | (LocalTransactionId) | Probe that fires when a transaction completes unsuccessfully. arg0 is the transaction ID. |
事务开始, 事务提交, 事务回滚3个mark.
探针信息 :
src/backend/utils/probes.h
/* TRACE_POSTGRESQL_TRANSACTION_START ( unsigned int) */
#if defined STAP_SDT_V1
#define TRACE_POSTGRESQL_TRANSACTION_START_ENABLED() __builtin_expect (transaction__start_semaphore, 0)
#define postgresql_transaction__start_semaphore transaction__start_semaphore
#else
#define TRACE_POSTGRESQL_TRANSACTION_START_ENABLED() __builtin_expect (postgresql_transaction__start_semaphore, 0)
#endif
__extension__ extern unsigned short postgresql_transaction__start_semaphore __attribute__ ((unused)) __attribute__ ((section (".probes")));
#define TRACE_POSTGRESQL_TRANSACTION_START(arg1) \
DTRACE_PROBE1(postgresql,transaction__start,arg1)
/* TRACE_POSTGRESQL_TRANSACTION_COMMIT ( unsigned int) */
#if defined STAP_SDT_V1
#define TRACE_POSTGRESQL_TRANSACTION_COMMIT_ENABLED() __builtin_expect (transaction__commit_semaphore, 0)
#define postgresql_transaction__commit_semaphore transaction__commit_semaphore
#else
#define TRACE_POSTGRESQL_TRANSACTION_COMMIT_ENABLED() __builtin_expect (postgresql_transaction__commit_semaphore, 0)
#endif
__extension__ extern unsigned short postgresql_transaction__commit_semaphore __attribute__ ((unused)) __attribute__ ((section (".probes")));
#define TRACE_POSTGRESQL_TRANSACTION_COMMIT(arg1) \
DTRACE_PROBE1(postgresql,transaction__commit,arg1)
/* TRACE_POSTGRESQL_TRANSACTION_ABORT ( unsigned int) */
#if defined STAP_SDT_V1
#define TRACE_POSTGRESQL_TRANSACTION_ABORT_ENABLED() __builtin_expect (transaction__abort_semaphore, 0)
#define postgresql_transaction__abort_semaphore transaction__abort_semaphore
#else
#define TRACE_POSTGRESQL_TRANSACTION_ABORT_ENABLED() __builtin_expect (postgresql_transaction__abort_semaphore, 0)
#endif
__extension__ extern unsigned short postgresql_transaction__abort_semaphore __attribute__ ((unused)) __attribute__ ((section (".probes")));
#define TRACE_POSTGRESQL_TRANSACTION_ABORT(arg1) \
DTRACE_PROBE1(postgresql,transaction__abort,arg1)
探针在pg源码中的信息 :
src/backend/access/transam/xact.c
/*
* StartTransaction
*/
static void
StartTransaction(void)
{
...
VirtualTransactionId vxid;
...
TRACE_POSTGRESQL_TRANSACTION_START(vxid.localTransactionId);
...
/*
* CommitTransaction
*
* NB: if you change this routine, better look at PrepareTransaction too!
*/
static void
CommitTransaction(void)
{
...
TRACE_POSTGRESQL_TRANSACTION_COMMIT(MyProc->lxid);
...
/*
* AbortTransaction
*/
static void
AbortTransaction(void)
...
TRACE_POSTGRESQL_TRANSACTION_ABORT(MyProc->lxid);
这三个探针对应的参数为本地事务号, 和全局事务号无关.
本地事务类型定义 :
src/include/storage/lock.h
/*
* Top-level transactions are identified by VirtualTransactionIDs comprising
* the BackendId of the backend running the xact, plus a locally-assigned
* LocalTransactionId. These are guaranteed unique over the short term,
* but will be reused after a database restart; hence they should never
* be stored on disk.
*
* Note that struct VirtualTransactionId can not be assumed to be atomically
* assignable as a whole. However, type LocalTransactionId is assumed to
* be atomically assignable, and the backend ID doesn't change often enough
* to be a problem, so we can fetch or assign the two fields separately.
* We deliberately refrain from using the struct within PGPROC, to prevent
* coding errors from trying to use struct assignment with it; instead use
* GET_VXID_FROM_PGPROC().
*/
typedef struct
{
BackendId backendId; /* determined at backend startup */
LocalTransactionId localTransactionId; /* backend-local transaction
* id */
} VirtualTransactionId;
使用事务相关的探针, 一般可以用于统计以下信息 :
1. 每秒新建, 提交, 回滚的事务数
2. 记录每个事务的时长
3. 结合内核探针, 可以记录每个事务的io开销, 网络开销, 单步指令等.
例子 :
1. 每秒新建, 提交, 回滚的事务数
digoal=#
CREATE OR REPLACE FUNCTION public.f_test(i_id integer)
RETURNS void
LANGUAGE plpgsql
STRICT
AS $function$
declare
begin
update test set info=md5(random()::text), crt_time=clock_timestamp() where id=i_id;
if not found then
insert into test(id,info,crt_time) values(i_id,md5(random()::text),clock_timestamp());
end if;
return;
exception when others then
return;
end;
$function$;
digoal=# create table test(id int primary key, info text, crt_time timestamp);
CREATE TABLE
pg93@db-172-16-3-150-> cat test.sql
\setrandom id 1 5000000
select f_test(:id);
[root@db-172-16-3-150 ~]# stap -e '
global var1
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__start") {
var1["START"]++
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__commit") {
var1["COMMIT"]++
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__abort") {
var1["ABORT"]++
}
probe timer.s(1) {
printf("START/s:%d, COMMIT/s:%d, ABORT/s:%d\n", var1["START"], var1["COMMIT"], var1["ABORT"])
var1["START"]=0
var1["COMMIT"]=0
var1["ABORT"]=0
}'
pg93@db-172-16-3-150-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 1 -T 10
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 1
duration: 10 s
number of transactions actually processed: 175602
tps = 17559.485329 (including connections establishing)
tps = 17601.731285 (excluding connections establishing)
statement latencies in milliseconds:
0.001609 \setrandom id 1 5000000
0.451537 select f_test(:id);
stap输出 :
START/s:0, COMMIT/s:0, ABORT/s:0
START/s:0, COMMIT/s:0, ABORT/s:0
START/s:7484, COMMIT/s:7483, ABORT/s:0
START/s:18035, COMMIT/s:18032, ABORT/s:0
START/s:17345, COMMIT/s:17346, ABORT/s:0
START/s:17151, COMMIT/s:17150, ABORT/s:0
START/s:17517, COMMIT/s:17520, ABORT/s:0
START/s:18048, COMMIT/s:18046, ABORT/s:0
START/s:17597, COMMIT/s:17600, ABORT/s:0
START/s:17648, COMMIT/s:17645, ABORT/s:0
START/s:17728, COMMIT/s:17724, ABORT/s:0
START/s:17346, COMMIT/s:17348, ABORT/s:0
START/s:9720, COMMIT/s:9725, ABORT/s:0
START/s:0, COMMIT/s:0, ABORT/s:0
START/s:0, COMMIT/s:0, ABORT/s:0
2. 记录每个事务的时长, 输出柱状图
首先要确保三个探针的本地事务号一致 :
[root@db-172-16-3-150 ~]# stap -e '
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__start") {
println(pn(),$arg1)
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__commit") {
println(pn(),$arg1)
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__abort") {
println(pn(),$arg1)
}'
digoal=# select 1;
?column?
----------
1
(1 row)
digoal=# begin;
BEGIN
digoal=# abort;
ROLLBACK
process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__start")55238
process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__commit")55238
process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__start")55239
process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__abort")55239
stap脚本如下 :
stap -e '
global var1%[819200], var2, var3
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__start") {
var1[pid(),$arg1] = gettimeofday_ms()
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__commit") {
if (var1[pid(),$arg1] != 0)
var2 <<< (gettimeofday_ms()-var1[pid(),$arg1])
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__abort") {
if (var1[pid(),$arg1] != 0)
var3 <<< (gettimeofday_ms()-var1[pid(),$arg1])
}
probe timer.s($1) {
if (@count(var2) != 0) {
printf("COMMIT/s:%d\n", @count(var2) / $1)
println(@hist_log(var2))
delete var2
}
if (@count(var3) != 0) {
printf("ABORT/s:%d\n", @count(var3) / $1)
println(@hist_log(var3))
delete var3
}
}' 3
pg93@db-172-16-3-150-> pgbench -M prepared -n -r -f ./test.sql -c 8 -j 1 -T 30
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 1
duration: 30 s
number of transactions actually processed: 737912
tps = 24596.734611 (including connections establishing)
tps = 24615.235172 (excluding connections establishing)
statement latencies in milliseconds:
0.002188 \setrandom id 1 5000000
0.321188 select f_test(:id);
输出
COMMIT/s:29660
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 72349
1 |@@@@@@@@@@@ 16600
2 | 32
4 | 0
8 | 0
COMMIT/s:57109
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 138167
1 |@@@@@@@@@@@ 33128
2 | 29
4 | 2
8 | 2
16 | 0
32 | 0
COMMIT/s:24830
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 58175
1 |@@@@@@@@@@@@@@ 16310
2 | 7
4 | 0
8 | 0
COMMIT/s:22335
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 51034
1 |@@@@@@@@@@@@@@@ 15960
2 | 9
4 | 0
8 | 2
16 | 0
32 | 0
COMMIT/s:20267
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 45321
1 |@@@@@@@@@@@@@@@@@ 15468
2 | 13
4 | 0
8 | 0
COMMIT/s:18493
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 40386
1 |@@@@@@@@@@@@@@@@@@ 15087
2 | 5
4 | 1
8 | 0
16 | 0
[其他]
1. 第二个例子可能存在偏差, 与stap全局变量锁有关, 不要除以定量时间即可.
修改如下 :
stap -e '
global var1%[819200], var2, var3, var4
probe begin {
var4=gettimeofday_ms()
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__start") {
var1[pid(),$arg1] = gettimeofday_ms()
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__commit") {
if (var1[pid(),$arg1] != 0)
var2 <<< (gettimeofday_ms()-var1[pid(),$arg1])
}
probe process("/home/pg93/pgsql9.3.1/bin/postgres").mark("transaction__abort") {
if (var1[pid(),$arg1] != 0)
var3 <<< (gettimeofday_ms()-var1[pid(),$arg1])
}
probe timer.s($1) {
now=gettimeofday_ms()
if (@count(var2) != 0) {
printf("COMMIT/s:%d\n", (1000*@count(var2)) / (now-var4))
println(@hist_log(var2))
delete var2
}
if (@count(var3) != 0) {
printf("ABORT/s:%d\n", (1000*@count(var3)) / (now-var4))
println(@hist_log(var3))
delete var3
}
var4=now
}' 3
输出精准 :
COMMIT/s:16364
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 39936
1 |@@@@@@@@@@@ 9137
2 | 20
4 | 0
8 | 0
COMMIT/s:16427
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 39927
1 |@@@@@@@@@@@ 9346
2 | 10
4 | 0
8 | 0
COMMIT/s:16436
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 40073
1 |@@@@@@@@@@@ 9223
2 | 13
4 | 0
8 | 0
COMMIT/s:16287
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 39649
1 |@@@@@@@@@@@ 9195
2 | 17
4 | 0
8 | 0
COMMIT/s:16233
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 39250
1 |@@@@@@@@@@@@ 9432
2 | 17
4 | 0
8 | 0
COMMIT/s:15605
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 37605
1 |@@@@@@@@@@@@ 9191
2 | 21
4 | 0
8 | 0
COMMIT/s:15581
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 37382
1 |@@@@@@@@@@@@ 9350
2 | 11
4 | 1
8 | 0
16 | 0
COMMIT/s:15422
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 36838
1 |@@@@@@@@@@@@ 9412
2 | 16
4 | 1
8 | 0
16 | 0
COMMIT/s:15521
value |-------------------------------------------------- count
0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 37117
1 |@@@@@@@@@@@@ 9437
2 | 11
4 | 0
8 | 0
结合内核探针的例子可参考 :
Systemtap EXP: Trace PostgreSQL iostat per SQL statement
http://blog.163.com/digoal@126/blog/static/16387704020139152191581/
Systemtap EXP: trace PostgreSQL netflow per session or per sql
http://blog.163.com/digoal@126/blog/static/16387704020139153195701/
Systemtap EXP: trace PostgreSQL instruction or block of instructions per sql or per session
http://blog.163.com/digoal@126/blog/static/16387704020139153455311/
参考
1. http://www.postgresql.org/docs/9.3/static/dynamic-trace.html
2. http://doxygen.postgresql.org/
3. src/backend/access/transam/xact.c
4. src/backend/utils/probes.h
5. src/include/storage/lock.h
6. https://sourceware.org/systemtap/tapsets/