Systemtap Statistics (aggregates) Data Type

6 minute read

背景

统计类型在以前写的几篇BLOG进行过述过.

可参考 :

http://blog.163.com/digoal@126/blog/static/16387704020138310438924/

http://blog.163.com/digoal@126/blog/static/16387704020138333731979/

http://blog.163.com/digoal@126/blog/static/16387704020138392759478/

本文可作为一个补充.

统计|聚合类型存储数字类型的统计流信息, 聚合类型只有1个操作符, <<<, 用以向统计类型中添加基础数据.   
统计类型数据的展现可以通过几个函数来实现, 例如统计类型变量中包含多少个基础数据, 这些基础数据的最大值, 最小值, 平均值, 求和等.  
统计类型变量必须声明为全局变量, 统计类型可以存储在普通的全局变量或者数组的元素中.   
例如 :   
global var;  
global arr;  
var <<< pid();  
arr[1] <<< pid();  
统计变量操作不需要exclusive lock, 所以操作起来比较快.  
Aggregate instances are used to collect statistics on numerical values,   
it is important to accumulate new data quickly and in large volume.   
These instances operate without exclusive locks, and store only aggregated stream statistics.   
Aggregates make sense only for global variables.   
They are stored individually or as elements of an associative array.   
For information about wrapping associative arrays with statistics elements,   

使用数组元素来存储统计类型时, 由于数组元素的个数在声明数组时固定了, 超出的话可以通过覆盖最早的元素值(wrap)来避免错误.参考wrap部分 :

http://blog.163.com/digoal@126/blog/static/1638770402013999511424/

8.1 The aggregation (< < <) operator  
The aggregation operator is ``< < <'', and its effect is similar to an assignment or a C++ output streaming operation.   
The left operand specifies a scalar or array-index l-value, which must be declared global.   
The right operand is a numeric expression.   
The meaning is intuitive: add the given number as a sample to the set of numbers to compute their statistics.   
The specific list of statistics to gather is given separately by the extraction functions.   
The following is an example.  
a <<< delta_timestamp  
writes[execname()] <<< count  
统计类型仅支持1个操作符 <<< , 即往统计类型变量中增加采样数据(或叫基础数据).  
a <<< delta_timestamp  
writes[execname()] <<< count  
  
从统计类型变量中得到统计信息的函数如下 :   
8.2 Extraction functions  
For each instance of a distinct extraction function operating on a given identifier, the translator computes a set of statistics.   
With each execution of an extraction function, the aggregation is computed for that moment across all processors.   
The first argument of each function is the same style of l-value as used on the left side of the aggregation operation.  
从统计类型中得到整型采样统计, 如采样数据的个数, 求和, 最小值, 最大值, 平均值.  
8.3 Integer extractors  
The following functions provide methods to extract information about aggregate.  
  
统计类型变量中包含多少个采样数据 :   
8.3.1 @count(s)  
This statement returns the number of samples accumulated in aggregate s.  
统计类型变量的采样数据求和 :   
8.3.2 @sum(s)  
This statement returns the total sum of all samples in aggregate s.  
统计类型变量的采样数据的最小值 :   
8.3.3 @min(s)  
This statement returns the minimum of all samples in aggregate s.  
统计类型变量的采样数据的最大值 :   
8.3.4 @max(s)  
This statement returns the maximum of all samples in aggregate s.  
统计类型变量的采样数据的平均值 :   
8.3.5 @avg(s)  
This statement returns the average value of all samples in aggregate s.  
举例 :   
因为stap脚本语言中数字类型只支持整型, 所以在输出中没有小数.  
[root@db-172-16-3-39 ~]# stap -D MAXACTION=199999999 -e 'global stat; probe begin {for(i=0;i<990000000;i++) stat <<< i; printd("/",@count(stat),@max(stat),@min(stat),@avg(stat),@sum(stat),"\n"); exit()}'  
990000000/989999999/0/494999999/490049999505000000/  
  
从统计类型中得到柱状采样分布的函数 :   
8.4 Histogram extractors  
The following functions provide methods to extract histogram information.  
Printing a histogram with the print family of functions renders a histogram object as a tabular "ASCII art" bar chart.  
有两个函数可以输出采样值分布的柱状图.  
第一个是@hist_linear函数.  
8.4.1 @hist_linear  
The statement @hist_linear(v,L,H,W) represents a linear histogram of aggregate v,   
where L and H represent the lower and upper end of a range of values ,  
W represents the width (or size) of each bucket within the range.   
The low and high values can be negative, but the overall difference (high minus low) must be positive.   
The width parameter must also be positive.  
语法  :   
@hist_linear(v,L,H,W)  
v 代表统计类型变量.  
L代表采样数据输出柱状图的小值边界.  
H代表采样数据的大值边界.  
W是bucket的大小.  
例如 :   
[root@db-172-16-3-39 ~]# stap -e 'global s; probe begin {for(i=-100;i<100;i++) s<<<i; println(@hist_linear(s, 0, 1000, 20)); exit()}'  
value |-------------------------------------------------- count  
   <0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 100  小于0的采样数据有100个  
    0 |@@@@@@@@@@                                          20  >=0 && <20 的采样数据有20个.  
   20 |@@@@@@@@@@                                          20  ...以此类推  
   40 |@@@@@@@@@@                                          20  
   60 |@@@@@@@@@@                                          20  
   80 |@@@@@@@@@@                                          20  
  100 |                                                     0  >=100 && <120 的采样数据有0个  
  120 |                                                     0  >=120 && <140 的采样数据有0个  
... 后面的数据都是0个, hist_linear把他们去掉不输出了. 采样数据在bucket中为空的话, 怎样去掉规则见下面的解释.  
空的bucket没有必要输出, 在中间部分的空bucket 前后各取HIST_ELISION个, 中间的去掉以~符号代替之. 首尾的只留HIST_ELISION个, 其余的去掉. 上面的例子出现的空bucket是在尾部, 所以保留2个, 其余的去掉.  
In the output, a range of consecutive empty buckets may be replaced with a tilde (~) character.   
This can be controlled on the command line with -D HIST_ELISION=< num> , where < num> specifies how many empty buckets at the top and bottom of the range to print.   
The default is 2. A < num> of 0 removes all empty buckets.   
A negative < num> disables removal.  -- 负的HIST_ELISION表示不移除任何空的bucket(仅限于出现在中间部分的空bucket).  
例如 :   
中间部分的空bucket以~代替 :   
[root@db-172-16-3-39 ~]# stap -D HIST_ELISION=2 -e 'global s; probe begin {for(i=-100;i<100;i++) s<<<i; for(i=300;i<400;i++) s<<<i; print(@hist_linear(s, 0, 1000, 20)); exit()}'  
value |-------------------------------------------------- count  
   <0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 100  
    0 |@@@@@@@@@@                                          20  
   20 |@@@@@@@@@@                                          20  
   40 |@@@@@@@@@@                                          20  
   60 |@@@@@@@@@@                                          20  
   80 |@@@@@@@@@@                                          20  
  100 |                                                     0  
  120 |                                                     0  
      ~  
  260 |                                                     0  
  280 |                                                     0  
  300 |@@@@@@@@@@                                          20  
  320 |@@@@@@@@@@                                          20  
  340 |@@@@@@@@@@                                          20  
  360 |@@@@@@@@@@                                          20  
  380 |@@@@@@@@@@                                          20  
  400 |                                                     0  
  420 |                                                     0  
HIST_ELISION不管多少末尾部分的空bucket还是一样会去掉 :   
[root@db-172-16-3-39 ~]# stap -D HIST_ELISION=-1 -e 'global s; probe begin {for(i=-100;i<100;i++) s<<<i; for(i=300;i<400;i++) s<<<i; print(@hist_linear(s, 0, 1000, 20)); exit()}'  
value |-------------------------------------------------- count  
   <0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 100  
    0 |@@@@@@@@@@                                          20  
   20 |@@@@@@@@@@                                          20  
   40 |@@@@@@@@@@                                          20  
   60 |@@@@@@@@@@                                          20  
   80 |@@@@@@@@@@                                          20  
  100 |                                                     0  
  120 |                                                     0  
  140 |                                                     0  
  160 |                                                     0  
  180 |                                                     0  
  200 |                                                     0  
  220 |                                                     0  
  240 |                                                     0  
  260 |                                                     0  
  280 |                                                     0  
  300 |@@@@@@@@@@                                          20  
  320 |@@@@@@@@@@                                          20  
  340 |@@@@@@@@@@                                          20  
  360 |@@@@@@@@@@                                          20  
  380 |@@@@@@@@@@                                          20  
  400 |                                                     0  
  420 |                                                     0  
以下为书上的例子 :   
For example, if you specify -D HIST_ELISION=3 and the histogram has 10 consecutive empty buckets, the first 3 and last 3 empty buckets will be printed and the middle 4 empty buckets will be represented by a tilde (~).  
  
The following is an example.  
  
global reads  
probe netdev.receive {  
    reads <<< length  
}  
probe end {  
    print(@hist_linear(reads, 0, 10240, 200))  
}  
This generates the following output.  
value |-------------------------------------------------- count  
    0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1650  
  200 |                                                      8  
  400 |                                                      0  
  600 |                                                      0  
      ~  
 1000 |                                                      0  
 1200 |                                                      0  
 1400 |                                                      1  
 1600 |                                                      0  
 1800 |                                                      0  
This shows that 1650 network reads were of a size between 0 and 199 bytes,   
8 reads were between 200 and 399 bytes,   
1 read was between 1200 and 1399 bytes.   
The tilde (~) character indicates the bucket for 800 to 999 bytes was removed because it was empty.   
Empty buckets for 2000 bytes and larger were also removed because they were empty.  
  
第二个是@hist_log函数.   
和@hist_linear类似, 只是W是2^n.  
8.4.2 @hist_log  
The statement @hist_log(v) represents a base-2 logarithmic histogram. Empty buckets are replaced with a tilde (~) character in the same way as @hist_linear() (see above).  
The following is an example.  
  
global reads  
probe netdev.receive {  
    reads <<< length  
}  
probe end {  
    print(@hist_log(reads))  
}  
This generates the following output.  
value |-------------------------------------------------- count  
    8 |                                                      0  
   16 |                                                      0  
   32 |                                                    254  
   64 |                                                      3  
  128 |                                                      2  
  256 |                                                      2  
  512 |                                                      4  
 1024 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 16689  
 2048 |                                                      0  
 4096 |                                                      0  
例子 :   
[root@db-172-16-3-39 ~]# stap -D HIST_ELISION=-1 -e 'global s; probe begin {for(i=-100;i<100;i++) s<<<i; for(i=300;i<400;i++) s<<<i; print(@hist_log(s)); exit()}'  
value |-------------------------------------------------- count  
 -256 |                                                     0  
 -128 |                                                     0  
  -64 |@@@@@@@@@@@@@@@@@@                                  37  
  -32 |@@@@@@@@@@@@@@@@                                    32  
  -16 |@@@@@@@@                                            16  
   -8 |@@@@                                                 8  
   -4 |@@                                                   4  
   -2 |@                                                    2  
   -1 |                                                     1  
    0 |                                                     1  
    1 |                                                     1  
    2 |@                                                    2  
    4 |@@                                                   4  
    8 |@@@@                                                 8  
   16 |@@@@@@@@                                            16  
   32 |@@@@@@@@@@@@@@@@                                    32  
   64 |@@@@@@@@@@@@@@@@@@                                  36  
  128 |                                                     0  
  256 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 100  
  512 |                                                     0  
 1024 |                                                     0  
  
8.5 Deletion  
The delete statement applied to an aggregate variable will reset it to the initial empty state.  

详见 :

http://blog.163.com/digoal@126/blog/static/1638770402013997490563/

例子 :   
[root@db-172-16-3-39 ~]# stap --vp 00001 -D HIST_ELISION=-1 -e 'global s; probe begin {for(i=-100;i<100;i++) s<<<i; for(i=300;i<400;i++) s<<<i; println("before delete cnt: ",@count(s)); delete s; println("after delete cnt: ",@count(s)); exit()}'  
Pass 5: starting run.  
before delete cnt: 300  
after delete cnt: 0  
Pass 5: run completed in 10usr/30sys/306real ms.  
使用delete清除统计类型变量后, 显然采样数据个数为0.  

参考

1. https://sourceware.org/systemtap/langref/Statistics_aggregates.html

2. http://blog.163.com/digoal@126/blog/static/16387704020138310438924/

3. http://blog.163.com/digoal@126/blog/static/16387704020138333731979/

4. http://blog.163.com/digoal@126/blog/static/16387704020138392759478/

5. http://blog.163.com/digoal@126/blog/static/1638770402013997490563/

6. https://sourceware.org/systemtap/tapsets/API-netdev-receive.html

Flag Counter

digoal’s 大量PostgreSQL文章入口