Systemtap Associative array Data Type

4 minute read

背景

数组变量在stap启动时以哈希表形式初始化, 在handler中使用数组时不需要再动态创建.  
  
Associative arrays are implemented as hash tables with a maximum size set at startup.   
Associative arrays are too large to be created dynamically for individual probe handler runs, so they must be declared as global.  
由于数组类型占用的空间明显大于数字或字符串类型, 考虑到安全, systemtap设计时就不允许数组作为handler本地变量动态创建. 所以数组变量必须声明为全局变量.  
  
数组的基本操作就是设置和读取元素值  
The basic operations for arrays are setting and looking up elements.   
These operations are expressed in awk syntax:   
the array name followed by an opening bracket ([), a comma-separated list of up to nine index index expressions, and a closing bracket (]).   
使用语法  
arrvar[idx1, ... idxn]  
Each index expression may be a string or a number, as long as it is consistently typed throughout the script.  
数组的索引, 元素值各自必须一致, 否则会报错, 例如 :   
arr1的索引和元素都不一致 :   
[root@db-172-16-3-39 ~]# stap -e 'global arr1; probe begin {arr1["a"]=1;arr1[1]="x";exit()}'  
semantic error: type mismatch (string vs. long): identifier 'arr1' at <input>:1:39  
        source: global arr1; probe begin {arr1["a"]=1;arr1[1]="x";exit()}  
                                                      ^  
  
semantic error: type was first inferred here (long): identifier 'arr1' at :1:27  
        source: global arr1; probe begin {arr1["a"]=1;arr1[1]="x";exit()}  
                                          ^  
  
semantic error: type mismatch (long vs. string): number '1' at :1:44  
        source: global arr1; probe begin {arr1["a"]=1;arr1[1]="x";exit()}  
                                                           ^  
  
semantic error: type mismatch (long vs. string): identifier 'arr1' at :1:39  
        source: global arr1; probe begin {arr1["a"]=1;arr1[1]="x";exit()}  
                                                      ^  
  
Pass 2: analysis failed.  Try again with another '--vp 01' option.  
arr1的元素类型不一致 :   
[root@db-172-16-3-39 ~]# stap -e 'global arr1; probe begin {arr1["a"]=1;arr1["b"]="x";exit()}'  
semantic error: type mismatch (string vs. long): identifier 'arr1' at <input>:1:39  
        source: global arr1; probe begin {arr1["a"]=1;arr1["b"]="x";exit()}  
                                                      ^  
  
semantic error: type was first inferred here (long): identifier 'arr1' at :1:27  
        source: global arr1; probe begin {arr1["a"]=1;arr1["b"]="x";exit()}  
                                          ^  
  
Pass 2: analysis failed.  Try again with another '--vp 01' option.  
元素和索引类型一致, 可以正常使用.  
[root@db-172-16-3-39 ~]# stap -e 'global arr1; probe begin {arr1["a"]=1;arr1["b"]=2;exit()}'  
arr1["b"]=0x2  
arr1["a"]=0x1  
  
7.1 Examples  
# Increment the named array slot:  
foo [4,"hello"] ++  
  
# Update a statistic:  
processusage [uid(),execname()] ++  
  
# Set a timestamp reference point:  
times [tid()] = get_cycles()  
  
# Compute a timestamp delta:  
delta = get_cycles() - times [tid()]  
数组的索引一般选取pid, execname, uid, tid等, 以便做唯一性区分.  
数组的索引含义有点类似数据库中的primary key.   
数组的元素value 含义则类似K-V中的value.  
  
7.2 Types of values  
Array elements may be set to a number, a string, or an aggregate.   
The type must be consistent throughout the use of the array.   
The first assignment to the array defines the type of the elements.   
Unset array elements may be fetched and return a null value (zero or empty string) as appropriate, but they are not seen by a membership test.  
数组的元素值类型, 可以是数字, 字符串, 或者统计类型  

统计类型参考 :

http://blog.163.com/digoal@126/blog/static/16387704020138310438924/

http://blog.163.com/digoal@126/blog/static/16387704020138333731979/

http://blog.163.com/digoal@126/blog/static/16387704020138392759478/

不管存储何种类型, 一个数组中所有元素的类型必须一致, 例如都存储数字.  
当然一个数组的索引的类型也必须一致.  
例子见本文开头部分.  
数组元素delete后, 清理为改元素类型的原始状态. 数字0, 字符串空"", 统计类型空.  
例子 :   
[root@db-172-16-3-39 ~]# stap -e 'global arr1; probe begin {arr1[1]=1; delete arr1[1]; println(arr1[1]); exit()}'  
0  
[root@db-172-16-3-39 ~]# stap -e 'global arr1; probe begin {arr1[1]="hello"; delete arr1[1]; println(arr1[1]); exit()}'  
  
[root@db-172-16-3-39 ~]# stap -e 'global arr1; probe begin {arr1[1] <<< 123; delete arr1[1]; println(@count(arr1[1])); exit()}'  
0  
  
7.3 Array capacity  
Array sizes can be specified explicitly or allowed to default to the maximum size as defined by MAXMAPENTRIES.   

参考

http://blog.163.com/digoal@126/blog/static/163877040201381021752228/

数组中可以存储多少个元素可以在声明全局变量时指定或者由MAXMAPENTRIES参数决定(默认2048).  
You can explicitly specify the size of an array as follows:  
global ARRAY[<size>]  
If you do not specify the size parameter, then the array is created to hold MAXMAPENTRIES number of elements.  
默认情况下超出存储的个数会报错. 例如 :   
[root@db-172-16-3-39 ~]# stap -e 'global arr1[10]; probe begin {for(i=0;i<100;i++) arr1[i]=i; exit()}'  
ERROR: Array overflow, check size limit (10) near identifier 'arr1' at <input>:1:50  
WARNING: Number of errors: 1, skipped probes: 0  
WARNING: /usr/bin/staprun exited with status: 1  
Pass 5: run failed.  Try again with another '--vp 00001' option.  
  
7.4 Array wrapping  
Arrays may be wrapped using the percentage symbol (%) causing previously entered elements to be overwritten if more elements are inserted than the array can hold.   
This works for both regular and statistics typed arrays.  
You can mark arrays for wrapping as follows:  
global ARRAY1%[<size>], ARRAY2%  
前面说的数组变量超出存储的个数会报错, 有一种使用场景是, 对于前面的元素进行覆盖. 而不报错.  
类似mongodb中的capped collection.  
例如 :   
[root@db-172-16-3-39 ~]# stap -e 'global arr1%[10]; probe begin {for(i=0;i<100;i++) arr1[i]=i; exit()}'  
arr1[99]=0x63  
arr1[98]=0x62  
arr1[97]=0x61  
arr1[96]=0x60  
arr1[95]=0x5f  
arr1[94]=0x5e  
arr1[93]=0x5d  
arr1[92]=0x5c  
arr1[91]=0x5b  
arr1[90]=0x5a  
  
7.5 Iteration, foreach  
Like awk, SystemTap's foreach creates a loop that iterates over key tuples of an array, not only values.   
The iteration may be sorted by any single key or a value by adding an extra plus symbol (+) or minus symbol (-) to the code or limited to only a few elements with the limit keyword.   
The following are examples.  
# Simple loop in arbitrary sequence:  
foreach ([a,b] in foo)  
    fuss_with(foo[a,b])  
  
# Loop in increasing sequence of value:  
foreach ([a,b] in foo+) { ... }  
  
# Loop in decreasing sequence of first key:  
foreach ([a-,b] in foo) { ... }  
  
# Print the first 10 tuples and values in the array in decreasing sequence  
foreach (v = [i,j] in foo- limit 10)  
    printf("foo[%d,%s] = %d\n", i, j, v)  
  
foreach中可以使用break以及continue语句  
The break and continue statements also work inside foreach loops.   
Since arrays can be large but probe handlers must execute quickly, you should write scripts that exit iteration early, if possible.   
foreach中不允许修改数组元素的值, 目的是为了简化handler长度操作, 降低handler对程序的影响.  
For simplicity, SystemTap forbids any modification of an array during iteration with a foreach.  

详细的foreach用法参考 :

http://blog.163.com/digoal@126/blog/static/1638770402013997490563/

7.6 Deletion  
The delete statement can either remove a single element by index from an array or clear an entire array at once.   

详细的DELETE用法参考 :

http://blog.163.com/digoal@126/blog/static/1638770402013997490563/

参考

1. https://sourceware.org/systemtap/langref/Associative_arrays.html

2. http://blog.163.com/digoal@126/blog/static/16387704020138310438924/

3. http://blog.163.com/digoal@126/blog/static/16387704020138333731979/

4. http://blog.163.com/digoal@126/blog/static/16387704020138392759478/

5. http://blog.163.com/digoal@126/blog/static/163877040201381021752228/

6. http://blog.163.com/digoal@126/blog/static/1638770402013997490563/

Flag Counter

digoal’s 大量PostgreSQL文章入口