PostgreSQL aggregate function 1 : General-Purpose Aggregate Functions

3 minute read

背景

PostgreSQL支持较多的聚合函数, 以PostgreSQL 9.4为例, 支持例如一般性的聚合, 统计学科的聚合, 排序集聚合, 假象集聚合等.

本文将对一般性聚合函数举例说明其功能和用法.

以下图表参考 :

http://www.postgresql.org/docs/9.4/static/functions-aggregate.html

Function Argument Type(s) Return Type Description
array_agg(expression) any array of the argument type input values, including nulls, concatenated into an array
avg(expression) smallint, int, bigint, real, double precision, numeric, or interval numeric for any integer-type argument, double precision for a floating-point argument, otherwise the same as the argument data type the average (arithmetic mean) of all input values
bit_and(expression) smallint, int, bigint, or bit same as argument data type the bitwise AND of all non-null input values, or null if none
bit_or(expression) smallint, int, bigint, or bit same as argument data type the bitwise OR of all non-null input values, or null if none
bool_and(expression) bool bool true if all input values are true, otherwise false
bool_or(expression) bool bool true if at least one input value is true, otherwise false
count(*) - bigint number of input rows
count(expression) any bigint number of input rows for which the value of expression is not null
every(expression) bool bool equivalent to bool_and
json_agg(expression) any json aggregates values as a JSON array
json_object_agg(name,value) (any, any) json aggregates name/value pairs as a JSON object
max(expression) any array, numeric, string, or date/time type same as argument type maximum value of expression across all input values
min(expression) any array, numeric, string, or date/time type same as argument type minimum value of expression across all input values
string_agg(expression,delimiter) (text, text) or (bytea, bytea) same as argument types input values concatenated into a string, separated by delimiter
sum(expression) smallint, int, bigint, real, double precision, numeric, interval, or money bigint for smallint or int arguments, numeric for bigint arguments, otherwise the same as the argument data type sum of expression across all input values
xmlagg(expression) xml xml concatenation of XML values (see alsoSection 9.14.1.7)

上图中所有聚合函数, 当没有行输入时, 除了count返回0, 其他都返回null.

使用sum, array_agg时, 当没有行输入, 返回NULL可能有点别扭, 那么你可以使用coalesce来替代NULL, 如coalesce(sum(x), 0)

coalesce(array_agg(x), ‘{}’::int[])

例子 :

聚合后得到数组, null将计入数组元素

postgres=# select array_agg(id) from (values(null),(1),(2)) as t(id);  
 array_agg    
------------  
 {NULL,1,2}  
(1 row)  

算平均值时不计算null

postgres=# select avg(id) from (values(null),(1),(2)) as t(id);  
        avg           
--------------------  
 1.5000000000000000  
(1 row)  
算bit与 或 时也不计算NULL
postgres=# select bit_and(id) from (values(null),(1),(2)) as t(id);  
 bit_and   
---------  
       0  
(1 row)  
postgres=# select bit_or(id) from (values(null),(1),(2)) as t(id);  
 bit_or   
--------  
      3  
(1 row)  

算布尔逻辑时也不计算NULL

postgres=# select bool_and(id) from (values(null),(true),(false)) as t(id);  
 bool_and   
----------  
 f  
(1 row)  

every是bool_and的别名, 实际上是SQL标准中定义的.

postgres=# select every(id) from (values(null),(true),(false)) as t(id);  
 every   
-------  
 f  
(1 row)  

SQL标准中还定义了any和some为bool_or的别名, 但是因为any和some还可以被解释为子查询, 所以在PostgreSQL中any和some的布尔逻辑聚合不可用.

postgres=# select any(id) from (values(null),(true),(false)) as t(id);  
ERROR:  syntax error at or near "any"  
LINE 1: select any(id) from (values(null),(true),(false)) as t(id);  
               ^  
postgres=# select some(id) from (values(null),(true),(false)) as t(id);  
ERROR:  syntax error at or near "some"  
LINE 1: select some(id) from (values(null),(true),(false)) as t(id);  
               ^  

bool_or的例子

postgres=# select bool_or(id) from (values(null),(true),(false)) as t(id);  
 bool_or   
---------  
 t  
(1 row)  

计算非空的表达式个数, count带表达式时, 不计算null

postgres=# select count(id) from (values(null),(1),(2)) as t(id);  
 count   
-------  
     2  
(1 row)  

计算表达式(含空值)的个数, count()计算null, 注意count()是一个独立的聚合函数. 请和count(express)区分开来.

postgres=# select count(*) from (values(null),(1),(2)) as t(id);  
 count   
-------  
     3  
(1 row)  
postgres=# select count(*) from (values(null),(null),(1),(2)) as t(id);  
 count   
-------  
     4  
(1 row)  

聚合后得到json, 不带key的json聚合

postgres=# select json_agg(id) from (values(null),(true),(false)) as t(id);  
      json_agg         
---------------------  
 [null, true, false]  
(1 row)  

聚合后得到json, 带key的json聚合, 注意key不能为null, 否则报错.

postgres=# select json_object_agg(c1,c2) from (values('a',null),('b',true),('c',false)) as t(c1,c2);  
             json_object_agg               
-----------------------------------------  
 { "a" : null, "b" : true, "c" : false }  
(1 row)  
postgres=# select json_object_agg(c1,c2) from (values(null,null),('b',true),('c',false)) as t(c1,c2);  
ERROR:  22023: field name must not be null  
LOCATION:  json_object_agg_transfn, json.c:1959  

计算最大最小值, max, min都不计算null

postgres=# select max(id) from (values(null),(1),(2)) as t(id);  
 max   
-----  
   2  
(1 row)  
postgres=# select min(id) from (values(null),(1),(2)) as t(id);  
 min   
-----  
   1  
(1 row)  

聚合后得到字符串, 字符串聚合

postgres=# select string_agg(c1,'***') from (values('a',null),('b',true),('c',false)) as t(c1,c2);  
 string_agg   
------------  
 a***b***c  
(1 row)  
postgres=# select string_agg(id,'***') from (values(null),('digoal'),('zhou')) as t(id);  
  string_agg     
---------------  
 digoal***zhou  
(1 row)  

计算总和, sum不计算null, 当所有行都是null时, 即没有任何行输入, 返回null.

postgres=# select sum(id) from (values(null),(1),(2)) as t(id);  
 sum   
-----  
   3  
(1 row)  
postgres=# select sum(id::int) from (values(null),(null),(null)) as t(id);  
 sum   
-----  
      
(1 row)  

聚合后得到xml

postgres=# select xmlagg(id::xml) from (values(null),('<foo>digoal</foo>'),('<bar/>')) as t(id);  
         xmlagg            
-------------------------  
 <foo>digoal</foo><bar/>  
(1 row)  

某些聚合函数得到的结果可能和行的输入顺序有关, 例如array_agg, json_agg, json_object_agg, string_agg, and xmlagg, 以及某些自定义聚合函数. 如何来实现呢?

支持聚合函数中使用order by的PostgreSQL版本可以用如下语法 :

postgres=# select string_agg(id,'***' order by id) from (values(null),('digoal'),('zhou')) as t(id);  
  string_agg     
---------------  
 digoal***zhou  
(1 row)  
postgres=# select string_agg(id,'***' order by id desc) from (values(null),('digoal'),('zhou')) as t(id);  
  string_agg     
---------------  
 zhou***digoal  
(1 row)  

不支持聚合函数中使用order by的PostgreSQL版本, 可以用如下语法 :

SELECT xmlagg(x) FROM (SELECT x FROM test ORDER BY y DESC) AS tab;  
postgres=# select string_agg(id,'***') from (select id from (values(null),('digoal'),('zhou')) as t(id) order by id desc) t;  
  string_agg     
---------------  
 zhou***digoal  
(1 row)  

参考

1. http://www.postgresql.org/docs/9.4/static/functions-aggregate.html

2. http://www.postgresql.org/docs/9.4/static/functions-xml.html

3. src/backend/utils/adt

这些函数的代码在src/backend/utils/adt这里可以查询到, 对应各自的类型.

Flag Counter

digoal’s 大量PostgreSQL文章入口