PostgreSQL 9.5 new feature - Support GROUPING SETS, CUBE and ROLLUP.
背景
PostgreSQL 9.5 已实现分组集合特性
+T431 Extended grouping capabilities YES
+T432 Nested and concatenated GROUPING SETS YES
+T433 Multiargument GROUPING function YES
用法参考
1. http://www.postgresql.org/docs/devel/static/queries-table-expressions.html#QUERIES-GROUPING-SETS
2. http://www.postgresql.org/docs/devel/static/functions-aggregate.html
回归测试输出
src/test/regress/expected/groupingsets.out
分组集合实现的是将多个分组查询的结果union all的效果。
例如:
=> SELECT * FROM items_sold;
brand | size | sales
-------+------+-------
Foo | L | 10
Foo | M | 20
Bar | M | 15
Bar | L | 5
(4 rows)
=> SELECT brand, size, sum(sales) FROM items_sold GROUP BY GROUPING SETS ((brand), (size), ());
brand | size | sum
-------+------+-----
Foo | | 30
Bar | | 20
| L | 15
| M | 35
| | 50
(5 rows)
其中GROUP BY GROUPING SETS ((brand), (size), ());
相当于以下三个group by的union all:
group by brand
group by size
group by ()
未在分组的列以NULL代替。
分组集合除了可以用GROUPING SETS来指定,另外还提供了两个特殊的写法rollup和cube.
ROLLUP ( e1, e2, e3, ... )
代表递减分组,一般用于异构结构的分组如国家,省份,城市,乡镇这样的结构查询。
逐级分组汇总结果,它相当于如下写法:
GROUPING SETS (
( e1, e2, e3, ... ),
...
( e1, e2 )
( e1 )
( ) -- 注意包含全集
)
还有一种写法是CUBE
CUBE ( a, b, c )
cube是任意组合,相当于:
GROUPING SETS (
( a, b, c ),
( a, b ),
( a, c ),
( a ),
( b, c ),
( b ),
( c ),
( ) -- 注意包含全集
)
在cube和rollup中使用括号可以将多个表达式作为单个表达式来处理:
ROLLUP ( a, (b,c), d )
递减,相当于
GROUPING SETS (
( a, b, c, d ),
( a, b, c ),
( a ),
( )
)
CUBE ( (a,b), (c,d) )
相当于:
GROUPING SETS (
( a, b, c, d ),
( a, b ),
( c, d ),
( )
)
同时cube,rollup,grouping sets还可以混合使用:
GROUP BY a, CUBE(b,c), GROUPING SETS ((d), (e))
相当于:
GROUP BY GROUPING SETS (
(a,b,c,d), (a,b,c,e),
(a,b,d), (a,b,e),
(a,c,d), (a,c,e),
(a,d), (a,e)
)
既然分组聚合相当于多个group by 的union all,那么我们如何区分当前输出的记录是哪个分组group by的输出呢?
用grouping(cols)可以表示未参与聚合的表达式的比特位,并转换为INT输出。
例如:
=> SELECT * FROM items_sold;
make | model | sales
-------+-------+-------
Foo | GT | 10
Foo | Tour | 20
Bar | City | 15
Bar | Sport | 5
(4 rows)
grouping()中必须包含group by后面的任意或所有列。
=> SELECT make, model, GROUPING(make,model), sum(sales) FROM items_sold GROUP BY ROLLUP(make,model);
make | model | grouping | sum
-------+-------+----------+-----
Foo | GT | 0 | 10
Foo | Tour | 0 | 20
Bar | City | 0 | 15
Bar | Sport | 0 | 5
Foo | | 1 | 30
Bar | | 1 | 20
| | 3 | 50
(7 rows)
grouping()中必须包含group by后面的任意或所有列,不能包含未参与聚合的列。
grouping()中的每个表达式用1个比特位表示。
postgres=# create table tbl(c1 int,c2 int,c3 int, c4 int, c5 int);
CREATE TABLE
postgres=# select c1,count(*),grouping(c1,c2,c3) from tbl group by cube(c1,c2,c3);
c1 | count | grouping
----+-------+----------
| 0 | 7 -- b'111'::int = 7
(1 row)
postgres=# select c1,count(*),grouping(c1,c2) from tbl group by cube(c1,c2,c3);
c1 | count | grouping
----+-------+----------
| 0 | 3 b'11'::int = 3
(1 row)
postgres=# select c1,count(*),grouping(c2) from tbl group by cube(c1,c2,c3);
c1 | count | grouping
----+-------+----------
| 0 | 1 b'1'::int = 1
(1 row)
postgres=# insert into tbl values (1,2,3,4,5);
INSERT 0 1
postgres=# insert into tbl values (1,2,3,4,6);
INSERT 0 1
postgres=# insert into tbl values (2,3,4,5,6);
INSERT 0 1
postgres=# select c1,count(*),grouping(c2) from tbl group by cube(c1,c2,c3);
c1 | count | grouping
----+-------+----------
1 | 2 | 0
1 | 2 | 0
1 | 2 | 1
2 | 1 | 0
2 | 1 | 0
2 | 1 | 1
| 3 | 1
1 | 2 | 1
| 2 | 1
2 | 1 | 1
| 1 | 1
| 2 | 0
| 2 | 0
| 1 | 0
| 1 | 0
(15 rows)
grouping中包含未参与聚合的列将报错:
postgres=# select c1,count(*),grouping(c4) from tbl group by cube(c1,c2,c3);
ERROR: arguments to GROUPING must be grouping expressions of the associated query level
LINE 1: select c1,count(*),grouping(c4) from tbl group by cube(c1,c2...
^
postgres=# select c1,count(*),grouping(c1,c2,c3,c4) from tbl group by cube(c1,c2,c3);
ERROR: arguments to GROUPING must be grouping expressions of the associated query level
LINE 1: select c1,count(*),grouping(c1,c2,c3,c4) from tbl group by c...
^
postgres=# select c1,count(*),grouping(c1,c2,c3) from tbl group by cube(c1,c2,c3);
c1 | count | grouping
----+-------+----------
1 | 2 | 0
1 | 2 | 1
1 | 2 | 3
2 | 1 | 0
2 | 1 | 1
2 | 1 | 3
| 3 | 7
1 | 2 | 2
| 2 | 6
2 | 1 | 2
| 1 | 6
| 2 | 4
| 2 | 5
| 1 | 4
| 1 | 5
(15 rows)
参考
1. http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f3d3118532175541a9a96ed78881a3b04a057128
2. http://www.postgresql.org/docs/devel/static/queries-table-expressions.html#QUERIES-GROUPING-SETS
3. http://www.postgresql.org/docs/devel/static/functions-aggregate.html