如何从PostgreSQL json中提取数组

4 minute read

背景

在PostgreSQL中使用JSON类型时,除了能存,大家更关心的是JSON中的数据的使用,例如
1. VALUE支持哪些类型,

通过以下方法可以提取JSON最外层的VALUE的数据类型

json_typeof(json)    
jsonb_typeof(jsonb)    

目前支持的类型如下
object, array, string, number, boolean, and null

2. 包含哪些KEY或VALUE,通过路径提取KEY或VALUE等。

3. 通过制定KEY提取VALUE,
通常通过KEY提取的VALUE还是JSON或JSONB类型,因为JSON本身就是嵌套的,但是可以通过json typeof得知它的类型。

json_typeof(json)    
jsonb_typeof(jsonb)    

但是SQL还是认为他是个JSON,要么就是转换为TEXT。

postgres=# create table t3(c1 jsonb);  
CREATE TABLE  
postgres=# insert into t3 values ('{"a":"v","b":12,"c":{"ab":"hello"},"d":12.3,"e":true,"f":[1,2,3,4],"g":["a","b"]}');  
INSERT 0 1  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'a' col from t3) t;  
 pg_typeof | jsonb_typeof | col   
-----------+--------------+-----  
 jsonb     | string       | "v"  
(1 row)  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'b' col from t3) t;  
 pg_typeof | jsonb_typeof | col   
-----------+--------------+-----  
 jsonb     | number       | 12  
(1 row)  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'c' col from t3) t;  
 pg_typeof | jsonb_typeof |       col         
-----------+--------------+-----------------  
 jsonb     | object       | {"ab": "hello"}  
(1 row)  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'d' col from t3) t;  
 pg_typeof | jsonb_typeof | col    
-----------+--------------+------  
 jsonb     | number       | 12.3  
(1 row)  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'e' col from t3) t;  
 pg_typeof | jsonb_typeof | col    
-----------+--------------+------  
 jsonb     | boolean      | true  
(1 row)  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'f' col from t3) t;  
 pg_typeof | jsonb_typeof |     col        
-----------+--------------+--------------  
 jsonb     | array        | [1, 2, 3, 4]  
(1 row)  
  
postgres=# select pg_typeof(col), jsonb_typeof(col),col from (select c1->'g' col from t3) t;  
 pg_typeof | jsonb_typeof |    col       
-----------+--------------+------------  
 jsonb     | array        | ["a", "b"]  
(1 row)  

4. 特定VALUE类型的处理,例如数组。

当VALUE是ARRAY时,如果需要返回ARRAY给SQL层,暂时还没有内置这样的操作符,需要自定义,本文将介绍。

如何返回JSON中存储的数组

大多数JSON操作符,返回的要么是TEXT要么是JSON.

例如以下查询
操作符->返回数组

postgres=# select pg_typeof('{"a":[1,2,3],"b":[4,5,6]}'::json->'a'), '{"a":[1,2,3],"b":[4,5,6]}'::json->'a';  
 pg_typeof | ?column?   
-----------+----------  
 json      | [1,2,3]  
(1 row)  

操作符-»返回text

postgres=# select pg_typeof('{"a":[1,2,3],"b":[4,5,6]}'::json->>'a'), '{"a":[1,2,3],"b":[4,5,6]}'::json->>'a';  
 pg_typeof | ?column?   
-----------+----------  
 text      | [1,2,3]  
(1 row)  

如果明确json_typeof或jsonb_typeof是array,则可以通过以下function将数组转换为行
不管是什么数组,都返回text行

json_array_elements_text(json)  
  
jsonb_array_elements_text(jsonb)  
  
postgres=# select pg_typeof(col),col from (select json_array_elements_text('{"a":"B","b":[1,2,3,4,5,6]}'::json->'b') col) t;  
 pg_typeof | col   
-----------+-----  
 text      | 1  
 text      | 2  
 text      | 3  
 text      | 4  
 text      | 5  
 text      | 6  
(6 rows)  

使用数组构造器,可以将多条记录构造为数组。

postgres=# SELECT array(select json_array_elements_text('{"a":"B","b":[1,2,3,4,5,6]}'::json->'b'));  
     array       
---------------  
 {1,2,3,4,5,6}  
(1 row)  
  
postgres=# SELECT pg_typeof(array(select json_array_elements_text('{"a":"B","b":[1,2,3,4,5,6]}'::json->'b')));  
 pg_typeof   
-----------  
 text[]  
(1 row)  

如何转换JSONB数组的类型

如果想构造int[],在构造前,可以将行的输出转换为对应的格式,

postgres=# SELECT array(select (json_array_elements_text('{"a":"B","b":[1,2,3,4,5,6]}'::json->'b'))::int );  
     array       
---------------  
 {1,2,3,4,5,6}  
(1 row)  
  
postgres=# SELECT pg_typeof(array(select (json_array_elements_text('{"a":"B","b":[1,2,3,4,5,6]}'::json->'b'))::int ));  
 pg_typeof   
-----------  
 integer[]  
(1 row)  
  

编写JSON数组转换为SQL数组的函数

可以将以上方法转换为函数来使用,输入json typeof == array的json或jsonb对象,输出text数组。
转换jsonb, json array to text array

CREATE OR REPLACE FUNCTION json_arr2text_arr(_js jsonb)  
   RETURNS text[] AS  
$func$  
SELECT ARRAY(SELECT jsonb_array_elements_text(_js))  
$func$  
LANGUAGE sql IMMUTABLE;  
  
  
CREATE OR REPLACE FUNCTION json_arr2text_arr(_js json)  
   RETURNS text[] AS  
$func$  
SELECT ARRAY(SELECT json_array_elements_text(_js))  
$func$  
LANGUAGE sql IMMUTABLE;  

转换jsonb, json array to int array

CREATE OR REPLACE FUNCTION json_arr2int_arr(_js jsonb)  
   RETURNS int[] AS  
$func$  
SELECT ARRAY( SELECT (jsonb_array_elements_text(_js))::int )  
$func$  
LANGUAGE sql IMMUTABLE;  
  
CREATE OR REPLACE FUNCTION json_arr2int_arr(_js json)  
   RETURNS int[] AS  
$func$  
SELECT ARRAY( SELECT (json_array_elements_text(_js))::int )  
$func$  
LANGUAGE sql IMMUTABLE;  

例子

postgres=# select col, pg_typeof(col) from (select json_arr2text_arr(c1->'f') col from t3) t;  
    col    | pg_typeof   
-----------+-----------  
 {1,2,3,4} | text[]  
(1 row)  
  
postgres=# select col, pg_typeof(col) from (select json_arr2int_arr(c1->'f') col from t3) t;  
    col    | pg_typeof   
-----------+-----------  
 {1,2,3,4} | integer[]  
(1 row)  
  
postgres=# select col, pg_typeof(col) from (select json_arr2text_arr(c1->'g') col from t3) t;  
  col  | pg_typeof   
-------+-----------  
 {a,b} | text[]  
(1 row)  

应用场景

例如业务系统在JSON中存储了一些ARRAY,这些ARRAY通常会包括一些全包含,全不包含,相交等查询,达到查询的目的。

这里就需要用到数组的包含,相交,不相干操作符来判断,那就会用到前面提到的数组的转换需求。

例子

例如JSON存储的是乐高积木的各属性。

JSON某个KEY存储的数组代表乐高积木对应型号的小零件部件号清单,然后用户需要查询哪些积木型号包含了指定零件编号,这里涉及的就是全包含的查询。

包含’a’零件

postgres=# select * from t3 where json_arr2text_arr(c1->'g') @> array['a'];  
                                                 c1                                                    
-----------------------------------------------------------------------------------------------------  
 {"a": "v", "b": 12, "c": {"ab": "hello"}, "d": 12.3, "e": true, "f": [1, 2, 3, 4], "g": ["a", "b"]}  
(1 row)  

同时包含’a’,’c’零件

postgres=# select * from t3 where json_arr2text_arr(c1->'g') @> array['a','c'];  
 c1   
----  
(0 rows)  

同时包含1,2零件

postgres=# select * from t3 where json_arr2int_arr(c1->'f') @> array[1,2];  
                                                 c1                                                    
-----------------------------------------------------------------------------------------------------  
 {"a": "v", "b": 12, "c": {"ab": "hello"}, "d": 12.3, "e": true, "f": [1, 2, 3, 4], "g": ["a", "b"]}  
(1 row)  

同时包含1,6零件

postgres=# select * from t3 where json_arr2int_arr(c1->'f') @> array[1,6];  
 c1   
----  
(0 rows)  

包含1或6零件

postgres=# select * from t3 where json_arr2int_arr(c1->'f') && array[1,6];  
                                                 c1                                                    
-----------------------------------------------------------------------------------------------------  
 {"a": "v", "b": 12, "c": {"ab": "hello"}, "d": 12.3, "e": true, "f": [1, 2, 3, 4], "g": ["a", "b"]}  
(1 row)  

即不包含1 也不包含6零件

postgres=# select * from t3 where not(json_arr2int_arr(c1->'f') && array[1,6]);  
 c1   
----  
(0 rows)  

不包含6零件

postgres=# select * from t3 where not(json_arr2int_arr(c1->'f') && array[6]);  
                                                 c1                                                    
-----------------------------------------------------------------------------------------------------  
 {"a": "v", "b": 12, "c": {"ab": "hello"}, "d": 12.3, "e": true, "f": [1, 2, 3, 4], "g": ["a", "b"]}  
(1 row)  

如果你不想转换为INT,那么在条件中也不要使用INT数组

postgres=# select * from t3 where not(json_arr2text_arr(c1->'f') && array['6']);  
                                                 c1                                                    
-----------------------------------------------------------------------------------------------------  
 {"a": "v", "b": 12, "c": {"ab": "hello"}, "d": 12.3, "e": true, "f": [1, 2, 3, 4], "g": ["a", "b"]}  
(1 row)  

索引

为了提升这种查询的速度,我们可以这样建立索引。

postgres=# create index idx_t3_1 on t3 using gin (json_arr2text_arr(c1->'f'));
CREATE INDEX
postgres=# set enable_seqscan=off;
SET
postgres=# explain select * from t3 where json_arr2text_arr(c1->'f') && array['1','6'];
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Bitmap Heap Scan on t3  (cost=12.25..16.52 rows=1 width=32)
   Recheck Cond: (json_arr2text_arr((c1 -> 'f'::text)) && '{1,6}'::text[])
   ->  Bitmap Index Scan on idx_t3_1  (cost=0.00..12.25 rows=1 width=0)
         Index Cond: (json_arr2text_arr((c1 -> 'f'::text)) && '{1,6}'::text[])
(4 rows)

postgres=# explain select * from t3 where json_arr2text_arr(c1->'f') @> array['1','6'];
                                  QUERY PLAN                                   
-------------------------------------------------------------------------------
 Bitmap Heap Scan on t3  (cost=12.25..16.52 rows=1 width=32)
   Recheck Cond: (json_arr2text_arr((c1 -> 'f'::text)) @> '{1,6}'::text[])
   ->  Bitmap Index Scan on idx_t3_1  (cost=0.00..12.25 rows=1 width=0)
         Index Cond: (json_arr2text_arr((c1 -> 'f'::text)) @> '{1,6}'::text[])
(4 rows)

参考

http://dba.stackexchange.com/questions/54283/how-to-turn-json-array-into-postgres-array
https://www.postgresql.org/docs/9.6/static/functions-json.html
https://www.postgresql.org/docs/9.6/static/functions-array.html

Flag Counter

digoal’s 大量PostgreSQL文章入口