PostgreSQL 10.0 preview 功能增强 - SQL:2016标准(之SQL/JSON) Oracle 12c兼容
背景
PostgreSQL 很多年前,就支持JSON类型,属于最早支持JSON类型的关系数据库。内置了两个JSON类型json与jsonb。
而Oracle从12c才开始支持JSON。
SQL标准2016中,包含了一个SQL/JSON的数据模型标准,这个标准的内容购买页面 https://www.iso.org/standard/63556.html
Oracle 12c支持的JSON与该标准非常接近(不知道是不是Oracle想用标准来后来居上呢?)
那么虽然PostgreSQL早在几年前就支持了JSON,由于oracle市场还是很大,很多用户也许会习惯ORACLE的那一套JSON用法。
PostgreSQL json, jsonb已有的生态,包括自身的用法,周边的例如pljava, pljavascript, plv8等存储过程引擎。感兴趣可以谷歌百科。
个人认为实用性已超SQL标准,但是拗不过人多。客户第一。
所以,PostgreSQL社区依旧会在10.0支持SQL:2016的SQL/JSON标准,真是道高一尺魔高一丈啊。
Hi there,
Attached patch is an implementation of SQL/JSON data model from SQL-2016
standard (ISO/IEC 9075-2:2016(E)), which was published 2016-12-15 and is
available only for purchase from ISO web site (
https://www.iso.org/standard/63556.html). Unfortunately I didn't find any
public sources of the standard or any preview documents, but Oracle
implementation of json support in 12c release 2 is very close (
http://docs.oracle.com/database/122/ADJSN/json-in-oracle-database.htm),
also we used https://livesql.oracle.com/ to understand some details.
Postgres has already two json data types - json and jsonb and implementing
another json data type, which strictly conforms the standard, would be not
a good idea. Moreover, SQL standard doesn’t describe data type, but only
data model, which “comprises SQL/JSON items and SQL/JSON sequences. The
components of the SQL/JSON data model are:
1) An SQL/JSON item is defined recursively as any of the following:
a) An SQL/JSON scalar, defined as a non-null value of any of the following
predefined (SQL) types:
character string with character set Unicode, numeric, Boolean, or datetime.
b) An SQL/JSON null, defined as a value that is distinct from any value of
any SQL type.
NOTE 122 — An SQL/JSON null is distinct from the SQL null value.
c) An SQL/JSON array, defined as an ordered list of zero or more SQL/JSON
items, called the SQL/JSON
elements of the SQL/JSON array.
d) An SQL/JSON object, defined as an unordered collection of zero or more
SQL/JSON members….
“
Our jsonb corresponds to SQL/JSON with UNIQUE KEYS and implicit ordering of
keys and our main intention was to provide support of jsonb as a most
important and usable data type.
We created repository for reviewing (ask for write access) -
https://github.com/postgrespro/sqljson/tree/sqljson
Examples of usage can be found in src/test/regress/sql/sql_json.sql
The whole documentation about json support should be reorganized and added,
and we plan to do this before release. We need help of community here.
Our goal is to provide support of main features of SQL/JSON to release 10,
as we discussed at developers meeting in Brussels (Andrew Dunstan has
kindly agreed to review the patch).
We had not much time to develop the complete support, because of standard
availability), but hope all major features are here, namely, all nine
functions as described in the standard (but see implementation notes below):
“All manipulation (e.g., retrieval, creation, testing) of SQL/JSON items is
performed through a number of SQL/JSON functions. There are nine such
functions, categorized as SQL/JSON retrieval functions and SQL/JSON
construction functions. The SQL/JSON retrieval functions are characterized
by operating on JSON data and returning an SQL value (possibly a Boolean
value) or a JSON value. The SQL/JSON construction functions return JSON
data created from operations on SQL data or other JSON data.
The SQL/JSON retrieval functions are:
— <JSON value function>: extracts an SQL value of a predefined type from a
JSON text.
— <JSON query>: extracts a JSON text from a JSON text.
— <JSON table>: converts a JSON text to an SQL table.
— <JSON predicate>: tests whether a string value is or is not properly
formed JSON text.
— <JSON exists predicate>: tests whether an SQL/JSON path expression
returns any SQL/JSON items.
The SQL/JSON construction functions are:
— <JSON object constructor>: generates a string that is a serialization of
an SQL/JSON object.
— <JSON array constructor>: generates a string that is a serialization of
an SQL/JSON array.
— <JSON object aggregate constructor>: generates, from an aggregation of
SQL data, a string that is a serialization
of an SQL/JSON object.
— <JSON array aggregate constructor>: generates, from an aggregation of SQL
data, a string that is a serialization
of an SQL/JSON array.
A JSON-returning function is an SQL/JSON construction function or
JSON_QUERY.”
The standard describes SQL/JSON path language, which used by SQL/JSON query
operators to query JSON. It defines path language as string literal. We
implemented the path language as JSONPATH data type, since other
approaches are not friendly to planner and executor.
The functions and JSONPATH provide a new functionality for json support,
namely, ability to operate (in standard specified way) with json structure
at SQL-language level - the often requested feature by the users.
The patch is consists of about 15000 insertions (about 5000 lines are from
tests), passes all regression tests and doesn’t touches critical parts, so
we hope with community help to bring it to committable state.
Authors: Nikita Glukhov, Teodor Sigaev, Oleg Bartunov and Alexander Korotkov
Implementation notes:
1.
We didn’t implemented ‘datetime’ support, since it’s not clear from
standard.
2.
JSON_OBJECT/JSON_OBJECTAGG (KEY <key> VALUE <value>, ...) doesn’t
implemented, only (<key>:<value>, …) and (<key> VALUE <value>, …) are
supported, because of grammar conflicts with leading KEY keyword.
3.
FORMAT (JSON|JSONB)) in JSON_ARRAYAGG with subquery doesn’t supported,
because of grammar conflicts with non-reserved word FORMAT.
4.
JSONPATH implemented only for jsonb data type , so JSON_EXISTS(),
JSON_VALUE(), JSON_QUERY() and JSON_TABLE() doesn’t works if context item
is of json data type.
5.
Some methods and predicates for JSONPATH not yet implemented, for
example .type(), .size(), .keyvalue(), predicates like_regex, starts
with, etc. They are not key features and we plan to make them in next
release.
6.
JSONPATH doesn’t support expression for index array, like [2+3 to
$upperbound], only simple constants like [5, 7 to 12] are supported.
7.
JSONPATH extensions to standard: .** (wildcard path accessor), .key
(member accessor without leading @).
8.
FORMAT JSONB extension to standard for returning jsonb - standard
specifies possibility of returning custom type.
9.
JSON_EXISTS(), JSON_VALUE(), JSON_QUERY() are implemented using new
executor node JsonExpr.
10.
JSON_TABLE() is transformed into joined subselects with JSON_VALUE() and
JSON_QUERY() in target list.
11.
JSON_OBJECT(), JSON_ARRAY() constructors and IS JSON predicate are
transformed into raw function calls.
12.
Added explicit casts bytea=>jsonb and jsonb=>bytea (for jsonb=>bytea
output using RETURNING bytea FORMAT JSONB and corresponding bytea=>jsonb
input using <jsonb_bytea_expr> FORMAT JSONB).
Best regards,
Oleg
玩法
https://github.com/postgrespro/sqljson/blob/sqljson/src/test/regress/sql/sql_json.sql
这个patch的讨论,详见邮件组,本文末尾URL。
PostgreSQL社区的作风非常严谨,一个patch可能在邮件组中讨论几个月甚至几年,根据大家的意见反复的修正,patch合并到master已经非常成熟,所以PostgreSQL的稳定性也是远近闻名的。
参考
https://commitfest.postgresql.org/13/1063/
https://www.postgresql.org/message-id/flat/CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com#CAF4Au4w2x-5LTnN_bxky-mq4=WOqsGsxSpENCzHRAzSnEd8+WQ@mail.gmail.com