PostgreSQL 10.0 preview 功能增强 - JSON 内容全文检索
背景
PostgreSQL 10.0 支持JSON内容的全文检索了。
同样支持ts rank和phrase 索引哦。
ts rank, phrase请参考
《从难缠的模糊查询聊开 - PostgreSQL独门绝招之一 GIN , GiST , SP-GiST , RUM 索引原理与技术背景》
Hi all
I would like to propose patch with a set of new small functions for fts in
case of
jsonb data type:
* to_tsvector(config, jsonb) - make a tsvector from all string values and
elements of jsonb object. To prevent the situation, when tsquery can find
a
phrase consisting of lexemes from two different values/elements, this
function will add an increment to position of each lexeme from every new
value/element.
* ts_headline(config, jsonb, tsquery, options) - generate a headline
directly
from jsonb object
Here are the examples how they work:
=# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee
fff"}}'::jsonb);
to_tsvector
-------------------------------------------------
'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8
(1 row)
=# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc
ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');
ts_headline
----------------------
aaa <bbb> ccc <ddd>
(1 row)
Any comments or suggestions?
这个patch的讨论,详见邮件组,本文末尾URL。
PostgreSQL社区的作风非常严谨,一个patch可能在邮件组中讨论几个月甚至几年,根据大家的意见反复的修正,patch合并到master已经非常成熟,所以PostgreSQL的稳定性也是远近闻名的。
参考
https://commitfest.postgresql.org/13/1054/
https://www.postgresql.org/message-id/flat/CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com#CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com