PostgreSQL 10.0 preview 功能增强 - JSON 内容全文检索

less than 1 minute read

背景

PostgreSQL 10.0 支持JSON内容的全文检索了。

同样支持ts rank和phrase 索引哦。

ts rank, phrase请参考

《从难缠的模糊查询聊开 - PostgreSQL独门绝招之一 GIN , GiST , SP-GiST , RUM 索引原理与技术背景》

Hi all  
  
I would like to propose patch with a set of new small functions for fts in  
case of  
jsonb data type:  
  
* to_tsvector(config, jsonb) - make a tsvector from all string values and  
  elements of jsonb object. To prevent the situation, when tsquery can find  
  a  
  phrase consisting of lexemes from two different values/elements, this  
  function will add an increment to position of each lexeme from every new  
  value/element.  
  
* ts_headline(config, jsonb, tsquery, options) - generate a headline  
directly  
  from jsonb object  
  
Here are the examples how they work:  
  

=# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee  
fff"}}'::jsonb);  
                   to_tsvector  
-------------------------------------------------  
 'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8  
(1 row)  
  
  
=# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc  
ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');  
     ts_headline  
----------------------  
 aaa <bbb> ccc <ddd>  
(1 row)  

  
Any comments or suggestions?  

这个patch的讨论,详见邮件组,本文末尾URL。

PostgreSQL社区的作风非常严谨,一个patch可能在邮件组中讨论几个月甚至几年,根据大家的意见反复的修正,patch合并到master已经非常成熟,所以PostgreSQL的稳定性也是远近闻名的。

参考

https://commitfest.postgresql.org/13/1054/

https://www.postgresql.org/message-id/flat/CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com#CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com

Flag Counter

digoal’s 大量PostgreSQL文章入口