PostgreSQL schemaless 的实现(类mongodb collection)

1 minute read

背景

使用mongodb时，并不需要先建表(collection)，直接就可以往里面写，原因是collection本事就是固定的BSON结构。

所以当用户插入时，如果表不存在，建一个BSON结构的colleciton即可。

而关系数据库无法做到这一点，因为关系数据库的表结构都是事先构建，并且在写入数据时，是需要检查对应的类型，约束的。

那么有没有办法让PostgreSQL关系数据库也实现类似mongo这种schemaless的表呢?

函数式写入

用户通过调用函数，写入数据。

在函数中处理并实现schemaless。

例子

创建一个自动建表的函数，用于自动创建目标表。

create or replace function create_schemaless(target name) returns void as $$  
declare  
begin  
  execute format('create table if not exists %I (content jsonb)', target);  
exception when others then  
  return;  
end;  
$$ language plpgsql strict;  

创建一个插入数据的函数，使用动态SQL，如果遇到表不存在的错误，则调用建表函数进行建表。

create or replace function ins_schemaless(target name, content jsonb) returns void as $$  
declare  
begin  
  execute format('insert into %I values (%L)', target, content);  
  exception   
    WHEN SQLSTATE '42P01' THEN   
    perform create_schemaless(target);  
    execute format('insert into %I values (%L)', target, content);   
end;  
$$ language plpgsql strict;  

调用函数插入数据，不需要建表，会自动创建。

postgres=# select ins_schemaless('abcde','{"a":123.1}');  
 ins_schemaless   
----------------  
   
(1 row)  
  
postgres=# select * from abcde;  
   content      
--------------  
 {"a": 123.1}  
(1 row)  
  
postgres=# select ins_schemaless('abcde','{"a":123.1}');  
 ins_schemaless   
----------------  
   
(1 row)  
  
postgres=# select * from abcde;  
   content      
--------------  
 {"a": 123.1}  
 {"a": 123.1}  
(2 rows)  
  
postgres=# select ins_schemaless('abcdefg','{"a":123.1}');  
 ins_schemaless   
----------------  
   
(1 row)  
  
postgres=# select * from abcdefg;  
   content      
--------------  
 {"a": 123.1}  
(1 row)  

函数支持并发插入，不会相互影响。

性能

由于使用了动态SQL，性能略差。

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 32  
number of threads: 32  
duration: 120 s  
number of transactions actually processed: 26908558  
latency average = 0.143 ms  
latency stddev = 1.397 ms  
tps = 224219.413026 (including connections establishing)  
tps = 224353.960206 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.143  select ins_schemaless('c','{}');  

使用绑定变量，性能如下。

transaction type: ./test.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 32  
number of threads: 32  
duration: 120 s  
number of transactions actually processed: 39684200  
latency average = 0.097 ms  
latency stddev = 2.192 ms  
tps = 330698.368601 (including connections establishing)  
tps = 330708.294542 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.097  insert into c values ('{}');  

损失11万的QPS，获得schemaless的能力，要不要呢？

当然，如果是从内核层面来支持就更爽了，性能问题可能也能得到解决，比如。

insert into schemaless tbl values (jsonb);

digoal’s 大量PostgreSQL文章入口

Twitter Facebook Google+ LinkedIn

Digoal.zhou

PostgreSQL schemaless 的实现(类mongodb collection)

背景

函数式写入

例子

性能

digoal’s 大量PostgreSQL文章入口

You May Also Enjoy

PostgreSQL(PPAS 兼容Oracle) 从零开始入门手册 - 珍藏版

PostgreSQL pipelinedb 流计算插件 - IoT应用 - 实时轨迹聚合

PostgreSQL plpgsql 存储过程、函数 - 状态、异常变量打印、异常捕获… - GET [STACKED] DIAGNOSTICS

PostgreSQL datediff 日期间隔（单位转换）兼容SQL用法