PostgreSQL FDW mongo_fdw usage

4 minute read

背景

https://wiki.postgresql.org/wiki/Foreign_data_wrappers

mongo_fdw是PostgreSQL众多FDW中的一个, 显而易见是连接mongoDB的.

最近在看CitusDB的时候, 文档中提到了mongo_fdw, 实际上在CitusDB中并没有发挥这个extension的distribute query特性.

用在PostgreSQL单节点的环境当然也是可以的.

目前mongo_fdw由EDB在维护,建议用户使用EDB的版本。本文成文太早,还没有EDB版本的MONGO_FDW。

https://github.com/EnterpriseDB/mongo_fdw

下面介绍一下mongo_fdw在PostgreSQL数据库环境中的安装和用法.

1. 安装

下载

ocz@db-172-16-3-150-> wget --no-check-certificate https://github.com/citusdata/mongo_fdw/archive/master.zip  
ocz@db-172-16-3-150-> unzip master  
ocz@db-172-16-3-150-> mv mongo_fdw-master postgresql-9.2.1/contrib/  

编译

su - root  
[root@db-172-16-3-150 ~]# . /home/ocz/.bash_profile   
root@db-172-16-3-150-> which pg_config  
/home/ocz/pgsql9.2.1/bin/pg_config  
root@db-172-16-3-150-> cd /home/ocz/postgresql-9.2.1/contrib/mongo_fdw-master/  
root@db-172-16-3-150-> make clean  
root@db-172-16-3-150-> make  
root@db-172-16-3-150-> make install  

2. 创建extension

root@db-172-16-3-150-> su - ocz  
ocz@db-172-16-3-150-> psql  
psql (9.2.1)  
Type "help" for help.  
postgres=# \c digoal postgres  
digoal=# create extension mongo_fdw;  
CREATE EXTENSION  

3. 创建mongoDB测试环境

[root@db-172-16-3-150 ~]# wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.2.3.tgz  
[root@db-172-16-3-150 ~]# tar -zxvf mongodb-linux-x86_64-2.2.3.tgz  
[root@db-172-16-3-150 ~]# mv mongodb-linux-x86_64-2.2.3 /opt/  
  
[root@db-172-16-3-150 ~]# useradd mongo  
[root@db-172-16-3-150 ~]# chown -R mongo:mongo /opt/mongodb-linux-x86_64-2.2.3  
  
[root@db-172-16-3-150 ~]# su - mongo  
mongo@db-172-16-3-150-> vi .bash_profile  
export MONGO_HOME=/opt/mongodb-linux-x86_64-2.2.3  
export PATH=$MONGO_HOME/bin:$PATH:.  
umask 022  
alias rm='rm -i'  
alias ll='ls -lh'  
  
su - root  
[root@db-172-16-3-150 data03]# mkdir -p /data03/mongodata  
[root@db-172-16-3-150 data03]# chown -R mongo:mongo /data03/mongodata  
  
[root@db-172-16-3-150 data03]# vi /opt/mongodb-linux-x86_64-2.2.3/mongo11111.conf  
logpath = /data03/mongodata/mongod_5281.log  
logappend = true  
fork = true  
port = 11111  
noauth = true  
bind_ip = 0.0.0.0  
dbpath = /data03/mongodata/  
maxConns = 20000  
pidfilepath = /data03/mongodata/mongod_11111.pid  
nounixsocket = true  
directoryperdb = true  
journal = true  
journalCommitInterval = 40  
profile = 0  
nohttpinterface = true  
nssize = 2000  
oplogSize = 31280  
[root@db-172-16-3-150 mongodb-linux-x86_64-2.2.3]# chown mongo:mongo /opt/mongodb-linux-x86_64-2.2.3/mongo11111.conf  

4. 启动mongodb数据库

su - mongo  
mongo@db-172-16-3-150-> mongod -f /opt/mongodb-linux-x86_64-2.2.3/mongo11111.conf  
mongo@db-172-16-3-150-> mongo 127.0.0.1:11111/admin  
MongoDB shell version: 2.2.3  
connecting to: 127.0.0.1:11111/admin  
Welcome to the MongoDB shell.  
For interactive help, type "help".  
For more comprehensive documentation, see  
        http://docs.mongodb.org/  
Questions? Try the support group  
        http://groups.google.com/group/mongodb-user  
> show dbsshow dbs  
local   (empty)  

5. 创建外部表

[root@db-172-16-3-150 ~]# su - ocz  
ocz@db-172-16-3-150-> psql digoal digoal  
psql (9.2.1)  
Type "help" for help.  
digoal=> \c digoal postgres  
You are now connected to database "digoal" as user "postgres".  
digoal=# \dx  
                                  List of installed extensions  
   Name    | Version |   Schema   |                         Description                            
-----------+---------+------------+--------------------------------------------------------------  
 dblink    | 1.0     | public     | connect to other PostgreSQL databases from within a database  
 hll       | 1.0     | public     | type for storing hyperloglog data  
 mongo_fdw | 1.0     | public     | foreign data wrapper for MongoDB access  
 plcoffee  | 1.3.0   | pg_catalog | PL/CoffeeScript (v8) trusted procedural language  
 plls      | 1.3.0   | pg_catalog | PL/LiveScript (v8) trusted procedural language  
 plpgsql   | 1.0     | pg_catalog | PL/pgSQL procedural language  
 plv8      | 1.3.0   | pg_catalog | PL/JavaScript (v8) trusted procedural language  
(7 rows)  
  
digoal=# CREATE SERVER mongo_server FOREIGN DATA WRAPPER mongo_fdw  
  OPTIONS (address '127.0.0.1', port '11111');  
  
digoal=# CREATE FOREIGN TABLE customer_reviews  
(  
    customer_id TEXT,  
    review_date TIMESTAMP,  
    review_rating INTEGER,  
    product_id CHAR(10),  
    product_title TEXT,  
    product_group TEXT,  
    product_category TEXT,  
    similar_product_ids CHAR(10)[]  
)  
SERVER mongo_server  
OPTIONS (database 'test', collection 'customer_reviews');  

6. 查询外部表

digoal=# select * from customer_reviews ;  
 customer_id | review_date | review_rating | product_id | product_title | product_group | product_category | similar_product_ids   
-------------+-------------+---------------+------------+---------------+---------------+------------------+---------------------  
(0 rows)  
> show dbsshow dbs  
local   (empty)  
test    (empty)  

7. 下载测试数据

[root@db-172-16-3-150 ~]# wget http://examples.citusdata.com/customer_reviews_1998.json.gz  
[root@db-172-16-3-150 ~]# wget http://examples.citusdata.com/customer_reviews_1999.json.gz  
[root@db-172-16-3-150 ~]# gunzip customer_reviews_1998.json.gz  

8. 导入测试数据

[root@db-172-16-3-150 ~]# . /home/mongo/.bash_profile   
root@db-172-16-3-150-> which mongo  
/opt/mongodb-linux-x86_64-2.2.3/bin/mongo  
root@db-172-16-3-150-> mongoimport --port 11111 --db test --collection customer_reviews --type json --file customer_reviews_1998.json  
connected to: 127.0.0.1:11111  
root@db-172-16-3-150-> mongo 127.0.0.1:11111/test  
MongoDB shell version: 2.2.3  
connecting to: 127.0.0.1:11111/test  
> show collections  
customer_reviews  
system.indexes  
> db.customer_reviews.count()  
589859  

9. PostgreSQL 外部表查询

digoal=# select * from customer_reviews limit 1;  
  customer_id  |     review_date     | review_rating | product_id |               product_title                | product_group |   p  
roduct_category   |                   similar_product_ids                      
---------------+---------------------+---------------+------------+--------------------------------------------+---------------+----  
------------------+----------------------------------------------------------  
 AE22YDHSBFYIP | 1970-12-30 00:00:00 |             5 | 1551803542 | Start and Run a Coffee Bar (Start & Run a) | Book          | Bus  
iness & Investing | {0471136174,0910627312,047112138X,0786883561,0201570483}  
(1 row)  
digoal=# select count(*) from customer_reviews ;  
 count    
--------  
 589859  
(1 row)  
digoal=# SELECT  
    round(avg(review_rating), 2),  
    width_bucket(length(product_title), 1, 50, 5) as title_length,  
    count(*)  
FROM  
    customer_reviews  
WHERE  
    product_group='Book' AND  
    review_date >= '1998-01-01' AND  
    review_date < date '1998-01-01' + interval '1 year'  
GROUP BY  
    title_length  
ORDER BY  
    title_length;  
 round | title_length | count    
-------+--------------+--------  
  4.39 |            1 |  40058  
  4.29 |            2 | 125688  
  4.41 |            3 |  76425  
  4.36 |            4 |  51531  
  4.30 |            5 |  34974  
  4.39 |            6 |  30349  
(6 rows)  
  
digoal=# explain analyze verbose SELECT  
    round(avg(review_rating), 2),  
    width_bucket(length(product_title), 1, 50, 5) as title_length,  
    count(*)  
FROM  
    customer_reviews  
WHERE  
    product_group='Book' AND  
    review_date >= '1998-01-01' AND  
    review_date < date '1998-01-01' + interval '1 year'  
GROUP BY  
    title_length  
ORDER BY  
    title_length;  
                                                                                                                        QUERY PLAN    
                                                                                                                        
------------------------------------------------------------------------------------------------------------------------------------  
----------------------------------------------------------------------------------------------------------------------  
 Sort  (cost=21457.71..21457.71 rows=1 width=36) (actual time=3221.772..3221.772 rows=6 loops=1)  
   Output: (round(avg(review_rating), 2)), (width_bucket((length(product_title))::double precision, 1::double precision, 50::double   
precision, 5)), (count(*))  
   Sort Key: (width_bucket((length(customer_reviews.product_title))::double precision, 1::double precision, 50::double precision, 5)  
)  
   Sort Method: quicksort  Memory: 25kB  
   ->  HashAggregate  (cost=21457.68..21457.70 rows=1 width=36) (actual time=3221.746..3221.754 rows=6 loops=1)  
         Output: round(avg(review_rating), 2), (width_bucket((length(product_title))::double precision, 1::double precision, 50::dou  
ble precision, 5)), count(*)  
         ->  Foreign Scan on public.customer_reviews  (cost=5.00..21457.56 rows=15 width=36) (actual time=225.305..3036.528 rows=359  
025 loops=1)  
               Output: width_bucket((length(product_title))::double precision, 1::double precision, 50::double precision, 5), review  
_rating  
               Filter: ((customer_reviews.review_date >= '1998-01-01 00:00:00'::timestamp without time zone) AND (customer_reviews.r  
eview_date < '1999-01-01 00:00:00'::timestamp without time zone) AND (customer_reviews.product_group = 'Book'::text))  
               Foreign Namespace: test.customer_reviews  
 Total runtime: 3222.184 ms  
(11 rows)  

参考

1. https://github.com/citusdata/mongo_fdw

2. http://pgxn.org/dist/mongo_fdw

3. https://github.com/EnterpriseDB/mongo_fdw

其他FDW

1. hive

https://github.com/youngwookim/hive-fdw-for-postgresql

2. PostgreSQL Foreign Table - pgsql_fdw

http://blog.163.com/digoal@126/blog/static/163877040201231514057303/

3. PostgreSQL Foreign Table - oracle_fdw 1

http://blog.163.com/digoal@126/blog/static/163877040201181505331588/

4. PostgreSQL Foreign Table - oracle_fdw 2

http://blog.163.com/digoal@126/blog/static/16387704020118151162340/

5. PostgreSQL Foreign Table - oracle_fdw 3

http://blog.163.com/digoal@126/blog/static/16387704020118951953408/

6. PostgreSQL Foreign Table - file_fdw

http://blog.163.com/digoal@126/blog/static/163877040201141641148311/

7. PostgreSQL Foreign Table - redis_fdw

http://blog.163.com/digoal@126/blog/static/16387704020119181188247/

8. PostgreSQL Foreign Table - mysql_fdw 1

http://blog.163.com/digoal@126/blog/static/1638770402011111233524987/

9. PostgreSQL Foreign Table - mysql_fdw 2

http://blog.163.com/digoal@126/blog/static/16387704020121108551698/

10. PostgreSQL file text array fdw used for unpredictable columns of text file

http://blog.163.com/digoal@126/blog/static/163877040201302410511382/

Flag Counter

digoal’s 大量PostgreSQL文章入口