Greenplum 表空间和filespace的用法

8 minute read

背景

Greenplum支持表空间,创建表空间时,需要指定filespace。

postgres=# \h create tablespace;  
Command:     CREATE TABLESPACE  
Description: define a new tablespace  
Syntax:  
CREATE TABLESPACE tablespace_name [OWNER username] FILESPACE filespace_name  

那么什么是filespace呢?

GP在初始化完后,有一个默认的filespace : pg_system。

表空间pg_default和pg_global都放在这个filespace下面。

也就是说一个filespace可以被多个表空间使用。

postgres=# select oid,* from pg_filespace;  
 oid  |  fsname   | fsowner   
------+-----------+---------  
 3052 | pg_system |      10  
(1 row)  
postgres=# select * from pg_tablespace;  
  spcname   | spcowner | spclocation | spcacl | spcprilocations | spcmirlocations | spcfsoid   
------------+----------+-------------+--------+-----------------+-----------------+----------  
 pg_default |       10 |             |        |                 |                 |     3052  
 pg_global  |       10 |             |        |                 |                 |     3052  
(2 rows)  

还有TEMPORARY_FILES和TRANSACTION_FILES对应的filespace如下:

$gpfilespace --showtempfilespace  
20151218:16:02:07:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-  
A tablespace requires a file system location to store its database  
files. A filespace is a collection of file system locations for all components  
in a Greenplum system (primary segment, mirror segment and master instances).  
Once a filespace is created, it can be used by one or more tablespaces.  
  
20151218:16:02:07:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-Getting filespace information for TEMPORARY_FILES  
20151218:16:02:08:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-Checking for filespace consistency  
20151218:16:02:08:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES  
20151218:16:02:09:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-TEMPORARY_FILES OIDs are consistent for pg_system filespace  
20151218:16:02:11:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-TEMPORARY_FILES entries are consistent for pg_system filespace  
20151218:16:02:11:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES  
20151218:16:02:11:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-Current Filespace for TEMPORARY_FILES is pg_system  
20151218:16:02:11:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-1    /disk1/digoal/gpdata/gpseg-1  
20151218:16:02:11:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-25    /disk1/digoal/gpdata_mirror/gpseg0  
20151218:16:02:11:063949 gpfilespace:127.0.0.1:digoal-[INFO]:-2    /disk1/digoal/gpdata/gpseg0  
......  
  
$gpfilespace --showtransfilespace  
20151218:16:09:41:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-  
A tablespace requires a file system location to store its database  
files. A filespace is a collection of file system locations for all components  
in a Greenplum system (primary segment, mirror segment and master instances).  
Once a filespace is created, it can be used by one or more tablespaces.  
  
20151218:16:09:41:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-Getting filespace information for TRANSACTION_FILES  
20151218:16:09:41:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-Checking for filespace consistency  
20151218:16:09:41:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining current filespace entries used by TRANSACTION_FILES  
20151218:16:09:42:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-TRANSACTION_FILES OIDs are consistent for pg_system filespace  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-TRANSACTION_FILES entries are consistent for pg_system filespace  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining current filespace entries used by TRANSACTION_FILES  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-Current Filespace for TRANSACTION_FILES is pg_system  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-1    /disk1/digoal/gpdata/gpseg-1  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-25    /disk1/digoal/gpdata_mirror/gpseg0  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-2    /disk1/digoal/gpdata/gpseg0  
20151218:16:09:44:071104 gpfilespace:127.0.0.1:digoal-[INFO]:-26    /disk1/digoal/gpdata_mirror/gpseg1  
......  

如果我们的greenplum集群中,有SSD硬盘,又有SATA硬盘。怎样更好的利用这些空间呢?

方法1.

用flashcache或bcache,通过device mapper技术,将SSD和SATA绑定,做成块设备。

再通过 逻辑卷管理 或者 软RAID 或者 brtfs or zfs管理起来,做成大的文件系统。

(还有一种方法是用RHEL 7提供的LVM,可以将SSD作为二级缓存)

这种方法对GP来说,是混合动力,可以创建一个或多个文件系统(都具备混合动力)。

所以建议只需要一个pg_system filespace就够了(除非容量到了文件系统管理的极限,那样的话可以分成多个文件系统)。

用多个文件系统的情况下,就需要对每个文件系统,创建对应的目录,以及filespace。

方法2.

SSD和SATA分开,各自创建各自的文件系统。

对每个文件系统,创建对应的目录,以及filespace。

创建filespace非常简单,分几步。

如下:

1. 创建目录,需要在所有的角色对应的主机中创建。给予gp 操作系统管理用户对应的权限。

master  
$ mkdir /ssd1/gpdata/master_p  
$ chown gpadmin:gpadmin /ssd1/gpdata/master_p  
  
master standby  
$ mkdir /ssd1/gpdata/master_s  
$ chown gpadmin:gpadmin /ssd1/gpdata/master_s  
  
segment  
$ mkdir /ssd1/gpdata_p  
$ chown gpadmin:gpadmin /ssd1/gpdata_p  
  
segment mirror  
$ mkdir /ssd1/gpdata_s  
$ chown gpadmin:gpadmin /ssd1/gpdata_s  

2. 查看系统配置

postgres=# select dbid,content,role,preferred_role,hostname,port from gp_segment_configuration order by role,dbid;  
 dbid | content | role | preferred_role |     hostname      | port    
------+---------+------+----------------+-------------------+-------  
    2 |       0 | m    | p              | digoal.sqa.zmf | 40000  
    3 |       1 | m    | p              | digoal.sqa.zmf | 40001  
    4 |       2 | m    | p              | digoal.sqa.zmf | 40002  
    5 |       3 | m    | p              | digoal.sqa.zmf | 40003  
    6 |       4 | m    | p              | digoal.sqa.zmf | 40004  
    7 |       5 | m    | p              | digoal.sqa.zmf | 40005  
    8 |       6 | m    | p              | digoal.sqa.zmf | 40006  
    9 |       7 | m    | p              | digoal.sqa.zmf | 40007  
   10 |       8 | m    | p              | digoal.sqa.zmf | 40008  
   11 |       9 | m    | p              | digoal.sqa.zmf | 40009  
   12 |      10 | m    | p              | digoal.sqa.zmf | 40010  
   13 |      11 | m    | p              | digoal.sqa.zmf | 40011  
   14 |      12 | m    | p              | digoal.sqa.zmf | 40012  
   15 |      13 | m    | p              | digoal.sqa.zmf | 40013  
   16 |      14 | m    | p              | digoal.sqa.zmf | 40014  
   17 |      15 | m    | p              | digoal.sqa.zmf | 40015  
   18 |      16 | m    | p              | digoal.sqa.zmf | 40016  
   19 |      17 | m    | p              | digoal.sqa.zmf | 40017  
   20 |      18 | m    | p              | digoal.sqa.zmf | 40018  
   21 |      19 | m    | p              | digoal.sqa.zmf | 40019  
   22 |      20 | m    | p              | digoal.sqa.zmf | 40020  
   23 |      21 | m    | p              | digoal.sqa.zmf | 40021  
   24 |      22 | m    | p              | digoal.sqa.zmf | 50011  
    1 |      -1 | p    | p              | digoal.sqa.zmf |  1921  
   25 |       0 | p    | m              | digoal.sqa.zmf | 41000  
   26 |       1 | p    | m              | digoal.sqa.zmf | 41001  
   27 |       2 | p    | m              | digoal.sqa.zmf | 41002  
   28 |       3 | p    | m              | digoal.sqa.zmf | 41003  
   29 |       4 | p    | m              | digoal.sqa.zmf | 41004  
   30 |       5 | p    | m              | digoal.sqa.zmf | 41005  
   31 |       6 | p    | m              | digoal.sqa.zmf | 41006  
   32 |       7 | p    | m              | digoal.sqa.zmf | 41007  
   33 |       8 | p    | m              | digoal.sqa.zmf | 41008  
   34 |       9 | p    | m              | digoal.sqa.zmf | 41009  
   35 |      10 | p    | m              | digoal.sqa.zmf | 41010  
   36 |      11 | p    | m              | digoal.sqa.zmf | 41011  
   37 |      12 | p    | m              | digoal.sqa.zmf | 41012  
   38 |      13 | p    | m              | digoal.sqa.zmf | 41013  
   39 |      14 | p    | m              | digoal.sqa.zmf | 41014  
   40 |      15 | p    | m              | digoal.sqa.zmf | 41015  
   41 |      16 | p    | m              | digoal.sqa.zmf | 41016  
   42 |      17 | p    | m              | digoal.sqa.zmf | 41017  
   43 |      18 | p    | m              | digoal.sqa.zmf | 41018  
   44 |      19 | p    | m              | digoal.sqa.zmf | 41019  
   45 |      20 | p    | m              | digoal.sqa.zmf | 41020  
   46 |      21 | p    | m              | digoal.sqa.zmf | 41021  
   47 |      22 | p    | m              | digoal.sqa.zmf | 41022  
(47 rows)  

3. 创建配置文件,格式如下,假如我要创建一个名为ssd1的filespace。

字段包含(hostname, dbid, DIR/$prefix + $content)

$ vi conf  
filespace:ssd1  
digoal.sqa.zmf:1:/ssd1/gpdata/master_p/gp-1  
digoal.sqa.zmf:2:/ssd1/gpdata_p/gp0  
digoal.sqa.zmf:3:/ssd1/gpdata_p/gp1  
......  
digoal.sqa.zmf:25:/ssd1/gpdata_s/gp0  
digoal.sqa.zmf:26:/ssd1/gpdata_s/gp1  
......  

还有一种方法是使用gpfilespace -o conf来生成配置文件。(在提示时输入目录名DIR的部分即可)

4. 创建filespace

gpfilespace -c conf -h 127.0.0.1 -p 1921 -U digoal -W digoal  
  
20151218:17:16:39:108364 gpfilespace:127.0.0.1:digoal-[INFO]:-Connecting to database  
20151218:17:16:39:108364 gpfilespace:127.0.0.1:digoal-[INFO]:-Filespace "ssd1" successfully created  
......  

然后gpfilespace会自动在数据库执行以下DDL SQL。创建对应的filespace。

所以我们也可以自己在数据库中执行SQL来创建filespace。

CREATE FILESPACE ssd1   
(  
  1: '/disk1/digoal/new_p/gp-1',  
  2: '/disk1/digoal/new_p/gp0',  
  3: '/disk1/digoal/new_p/gp1',  
  4: '/disk1/digoal/new_p/gp2',  
  5: '/disk1/digoal/new_p/gp3',  
  6: '/disk1/digoal/new_p/gp4',  
  7: '/disk1/digoal/new_p/gp5',  
  8: '/disk1/digoal/new_p/gp6',  
  9: '/disk1/digoal/new_p/gp7',  
  10: '/disk1/digoal/new_p/gp8',  
  11: '/disk1/digoal/new_p/gp9',  
  12: '/disk1/digoal/new_p/gp10',  
  13: '/disk1/digoal/new_p/gp11',  
  14: '/disk1/digoal/new_p/gp12',  
  15: '/disk1/digoal/new_p/gp13',  
  16: '/disk1/digoal/new_p/gp14',  
  17: '/disk1/digoal/new_p/gp15',  
  18: '/disk1/digoal/new_p/gp16',  
  19: '/disk1/digoal/new_p/gp17',  
  20: '/disk1/digoal/new_p/gp18',  
  21: '/disk1/digoal/new_p/gp19',  
  22: '/disk1/digoal/new_p/gp20',  
  23: '/disk1/digoal/new_p/gp21',  
  24: '/disk1/digoal/new_p/gp22',  
  25: '/disk1/digoal/new_s/gp0',  
  26: '/disk1/digoal/new_s/gp1',  
  27: '/disk1/digoal/new_s/gp2',  
  28: '/disk1/digoal/new_s/gp3',  
  29: '/disk1/digoal/new_s/gp4',  
  30: '/disk1/digoal/new_s/gp5',  
  31: '/disk1/digoal/new_s/gp6',  
  32: '/disk1/digoal/new_s/gp7',  
  33: '/disk1/digoal/new_s/gp8',  
  34: '/disk1/digoal/new_s/gp9',  
  35: '/disk1/digoal/new_s/gp10',  
  36: '/disk1/digoal/new_s/gp11',  
  37: '/disk1/digoal/new_s/gp12',  
  38: '/disk1/digoal/new_s/gp13',  
  39: '/disk1/digoal/new_s/gp14',  
  40: '/disk1/digoal/new_s/gp15',  
  41: '/disk1/digoal/new_s/gp16',  
  42: '/disk1/digoal/new_s/gp17',  
  43: '/disk1/digoal/new_s/gp18',  
  44: '/disk1/digoal/new_s/gp19',  
  45: '/disk1/digoal/new_s/gp20',  
  46: '/disk1/digoal/new_s/gp21',  
  47: '/disk1/digoal/new_s/gp22'  
);  

现在你可以使用这个filespace了.

例如

1. 将temp , trans移动到这个新的filespace.

$gpfilespace --movetempfilespace ssd1  
  
20151218:17:17:29:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-  
A tablespace requires a file system location to store its database  
files. A filespace is a collection of file system locations for all components  
in a Greenplum system (primary segment, mirror segment and master instances).  
Once a filespace is created, it can be used by one or more tablespaces.  
  
20151218:17:17:29:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Database was started in NORMAL mode  
20151218:17:17:29:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Stopping Greenplum Database  
20151218:17:17:57:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Starting Greenplum Database in master only mode  
20151218:17:18:02:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Checking if filespace ssd1 exists  
20151218:17:18:02:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Checking if filespace is same as current filespace  
20151218:17:18:02:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Stopping Greenplum Database in master only mode  
20151218:17:18:04:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Checking for connectivity  
20151218:17:18:04:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining current filespace information  
20151218:17:18:04:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining current filespace entries used by TEMPORARY_FILES  
20151218:17:18:04:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Obtaining segment information ...  
20151218:17:18:04:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Creating RemoteOperations list  
20151218:17:18:04:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Moving TEMPORARY_FILES filespace from pg_system to ssd1 ...  
20151218:17:18:06:008363 gpfilespace:127.0.0.1:digoal-[INFO]:-Starting Greenplum Database  
  
  
$gpfilespace --movetransfilespace ssd1  
...  
20151218:17:19:17:055389 gpfilespace:127.0.0.1:digoal-[INFO]:-Moving TRANSACTION_FILES filespace from pg_system to ssd1 ...  
20151218:17:21:16:055389 gpfilespace:127.0.0.1:digoal-[INFO]:-Starting Greenplum Database  

2. 创建表空间,使用这个filespace.

postgres=# create tablespace tbs_ssd1 filespace ssd1;  
CREATE TABLESPACE  
postgres=# create table tt(id int) tablespace tbs_ssd1;  
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.  
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.  
CREATE TABLE  
postgres=# select * from pg_tablespace ;  
  spcname   | spcowner | spclocation | spcacl | spcprilocations | spcmirlocations | spcfsoid   
------------+----------+-------------+--------+-----------------+-----------------+----------  
 pg_default |       10 |             |        |                 |                 |     3052  
 pg_global  |       10 |             |        |                 |                 |     3052  
 tbs_ssd1   |       10 |             |        |                 |                 |    69681  
(3 rows)  
  
postgres=# select * from pg_filespace;  
  fsname   | fsowner   
-----------+---------  
 pg_system |      10  
 ssd1      |      10  
(2 rows)  

greenplum为什么会引入filespace的概念?

因为主机目录结构可能不一样,所以原有的目录结构式的方法来创建表空间,可能不够灵活。

最后,如何查看每个节点的filespace和location的关系?

digoal=# select a.dbid,a.content,a.role,a.port,a.hostname,b.fsname,c.fselocation from gp_segment_configuration a,pg_filespace b,pg_filespace_entry c where a.dbid=c.fsedbid and b.oid=c.fsefsoid order by content;  
 dbid | content | role | port  |     hostname      |  fsname   |             fselocation                
------+---------+------+-------+-------------------+-----------+--------------------------------------  
    1 |      -1 | p    |  1921 | digoal193096.zmf | pg_system | /data01/gpdata/master_pgdata/gpseg-1  
    2 |       0 | p    | 40000 | digoal193096.zmf | pg_system | /data01/gpdata/gpseg0  
    3 |       1 | p    | 40001 | digoal193096.zmf | pg_system | /data01/gpdata/gpseg1  
    4 |       2 | p    | 40002 | digoal193096.zmf | pg_system | /data01/gpdata/gpseg2  
    5 |       3 | p    | 40000 | digoal199092.zmf | pg_system | /data01/gpdata/gpseg3  
    6 |       4 | p    | 40001 | digoal199092.zmf | pg_system | /data01/gpdata/gpseg4  
    7 |       5 | p    | 40002 | digoal199092.zmf | pg_system | /data01/gpdata/gpseg5  
    8 |       6 | p    | 40000 | digoal200164.zmf | pg_system | /data01/gpdata/gpseg6  
    9 |       7 | p    | 40001 | digoal200164.zmf | pg_system | /data01/gpdata/gpseg7  
   10 |       8 | p    | 40002 | digoal200164.zmf | pg_system | /data01/gpdata/gpseg8  
   11 |       9 | p    | 40000 | digoal204016.zmf | pg_system | /data01/gpdata/gpseg9  
   12 |      10 | p    | 40001 | digoal204016.zmf | pg_system | /data01/gpdata/gpseg10  
   13 |      11 | p    | 40002 | digoal204016.zmf | pg_system | /data01/gpdata/gpseg11  
   14 |      12 | p    | 40000 | digoal204063.zmf | pg_system | /data01/gpdata/gpseg12  
   15 |      13 | p    | 40001 | digoal204063.zmf | pg_system | /data01/gpdata/gpseg13  
   16 |      14 | p    | 40002 | digoal204063.zmf | pg_system | /data01/gpdata/gpseg14  
   17 |      15 | p    | 40003 | digoal193096.zmf | pg_system | /data01/gpdata/gpseg15  
   18 |      16 | p    | 40003 | digoal199092.zmf | pg_system | /data01/gpdata/gpseg16  
   19 |      17 | p    | 40003 | digoal200164.zmf | pg_system | /data01/gpdata/gpseg17  
   20 |      18 | p    | 40003 | digoal204016.zmf | pg_system | /data01/gpdata/gpseg18  
   21 |      19 | p    | 40003 | digoal204063.zmf | pg_system | /data01/gpdata/gpseg19  
   22 |      20 | p    | 40000 | digoal209198.zmf | pg_system | /data01/gpdata/gpseg22  
   23 |      21 | p    | 40001 | digoal209198.zmf | pg_system | /data01/gpdata/gpseg23  
   24 |      22 | p    | 40002 | digoal209198.zmf | pg_system | /data01/gpdata/gpseg24  
   25 |      23 | p    | 40003 | digoal209198.zmf | pg_system | /data01/gpdata/gpseg25  
(25 rows)  

参考

gpfilespace -h  

Flag Counter

digoal’s 大量PostgreSQL文章入口