ZFS 12SATA JBOD vs MSA 2312FC 24SAS
背景
今天拿了两台主机PK一下zfs和存储的性能.
ZFS主机
联想 Reno/Raleigh
8核 Intel(R) Xeon(R) CPU E5-2407 0 @ 2.20GHz
24GB内存
12*SATA 2TB, 其中2块RAID1, 另外10块作为zpool (raidz 9 + spare 1 + raid1的一个分区作为log)
文件系统特殊项atime=off, compression=lz4 压缩比 约3.16
存储主机
DELL R610
16核 Intel(R) Xeon(R) CPU E5630 @ 2.53GHz
32GB内存
存储2台MSA2312FC, 分别12块300G SAS盘. 10块做的RAID 10. 2块hot spare.
其中一台存储的配置
Controllers
-----------
Controller ID: A
Serial Number: 3CL947R707
Hardware Version: 56
CPLD Version: 8
Disks: 12
Vdisks: 1
Cache Memory Size (MB): 1024
Host Ports: 2
Disk Channels: 2
Disk Bus Type: SAS
Status: Running
Failed Over: No
Fail Over Reason: Not applicable
# show disks
Location Serial Number Vendor Rev How Used Type Size
Rate(Gb/s) SP Status
-----------------------------------------------------------------------
1.1 3QP2EN7V00009006CJT4 SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.2 3QP232CM00009952PTMA SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.3 3QP2GKLZ00009008V1VA SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.4 3QP2G2LL00009008WAYU SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.5 3QP2EN0700009007DAPA SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.6 3QP2G6AE00009008V39E SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.7 6SJ4ZX1S0000N239DF6Q SEAGATE 0008 GLOBAL SP SAS 300.0GB
3.0 OK
1.8 6SJ4ZTLT0000N239FM3P SEAGATE 0008 VDISK VRSC SAS 300.0GB
3.0 OK
1.9 6SJ4ZY6H0000N2407XKP SEAGATE 0008 VDISK VRSC SAS 300.0GB
3.0 OK
1.10 3QP2FVR100009008Z16H SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.11 3QP2DZEX00009008WBN9 SEAGATE 0004 VDISK VRSC SAS 300.0GB
3.0 OK
1.12 3QP2CXWS00009008WBLN SEAGATE 0004 GLOBAL SP SAS 300.0GB
3.0 OK
-----------------------------------------------------------------------
Name Size Free Own Pref RAID Disks Spr Chk Status Jobs
Serial Number
------------------------------------------------------------------------
vd01 1498.4GB 100.4GB A A RAID10 10 0 16k FTOL VRSC 66%
00c0ffda61090000144de15100000000
# show cache
System Cache Parameters
-----------------------
Operation Mode: Active-Active ULP
Controller A Cache Parameters
-----------------------------
Write Back Status: Enabled
CompactFlash Status: Installed
Cache Flush: Enabled
Controller B Cache Parameters
-----------------------------
Write Back Status: Enabled
CompactFlash Status: Installed
Cache Flush: Enabled
存储文件系统
[root@db- ~]# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
lv01 vgdata01 -wi-ao 300.00G
lv02 vgdata01 -wi-ao 100.00G
lv03 vgdata01 -wi-ao 1.17T
lv04 vgdata01 -wi-ao 1001.99G
[root@db- ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/mpath/d09_msa1_vd01vol01 vgdata01 lvm2 a-- 1.27T 0
/dev/mpath/d09_msa2_vd01vol01 vgdata01 lvm2 a-- 1.27T 0
使用ext4, noatime, nodiratime加载.
测试场景是PostgreSQL 9.2.8
目前只测试了读速度, 因为zfs这台是流复制备机. (数据库配置完全一致)
zfs下18G表的COUNT查询
digoal=> select count(*) from tbl;
count
----------
48391818
(1 row)
Time: 9998.065 ms
存储下的查询
digoal=> select count(*) from tbl;
count
----------
48391818
(1 row)
Time: 64707.770 ms
这个测试数据在ZFS中LZ4压缩算法后缩小了2.5倍左右.
pg_relation_filepath
-------------------------------------------------
pg_tblspc/16384/PG_9.2_201204301/70815/10088356
> ll -h pg_tblspc/16384/PG_9.2_201204301/70815/10088356*
-rw------- 1 postgres postgres 1.0G Jun 19 00:59 pg_tblspc/16384/PG_9.2_201204301/70815/10088356
-rw------- 1 postgres postgres 1.0G Jun 19 04:24 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.1
-rw------- 1 postgres postgres 1.0G Jun 19 01:21 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.10
-rw------- 1 postgres postgres 1.0G Jun 19 05:15 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.11
-rw------- 1 postgres postgres 1.0G Jun 19 04:52 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.12
-rw------- 1 postgres postgres 1.0G Jun 19 01:50 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.13
-rw------- 1 postgres postgres 1.0G Jun 19 04:22 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.14
-rw------- 1 postgres postgres 1.0G Jun 19 03:32 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.15
-rw------- 1 postgres postgres 1.0G Jun 19 02:05 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.16
-rw------- 1 postgres postgres 575M Jun 19 04:26 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.17
-rw------- 1 postgres postgres 1.0G Jun 19 04:30 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.2
-rw------- 1 postgres postgres 1.0G Jun 19 01:27 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.3
-rw------- 1 postgres postgres 1.0G Jun 19 03:24 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.4
-rw------- 1 postgres postgres 1.0G Jun 19 00:52 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.5
-rw------- 1 postgres postgres 1.0G Jun 19 03:39 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.6
-rw------- 1 postgres postgres 1.0G Jun 19 04:53 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.7
-rw------- 1 postgres postgres 1.0G Jun 19 00:49 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.8
-rw------- 1 postgres postgres 1.0G Jun 19 05:14 pg_tblspc/16384/PG_9.2_201204301/70815/10088356.9
-rw------- 1 postgres postgres 4.5M Jun 19 03:38 pg_tblspc/16384/PG_9.2_201204301/70815/10088356_fsm
-rw------- 1 postgres postgres 288K Jun 19 05:12 pg_tblspc/16384/PG_9.2_201204301/70815/10088356_vm
du -sh pg_tblspc/16384/PG_9.2_201204301/70815/10088356*
415M pg_tblspc/16384/PG_9.2_201204301/70815/10088356
405M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.1
427M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.10
428M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.11
425M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.12
425M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.13
427M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.14
427M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.15
428M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.16
237M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.17
403M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.2
413M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.3
427M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.4
432M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.5
423M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.6
425M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.7
433M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.8
428M pg_tblspc/16384/PG_9.2_201204301/70815/10088356.9
3.5M pg_tblspc/16384/PG_9.2_201204301/70815/10088356_fsm
36K pg_tblspc/16384/PG_9.2_201204301/70815/10088356_vm
写速度测试补充
测试模型 :
postgres=# create table test (id int primary key, info text, crt_time timestamp);
CREATE TABLE
postgres=# create or replace function f(v_id int) returns void as
$$
declare
begin
update test set info=md5(now()::text),crt_time=now() where id=v_id;
if not found then
insert into test values (v_id, md5(now()::text), now());
end if;
return;
exception when others then
return;
end;
$$ language plpgsql strict;
CREATE FUNCTION
$ vi test.sql
\setrandom vid 1 5000000
select f(:vid);
测试结果
ZFS结果
pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -T 30
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 4
duration: 30 s
number of transactions actually processed: 1529642
tps = 50987.733547 (including connections establishing)
tps = 50998.421896 (excluding connections establishing)
statement latencies in milliseconds:
0.002064 \setrandom vid 1 5000000
0.153280 select f(:vid);
postgres=# select count(*) from test;
count
---------
1317641
(1 row)
存储主机结果
pgbench -M prepared -n -r -f ./test.sql -c 8 -j 4 -T 30
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 4
duration: 30 s
number of transactions actually processed: 717486
tps = 23915.516813 (including connections establishing)
tps = 23921.744263 (excluding connections establishing)
statement latencies in milliseconds:
0.003088 \setrandom vid 1 5000000
0.328250 select f(:vid);
postgres=# select count(*) from test;
count
--------
668395
(1 row)
其他
1. 有slog和没有slog的pg_test_fsync的测试结果
有slog
pg_test_fsync
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a*
fdatasync 303.897 ops/sec 3291 usecs/op
fsync 329.612 ops/sec 3034 usecs/op
fsync_writethrough n/a
open_sync n/a*
* This file system and its mount options do not support direct
I/O, e.g. ext4 in journaled mode.
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a*
fdatasync 328.331 ops/sec 3046 usecs/op
fsync 326.671 ops/sec 3061 usecs/op
fsync_writethrough n/a
open_sync n/a*
* This file system and its mount options do not support direct
I/O, e.g. ext4 in journaled mode.
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write n/a*
2 * 8kB open_sync writes n/a*
4 * 4kB open_sync writes n/a*
8 * 2kB open_sync writes n/a*
16 * 1kB open_sync writes n/a*
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 324.818 ops/sec 3079 usecs/op
write, close, fsync 325.872 ops/sec 3069 usecs/op
Non-Sync'ed 8kB writes:
write 78023.363 ops/sec 13 usecs/op
没有slog
pg_test_fsync
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a*
fdatasync 325.150 ops/sec 3076 usecs/op
fsync 320.737 ops/sec 3118 usecs/op
fsync_writethrough n/a
open_sync n/a*
* This file system and its mount options do not support direct
I/O, e.g. ext4 in journaled mode.
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a*
fdatasync 313.791 ops/sec 3187 usecs/op
fsync 313.884 ops/sec 3186 usecs/op
fsync_writethrough n/a
open_sync n/a*
* This file system and its mount options do not support direct
I/O, e.g. ext4 in journaled mode.
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write n/a*
2 * 8kB open_sync writes n/a*
4 * 4kB open_sync writes n/a*
8 * 2kB open_sync writes n/a*
16 * 1kB open_sync writes n/a*
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 328.620 ops/sec 3043 usecs/op
write, close, fsync 328.271 ops/sec 3046 usecs/op
Non-Sync'ed 8kB writes:
write 71741.498 ops/sec 14 usecs/op
通过iostat可以看到, 有SLOG时, pg_test_fsync全压到slog那个块设备了, 而没有slog的情况下, 压力都在vdev的块设备上, 这里是raidz所以, 全部在所有的设备上.
如果slog改成ssd, pg_test_fsync将会有很好的表现. 例如使用/dev/shm模拟ssd
# cd /dev/shm
# dd if=/dev/zero of=./test.img bs=1k count=2048000
# zpool add zp1 log /dev/shm/test.img
# zpool status
pool: zp1
state: ONLINE
scan: none requested
config:
NAME STATE READ WRITE CKSUM
zp1 ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
sda ONLINE 0 0 0
sdb ONLINE 0 0 0
sdc ONLINE 0 0 0
sdd ONLINE 0 0 0
sde ONLINE 0 0 0
sdf ONLINE 0 0 0
sdg ONLINE 0 0 0
sdh ONLINE 0 0 0
sdi ONLINE 0 0 0
logs
/dev/shm/test.img ONLINE 0 0 0
spares
sdj AVAIL
使用内存作为slog后, fsync显然提高了.
pg_test_fsync
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.
Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a*
fdatasync 6695.657 ops/sec 149 usecs/op
fsync 8079.750 ops/sec 124 usecs/op
fsync_writethrough n/a
open_sync n/a*
* This file system and its mount options do not support direct
I/O, e.g. ext4 in journaled mode.
Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync
is Linux's default)
open_datasync n/a*
fdatasync 6247.616 ops/sec 160 usecs/op
fsync 3140.959 ops/sec 318 usecs/op
fsync_writethrough n/a
open_sync n/a*
* This file system and its mount options do not support direct
I/O, e.g. ext4 in journaled mode.
Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB
in different write open_sync sizes.)
1 * 16kB open_sync write n/a*
2 * 8kB open_sync writes n/a*
4 * 4kB open_sync writes n/a*
8 * 2kB open_sync writes n/a*
16 * 1kB open_sync writes n/a*
Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
write, fsync, close 6330.570 ops/sec 158 usecs/op
write, close, fsync 6989.741 ops/sec 143 usecs/op
Non-Sync'ed 8kB writes:
write 77800.273 ops/sec 13 usecs/op
不过这里要说一下, 如果PostgreSQL 关闭了synchronous_commit, 其实普通盘的slog就够用了.
后面的写测试就是很好的证明.
2. 创建zpool的块设备最好是by-id的, 因为在Linux下设备名可能发生变更. 例如/dev/sda重启后可能变成了/dev/sdb
对于slog, 这是不允许的, 将导致数据崩溃.
查看by-id
# ll /dev/disk/by-id/*
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064b0a6dc -> ../../sdd
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064b0a6dc-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064b0a6dc-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064b563d5 -> ../../sda
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064b563d5-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064b563d5-part9 -> ../../sda9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbc776 -> ../../sde
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbc776-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbc776-part9 -> ../../sde9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbf23b -> ../../sdh
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbf23b-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbf23b-part9 -> ../../sdh9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbfc66 -> ../../sdf
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbfc66-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bbfc66-part9 -> ../../sdf9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bc046a -> ../../sdj
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bc046a-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bc046a-part9 -> ../../sdj9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bf56da -> ../../sdc
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bf56da-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bf56da-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bf65dd -> ../../sdb
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bf65dd-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064bf65dd-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064c02880 -> ../../sdi
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064c02880-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064c02880-part9 -> ../../sdi9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064c04f5a -> ../../sdg
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064c04f5a-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-35000c50064c04f5a-part9 -> ../../sdg9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/scsi-3600605b0079e70801b0e33ff07ebffa3 -> ../../sdk
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-3600605b0079e70801b0e33ff07ebffa3-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-3600605b0079e70801b0e33ff07ebffa3-part2 -> ../../sdk2
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/scsi-3600605b0079e70801b0e33ff07ebffa3-part3 -> ../../sdk3
lrwxrwxrwx 1 root root 10 Jun 19 12:43 /dev/disk/by-id/scsi-3600605b0079e70801b0e33ff07ebffa3-part4 -> ../../sdk4
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064b0a6dc -> ../../sdd
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064b0a6dc-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064b0a6dc-part9 -> ../../sdd9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064b563d5 -> ../../sda
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064b563d5-part1 -> ../../sda1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064b563d5-part9 -> ../../sda9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbc776 -> ../../sde
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbc776-part1 -> ../../sde1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbc776-part9 -> ../../sde9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbf23b -> ../../sdh
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbf23b-part1 -> ../../sdh1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbf23b-part9 -> ../../sdh9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbfc66 -> ../../sdf
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbfc66-part1 -> ../../sdf1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bbfc66-part9 -> ../../sdf9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bc046a -> ../../sdj
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bc046a-part1 -> ../../sdj1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bc046a-part9 -> ../../sdj9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bf56da -> ../../sdc
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bf56da-part1 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bf56da-part9 -> ../../sdc9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bf65dd -> ../../sdb
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bf65dd-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064bf65dd-part9 -> ../../sdb9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064c02880 -> ../../sdi
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064c02880-part1 -> ../../sdi1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064c02880-part9 -> ../../sdi9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064c04f5a -> ../../sdg
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064c04f5a-part1 -> ../../sdg1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x5000c50064c04f5a-part9 -> ../../sdg9
lrwxrwxrwx 1 root root 9 Jun 19 2014 /dev/disk/by-id/wwn-0x600605b0079e70801b0e33ff07ebffa3 -> ../../sdk
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x600605b0079e70801b0e33ff07ebffa3-part1 -> ../../sdk1
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x600605b0079e70801b0e33ff07ebffa3-part2 -> ../../sdk2
lrwxrwxrwx 1 root root 10 Jun 19 2014 /dev/disk/by-id/wwn-0x600605b0079e70801b0e33ff07ebffa3-part3 -> ../../sdk3
lrwxrwxrwx 1 root root 10 Jun 19 12:43 /dev/disk/by-id/wwn-0x600605b0079e70801b0e33ff07ebffa3-part4 -> ../../sdk4
如果已经使用了/dev/sd*, 可以删除后重新加入.
# zpool remove zp1 /dev/sdk4
# zpool add zp1 log /dev/disk/by-id/scsi-3600605b0079e70801b0e33ff07ebffa3-part4
slog一般不需要太大. 有几个G就差不多了. L2ARC则越大越好.
小结
测试面比较窄, 但是反映了一些问题.
1. 因为使用了SLOG, 所以ZFS写性能超出了这样配置的存储. 所以还是比较适合用作数据库的.
2. 因为这里的读测试还没有超出内存大小. 显然还不能说明问题. 超出内存后18G表的查询需要70秒左右. 如果加上SSD作为L2ARC的话, 读性能还能有提高.
3. 使用zfs压缩后, 存储空间是小了, 同时还要考虑压缩和解压带来的延迟和CPU开销.
4. slog很重要, 最好mirror , 如果底层是raid的话, 可以不mirror. 这里的用内存作为例子千万别模仿, 我只是模仿ssd.
参考
1. http://blog.163.com/digoal@126/blog/#m=0&t=1&c=fks_084075085094080071084086083095085080082075083081086071084