zpool use 4KiB sector

8 minute read

背景

默认情况下, 创建top-level dev如果不指定扇区大小, 那么使用vdevs中最小的扇区大小作为zpool的扇区大小.  
但是某些设备虽然是4K的扇区, 但是可能内核读到的扇区大小可能是512字节.  
这样的话, 将对影响有比较大的影响, 所以在这种情况下, 创建zpool时可以指定扇区的大小. 添加top-devel时也可以指定扇区的大小, 尽量使用扇区大小一致的设备.  
  
zpool创建完后不允许修改扇区的大小.  
[root@db-172-16-3-150 man8]# zpool set ashift=9 zpp   
cannot set property for 'zpp': property 'ashift' can only be set at creation time  
  
top-level vdev的扇区大小指定.  
[root@db-172-16-3-150 man8]# zpool create -o ashift=12 zpp mirror /ssd1/zfs.1 /ssd1/zfs.2  
[root@db-172-16-3-150 man8]# zpool get all zpp|grep ashift  
zpp   ashift                 12                     local  
  
[root@db-172-16-3-150 man8]# zpool get all zpp  
NAME  PROPERTY               VALUE                  SOURCE  
zpp   size                   95.5M                  -  
zpp   capacity               0%                     -  
zpp   altroot                -                      default  
zpp   health                 ONLINE                 -  
zpp   guid                   10118330420533444691   default  
zpp   version                -                      default  
zpp   bootfs                 -                      default  
zpp   delegation             on                     default  
zpp   autoreplace            off                    default  
zpp   cachefile              -                      default  
zpp   failmode               wait                   default  
zpp   listsnapshots          off                    default  
zpp   autoexpand             off                    default  
zpp   dedupditto             0                      default  
zpp   dedupratio             1.00x                  -  
zpp   free                   94.9M                  -  
zpp   allocated              612K                   -  
zpp   readonly               off                    -  
zpp   ashift                 12                     local  
zpp   comment                -                      default  
zpp   expandsize             0                      -  
zpp   freeing                0                      default  
zpp   feature@async_destroy  enabled                local  
zpp   feature@empty_bpobj    enabled                local  
zpp   feature@lz4_compress   enabled                local  
添加top-level vdev时也可以指定.  
[root@db-172-16-3-150 zpp]# zpool add -o ashift=10 zpp mirror /ssd1/zfs.5 /ssd1/zfs.6  
[root@db-172-16-3-150 zpp]# dd if=/dev/zero of=./test1 bs=1k count=102400  
102400+0 records in  
102400+0 records out  
104857600 bytes (105 MB) copied, 2.75559 s, 38.1 MB/s  
[root@db-172-16-3-150 zpp]# zpool list -v zpp  
NAME   SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT  
zpp   286M   204M  82.8M    71%  1.00x  ONLINE  -  
  mirror  95.5M  78.3M  17.2M         -  
    /ssd1/zfs.1      -      -      -         -  
    /ssd1/zfs.2      -      -      -         -  
  mirror  95.5M  81.1M  14.4M         -  
    /ssd1/zfs.3      -      -      -         -  
    /ssd1/zfs.4      -      -      -         -  
  mirror  95.5M  44.3M  51.2M         -  
    /ssd1/zfs.5      -      -      -         -  
    /ssd1/zfs.6      -      -      -         -  

参考

1. man zpool

The following property can be set at creation time:  
  
       ashift  
  
           Pool  sector  size exponent, to the power of 2 (internally referred to as "ashift"). I/O operations will be  
           aligned to the specified size boundaries. Additionally, the minimum (disk) write size will be  set  to  the  
           specified  size,  so  this  represents a space vs. performance trade-off. The typical case for setting this  
           property is when performance is important and the underlying disks use 4KiB sectors but report 512B sectors  
           to the OS (for compatibility reasons); in that case, set ashift=12 (which is 1<<12 = 4096).  
  
           For  optimal  performance,  the  pool sector size should be greater than or equal to the sector size of the  
           underlying disks. Since the property cannot be changed after pool creation, if in a given  pool,  you  ever  
           want to use drives that report 4KiB sectors, you must set ashift=12 at pool creation time.  

2. 其他属性

   Properties  
       Each pool has several properties associated with it. Some properties are read-only statistics while others  are  
       configurable and change the behavior of the pool. The following are read-only properties:  
  
       available           Amount  of  storage available within the pool. This property can also be referred to by its  
                           shortened column name, "avail".  
  
       capacity            Percentage of pool space used. This property can also be referred to by its shortened  col-  
                           umn name, "cap".  
  
       expandsize          Amount  of  uninitialized  space within the pool or device that can be used to increase the  
                           total capacity of the pool.  Uninitialized space consists of any space on  an  EFI  labeled  
                           vdev  which  has  not been brought online (i.e. zpool online -e).  This space occurs when a  
                           LUN is dynamically expanded.  
  
       free                The amount of free space available in the pool.  
  
       freeing             After a file system or snapshot is destroyed, the space it was using  is  returned  to  the  
                           pool  asynchronously.  freeing  is the amount of space remaining to be reclaimed. Over time  
                           freeing will decrease while free increases.  
  
       health              The current health of the pool. Health can be "ONLINE", "DEGRADED", "FAULTED", "  OFFLINE",  
                           "REMOVED", or "UNAVAIL".  
  
       guid                A unique identifier for the pool.  
  
       size                Total size of the storage pool.  
  
       unsupported@feature_Information  about unsupported features that are enabled on the pool. See zpool-features(5)  
                           for details.  
  
       used                Amount of storage space used within the pool.  
  
       The space usage properties report actual physical space available to the storage pool. The physical  space  can  
       be  different  from the total amount of space that any contained datasets can actually use. The amount of space  
       used in a raidz configuration depends on the characteristics of  the  data  being  written.  In  addition,  ZFS  
       reserves  some  space for internal accounting that the zfs(8) command takes into account, but the zpool command  
       does not. For non-full pools of a reasonable size, these effects should be invisible. For small pools, or pools  
       that are close to being completely full, these discrepancies may become more noticeable.  
  
       The following property can be set at creation time:  
  
       ashift  
  
           Pool  sector  size exponent, to the power of 2 (internally referred to as "ashift"). I/O operations will be  
           aligned to the specified size boundaries. Additionally, the minimum (disk) write size will be  set  to  the  
           specified  size,  so  this  represents a space vs. performance trade-off. The typical case for setting this  
           property is when performance is important and the underlying disks use 4KiB sectors but report 512B sectors  
           to the OS (for compatibility reasons); in that case, set ashift=12 (which is 1<<12 = 4096).  
  
           For  optimal  performance,  the  pool sector size should be greater than or equal to the sector size of the  
           underlying disks. Since the property cannot be changed after pool creation, if in a given  pool,  you  ever  
           want to use drives that report 4KiB sectors, you must set ashift=12 at pool creation time.  
  
       The following property can be set at creation time and import time:  
  
       altroot  
  
           Alternate root directory. If set, this directory is prepended to any mount points within the pool. This can  
           be used when examining an unknown pool where the mount points cannot be trusted, or in  an  alternate  boot  
           environment,  where the typical paths are not valid. altroot is not a persistent property. It is valid only  
           while the system is up. Setting altroot defaults to using cachefile=none, though  this  may  be  overridden  
           using an explicit setting.  
  
       The following properties can be set at creation time and import time, and later changed with the zpool set com-  
       mand:  
  
       autoexpand=on | off  
  
           Controls automatic pool expansion when the underlying LUN is grown. If set to on, the pool will be  resized  
           according  to  the size of the expanded device. If the device is part of a mirror or raidz then all devices  
           within that mirror/raidz group must be expanded before the new space is made available  to  the  pool.  The  
           default behavior is off. This property can also be referred to by its shortened column name, expand.  
  
       autoreplace=on | off  
           Controls  automatic device replacement. If set to "off", device replacement must be initiated by the admin-  
           istrator by using the "zpool replace" command. If set to "on", any new device, found in the  same  physical  
           location  as  a  device  that previously belonged to the pool, is automatically formatted and replaced. The  
           default behavior is "off". This property can also be referred to by its shortened column name, "replace".  
  
       bootfs=pool/dataset  
  
           Identifies the default bootable dataset for the root pool. This property is expected to be  set  mainly  by  
           the installation and upgrade programs.  
  
       cachefile=path | none  
  
           Controls  the  location  of where the pool configuration is cached. Discovering all pools on system startup  
           requires a cached copy of the configuration data that is stored on the root file system. All pools in  this  
           cache  are automatically imported when the system boots. Some environments, such as install and clustering,  
           need to cache this information in a different location so that pools are not automatically  imported.  Set-  
           ting  this  property  caches the pool configuration in a different location that can later be imported with  
           "zpool import -c". Setting it to the special value "none" creates a temporary pool that  is  never  cached,  
           and the special value ’’ (empty string) uses the default location.  
  
           Multiple  pools  can  share  the  same cache file. Because the kernel destroys and recreates this file when  
           pools are added and removed, care should be taken when attempting to access this file. When the  last  pool  
           using a cachefile is exported or destroyed, the file is removed.  
  
       comment=text  
  
           A  text  string consisting of printable ASCII characters that will be stored such that it is available even  
           if the pool becomes faulted.  An administrator can provide additional information about a pool  using  this  
           property.  
  
       delegation=on | off  
  
           Controls  whether  a  non-privileged user is granted access based on the dataset permissions defined on the  
           dataset. See zfs(8) for more information on ZFS delegated administration.  
  
       failmode=wait | continue | panic  
  
           Controls the system behavior in the event of catastrophic pool  failure.  This  condition  is  typically  a  
           result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the  
           pool. The behavior of such an event is determined as follows:  
  
           wait        Blocks all I/O access until the device connectivity is recovered and the  errors  are  cleared.  
                       This is the default behavior.  
  
           continue    Returns  EIO  to  any  new  write I/O requests but allows reads to any of the remaining healthy  
                       devices. Any write requests that have yet to be committed to disk would be blocked.  
  
           panic       Prints out a message to the console and generates a system crash dump.  
  
       feature@feature_name=enabled  
           The value of this property is the current state of feature_name. The only valid  value  when  setting  this  
           property  is  enabled  which  moves feature_name to the enabled state. See zpool-features(5) for details on  
           feature states.  
  
       listsnaps=on | off  
  
           Controls whether information about snapshots associated with this pool is output when  "zfs  list"  is  run  
           without the -t option. The default value is "off".  
  
       version=version  
  
           The  current  on-disk version of the pool. This can be increased, but never decreased. The preferred method  
           of updating pools is with the "zpool upgrade" command, though this property can be  used  when  a  specific  
           version  is needed for backwards compatibility. Once feature flags are enabled on a pool this property will  
           no longer have a value.  

Flag Counter

digoal’s 大量PostgreSQL文章入口