为什么cgroup blkio不能限制分区

1 minute read

背景

在使用cgroup blkio子系统限制块设备的IOPS时,有没有遇到过这样的错误?

#echo "8:1 10000" >./blkio.throttle.write_iops_device   
bash: echo: write error: No such device  

当限制的块设备不是设备本身时(而是分区),会报错。

那么是什么原因造成的呢?

例子

系统中有一个sda的盘,这个盘分了5个区。

#ll /dev/sd*  
brw-rw---- 1 root disk 8, 0 Aug  6 13:09 /dev/sda  
brw-rw---- 1 root disk 8, 1 Aug  6 13:09 /dev/sda1  
brw-rw---- 1 root disk 8, 2 Aug  6 13:09 /dev/sda2  
brw-rw---- 1 root disk 8, 3 Aug  6 13:09 /dev/sda3  
brw-rw---- 1 root disk 8, 4 Aug  6 13:09 /dev/sda4  
brw-rw---- 1 root disk 8, 5 Aug  6 13:09 /dev/sda5  

对整盘进行IOPS限制,正常

#echo "8:0 10000" >./blkio.throttle.write_iops_device   

对单个分区进行IOPS限制,失败

#echo "8:1 10000" >./blkio.throttle.write_iops_device   
bash: echo: write error: No such device  

查看了cgroup的代码,发现了问题在blkio_policy_parse_and_set这里,如果发现不是块设备本身,而是分区,则报错。

kernel-2.6.32 / block / blk-cgroup.c

...  
static int blkio_policy_parse_and_set(char *buf,  
	struct blkio_policy_node *newpn, enum blkio_policy_id plid, int fileid)  
{  
...  
	dev = MKDEV(major, minor);  
  
	disk = get_gendisk(dev, &part);  
	if (!disk || part) {  
		ret = -ENODEV;  
		goto out;  
	}  
...  

get_gendisk的帮助手册如下

man get_gendisk

Name  
  
get_gendisk — get partitioning information for a given device  
  
Synopsis  
  
struct gendisk * get_gendisk (	dev_t devt,  
 	int * partno);  
   
Arguments  
  
devt  
device to get partitioning information for  
  
partno  
returned partition index  
  
Description  
  
This function gets the structure containing partitioning information for the given device devt.  

知道原因之后就好理解了,还好cgroup blkio子系统支持device map设备,所以如果你要对单个分区进行IOPS的限制,可以用LVM做一层,就可以限制了。

例如

pvcreate /dev/sdb  
vgcreate vgdata01 /dev/sdb  
lvcreate -n lv01 -l 50%VG vgdata01   
lvcreate -n lv02 -l 40%VG vgdata01   
  
#dmsetup ls  
vgdata01-lv02   (253, 1)  
vgdata01-lv01   (253, 0)  
  
#echo "253:1 10000" >./blkio.throttle.write_iops_device   

修改lvm逻辑卷的major, minor号

man lvcreate  
  
       --minor minor  
              Sets the minor number.  Minor numbers are not supported with pool volumes.  
  
       -j, --major major  
              Sets the major number.  Major numbers are not supported with pool volumes.  This option is supported only on older systems (kernel version 2.4) and is ignored on modern Linux systems where major num-  
              bers are dynamically assigned.  

参考

1. https://www.kernel.org/doc/Documentation/devices.txt

Flag Counter

digoal’s 大量PostgreSQL文章入口