From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from cn.fujitsu.com ([59.151.112.132]:64493 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1750702AbaLOBXu convert rfc822-to-8bit (ORCPT ); Sun, 14 Dec 2014 20:23:50 -0500 Message-ID: <548E377D.6030804@cn.fujitsu.com> Date: Mon, 15 Dec 2014 09:21:01 +0800 From: Dongsheng Yang MIME-Version: 1.0 To: Grzegorz Kowal , linux-btrfs Subject: Re: [PATCH v2 1/3] Btrfs: get more accurate output in df command. References: <36be817396956bffe981a69ea0b8796c44153fa5.1418203063.git.yangds.fnst@cn.fujitsu.com> <548B4117.1040007@inwind.it> In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Sender: linux-btrfs-owner@vger.kernel.org List-ID: On 12/14/2014 10:32 PM, Grzegorz Kowal wrote: > Hi, > > I see another problem on 1 device fs after applying this patch. I set > up 30GB system partition: > > in gdisk key 'i' > Partition size: 62914560 sectors (30.0 GiB) -> 62914560 * 512 = > 32212254720 = 30.0GiB > > before applying the patch df -B1 shows > > Filesystem 1B-blocks Used Available Use% Mounted on > /dev/sda3 32212254720 18149384192 12821815296 59% / > > The total size matches exactly the one reported by gdisk. > > After applying the patch df -B1 shows > > Filesystem 1B-blocks Used Available Use% Mounted on > /dev/sda3 29761732608 17037860864 12723871744 58% / > > The total size is 29761732608 bytes now, which is 2337 MiB (2.28GB) > smaller then the partition size. Hi Grzegorz, this is very similar with the question discussed in my last mail. I believe the 2.28GB is used for metadata in DUP level. That said, there is a space of almost 4.56GB (2.28*2GB) used for metadata and the metadata is in DUP level. Then user can only see a 2.28GB from the FS space level. 2.28G -----|---------------- Filesystem space (27.72G) | \ | \ | \ | 4.56GB \ -----|----|----------------- Device space (30.0G) DUP | The main idea in new btrfs_statfs() is to consider the space information in a FS level rather than the device space level. Then the @size in df is the size in FS space level. As we all agree to report @size as 5G in the case of "mkfs.btrfs /dev/vdb (5G) /dev/vdc (5G) -d raid1", this means we are agreed to think the df in a FS space without showing the detail in device space. In this time, the total size in FS space is 27.72G, so the @size in df is 27.72G. Does it make sense to you? As there are 2 complaints for the same change of @size in df, I have to say it maybe not so easy to understand. Anyone have some suggestion about it? Thanx Yang > > IMHO the total size should correspond exactly to the partition size, > at least in the case of fs on one device, and anything used by the > file system and user should go to the used space. > > Am I missing something here? > > Thanks, > Grzegorz > > On Sun, Dec 14, 2014 at 12:27 PM, Grzegorz Kowal > wrote: >> Hi, >> >> I see another problem on 1 device fs after applying this patch. I set up >> 30GB system partition: >> >> in gdisk key 'i' >> Partition size: 62914560 sectors (30.0 GiB) -> 62914560 * 512 = 32212254720 >> = 30.0GiB >> >> before applying the patch df -B1 shows >> >> Filesystem 1B-blocks Used Available Use% Mounted on >> /dev/sda3 32212254720 18149384192 12821815296 59% / >> >> The total size matches exactly the one reported by gdisk. >> >> After applying the patch df -B1 shows >> >> Filesystem 1B-blocks Used Available Use% Mounted on >> /dev/sda3 29761732608 17037860864 12723871744 58% / >> >> The total size is 29761732608 bytes now, which is 2337 MiB (2.28GB) smaller >> then the partition size. >> >> IMHO the total size should correspond exactly to the partition size, at >> least in the case of fs on one device, and anything used by the file system >> and user should go to the used space. >> >> Am I missing something here? >> >> Thanks, >> Grzegorz >> >> >> >> >> >> On Sun, Dec 14, 2014 at 9:29 AM, Dongsheng Yang >> wrote: >>> On Sat, Dec 13, 2014 at 3:25 AM, Goffredo Baroncelli >>> wrote: >>>> On 12/11/2014 09:31 AM, Dongsheng Yang wrote: >>>>> When function btrfs_statfs() calculate the tatol size of fs, it is >>>>> calculating >>>>> the total size of disks and then dividing it by a factor. But in some >>>>> usecase, >>>>> the result is not good to user. >>>> >>>> I Yang; during my test I discovered an error: >>>> >>>> $ sudo lvcreate -L +10G -n disk vgtest >>>> $ sudo /home/ghigo/mkfs.btrfs -f /dev/vgtest/disk >>>> Btrfs v3.17 >>>> See http://btrfs.wiki.kernel.org for more information. >>>> >>>> Turning ON incompat feature 'extref': increased hardlink limit per file >>>> to 65536 >>>> fs created label (null) on /dev/vgtest/disk >>>> nodesize 16384 leafsize 16384 sectorsize 4096 size 10.00GiB >>>> $ sudo mount /dev/vgtest/disk /mnt/btrfs1/ >>>> $ df /mnt/btrfs1/ >>>> Filesystem 1K-blocks Used Available Use% Mounted on >>>> /dev/mapper/vgtest-disk 9428992 1069312 8359680 12% /mnt/btrfs1 >>>> $ sudo ~/btrfs fi df /mnt/btrfs1/ >>>> Data, single: total=8.00MiB, used=256.00KiB >>>> System, DUP: total=8.00MiB, used=16.00KiB >>>> System, single: total=4.00MiB, used=0.00B >>>> Metadata, DUP: total=1.00GiB, used=112.00KiB >>>> Metadata, single: total=8.00MiB, used=0.00B >>>> GlobalReserve, single: total=16.00MiB, used=0.00B >>>> >>>> What seems me strange is the 9428992KiB of total disk space as reported >>>> by df. About 600MiB are missing ! >>>> >>>> Without your patch, I got: >>>> >>>> $ df /mnt/btrfs1/ >>>> Filesystem 1K-blocks Used Available Use% Mounted on >>>> /dev/mapper/vgtest-disk 10485760 16896 8359680 1% /mnt/btrfs1 >>>> >>> Hi Goffredo, thanx for pointing this. Let me try to show the reason of it. >>> >>> In this case you provided here, the metadata is 1G in DUP. It means >>> we spent 2G device space for it. But from the view point of user, they >>> can only see 9G size for this fs. It's show as below: >>> 1G >>> -----|---------------- Filesystem space (9G) >>> | \ >>> | \ >>> | \ >>> | 2G \ >>> -----|----|----------------- Device space (10G) >>> DUP | >>> >>> The main idea about the new btrfs_statfs() is hiding the detail in Device >>> space and only show the information in the FS space level. >>> >>> Actually, the similar idea is adopted in btrfs fi df. >>> >>> Example: >>> # lvcreate -L +10G -n disk Group0 >>> # mkfs.btrfs -f /dev/Group0/disk >>> # dd if=/dev/zero of=/mnt/data bs=1M count=10240 >>> dd: error writing ‘/mnt/data’: No space left on device >>> 8163+0 records in >>> 8162+0 records out >>> 8558477312 bytes (8.6 GB) copied, 207.565 s, 41.2 MB/s >>> # df /mnt >>> Filesystem 1K-blocks Used Available Use% Mounted on >>> /dev/mapper/Group0-disk 9428992 8382896 2048 100% /mnt >>> # btrfs fi df /mnt >>> Data, single: total=7.97GiB, used=7.97GiB >>> System, DUP: total=8.00MiB, used=16.00KiB >>> System, single: total=4.00MiB, used=0.00B >>> Metadata, DUP: total=1.00GiB, used=8.41MiB >>> Metadata, single: total=8.00MiB, used=0.00B >>> GlobalReserve, single: total=16.00MiB, used=0.00B >>> >>> Now, the all space is used and got a ENOSPC error. >>> But from the output of btrfs fi df. We can only find almost 9G (data >>> 8G + metadata 1G) >>> is used. The difference is that it show the DUP here to make >>> it more clear. >>> >>> Wish this description is clear to you :). >>> >>> If there is anything confusing or sounds incorrect to you, please point it >>> out. >>> >>> Thanx >>> Yang >>> >>>>> Example: >>>>> # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 >>>>> # mount /dev/vdf1 /mnt >>>>> # dd if=/dev/zero of=/mnt/zero bs=1M count=1000 >>>>> # df -h /mnt >>>>> Filesystem Size Used Avail Use% Mounted on >>>>> /dev/vdf1 3.0G 1018M 1.3G 45% /mnt >>>>> # btrfs fi show /dev/vdf1 >>>>> Label: none uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294 >>>>> Total devices 2 FS bytes used 1001.53MiB >>>>> devid 1 size 2.00GiB used 1.85GiB path /dev/vdf1 >>>>> devid 2 size 4.00GiB used 1.83GiB path /dev/vdf2 >>>>> a. df -h should report Size as 2GiB rather than as 3GiB. >>>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB. >>>>> >>>>> b. df -h should report Avail as 0.97GiB or less, rather than as 1.3GiB. >>>>> 1.85 (the capacity of the allocated chunk) >>>>> -1.018 (the file stored) >>>>> +(2-1.85=0.15) (the residual capacity of the disks >>>>> considering a raid1 fs) >>>>> --------------- >>>>> = 0.97 >>>>> >>>>> This patch drops the factor at all and calculate the size observable to >>>>> user without considering which raid level the data is in and what's the >>>>> size exactly in disk. >>>>> After this patch applied: >>>>> # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 >>>>> # mount /dev/vdf1 /mnt >>>>> # dd if=/dev/zero of=/mnt/zero bs=1M count=1000 >>>>> # df -h /mnt >>>>> Filesystem Size Used Avail Use% Mounted on >>>>> /dev/vdf1 2.0G 1.3G 713M 66% /mnt >>>>> # df /mnt >>>>> Filesystem 1K-blocks Used Available Use% Mounted on >>>>> /dev/vdf1 2097152 1359424 729536 66% /mnt >>>>> # btrfs fi show /dev/vdf1 >>>>> Label: none uuid: e98c1321-645f-4457-b20d-4f41dc1cf2f4 >>>>> Total devices 2 FS bytes used 1001.55MiB >>>>> devid 1 size 2.00GiB used 1.85GiB path /dev/vdf1 >>>>> devid 2 size 4.00GiB used 1.83GiB path /dev/vdf2 >>>>> a). The @Size is 2G as we expected. >>>>> b). @Available is 700M = 1.85G - 1.3G + (2G - 1.85G). >>>>> c). @Used is changed to 1.3G rather than 1018M as above. Because >>>>> this patch do not treat the free space in metadata chunk >>>>> and system chunk as available to user. It's true, user can >>>>> not use these space to store data, then it should not be >>>>> thought as available. At the same time, it will make the >>>>> @Used + @Available == @Size as possible to user. >>>>> >>>>> Signed-off-by: Dongsheng Yang >>>>> --- >>>>> Changelog: >>>>> v1 -> v2: >>>>> a. account the total_bytes in medadata chunk and >>>>> system chunk as used to user. >>>>> b. set data_stripes to the correct value in RAID0. >>>>> >>>>> fs/btrfs/extent-tree.c | 13 ++---------- >>>>> fs/btrfs/super.c | 56 >>>>> ++++++++++++++++++++++---------------------------- >>>>> 2 files changed, 26 insertions(+), 43 deletions(-) >>>>> >>>>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >>>>> index a84e00d..9954d60 100644 >>>>> --- a/fs/btrfs/extent-tree.c >>>>> +++ b/fs/btrfs/extent-tree.c >>>>> @@ -8571,7 +8571,6 @@ static u64 >>>>> __btrfs_get_ro_block_group_free_space(struct list_head *groups_list) >>>>> { >>>>> struct btrfs_block_group_cache *block_group; >>>>> u64 free_bytes = 0; >>>>> - int factor; >>>>> >>>>> list_for_each_entry(block_group, groups_list, list) { >>>>> spin_lock(&block_group->lock); >>>>> @@ -8581,16 +8580,8 @@ static u64 >>>>> __btrfs_get_ro_block_group_free_space(struct list_head *groups_list) >>>>> continue; >>>>> } >>>>> >>>>> - if (block_group->flags & (BTRFS_BLOCK_GROUP_RAID1 | >>>>> - BTRFS_BLOCK_GROUP_RAID10 | >>>>> - BTRFS_BLOCK_GROUP_DUP)) >>>>> - factor = 2; >>>>> - else >>>>> - factor = 1; >>>>> - >>>>> - free_bytes += (block_group->key.offset - >>>>> - >>>>> btrfs_block_group_used(&block_group->item)) * >>>>> - factor; >>>>> + free_bytes += block_group->key.offset - >>>>> + btrfs_block_group_used(&block_group->item); >>>>> >>>>> spin_unlock(&block_group->lock); >>>>> } >>>>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c >>>>> index 54bd91e..40f41a2 100644 >>>>> --- a/fs/btrfs/super.c >>>>> +++ b/fs/btrfs/super.c >>>>> @@ -1641,6 +1641,8 @@ static int btrfs_calc_avail_data_space(struct >>>>> btrfs_root *root, u64 *free_bytes) >>>>> u64 used_space; >>>>> u64 min_stripe_size; >>>>> int min_stripes = 1, num_stripes = 1; >>>>> + /* How many stripes used to store data, without considering >>>>> mirrors. */ >>>>> + int data_stripes = 1; >>>>> int i = 0, nr_devices; >>>>> int ret; >>>>> >>>>> @@ -1657,12 +1659,15 @@ static int btrfs_calc_avail_data_space(struct >>>>> btrfs_root *root, u64 *free_bytes) >>>>> if (type & BTRFS_BLOCK_GROUP_RAID0) { >>>>> min_stripes = 2; >>>>> num_stripes = nr_devices; >>>>> + data_stripes = num_stripes; >>>>> } else if (type & BTRFS_BLOCK_GROUP_RAID1) { >>>>> min_stripes = 2; >>>>> num_stripes = 2; >>>>> + data_stripes = 1; >>>>> } else if (type & BTRFS_BLOCK_GROUP_RAID10) { >>>>> min_stripes = 4; >>>>> num_stripes = 4; >>>>> + data_stripes = 2; >>>>> } >>>>> >>>>> if (type & BTRFS_BLOCK_GROUP_DUP) >>>>> @@ -1733,14 +1738,17 @@ static int btrfs_calc_avail_data_space(struct >>>>> btrfs_root *root, u64 *free_bytes) >>>>> i = nr_devices - 1; >>>>> avail_space = 0; >>>>> while (nr_devices >= min_stripes) { >>>>> - if (num_stripes > nr_devices) >>>>> + if (num_stripes > nr_devices) { >>>>> num_stripes = nr_devices; >>>>> + if (type & BTRFS_BLOCK_GROUP_RAID0) >>>>> + data_stripes = num_stripes; >>>>> + } >>>>> >>>>> if (devices_info[i].max_avail >= min_stripe_size) { >>>>> int j; >>>>> u64 alloc_size; >>>>> >>>>> - avail_space += devices_info[i].max_avail * >>>>> num_stripes; >>>>> + avail_space += devices_info[i].max_avail * >>>>> data_stripes; >>>>> alloc_size = devices_info[i].max_avail; >>>>> for (j = i + 1 - num_stripes; j <= i; j++) >>>>> devices_info[j].max_avail -= alloc_size; >>>>> @@ -1772,15 +1780,13 @@ static int btrfs_calc_avail_data_space(struct >>>>> btrfs_root *root, u64 *free_bytes) >>>>> static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf) >>>>> { >>>>> struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb); >>>>> - struct btrfs_super_block *disk_super = fs_info->super_copy; >>>>> struct list_head *head = &fs_info->space_info; >>>>> struct btrfs_space_info *found; >>>>> u64 total_used = 0; >>>>> + u64 total_alloc = 0; >>>>> u64 total_free_data = 0; >>>>> int bits = dentry->d_sb->s_blocksize_bits; >>>>> __be32 *fsid = (__be32 *)fs_info->fsid; >>>>> - unsigned factor = 1; >>>>> - struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv; >>>>> int ret; >>>>> >>>>> /* >>>>> @@ -1792,38 +1798,20 @@ static int btrfs_statfs(struct dentry *dentry, >>>>> struct kstatfs *buf) >>>>> rcu_read_lock(); >>>>> list_for_each_entry_rcu(found, head, list) { >>>>> if (found->flags & BTRFS_BLOCK_GROUP_DATA) { >>>>> - int i; >>>>> - >>>>> - total_free_data += found->disk_total - >>>>> found->disk_used; >>>>> + total_free_data += found->total_bytes - >>>>> found->bytes_used; >>>>> total_free_data -= >>>>> >>>>> btrfs_account_ro_block_groups_free_space(found); >>>>> - >>>>> - for (i = 0; i < BTRFS_NR_RAID_TYPES; i++) { >>>>> - if (!list_empty(&found->block_groups[i])) >>>>> { >>>>> - switch (i) { >>>>> - case BTRFS_RAID_DUP: >>>>> - case BTRFS_RAID_RAID1: >>>>> - case BTRFS_RAID_RAID10: >>>>> - factor = 2; >>>>> - } >>>>> - } >>>>> - } >>>>> + total_used += found->bytes_used; >>>>> + } else { >>>>> + /* For metadata and system, we treat the >>>>> total_bytes as >>>>> + * not available to file data. So show it as Used >>>>> in df. >>>>> + **/ >>>>> + total_used += found->total_bytes; >>>>> } >>>>> - >>>>> - total_used += found->disk_used; >>>>> + total_alloc += found->total_bytes; >>>>> } >>>>> - >>>>> rcu_read_unlock(); >>>>> >>>>> - buf->f_blocks = div_u64(btrfs_super_total_bytes(disk_super), >>>>> factor); >>>>> - buf->f_blocks >>= bits; >>>>> - buf->f_bfree = buf->f_blocks - (div_u64(total_used, factor) >> >>>>> bits); >>>>> - >>>>> - /* Account global block reserve as used, it's in logical size >>>>> already */ >>>>> - spin_lock(&block_rsv->lock); >>>>> - buf->f_bfree -= block_rsv->size >> bits; >>>>> - spin_unlock(&block_rsv->lock); >>>>> - >>>>> buf->f_bavail = total_free_data; >>>>> ret = btrfs_calc_avail_data_space(fs_info->tree_root, >>>>> &total_free_data); >>>>> if (ret) { >>>>> @@ -1831,8 +1819,12 @@ static int btrfs_statfs(struct dentry *dentry, >>>>> struct kstatfs *buf) >>>>> mutex_unlock(&fs_info->fs_devices->device_list_mutex); >>>>> return ret; >>>>> } >>>>> - buf->f_bavail += div_u64(total_free_data, factor); >>>>> + buf->f_bavail += total_free_data; >>>>> buf->f_bavail = buf->f_bavail >> bits; >>>>> + buf->f_blocks = total_alloc + total_free_data; >>>>> + buf->f_blocks >>= bits; >>>>> + buf->f_bfree = buf->f_blocks - (total_used >> bits); >>>>> + >>>>> mutex_unlock(&fs_info->chunk_mutex); >>>>> mutex_unlock(&fs_info->fs_devices->device_list_mutex); >>>>> >>>>> >>>> >>>> -- >>>> gpg @keyserver.linux.it: Goffredo Baroncelli >>>> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>>> in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > . >