linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
To: <kreijack@inwind.it>, <1i5t5.duncan@cox.net>, <rwhite@pobox.com>,
	<lists@colorremedies.com>
Cc: <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2 1/3] Btrfs: get more accurate output in df command.
Date: Sat, 13 Dec 2014 17:57:38 +0800	[thread overview]
Message-ID: <548C0D92.2030608@cn.fujitsu.com> (raw)
In-Reply-To: <548B2D34.9060509@inwind.it>

On 12/13/2014 02:00 AM, Goffredo Baroncelli wrote:
> On 12/11/2014 09:31 AM, Dongsheng Yang wrote:
>> When function btrfs_statfs() calculate the tatol size of fs, it is calculating
>> the total size of disks and then dividing it by a factor. But in some usecase,
>> the result is not good to user.
> I am checking it; to me it seems a good improvement. However
> I noticed that df now doesn't seem to report anymore the space consumed
> by the meta-data chunk; eg:
>
> # I have two disks of 5GB each
> $ sudo ~/mkfs.btrfs -f -m raid1 -d raid1 /dev/vgtest/disk /dev/vgtest/disk1
>
> $ df -h /mnt/btrfs1/
> Filesystem               Size  Used Avail Use% Mounted on
> /dev/mapper/vgtest-disk  5.0G  1.1G  4.0G  21% /mnt/btrfs1
>
> $ sudo btrfs fi show
> Label: none  uuid: 884414c6-9374-40af-a5be-3949cdf6ad0b
> 	Total devices 2 FS bytes used 640.00KB
> 	devid    2 size 5.00GB used 2.01GB path /dev/dm-1
> 	devid    1 size 5.00GB used 2.03GB path /dev/dm-0
>
> $ sudo ./btrfs fi df /mnt/btrfs1/
> Data, RAID1: total=1.00GiB, used=512.00KiB
> Data, single: total=8.00MiB, used=0.00B
> System, RAID1: total=8.00MiB, used=16.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, RAID1: total=1.00GiB, used=112.00KiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=16.00MiB, used=0.00B
>
> In this case the filesystem is empty (it was a new filesystem !). However a
> 1G metadata chunk was already allocated. This is the reasons why the free
> space is only 4Gb.

Actually, in the original btrfs_statfs(), the space in metadata chunk 
was also not be considered
as available. But you will get 5G in this case with original 
btrfs_statfs() because there is a bug
in it. As I use another implementation to calculate the @available in df 
command, I did not
mention this problem. I can describe it as below.

In original btrfs_statfs(), we only consider the free space in data 
chunk as available.
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
         list_for_each_entry_rcu(found, head, list) {
                 if (found->flags & BTRFS_BLOCK_GROUP_DATA) {
                         int i;

                         total_free_data += found->disk_total - 
found->disk_used;
 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
In the later, we will add the total_free_data to @available in output of df.
         buf->f_bavail = total_free_data;
BUT:
     This is incorrect! It should be:
       buf->f_bavail = div_u64(total_free_data, factor);
That said this bug will add one more (data_chunk_disk_total - 
data_chunk_disk_used = 1G)
to @available by mistake. Unfortunately, the free space in 
metadata_chunk is also 1G.

Then we will get 5G in this case you provided here.

Conclusion:
     Even in the original btrfs_statfs(), the free space in metadata is 
also considered as not available.
But one bug in it make it to be misunderstood. My new btrfs_statfs() 
still consider the free space
in metadata as not available, furthermore, I add it into @used.
>
> On my system the ratio metadata/data is 234MB/8.82GB = ~3%, so ignoring the
> metadata chunk from the free space may not be a big problem.
>
>
>
>
>
>> Example:
>> 	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>> 	# mount /dev/vdf1 /mnt
>> 	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>> 	# df -h /mnt
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>> 	# btrfs fi show /dev/vdf1
>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>> 	Total devices 2 FS bytes used 1001.53MiB
>> 	devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
>> 	devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>> a. df -h should report Size as 2GiB rather than as 3GiB.
>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>>
>> b. df -h should report Avail as 0.97GiB or less, rather than as 1.3GiB.
>>      1.85           (the capacity of the allocated chunk)
>>     -1.018          (the file stored)
>>     +(2-1.85=0.15)  (the residual capacity of the disks
>>                      considering a raid1 fs)
>>     ---------------
>> =   0.97
>>
>> This patch drops the factor at all and calculate the size observable to
>> user without considering which raid level the data is in and what's the
>> size exactly in disk.
>> After this patch applied:
>> 	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>> 	# mount /dev/vdf1 /mnt
>> 	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>> 	# df -h /mnt
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/vdf1       2.0G  1.3G  713M  66% /mnt
>> 	# df /mnt
>> Filesystem     1K-blocks    Used Available Use% Mounted on
>> /dev/vdf1        2097152 1359424    729536  66% /mnt
>> 	# btrfs fi show /dev/vdf1
>> Label: none  uuid: e98c1321-645f-4457-b20d-4f41dc1cf2f4
>> 	Total devices 2 FS bytes used 1001.55MiB
>> 	devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
>> 	devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>> a). The @Size is 2G as we expected.
>> b). @Available is 700M = 1.85G - 1.3G + (2G - 1.85G).
>> c). @Used is changed to 1.3G rather than 1018M as above. Because
>>      this patch do not treat the free space in metadata chunk
>>      and system chunk as available to user. It's true, user can
>>      not use these space to store data, then it should not be
>>      thought as available. At the same time, it will make the
>>      @Used + @Available == @Size as possible to user.
>>
>> Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
>> ---
>> Changelog:
>>          v1 -> v2:
>>                  a. account the total_bytes in medadata chunk and
>>                     system chunk as used to user.
>>                  b. set data_stripes to the correct value in RAID0.
>>
>>   fs/btrfs/extent-tree.c | 13 ++----------
>>   fs/btrfs/super.c       | 56 ++++++++++++++++++++++----------------------------
>>   2 files changed, 26 insertions(+), 43 deletions(-)
>>
>> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> index a84e00d..9954d60 100644
>> --- a/fs/btrfs/extent-tree.c
>> +++ b/fs/btrfs/extent-tree.c
>> @@ -8571,7 +8571,6 @@ static u64 __btrfs_get_ro_block_group_free_space(struct list_head *groups_list)
>>   {
>>   	struct btrfs_block_group_cache *block_group;
>>   	u64 free_bytes = 0;
>> -	int factor;
>>   
>>   	list_for_each_entry(block_group, groups_list, list) {
>>   		spin_lock(&block_group->lock);
>> @@ -8581,16 +8580,8 @@ static u64 __btrfs_get_ro_block_group_free_space(struct list_head *groups_list)
>>   			continue;
>>   		}
>>   
>> -		if (block_group->flags & (BTRFS_BLOCK_GROUP_RAID1 |
>> -					  BTRFS_BLOCK_GROUP_RAID10 |
>> -					  BTRFS_BLOCK_GROUP_DUP))
>> -			factor = 2;
>> -		else
>> -			factor = 1;
>> -
>> -		free_bytes += (block_group->key.offset -
>> -			       btrfs_block_group_used(&block_group->item)) *
>> -			       factor;
>> +		free_bytes += block_group->key.offset -
>> +			      btrfs_block_group_used(&block_group->item);
>>   
>>   		spin_unlock(&block_group->lock);
>>   	}
>> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
>> index 54bd91e..40f41a2 100644
>> --- a/fs/btrfs/super.c
>> +++ b/fs/btrfs/super.c
>> @@ -1641,6 +1641,8 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
>>   	u64 used_space;
>>   	u64 min_stripe_size;
>>   	int min_stripes = 1, num_stripes = 1;
>> +	/* How many stripes used to store data, without considering mirrors. */
>> +	int data_stripes = 1;
>>   	int i = 0, nr_devices;
>>   	int ret;
>>   
>> @@ -1657,12 +1659,15 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
>>   	if (type & BTRFS_BLOCK_GROUP_RAID0) {
>>   		min_stripes = 2;
>>   		num_stripes = nr_devices;
>> +		data_stripes = num_stripes;
>>   	} else if (type & BTRFS_BLOCK_GROUP_RAID1) {
>>   		min_stripes = 2;
>>   		num_stripes = 2;
>> +		data_stripes = 1;
>>   	} else if (type & BTRFS_BLOCK_GROUP_RAID10) {
>>   		min_stripes = 4;
>>   		num_stripes = 4;
>> +		data_stripes = 2;
>>   	}
>>   
>>   	if (type & BTRFS_BLOCK_GROUP_DUP)
>> @@ -1733,14 +1738,17 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
>>   	i = nr_devices - 1;
>>   	avail_space = 0;
>>   	while (nr_devices >= min_stripes) {
>> -		if (num_stripes > nr_devices)
>> +		if (num_stripes > nr_devices) {
>>   			num_stripes = nr_devices;
>> +			if (type & BTRFS_BLOCK_GROUP_RAID0)
>> +				data_stripes = num_stripes;
>> +		}
>>   
>>   		if (devices_info[i].max_avail >= min_stripe_size) {
>>   			int j;
>>   			u64 alloc_size;
>>   
>> -			avail_space += devices_info[i].max_avail * num_stripes;
>> +			avail_space += devices_info[i].max_avail * data_stripes;
>>   			alloc_size = devices_info[i].max_avail;
>>   			for (j = i + 1 - num_stripes; j <= i; j++)
>>   				devices_info[j].max_avail -= alloc_size;
>> @@ -1772,15 +1780,13 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
>>   static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>   {
>>   	struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb);
>> -	struct btrfs_super_block *disk_super = fs_info->super_copy;
>>   	struct list_head *head = &fs_info->space_info;
>>   	struct btrfs_space_info *found;
>>   	u64 total_used = 0;
>> +	u64 total_alloc = 0;
>>   	u64 total_free_data = 0;
>>   	int bits = dentry->d_sb->s_blocksize_bits;
>>   	__be32 *fsid = (__be32 *)fs_info->fsid;
>> -	unsigned factor = 1;
>> -	struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv;
>>   	int ret;
>>   
>>   	/*
>> @@ -1792,38 +1798,20 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>   	rcu_read_lock();
>>   	list_for_each_entry_rcu(found, head, list) {
>>   		if (found->flags & BTRFS_BLOCK_GROUP_DATA) {
>> -			int i;
>> -
>> -			total_free_data += found->disk_total - found->disk_used;
>> +			total_free_data += found->total_bytes - found->bytes_used;
>>   			total_free_data -=
>>   				btrfs_account_ro_block_groups_free_space(found);
>> -
>> -			for (i = 0; i < BTRFS_NR_RAID_TYPES; i++) {
>> -				if (!list_empty(&found->block_groups[i])) {
>> -					switch (i) {
>> -					case BTRFS_RAID_DUP:
>> -					case BTRFS_RAID_RAID1:
>> -					case BTRFS_RAID_RAID10:
>> -						factor = 2;
>> -					}
>> -				}
>> -			}
>> +			total_used += found->bytes_used;
>> +		} else {
>> +			/* For metadata and system, we treat the total_bytes as
>> +			 * not available to file data. So show it as Used in df.
>> +			 **/
>> +			total_used += found->total_bytes;
>>   		}
>> -
>> -		total_used += found->disk_used;
>> +		total_alloc += found->total_bytes;
>>   	}
>> -
>>   	rcu_read_unlock();
>>   
>> -	buf->f_blocks = div_u64(btrfs_super_total_bytes(disk_super), factor);
>> -	buf->f_blocks >>= bits;
>> -	buf->f_bfree = buf->f_blocks - (div_u64(total_used, factor) >> bits);
>> -
>> -	/* Account global block reserve as used, it's in logical size already */
>> -	spin_lock(&block_rsv->lock);
>> -	buf->f_bfree -= block_rsv->size >> bits;
>> -	spin_unlock(&block_rsv->lock);
>> -
>>   	buf->f_bavail = total_free_data;
>>   	ret = btrfs_calc_avail_data_space(fs_info->tree_root, &total_free_data);
>>   	if (ret) {
>> @@ -1831,8 +1819,12 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
>>   		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>>   		return ret;
>>   	}
>> -	buf->f_bavail += div_u64(total_free_data, factor);
>> +	buf->f_bavail += total_free_data;
>>   	buf->f_bavail = buf->f_bavail >> bits;
>> +	buf->f_blocks = total_alloc + total_free_data;
>> +	buf->f_blocks >>= bits;
>> +	buf->f_bfree = buf->f_blocks - (total_used >> bits);
>> +
>>   	mutex_unlock(&fs_info->chunk_mutex);
>>   	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
>>   
>>
>


  parent reply	other threads:[~2014-12-13 10:00 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-11  8:31 [PATCH v2 1/3] Btrfs: get more accurate output in df command Dongsheng Yang
2014-12-11  8:31 ` [PATCH v2 2/3] Btrfs: raid56: simplify the parameter of nr_parity_stripes() Dongsheng Yang
2014-12-16  6:21   ` Satoru Takeuchi
2014-12-11  8:31 ` [PATCH v2 3/3] Btrfs: adapt df command to RAID5/6 Dongsheng Yang
2014-12-12 18:00 ` [PATCH v2 1/3] Btrfs: get more accurate output in df command Goffredo Baroncelli
2014-12-13  0:50   ` Duncan
2014-12-13 10:21     ` Dongsheng Yang
2014-12-13  9:57   ` Dongsheng Yang [this message]
2014-12-12 19:25 ` Goffredo Baroncelli
2014-12-14 11:29   ` Dongsheng Yang
     [not found]     ` <CABmMA7tw9BDsBXGHLO4vjcO4gaYmZPb_BQV8w22griqFvCJpPA@mail.gmail.com>
2014-12-14 14:32       ` Grzegorz Kowal
2014-12-15  1:21         ` Dongsheng Yang
2014-12-15  6:06           ` Robert White
2014-12-15  7:49             ` Robert White
2014-12-15  8:26               ` Dongsheng Yang
2014-12-15  9:36                 ` Robert White
2014-12-16  3:30                   ` Standards Problems [Was: [PATCH v2 1/3] Btrfs: get more accurate output in df command.] Robert White
2014-12-16  3:52                     ` Robert White
2014-12-16 11:30                     ` Dongsheng Yang
2014-12-16 13:24                       ` Dongsheng Yang
2014-12-16 19:52                       ` Robert White
2014-12-17 11:38                         ` Dongsheng Yang
2014-12-18  4:07                           ` Robert White
2014-12-18  8:02                             ` Duncan
2014-12-23 12:31                             ` Dongsheng Yang
2014-12-27  1:10                               ` Robert White
2015-01-05  9:59                                 ` Dongsheng Yang
2014-12-31  0:15                             ` Zygo Blaxell
2015-01-05  9:56                               ` Dongsheng Yang
2015-01-05 10:07                                 ` [PATCH v2 1/3] Btrfs: get more accurate output in df command Dongsheng Yang
2015-01-05 10:07                                   ` [PATCH v2 2/3] Btrfs: raid56: simplify the parameter of nr_parity_stripes() Dongsheng Yang
2015-01-05 10:07                                   ` [PATCH v2 3/3] Btrfs: adapt df command to RAID5/6 Dongsheng Yang
2014-12-19  3:32             ` [PATCH v2 1/3] Btrfs: get more accurate output in df command Zygo Blaxell
     [not found]     ` <548F1EA7.9050505@inwind.it>
2014-12-16 13:47       ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548C0D92.2030608@cn.fujitsu.com \
    --to=yangds.fnst@cn.fujitsu.com \
    --cc=1i5t5.duncan@cox.net \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=rwhite@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).