[bug] df reports wrong Size and Avail on raid1, 3.18rc2

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

* [bug] df reports wrong Size and Avail on raid1, 3.18rc2
@ 2014-10-29  2:19 Chris Murphy
  2014-10-29  2:26 ` Eric Sandeen
  2014-12-09 11:20 ` [PATCH] Btrfs: get more accurate output in fd command Dongsheng Yang
  0 siblings, 2 replies; 19+ messages in thread
From: Chris Murphy @ 2014-10-29  2:19 UTC (permalink / raw)
  To: Btrfs BTRFS

3.18.0-0.rc2.git1.1.fc22.x86_64
btrfs-progs-3.17-1.fc21.x86_64

# btrfs fi show /mnt
Label: 'btrfs1'  uuid: 0f1c615f-30a0-4166-8a3c-987849551513
	Total devices 2 FS bytes used 233.54GiB
	devid    1 size 465.76GiB used 236.03GiB path /dev/sdb
	devid    2 size 298.09GiB used 236.03GiB path /dev/sdc

# df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/sdb        382G  234G   65G  79% /mnt


a. df -h should report Size as 298GiB rather than as 382GiB.
Because this is 2 device raid1, the limiting factor is devid 2 @ 298GiB.

b. df -h should report Avail as 62GiB or less, rather than as 65GiB.
298.09 - 236.03 = 62.06


Chris Murphy

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [bug] df reports wrong Size and Avail on raid1, 3.18rc2
  2014-10-29  2:19 [bug] df reports wrong Size and Avail on raid1, 3.18rc2 Chris Murphy
@ 2014-10-29  2:26 ` Eric Sandeen
  2014-12-09 11:20 ` [PATCH] Btrfs: get more accurate output in fd command Dongsheng Yang
  1 sibling, 0 replies; 19+ messages in thread
From: Eric Sandeen @ 2014-10-29  2:26 UTC (permalink / raw)
  To: Chris Murphy, Btrfs BTRFS

On 10/28/14 9:19 PM, Chris Murphy wrote:
> 3.18.0-0.rc2.git1.1.fc22.x86_64
> btrfs-progs-3.17-1.fc21.x86_64
> 
> # btrfs fi show /mnt
> Label: 'btrfs1'  uuid: 0f1c615f-30a0-4166-8a3c-987849551513
> 	Total devices 2 FS bytes used 233.54GiB
> 	devid    1 size 465.76GiB used 236.03GiB path /dev/sdb
> 	devid    2 size 298.09GiB used 236.03GiB path /dev/sdc
> 
> # df -h /mnt
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/sdb        382G  234G   65G  79% /mnt
> 
> 
> a. df -h should report Size as 298GiB rather than as 382GiB.
> Because this is 2 device raid1, the limiting factor is devid 2 @ 298GiB.
> 
> b. df -h should report Avail as 62GiB or less, rather than as 65GiB.
> 298.09 - 236.03 = 62.06

Is there an fstest for btrfs disk space reporting?

ext2/3/4 used to keep getting "overhead" wrong for various filesystem
types ... until we wrote a regression test.

Just sayin' :)

-Eric

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] Btrfs: get more accurate output in fd command.
  2014-10-29  2:19 [bug] df reports wrong Size and Avail on raid1, 3.18rc2 Chris Murphy
  2014-10-29  2:26 ` Eric Sandeen
@ 2014-12-09 11:20 ` Dongsheng Yang
  2014-12-09 18:47   ` Goffredo Baroncelli
  2014-12-10 13:59   ` Shriramana Sharma
  1 sibling, 2 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-09 11:20 UTC (permalink / raw)
  To: lists; +Cc: linux-btrfs, Dongsheng Yang

When function btrfs_statfs() calculate the tatol size of fs, it is calculating
the total size of disks and then dividing it by a factor. But in some usecase,
the result is not good to user.

Example:
	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
	# mount /dev/vdf1 /mnt
	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
	# df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdf1       3.0G 1018M  1.3G  45% /mnt

	# btrfs fi show /dev/vdf1
Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
	Total devices 2 FS bytes used 1001.53MiB
	devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
	devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2

a. df -h should report Size as 2GiB rather than as 3GiB.
Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.

b. df -h should report Avail as 0.15GiB or less, rather than as 1.3GiB.
2 - 1.85 = 0.15

This patch drops the factor at all and calculate the size observable to
user without considering which raid level the data is in and what's the
size exactly in disk.

After this patch applied:
	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
	# mount /dev/vdf1 /mnt
	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
	# df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdf1       2.0G 1018M  713M  59% /mnt

	# btrfs fi show /dev/vdf1
Label: none  uuid: e98c1321-645f-4457-b20d-4f41dc1cf2f4
	Total devices 2 FS bytes used 1001.55MiB
	devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
	devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2

Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
---
 fs/btrfs/extent-tree.c | 13 ++-----------
 fs/btrfs/super.c       | 47 +++++++++++++++++++----------------------------
 2 files changed, 21 insertions(+), 39 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index a84e00d..9954d60 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -8571,7 +8571,6 @@ static u64 __btrfs_get_ro_block_group_free_space(struct list_head *groups_list)
 {
 	struct btrfs_block_group_cache *block_group;
 	u64 free_bytes = 0;
-	int factor;
 
 	list_for_each_entry(block_group, groups_list, list) {
 		spin_lock(&block_group->lock);
@@ -8581,16 +8580,8 @@ static u64 __btrfs_get_ro_block_group_free_space(struct list_head *groups_list)
 			continue;
 		}
 
-		if (block_group->flags & (BTRFS_BLOCK_GROUP_RAID1 |
-					  BTRFS_BLOCK_GROUP_RAID10 |
-					  BTRFS_BLOCK_GROUP_DUP))
-			factor = 2;
-		else
-			factor = 1;
-
-		free_bytes += (block_group->key.offset -
-			       btrfs_block_group_used(&block_group->item)) *
-			       factor;
+		free_bytes += block_group->key.offset -
+			      btrfs_block_group_used(&block_group->item);
 
 		spin_unlock(&block_group->lock);
 	}
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 54bd91e..83c2c3c 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1641,6 +1641,8 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
 	u64 used_space;
 	u64 min_stripe_size;
 	int min_stripes = 1, num_stripes = 1;
+	/* How many stripes used to store data, without considering mirrors. */
+	int data_stripes = 1;
 	int i = 0, nr_devices;
 	int ret;
 
@@ -1657,12 +1659,15 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
 	if (type & BTRFS_BLOCK_GROUP_RAID0) {
 		min_stripes = 2;
 		num_stripes = nr_devices;
+		data_stripes = 2;
 	} else if (type & BTRFS_BLOCK_GROUP_RAID1) {
 		min_stripes = 2;
 		num_stripes = 2;
+		data_stripes = 1;
 	} else if (type & BTRFS_BLOCK_GROUP_RAID10) {
 		min_stripes = 4;
 		num_stripes = 4;
+		data_stripes = 2;
 	}
 
 	if (type & BTRFS_BLOCK_GROUP_DUP)
@@ -1740,7 +1745,7 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
 			int j;
 			u64 alloc_size;
 
-			avail_space += devices_info[i].max_avail * num_stripes;
+			avail_space += devices_info[i].max_avail * data_stripes;
 			alloc_size = devices_info[i].max_avail;
 			for (j = i + 1 - num_stripes; j <= i; j++)
 				devices_info[j].max_avail -= alloc_size;
@@ -1772,14 +1777,13 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes)
 static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(dentry->d_sb);
-	struct btrfs_super_block *disk_super = fs_info->super_copy;
 	struct list_head *head = &fs_info->space_info;
 	struct btrfs_space_info *found;
 	u64 total_used = 0;
+	u64 total_alloc = 0;
 	u64 total_free_data = 0;
 	int bits = dentry->d_sb->s_blocksize_bits;
 	__be32 *fsid = (__be32 *)fs_info->fsid;
-	unsigned factor = 1;
 	struct btrfs_block_rsv *block_rsv = &fs_info->global_block_rsv;
 	int ret;
 
@@ -1792,38 +1796,17 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
 	rcu_read_lock();
 	list_for_each_entry_rcu(found, head, list) {
 		if (found->flags & BTRFS_BLOCK_GROUP_DATA) {
-			int i;
-
-			total_free_data += found->disk_total - found->disk_used;
+			total_free_data += found->total_bytes - found->bytes_used;
 			total_free_data -=
 				btrfs_account_ro_block_groups_free_space(found);
-
-			for (i = 0; i < BTRFS_NR_RAID_TYPES; i++) {
-				if (!list_empty(&found->block_groups[i])) {
-					switch (i) {
-					case BTRFS_RAID_DUP:
-					case BTRFS_RAID_RAID1:
-					case BTRFS_RAID_RAID10:
-						factor = 2;
-					}
-				}
-			}
 		}
 
-		total_used += found->disk_used;
+		total_used += found->bytes_used;
+		total_alloc += found->total_bytes;
 	}
 
 	rcu_read_unlock();
 
-	buf->f_blocks = div_u64(btrfs_super_total_bytes(disk_super), factor);
-	buf->f_blocks >>= bits;
-	buf->f_bfree = buf->f_blocks - (div_u64(total_used, factor) >> bits);
-
-	/* Account global block reserve as used, it's in logical size already */
-	spin_lock(&block_rsv->lock);
-	buf->f_bfree -= block_rsv->size >> bits;
-	spin_unlock(&block_rsv->lock);
-
 	buf->f_bavail = total_free_data;
 	ret = btrfs_calc_avail_data_space(fs_info->tree_root, &total_free_data);
 	if (ret) {
@@ -1831,8 +1814,16 @@ static int btrfs_statfs(struct dentry *dentry, struct kstatfs *buf)
 		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 		return ret;
 	}
-	buf->f_bavail += div_u64(total_free_data, factor);
+	buf->f_bavail += total_free_data;
 	buf->f_bavail = buf->f_bavail >> bits;
+	buf->f_blocks = total_alloc + total_free_data;
+	buf->f_blocks >>= bits;
+	buf->f_bfree = buf->f_blocks - (total_used >> bits);
+	/* Account global block reserve as used, it's in logical size already */
+	spin_lock(&block_rsv->lock);
+	buf->f_bfree -= block_rsv->size >> bits;
+	spin_unlock(&block_rsv->lock);
+
 	mutex_unlock(&fs_info->chunk_mutex);
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 
-- 
1.8.4.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-09 11:20 ` [PATCH] Btrfs: get more accurate output in fd command Dongsheng Yang
@ 2014-12-09 18:47   ` Goffredo Baroncelli
  2014-12-10  1:08     ` Dongsheng Yang
  2014-12-10 13:59   ` Shriramana Sharma
  1 sibling, 1 reply; 19+ messages in thread
From: Goffredo Baroncelli @ 2014-12-09 18:47 UTC (permalink / raw)
  To: Dongsheng Yang; +Cc: lists, linux-btrfs

Hi Dongsheng
On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
> When function btrfs_statfs() calculate the tatol size of fs, it is calculating
> the total size of disks and then dividing it by a factor. But in some usecase,
> the result is not good to user.
> 
> Example:
> 	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
> 	# mount /dev/vdf1 /mnt
> 	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
> 	# df -h /mnt
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
> 
> 	# btrfs fi show /dev/vdf1
> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
> 	Total devices 2 FS bytes used 1001.53MiB
> 	devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
> 	devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
> 
> a. df -h should report Size as 2GiB rather than as 3GiB.
> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.

I agree

> b. df -h should report Avail as 0.15GiB or less, rather than as 1.3GiB.
> 2 - 1.85 = 0.15

I cannot agree; the avail should be: 
    1.85           (the capacity of the allocated chunk)
   -1.018          (the file stored)
   +(2-1.85=0.15)  (the residual capacity of the disks
                    considering a raid1 fs)
   ---------------
=   0.97           

> 
> This patch drops the factor at all and calculate the size observable to
> user without considering which raid level the data is in and what's the
> size exactly in disk.
> 
> After this patch applied:
> 	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
> 	# mount /dev/vdf1 /mnt
> 	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
> 	# df -h /mnt
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdf1       2.0G 1018M  713M  59% /mnt

I am confused: in this example you reported as Avail 713MB, when previous
you stated that the right value should be 150MB...

What happens when the filesystem is RAID5/RAID6 or Linear ?


[...]
-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-09 18:47   ` Goffredo Baroncelli
@ 2014-12-10  1:08     ` Dongsheng Yang
  2014-12-10 10:53       ` Robert White
  0 siblings, 1 reply; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-10  1:08 UTC (permalink / raw)
  To: kreijack; +Cc: lists, linux-btrfs

On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
> Hi Dongsheng
> On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>> When function btrfs_statfs() calculate the tatol size of fs, it is calculating
>> the total size of disks and then dividing it by a factor. But in some usecase,
>> the result is not good to user.
>>
>> Example:
>> 	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>> 	# mount /dev/vdf1 /mnt
>> 	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>> 	# df -h /mnt
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>
>> 	# btrfs fi show /dev/vdf1
>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>> 	Total devices 2 FS bytes used 1001.53MiB
>> 	devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
>> 	devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>>
>> a. df -h should report Size as 2GiB rather than as 3GiB.
>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
> I agree
>
>> b. df -h should report Avail as 0.15GiB or less, rather than as 1.3GiB.
>> 2 - 1.85 = 0.15
> I cannot agree; the avail should be:
>      1.85           (the capacity of the allocated chunk)
>     -1.018          (the file stored)
>     +(2-1.85=0.15)  (the residual capacity of the disks
>                      considering a raid1 fs)
>     ---------------
> =   0.97

My bad here. It should be 0.97. My mistake in this changelog.
I will update it in next version.
>> This patch drops the factor at all and calculate the size observable to
>> user without considering which raid level the data is in and what's the
>> size exactly in disk.
>>
>> After this patch applied:
>> 	# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>> 	# mount /dev/vdf1 /mnt
>> 	# dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>> 	# df -h /mnt
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/vdf1       2.0G 1018M  713M  59% /mnt
> I am confused: in this example you reported as Avail 713MB, when previous
> you stated that the right value should be 150MB...

As you pointed above, the right value should be 970MB or less (Some 
space is used for metadata and system).
And the 713MB is my result of it.
>
> What happens when the filesystem is RAID5/RAID6 or Linear ?

The original df did not consider the RAID5/6. So it still does not work 
well with
this patch applied. But I will update this patch to handle these 
scenarios in V2.

Thanx
Yang

  [...]


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10  1:08     ` Dongsheng Yang
@ 2014-12-10 10:53       ` Robert White
  2014-12-10 13:21         ` Duncan
                           ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Robert White @ 2014-12-10 10:53 UTC (permalink / raw)
  To: Dongsheng Yang, kreijack; +Cc: lists, linux-btrfs

On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>> Hi Dongsheng
>> On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>> When function btrfs_statfs() calculate the tatol size of fs, it is
>>> calculating
>>> the total size of disks and then dividing it by a factor. But in some
>>> usecase,
>>> the result is not good to user.
>>>
>>> Example:
>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>     # mount /dev/vdf1 /mnt
>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>     # df -h /mnt
>>> Filesystem      Size  Used Avail Use% Mounted on
>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>
>>>     # btrfs fi show /dev/vdf1
>>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>>>     Total devices 2 FS bytes used 1001.53MiB
>>>     devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
>>>     devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>>>
>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>> I agree

NOPE.

The model you propose is too simple.

While the data portion of the file system is set to RAID1 the metadata 
portion of the filesystem is still set to the default of DUP. As such it 
is impossible to guess how much space is "free" since it is unknown how 
the space will be used before hand.

IF, say, this were used as a typical mail spool, web cache, or any 
number of similar smal-file applications virtually all of the data may 
end up in the metadata chunks. The "blocks free" in this usage are 
indistinguisable from any other file system.

For all that DUP data the correct size is 3GiB because there will be two 
copies of all metadata but they could _all_ end up on /dev/vdf2.

So you have a RAID-1 region that is constrained to 2Gib. You have 2GiB 
more storage for all your metadata, but the constraint is DUP (so 
everything is written twice "somewhere")

So the space breakdown is, if optimally packed, actually

2GiB mirrored, for _data_, takes up 4GiB total spread evenly across 
/dev/vdf2 (2Gib) and /dev/vdf1 (2Gib).

_AND_ 1GiB of metadata, written twice to /dev/vdf2 (2Gib)

So free space is 3Gib on the presumption that data and metadata will be 
equally used.

The program, not being psychic, can only make a fair-usage guess about 
future use.

Now we have accounted for all 6GiB of raw storage _and_ the report of 
3GiB free.

IF you wanted everything to be RAID-1 you should have instead done

# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 -m raid1

The mistake is yours, rest of you analysis is, therefore, completely 
inapplicable. Please read all the documentation before making that sort 
of filesystem. Your data will thank you later.

DSCLAIMER: I have _not_ looked at the numbers you would get if you used 
the corrected command.

>>
>>> b. df -h should report Avail as 0.15GiB or less, rather than as 1.3GiB.
>>> 2 - 1.85 = 0.15
>> I cannot agree; the avail should be:
>>      1.85           (the capacity of the allocated chunk)
>>     -1.018          (the file stored)
>>     +(2-1.85=0.15)  (the residual capacity of the disks
>>                      considering a raid1 fs)
>>     ---------------
>> =   0.97
>
> My bad here. It should be 0.97. My mistake in this changelog.
> I will update it in next version.
>>> This patch drops the factor at all and calculate the size observable to
>>> user without considering which raid level the data is in and what's the
>>> size exactly in disk.
>>>
>>> After this patch applied:
>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>     # mount /dev/vdf1 /mnt
>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>     # df -h /mnt
>>> Filesystem      Size  Used Avail Use% Mounted on
>>> /dev/vdf1       2.0G 1018M  713M  59% /mnt
>> I am confused: in this example you reported as Avail 713MB, when previous
>> you stated that the right value should be 150MB...
>
> As you pointed above, the right value should be 970MB or less (Some
> space is used for metadata and system).
> And the 713MB is my result of it.
>>
>> What happens when the filesystem is RAID5/RAID6 or Linear ?
>
> The original df did not consider the RAID5/6. So it still does not work
> well with
> this patch applied. But I will update this patch to handle these
> scenarios in V2.
>
> Thanx
> Yang
>
>   [...]
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 10:53       ` Robert White
@ 2014-12-10 13:21         ` Duncan
  2014-12-10 15:02           ` Dongsheng Yang
  2014-12-10 20:36           ` Robert White
  2014-12-10 14:51         ` Dongsheng Yang
  2014-12-10 18:25         ` Goffredo Baroncelli
  2 siblings, 2 replies; 19+ messages in thread
From: Duncan @ 2014-12-10 13:21 UTC (permalink / raw)
  To: linux-btrfs

Robert White posted on Wed, 10 Dec 2014 02:53:40 -0800 as excerpted:

> On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
>> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>>> Hi Dongsheng On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>>> When function btrfs_statfs() calculate the tatol size of fs, it is
>>>> calculating the total size of disks and then dividing it by a factor.
>>>> But in some usecase, the result is not good to user.
>>>>
>>>> Example:
>>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>>     # mount /dev/vdf1 /mnt
>>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>>     # df -h /mnt
>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>>
>>>>     # btrfs fi show /dev/vdf1
>>>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>>>>     Total devices 2 FS bytes used 1001.53MiB
>>>>     devid    1 size 2.00GiB  used 1.85GiB path /dev/vdf1
>>>>     devid    2 size 4.00GiB  used 1.83GiB path /dev/vdf2
>>>>
>>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>>> I agree
> 
> NOPE.
> 
> The model you propose is too simple.
> 
> While the data portion of the file system is set to RAID1 the metadata
> portion of the filesystem is still set to the default of DUP.

Metadata defaults to DUP only on a single-device filesystem.  On a multi-
device filesystem, metadata defaults to raid1.  (FWIW, for both, data 
defaults to single.)

And in the example, the mkfs was supplied with two devices, so there's no 
dup metadata remaining from a formerly single-device filesystem, either.  
(Tho there will be the small single-mode stubs, empty, remaining from the 
mkfs process, as no balance has been run to delete them yet, but those 
are much smaller and empty.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 13:21         ` Duncan
@ 2014-12-10 15:02           ` Dongsheng Yang
  2014-12-10 19:05             ` Goffredo Baroncelli
  2014-12-11  3:53             ` Duncan
  2014-12-10 20:36           ` Robert White
  1 sibling, 2 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-10 15:02 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Wed, Dec 10, 2014 at 9:21 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Robert White posted on Wed, 10 Dec 2014 02:53:40 -0800 as excerpted:
>
>> On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
>>> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>>>> Hi Dongsheng On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>>>> When function btrfs_statfs() calculate the tatol size of fs, it is
>>>>> calculating the total size of disks and then dividing it by a factor.
>>>>> But in some usecase, the result is not good to user.
>>>>>
>>>>> Example:
>>>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>>>     # mount /dev/vdf1 /mnt
>>>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>>>     # df -h /mnt
>>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>>>
>>>>>     # btrfs fi show /dev/vdf1
>>>>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>>>>>     Total devices 2 FS bytes used 1001.53MiB
>>>>>     devid    1 size 2.00GiB  used 1.85GiB path /dev/vdf1
>>>>>     devid    2 size 4.00GiB  used 1.83GiB path /dev/vdf2
>>>>>
>>>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>>>> I agree
>>
>> NOPE.
>>
>> The model you propose is too simple.
>>
>> While the data portion of the file system is set to RAID1 the metadata
>> portion of the filesystem is still set to the default of DUP.
>
> Metadata defaults to DUP only on a single-device filesystem.  On a multi-
> device filesystem, metadata defaults to raid1.  (FWIW, for both, data
> defaults to single.)

Exactly. Thanx for your clarification. :)
>
> And in the example, the mkfs was supplied with two devices, so there's no
> dup metadata remaining from a formerly single-device filesystem, either.
> (Tho there will be the small single-mode stubs, empty, remaining from the
> mkfs process, as no balance has been run to delete them yet, but those
> are much smaller and empty.)

Yes. One question not related here: how about delete them in the end of mkfs?

Thanx
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 15:02           ` Dongsheng Yang
@ 2014-12-10 19:05             ` Goffredo Baroncelli
  2014-12-11  8:23               ` Dongsheng Yang
  2014-12-11  3:53             ` Duncan
  1 sibling, 1 reply; 19+ messages in thread
From: Goffredo Baroncelli @ 2014-12-10 19:05 UTC (permalink / raw)
  To: Dongsheng Yang, Duncan; +Cc: linux-btrfs

On 12/10/2014 04:02 PM, Dongsheng Yang wrote:
> On Wed, Dec 10, 2014 at 9:21 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>> Robert White posted on Wed, 10 Dec 2014 02:53:40 -0800 as excerpted:
[...]
>> And in the example, the mkfs was supplied with two devices, so there's no
>> dup metadata remaining from a formerly single-device filesystem, either.
>> (Tho there will be the small single-mode stubs, empty, remaining from the
>> mkfs process, as no balance has been run to delete them yet, but those
>> are much smaller and empty.)
> 
> Yes. One question not related here: how about delete them in the end of mkfs?
> 
> Thanx

A btrfs balance should remove them. If you don't want to balance a full
filesystem, you can filter the chunk by usage (set a low usage).
Recently it was discussed in a tread...

BR
Goffredo


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 19:05             ` Goffredo Baroncelli
@ 2014-12-11  8:23               ` Dongsheng Yang
  0 siblings, 0 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-11  8:23 UTC (permalink / raw)
  To: kreijack, Dongsheng Yang, Duncan; +Cc: linux-btrfs

On 12/11/2014 03:05 AM, Goffredo Baroncelli wrote:
> On 12/10/2014 04:02 PM, Dongsheng Yang wrote:
>> On Wed, Dec 10, 2014 at 9:21 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>>> Robert White posted on Wed, 10 Dec 2014 02:53:40 -0800 as excerpted:
> [...]
>>> And in the example, the mkfs was supplied with two devices, so there's no
>>> dup metadata remaining from a formerly single-device filesystem, either.
>>> (Tho there will be the small single-mode stubs, empty, remaining from the
>>> mkfs process, as no balance has been run to delete them yet, but those
>>> are much smaller and empty.)
>> Yes. One question not related here: how about delete them in the end of mkfs?
>>
>> Thanx
> A btrfs balance should remove them. If you don't want to balance a full
> filesystem, you can filter the chunk by usage (set a low usage).
> Recently it was discussed in a tread...

Thanx Goffredo, it works well to me.
>
> BR
> Goffredo
>
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 15:02           ` Dongsheng Yang
  2014-12-10 19:05             ` Goffredo Baroncelli
@ 2014-12-11  3:53             ` Duncan
  2014-12-11  8:25               ` Dongsheng Yang
  1 sibling, 1 reply; 19+ messages in thread
From: Duncan @ 2014-12-11  3:53 UTC (permalink / raw)
  To: linux-btrfs

Dongsheng Yang posted on Wed, 10 Dec 2014 23:02:15 +0800 as excerpted:

>> And in the example, the mkfs was supplied with two devices, so there's
>> no dup metadata remaining from a formerly single-device filesystem,
>> either. (Tho there will be the small single-mode stubs, empty,
>> remaining from the mkfs process, as no balance has been run to delete
>> them yet, but those are much smaller and empty.)
> 
> Yes. One question not related here: how about delete them in the end of
> mkfs?

GB covered the old, manual balance method.  Do a btrfs balance -dusage=0
-musage=0 (or whatever, someone posted his recipe doing the same thing 
except with the single profiles instead of zero usage), and those stubs 
should disappear, as they're empty so there's nothing to rewrite when the 
balance does its thing and it simply removes them.

FWIW I actually have a mkfs helper script here that takes care of a bunch 
of site-default options such as dual-device raid1 both data/metadata, 
skinny-metadata, etc, and it actually prompts for a mountpoint (assuming 
it's already setup in fstab) and will do an immediate mount and balance 
usage=0 to eliminate the stubs if that mountpoint is filled in, again 
assuming it appears in fstab as well.  Since I keep fully separate 
filesystems to avoid putting all my data eggs in the same not-yet-fully-
stable btrfs basket, and my backup system includes periodically blowing 
away the backup and (after booting to the new backup) the working copy 
with a fresh mkfs for a clean start, the mkfs helper script is useful, 
and since I was already doing that, it was reasonably simple to extend it 
to handle the mount and stub-killing balance immediately after the mkfs.

But at least in theory, that old manual method shouldn't be necessary 
with a current (IIRC 3.18 required) kernel, since btrfs should now 
automatically detect empty chunks and automatically rebalance to remove 
them as necessary.  However, I've been busy and haven't actually tried 
3.18 yet, and thus obviously haven't done a mkfs and mount of a fresh 
filesystem to see how long it actually takes to trigger and remove those 
stubs, so for all I know it takes awhile to kick in, and if people are 
bothered by the display of the stubs before it does, they can of course 
still do it the old way.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-11  3:53             ` Duncan
@ 2014-12-11  8:25               ` Dongsheng Yang
  0 siblings, 0 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-11  8:25 UTC (permalink / raw)
  To: Duncan, linux-btrfs

On 12/11/2014 11:53 AM, Duncan wrote:
> Dongsheng Yang posted on Wed, 10 Dec 2014 23:02:15 +0800 as excerpted:
>
>>> And in the example, the mkfs was supplied with two devices, so there's
>>> no dup metadata remaining from a formerly single-device filesystem,
>>> either. (Tho there will be the small single-mode stubs, empty,
>>> remaining from the mkfs process, as no balance has been run to delete
>>> them yet, but those are much smaller and empty.)
>> Yes. One question not related here: how about delete them in the end of
>> mkfs?
> GB covered the old, manual balance method.  Do a btrfs balance -dusage=0
> -musage=0 (or whatever, someone posted his recipe doing the same thing
> except with the single profiles instead of zero usage), and those stubs
> should disappear, as they're empty so there's nothing to rewrite when the
> balance does its thing and it simply removes them.
>
> FWIW I actually have a mkfs helper script here that takes care of a bunch
> of site-default options such as dual-device raid1 both data/metadata,
> skinny-metadata, etc, and it actually prompts for a mountpoint (assuming
> it's already setup in fstab) and will do an immediate mount and balance
> usage=0 to eliminate the stubs if that mountpoint is filled in, again
> assuming it appears in fstab as well.  Since I keep fully separate
> filesystems to avoid putting all my data eggs in the same not-yet-fully-
> stable btrfs basket, and my backup system includes periodically blowing
> away the backup and (after booting to the new backup) the working copy
> with a fresh mkfs for a clean start, the mkfs helper script is useful,
> and since I was already doing that, it was reasonably simple to extend it
> to handle the mount and stub-killing balance immediately after the mkfs.
>
>
> But at least in theory, that old manual method shouldn't be necessary
> with a current (IIRC 3.18 required) kernel, since btrfs should now
> automatically detect empty chunks and automatically rebalance to remove
> them as necessary.  However, I've been busy and haven't actually tried
> 3.18 yet, and thus obviously haven't done a mkfs and mount of a fresh
> filesystem to see how long it actually takes to trigger and remove those
> stubs, so for all I know it takes awhile to kick in, and if people are
> bothered by the display of the stubs before it does, they can of course
> still do it the old way.

Thanx Duncan, I tried the old manual method. It works well to me. Will 
try the new kernel later.
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 13:21         ` Duncan
  2014-12-10 15:02           ` Dongsheng Yang
@ 2014-12-10 20:36           ` Robert White
  2014-12-10 21:03             ` Goffredo Baroncelli
  1 sibling, 1 reply; 19+ messages in thread
From: Robert White @ 2014-12-10 20:36 UTC (permalink / raw)
  To: Duncan, linux-btrfs

On 12/10/2014 05:21 AM, Duncan wrote:
> Robert White posted on Wed, 10 Dec 2014 02:53:40 -0800 as excerpted:
>
>> On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
>>> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>>>> Hi Dongsheng On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>>>> When function btrfs_statfs() calculate the tatol size of fs, it is
>>>>> calculating the total size of disks and then dividing it by a factor.
>>>>> But in some usecase, the result is not good to user.
>>>>>
>>>>> Example:
>>>>>      # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>>>      # mount /dev/vdf1 /mnt
>>>>>      # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>>>      # df -h /mnt
>>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>>>
>>>>>      # btrfs fi show /dev/vdf1
>>>>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>>>>>      Total devices 2 FS bytes used 1001.53MiB
>>>>>      devid    1 size 2.00GiB  used 1.85GiB path /dev/vdf1
>>>>>      devid    2 size 4.00GiB  used 1.83GiB path /dev/vdf2
>>>>>
>>>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>>>> I agree
>>
>> NOPE.
>>
>> The model you propose is too simple.
>>
>> While the data portion of the file system is set to RAID1 the metadata
>> portion of the filesystem is still set to the default of DUP.

Well my bad... /D'oh...

Though I'd say the documentation needs to be updated. The only mention 
of changes from the default is this bit.

 From man mkfs.btrfs as distributed in the source tree:

[QUOTE]
        -m|--metadata <profile>
            Specify how metadata must be spanned across the devices 
specified. Valid values are raid0, raid1, raid5, raid6, raid10, single 
or dup.

            Single device will have dup set by default except in the 
case of SSDs which will default to single. This is because SSDs can 
remap blocks internally so duplicate blocks could end up in the same 
erase block which negates the benefits of doing metadata duplication.
[/QUOTE]

No mention is made of RAID1 for a multi-device FS, the two defaults are 
listed as DUP and Single.

ASIDE: The wiki page mentions RAID1 but doesn't mention the SSD fallback 
to single; and it's annotated as potentially out of date. But I never 
looked there because I had the manual page locally.

I tested it and sure enough, it's RAID1...

I also noticed that the default for data goes from single to RAID0 in a 
two slice build.

I generally don't expect defaults to change in undocumented ways. 
Particularly since that makes make-plus-add orthogonal to make-as-multi.

Without other guidance I'd been assuming that

mkfs.btrfs d1 d2 d3 ...
--vs--
mkfs.btrfs d1
btrfs dev add d2
btrfs dev add d3
...

would net the same resultant system. I have only ever done the latter 
until today.

Does/Will the defaults change when three, four, or more slices are used 
to build the system?

I'll take a stab at updating the manual page.

-- Rob.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 20:36           ` Robert White
@ 2014-12-10 21:03             ` Goffredo Baroncelli
  0 siblings, 0 replies; 19+ messages in thread
From: Goffredo Baroncelli @ 2014-12-10 21:03 UTC (permalink / raw)
  To: Robert White; +Cc: Duncan, linux-btrfs

On 12/10/2014 09:36 PM, Robert White wrote:
[...]
> I tested it and sure enough, it's RAID1...
> 
> I also noticed that the default for data goes from single to RAID0 in
> a two slice build.
> 
> I generally don't expect defaults to change in undocumented ways.
> Particularly since that makes make-plus-add orthogonal to
> make-as-multi.
> 
> Without other guidance I'd been assuming that
> 
> mkfs.btrfs d1 d2 d3 ... 
> --vs-- 
> mkfs.btrfs d1 
> btrfs dev add d2 
> btrfs dev add d3 ...
> 
> would net the same resultant system. I have only ever done the latter
> until today.
> 
> Does/Will the defaults change when three, four, or more slices are
> used to build the system?
> 
> I'll take a stab at updating the manual page.

Why not printing from mkfs.btrfs the raid profiles used ?


> 
> -- Rob.
> 
> 
> -- To unsubscribe from this list: send the line "unsubscribe
> linux-btrfs" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 10:53       ` Robert White
  2014-12-10 13:21         ` Duncan
@ 2014-12-10 14:51         ` Dongsheng Yang
  2014-12-10 18:25         ` Goffredo Baroncelli
  2 siblings, 0 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-10 14:51 UTC (permalink / raw)
  To: Robert White; +Cc: Dongsheng Yang, kreijack, lists, linux-btrfs

On Wed, Dec 10, 2014 at 6:53 PM, Robert White <rwhite@pobox.com> wrote:
> On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
>>
>> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>>>
>>> Hi Dongsheng
>>> On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>>>
>>>> When function btrfs_statfs() calculate the tatol size of fs, it is
>>>> calculating
>>>> the total size of disks and then dividing it by a factor. But in some
>>>> usecase,
>>>> the result is not good to user.
>>>>
>>>> Example:
>>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>>     # mount /dev/vdf1 /mnt
>>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>>     # df -h /mnt
>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>>
>>>>     # btrfs fi show /dev/vdf1
>>>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>>>>     Total devices 2 FS bytes used 1001.53MiB
>>>>     devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
>>>>     devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>>>>
>>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>>>
>>> I agree
>
>
> NOPE.
>
> The model you propose is too simple.
>
> While the data portion of the file system is set to RAID1 the metadata
> portion of the filesystem is still set to the default of DUP. As such it is
> impossible to guess how much space is "free" since it is unknown how the
> space will be used before hand.
>
> IF, say, this were used as a typical mail spool, web cache, or any number of
> similar smal-file applications virtually all of the data may end up in the
> metadata chunks. The "blocks free" in this usage are indistinguisable from
> any other file system.
>
> For all that DUP data the correct size is 3GiB because there will be two
> copies of all metadata but they could _all_ end up on /dev/vdf2.
>
> So you have a RAID-1 region that is constrained to 2Gib. You have 2GiB more
> storage for all your metadata, but the constraint is DUP (so everything is
> written twice "somewhere")
>
> So the space breakdown is, if optimally packed, actually

The issue you pointed here really exists. If the all data is stored inline,
the raid level will probably be different with the raid level we set by "-d".

If we want to give an exactly guess of the future use, I would say
it's impossible.

But, 2G of the @size is more proper than 3G in this case I think.

Let's compare them as below:

2G:
    a). It's readable to user, we build a btrfs with two devices of 2G and 4G.
Then we got an fs of 2G. That's what raid1 should be understood.
    b). Even if all data is stored in inline extent, the @size will also grows
at the same time. That said, if as you said, we got 3G data in it. The @size
will also be reported as 3G in df command.

3G:
   a). It is strange to user, why we got a fs of 3G in raid1 with 2G
and 4G device?
And why I can not use the all the 3G capacity df reported (we can not assume a
user understand what's inline extent.)?

So, I prefer 2G to 3G here. Furthermore, I have cooked a new patch to treat
space in metadata chunk and system chunk more properly. shown as below.
        # df -h /mnt
Filesystem      Size  Used Avail Use% Mounted on
/dev/vdf1       2.0G  1.3G  713M  66% /mnt
        # df /mnt
Filesystem     1K-blocks    Used Available Use% Mounted on
/dev/vdf1        2097152 1359424    729536  66% /mnt
        # btrfs fi show /dev/vdf1
Label: none  uuid: e98c1321-645f-4457-b20d-4f41dc1cf2f4
        Total devices 2 FS bytes used 1001.55MiB
        devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
        devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2

Does this makes more sense to you, Robert?

Thanx
Yang

>
> 2GiB mirrored, for _data_, takes up 4GiB total spread evenly across
> /dev/vdf2 (2Gib) and /dev/vdf1 (2Gib).
>
> _AND_ 1GiB of metadata, written twice to /dev/vdf2 (2Gib)
>
> So free space is 3Gib on the presumption that data and metadata will be
> equally used.
>
> The program, not being psychic, can only make a fair-usage guess about
> future use.
>
> Now we have accounted for all 6GiB of raw storage _and_ the report of 3GiB
> free.
>
> IF you wanted everything to be RAID-1 you should have instead done
>
> # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 -m raid1
>
> The mistake is yours, rest of you analysis is, therefore, completely
> inapplicable. Please read all the documentation before making that sort of
> filesystem. Your data will thank you later.
>
> DSCLAIMER: I have _not_ looked at the numbers you would get if you used the
> corrected command.
>
>
>
>>>
>>>> b. df -h should report Avail as 0.15GiB or less, rather than as 1.3GiB.
>>>> 2 - 1.85 = 0.15
>>>
>>> I cannot agree; the avail should be:
>>>      1.85           (the capacity of the allocated chunk)
>>>     -1.018          (the file stored)
>>>     +(2-1.85=0.15)  (the residual capacity of the disks
>>>                      considering a raid1 fs)
>>>     ---------------
>>> =   0.97
>>
>>
>> My bad here. It should be 0.97. My mistake in this changelog.
>> I will update it in next version.
>>>>
>>>> This patch drops the factor at all and calculate the size observable to
>>>> user without considering which raid level the data is in and what's the
>>>> size exactly in disk.
>>>>
>>>> After this patch applied:
>>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>>     # mount /dev/vdf1 /mnt
>>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>>     # df -h /mnt
>>>> Filesystem      Size  Used Avail Use% Mounted on
>>>> /dev/vdf1       2.0G 1018M  713M  59% /mnt
>>>
>>> I am confused: in this example you reported as Avail 713MB, when previous
>>> you stated that the right value should be 150MB...
>>
>>
>> As you pointed above, the right value should be 970MB or less (Some
>> space is used for metadata and system).
>> And the 713MB is my result of it.
>>>
>>>
>>> What happens when the filesystem is RAID5/RAID6 or Linear ?
>>
>>
>> The original df did not consider the RAID5/6. So it still does not work
>> well with
>> this patch applied. But I will update this patch to handle these
>> scenarios in V2.
>>
>> Thanx
>> Yang
>>
>>   [...]
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 10:53       ` Robert White
  2014-12-10 13:21         ` Duncan
  2014-12-10 14:51         ` Dongsheng Yang
@ 2014-12-10 18:25         ` Goffredo Baroncelli
  2014-12-11  8:28           ` Dongsheng Yang
  2 siblings, 1 reply; 19+ messages in thread
From: Goffredo Baroncelli @ 2014-12-10 18:25 UTC (permalink / raw)
  To: Robert White, Dongsheng Yang; +Cc: lists, linux-btrfs

On 12/10/2014 11:53 AM, Robert White wrote:
> On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
>> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>>> Hi Dongsheng On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>>> When function btrfs_statfs() calculate the tatol size of fs, it
>>>> is calculating the total size of disks and then dividing it by
>>>> a factor. But in some usecase, the result is not good to user.
>>>> 
>>>> Example: # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 # mount
>>>> /dev/vdf1 /mnt # dd if=/dev/zero of=/mnt/zero bs=1M count=1000 
>>>> # df -h /mnt Filesystem      Size  Used Avail Use% Mounted on 
>>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>> 
>>>> # btrfs fi show /dev/vdf1 Label: none  uuid:
>>>> f85d93dc-81f4-445d-91e5-6a5cd9563294 Total devices 2 FS bytes
>>>> used 1001.53MiB devid    1 size 2.00GiB used 1.85GiB path
>>>> /dev/vdf1 devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>>>> 
>>>> a. df -h should report Size as 2GiB rather than as 3GiB. 
>>>> Because this is 2 device raid1, the limiting factor is devid 1
>>>> @2GiB.
>>> I agree
> 
> NOPE.
> 
> The model you propose is too simple.
> 
> While the data portion of the file system is set to RAID1 the
> metadata portion of the filesystem is still set to the default of
> DUP. As such it is impossible to guess how much space is "free" since
> it is unknown how the space will be used before hand.


Hi Robert,

sorry but you are talking about a different problem.
Yang is  trying to solve a problem where it is impossible to fill
all the disk space because some portion is not raid1 protected. So
it is incorrect to report all space/2 as free space.

Instead you are stating that *if* the metadata are stored as DUP (and
is not this case, because the metadata are raid1, see below), it is possible
to fill all the disk space.

This is a complex problem. The fact that BTRFS allows different
raid levels causes to be very difficult to evaluate the free space (
as space available directly to the user). There is no a simple answer.

I am still convinced that the best free space *estimation* is considering
the ratio disk-space-consumed/file-allocated constant, and evaluate
the free space as the 

disk-space-unused*file-allocate/disk-space-consumed.

Of course there are pathological cases that make this
prediction fails completely. But I consider the best estimation
possible for the average users.

But again this is a different problem that the one raised by 
Yang.



[...]

> IF you wanted everything to be RAID-1 you should have instead done

> # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 -m raid1
> 
> The mistake is yours, rest of you analysis is, therefore, completely
> inapplicable. Please read all the documentation before making that
> sort of filesystem. Your data will thank you later.
> 
> DSCLAIMER: I have _not_ looked at the numbers you would get if you
> used the corrected command.

Sorry, but you are wrong. Doing mkfs.btrfs -d raid1 /dev/loop[01] leads 
to have both data and metadata  in raid1. IIRC if you have more than
one disks, the metadata switched to raid1 automatically.

$ sudo mkfs.btrfs -d raid1 /dev/loop[01]
Btrfs v3.17
See http://btrfs.wiki.kernel.org for more information.

Performing full device TRIM (10.00GiB) ...
Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
Performing full device TRIM (30.00GiB) ...
adding device /dev/loop1 id 2
fs created label (null) on /dev/loop0
	nodesize 16384 leafsize 16384 sectorsize 4096 size 40.00GiB
ghigo@venice:/tmp$ sudo mount /dev/loop0 t/
ghigo@venice:/tmp$ sudo dd if=/dev/zero of=t/fill bs=4M count=10
10+0 records in
10+0 records out
41943040 bytes (42 MB) copied, 0.018853 s, 2.2 GB/s
ghigo@venice:/tmp$ sync
ghigo@venice:/tmp$ sudo btrfs fi df t/
Data, RAID1: total=1.00GiB, used=40.50MiB
Data, single: total=8.00MiB, used=0.00B
System, RAID1: total=8.00MiB, used=16.00KiB
System, single: total=4.00MiB, used=0.00B
Metadata, RAID1: total=1.00GiB, used=160.00KiB
Metadata, single: total=8.00MiB, used=0.00B
GlobalReserve, single: total=16.00MiB, used=0.00B

[...]

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 18:25         ` Goffredo Baroncelli
@ 2014-12-11  8:28           ` Dongsheng Yang
  0 siblings, 0 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-11  8:28 UTC (permalink / raw)
  To: kreijack, Robert White; +Cc: lists, linux-btrfs

On 12/11/2014 02:25 AM, Goffredo Baroncelli wrote:
> On 12/10/2014 11:53 AM, Robert White wrote:
>> On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
>>> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>>>> Hi Dongsheng On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>>>> When function btrfs_statfs() calculate the tatol size of fs, it
>>>>> is calculating the total size of disks and then dividing it by
>>>>> a factor. But in some usecase, the result is not good to user.
>>>>>
>>>>> Example: # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 # mount
>>>>> /dev/vdf1 /mnt # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>>> # df -h /mnt Filesystem      Size  Used Avail Use% Mounted on
>>>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>>>
>>>>> # btrfs fi show /dev/vdf1 Label: none  uuid:
>>>>> f85d93dc-81f4-445d-91e5-6a5cd9563294 Total devices 2 FS bytes
>>>>> used 1001.53MiB devid    1 size 2.00GiB used 1.85GiB path
>>>>> /dev/vdf1 devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>>>>>
>>>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>>>> Because this is 2 device raid1, the limiting factor is devid 1
>>>>> @2GiB.
>>>> I agree
>> NOPE.
>>
>> The model you propose is too simple.
>>
>> While the data portion of the file system is set to RAID1 the
>> metadata portion of the filesystem is still set to the default of
>> DUP. As such it is impossible to guess how much space is "free" since
>> it is unknown how the space will be used before hand.
>
> Hi Robert,
>
> sorry but you are talking about a different problem.
> Yang is  trying to solve a problem where it is impossible to fill
> all the disk space because some portion is not raid1 protected. So
> it is incorrect to report all space/2 as free space.
>
> Instead you are stating that *if* the metadata are stored as DUP (and
> is not this case, because the metadata are raid1, see below), it is possible
> to fill all the disk space.
>
> This is a complex problem. The fact that BTRFS allows different
> raid levels causes to be very difficult to evaluate the free space (
> as space available directly to the user). There is no a simple answer.
>
> I am still convinced that the best free space *estimation* is considering
> the ratio disk-space-consumed/file-allocated constant, and evaluate
> the free space as the
>
> disk-space-unused*file-allocate/disk-space-consumed.
>
> Of course there are pathological cases that make this
> prediction fails completely. But I consider the best estimation
> possible for the average users.
>
> But again this is a different problem that the one raised by
> Yang.

Thanx Goffredo, I have cooked a v2 for this problem. I will send it out 
soon.
>
>
>
> [...]
>
>> IF you wanted everything to be RAID-1 you should have instead done
>> # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 -m raid1
>>
>> The mistake is yours, rest of you analysis is, therefore, completely
>> inapplicable. Please read all the documentation before making that
>> sort of filesystem. Your data will thank you later.
>>
>> DSCLAIMER: I have _not_ looked at the numbers you would get if you
>> used the corrected command.
> Sorry, but you are wrong. Doing mkfs.btrfs -d raid1 /dev/loop[01] leads
> to have both data and metadata  in raid1. IIRC if you have more than
> one disks, the metadata switched to raid1 automatically.
>
> $ sudo mkfs.btrfs -d raid1 /dev/loop[01]
> Btrfs v3.17
> See http://btrfs.wiki.kernel.org for more information.
>
> Performing full device TRIM (10.00GiB) ...
> Turning ON incompat feature 'extref': increased hardlink limit per file to 65536
> Performing full device TRIM (30.00GiB) ...
> adding device /dev/loop1 id 2
> fs created label (null) on /dev/loop0
> 	nodesize 16384 leafsize 16384 sectorsize 4096 size 40.00GiB
> ghigo@venice:/tmp$ sudo mount /dev/loop0 t/
> ghigo@venice:/tmp$ sudo dd if=/dev/zero of=t/fill bs=4M count=10
> 10+0 records in
> 10+0 records out
> 41943040 bytes (42 MB) copied, 0.018853 s, 2.2 GB/s
> ghigo@venice:/tmp$ sync
> ghigo@venice:/tmp$ sudo btrfs fi df t/
> Data, RAID1: total=1.00GiB, used=40.50MiB
> Data, single: total=8.00MiB, used=0.00B
> System, RAID1: total=8.00MiB, used=16.00KiB
> System, single: total=4.00MiB, used=0.00B
> Metadata, RAID1: total=1.00GiB, used=160.00KiB
> Metadata, single: total=8.00MiB, used=0.00B
> GlobalReserve, single: total=16.00MiB, used=0.00B
>
> [...]
>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-09 11:20 ` [PATCH] Btrfs: get more accurate output in fd command Dongsheng Yang
  2014-12-09 18:47   ` Goffredo Baroncelli
@ 2014-12-10 13:59   ` Shriramana Sharma
  2014-12-10 14:56     ` Dongsheng Yang
  1 sibling, 1 reply; 19+ messages in thread
From: Shriramana Sharma @ 2014-12-10 13:59 UTC (permalink / raw)
  To: linux-btrfs

On Tue, Dec 9, 2014 at 4:50 PM, Dongsheng Yang
<yangds.fnst@cn.fujitsu.com> wrote:
>         # df -h /mnt
> Filesystem      Size  Used Avail Use% Mounted on
> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt

LOL -- not being a user of RAID I can't comment on the patch, but I
was somewhat wondering what the "fd" command in the subject line is...
:-)

-- 
Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] Btrfs: get more accurate output in fd command.
  2014-12-10 13:59   ` Shriramana Sharma
@ 2014-12-10 14:56     ` Dongsheng Yang
  0 siblings, 0 replies; 19+ messages in thread
From: Dongsheng Yang @ 2014-12-10 14:56 UTC (permalink / raw)
  To: Shriramana Sharma; +Cc: linux-btrfs

On Wed, Dec 10, 2014 at 9:59 PM, Shriramana Sharma <samjnaa@gmail.com> wrote:
> On Tue, Dec 9, 2014 at 4:50 PM, Dongsheng Yang
> <yangds.fnst@cn.fujitsu.com> wrote:
>>         # df -h /mnt
>> Filesystem      Size  Used Avail Use% Mounted on
>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>
> LOL -- not being a user of RAID I can't comment on the patch, but I
> was somewhat wondering what the "fd" command in the subject line is...
> :-)

Yea, it should be "df". :)
>
> --
> Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-12-11  8:31 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-10-29  2:19 [bug] df reports wrong Size and Avail on raid1, 3.18rc2 Chris Murphy
2014-10-29  2:26 ` Eric Sandeen
2014-12-09 11:20 ` [PATCH] Btrfs: get more accurate output in fd command Dongsheng Yang
2014-12-09 18:47   ` Goffredo Baroncelli
2014-12-10  1:08     ` Dongsheng Yang
2014-12-10 10:53       ` Robert White
2014-12-10 13:21         ` Duncan
2014-12-10 15:02           ` Dongsheng Yang
2014-12-10 19:05             ` Goffredo Baroncelli
2014-12-11  8:23               ` Dongsheng Yang
2014-12-11  3:53             ` Duncan
2014-12-11  8:25               ` Dongsheng Yang
2014-12-10 20:36           ` Robert White
2014-12-10 21:03             ` Goffredo Baroncelli
2014-12-10 14:51         ` Dongsheng Yang
2014-12-10 18:25         ` Goffredo Baroncelli
2014-12-11  8:28           ` Dongsheng Yang
2014-12-10 13:59   ` Shriramana Sharma
2014-12-10 14:56     ` Dongsheng Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox