Linux Btrfs filesystem development
 help / color / mirror / Atom feed
From: Robert White <rwhite@pobox.com>
To: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>, kreijack@inwind.it
Cc: lists@colorremedies.com, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] Btrfs: get more accurate output in fd command.
Date: Wed, 10 Dec 2014 02:53:40 -0800	[thread overview]
Message-ID: <54882634.4000809@pobox.com> (raw)
In-Reply-To: <54879CFE.1090909@cn.fujitsu.com>

On 12/09/2014 05:08 PM, Dongsheng Yang wrote:
> On 12/10/2014 02:47 AM, Goffredo Baroncelli wrote:
>> Hi Dongsheng
>> On 12/09/2014 12:20 PM, Dongsheng Yang wrote:
>>> When function btrfs_statfs() calculate the tatol size of fs, it is
>>> calculating
>>> the total size of disks and then dividing it by a factor. But in some
>>> usecase,
>>> the result is not good to user.
>>>
>>> Example:
>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>     # mount /dev/vdf1 /mnt
>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>     # df -h /mnt
>>> Filesystem      Size  Used Avail Use% Mounted on
>>> /dev/vdf1       3.0G 1018M  1.3G  45% /mnt
>>>
>>>     # btrfs fi show /dev/vdf1
>>> Label: none  uuid: f85d93dc-81f4-445d-91e5-6a5cd9563294
>>>     Total devices 2 FS bytes used 1001.53MiB
>>>     devid    1 size 2.00GiB used 1.85GiB path /dev/vdf1
>>>     devid    2 size 4.00GiB used 1.83GiB path /dev/vdf2
>>>
>>> a. df -h should report Size as 2GiB rather than as 3GiB.
>>> Because this is 2 device raid1, the limiting factor is devid 1 @2GiB.
>> I agree

NOPE.

The model you propose is too simple.

While the data portion of the file system is set to RAID1 the metadata 
portion of the filesystem is still set to the default of DUP. As such it 
is impossible to guess how much space is "free" since it is unknown how 
the space will be used before hand.

IF, say, this were used as a typical mail spool, web cache, or any 
number of similar smal-file applications virtually all of the data may 
end up in the metadata chunks. The "blocks free" in this usage are 
indistinguisable from any other file system.

For all that DUP data the correct size is 3GiB because there will be two 
copies of all metadata but they could _all_ end up on /dev/vdf2.

So you have a RAID-1 region that is constrained to 2Gib. You have 2GiB 
more storage for all your metadata, but the constraint is DUP (so 
everything is written twice "somewhere")

So the space breakdown is, if optimally packed, actually

2GiB mirrored, for _data_, takes up 4GiB total spread evenly across 
/dev/vdf2 (2Gib) and /dev/vdf1 (2Gib).

_AND_ 1GiB of metadata, written twice to /dev/vdf2 (2Gib)

So free space is 3Gib on the presumption that data and metadata will be 
equally used.

The program, not being psychic, can only make a fair-usage guess about 
future use.

Now we have accounted for all 6GiB of raw storage _and_ the report of 
3GiB free.

IF you wanted everything to be RAID-1 you should have instead done

# mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1 -m raid1

The mistake is yours, rest of you analysis is, therefore, completely 
inapplicable. Please read all the documentation before making that sort 
of filesystem. Your data will thank you later.

DSCLAIMER: I have _not_ looked at the numbers you would get if you used 
the corrected command.


>>
>>> b. df -h should report Avail as 0.15GiB or less, rather than as 1.3GiB.
>>> 2 - 1.85 = 0.15
>> I cannot agree; the avail should be:
>>      1.85           (the capacity of the allocated chunk)
>>     -1.018          (the file stored)
>>     +(2-1.85=0.15)  (the residual capacity of the disks
>>                      considering a raid1 fs)
>>     ---------------
>> =   0.97
>
> My bad here. It should be 0.97. My mistake in this changelog.
> I will update it in next version.
>>> This patch drops the factor at all and calculate the size observable to
>>> user without considering which raid level the data is in and what's the
>>> size exactly in disk.
>>>
>>> After this patch applied:
>>>     # mkfs.btrfs -f /dev/vdf1 /dev/vdf2 -d raid1
>>>     # mount /dev/vdf1 /mnt
>>>     # dd if=/dev/zero of=/mnt/zero bs=1M count=1000
>>>     # df -h /mnt
>>> Filesystem      Size  Used Avail Use% Mounted on
>>> /dev/vdf1       2.0G 1018M  713M  59% /mnt
>> I am confused: in this example you reported as Avail 713MB, when previous
>> you stated that the right value should be 150MB...
>
> As you pointed above, the right value should be 970MB or less (Some
> space is used for metadata and system).
> And the 713MB is my result of it.
>>
>> What happens when the filesystem is RAID5/RAID6 or Linear ?
>
> The original df did not consider the RAID5/6. So it still does not work
> well with
> this patch applied. But I will update this patch to handle these
> scenarios in V2.
>
> Thanx
> Yang
>
>   [...]
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>


  reply	other threads:[~2014-12-10 10:53 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-29  2:19 [bug] df reports wrong Size and Avail on raid1, 3.18rc2 Chris Murphy
2014-10-29  2:26 ` Eric Sandeen
2014-12-09 11:20 ` [PATCH] Btrfs: get more accurate output in fd command Dongsheng Yang
2014-12-09 18:47   ` Goffredo Baroncelli
2014-12-10  1:08     ` Dongsheng Yang
2014-12-10 10:53       ` Robert White [this message]
2014-12-10 13:21         ` Duncan
2014-12-10 15:02           ` Dongsheng Yang
2014-12-10 19:05             ` Goffredo Baroncelli
2014-12-11  8:23               ` Dongsheng Yang
2014-12-11  3:53             ` Duncan
2014-12-11  8:25               ` Dongsheng Yang
2014-12-10 20:36           ` Robert White
2014-12-10 21:03             ` Goffredo Baroncelli
2014-12-10 14:51         ` Dongsheng Yang
2014-12-10 18:25         ` Goffredo Baroncelli
2014-12-11  8:28           ` Dongsheng Yang
2014-12-10 13:59   ` Shriramana Sharma
2014-12-10 14:56     ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54882634.4000809@pobox.com \
    --to=rwhite@pobox.com \
    --cc=kreijack@inwind.it \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=yangds.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox