All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
To: Robert White <rwhite@pobox.com>,
	Grzegorz Kowal <custos.mentis@gmail.com>,
	linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: [PATCH v2 1/3] Btrfs: get more accurate output in df command.
Date: Mon, 15 Dec 2014 16:26:32 +0800	[thread overview]
Message-ID: <548E9B38.9080202@cn.fujitsu.com> (raw)
In-Reply-To: <548E929B.2090203@pobox.com>

On 12/15/2014 03:49 PM, Robert White wrote:
> On 12/14/2014 10:06 PM, Robert White wrote:
>> On 12/14/2014 05:21 PM, Dongsheng Yang wrote:
>>> Anyone have some suggestion about it?
>> (... strong advocacy for raw numbers...)

Hi Robert, thanx for your so detailed reply.

You are proposing to report the raw numbers in df command, right?

Let's compare the space information in FS level and Device level.

Example:
/dev/sda == 1TiB
/dev/sdb == 2TiB

mkfs.btrfs /dev/sda /dev/sdb -d raid1

(1). If we report the raw numbers in df command, we will get the result of
@size=3T, @used=0 @available=3T. It's not a bad idea until now, as you said
user can consider the raid when they are using the fs. Then if we fill 
1T data
into it. we will get @size=3, @used=2T, @avalable=1T. And at this moment, we
will get ENOSPC when writing any more data. It's unacceptable.  Why you
tell me there is 1T space available, but I can't write one byte into it?

(2). Actually, there was an elder btrfs_statfs(), it reported the raw 
numbers to user.
To solve the problem mentioned in (1), we need report the space 
information in the FS level.
Current btrfs_statfs() is designed like this, but not working in any cases.
My new btrfs_statfs() here is following this design and implementing it
to show a *better* output to user.

Thanx
Yang
>
> Concise Example to attempt to be clearer:
>
> /dev/sda == 1TiB
> /dev/sdb == 2TiB
> /dev/sdc == 3TiB
> /dev/sdd == 3TiB
>
> mkfs.btrfs /dev/sd{a..d} -d raid0
> mount /dev/sda /mnt
>
> Now compare ::
>
> #!/bin/bash
> dd if=/dev/urandom of=/mnt/example bs=1G
>
> vs
>
> #!/bin/bash
> typeset -i counter
> for ((counter=0;;counter++)); do
> dd if=/dev/urandom of=/mnt/example$conter bs=44 count=1
> done
>
> vs
>
> #!/bin/bash
> typeset -i counter
> for ((counter=0;;counter++)); do
> dd if=/dev/urandom of=/mnt/example$conter bs=44 count=1
> done &
> dd if=/dev/urandom of=/mnt/example bs=1G
>
> Now repeat the above 3 models for
> mkfs.btrfs /dev/sd{a..d} -d raid5
>
>
> ......
>
> As you watch these six examples evolve you can ponder the ultimate 
> futility of doing adaptive prediction within statfs().
>
> Then go back and change the metadata from the default of RAID1 to 
> RAID5 or RAID6 or RAID10.
>
> Then go back and try
>
> mkfs.btrfs /dev/sd{a..d} -d raid10
>
> then balance when the big file runs out of space, then resume the big 
> file with oflag=append
>
> ......
>
> Unlike _all_ our predecessors, we are active at both the semantic file 
> storage level _and_ the physical media management level.
>
> None of the prior filesystems match this new ground exactly.
>
> The only real option is to expose the raw numbers and then tell people 
> the corner cases.
>
> Absolutely unavailable blocks, such as the massive waste of 5TiB in 
> the above sized media if raid10 were selected for both data and 
> metadata would be subtracted from size if and only if it's 
> _impossible_ for it to be accessed by this sort of restriction. But 
> even in this case, the correct answer for size is 4TiB because that 
> exactly answers "how big is this filesystem".
>
> It might be worth having a "dev_item.bytes_excluded" or unusable or 
> whatever to account for the difference between total_bytes and 
> bytes_used and the implicit bytes available. This would account for 
> the 0,1,2,2 TiB that a raid10 of the example sizes could never reach 
> in the current geometry. I'm betting that this sort of number also 
> shows up as some number of sectors in any filesystem that has an odd 
> tidbit of size up at the top where no structure is ever gong to fit. 
> That's just a feature of the way disks use GB instead of GiB and msdos 
> style partitions love the number 63.
>
> So resize sets the size. Geometry limitations may reduce the effective 
> size by some, or a _lot_, but then the used-vs-available should _not_ 
> try to correct for whatever geometry is in use. Even when it might be 
> simple because if it does it well in the simple cases like 
> raid10/raid10, it would have to botch it up on the hard cases.
>
>
> .
>


  reply	other threads:[~2014-12-15  8:29 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-11  8:31 [PATCH v2 1/3] Btrfs: get more accurate output in df command Dongsheng Yang
2014-12-11  8:31 ` [PATCH v2 2/3] Btrfs: raid56: simplify the parameter of nr_parity_stripes() Dongsheng Yang
2014-12-16  6:21   ` Satoru Takeuchi
2014-12-11  8:31 ` [PATCH v2 3/3] Btrfs: adapt df command to RAID5/6 Dongsheng Yang
2014-12-12 18:00 ` [PATCH v2 1/3] Btrfs: get more accurate output in df command Goffredo Baroncelli
2014-12-13  0:50   ` Duncan
2014-12-13 10:21     ` Dongsheng Yang
2014-12-13  9:57   ` Dongsheng Yang
2014-12-12 19:25 ` Goffredo Baroncelli
2014-12-14 11:29   ` Dongsheng Yang
     [not found]     ` <CABmMA7tw9BDsBXGHLO4vjcO4gaYmZPb_BQV8w22griqFvCJpPA@mail.gmail.com>
2014-12-14 14:32       ` Grzegorz Kowal
2014-12-15  1:21         ` Dongsheng Yang
2014-12-15  6:06           ` Robert White
2014-12-15  7:49             ` Robert White
2014-12-15  8:26               ` Dongsheng Yang [this message]
2014-12-15  9:36                 ` Robert White
2014-12-16  3:30                   ` Standards Problems [Was: [PATCH v2 1/3] Btrfs: get more accurate output in df command.] Robert White
2014-12-16  3:52                     ` Robert White
2014-12-16 11:30                     ` Dongsheng Yang
2014-12-16 13:24                       ` Dongsheng Yang
2014-12-16 19:52                       ` Robert White
2014-12-17 11:38                         ` Dongsheng Yang
2014-12-18  4:07                           ` Robert White
2014-12-18  8:02                             ` Duncan
2014-12-23 12:31                             ` Dongsheng Yang
2014-12-27  1:10                               ` Robert White
2015-01-05  9:59                                 ` Dongsheng Yang
2014-12-31  0:15                             ` Zygo Blaxell
2015-01-05  9:56                               ` Dongsheng Yang
2015-01-05 10:07                                 ` [PATCH v2 1/3] Btrfs: get more accurate output in df command Dongsheng Yang
2015-01-05 10:07                                   ` [PATCH v2 2/3] Btrfs: raid56: simplify the parameter of nr_parity_stripes() Dongsheng Yang
2015-01-05 10:07                                   ` [PATCH v2 3/3] Btrfs: adapt df command to RAID5/6 Dongsheng Yang
2014-12-19  3:32             ` [PATCH v2 1/3] Btrfs: get more accurate output in df command Zygo Blaxell
     [not found]     ` <548F1EA7.9050505@inwind.it>
2014-12-16 13:47       ` Dongsheng Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=548E9B38.9080202@cn.fujitsu.com \
    --to=yangds.fnst@cn.fujitsu.com \
    --cc=custos.mentis@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=rwhite@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.