linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* What to do about df and btrfs fi df
@ 2014-02-10 16:41 Josef Bacik
  2014-02-10 17:06 ` Hugo Mills
                   ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Josef Bacik @ 2014-02-10 16:41 UTC (permalink / raw)
  To: lin >> "linux-btrfs@vger.kernel.org"

Hello,

So first of all this is going to get a lot of responses, so straight 
away I'm only going to consider your opinion if I recognize your name 
and think you are a sane person.  This basically means any big 
contributors and we'll make sanity exceptions for cwillu.

These are just broad strokes, let us not get bogged down in the details, 
I just want to come to a consensus on how things _generally_ should be 
portrayed to the user.  We can worry about implementation details once 
we agree on the direction we want to go.

We all know space is a loaded question with btrfs, so I'm just going to 
explain the reasoning of why we chose what we chose originally and then 
offer the direction we should go in.  If you agree say yay, if not 
please provide a very concise alternative suggestion with a very short 
explanation of why it is better than I'm suggesting.  I'm not looking to 
spend a whole lot of time this problem.

Also this isn't going to address b_avail, cause frankly that is some 
fucking voodoo right there, suffice it to say we will just adjust 
b_avail based on how we should represent total and used.

===== ye olde df =====

I don't remember what we did originally, but IIRC we would only show 
used space from the block groups and would show the entire size of the 
fs.  So for example with two 1 tb drives in RAID1 you'd see ENOSPC and 
look at df and it would show total of 2TB and used at 1TB.  Obviously 
not good, so we switched to the mechanism we have today, which is you 
see 2TB for total, you see 2TB for used and you see 0 for available.  We 
just scaled up the used and available based on your raid multiplier.

===== btrfs fi df =====

I made this for me because of ENOSPC issues but of course it's also 
really useful for users.  It is just a dump of the block group 
information and their flags, so really just spits out bytes_used and 
total_bytes and flags.  Because at the block_group/space_info level in 
btrfs we don't care about how much actual space is taken up this number 
is not adjusted for RAID values, and these numbers are reflected in the 
tools output.  So if you have RAID1 you need to mentally multiply the 
Total and Used values by 2 because that is how much actual space is 
being used.

=====  What to do moving forward =====

Flip what both of these do.  Do not multiply for normal df, and multiply 
for btrfs fi df.

===== New and improved df =====

Since this is the lowest common denominator we should just spit out how 
much space is used based on the block groups and then divide the 
remaining space that hasn't been allocated yet by the raid multiplier.

This is going to be kind of tricky once we do per-subvolume RAID levels, 
but this falls under the b_avail voodoo which is just a guess anyway, so 
for this we will probably take the biggest multiplier and use that to 
show how much available space you have.

This way with RAID1 it shows you have 1tb of total space and you've used 
1tb of space.

===== New and improved btrfs fi df =====

Since people using this tool are already going to be better informed and 
since we are already given the block group flags we can go ahead and do 
the raid multiplier in btrfs-progs and spit out the adjusted numbers 
rather than the raw numbers we get from the ioctl.  This will just be a 
progs thing and that way we can possibly add an option to not apply the 
multipliers and just get the raw output.

===== Conclusion =====

Let me know if this is acceptable to everybody.  Remember this is just 
broad strokes, keep your responses short and simple or I simply won't 
read them.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 16:41 What to do about df and btrfs fi df Josef Bacik
@ 2014-02-10 17:06 ` Hugo Mills
  2014-02-10 18:24   ` cwillu
  2014-02-10 22:14   ` Goffredo Baroncelli
  2014-02-10 22:26 ` Goffredo Baroncelli
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 29+ messages in thread
From: Hugo Mills @ 2014-02-10 17:06 UTC (permalink / raw)
  To: Josef Bacik; +Cc: lin >> "linux-btrfs@vger.kernel.org"

[-- Attachment #1: Type: text/plain, Size: 2455 bytes --]

tl;dr: Yes to proposed df changes. Keep btrfs fi df as-is.

On Mon, Feb 10, 2014 at 11:41:51AM -0500, Josef Bacik wrote:
[snip]
> =====  What to do moving forward =====
> 
> Flip what both of these do.  Do not multiply for normal df, and
> multiply for btrfs fi df.
> 
> ===== New and improved df =====
> 
> Since this is the lowest common denominator we should just spit out
> how much space is used based on the block groups and then divide the
> remaining space that hasn't been allocated yet by the raid
> multiplier.
> 
> This is going to be kind of tricky once we do per-subvolume RAID
> levels, but this falls under the b_avail voodoo which is just a
> guess anyway, so for this we will probably take the biggest
> multiplier and use that to show how much available space you have.

   Biggest multiplier leads to the pessimistic estimate, which is what
I'd prefer to see here, so that's good. Agree with this.

> This way with RAID1 it shows you have 1tb of total space and you've
> used 1tb of space.
> 
> ===== New and improved btrfs fi df =====
> 
> Since people using this tool are already going to be better informed
> and since we are already given the block group flags we can go ahead
> and do the raid multiplier in btrfs-progs and spit out the adjusted
> numbers rather than the raw numbers we get from the ioctl.  This
> will just be a progs thing and that way we can possibly add an
> option to not apply the multipliers and just get the raw output.

   Keep this unchanged, IMO.

(a) I quite like the non-multiplied version as it is, as it gives you
    the quantities of real, actual data stored -- the value you
    generally care about anyway ("how much stuff do I have on here?").

(b) Using the non-multiplied version here as well as above would then
    give >gasp< comparable values for btrfs fi df and Plain Old df.
    Less confusion all round, I think.

(c) The difficulty with using multiplied values is the behaviour of
    parity RAID on filesystems with different sized devices: there
    isn't a single multiplier that will give an accurate answer at
    all. (Detailed arguments available on application ;) )

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
            --- "Can I offer you anything? Tea? Seedcake? ---            
                         Glass of Amontillado?"                          

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 17:06 ` Hugo Mills
@ 2014-02-10 18:24   ` cwillu
  2014-02-10 18:28     ` Josef Bacik
  2014-02-11  1:02     ` Roger Binns
  2014-02-10 22:14   ` Goffredo Baroncelli
  1 sibling, 2 replies; 29+ messages in thread
From: cwillu @ 2014-02-10 18:24 UTC (permalink / raw)
  To: Hugo Mills, Josef Bacik, lin >> linux-btrfs@vger.kernel.org

I concur.

The regular df data used number should be the amount of space required
to hold a backup of that content (assuming that the backup maintains
reflinks and compression and so forth).

There's no good answer for available space; the statfs syscall isn't
rich enough to cover all the bases even in the face of dup metadata
and single data (i.e., the common case), and a truly conservative
estimate (report based on the highest-usage raid level in use) would
report space/2 on that same common case.  "Highest-usage data raid
level in use" is probably the best compromise, with a big warning that
that many large numbers of small files will not actually fit, posted
in some mythical place that users look.

I would like to see the information from btrfs fi df and btrfs fi show
summarized somewhere (ideally as a new btrfs fi df output), as both
sets of numbers are really necessary, or at least have btrfs fi df
include the amount of space not allocated to a block group.

Re regular df: are we adding space allocated to a block group (raid1,
say) but not in actual use in a file as the N/2 space available in the
block group, or the N space it takes up on disk?  This probably
matters a bit less than it used to, but if it's N/2, that leaves us
open to "empty filesystem, 100GB free, write a 80GB file and then
delete it, wtf, only 60GB free now?" reporting issues.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:24   ` cwillu
@ 2014-02-10 18:28     ` Josef Bacik
  2014-02-10 18:36       ` cwillu
  2014-02-11  1:02     ` Roger Binns
  1 sibling, 1 reply; 29+ messages in thread
From: Josef Bacik @ 2014-02-10 18:28 UTC (permalink / raw)
  To: cwillu, Hugo Mills, lin >> linux-btrfs@vger.kernel.org



On 02/10/2014 01:24 PM, cwillu wrote:
> I concur.
>
> The regular df data used number should be the amount of space required
> to hold a backup of that content (assuming that the backup maintains
> reflinks and compression and so forth).
>
> There's no good answer for available space; the statfs syscall isn't
> rich enough to cover all the bases even in the face of dup metadata
> and single data (i.e., the common case), and a truly conservative
> estimate (report based on the highest-usage raid level in use) would
> report space/2 on that same common case.  "Highest-usage data raid
> level in use" is probably the best compromise, with a big warning that
> that many large numbers of small files will not actually fit, posted
> in some mythical place that users look.
>
> I would like to see the information from btrfs fi df and btrfs fi show
> summarized somewhere (ideally as a new btrfs fi df output), as both
> sets of numbers are really necessary, or at least have btrfs fi df
> include the amount of space not allocated to a block group.
>
> Re regular df: are we adding space allocated to a block group (raid1,
> say) but not in actual use in a file as the N/2 space available in the
> block group, or the N space it takes up on disk?  This probably
> matters a bit less than it used to, but if it's N/2, that leaves us
> open to "empty filesystem, 100GB free, write a 80GB file and then
> delete it, wtf, only 60GB free now?" reporting issues.
>

The only case we add the actual allocated chunk space is for metadata, 
for data we only add the actual used number.  So say say you write 80gb 
file and then delete it but during the writing we allocated a 1 gig 
chunk for metadata you'll see only 99gb free, make sense?  We could 
(should?) roll this into the b_avail magic and make "used" really only 
reflect data usage, opinions on this?  Thanks,

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:28     ` Josef Bacik
@ 2014-02-10 18:36       ` cwillu
  2014-02-10 18:41         ` Josef Bacik
  0 siblings, 1 reply; 29+ messages in thread
From: cwillu @ 2014-02-10 18:36 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Hugo Mills, lin >> linux-btrfs@vger.kernel.org

IMO, used should definitely include metadata, especially given that we
inline small files.

I can convince myself both that this implies that we should roll it
into b_avail, and that we should go the other way and only report the
actual used number for metadata as well, so I might just plead
insanity here.

On Mon, Feb 10, 2014 at 12:28 PM, Josef Bacik <jbacik@fb.com> wrote:
>
>
> On 02/10/2014 01:24 PM, cwillu wrote:
>>
>> I concur.
>>
>> The regular df data used number should be the amount of space required
>> to hold a backup of that content (assuming that the backup maintains
>> reflinks and compression and so forth).
>>
>> There's no good answer for available space; the statfs syscall isn't
>> rich enough to cover all the bases even in the face of dup metadata
>> and single data (i.e., the common case), and a truly conservative
>> estimate (report based on the highest-usage raid level in use) would
>> report space/2 on that same common case.  "Highest-usage data raid
>> level in use" is probably the best compromise, with a big warning that
>> that many large numbers of small files will not actually fit, posted
>> in some mythical place that users look.
>>
>> I would like to see the information from btrfs fi df and btrfs fi show
>> summarized somewhere (ideally as a new btrfs fi df output), as both
>> sets of numbers are really necessary, or at least have btrfs fi df
>> include the amount of space not allocated to a block group.
>>
>> Re regular df: are we adding space allocated to a block group (raid1,
>> say) but not in actual use in a file as the N/2 space available in the
>> block group, or the N space it takes up on disk?  This probably
>> matters a bit less than it used to, but if it's N/2, that leaves us
>> open to "empty filesystem, 100GB free, write a 80GB file and then
>> delete it, wtf, only 60GB free now?" reporting issues.
>>
>
> The only case we add the actual allocated chunk space is for metadata, for
> data we only add the actual used number.  So say say you write 80gb file and
> then delete it but during the writing we allocated a 1 gig chunk for
> metadata you'll see only 99gb free, make sense?  We could (should?) roll
> this into the b_avail magic and make "used" really only reflect data usage,
> opinions on this?  Thanks,
>
> Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:36       ` cwillu
@ 2014-02-10 18:41         ` Josef Bacik
  2014-02-10 18:54           ` cwillu
                             ` (3 more replies)
  0 siblings, 4 replies; 29+ messages in thread
From: Josef Bacik @ 2014-02-10 18:41 UTC (permalink / raw)
  To: cwillu; +Cc: Hugo Mills, lin >> linux-btrfs@vger.kernel.org



On 02/10/2014 01:36 PM, cwillu wrote:
> IMO, used should definitely include metadata, especially given that we
> inline small files.
>
> I can convince myself both that this implies that we should roll it
> into b_avail, and that we should go the other way and only report the
> actual used number for metadata as well, so I might just plead
> insanity here.
>

I could be convinced to do this.  So we have

total: (total disk bytes) / (raid multiplier)
used: (total used in data block groups) +
	(total used in metadata block groups)
avail: total - (total used in data block groups +
		total metadata block groups)

That seems like the simplest to code up.  Then we can argue about 
whether to use the total metadata size or just the used metadata size 
for b_avail.  Seem reasonable?

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:41         ` Josef Bacik
@ 2014-02-10 18:54           ` cwillu
  2014-02-10 19:05           ` Hugo Mills
                             ` (2 subsequent siblings)
  3 siblings, 0 replies; 29+ messages in thread
From: cwillu @ 2014-02-10 18:54 UTC (permalink / raw)
  To: Josef Bacik; +Cc: Hugo Mills, lin >> linux-btrfs@vger.kernel.org

>> IMO, used should definitely include metadata, especially given that we
>> inline small files.
>>
>> I can convince myself both that this implies that we should roll it
>> into b_avail, and that we should go the other way and only report the
>> actual used number for metadata as well, so I might just plead
>> insanity here.
>
> I could be convinced to do this.  So we have
>
> total: (total disk bytes) / (raid multiplier)
> used: (total used in data block groups) +
>         (total used in metadata block groups)
> avail: total - (total used in data block groups +
>                 total metadata block groups)
>
> That seems like the simplest to code up.  Then we can argue about whether to
> use the total metadata size or just the used metadata size for b_avail.
> Seem reasonable?

I can't think of any situations where this results in tears.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:41         ` Josef Bacik
  2014-02-10 18:54           ` cwillu
@ 2014-02-10 19:05           ` Hugo Mills
  2014-02-11 17:36           ` David Sterba
  2014-02-17 17:08           ` David Sterba
  3 siblings, 0 replies; 29+ messages in thread
From: Hugo Mills @ 2014-02-10 19:05 UTC (permalink / raw)
  To: Josef Bacik; +Cc: cwillu, lin >> linux-btrfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1284 bytes --]

On Mon, Feb 10, 2014 at 01:41:23PM -0500, Josef Bacik wrote:
> 
> 
> On 02/10/2014 01:36 PM, cwillu wrote:
> >IMO, used should definitely include metadata, especially given that we
> >inline small files.
> >
> >I can convince myself both that this implies that we should roll it
> >into b_avail, and that we should go the other way and only report the
> >actual used number for metadata as well, so I might just plead
> >insanity here.
> >
> 
> I could be convinced to do this.  So we have
> 
> total: (total disk bytes) / (raid multiplier)
> used: (total used in data block groups) +
> 	(total used in metadata block groups)
> avail: total - (total used in data block groups +
> 		total metadata block groups)
> 
> That seems like the simplest to code up.  Then we can argue about
> whether to use the total metadata size or just the used metadata
> size for b_avail.  Seem reasonable?

   My vote on that bikeshed: total metadata size. But I'll accept any
other answer. :)

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Well, you don't get to be a kernel hacker simply by looking ---   
                    good in Speedos. -- Rusty Russell                    

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 17:06 ` Hugo Mills
  2014-02-10 18:24   ` cwillu
@ 2014-02-10 22:14   ` Goffredo Baroncelli
  1 sibling, 0 replies; 29+ messages in thread
From: Goffredo Baroncelli @ 2014-02-10 22:14 UTC (permalink / raw)
  To: Hugo Mills, Josef Bacik,
	lin >> "linux-btrfs@vger.kernel.org"

On 02/10/2014 06:06 PM, Hugo Mills wrote:
>    Biggest multiplier leads to the pessimistic estimate, which is what
> I'd prefer to see here, so that's good. Agree with this.

I would prefer to use as "raid multiplier" the ratio 

 total data block groups + total metadata block group
--------------------------------------------------------------
    disk space allocated for data and metdata block group

I hope that this would work better when we have a filesystem composed 
by small (inlined) files or when we will have per-subvolume RAID levels.

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 16:41 What to do about df and btrfs fi df Josef Bacik
  2014-02-10 17:06 ` Hugo Mills
@ 2014-02-10 22:26 ` Goffredo Baroncelli
  2014-02-10 22:56   ` cwillu
  2014-02-11 13:14   ` Josef Bacik
  2014-02-11 20:53 ` Sandy McArthur
  2014-02-12  3:55 ` Anand Jain
  3 siblings, 2 replies; 29+ messages in thread
From: Goffredo Baroncelli @ 2014-02-10 22:26 UTC (permalink / raw)
  To: Josef Bacik, lin >> "linux-btrfs@vger.kernel.org"

On 02/10/2014 05:41 PM, Josef Bacik wrote:
> ===== New and improved btrfs fi df =====
> 
> Since people using this tool are already going to be better informed
> and since we are already given the block group flags we can go ahead
> and do the raid multiplier in btrfs-progs and spit out the adjusted
> numbers rather than the raw numbers we get from the ioctl.  This will
> just be a progs thing and that way we can possibly add an option to
> not apply the multipliers and just get the raw output.

In the past [1] I proposed the following approach.

$ sudo btrfs filesystem df /mnt/btrfs1/
Disk size:		 400.00GB
Disk allocated:		   8.04GB
Disk unallocated:	 391.97GB
Used:			  11.29MB
Free (Estimated):	 250.45GB	(Max: 396.99GB, min: 201.00GB)
Data to disk ratio:	     63 %

The space was given in terms of "disk space" and in terms of 
"filesystem space". Other that there is an indication of an estimation of
the free space, with the pessimistic and optimistic values.

[1] See "[PATCH V3][BTRFS-PROGS] Enhance btrfs fi df with raid5/6 support" 
dated 03/10/2013 01:17 PM

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 22:26 ` Goffredo Baroncelli
@ 2014-02-10 22:56   ` cwillu
  2014-02-11 13:14   ` Josef Bacik
  1 sibling, 0 replies; 29+ messages in thread
From: cwillu @ 2014-02-10 22:56 UTC (permalink / raw)
  To: kreijack@inwind.it; +Cc: Josef Bacik, lin >> linux-btrfs@vger.kernel.org

> In the past [1] I proposed the following approach.
>
> $ sudo btrfs filesystem df /mnt/btrfs1/
> Disk size:               400.00GB
> Disk allocated:            8.04GB
> Disk unallocated:        391.97GB
> Used:                     11.29MB
> Free (Estimated):        250.45GB       (Max: 396.99GB, min: 201.00GB)
> Data to disk ratio:          63 %

Note that a big chunk of the problem is "what do we do with the
regular system df output".  I don't mind this as a btrfs fi df summary
though.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:24   ` cwillu
  2014-02-10 18:28     ` Josef Bacik
@ 2014-02-11  1:02     ` Roger Binns
  2014-02-11  3:13       ` cwillu
  1 sibling, 1 reply; 29+ messages in thread
From: Roger Binns @ 2014-02-11  1:02 UTC (permalink / raw)
  To: linux-btrfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/02/14 10:24, cwillu wrote:
> The regular df data used number should be the amount of space required 
> to hold a backup of that content (assuming that the backup maintains 
> reflinks and compression and so forth).
> 
> There's no good answer for available space;

I think the flipside of the above works well.  How large a group of files
can you expect to create before you will get ENOSPC?

That for example is the check code does that looks at df - "I need to put
in XGB of files - will it fit?"  It is also what users do.

This is also what NTFS under Windows does with compression.  If it says
you have 5GB of space left then you will be able to put in 5GB of
uncompressible files.  Of course if they are compressible then you don't
end up consuming all the free space.

Roger

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlL5dqcACgkQmOOfHg372QQBzgCgyrvj+WnZevjEDdgbAFd2nHaD
H98AoK0ZSDwZJpSMIdXpGYZGjWuPpGTh
=xJ+X
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11  1:02     ` Roger Binns
@ 2014-02-11  3:13       ` cwillu
  2014-02-11  3:35         ` ronnie sahlberg
  2014-02-11 19:58         ` Roger Binns
  0 siblings, 2 replies; 29+ messages in thread
From: cwillu @ 2014-02-11  3:13 UTC (permalink / raw)
  To: Roger Binns; +Cc: linux-btrfs

On Mon, Feb 10, 2014 at 7:02 PM, Roger Binns <rogerb@rogerbinns.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 10/02/14 10:24, cwillu wrote:
>> The regular df data used number should be the amount of space required
>> to hold a backup of that content (assuming that the backup maintains
>> reflinks and compression and so forth).
>>
>> There's no good answer for available space;
>
> I think the flipside of the above works well.  How large a group of files
> can you expect to create before you will get ENOSPC?
>
> That for example is the check code does that looks at df - "I need to put
> in XGB of files - will it fit?"  It is also what users do.

But the answer changes dramatically depending on whether it's large
numbers of small files or a small number of large files, and the
conservative worst-case choice means we report a number that is half
what is probably expected.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11  3:13       ` cwillu
@ 2014-02-11  3:35         ` ronnie sahlberg
  2014-02-11 19:58         ` Roger Binns
  1 sibling, 0 replies; 29+ messages in thread
From: ronnie sahlberg @ 2014-02-11  3:35 UTC (permalink / raw)
  To: cwillu; +Cc: Roger Binns, Btrfs BTRFS

On Mon, Feb 10, 2014 at 7:13 PM, cwillu <cwillu@cwillu.com> wrote:
> On Mon, Feb 10, 2014 at 7:02 PM, Roger Binns <rogerb@rogerbinns.com> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 10/02/14 10:24, cwillu wrote:
>>> The regular df data used number should be the amount of space required
>>> to hold a backup of that content (assuming that the backup maintains
>>> reflinks and compression and so forth).
>>>
>>> There's no good answer for available space;
>>
>> I think the flipside of the above works well.  How large a group of files
>> can you expect to create before you will get ENOSPC?
>>
>> That for example is the check code does that looks at df - "I need to put
>> in XGB of files - will it fit?"  It is also what users do.
>
> But the answer changes dramatically depending on whether it's large
> numbers of small files or a small number of large files, and the
> conservative worst-case choice means we report a number that is half
> what is probably expected.

I don't think that is a problem, as long as the "avail guesstimate" is
conservative.

Scenario:
A user has 10G of files and df reports that there are 11G available.
I think the expectation is that copying these 10G into the filesystem
will not ENOSPC.
After the copy completes, whether the new avail number is ==1G or >>1G
is less important IMHO.

I.e. I like to see df output as a "you can write AT LEAST this much
more data until the filesystem is full".


That was my 5 cent.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 22:26 ` Goffredo Baroncelli
  2014-02-10 22:56   ` cwillu
@ 2014-02-11 13:14   ` Josef Bacik
  2014-02-11 18:20     ` Goffredo Baroncelli
  1 sibling, 1 reply; 29+ messages in thread
From: Josef Bacik @ 2014-02-11 13:14 UTC (permalink / raw)
  To: kreijack, lin >> "linux-btrfs@vger.kernel.org"



On 02/10/2014 05:26 PM, Goffredo Baroncelli wrote:
> On 02/10/2014 05:41 PM, Josef Bacik wrote:
>> ===== New and improved btrfs fi df =====
>>
>> Since people using this tool are already going to be better informed
>> and since we are already given the block group flags we can go ahead
>> and do the raid multiplier in btrfs-progs and spit out the adjusted
>> numbers rather than the raw numbers we get from the ioctl.  This will
>> just be a progs thing and that way we can possibly add an option to
>> not apply the multipliers and just get the raw output.
>
> In the past [1] I proposed the following approach.
>
> $ sudo btrfs filesystem df /mnt/btrfs1/
> Disk size:		 400.00GB
> Disk allocated:		   8.04GB
> Disk unallocated:	 391.97GB
> Used:			  11.29MB
> Free (Estimated):	 250.45GB	(Max: 396.99GB, min: 201.00GB)
> Data to disk ratio:	     63 %
>
> The space was given in terms of "disk space" and in terms of
> "filesystem space". Other that there is an indication of an estimation of
> the free space, with the pessimistic and optimistic values.
>
> [1] See "[PATCH V3][BTRFS-PROGS] Enhance btrfs fi df with raid5/6 support"
> dated 03/10/2013 01:17 PM
>

The problem I had with this patch was it didn't give me a way to get the 
original output.  I as a developer really need to have the raw dump of 
the block group info as I'm doing stuff.  So I like this output, but I 
still need my old output, if you fix that part up I'll review/ack it. 
Thanks,

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:41         ` Josef Bacik
  2014-02-10 18:54           ` cwillu
  2014-02-10 19:05           ` Hugo Mills
@ 2014-02-11 17:36           ` David Sterba
  2014-02-17 17:08           ` David Sterba
  3 siblings, 0 replies; 29+ messages in thread
From: David Sterba @ 2014-02-11 17:36 UTC (permalink / raw)
  To: Josef Bacik; +Cc: cwillu, Hugo Mills, lin >> linux-btrfs@vger.kernel.org

On Mon, Feb 10, 2014 at 01:41:23PM -0500, Josef Bacik wrote:
> On 02/10/2014 01:36 PM, cwillu wrote:
> >IMO, used should definitely include metadata, especially given that we
> >inline small files.
> >
> >I can convince myself both that this implies that we should roll it
> >into b_avail, and that we should go the other way and only report the
> >actual used number for metadata as well, so I might just plead
> >insanity here.
> >
> 
> I could be convinced to do this.  So we have
> 
> total: (total disk bytes) / (raid multiplier)
> used: (total used in data block groups) +
> 	(total used in metadata block groups)
> avail: total - (total used in data block groups +
> 		total metadata block groups)

Sounds reasonable to me.

> That seems like the simplest to code up.  Then we can argue about whether to
> use the total metadata size or just the used metadata size for b_avail.

I tend to vote for 'total metadata size', based on the common usecases
that consume only the metadata (reflink, snapshot), I'd expect to see no
change in the 'avail' value, increased 'used' make sense.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11 13:14   ` Josef Bacik
@ 2014-02-11 18:20     ` Goffredo Baroncelli
  2014-02-11 18:33       ` Josef Bacik
  2014-02-11 18:56       ` Hugo Mills
  0 siblings, 2 replies; 29+ messages in thread
From: Goffredo Baroncelli @ 2014-02-11 18:20 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs@vger.kernel.org

On 02/11/2014 02:14 PM, Josef Bacik wrote:
> 
> 
> On 02/10/2014 05:26 PM, Goffredo Baroncelli wrote:
>> On 02/10/2014 05:41 PM, Josef Bacik wrote:
>>> ===== New and improved btrfs fi df =====
[...]

Hi Josef

> The problem I had with this patch was it didn't give me a way to get
> the original output.  I as a developer really need to have the raw
> dump of the block group info as I'm doing stuff.  So I like this
> output, but I still need my old output, if you fix that part up I'll
> review/ack it. Thanks,

I am open to improve this patch. What about the following output (it 
was a copy and paste, no code for now, the number are invented)

$ sudo btrfs filesystem df /mnt/btrfs1/
Disk size:		 400.00GB
Disk unallocated:	 391.97GB
Disk allocation:
                        Allocated	Used
   Data, single:           2.01GB,         1.00GB
   System, DUP:            4.00MB          2.00MB
   System, single:         4.00MB          1.00MB
   Metadata, DUP:          2.00GB        750.00MB
   Metadata, single:       8.00MB          2.20MB
                           ------         -------
   Total:                  7.00GB          1.75GB

Free (Estimated):	 250.45GB	(Max: 396.99GB, min: 201.00GB)
Data to disk ratio:	     63 %

Do you like ? Do you have further suggestions ?

Anyway if you want a more understandable block group info dump, 
I suggest you to give a look to the other command (same patch set):

$ sudo ./btrfs filesystem disk-usage -t /mnt/btrfs1/
         Data   Data    Metadata Metadata System System             
         Single RAID6   Single   RAID5    Single RAID5   Unallocated
                                                                    
/dev/vdb 8.00MB  1.00GB   8.00MB   1.00GB 4.00MB  4.00MB     97.98GB
/dev/vdc      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
/dev/vdd      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
/dev/vde      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
         ====== ======= ======== ======== ====== ======= ===========
Total    8.00MB  2.00GB   8.00MB   3.00GB 4.00MB 12.00MB    391.97GB
Used       0.00 11.25MB     0.00  36.00KB   0.00  4.00KB            




> 
> Josef -- To unsubscribe from this list: send the line "unsubscribe
> linux-btrfs" in the body of a message to majordomo@vger.kernel.org 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11 18:20     ` Goffredo Baroncelli
@ 2014-02-11 18:33       ` Josef Bacik
  2014-02-11 18:46         ` Goffredo Baroncelli
  2014-02-11 18:56       ` Hugo Mills
  1 sibling, 1 reply; 29+ messages in thread
From: Josef Bacik @ 2014-02-11 18:33 UTC (permalink / raw)
  To: kreijack, linux-btrfs@vger.kernel.org



On 02/11/2014 01:20 PM, Goffredo Baroncelli wrote:
> On 02/11/2014 02:14 PM, Josef Bacik wrote:
>>
>>
>> On 02/10/2014 05:26 PM, Goffredo Baroncelli wrote:
>>> On 02/10/2014 05:41 PM, Josef Bacik wrote:
>>>> ===== New and improved btrfs fi df =====
> [...]
>
> Hi Josef
>
>> The problem I had with this patch was it didn't give me a way to get
>> the original output.  I as a developer really need to have the raw
>> dump of the block group info as I'm doing stuff.  So I like this
>> output, but I still need my old output, if you fix that part up I'll
>> review/ack it. Thanks,
>
> I am open to improve this patch. What about the following output (it
> was a copy and paste, no code for now, the number are invented)
>
> $ sudo btrfs filesystem df /mnt/btrfs1/
> Disk size:		 400.00GB
> Disk unallocated:	 391.97GB
> Disk allocation:
>                          Allocated	Used
>     Data, single:           2.01GB,         1.00GB
>     System, DUP:            4.00MB          2.00MB
>     System, single:         4.00MB          1.00MB
>     Metadata, DUP:          2.00GB        750.00MB
>     Metadata, single:       8.00MB          2.20MB
>                             ------         -------
>     Total:                  7.00GB          1.75GB
>
> Free (Estimated):	 250.45GB	(Max: 396.99GB, min: 201.00GB)
> Data to disk ratio:	     63 %
>
> Do you like ? Do you have further suggestions ?
>
> Anyway if you want a more understandable block group info dump,
> I suggest you to give a look to the other command (same patch set):
>
> $ sudo ./btrfs filesystem disk-usage -t /mnt/btrfs1/
>           Data   Data    Metadata Metadata System System
>           Single RAID6   Single   RAID5    Single RAID5   Unallocated
>
> /dev/vdb 8.00MB  1.00GB   8.00MB   1.00GB 4.00MB  4.00MB     97.98GB
> /dev/vdc      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
> /dev/vdd      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
> /dev/vde      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
>           ====== ======= ======== ======== ====== ======= ===========
> Total    8.00MB  2.00GB   8.00MB   3.00GB 4.00MB 12.00MB    391.97GB
> Used       0.00 11.25MB     0.00  36.00KB   0.00  4.00KB
>

I did not notice this before, as long as I have a way to get to this 
information still then I'm good with what you had originally.  I guess 
update and resend and I'll review it.  Thanks,

Josef

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11 18:33       ` Josef Bacik
@ 2014-02-11 18:46         ` Goffredo Baroncelli
  0 siblings, 0 replies; 29+ messages in thread
From: Goffredo Baroncelli @ 2014-02-11 18:46 UTC (permalink / raw)
  To: Josef Bacik, linux-btrfs@vger.kernel.org

On 02/11/2014 07:33 PM, Josef Bacik wrote:
> 
> 
> On 02/11/2014 01:20 PM, Goffredo Baroncelli wrote:
>> On 02/11/2014 02:14 PM, Josef Bacik wrote:
>>>
>>>
>>> On 02/10/2014 05:26 PM, Goffredo Baroncelli wrote:
>>>> On 02/10/2014 05:41 PM, Josef Bacik wrote:
>>>>> ===== New and improved btrfs fi df =====
>> [...]
>>
>> Hi Josef
>>
>>> The problem I had with this patch was it didn't give me a way to get
>>> the original output.  I as a developer really need to have the raw
>>> dump of the block group info as I'm doing stuff.  So I like this
>>> output, but I still need my old output, if you fix that part up I'll
>>> review/ack it. Thanks,
>>
>> I am open to improve this patch. What about the following output (it
>> was a copy and paste, no code for now, the number are invented)
>>
>> $ sudo btrfs filesystem df /mnt/btrfs1/
>> Disk size:         400.00GB
>> Disk unallocated:     391.97GB
>> Disk allocation:
>>                          Allocated    Used
>>     Data, single:           2.01GB,         1.00GB
>>     System, DUP:            4.00MB          2.00MB
>>     System, single:         4.00MB          1.00MB
>>     Metadata, DUP:          2.00GB        750.00MB
>>     Metadata, single:       8.00MB          2.20MB
>>                             ------         -------
>>     Total:                  7.00GB          1.75GB
>>
>> Free (Estimated):     250.45GB    (Max: 396.99GB, min: 201.00GB)
>> Data to disk ratio:         63 %
>>
>> Do you like ? Do you have further suggestions ?
>>
>> Anyway if you want a more understandable block group info dump,
>> I suggest you to give a look to the other command (same patch set):
>>
>> $ sudo ./btrfs filesystem disk-usage -t /mnt/btrfs1/
>>           Data   Data    Metadata Metadata System System
>>           Single RAID6   Single   RAID5    Single RAID5   Unallocated
>>
>> /dev/vdb 8.00MB  1.00GB   8.00MB   1.00GB 4.00MB  4.00MB     97.98GB
>> /dev/vdc      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
>> /dev/vdd      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
>> /dev/vde      -  1.00GB        -   1.00GB      -  4.00MB     98.00GB
>>           ====== ======= ======== ======== ====== ======= ===========
>> Total    8.00MB  2.00GB   8.00MB   3.00GB 4.00MB 12.00MB    391.97GB
>> Used       0.00 11.25MB     0.00  36.00KB   0.00  4.00KB
>>
> 
> I did not notice this before, as long as I have a way to get to this
> information still then I'm good with what you had originally. I guess
> update and resend and I'll review it. Thanks,> 
> Josef

Ok, let me few day to rebase the patches.

> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11 18:20     ` Goffredo Baroncelli
  2014-02-11 18:33       ` Josef Bacik
@ 2014-02-11 18:56       ` Hugo Mills
  2014-02-12 21:03         ` Goffredo Baroncelli
  1 sibling, 1 reply; 29+ messages in thread
From: Hugo Mills @ 2014-02-11 18:56 UTC (permalink / raw)
  To: kreijack; +Cc: Josef Bacik, linux-btrfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 1733 bytes --]

On Tue, Feb 11, 2014 at 07:20:23PM +0100, Goffredo Baroncelli wrote:
> On 02/11/2014 02:14 PM, Josef Bacik wrote:
> > 
> > 
> > On 02/10/2014 05:26 PM, Goffredo Baroncelli wrote:
> >> On 02/10/2014 05:41 PM, Josef Bacik wrote:
> >>> ===== New and improved btrfs fi df =====
> [...]
> 
> Hi Josef
> 
> > The problem I had with this patch was it didn't give me a way to get
> > the original output.  I as a developer really need to have the raw
> > dump of the block group info as I'm doing stuff.  So I like this
> > output, but I still need my old output, if you fix that part up I'll
> > review/ack it. Thanks,
> 
> I am open to improve this patch. What about the following output (it 
> was a copy and paste, no code for now, the number are invented)
> 
> $ sudo btrfs filesystem df /mnt/btrfs1/
> Disk size:		 400.00GB
> Disk unallocated:	 391.97GB
> Disk allocation:
>                         Allocated	Used
>    Data, single:           2.01GB,         1.00GB
>    System, DUP:            4.00MB          2.00MB
>    System, single:         4.00MB          1.00MB
>    Metadata, DUP:          2.00GB        750.00MB
>    Metadata, single:       8.00MB          2.20MB
>                            ------         -------
>    Total:                  7.00GB          1.75GB

   Two minor nits here: please put a space between the number and the
units, and distinguish between e.g. MB (powers of 10) and MiB (powers
of 2).

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
    --- But somewhere along the line, it seems / That pimp became ---    
                       cool,  and punk mainstream.                       

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11  3:13       ` cwillu
  2014-02-11  3:35         ` ronnie sahlberg
@ 2014-02-11 19:58         ` Roger Binns
  1 sibling, 0 replies; 29+ messages in thread
From: Roger Binns @ 2014-02-11 19:58 UTC (permalink / raw)
  To: linux-btrfs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/02/14 19:13, cwillu wrote:
> But the answer changes dramatically depending on whether it's large 
> numbers of small files or a small number of large files, and the 
> conservative worst-case choice means we report a number that is half 
> what is probably expected.

Perfect is the enemy of good.

We aren't talking about a billion zero byte files and expecting them to
take no space.  It is things like a user with a file manager grabbing some
files and eyeballing if they will fit in the destination. Or the file
manager itself giving a warning before the copying starts ("they might not
fit").

In both cases the sum of the source file sizes is compared to the df on
the destination.

Roger

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlL6gNsACgkQmOOfHg372QR+WACfd91k2MYzbBbb3RFFuLCJUyw0
tw0AoI51yxrXCGFYHJBEK3+rwqR6i/iY
=RIiX
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 16:41 What to do about df and btrfs fi df Josef Bacik
  2014-02-10 17:06 ` Hugo Mills
  2014-02-10 22:26 ` Goffredo Baroncelli
@ 2014-02-11 20:53 ` Sandy McArthur
  2014-02-12  3:09   ` Kostia Khlebopros
  2014-02-12  3:55 ` Anand Jain
  3 siblings, 1 reply; 29+ messages in thread
From: Sandy McArthur @ 2014-02-11 20:53 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org; +Cc: Josef Bacik

Maybe this is too much of a break from tradition but I think df should
report the min(device free space, remaining quota) for the particular
volume being queried.


On Mon, Feb 10, 2014 at 11:41 AM, Josef Bacik <jbacik@fb.com> wrote:
> Hello,
>
> So first of all this is going to get a lot of responses, so straight away
> I'm only going to consider your opinion if I recognize your name and think
> you are a sane person.  This basically means any big contributors and we'll
> make sanity exceptions for cwillu.
>
> These are just broad strokes, let us not get bogged down in the details, I
> just want to come to a consensus on how things _generally_ should be
> portrayed to the user.  We can worry about implementation details once we
> agree on the direction we want to go.
>
> We all know space is a loaded question with btrfs, so I'm just going to
> explain the reasoning of why we chose what we chose originally and then
> offer the direction we should go in.  If you agree say yay, if not please
> provide a very concise alternative suggestion with a very short explanation
> of why it is better than I'm suggesting.  I'm not looking to spend a whole
> lot of time this problem.
>
> Also this isn't going to address b_avail, cause frankly that is some fucking
> voodoo right there, suffice it to say we will just adjust b_avail based on
> how we should represent total and used.
>
> ===== ye olde df =====
>
> I don't remember what we did originally, but IIRC we would only show used
> space from the block groups and would show the entire size of the fs.  So
> for example with two 1 tb drives in RAID1 you'd see ENOSPC and look at df
> and it would show total of 2TB and used at 1TB.  Obviously not good, so we
> switched to the mechanism we have today, which is you see 2TB for total, you
> see 2TB for used and you see 0 for available.  We just scaled up the used
> and available based on your raid multiplier.
>
> ===== btrfs fi df =====
>
> I made this for me because of ENOSPC issues but of course it's also really
> useful for users.  It is just a dump of the block group information and
> their flags, so really just spits out bytes_used and total_bytes and flags.
> Because at the block_group/space_info level in btrfs we don't care about how
> much actual space is taken up this number is not adjusted for RAID values,
> and these numbers are reflected in the tools output.  So if you have RAID1
> you need to mentally multiply the Total and Used values by 2 because that is
> how much actual space is being used.
>
> =====  What to do moving forward =====
>
> Flip what both of these do.  Do not multiply for normal df, and multiply for
> btrfs fi df.
>
> ===== New and improved df =====
>
> Since this is the lowest common denominator we should just spit out how much
> space is used based on the block groups and then divide the remaining space
> that hasn't been allocated yet by the raid multiplier.
>
> This is going to be kind of tricky once we do per-subvolume RAID levels, but
> this falls under the b_avail voodoo which is just a guess anyway, so for
> this we will probably take the biggest multiplier and use that to show how
> much available space you have.
>
> This way with RAID1 it shows you have 1tb of total space and you've used 1tb
> of space.
>
> ===== New and improved btrfs fi df =====
>
> Since people using this tool are already going to be better informed and
> since we are already given the block group flags we can go ahead and do the
> raid multiplier in btrfs-progs and spit out the adjusted numbers rather than
> the raw numbers we get from the ioctl.  This will just be a progs thing and
> that way we can possibly add an option to not apply the multipliers and just
> get the raw output.
>
> ===== Conclusion =====
>
> Let me know if this is acceptable to everybody.  Remember this is just broad
> strokes, keep your responses short and simple or I simply won't read them.
> Thanks,
>
> Josef
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine

^ permalink raw reply	[flat|nested] 29+ messages in thread

* RE: What to do about df and btrfs fi df
  2014-02-11 20:53 ` Sandy McArthur
@ 2014-02-12  3:09   ` Kostia Khlebopros
  2014-02-12 21:12     ` Goffredo Baroncelli
  0 siblings, 1 reply; 29+ messages in thread
From: Kostia Khlebopros @ 2014-02-12  3:09 UTC (permalink / raw)
  To: Sandy McArthur, linux-btrfs@vger.kernel.org; +Cc: Josef Bacik

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 6218 bytes --]

Any plans on having "brtfs fi df" report more precise values rather then rounded off to the nearest hundredth of a unit. full kilobytes(1024 bytes =1Kib) or in bytes would be nice

Current output:

# btrfs fi df /data
Data, single: total=1.37TiB, used=1.35TiB
System, DUP: total=8.00MiB, used=192.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=3.00GiB, used=1.62GiB
Metadata, single: total=8.00MiB, used=0.00

Better would be:

# btrfs fi df /data
Data, single: total=14123412341 bytes, used=1342343 bytes
...etc you know what I mean...

I wish there was more you know.

Kind Regards,

Kostia

 Consider the environment. Please don't print this e-mail unless you really need to.

-----Original Message-----
From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Sandy McArthur
Sent: Tuesday, February 11, 2014 12:54 PM
To: linux-btrfs@vger.kernel.org
Cc: Josef Bacik
Subject: Re: What to do about df and btrfs fi df

Maybe this is too much of a break from tradition but I think df should report the min(device free space, remaining quota) for the particular volume being queried.


On Mon, Feb 10, 2014 at 11:41 AM, Josef Bacik <jbacik@fb.com> wrote:
> Hello,
>
> So first of all this is going to get a lot of responses, so straight 
> away I'm only going to consider your opinion if I recognize your name 
> and think you are a sane person.  This basically means any big 
> contributors and we'll make sanity exceptions for cwillu.
>
> These are just broad strokes, let us not get bogged down in the 
> details, I just want to come to a consensus on how things _generally_ 
> should be portrayed to the user.  We can worry about implementation 
> details once we agree on the direction we want to go.
>
> We all know space is a loaded question with btrfs, so I'm just going 
> to explain the reasoning of why we chose what we chose originally and 
> then offer the direction we should go in.  If you agree say yay, if 
> not please provide a very concise alternative suggestion with a very 
> short explanation of why it is better than I'm suggesting.  I'm not 
> looking to spend a whole lot of time this problem.
>
> Also this isn't going to address b_avail, cause frankly that is some 
> fucking voodoo right there, suffice it to say we will just adjust 
> b_avail based on how we should represent total and used.
>
> ===== ye olde df =====
>
> I don't remember what we did originally, but IIRC we would only show 
> used space from the block groups and would show the entire size of the 
> fs.  So for example with two 1 tb drives in RAID1 you'd see ENOSPC and 
> look at df and it would show total of 2TB and used at 1TB.  Obviously 
> not good, so we switched to the mechanism we have today, which is you 
> see 2TB for total, you see 2TB for used and you see 0 for available.  
> We just scaled up the used and available based on your raid multiplier.
>
> ===== btrfs fi df =====
>
> I made this for me because of ENOSPC issues but of course it's also 
> really useful for users.  It is just a dump of the block group 
> information and their flags, so really just spits out bytes_used and total_bytes and flags.
> Because at the block_group/space_info level in btrfs we don't care 
> about how much actual space is taken up this number is not adjusted 
> for RAID values, and these numbers are reflected in the tools output.  
> So if you have RAID1 you need to mentally multiply the Total and Used 
> values by 2 because that is how much actual space is being used.
>
> =====  What to do moving forward =====
>
> Flip what both of these do.  Do not multiply for normal df, and 
> multiply for btrfs fi df.
>
> ===== New and improved df =====
>
> Since this is the lowest common denominator we should just spit out 
> how much space is used based on the block groups and then divide the 
> remaining space that hasn't been allocated yet by the raid multiplier.
>
> This is going to be kind of tricky once we do per-subvolume RAID 
> levels, but this falls under the b_avail voodoo which is just a guess 
> anyway, so for this we will probably take the biggest multiplier and 
> use that to show how much available space you have.
>
> This way with RAID1 it shows you have 1tb of total space and you've 
> used 1tb of space.
>
> ===== New and improved btrfs fi df =====
>
> Since people using this tool are already going to be better informed 
> and since we are already given the block group flags we can go ahead 
> and do the raid multiplier in btrfs-progs and spit out the adjusted 
> numbers rather than the raw numbers we get from the ioctl.  This will 
> just be a progs thing and that way we can possibly add an option to 
> not apply the multipliers and just get the raw output.
>
> ===== Conclusion =====
>
> Let me know if this is acceptable to everybody.  Remember this is just 
> broad strokes, keep your responses short and simple or I simply won't read them.
> Thanks,
>
> Josef
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" 
> in the body of a message to majordomo@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html



--
Sandy McArthur

"He who dares not offend cannot be honest."
- Thomas Paine
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html

-----
No virus found in this message.
Checked by AVG - www.avg.com
Version: 2013.0.3462 / Virus Database: 3697/7079 - Release Date: 02/09/14

This e-mail, including attachments, may include confidential 
and/or proprietary information, and may be used only by the 
person or entity to which it is addressed.
If the reader of this e-mail is not the intended recipient or his or 
her authorized agent, the reader is hereby notified that any 
dissemination, distribution or copying of this e-mail is prohibited. 
If you have received this e-mail in error, please notify the sender 
by replying to this message and delete this e-mail immediately
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 16:41 What to do about df and btrfs fi df Josef Bacik
                   ` (2 preceding siblings ...)
  2014-02-11 20:53 ` Sandy McArthur
@ 2014-02-12  3:55 ` Anand Jain
  3 siblings, 0 replies; 29+ messages in thread
From: Anand Jain @ 2014-02-12  3:55 UTC (permalink / raw)
  To: lin >> "linux-btrfs@vger.kernel.org"; +Cc: Josef Bacik



  Most of the btrfs-progs output has to be (re)designed from the point
  of view of the end-user.

  Eg: 'btrfs su list /mnt', it could have been much better from the end
  user perspective (who does not have to look into the source code),
  of course it does make sense to the developers himself but that's
  irrelevant.  And the good news its getting better.

  When creating btrfs-progs output, it has to match with the context
  that end user will be when using btrfs. And, advance details could
  go into the --verbose/--debug option.


Thanks, -Anand

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-11 18:56       ` Hugo Mills
@ 2014-02-12 21:03         ` Goffredo Baroncelli
  0 siblings, 0 replies; 29+ messages in thread
From: Goffredo Baroncelli @ 2014-02-12 21:03 UTC (permalink / raw)
  To: Hugo Mills; +Cc: Josef Bacik, linux-btrfs@vger.kernel.org

Hi Hugo,

On 02/11/2014 07:56 PM, Hugo Mills wrote:

>> $ sudo btrfs filesystem df /mnt/btrfs1/
>> Disk size:		 400.00GB
>> Disk unallocated:	 391.97GB
>> Disk allocation:
>>                         Allocated	Used
>>    Data, single:           2.01GB,         1.00GB
>>    System, DUP:            4.00MB          2.00MB
>>    System, single:         4.00MB          1.00MB
>>    Metadata, DUP:          2.00GB        750.00MB
>>    Metadata, single:       8.00MB          2.20MB
>>                            ------         -------
>>    Total:                  7.00GB          1.75GB
> 
>    Two minor nits here: please put a space between the number and the
> units, and distinguish between e.g. MB (powers of 10) and MiB (powers
> of 2).

I will use the pretty_size_snprintf() function, which uses the powers 
of 2 units, but unfortunately doesn't seems to have the space. 
The space could be added with a dedicated patch.

> 
>    Hugo.
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-12  3:09   ` Kostia Khlebopros
@ 2014-02-12 21:12     ` Goffredo Baroncelli
  0 siblings, 0 replies; 29+ messages in thread
From: Goffredo Baroncelli @ 2014-02-12 21:12 UTC (permalink / raw)
  To: Kostia Khlebopros, Sandy McArthur, linux-btrfs@vger.kernel.org
  Cc: Josef Bacik

Hi Kostia,

On 02/12/2014 04:09 AM, Kostia Khlebopros wrote:
> Any plans on having "brtfs fi df" report more precise values rather
> then rounded off to the nearest hundredth of a unit. full
> kilobytes(1024 bytes =1Kib) or in bytes would be nice> 
> Current output:
> 
> # btrfs fi df /data
> Data, single: total=1.37TiB, used=1.35TiB
> System, DUP: total=8.00MiB, used=192.00KiB
> System, single: total=4.00MiB, used=0.00
> Metadata, DUP: total=3.00GiB, used=1.62GiB
> Metadata, single: total=8.00MiB, used=0.00
> 
> Better would be:
> 
> # btrfs fi df /data
> Data, single: total=14123412341 bytes, used=1342343 bytes
> ...etc you know what I mean...
> 

My patches have a switch '-b' which set the units in byte. 

$ sudo ./btrfs fi df /
Disk size:		138.05GiB
Disk allocated:		 25.04GiB
Disk unallocated:	113.01GiB
Used:			 21.36GiB
Free (Estimated):	105.61GiB	(Max: 114.68GiB, min: 58.17GiB)
Data to disk ratio:	     92 %

$ sudo ./btrfs fi df -b /
Disk size:		      148229656576
Disk allocated:		       26881294336
Disk unallocated:	      121348362240
Used:			       22939594752
Free (Estimated):	      113402084997	(Max: 123134189568, min: 62460008448)
Data to disk ratio:	              92 %


> I wish there was more you know.
> 
> Kind Regards,
> 
> Kostia

BR
G.Baroncelli


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it>
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-10 18:41         ` Josef Bacik
                             ` (2 preceding siblings ...)
  2014-02-11 17:36           ` David Sterba
@ 2014-02-17 17:08           ` David Sterba
  2014-02-18  8:33             ` Goswin von Brederlow
  3 siblings, 1 reply; 29+ messages in thread
From: David Sterba @ 2014-02-17 17:08 UTC (permalink / raw)
  To: Josef Bacik; +Cc: cwillu, Hugo Mills, lin >> linux-btrfs@vger.kernel.org

On Mon, Feb 10, 2014 at 01:41:23PM -0500, Josef Bacik wrote:
> 
> 
> On 02/10/2014 01:36 PM, cwillu wrote:
> >IMO, used should definitely include metadata, especially given that we
> >inline small files.
> >
> >I can convince myself both that this implies that we should roll it
> >into b_avail, and that we should go the other way and only report the
> >actual used number for metadata as well, so I might just plead
> >insanity here.
> >
> 
> I could be convinced to do this.  So we have
> 
> total: (total disk bytes) / (raid multiplier)
> used: (total used in data block groups) +
> 	(total used in metadata block groups)
> avail: total - (total used in data block groups +
> 		total metadata block groups)

The size of global block reserve should be IMO subtracted from 'avail',
this reports the space as free, but is in fact not.

The "used" amount of the global reserve might be included into
filesystem 'used', but I've observed the global reserve used for short
periods of time under some heavy stress, I'm convinced it needs to be
accounted in the df report.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-17 17:08           ` David Sterba
@ 2014-02-18  8:33             ` Goswin von Brederlow
  2014-02-18 16:43               ` David Sterba
  0 siblings, 1 reply; 29+ messages in thread
From: Goswin von Brederlow @ 2014-02-18  8:33 UTC (permalink / raw)
  To: dsterba, Josef Bacik, cwillu, Hugo Mills,
	lin >> linux-btrfs@vger.kernel.org

On Mon, Feb 17, 2014 at 06:08:20PM +0100, David Sterba wrote:
> On Mon, Feb 10, 2014 at 01:41:23PM -0500, Josef Bacik wrote:
> > 
> > 
> > On 02/10/2014 01:36 PM, cwillu wrote:
> > >IMO, used should definitely include metadata, especially given that we
> > >inline small files.
> > >
> > >I can convince myself both that this implies that we should roll it
> > >into b_avail, and that we should go the other way and only report the
> > >actual used number for metadata as well, so I might just plead
> > >insanity here.
> > >
> > 
> > I could be convinced to do this.  So we have
> > 
> > total: (total disk bytes) / (raid multiplier)
> > used: (total used in data block groups) +
> > 	(total used in metadata block groups)
> > avail: total - (total used in data block groups +
> > 		total metadata block groups)
> 
> The size of global block reserve should be IMO subtracted from 'avail',
> this reports the space as free, but is in fact not.

How much global block reserve is there? Does that explain why I can't
use the last 270G of my 19TB btrfs?
 
> The "used" amount of the global reserve might be included into
> filesystem 'used', but I've observed the global reserve used for short
> periods of time under some heavy stress, I'm convinced it needs to be
> accounted in the df report.

As a comparison the ext2/3/4 filesystem has a % reserved for root and
does not show this in available. So you get filesystem with 0 bytes
free but root can still write to them.

I would argue that available should not include the reserve. It is not
available for normal operations, right?

MfG
	Goswin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: What to do about df and btrfs fi df
  2014-02-18  8:33             ` Goswin von Brederlow
@ 2014-02-18 16:43               ` David Sterba
  0 siblings, 0 replies; 29+ messages in thread
From: David Sterba @ 2014-02-18 16:43 UTC (permalink / raw)
  To: Goswin von Brederlow
  Cc: dsterba, Josef Bacik, cwillu, Hugo Mills,
	lin >> linux-btrfs@vger.kernel.org

On Tue, Feb 18, 2014 at 09:33:17AM +0100, Goswin von Brederlow wrote:
> > The size of global block reserve should be IMO subtracted from 'avail',
> > this reports the space as free, but is in fact not.
> 
> How much global block reserve is there? Does that explain why I can't
> use the last 270G of my 19TB btrfs?

The size is dynamically adjusted according to the current fs usage, but
is not larger than 512MB. I don't think it's related to the issue you
are observing.

> > The "used" amount of the global reserve might be included into
> > filesystem 'used', but I've observed the global reserve used for short
> > periods of time under some heavy stress, I'm convinced it needs to be
> > accounted in the df report.
> 
> As a comparison the ext2/3/4 filesystem has a % reserved for root and
> does not show this in available. So you get filesystem with 0 bytes
> free but root can still write to them.
> 
> I would argue that available should not include the reserve. It is not
> available for normal operations, right?

This is different from the ext2 reserve for root, it's a reserved space
for certain internal filesystem operations or as an emergency pool, eg.
when one wants to delete files from a full filesystem.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2014-02-18 16:43 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-10 16:41 What to do about df and btrfs fi df Josef Bacik
2014-02-10 17:06 ` Hugo Mills
2014-02-10 18:24   ` cwillu
2014-02-10 18:28     ` Josef Bacik
2014-02-10 18:36       ` cwillu
2014-02-10 18:41         ` Josef Bacik
2014-02-10 18:54           ` cwillu
2014-02-10 19:05           ` Hugo Mills
2014-02-11 17:36           ` David Sterba
2014-02-17 17:08           ` David Sterba
2014-02-18  8:33             ` Goswin von Brederlow
2014-02-18 16:43               ` David Sterba
2014-02-11  1:02     ` Roger Binns
2014-02-11  3:13       ` cwillu
2014-02-11  3:35         ` ronnie sahlberg
2014-02-11 19:58         ` Roger Binns
2014-02-10 22:14   ` Goffredo Baroncelli
2014-02-10 22:26 ` Goffredo Baroncelli
2014-02-10 22:56   ` cwillu
2014-02-11 13:14   ` Josef Bacik
2014-02-11 18:20     ` Goffredo Baroncelli
2014-02-11 18:33       ` Josef Bacik
2014-02-11 18:46         ` Goffredo Baroncelli
2014-02-11 18:56       ` Hugo Mills
2014-02-12 21:03         ` Goffredo Baroncelli
2014-02-11 20:53 ` Sandy McArthur
2014-02-12  3:09   ` Kostia Khlebopros
2014-02-12 21:12     ` Goffredo Baroncelli
2014-02-12  3:55 ` Anand Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).