From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: btrfs space used issue
Date: Wed, 28 Feb 2018 14:24:40 -0500 [thread overview]
Message-ID: <2892a866-fdc3-b337-4cd4-2cd4a18b9f21@gmail.com> (raw)
In-Reply-To: <pan$e116f$e5aa2400$88c9d453$f5589ed1@cox.net>
On 2018-02-28 14:09, Duncan wrote:
> vinayak hegde posted on Tue, 27 Feb 2018 18:39:51 +0530 as excerpted:
>
>> I am using btrfs, But I am seeing du -sh and df -h showing huge size
>> difference on ssd.
>>
>> mount:
>> /dev/drbd1 on /dc/fileunifier.datacache type btrfs
>>
> (rw,noatime,nodiratime,flushoncommit,discard,nospace_cache,recovery,commit=5,subvolid=5,subvol=/)
>>
>>
>> du -sh /dc/fileunifier.datacache/ - 331G
>>
>> df -h /dev/drbd1 746G 346G 398G 47% /dc/fileunifier.datacache
>>
>> btrfs fi usage /dc/fileunifier.datacache/
>> Overall:
>> Device size: 745.19GiB Device allocated: 368.06GiB
>> Device unallocated: 377.13GiB Device missing:
>> 0.00B Used: 346.73GiB Free (estimated):
>> 396.36GiB (min: 207.80GiB)
>> Data ratio: 1.00 Metadata ratio: 2.00
>> Global reserve: 176.00MiB (used: 0.00B)
>>
>> Data,single: Size:365.00GiB, Used:345.76GiB
>> /dev/drbd1 365.00GiB
>>
>> Metadata,DUP: Size:1.50GiB, Used:493.23MiB
>> /dev/drbd1 3.00GiB
>>
>> System,DUP: Size:32.00MiB, Used:80.00KiB
>> /dev/drbd1 64.00MiB
>>
>> Unallocated:
>> /dev/drbd1 377.13GiB
>>
>>
>> Even if we consider 6G metadata its 331+6 = 337.
>> where is 9GB used?
>>
>> Please explain.
>
> Taking a somewhat higher level view than Austin's reply, on btrfs, plain
> df and to a somewhat lessor extent du[1] are at best good /estimations/
> of usage, and for df, space remaining. Due to btrfs' COW/copy-on-write
> semantics and features such as the various replication/raid schemes,
> snapshotting, etc, btrfs makes available, that df/du don't really
> understand as they simply don't have and weren't /designed/ to have that
> level of filesystem-specific insight, they, particularly df due to its
> whole-filesystem focus, aren't particularly accurate on btrfs. Consider
> their output more a "best estimate given the rough data we have
> available" sort of report.
>
> To get the real filesystem focused picture, use btrfs filesystem usage,
> or btrfs filesystem show combined with btrfs filesystem df. That's what
> you should trust, altho various utilities that check for available space
> before doing something often use the kernel-call equivalent of (plain) df
> to ensure they have the required space, so it's worthwhile to keep an eye
> on it as the filesystem fills, as well. If it gets too out of sync with
> btrfs filesystem usage, or if btrfs filesystem usage unallocated drops
> below say five gigs or data or metadata size vs used shows a spread of
> multiple gigs (your data shows a spread of ~20 gigs ATM, but with 377
> gigs still unallocated it's no big deal; it would be a big deal if those
> were reversed, tho, only 20 gigs unallocated and a spread of 300+ gigs in
> data size vs used), then corrective action such as a filtered rebalance
> may be necessary.
>
> There are entries in the FAQ discussing free space issues that you should
> definitely read if you haven't, altho they obviously address the general
> case, so if you have more questions about an individual case after having
> read them, here is a good place to ask. =:^)
>
> Everything having to do with "space" (see both the 1/Important-questions
> and 4/Common-questions sections) here:
>
> https://btrfs.wiki.kernel.org/index.php/FAQ
>
> Meanwhile, it's worth noting that not entirely intuitively, btrfs' COW
> implementation can "waste" space on larger files that are mostly, but not
> entirely, rewritten. An example is the best way to demonstrate.
> Consider each x a used block and each - an unused but still referenced
> block:
>
> Original file, written as a single extent (diagram works best with
> monospace, not arbitrarily rewrapped):
>
> xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> First rewrite of part of it:
>
> xxxxxxxxxxx------xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
> xxxxxx
>
>
> Nth rewrite, where some blocks of the original still remain as originally
> written:
>
> ------------------xxx------------------------------
> xxx---
> xxxx----xxx
> xxxx
> xxxxxxxxxxxxxxxxxxxxx---xxxxxx
> xxx
> xxx
>
>
> As you can see, that first really large extent remains fully referenced,
> altho only three blocks of it remain in actual use. All those -- won't
> be returned to free space until those last three blocks get rewritten as
> well, thus freeing the entire original extent.
>
> I believe this effect is what Austin was referencing when he suggested
> the defrag, tho defrag won't necessarily /entirely/ clear it up. One way
> to be /sure/ it's cleared up would be to rewrite the entire file,
> deleting the original, either by copying it to a different filesystem and
> back (with the off-filesystem copy guaranteeing that it can't use reflinks
> to the existing extents), or by using cp's --reflink=never option.
> (FWIW, I prefer the former, just to be sure, using temporary copies to a
> suitably sized tmpfs for speed where possible, tho obviously if the file
> is larger than your memory size that's not possible.)
Correct, this is why I recommended trying a defrag. I've actually never
seen things so bad that a simple defrag didn't fix them however (though
I have seen a few cases where the target extent size had to be set
higher than the default of 20MB). Also, as counter-intuitive as it
might sound, autodefrag really doesn't help much with this, and can
actually make things worse.
This is also one of the things I was referring to in item 6of the list
of causes I gave, partly because I couldn't come up with a good way to
explain it clearly (which I feel you did an excellent job of above),
with the other big one being handling of xattrs and ACL's (which get
accounted by `df` but generally aren't by `du` (at least, not reliably).
>
> Of course where applicable, snapshots and dedup keep reflink-references
> to the old extents, so they must be adjusted or deleted as well, to
> properly free that space.
>
> ---
> [1] du: Because its purpose is different. du's primary purpose is
> telling you in detail what space files take up, per-file and per-
> directory, without particular regard to usage on the filesystem itself.
> df's focus, by contrast, is on the filesystem as a whole. So where two
> files share the same extent due to reflinking, du should and does count
> that usage for each file, because that's what each file /uses/ even if
> they both use the same extents.
next prev parent reply other threads:[~2018-02-28 19:24 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-27 13:09 btrfs space used issue vinayak hegde
2018-02-27 13:54 ` Austin S. Hemmelgarn
2018-02-28 6:01 ` vinayak hegde
2018-02-28 15:22 ` Andrei Borzenkov
2018-03-01 9:26 ` vinayak hegde
2018-03-01 10:18 ` Andrei Borzenkov
2018-03-01 12:25 ` Austin S. Hemmelgarn
2018-03-03 6:59 ` Duncan
2018-03-05 15:28 ` Christoph Hellwig
2018-03-05 16:17 ` Austin S. Hemmelgarn
2018-02-28 19:09 ` Duncan
2018-02-28 19:24 ` Austin S. Hemmelgarn [this message]
2018-02-28 19:54 ` Duncan
2018-02-28 20:15 ` Austin S. Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2892a866-fdc3-b337-4cd4-2cd4a18b9f21@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).