From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: Shyam Prasad N <nspmangalore@gmail.com>,
Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: Btrfs occupies more space than du reports...
Date: Fri, 23 Feb 2018 08:23:47 -0500 [thread overview]
Message-ID: <3968047d-32ef-780c-5375-77c923d96f38@gmail.com> (raw)
In-Reply-To: <CANT5p=q76WRoc8VHdLKqb8zZrGm3h4qpt2o4-zv=M-Mmd3rtBQ@mail.gmail.com>
On 2018-02-23 06:21, Shyam Prasad N wrote:
> Hi,
>
> Can someone explain me why there is a difference in the number of
> blocks reported by df and du commands below?
>
> =====================
> # df -h /dc
> Filesystem Size Used Avail Use% Mounted on
> /dev/drbd1 746G 519G 225G 70% /dc
>
> # btrfs filesystem df -h /dc/
> Data, single: total=518.01GiB, used=516.58GiB
> System, DUP: total=8.00MiB, used=80.00KiB
> Metadata, DUP: total=2.00GiB, used=1019.72MiB
> GlobalReserve, single: total=352.00MiB, used=0.00B
>
> # du -sh /dc
> 467G /dc
> =====================
>
> df shows 519G is used. While recursive check using du shows only 467G.
> The filesystem doesn't contain any snapshots/extra subvolumes.
> Neither does it contain any mounted filesystem under /dc.
> I also considered that it could be a void left behind by one of the
> open FDs held by a process. So I rebooted the system. Still no
> changes.
>
> The situation is even worse on a few other systems with similar configuration.
>
At least part of this is a difference in how each tool computes space usage.
* `df` calls `statvfs` to get it's data, which tries to count physical
allocation accounting for replication profiles. In other words, data in
chunks with the dup, raid1, and raid10 profiles gets counted twice, data
in raid5 and raid6 chunks gets counted with a bit of extra space for the
parity, etc.
* `btrfs fi df` looks directly at the filesystem itself and counts how
much space is available to each chunk type in the `total` values and how
much space is used in each chunk type in the `used` values, after
replication. If you add together the data used value and twice the
system and metadata used values, you get the used value reported by
regular `df` (well, close to it that is, `df` rounds at a lower
precision than `btrfs fi df` does).
* `du` scans the directory tree and looks at the file allocation values
returned form `stat` calls (or just looks at file sizes if you pass the
`--apparent-size` flag to it). Like `btrfs fi df`, it reports values
after replication, it has a couple of nasty caveats on BTRFS, namely
that it will report sizes for natively compressed files _before_
compression, and will count reflinked blocks once for each link.
Now, this doesn't explain the entirety of the discrepancy with `du`, but
it should cover the whole difference between `df` and `btrfs fi df`.
next prev parent reply other threads:[~2018-02-23 13:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-02-23 11:21 Btrfs occupies more space than du reports Shyam Prasad N
2018-02-23 13:23 ` Austin S. Hemmelgarn [this message]
2018-02-28 11:26 ` Shyam Prasad N
2018-02-28 15:10 ` Andrei Borzenkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3968047d-32ef-780c-5375-77c923d96f38@gmail.com \
--to=ahferroin7@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=nspmangalore@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).