From: Andrei Borzenkov <arvidjaar@gmail.com>
To: Martin Steigerwald <martin@lichtvoll.de>,
Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: Understanding "Used" in df
Date: Mon, 27 Jul 2020 19:42:58 +0300 [thread overview]
Message-ID: <558ef4c5-ee61-8a0d-5ca5-43a07d6e64ac@gmail.com> (raw)
In-Reply-To: <1622535.kDMmNaIAU4@merkaba>
27.07.2020 14:38, Martin Steigerwald пишет:
> Zygo Blaxell - 23.07.20, 06:51:06 CEST:
>> On Wed, Jul 22, 2020 at 05:10:19PM +0200, Martin Steigerwald wrote:
>>> I have:
>>>
>>> % LANG=en df -hT /home
>>> Filesystem Type Size Used Avail Use% Mounted on
>>> /dev/mapper/sata-home btrfs 300G 175G 123G 59% /home
>>>
>>> And:
>>>
>>> merkaba:~> btrfs fi sh /home
>>> Label: 'home' uuid: […]
>>>
>>> Total devices 2 FS bytes used 173.91GiB
>>> devid 1 size 300.00GiB used 223.03GiB path
>>> /dev/mapper/sata-home
>>> devid 2 size 300.00GiB used 223.03GiB path
>>> /dev/mapper/msata-home
>>>
>>> merkaba:~> btrfs fi df /home
>>> Data, RAID1: total=218.00GiB, used=171.98GiB
>>> System, RAID1: total=32.00MiB, used=64.00KiB
>>> Metadata, RAID1: total=5.00GiB, used=1.94GiB
>>> GlobalReserve, single: total=490.48MiB, used=0.00B
>>>
>>> As well as:
>>>
>>> merkaba:~> btrfs fi usage -T /home
>>>
>>> Overall:
>>> Device size: 600.00GiB
>>> Device allocated: 446.06GiB
>>> Device unallocated: 153.94GiB
>>> Device missing: 0.00B
>>> Used: 347.82GiB
>>> Free (estimated): 123.00GiB (min: 123.00GiB)
>>> Data ratio: 2.00
>>> Metadata ratio: 2.00
>>> Global reserve: 490.45MiB (used: 0.00B)
>>> Multiple profiles: no
>>>
>>> Data Metadata System
>>>
>>> Id Path RAID1 RAID1 RAID1 Unallocated
>>> -- ---------------------- --------- -------- -------- -----------
>>>
>>> 1 /dev/mapper/sata-home 218.00GiB 5.00GiB 32.00MiB 76.97GiB
>>> 2 /dev/mapper/msata-home 218.00GiB 5.00GiB 32.00MiB 76.97GiB
>>>
>>> -- ---------------------- --------- -------- -------- -----------
>>>
>>> Total 218.00GiB 5.00GiB 32.00MiB 153.94GiB
>>> Used 171.97GiB 1.94GiB 64.00KiB
>>>
>>> I think I understand all of it, including just 123G instead of
>>> 300 - 175 = 125 GiB "Avail" in df -hT.
>>>
>>> But why 175 GiB "Used" in 'df -hT' when just 173.91GiB (see 'btrfs
>>> fi sh') is allocated *within* the block group / chunks?
>>
>> statvfs (the 'df' syscall) does not report a "used" number, only total
>> and available btrfs data blocks (no metadata blocks are counted).
>> 'df' computes "used" by subtracting f_blocks - f_bavail.
>>
>> 122.99875 = 300 - 171.97 - 5 - .03125
>>
>> df_free = total - data_used - metadata_allocated - system_allocated
>
> I get that one. That is for what is still free.
>
> But I do not understand "Used" in df as.
>
> 1) It it would be doing 300 GiB - what is still available, it would do 300-122.99 = 177.01
>
df "Used" is computed as "total" - "free", where "free" is reported by
filesystem. btrfs free is 76.97GiB + 49.12GiB. I suppose btrfs does
internally round at least the first number to the full chunk size which
gives us close to 125GiB. To
> 2) If it would add together all allocated within a chunk…
>
> 171.98 GiB used in data + 64 KiB used in system + 1,94 GiB used in metadata ~= 174 GiB
>
> 3) It may consider all allocated system and metadata chunks as lost for writing
> data:
>
> 171.98 used in date + 32 MiB allocated in system + 5 GiB allocated in metadata ~= 176.98 GiB
>
> 4) It may consider 2 of those 5 GiB chunks for metadata as reclaimable and
> then it would go like this:
>
> 171.98 used in date + 32 MiB allocated in system + 3 GiB metadata ~= 116.98 GiB = 174.98 GiB
>
"df" does not know anything about data vs. metadata vs. system reserve.
It has only two values filesystem returns - free and avail. And yes,
they are computed independently.
"used" + "free" == "total", but do not expect "avail" to have any direct
relation to other metrics.
Unfortunately, "df" does not display "free" (I was wrong in other post).
But using stat ...
$ LANGUAGE=en stat -f .
...
Block size: 4096 Fundamental block size: 4096
Blocks: Total: 115164174 Free: 49153062 Available: 43297293
$ LANGUAGE=en df -B 4K .
Filesystem 4K-blocks Used Available Use% Mounted on
/dev/sda4 115164174 66011112 43297293 61% /
115164174 - 49153062 == 66011112
But there is no way you can compute Available from other values - it is
whatever filesystem returns.
> That would be about right, but also as unpredictable as it can get.
>
>> Inline files count as metadata instead of data, so even when you are
>> out of data blocks (zero blocks free in df), you can sometimes still
>> write small files. Sometimes, when you write one small file, 1GB of
>> available space disappears as a new metadata block group is allocated.
>>
>> 'df' doesn't take metadata or data sharing into account at all, or
>> the space required to store csums, or bursty metadata usage workloads.
>> 'df' can't predict these events, so its accuracy is limited to no
>> better than about 0.5% of the size of the filesystem or +/- 1GB,
>> whichever is larger.
>
> So just assume that df output can be +/- 1 GiB off?
>
> I am just wondering cause I aimed to explaining this to participants of
> my Linux courses… and for now I have the honest answer that I have
> no clue why df displays "175 GiB" as used.
>
>>> Does this have something to do with that global reserve thing?
>>
>> 'df' currently tells you nothing about metadata (except in kernels
>> before 5.6, when you run too low on metadata space, f_bavail is
>> abruptly set to zero). That's about the only impact global reserve
>> has on 'df'.
>
> But it won't claim used or even just allocated metadata space as available
> for writing data?
>
>> Global reserve is metadata allocated-but-unused space, and all
>> metadata is not visible to df. The reserve ensures that critical
>> btrfs metadata operations can complete without running out of space,
>> by forcing non-critical long-running operations to commit
>> transactions when no metadata space is available outside the reserved
>> pool. It mostly works, though there are still a few bugs left that
>> lead to EROFS when metadata runs low.
>
> Hmmm, thanks.
>
> But as far as I understood also from the other post, Global Reserve is
> reserved but not reported as used in df?
>
> I am not sure whether I am getting it though.
>
> Best,
>
next prev parent reply other threads:[~2020-07-27 16:43 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-22 15:10 Understanding "Used" in df Martin Steigerwald
2020-07-22 19:07 ` A L
2020-07-23 4:51 ` Zygo Blaxell
2020-07-27 11:38 ` Martin Steigerwald
2020-07-27 16:42 ` Andrei Borzenkov [this message]
2020-07-27 19:30 ` Chris Murphy
2020-07-27 19:48 ` Andrei Borzenkov
2020-07-27 20:47 ` Hugo Mills
2020-07-28 21:20 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=558ef4c5-ee61-8a0d-5ca5-43a07d6e64ac@gmail.com \
--to=arvidjaar@gmail.com \
--cc=ce3g8jdj@umail.furryterror.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=martin@lichtvoll.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox