From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Inconsistent free space with false ENOSPC
Date: Wed, 23 Nov 2016 06:09:04 +0000 (UTC) [thread overview]
Message-ID: <pan$aca24$249fdeaf$20473dcc$c7efe94c@cox.net> (raw)
In-Reply-To: 010201588d22ec8f-40a41aa2-9046-462a-ab95-347f101dfd02-000000@eu-west-1.amazonses.com
Martin Raiber posted on Tue, 22 Nov 2016 17:43:46 +0000 as excerpted:
> On 22.11.2016 15:16 Martin Raiber wrote:
>> ...
>> Interestingly,
>> after running "btrfs check --repair" "df" shows 0 free space (Used
>> 516456408 Available 0), being inconsistent with the below other btrfs
>> free space information.
>>
>> btrfs fi usage output:
>>
>> Overall:
>> Device size: 512.00GiB
>> Device allocated: 512.00GiB
>> Device unallocated: 1.04MiB
>> Device missing: 0.00B
>> Used: 492.03GiB
>> Free (estimated): 19.59GiB (min: 19.59GiB)
>> Data ratio: 1.00
>> Metadata ratio: 2.00
>> Global reserve: 512.00MiB (used: 326.20MiB)
>>
>> Data,single: Size:507.98GiB, Used:488.39GiB
>> /dev/mapper/LUKS-CC-9a6043feb9d946269555a71ec0742c8b 507.98GiB
>>
>> Metadata,DUP: Size:2.00GiB, Used:1.82GiB
>> /dev/mapper/LUKS-CC-9a6043feb9d946269555a71ec0742c8b 4.00GiB
>>
>> System,DUP: Size:8.00MiB, Used:80.00KiB
>> /dev/mapper/LUKS-CC-9a6043feb9d946269555a71ec0742c8b 16.00MiB
>>
>> Unallocated:
>> /dev/mapper/LUKS-CC-9a6043feb9d946269555a71ec0742c8b 1.04MiB
> Looking at the code, it seems df shows zero if the available metadata
> space is smaller than the used global reserve. So this file system might
> be out of metadata space.
Yes, you're in a *serious* metadata bind.
Any time global reserve has anything above zero usage, it means the
filesystem is in dire straits, and well over half of your global reserve
is used, a state that is quite rare as btrfs really tries hard not to use
that space at all under normal conditions and under most conditions will
ENOSPC before using the reserve at all.
And the global reserve comes from metadata but isn't accounted in
metadata usage, so your available metadata is actually negative by the
amount of global reserve used.
Meanwhile, all available space is allocated to either data or metadata
chunks already -- no unallocated space left to allocate new metadata
chunks to take care of the problem (well, ~1 MiB unallocated, but that's
not enough to allocate a chunk, metadata chunks being nominally 256 MiB
in size and with metadata dup, a pair of metadata chunks must be
allocated together, so 512 MiB would be needed, and of course even if the
1 MiB could be allocated, it'd be ~1/2 MiB worth of metadata due to
metadata-dup and you're 300+ MiB into global reserve, so it wouldn't even
come close to fixing the problem).
Now normally, as mentioned in the ENOSPC discussion in the FAQ on the
wiki, temporarily adding (btrfs device add) another device of some GiB
(32 GiB should do reasonably well, 8 GiB may, a USB thumb drive of
suitable size can be used if necessary) and using the space it makes
available to do a balance (-dusage= incrementing from 0 to perhaps 30 to
70 percent, higher numbers will take longer and may not work at first) in
ordered to combine partially used chunks and free enough space to then
remove (btrfs device remove) the temporarily added device.
However, in your case the data usage is 488 of 508 GiB on a 512 GiB
device with space needed for several GiB of metadata as well, so while in
theory you could free up ~20 GiB of space that way and that should get
you out of the immediate bind, the filesystem will still be very close to
full, particularly after clearing out the global reserve usage, with
perhaps 16 GiB unallocated at ideal, ~97% used. And as any veteran
sysadmin or filesystem expert will tell you, filesystems in general like
10-20% free in ordered to be able to "breath" or work most efficiently,
with btrfs being no exception, so while the above might get you out of
the immediate bind, it's unlikely to work for long.
Which means once you're out of the immediate bind, you're still going to
need to free some space, one way or another, and that might not be as
simple as the words make it appear.
It's worth noting that btrfs keeps the original full extents around until
all references to all (4 KiB on x86/amd64) blocks within the extent are
gone. So if you have an originally half GiB file that was in a single
extent, and have heavily rewritten most of it, thus triggering COW to
write most blocks elsewhere, if a single 4 KiB block from the original
remains unrewritten, it's quite likely that 4 KiB block from the original
will still be pinning the entire half GiB extent, keeping it from being
freed.
Of course snapshots, which you mentioned, complicate the picture by
continuing to keep references to extents as they were at the time the
snapshot was taken.
So getting rid of your oldest snapshots will probably release some space,
which can then be rebalanced into unallocated space, but it's quite
possible that you won't be able to reclaim as much space that way as you
might expect, particularly if as described above, some of your files are
mostly but not entirely rewritten since the original write, and a few
unchanged blocks remain, continuing to lock much larger extents into
place due to current references to blocks in the original extent.
What you might have to do is eliminate all snapshots holding references
to the old files, then copy --reflink=never (or simply cross-filesystem
copy so reflink-copy can't be used) the current files elsewhere, delete
the existing copy, and copy/move them back into place, thus releasing the
old extent references and freeing the space those old extents took.
Of course depending on the circumstances and how your backups are handled
(noting your urbackup.org email address here), it may be simpler to start
with a fresh filesystem, either blowing away the existing one and
starting over, or archiving the existing one as-is and starting over.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2016-11-23 6:09 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-22 14:16 Inconsistent free space with false ENOSPC Martin Raiber
2016-11-22 17:43 ` Martin Raiber
2016-11-23 6:09 ` Duncan [this message]
2016-11-23 16:22 ` Martin Raiber
2016-11-24 4:44 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$aca24$249fdeaf$20473dcc$c7efe94c@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).