From: Marc Haber <mh+linux-btrfs@zugschlus.de>
To: linux-btrfs@vger.kernel.org
Subject: Re: Again, no space left on device while rebalancing and recipe doesnt work
Date: Sat, 5 Mar 2016 15:28:36 +0100 [thread overview]
Message-ID: <20160305142836.GD1902@torres.zugschlus.de> (raw)
In-Reply-To: <pan$b2129$4febff94$eb9a65a0$72f9d0cf@cox.net>
Hi,
I have not seen this message coming back to the mailing list. Was it
again too long?
I have pastebinned the log at http://paste.debian.net/412118/
On Tue, Mar 01, 2016 at 08:51:32PM +0000, Duncan wrote:
> There has been something bothering me about this thread that I wasn't
> quite pinning down, but here it is.
>
> If you look at the btrfs fi df/usage numbers, data chunk total vs. used
> are very close to one another (113 GiB total, 112.77 GiB used, single
> profile, assuming GiB data chunks, that's only a fraction of a single
> data chunk unused), so balance would seem to be getting thru them just
> fine.
Where would you see those numbers? I have those, pre-balance:
Mar 2 20:28:01 fan root: Data, single: total=77.00GiB, used=76.35GiB
Mar 2 20:28:01 fan root: System, DUP: total=32.00MiB, used=48.00KiB
Mar 2 20:28:01 fan root: Metadata, DUP: total=86.50GiB, used=2.11GiB
Mar 2 20:28:01 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
> But there's a /huge/ spread between total vs. used metadata (32 GiB
> total, under 4 GiB used, clearly _many_ empty or nearly empty chunks),
> implying that has not been successfully balanced in quite some time, if
> ever.
This is possible, yes.
> So I'd surmise the problem is in metadata, not in data.
>
> Which would explain why balancing data works fine, but a whole-filesystem
> balance doesn't, because it's getting stuck on the metadata, not the data.
>
> Now the balance metadata filters include system as well, by default, and
> the -mprofiles=dup and -sprofiles=dup balances finished, apparently
> without error, which throws a wrench into my theory.
Also finishes without changing things, post-balance:
Mar 2 21:55:37 fan root: Data, single: total=77.00GiB, used=76.36GiB
Mar 2 21:55:37 fan root: System, DUP: total=32.00MiB, used=80.00KiB
Mar 2 21:55:37 fan root: Metadata, DUP: total=99.00GiB, used=2.11GiB
Mar 2 21:55:37 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
Wait, Metadata used actually _grew_???
> But while we have the btrfs fi df from before the attempt with the
> profiles filters, we don't have the same output from after.
s
We now have everything. New log attached.
> > I'd like to remove unused snapshots and keep the number of them to 4
> > digits, as a workaround.
>
> I'll strongly second that recommendation. Btrfs is known to have
> snapshot scaling issues at 10K snapshots and above. My strong
> recommendation is to limit snapshots per filesystem to 3000 or less, with
> a target of 2000 per filesystem or less if possible, and an ideal of 1000
> per filesystem or less if it's practical to keep it to that, which it
> should be with thinning, if you're only snapshotting 1-2 subvolumes, but
> may not be if you're snapshotting more.
I'm snapshotting /home every 10 minutes, the filesystem that I have
been posting logs from has about 400 snapshots, and snapshot cleanup
works fine. The slow snapshot removal is a different filesystem on the
same host which is on a rotating rust HDD, and is much bigger.
> By 3000 snapshots per filesystem, you'll be beginning to notice slowdowns
> in some btrfs maintenance commands if you're sensitive to it, tho it's
> still at least practical to work with, and by 10K, it's generally
> noticeable by all, at least once they thin down to 2K or so, as it's
> suddenly faster again! Above 100K, some btrfs maintenance commands slow
> to a crawl and doing that sort of maintenance really becomes impractical
> enough that it's generally easier to backup what you need to and blow
> away the filesystem to start again with a new one, than it is to try to
> recover the existing filesystem to a workable state, given that
> maintenance can at that point take days to weeks.
Ouch. This shold not be the case, or btrfs subvolume snapshot should
at least emit a warning. It is not good that it is so easy to get a
filesystem into a state this bad.
> So 5-digits of snapshots on a filesystem is definitely well outside of
> the recommended range, to the point that in some cases, particularly
> approaching 6-digits of snapshots, it'll be more practical to simply
> ditch the filesystem and start over, than to try to work with it any
> longer. Just don't do it; setup your thinning schedule so your peak is
> 3000 snapshots per filesystem or under, and you won't have that problem
> to worry about. =:^)
That needs to be documented prominently. Ths ZFS fanbois will love that.
> Oh, and btrfs quota management exacerbates the scaling issues
> dramatically. If you're using btrfs quotas
Am not, thankfully.
Greetings
Marc
--
-----------------------------------------------------------------------------
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 1600421
next prev parent reply other threads:[~2016-03-05 14:28 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-02-27 21:14 Again, no space left on device while rebalancing and recipe doesnt work Marc Haber
2016-02-27 23:15 ` Martin Steigerwald
2016-02-28 0:08 ` Marc Haber
2016-02-28 0:22 ` Hugo Mills
2016-02-28 8:40 ` Marc Haber
2016-02-29 1:56 ` Qu Wenruo
2016-02-29 15:33 ` Marc Haber
2016-03-01 0:45 ` Qu Wenruo
[not found] ` <20160301065448.GJ2334@torres.zugschlus.de>
2016-03-01 7:24 ` Qu Wenruo
2016-03-01 8:13 ` Qu Wenruo
[not found] ` <20160301161659.GR2334@torres.zugschlus.de>
2016-03-03 2:02 ` Qu Wenruo
2016-03-01 20:51 ` Duncan
2016-03-05 14:28 ` Marc Haber [this message]
2016-03-03 0:28 ` Dāvis Mosāns
2016-03-03 3:42 ` Qu Wenruo
2016-03-03 4:57 ` Duncan
2016-03-03 15:39 ` Dāvis Mosāns
2016-03-04 12:31 ` Duncan
2016-03-04 12:35 ` Hugo Mills
2016-03-27 12:10 ` Martin Steigerwald
2016-03-27 23:12 ` Duncan
2016-03-05 14:39 ` Marc Haber
2016-03-05 19:34 ` Chris Murphy
2016-03-05 20:09 ` Marc Haber
2016-03-06 6:43 ` Duncan
2016-03-06 20:27 ` Chris Murphy
2016-03-06 20:37 ` Chris Murphy
2016-03-07 8:47 ` Marc Haber
2016-03-07 8:42 ` Marc Haber
2016-03-07 18:39 ` Chris Murphy
2016-03-07 18:56 ` Austin S. Hemmelgarn
2016-03-07 19:07 ` Chris Murphy
2016-03-07 19:33 ` Marc Haber
2016-03-12 21:36 ` Marc Haber
2016-03-07 19:44 ` Chris Murphy
2016-03-07 20:43 ` Duncan
2016-03-07 22:44 ` Chris Murphy
2016-03-12 21:30 ` Marc Haber
2016-03-07 8:30 ` Marc Haber
2016-03-07 20:07 ` Duncan
2016-03-07 8:56 ` Marc Haber
2016-03-12 19:57 ` Marc Haber
2016-03-13 19:43 ` Chris Murphy
2016-03-13 20:50 ` Marc Haber
2016-03-13 21:31 ` Chris Murphy
2016-03-12 21:14 ` Marc Haber
2016-03-13 11:58 ` New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work) Marc Haber
2016-03-13 13:17 ` Andrew Vaughan
2016-03-13 16:56 ` Marc Haber
2016-03-13 17:12 ` Duncan
2016-03-13 21:05 ` Marc Haber
2016-03-14 1:05 ` Duncan
2016-03-14 11:49 ` Marc Haber
2016-03-13 19:14 ` Henk Slager
2016-03-13 19:42 ` Henk Slager
2016-03-13 20:56 ` Marc Haber
2016-03-14 0:00 ` Henk Slager
2016-03-15 7:20 ` Marc Haber
2016-03-14 12:07 ` Marc Haber
2016-03-14 12:48 ` New file system with same issue Holger Hoffstätte
2016-03-14 20:13 ` Marc Haber
2016-03-15 10:52 ` Holger Hoffstätte
2016-03-15 13:46 ` Marc Haber
2016-03-15 13:54 ` Austin S. Hemmelgarn
2016-03-15 14:09 ` Marc Haber
2016-03-17 1:17 ` A good "Boot Maintenance" scheme (WAS: New file system with same issue) Robert White
2016-03-14 13:46 ` New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work) Henk Slager
2016-03-14 20:05 ` Marc Haber
2016-03-14 20:39 ` Henk Slager
2016-03-14 21:59 ` Chris Murphy
2016-03-14 23:22 ` Henk Slager
2016-03-15 7:16 ` Marc Haber
2016-03-15 12:15 ` Henk Slager
2016-03-15 13:24 ` Marc Haber
2016-03-15 7:07 ` Marc Haber
2016-03-27 12:15 ` Martin Steigerwald
2016-03-15 13:29 ` Marc Haber
2016-03-15 13:42 ` Marc Haber
2016-03-15 16:54 ` Henk Slager
2016-03-27 8:41 ` Current state of old filesystem " Marc Haber
2016-04-01 13:59 ` Again, no space left on device while rebalancing and recipe doesnt work Marc Haber
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160305142836.GD1902@torres.zugschlus.de \
--to=mh+linux-btrfs@zugschlus.de \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).