Re: Again, no space left on device while rebalancing and recipe doesnt work

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Marc Haber <mh+linux-btrfs@zugschlus.de>
To: linux-btrfs@vger.kernel.org
Subject: Re: Again, no space left on device while rebalancing and recipe doesnt work
Date: Sat, 5 Mar 2016 15:28:36 +0100	[thread overview]
Message-ID: <20160305142836.GD1902@torres.zugschlus.de> (raw)
In-Reply-To: <pan$b2129$4febff94$eb9a65a0$72f9d0cf@cox.net>

Hi,

I have not seen this message coming back to the mailing list. Was it
again too long?

I have pastebinned the log at http://paste.debian.net/412118/

On Tue, Mar 01, 2016 at 08:51:32PM +0000, Duncan wrote:
> There has been something bothering me about this thread that I wasn't 
> quite pinning down, but here it is.
> 
> If you look at the btrfs fi df/usage numbers, data chunk total vs. used 
> are very close to one another (113 GiB total, 112.77 GiB used, single 
> profile, assuming GiB data chunks, that's only a fraction of a single 
> data chunk unused), so balance would seem to be getting thru them just 
> fine.

Where would you see those numbers? I have those, pre-balance:

Mar  2 20:28:01 fan root: Data, single: total=77.00GiB, used=76.35GiB
Mar  2 20:28:01 fan root: System, DUP: total=32.00MiB, used=48.00KiB
Mar  2 20:28:01 fan root: Metadata, DUP: total=86.50GiB, used=2.11GiB
Mar  2 20:28:01 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B

> But there's a /huge/ spread between total vs. used metadata (32 GiB 
> total, under 4 GiB used, clearly _many_ empty or nearly empty chunks), 
> implying that has not been successfully balanced in quite some time, if 
> ever.

This is possible, yes.

>   So I'd surmise the problem is in metadata, not in data.
> 
> Which would explain why balancing data works fine, but a whole-filesystem 
> balance doesn't, because it's getting stuck on the metadata, not the data.
> 
> Now the balance metadata filters include system as well, by default, and 
> the -mprofiles=dup and -sprofiles=dup balances finished, apparently 
> without error, which throws a wrench into my theory.

Also finishes without changing things, post-balance:
Mar  2 21:55:37 fan root: Data, single: total=77.00GiB, used=76.36GiB
Mar  2 21:55:37 fan root: System, DUP: total=32.00MiB, used=80.00KiB
Mar  2 21:55:37 fan root: Metadata, DUP: total=99.00GiB, used=2.11GiB
Mar  2 21:55:37 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B

Wait, Metadata used actually _grew_???

> But while we have the btrfs fi df from before the attempt with the 
> profiles filters, we don't have the same output from after.
s
We now have everything. New log attached.

> > I'd like to remove unused snapshots and keep the number of them to 4
> > digits, as a workaround.
> 
> I'll strongly second that recommendation.  Btrfs is known to have 
> snapshot scaling issues at 10K snapshots and above.  My strong 
> recommendation is to limit snapshots per filesystem to 3000 or less, with 
> a target of 2000 per filesystem or less if possible, and an ideal of 1000 
> per filesystem or less if it's practical to keep it to that, which it 
> should be with thinning, if you're only snapshotting 1-2 subvolumes, but 
> may not be if you're snapshotting more.

I'm snapshotting /home every 10 minutes, the filesystem that I have
been posting logs from has about 400 snapshots, and snapshot cleanup
works fine. The slow snapshot removal is a different filesystem on the
same host which is on a rotating rust HDD, and is much bigger.

> By 3000 snapshots per filesystem, you'll be beginning to notice slowdowns 
> in some btrfs maintenance commands if you're sensitive to it, tho it's 
> still at least practical to work with, and by 10K, it's generally 
> noticeable by all, at least once they thin down to 2K or so, as it's 
> suddenly faster again!  Above 100K, some btrfs maintenance commands slow 
> to a crawl and doing that sort of maintenance really becomes impractical 
> enough that it's generally easier to backup what you need to and blow 
> away the filesystem to start again with a new one, than it is to try to 
> recover the existing filesystem to a workable state, given that 
> maintenance can at that point take days to weeks.

Ouch. This shold not be the case, or btrfs subvolume snapshot should
at least emit a warning. It is not good that it is so easy to get a
filesystem into a state this bad.

> So 5-digits of snapshots on a filesystem is definitely well outside of 
> the recommended range, to the point that in some cases, particularly 
> approaching 6-digits of snapshots, it'll be more practical to simply 
> ditch the filesystem and start over, than to try to work with it any 
> longer.  Just don't do it; setup your thinning schedule so your peak is 
> 3000 snapshots per filesystem or under, and you won't have that problem 
> to worry about. =:^)

That needs to be documented prominently. Ths ZFS fanbois will love that.

> Oh, and btrfs quota management exacerbates the scaling issues 
> dramatically.  If you're using btrfs quotas

Am not, thankfully.

Greetings
Marc

-- 
-----------------------------------------------------------------------------
Marc Haber         | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany    |  lose things."    Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

next prev parent reply	other threads:[~2016-03-05 14:28 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-27 21:14 Again, no space left on device while rebalancing and recipe doesnt work Marc Haber
2016-02-27 23:15 ` Martin Steigerwald
2016-02-28  0:08   ` Marc Haber
2016-02-28  0:22     ` Hugo Mills
2016-02-28  8:40       ` Marc Haber
2016-02-29  1:56 ` Qu Wenruo
2016-02-29 15:33   ` Marc Haber
2016-03-01  0:45     ` Qu Wenruo
     [not found]       ` <20160301065448.GJ2334@torres.zugschlus.de>
2016-03-01  7:24         ` Qu Wenruo
2016-03-01  8:13           ` Qu Wenruo
     [not found]             ` <20160301161659.GR2334@torres.zugschlus.de>
2016-03-03  2:02               ` Qu Wenruo
2016-03-01 20:51           ` Duncan
2016-03-05 14:28             ` Marc Haber [this message]
2016-03-03  0:28 ` Dāvis Mosāns
2016-03-03  3:42   ` Qu Wenruo
2016-03-03  4:57   ` Duncan
2016-03-03 15:39     ` Dāvis Mosāns
2016-03-04 12:31       ` Duncan
2016-03-04 12:35         ` Hugo Mills
2016-03-27 12:10         ` Martin Steigerwald
2016-03-27 23:12           ` Duncan
2016-03-05 14:39   ` Marc Haber
2016-03-05 19:34     ` Chris Murphy
2016-03-05 20:09       ` Marc Haber
2016-03-06  6:43         ` Duncan
2016-03-06 20:27           ` Chris Murphy
2016-03-06 20:37             ` Chris Murphy
2016-03-07  8:47               ` Marc Haber
2016-03-07  8:42             ` Marc Haber
2016-03-07 18:39               ` Chris Murphy
2016-03-07 18:56                 ` Austin S. Hemmelgarn
2016-03-07 19:07                   ` Chris Murphy
2016-03-07 19:33                   ` Marc Haber
2016-03-12 21:36                 ` Marc Haber
2016-03-07 19:44               ` Chris Murphy
2016-03-07 20:43                 ` Duncan
2016-03-07 22:44                   ` Chris Murphy
2016-03-12 21:30             ` Marc Haber
2016-03-07  8:30           ` Marc Haber
2016-03-07 20:07             ` Duncan
2016-03-07  8:56         ` Marc Haber
2016-03-12 19:57       ` Marc Haber
2016-03-13 19:43         ` Chris Murphy
2016-03-13 20:50           ` Marc Haber
2016-03-13 21:31             ` Chris Murphy
2016-03-12 21:14       ` Marc Haber
2016-03-13 11:58       ` New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work) Marc Haber
2016-03-13 13:17         ` Andrew Vaughan
2016-03-13 16:56           ` Marc Haber
2016-03-13 17:12         ` Duncan
2016-03-13 21:05           ` Marc Haber
2016-03-14  1:05             ` Duncan
2016-03-14 11:49               ` Marc Haber
2016-03-13 19:14         ` Henk Slager
2016-03-13 19:42           ` Henk Slager
2016-03-13 20:56           ` Marc Haber
2016-03-14  0:00             ` Henk Slager
2016-03-15  7:20               ` Marc Haber
2016-03-14 12:07         ` Marc Haber
2016-03-14 12:48           ` New file system with same issue Holger Hoffstätte
2016-03-14 20:13             ` Marc Haber
2016-03-15 10:52               ` Holger Hoffstätte
2016-03-15 13:46                 ` Marc Haber
2016-03-15 13:54                   ` Austin S. Hemmelgarn
2016-03-15 14:09                     ` Marc Haber
2016-03-17  1:17               ` A good "Boot Maintenance" scheme (WAS: New file system with same issue) Robert White
2016-03-14 13:46           ` New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work) Henk Slager
2016-03-14 20:05             ` Marc Haber
2016-03-14 20:39               ` Henk Slager
2016-03-14 21:59                 ` Chris Murphy
2016-03-14 23:22                   ` Henk Slager
2016-03-15  7:16                     ` Marc Haber
2016-03-15 12:15                       ` Henk Slager
2016-03-15 13:24                         ` Marc Haber
2016-03-15  7:07                 ` Marc Haber
2016-03-27 12:15                   ` Martin Steigerwald
2016-03-15 13:29               ` Marc Haber
2016-03-15 13:42                 ` Marc Haber
2016-03-15 16:54                   ` Henk Slager
2016-03-27  8:41 ` Current state of old filesystem " Marc Haber
2016-04-01 13:59 ` Again, no space left on device while rebalancing and recipe doesnt work Marc Haber

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160305142836.GD1902@torres.zugschlus.de \
    --to=mh+linux-btrfs@zugschlus.de \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).