From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Reducing impact of periodic btrfs balance
Date: Thu, 19 May 2016 04:09:31 +0000 (UTC) [thread overview]
Message-ID: <pan$549b5$526f9b98$5d8778de$5121c19f@cox.net> (raw)
In-Reply-To: d40d8fc7-3766-1ccc-3253-9f870a0f4f85@cn.fujitsu.com
Qu Wenruo posted on Thu, 19 May 2016 09:33:19 +0800 as excerpted:
> Graham Cobb wrote on 2016/05/18 14:29 +0100:
>> Hi,
>>
>> I have a 6TB btrfs filesystem I created last year (about 60% used). It
>> is my main data disk for my home server so it gets a lot of usage
>> (particularly mail). I do frequent snapshots (using btrbk) so I have a
>> lot of snapshots (about 1500 now, although it was about double that
>> until I cut back the retention times recently).
>
> Even at 1500, it's still quite large, especially when they are all
> snapshots.
>
> The biggest problem of large amount of snapshots is, it will make any
> backref walk operation very slow. (O(n^3)~O(n^4))
> This includes: btrfs qgroup and balance, even fiemap (recently submitted
> patch will solve fiemap problem though)
>
> The btrfs design ensures snapshot creation fast, but that comes with the
> cost of backref walk.
>
>
> So, unless some super huge rework, I would prefer to keep the number of
> snapshots to a small amount, or avoid balance/qgroup.
Qu and Graham,
As you may have seen on my previous posts, my normal snapshots
recommendation is to try to keep under 250-300 per subvolume, and
definitely under 3000 max, 2000 preferably, and 1000 if being
conservative, per filesystem, thus allowing snapshotting of 6-8
subvolumes per filesystem before hitting the filesystem cap, due to
scaling issues like the above that are directly related to number of
snapshots.
Also, recognizing that the btrfs quota code dramatically compounds the
scaling issues, as well as because of the btrfs quota functionality still
never actually working fully correctly on btrfs, I recommend turning it
off unless it's definitely and specifically known to be needed, and if
it's actually needed, I recommend strong consideration be given to use of
a more mature filesystem where quotas are known to work reliably without
the scaling issues they present on btrfs.
So to Graham, are these 1.5K snapshots all of the same subvolume, or
split into snapshots of several subvolumes? If it's all of the same
subvolume or of only 2-3 subvolumes, you still have some work to do in
terms of getting down to recommended snapshot levels. Also, if you have
quotas on and don't specifically need them, try turning them off and see
if that alone makes it workable.
It's worth noting that a reasonable snapshot thinning program can help
quite a bit here, letting you still keep a reasonable retention, and that
250-300 snapshots per subvolume fits very well within that model.
Consider, if you're starting with say hourly snapshots, a year or even
three months out, are you really going to care what specific hourly
snapshot you retrieve a file from, or would daily or weekly snapshots do
just as well and actually make finding an appropriate snapshot easier as
there's less to go thru?
Generally speaking, most people starting with hourly snapshots can delete
every other snapshot, thinning by at least half, within a day or two, and
those doing snapshots even more frequently can thin down to at least
hourly within hours even, since if you haven't noticed a mistaken
deletion or whatever within a few hours, chances are good that recovery
from hourly snapshots is more than practical, and if you haven't noticed
it within a day or two, recovery from say two-hourly or six-hourly
snapshots will be fine. Similarly, a week out, most people can thin to
twice-daily or daily snapshots, and by 4 weeks out, perhaps to Monday/
Wednesday/Friday snapshots. By 13 weeks (one quarter) out, weekly
snapshots are often fine, and by six months (26 weeks) out, thinning to
quarterly (13-week) snapshots may be practical. If not, it certainly
should be within a year, tho well before a year is out, backups to
separate media should have taken over allowing the oldest snapshots be
dropped, finally reclaiming the space they were keeping locked up.
And primarily to Qu...
Is that 2K snapshots overall filesystem cap recommendation still too
high, even if per-subvolume snapshots are limited to 300-ish? Or is the
real problem per-subvolume snapshots, and as long as snapshots are
limited to 300ish per subvolume, for people who have gone subvolume mad
and have say 50 separate subvolumes being snapshotted (perhaps not too
unreasonable in a VM context with each VM on its own subvolume), if a
300ish cap per subvolume is maintained, the 15K total snapshots per
filesystem should still work reasonably well, so I should be able to drop
the overall filesystem cap recommendation and simply recommend a per-
subvolume snapshot cap of a few hundred?
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2016-05-19 4:09 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-18 13:29 Reducing impact of periodic btrfs balance Graham Cobb
2016-05-18 23:44 ` Paul Jones
2016-05-19 1:33 ` Qu Wenruo
2016-05-19 4:09 ` Duncan [this message]
2016-05-19 10:11 ` [Not TLS] " Graham Cobb
2016-05-20 3:19 ` Paul Jones
2016-05-26 22:12 ` Graham Cobb
2016-05-31 12:49 ` Austin S. Hemmelgarn
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$549b5$526f9b98$5d8778de$5121c19f@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).