Re: btrfs-cleaner / snapshot performance analysis

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Austin S. Hemmelgarn" <ahferroin7@gmail.com>
To: "Ellis H. Wilson III" <ellisw@panasas.com>,
	Hans van Kranenburg <hans.van.kranenburg@mendix.com>,
	Tomasz Pala <gotar@polanet.pl>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: btrfs-cleaner / snapshot performance analysis
Date: Mon, 12 Feb 2018 11:02:02 -0500	[thread overview]
Message-ID: <57d8d368-9c96-db65-14a6-2af39cc509f9@gmail.com> (raw)
In-Reply-To: <c7e267a4-d47d-9713-c222-81dcb441e009@panasas.com>

On 2018-02-12 10:37, Ellis H. Wilson III wrote:
> On 02/11/2018 01:24 PM, Hans van Kranenburg wrote:
>> Why not just use `btrfs fi du <subvol> <snap1> <snap2>` now and then and
>> update your administration with the results? .. Instead of putting the
>> burden of keeping track of all administration during every tiny change
>> all day long?
> 
> I will look into that if using built-in group capacity functionality 
> proves to be truly untenable.  Thanks!
As a general rule, unless you really need to actively prevent a 
subvolume from exceeding it's quota, this will generally be more 
reliable and have much less performance impact than using qgroups.
> 
>>> CoW is still valuable for us as we're shooting to support on the order
>>> of hundreds of snapshots per subvolume,
>>
>> Hundreds will get you into trouble even without qgroups.
> 
> I should have been more specific.  We are looking to use up to a few 
> dozen snapshots per subvolume, but will have many (tens to hundreds of) 
> discrete subvolumes (each with up to a few dozen snapshots) in a BTRFS 
> filesystem.  If I have it wrong and the scalability issues in BTRFS do 
> not solely apply to subvolumes and their snapshot counts, please let me 
> know.
The issue isn't so much total number of snapshots as it is how many 
snapshots are sharing data.  If each of your individual subvolumes 
shares no data with any of the others via reflinks (so no deduplication 
across subvolumes, and no copying files around using reflinks or the 
clone ioctl), then I would expect things will be just fine without 
qgroups provided that you're not deleting huge numbers of snapshots at 
the same time.

With qgroups involved, I really can't say for certain, as I've never 
done much with them myself, but based on my understanding of how it all 
works, I would expect multiple subvolumes with a small number of 
snapshots each to not have as many performance issues as a single 
subvolume with the same total number of snapshots.
> 
> I will note you focused on my tiny desktop filesystem when making some 
> of your previous comments -- this is why I didn't want to share specific 
> details.  Our filesystem will be RAID0 with six large HDDs (12TB each). 
> Reliability concerns do not apply to our situation for technical 
> reasons, but if there are capacity scaling issues with BTRFS I should be 
> made aware of, I'd be glad to hear them.  I have not seen any in 
> technical documentation of such a limit, and experiments so far on 6x6TB 
> arrays has not shown any performance problems, so I'm inclined to 
> believe the only scaling issue exists with reflinks.  Correct me if I'm 
> wrong.
BTRFS in general works fine at that scale, dependent of course on the 
level of concurrent access you need to support.  Each tree update needs 
to lock a bunch of things in the tree itself, and having large numbers 
of clients writing to the same set of files concurrently can cause lock 
contention issues because of this, especially if all of them are calling 
fsync() or fdatasync() regularly.  These issues can be mitigated by 
segregating workloads into their own subvolumes (each subvolume is a 
mostly independent filesystem tree), but it sounds like you're already 
doing that, so I don't think that would be an issue for you.

The only other possibility I can think of is that the performance hit 
from qgroups may scale not just based on the number of snapshots of a 
given subvolume, but also the total size of the subvolume (more data 
means more accounting work), though I'm not certain about that (it's 
just a hunch based on what I do know about qgroups).

Now, there are some other odd theoretical cases that may cause issues 
when dealing with really big filesystems, but they're either really 
specific edge cases (for example, starting with a really small 
filesystem and gradually scaling it up in size as it gets full) or 
happen at scales far larger than what you're talking about (on the order 
of at least double digit petabyte scale).

next prev parent reply	other threads:[~2018-02-12 16:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-09 16:45 btrfs-cleaner / snapshot performance analysis Ellis H. Wilson III
2018-02-09 17:10 ` Peter Grandi
2018-02-09 20:36 ` Hans van Kranenburg
2018-02-10 18:29   ` Ellis H. Wilson III
2018-02-10 22:05     ` Tomasz Pala
2018-02-11 15:59       ` Ellis H. Wilson III
2018-02-11 18:24         ` Hans van Kranenburg
2018-02-12 15:37           ` Ellis H. Wilson III
2018-02-12 16:02             ` Austin S. Hemmelgarn [this message]
2018-02-12 16:39               ` Ellis H. Wilson III
2018-02-12 18:07                 ` Austin S. Hemmelgarn
2018-02-13 13:34             ` E V
2018-02-11  1:02     ` Hans van Kranenburg
2018-02-11  9:31       ` Andrei Borzenkov
2018-02-11 17:25         ` Adam Borowski
2018-02-11 16:15       ` Ellis H. Wilson III
2018-02-11 18:03         ` Hans van Kranenburg
2018-02-12 14:45           ` Ellis H. Wilson III
2018-02-12 17:09             ` Hans van Kranenburg
2018-02-12 17:38               ` Ellis H. Wilson III
2018-02-11  6:40 ` Qu Wenruo
2018-02-14  1:14   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57d8d368-9c96-db65-14a6-2af39cc509f9@gmail.com \
    --to=ahferroin7@gmail.com \
    --cc=ellisw@panasas.com \
    --cc=gotar@polanet.pl \
    --cc=hans.van.kranenburg@mendix.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).