public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
* Re: Questions about BTRFS balance and scrub on non-RAID setup
@ 2021-09-01  4:54 Duncan
  0 siblings, 0 replies; 5+ messages in thread
From: Duncan @ 2021-09-01  4:54 UTC (permalink / raw)
  To: Andrej Friesen, linux-btrfs

Andrej Friesen posted on Tue, 31 Aug 2021 10:17:07 +0200 as excerpted:

>> You probably want to use autodefrag or a custom defragmentation
>> solution too. We weren't satisfied with autodefrag in some situations
>> (were clearly fragmentation crept in and IO performance suffered
>> until a manual defrag) and developed our own scheduler for triggering
>> defragmentation based on file writes and slow full filesystem scans,
> 
> The ceph cluster only uses SSDs therefore I guess we do not suffer
> from fragmentation problem as with HDDs. As far as I understood SSDs.

Since I saw mention of btrfs snapshots as well...

It's worth mentioning that defrag (of course) triggers a write-out of
the new defragmented data, which because btrfs snapshots are cow-based
(copy- on-write), duplicates blocks still locked into place by existing 
snapshots.  With rewrite-in-place write patterns (typical
write-patterns for database or VM image usage), defrag and repeated
snapshots this can eat up space rather fast.

(They tried snapshot-aware defrag at one point but due to the exploded 
complexity of dealing with all the COW-references the performance just 
wasn't within the realm of practical as the defrag ended up making
little forward progress, so that was dropped in favor of a defrag that
would break the cow-references and thus use extra space, but at least
/worked/ for its labeled purpose.)

So I'd suggest choosing either one or the other, either snapshotting or 
defrag, don't try to use both in combination, or at least limit their 
usage in combination and keep an eye on space usage, deleting snapshots 
and/or reducing defrag frequency to some fraction of the snapshot 
frequency as necessary.

For ssds, autodefrag without manual defrag may be a reasonable
compromise (it's one I like personally but my use-case isn't
commercial), tho it is said that autodefrag may be a performance
bottleneck for some database (and I suspect VM-image as well)
use-cases, but I suspect autodefrag on ssds should both mitigate the
performance issue and likely eliminate the need for more intensive
manual/scheduled defrag runs.

The other thing to consider with below-btrfs-level snapshotting, and
I'm out-of-league for ceph/rdb but know it's definitely a problem with
lvm, is that btrfs due to its multi-device functionality cannot be
allowed to see other snapshots of the filesystem with the same btrfs
UUID.  (Btrfs- scan is what would make btrfs aware of them, but udev
typically triggers btrfs-scan when it detects new devices, and with lvm
at least, udev device detection can trigger somewhat unexpectedly.)
Because when btrfs sees these other devices with the same btrfs UUID,
it considers them additional devices of a multi-device btrfs and can
attempt to write to them instead of the original target device,
potentially creating all sorts of mayhem!

Like I said I'm out-of-league with ceph, etc, and have no idea if this 
even applies with it, but when I saw rdb snapshots mentioned I thought
of the lvm snapshots problem and thought it was worth a heads-up, in
case further investigation is necessary.

Likewise I saw the mention of quotas and balance.  Balance with quotas 
running similarly explodes due to constant recalculation of the quota
as the balance does its thing, increasing balance time dramatically and 
often out of the realm of the practical.  So if quotas are needed, 
minimize the use of balance, and if a balance is necessary, turning off 
quotas temporarily may be the only way to make reasonable forward 
progress on the balance.

But it sounds like btrfs quotas may not be necessary, thus avoiding
that problem entirely. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 5+ messages in thread
* Questions about BTRFS balance and scrub on non-RAID setup
@ 2021-08-30 13:20 Andrej Friesen
  2021-08-30 14:18 ` Lionel Bouton
  0 siblings, 1 reply; 5+ messages in thread
From: Andrej Friesen @ 2021-08-30 13:20 UTC (permalink / raw)
  To: linux-btrfs

Hey folks,

I have used btrfs now for a few years on my home server and have had a
good experience so far.

But now I need some advice because I and my team want to use BTRFs in
a product. And personal use is something really different than
enterprise :-)

Use case and context for my questions:

A file system as a service for our customers.
This will be offered to the customer as a network share via NFS. That
also means we do not have any control over the usage patterns.
No idea about how often, how much they write small or big files to
that file system.

Technically we only create one block device with several terabytes and
format this with btrfs. The actual block device which we format is
backed by a ceph cluster.
So the actual block device is already been on a distributed storage,
therefore we will not do any raid configuration.

The kernel will be a recent 5.10.

Scrub:

Do I need to regularly scrub?
If so, what would be a recommendation for my use case?

My conclusion after reading about the scrub. This checks for damaged
data and will recover the data if this filesystem has another copy of
that data.
Since we will run without raid in btrfs this is not needed in my opinion.
Am I right with my conclusion here?

Balance:

Do I need to regularly balance my filesystem?
If so, what would be a recommendation for my use case?

I am a little bit confused about this one.
The FAQ (https://btrfs.wiki.kernel.org/index.php/FAQ#Do_I_need_to_run_a_balance_regularly.3F)
says:

> In general usage, no. A full unfiltered balance typically takes a long time, and will rewrite huge amounts of data unnecessarily. You may wish to run a balance on metadata only (see Balance_Filters) if you find you have very large amounts of metadata space allocated but unused, but this should be a last resort. At some point, this kind of clean-up will be made an automatic background process.

Others on the wide internet however say it makes sense to regularly balance:

https://github.com/netdata/netdata/issues/3203#issuecomment-356026930

Something like this every day:
`btrfs balance start -dusage=50 -dlimit=2 -musage=50 -mlimit=4`

I also asked on IRC (username ajfriesen) about regular balance and
people seem to have different opinions on that topic as well.


What would a recommendation look like for my use case?
Would it make sense to update the FAQ in that regard?

PS: First-time mailing list user, please tell me if I did something wrong.


All the best
---
Andrej Friesen

https://www.ajfriesen.com/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-09-01  4:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-09-01  4:54 Questions about BTRFS balance and scrub on non-RAID setup Duncan
  -- strict thread matches above, loose matches on Subject: below --
2021-08-30 13:20 Andrej Friesen
2021-08-30 14:18 ` Lionel Bouton
2021-08-31  8:17   ` Andrej Friesen
2021-08-31 13:06     ` Lionel Bouton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox