Re: Suggestion: Anti-fragmentation safety catch (RFC)

public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Suggestion: Anti-fragmentation safety catch (RFC)
Date: Tue, 25 Mar 2014 15:42:36 +0000 (UTC)	[thread overview]
Message-ID: <pan$ee4df$a16c7a6$74fe8395$12ca6150@cox.net> (raw)
In-Reply-To: lgqk8i$j4p$1@ger.gmane.org

Martin posted on Tue, 25 Mar 2014 00:57:05 +0000 as excerpted:

> https://btrfs.wiki.kernel.org/index.php/Mount_options

> #### autodefrag (since [kernel] 3.0)
> 
> Will detect random writes into existing files and kick off background
> defragging. It is well suited to bdb or sqlite databases, but not
> virtualization images or big databases (yet). Once the developers make
> sure it doesn't defrag files over and over again, they'll move this
> toward the default.
> ####
> 
> Looks like I might be a good test case :-)
> 
> 
> What's the problem for big images or big databases? What is considered
> "big"?

"Big" is obviously relative and may depend to some extent on the physical 
device backing the filesystem, particularly SSD vs. spinning rust, as 
well as just how actively rewritten the file in question actually is.

Based on my own experience and what I've seen posted from others, 
autodefrag seems to work reasonably well into the lower hundreds of MiB, 
while once we're talking "gigs", something like the NOCOW file attribute 
tends to be a better solution.

Sizes of say half a gig to a gig are a gray area.  Autodefrag will 
probably work well enough on them for fast media (SSD) or if the file re-
writing requests aren't coming in /too/ fast, but on slower spinning rust 
or where internal file data rewrites are coming fast, rewriting the 
entire multi-hundred-megabyte file to defrag it every time an update of a 
few bytes comes in will likely bottleneck the system, with an effect much 
like the one you posted to start this thread: a load average increasing 
into the hundreds due to IO-bottleneck with CPUs @ 100% wait, due to the 
write-magnification effect as a full several hundred megabyte file gets 
repeatedly rewritten for each update of a few bytes!

Actually, if your use-case ends up being in or near that gray area, I'm 
sure some specific tests and hard numbers would be appreciated!  Maybe 
autodefrag is fine to 1.5 GiB or so, or perhaps the trouble starts at say 
300 MiB for you as your system is slow enough and the incoming data 
stream high enough you're bottlenecking at 300 MiB.  Or perhaps the half-
gig to 1-gig range is right on.  Regardless, if you can get hard data on 
it, please do share. =:^)

Meanwhile, the NOCOW extended file-attribute (chattr +C) mentioned a 
couple paragraphs up is recommended once the problem scales beyond what 
autodefrag can handle. There are, however, a number of btrfs specific 
peculiarities to the NOCOW situation that it can take some familiarity 
with the topic to cleanly navigate.  That's out of scope for this post 
and besides, there's quite a few other threads where it has been 
discussed, so I'll punt on that discussion, for now.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

     prev parent reply	other threads:[~2014-03-25 15:43 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-24 19:47 Suggestion: Anti-fragmentation safety catch (RFC) Martin
2014-03-24 20:19 ` Duncan
2014-03-25  0:57   ` Martin
2014-03-25 15:42     ` Duncan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$ee4df$a16c7a6$74fe8395$12ca6150@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox