Re: Very slow filesystem

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Very slow filesystem
Date: Thu, 5 Jun 2014 04:45:07 +0000 (UTC)	[thread overview]
Message-ID: <pan$66344$1e737ea5$54405fe6$fc9c8098@cox.net> (raw)
In-Reply-To: CAG1y0seEWf0RqDuJgG3Jx9iz-yPLh8piQ00dMAQ9aPLeCEp3vA@mail.gmail.com

Fajar A. Nugraha posted on Thu, 05 Jun 2014 10:22:49 +0700 as excerpted:

> (resending to the list as plain text, the original reply was rejected
> due to HTML format)
> 
> On Thu, Jun 5, 2014 at 10:05 AM, Duncan <1i5t5.duncan@cox.net> wrote:
>>
>> Igor M posted on Thu, 05 Jun 2014 00:15:31 +0200 as excerpted:
>>
>> > Why btrfs becames EXTREMELY slow after some time (months) of usage ?
>> > This is now happened second time, first time I though it was hard
>> > drive fault, but now drive seems ok.
>> > Filesystem is mounted with compress-force=lzo and is used for MySQL
>> > databases, files are mostly big 2G-8G.
>>
>> That's the problem right there, database access pattern on files over 1
>> GiB in size, but the problem along with the fix has been repeated over
>> and over and over and over... again on this list, and it's covered on
>> the btrfs wiki as well
> 
> Which part on the wiki? It's not on
> https://btrfs.wiki.kernel.org/index.php/FAQ or
> https://btrfs.wiki.kernel.org/index.php/UseCases

Most of the discussion and information is on the list, but there's a 
limited amount of information on the wiki in at least three places.  Two 
are on the mount options page, in the autodefrag and nodatacow options 
description:

* Autodefrag says it's well suited to bdb and sqlite dbs but not vm 
images or big dbs (yet).

* Nodatacow says performance gain is usually under 5% *UNLESS* the 
workload is random writes to large db files, where the difference can be 
VERY large.  (There's also mention of the fact that this turns off 
checksumming and compression.)

Of course that's the nodatacow mount option, not the NOCOW file 
attribute, which isn't to my knowledge discussed on the wiki, and given 
the wiki wording, one does indeed have to read a bit between the lines, 
but it is there if one looks.  That was certainly enough hint for me to 
mark the issue for further study as I did my initial pre-mkfs.btrfs 
research, for instance, and that it was a problem, with additional 
detail, was quickly confirmed once I checked the list.

* Additionally, there some discussion in the FAQ under "Can copy-on-write 
be turned off for data blocks?", including discussion of the command used 
(chattr +C), a link to a script, a shell commands example, and the hint 
"will produce file suitable for a raw VM image -- the blocks will be 
updated in-place and are preallocated."

FWIW, if I did wiki editing there'd probably be a dedicated page 
discussing it, but for better or worse, I seem to work best on mailing 
lists and newsgroups, and every time I've tried contributing on the web, 
even when it has been to a web forum which one would think would be close 
enough to lists/groups for me to adapt to, it simply hasn't gone much of 
anywhere.  So these days I let other people more comfortable with editing 
wikis or doing web forums do that (and sometimes people do that by either 
actually quoting my list post nearly verbatim or simply linking to it, 
which I'm fine with, as after all that's where much of the info I post 
comes from in the first place), and I stick to the lists.  Since I don't 
directly contribute to the wiki I don't much criticize it, but there are 
indeed at least hints there for those who can read them, something I did 
myself so I know it's not asking the impossible.

> If COW and rewrite is the main issue, why don't zfs experience the
> extreme slowdown (that is, not if you have sufficient free space
> available, like 20% or so)?

My personal opinion?  Primarily two things:

1) zfs is far more mature than btrfs and has been in production usage for 
many years now, while btrfs is still barely getting the huge warnings 
stripped off.  There's a lot of btrfs optimization possible that simply 
hasn't occurred yet as the focus is still real data-destruction-risk 
bugs, and in fact, btrfs isn't yet feature-complete either, so there's 
still focus on raw feature development as well.  When btrfs gets to the 
maturity level that zfs is at now, I expect a lot of the problems we have 
now will have been dramatically reduced if not eliminated.  (And the devs 
are indeed working on this problem, among others.)

2) Stating the obvious, while both btrfs and zfs are COW based and have 
other similarities, btrfs is an different filesystem, with an entirely 
different implementation and somewhat different emphasis.  There 
consequently WILL be some differences, even when they're both mature 
filesystems.  It's entirely possible that something about the btrfs 
implementation makes it less suitable in general to this particular use-
case.

Additionally, while I don't have zfs experience myself nor do I find it a 
particularly feasible option for me due to licensing and political 
issues, from what I've read it tends to handle certain issues by simply 
throwing gigs on gigs of memory at the problem.  Btrfs is designed to 
require far less memory, and as such, will by definition be somewhat more 
limited in spots.  (Arguably, this is simply a specific case of #2 above, 
they're individual filesystems with differing implementation and 
emphasis, so WILL by definition have different ideal use-cases.)

Meanwhile, there's that specific mention of 20% zfs free-space available, 
above.  On btrfs, as long as some amount of chunk-space remains 
unallocated to chunks, percentage free-space has little to no effect on 
performance.  And with metadata chunk-sizes of a quarter gig and data 
chunk-sizes of a gig, at the terabyte filesystem scale that equates to 
well under 1% free, before free-space becomes a performance issue at all.

So if indeed zfs is like many other filesystems in requiring 10-20% 
freespace in ordered to perform at best efficiency (I really don't know 
if that's the case or not, but it is part of the claim above), then that 
again simply emphasizes the differences between zfs and btrfs, since that 
literally has zero bearing at all on btrfs efficiency.

Rather, at least until btrfs gets automatic entirely unattended 
chunkspace rebalance triggering the btrfs issue is far more likely to be 
literally running out of either data or metadata space as all the chunks 
with freespace are allocated to the other one.  (Usually, it's metadata 
that runs out first, with lots of free space tied up in nearly empty data 
chunks.  But it can be either.  Of course a currently manually triggered 
rebalance can be used to solve this problem, but at present, it IS 
manually triggered, no automatic rebalancing functionality at all.)

So while zfs and btrfs might be similarly based on COW technology, they 
really are entirely different filesystems, with vastly different maturity 
levels and some pretty big differences in behavior as well as licensing 
and political philosophy, certainly now, but potentially even as btrfs 
matures to match zfs maturity, too.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

next prev parent reply	other threads:[~2014-06-05  4:45 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-04 22:15 Very slow filesystem Igor M
2014-06-04 22:27 ` Fajar A. Nugraha
2014-06-04 22:40   ` Roman Mamedov
2014-06-04 22:45   ` Igor M
2014-06-04 23:17     ` Timofey Titovets
2014-06-05  3:05 ` Duncan
2014-06-05  3:22   ` Fajar A. Nugraha
2014-06-05  4:45     ` Duncan [this message]
2014-06-05  7:50   ` Igor M
2014-06-05 10:54     ` Russell Coker
2014-06-05 15:52   ` Igor M
2014-06-05 16:13     ` Timofey Titovets
2014-06-05 19:53       ` Duncan
2014-06-06 19:06         ` Mitch Harder
2014-06-06 19:59           ` Duncan
2014-06-07  2:29           ` Russell Coker
2014-06-05  8:08 ` Erkki Seppala
2014-06-05  8:12   ` Erkki Seppala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$66344$1e737ea5$54405fe6$fc9c8098@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).