From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Very slow filesystem
Date: Fri, 6 Jun 2014 19:59:55 +0000 (UTC) [thread overview]
Message-ID: <pan$8b124$462edb2f$89da01c1$20a210a8@cox.net> (raw)
In-Reply-To: CAKcLGm8fFX-Hqz0iEgk2qtsUXrSNW=XsKMHiVWZvV4AoJCBGVQ@mail.gmail.com
Mitch Harder posted on Fri, 06 Jun 2014 14:06:53 -0500 as excerpted:
> Every time you update your database, btrfs is going to update whichever
> 128 KiB blocks need to be modified.
>
> Even for a tiny modification, the new compressed block may be slightly
> more or slightly less than 128 KiB.
FWIW, I believe that's 128 KiB pre-compression. And at least without
compress-force, btrfs will try the compression and if the compressed size
is larger than the uncompressed size, it simply won't compress that
block. So 128 KiB is the largest amount of space that 128 KiB of data
could take with compression on, but it can be half that or less if the
compression happens to be good for that 128 KiB block.
> If you have a 1-2 GB database that is being updated with any frequency,
> you can see how you will quickly end up with lots of metadata
> fragmentation as well as inefficient data block utilization.
> I think this will be the case even if you switch to NOCOW due to the
> compression.
That is one reason that, as I said, NOCOW turns off compression.
Compression simply doesn't work well with in-place updates, because as
you point out, the update may compress more or less well than the
original, and that won't work in-place. So NOCOW turns off compression
to avoid the problem.
If its COW (that is, not NOCOW), then the COW-based out-of-place-updates
avoid the problem of fitting more data in the same space, because the new
write can take more space in the new location if it has to.
But you are correct that compression and large, frequently updated
databases don't play well together either. Which is why turning off
compression when turning off COW isn't the big problem it would first
appear to be -- as it happens, the very same files where COW doesn't work
well, are also the ones where compression doesn't work well.
Similarly for checksumming. When there are enough updates, in addition
to taking more time to calculate and write, checksumming simply invites
race conditions between the last then-valid checksum and the next update
invalidating it. In addition, in many, perhaps most cases, the sorts of
apps that do constant internal updates, have already evolved their own
data integrity verification methods in ordered to cope with issues on the
after all way more common unverified filesystems, creating even more
possible race conditions and timing issues and making all that extra work
that btrfs normally does for verification unnecessary. Trying to do all
that in-place due to NOCOW is a recipe for failure or insanity if not both
So when turning off COW, just turning off checksumming/verification and
compression along with it makes the most sense, and that's what btrfs
does. To do otherwise is just asking for trouble, which is why you very
rarely see in-place-update-by-default filesystems offering either
transparent compression or data verification as features.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2014-06-06 20:00 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-04 22:15 Very slow filesystem Igor M
2014-06-04 22:27 ` Fajar A. Nugraha
2014-06-04 22:40 ` Roman Mamedov
2014-06-04 22:45 ` Igor M
2014-06-04 23:17 ` Timofey Titovets
2014-06-05 3:05 ` Duncan
2014-06-05 3:22 ` Fajar A. Nugraha
2014-06-05 4:45 ` Duncan
2014-06-05 7:50 ` Igor M
2014-06-05 10:54 ` Russell Coker
2014-06-05 15:52 ` Igor M
2014-06-05 16:13 ` Timofey Titovets
2014-06-05 19:53 ` Duncan
2014-06-06 19:06 ` Mitch Harder
2014-06-06 19:59 ` Duncan [this message]
2014-06-07 2:29 ` Russell Coker
2014-06-05 8:08 ` Erkki Seppala
2014-06-05 8:12 ` Erkki Seppala
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$8b124$462edb2f$89da01c1$20a210a8@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).