linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Very slow filesystem
Date: Fri, 6 Jun 2014 19:59:55 +0000 (UTC)	[thread overview]
Message-ID: <pan$8b124$462edb2f$89da01c1$20a210a8@cox.net> (raw)
In-Reply-To: CAKcLGm8fFX-Hqz0iEgk2qtsUXrSNW=XsKMHiVWZvV4AoJCBGVQ@mail.gmail.com

Mitch Harder posted on Fri, 06 Jun 2014 14:06:53 -0500 as excerpted:

> Every time you update your database, btrfs is going to update whichever
> 128 KiB blocks need to be modified.
> 
> Even for a tiny modification, the new compressed block may be slightly
> more or slightly less than 128 KiB.

FWIW, I believe that's 128 KiB pre-compression.  And at least without 
compress-force, btrfs will try the compression and if the compressed size 
is larger than the uncompressed size, it simply won't compress that 
block.  So 128 KiB is the largest amount of space that 128 KiB of data 
could take with compression on, but it can be half that or less if the 
compression happens to be good for that 128 KiB block.

> If you have a 1-2 GB database that is being updated with any frequency,
> you can see how you will quickly end up with lots of metadata
> fragmentation as well as inefficient data block utilization.
> I think this will be the case even if you switch to NOCOW due to the
> compression.

That is one reason that, as I said, NOCOW turns off compression.  
Compression simply doesn't work well with in-place updates, because as 
you point out, the update may compress more or less well than the 
original, and that won't work in-place.  So NOCOW turns off compression 
to avoid the problem.  

If its COW (that is, not NOCOW), then the COW-based out-of-place-updates 
avoid the problem of fitting more data in the same space, because the new 
write can take more space in the new location if it has to.

But you are correct that compression and large, frequently updated 
databases don't play well together either.  Which is why turning off 
compression when turning off COW isn't the big problem it would first 
appear to be -- as it happens, the very same files where COW doesn't work 
well, are also the ones where compression doesn't work well.

Similarly for checksumming.  When there are enough updates, in addition 
to taking more time to calculate and write, checksumming simply invites 
race conditions between the last then-valid checksum and the next update 
invalidating it.  In addition, in many, perhaps most cases, the sorts of 
apps that do constant internal updates, have already evolved their own 
data integrity verification methods in ordered to cope with issues on the 
after all way more common unverified filesystems, creating even more 
possible race conditions and timing issues and making all that extra work 
that btrfs normally does for verification unnecessary.  Trying to do all 
that in-place due to NOCOW is a recipe for failure or insanity if not both

So when turning off COW, just turning off checksumming/verification and 
compression along with it makes the most sense, and that's what btrfs 
does.  To do otherwise is just asking for trouble, which is why you very 
rarely see in-place-update-by-default filesystems offering either 
transparent compression or data verification as features.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2014-06-06 20:00 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-04 22:15 Very slow filesystem Igor M
2014-06-04 22:27 ` Fajar A. Nugraha
2014-06-04 22:40   ` Roman Mamedov
2014-06-04 22:45   ` Igor M
2014-06-04 23:17     ` Timofey Titovets
2014-06-05  3:05 ` Duncan
2014-06-05  3:22   ` Fajar A. Nugraha
2014-06-05  4:45     ` Duncan
2014-06-05  7:50   ` Igor M
2014-06-05 10:54     ` Russell Coker
2014-06-05 15:52   ` Igor M
2014-06-05 16:13     ` Timofey Titovets
2014-06-05 19:53       ` Duncan
2014-06-06 19:06         ` Mitch Harder
2014-06-06 19:59           ` Duncan [this message]
2014-06-07  2:29           ` Russell Coker
2014-06-05  8:08 ` Erkki Seppala
2014-06-05  8:12   ` Erkki Seppala

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$8b124$462edb2f$89da01c1$20a210a8@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).