From: Hugo Mills <hugo@carfax.org.uk>
To: Christoph Anton Mitterer <calestyo@scientia.net>
Cc: Henk Slager <eye1tm@gmail.com>,
linux-btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: btrfs
Date: Sun, 5 Jun 2016 21:07:21 +0000 [thread overview]
Message-ID: <20160605210721.GH24492@carfax.org.uk> (raw)
In-Reply-To: <1465160205.6702.38.camel@scientia.net>
[-- Attachment #1: Type: text/plain, Size: 3726 bytes --]
On Sun, Jun 05, 2016 at 10:56:45PM +0200, Christoph Anton Mitterer wrote:
> On Sun, 2016-06-05 at 22:39 +0200, Henk Slager wrote:
> > > So the point I'm trying to make:
> > > People do probably not care so much whether their VM image/etc. is
> > > COWed or not, snapshots/etc. still work with that,... but they may
> > > likely care if the integrity feature is lost.
> > > So IMHO, nodatacow + checksumming deserves to be amongst the top
> > > priorities.
> > Have you tried blockdevice/HDD caching like bcache or dmcache in
> > combination with VMs on BTRFS?
> No yet,... my personal use case is just some VMs on the notebook, and
> for this, the above would seem a bit overkill.
> For the larger VM cluster at the institute,... puh to be honest I don't
> know by hard what we do there.
>
>
> > Or ZVOL for VMs in ZFS with L2ARC?
> Well but all this is an alternative solution,...
>
>
> > I assume the primary reason for wanting nodatacow + checksumming is
> > to
> > avoid long seektimes on HDDs due to growing fragmentation of the VM
> > images over time.
> Well the primary reason is wanting to have overall checksumming in the
> fs, regardless of which features one uses.
The problem is that you can't guarantee consistency with
nodatacow+checksums. If you have nodatacow, then data is overwritten,
in place. If you do that, then you can't have a fully consistent
checksum -- there are always race conditions between the checksum and
the data being written (or the data and the checksum, depending on
which way round you do it).
> I think we already have some situations where tools use/set btrfs
> features by themselves (i.e. automatically)... wasn't systemd creating
> subvols per default in some locations, when there's btrfs?
> So it's no big step to postgresql/etc. setting nodatacow, making people
> loose integrity without them even knowing.
>
> Of course, avoiding the fragmentation is the reason for the desire to
> have nodatacow.
>
>
> > But even if you have nodatacow + checksumming
> > implemented, it is then still HDD access and a VM imagefile itself is
> > not guaranteed to be continuous.
> Uhm... sure, but that's no difference to other filesystems?!
>
>
> > It is clear that for VM images the amount of extents will be large
> > over time (like 50k or so, autodefrag on),
> Wasn't it said, that autodefrag performs bad for anything larger than
> ~1G?
I don't recall ever seeing someone saying that. Of course, I may
have forgotten seeing it...
> > but with a modern SSD used
> > as cache, it doesn't matter. It is still way faster than just HDD(s),
> > even with freshly copied image with <100 extents.
> Well the fragmentation has also many other consequences and not just
> seeks (assuming everyone would use SSDs, which is and probably won't be
> the case for quite a while).
> Most obviously you get much more IOPS and btrfs itself will, AFAIU,
> also suffer from some issues due to the fragmentation.
This is a fundamental problem with all CoW filesystems. There are
some mititgations that can be put in place (true CoW rather than
btrfs's redirect-on-write, like some databases do, where the original
data is copied elsewhere before overwriting; cache aggressively and
with knowledge of the CoW nature of the FS, like ZFS does), but they
all have their drawbacks and pathological cases.
Hugo.
--
Hugo Mills | How do you become King? You stand in the marketplace
hugo@... carfax.org.uk | and announce you're going to tax everyone. If you
http://carfax.org.uk/ | get out alive, you're King.
PGP: E2AB1DE4 | Harry Harrison
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2016-06-05 21:07 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-01 22:25 raid5/6 production use status? Christoph Anton Mitterer
2016-06-02 9:24 ` Gerald Hopf
2016-06-02 9:35 ` Hugo Mills
2016-06-02 10:03 ` Gerald Hopf
2016-06-03 17:38 ` btrfs (was: raid5/6) production use status (and future)? Christoph Anton Mitterer
2016-06-03 19:50 ` btrfs Austin S Hemmelgarn
2016-06-04 1:51 ` btrfs Christoph Anton Mitterer
2016-06-04 7:24 ` btrfs Andrei Borzenkov
2016-06-04 17:00 ` btrfs Chris Murphy
2016-06-04 17:37 ` btrfs Christoph Anton Mitterer
2016-06-04 19:13 ` btrfs Chris Murphy
2016-06-04 22:43 ` btrfs Christoph Anton Mitterer
2016-06-05 15:51 ` btrfs Chris Murphy
2016-06-05 20:39 ` btrfs Christoph Anton Mitterer
2016-06-04 21:18 ` btrfs Andrei Borzenkov
2016-06-05 20:39 ` btrfs Henk Slager
2016-06-05 20:56 ` btrfs Christoph Anton Mitterer
2016-06-05 21:07 ` Hugo Mills [this message]
2016-06-05 21:31 ` btrfs Christoph Anton Mitterer
2016-06-05 23:39 ` btrfs Chris Murphy
2016-06-08 6:13 ` btrfs Duncan
2016-06-06 0:56 ` btrfs Chris Murphy
2016-06-06 13:04 ` btrfs Austin S. Hemmelgarn
[not found] ` <f4a9ef2f-99a8-bcc4-5a8f-b022914980f0@swiftspirit.co.za>
2016-06-04 2:13 ` btrfs Christoph Anton Mitterer
2016-06-04 2:36 ` btrfs Chris Murphy
-- strict thread matches above, loose matches on Subject: below --
2024-01-15 15:32 btrfs Turritopsis Dohrnii Teo En Ming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160605210721.GH24492@carfax.org.uk \
--to=hugo@carfax.org.uk \
--cc=calestyo@scientia.net \
--cc=eye1tm@gmail.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).