Re: mount option nodatacow for VMs on SSD?

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: mount option nodatacow for VMs on SSD?
Date: Fri, 25 Nov 2016 12:01:37 +0000 (UTC)	[thread overview]
Message-ID: <pan$c85ad$dae9ebf5$a0482059$7c135fb1@cox.net> (raw)
In-Reply-To: 20161125082840.GA32711@rus.uni-stuttgart.de

Ulli Horlacher posted on Fri, 25 Nov 2016 09:28:40 +0100 as excerpted:

> I have vmware and virtualbox VMs on btrfs SSD.
> 
> I read in
> https://btrfs.wiki.kernel.org/index.php/SysadminGuide
#When_To_Make_Subvolumes
> 
>      certain types of data (databases, VM images and similar typically
>      big files that are randomly written internally) may require CoW to
>      be disabled for them.  So for example such areas could be placed in
>      a subvolume, that is always mounted with the option "nodatacow".
> 
> Does this apply to SSDs, too?

It can, because the root issue is the same, the COW-based fragmentation 
that's always a problem with this sort of frequently randomly partially 
rewritten file on COW-based filesystems, but the symptoms tend to be much 
less of a problem on ssd, so it doesn't tend to be as big of an issue 
there.

On multi-gig database files or VM images, files can end up with 100K 
extents due to COW-based rewriting.  Obviously this can be a HUGE problem 
on spinning rust due to its seek times, a problem zero-seek-time ssds 
don't have, but the sheer amount of metadata overhead due to tracking all 
those tiny extents can be a problem of its own, particularly when doing 
maintenance such as btrfs balance or btrfs check.  Both snapshotting and 
quota tracking amplify this overhead tracking problem as well, and it's 
this problem that can still be an issue on ssds.

That said, the autodefrag mount option, used to eliminate some of the 
heavy fragmentation due to copy-on-write (COW) that's the root problem, 
tends to be faster on ssd, and can often be all that's needed on ssd as 
between it ameliorating the root problem to a large extent and the faster 
speed of ssds, often that's all that's needed, particularly if you don't 
need quotas so have them off and only do relatively limited snapshotting.

The problem with both the nodatacow mount option and the nocow file 
attribute is that they disable some of the btrfs features and are 
weakened by other features that may well be a big part of the reason 
behind your choice of btrfs in the first place.  Both btrfs compression, 
if otherwise enabled, and checksuming and thus file integrity checking 
(and repair in the case of btrfs raid1/10), would be complicated or 
impossible to implement without COW, and thus are disabled in the NOCOW 
case.  Similarly, btrfs snapshotting depends on COW because the snapshot 
locks in place the existing version so a rewrite must be written 
elsewhere.  As a result, snapshotting weakens NOCOW to what has been 
called COW1, COW the first time a block is rewritten after a snapshot, 
but after that further writes to the same block will be rewritten into 
the (new) existing block location.  If you only do very occasional 
snapshots that may not be a problem, but if you're doing regular 
snapshots, particularly automated and multiple per day, the effect of the 
snapshotting forced COW1s may be fragmentation as bad as if NOCOW wasn't 
in place in the first place.

So to some degree, if you're going to be setting the nocow attribute or 
using the nodatacow mount option, you might as well just setup a 
different partition/volume and mkfs to something other than btrfs for 
those files.  OTOH, the btrfs multi-device and storage pool features 
aren't affected, so if they are big reasons you're doing btrfs, then 
there's some reason to keep using btrfs and simply do the nodatacow mount 
or nocow attribute if autodefrag isn't enough on its own to handle it.

Bottom line, the fragmentation is much less of a problem on ssds, 
particularly with autodefrag which may well be enough, but as always, it 
can be installation and task dependent, so if it's going to be a 
production system, do your own testing and make your own decisions based 
on the results. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

next prev parent reply	other threads:[~2016-11-25 12:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-25  8:28 mount option nodatacow for VMs on SSD? Ulli Horlacher
2016-11-25 12:01 ` Duncan [this message]
2016-11-25 12:25   ` Roman Mamedov
2016-11-26 10:27 ` Kai Krakow
2016-11-28  0:38   ` Ulli Horlacher
2016-11-28  2:56     ` Duncan
2016-11-28  9:49       ` [Not TLS] " Graham Cobb
2016-11-29  5:14         ` Duncan
2016-11-29 10:34           ` [Not TLS] " Niccolò Belli
2016-11-29 12:18           ` [Not TLS] " Austin S. Hemmelgarn
2016-11-28  8:20     ` Kai Krakow
2016-11-28 11:11       ` Niccolò Belli
2016-11-29  5:06         ` Duncan
2016-11-29 12:20           ` Austin S. Hemmelgarn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$c85ad$dae9ebf5$a0482059$7c135fb1@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).