linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Does data checksumming remain for files with No_COW file attribute?
Date: Sun, 25 Sep 2016 05:44:30 +0000 (UTC)	[thread overview]
Message-ID: <pan$9d4de$8d8969eb$2fd5352c$14a0ea2c@cox.net> (raw)
In-Reply-To: 20160924235014.GA2247@angband.pl

Adam Borowski posted on Sun, 25 Sep 2016 01:50:14 +0200 as excerpted:

> On Sun, Sep 25, 2016 at 02:25:32AM +0300, Alexander Tomokhov wrote:
>> Ok, so data checksumming does not remain for newly created empty files
>> with No_COW attribute.  I think it's an important trait of Btrfs
>> behavior and should be added to wiki.  So that users are informed that
>> disabling CoW on a per-file basis also loses checksum correctness of
>> such file.
> 
> Actually, it disables pretty much all btrfs features except for... CoW.
> 
> You lose:
> * checksums
> * compression
> * safety against power loss (torn writes, etc)
> * transactions (not that anyone uses them...)
> * etc

> But, CoW is still there.

> Try it: make a subvolume, create a
> FS_NO_COW file (preferably one big enough), snapshot the subvolume,
> filefrag -v both copies.  Write to one of them, changing only a part of
> file.  Wait for writeout, filefrag -v them again.

That's because snapshots depend on COW.  If you don't snapshot the file 
(or otherwise create additional reflinks to it, using cp --reflink=always, 
for instance), it'll be NOCOW.  But because snapshots (and other forms of 
multiple reflink) depend on COW, taking a snapshot (or otherwise multi-
reflinking) and then writing to one copy forces what has been referred to 
on this list as COW1, a single COW to break the multi-reflink.

However, COW1 doesn't change the NOCOW attribute, and further writes to 
the same block of the NOCOW file will overwrite the now-current block, 
instead of COWING it... until the next snapshot (or another multi-reflink 
operation) locks it too in place, of course, after which another COW1 
will be required.

Which means otherwise NOCOW files that are both repeatedly overwritten 
and repeatedly snapshotted, with both happening at about the same rate or 
snapshots happening more frequently than rewrites, will tend to fragment 
almost as fast as if they hadn't been set NOCOW in the first place.

So NOCOW still has an effect -- as long as rewrites are coming in more 
frequently than snapshots.  However, if the file is repeatedly snapshotted 
at the same or faster rate than it is rewritten, all those COW1s due to 
the repeated snapshotting will pretty effectively nullify the NOCOW 
setting, even if it otherwise remains valid.

The other alternative, of course, is to avoid snapshotting your NOCOW 
files (which of course means losing send/receive, since send requires a 
read-only snapshot).  You can choose one or the other, but can't have 
both without one, NOCOW, yielding to the other.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


  reply	other threads:[~2016-09-25  5:44 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-24 12:26 Does data checksumming remain for files with No_COW file attribute? Alexander Tomokhov
2016-09-24 12:37 ` Hugo Mills
2016-09-24 23:25   ` Alexander Tomokhov
2016-09-24 23:50     ` Adam Borowski
2016-09-25  5:44       ` Duncan [this message]
2016-09-26 20:41         ` Adam Borowski
2016-09-24 12:40 ` Roman Mamedov
2016-09-24 12:43   ` Hugo Mills
2016-09-24 18:11     ` Christoph Anton Mitterer
2016-09-25 13:49       ` Goffredo Baroncelli
2016-09-25 19:53         ` Christoph Anton Mitterer
2016-09-26 11:11       ` Austin S. Hemmelgarn
2016-09-24 18:09   ` Christoph Anton Mitterer
2016-09-24 21:44     ` Adam Borowski
2016-09-24 22:52       ` Christoph Anton Mitterer

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$9d4de$8d8969eb$2fd5352c$14a0ea2c@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).