From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Will Btrfs have an official command to "uncow" existing files?
Date: Mon, 22 Aug 2016 02:00:35 +0000 (UTC) [thread overview]
Message-ID: <pan$3c21f$ab5f3fdc$db27222f$e3bd5a66@cox.net> (raw)
In-Reply-To: 126611471805976@web2j.yandex.ru
Tomokhov Alexander posted on Sun, 21 Aug 2016 21:59:36 +0300 as excerpted:
> Btrfs wiki FAQ gives a link to example Python script:
> https://github.com/stsquad/scripts/blob/master/uncow.py
>
> But such a crucial and fundamental tool must exist in stock btrfs-progs.
> Filesystem with CoW technology at it's core must provide user sufficient
> control over CoW aspects. Running 3rd-party or manually written scripts
> for filesystem properties/metadata manipulation is not convenient, not
> safe and definitely not the way it must be done.
Why? No script or dedicated tool needed as it's a simple enough process.
Simply:
1. chattr +C (that being nocow) the containing directory.
Then either:
2a. mv the file to and from another filesystem, so it's actually created
new in the directory and thus inherits the nocow attribute at file
creation,
or
2b. mv out and then cp the file back into place with --reflink=never,
again, forcing the file to be created new in the directory, so it
inherits the nocow attribute at creation,
OR (replacing both steps above)
Create the empty file (using touch or similar), set it nocow, and use cat
srcfile >> destfile style redirection to fill it, so the file again gets
the nocow attribute set before it has content, but allowing you to set
only the file nocow, without setting the containing directory nocow.
Of course there's no exception here to the general case, if you're doing
the same thing to a whole bunch of files, setting up a script to do it
may be more efficient than doing it to each one manually one by one, and
a script could be useful there, but that's a general rule, nothing
exceptional for btrfs nocow, and a script or fancy tool isn't actually
required, regardless.
The point being, cow is the default case, and should work /reasonably/
well in most cases, certainly well enough so that normal people doing
normal things shouldn't need to worry about it. The only people who will
need to worry about it, therefore, are people worried about the last bit
of optimization possible to various corner-case use-cases that don't
match default assumptions very well, and it's precisely these sorts of
people that are /technical/ enough to be able to understand both why they
might want nocow (and what the positives and negatives are going to be),
and how to actually get it.
> Also is it possible (at least in theory) to "uncow" files being
> currently opened in-place? Without the trickery with creation & renaming
> of files or directories. So that running "chattr +C" on a file would be
> sufficient. If possible, is it going to be implemented?
It's software. Of course it's possible, tho it's also possible the
negatives make it not worth the trouble. If the implementation simply
creates a new file and does a --reflink=never cp in the background when
the nocow file attribute is set, it may not be worth it.
As to whether it'll ultimately be implemented, I don't know as I'm not a
dev. But even if it is ultimately implemented, it might well be five
years out or longer, because there's simply way more "it'd be nice" btrfs-
related ideas out there than there are devs working on implementations,
and projecting more than five years out in a software development
environment like the Linux kernel doesn't make a lot of sense, so five
years out or longer is likely, but beyond that, nobody really knows.
Add to that the fact that a lot of existing btrfs features took rather
longer to implement and stabilize than originally projected, and...
Meanwhile, if you aren't already, be aware that the basic concepts of
snapshots locking in references to existing extents as unchangeable
(snapshots being a btrfs feature that depends on its cow nature), and
nocow setting files as rewrite-in-place, basically can't work with each
other at the concept level, because once a snapshot is taken, extents
referenced by that snapshot really /cannot/ change until that snapshot is
deleted, something that can only be true if the extents are copy-on-write.
To work around this btrfs makes the nocow attribute weaker than the
snapshot locking a particular extent in place, using a process sometimes
referred to as cow-once or cow1. When a change is written to a(n
otherwise) nocow file that has been snapshotted and thus has its extents
locked in place, the block of the file that is changed will have to be
cowed, despite the nocow attribute. However, the nocow attribute is
kept, and any further changes to that block will be rewritten to the new
location it was copied to without further cowing of that block, of course
until the next snapshot locks that location in place as well, at which
further writes to the same block will of course cow it once more, to a
third location, which will again remain in place until yet another
snapshot...
So snapshotting a nocow file means it's no longer absolutely nocow, it's
now cow1. If the rewrite rate is higher than the snapshot rate, the nocow
should still have some effect, but where the snapshot rate is higher than
the rewrite rate, the effect will be as if the file wasn't nocow at all,
because each snapshot will lock in the file as it then exists.
It is for this reason that the recommendation is to keep nocow files in a
dedicated subvolume, so snapshots to the parent subvolume exclude the
nocow files, and to minimize snapshotting of the nocow subvolume. As
long as snapshotting is occurring at all, however, there will be some
fragmentation over time, but by combining a limited snapshotting rate
with a reasonable defrag schedule targeting the nowcow files as well,
fragmentation should at least remain under control.
And of course nocow has other negatives as well. Setting nocow turns off
both compression (if you otherwise have compression on) and checksumming,
thus killing two other major btrfs features, including its realtime file
integrity validation via checksumming.
So there are definitely negatives to nocow that must be weighed before
setting it. But it's worth keeping in mind that all of these features
are really practical due to cow in the first place, the reason other
filesystems don't tend to have them, and that while there is definitely a
tradeoff to cow vs. nocow, setting nocow doesn't turn off features you'd
have in conventional filesystems anyway, so it's not as if you're losing
features you'd have if you weren't using btrfs with its cow by default,
in the first place.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2016-08-22 2:00 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-21 18:59 Will Btrfs have an official command to "uncow" existing files? Tomokhov Alexander
2016-08-22 2:00 ` Duncan [this message]
2016-08-22 23:54 ` Tomokhov Alexander
2016-08-22 20:14 ` Jeff Mahoney
2016-08-22 22:53 ` Tomokhov Alexander
2016-08-22 23:06 ` Darrick J. Wong
2016-08-23 2:43 ` Chris Murphy
2016-08-23 11:23 ` Austin S. Hemmelgarn
2016-08-24 18:34 ` Omar Sandoval
2016-08-24 22:42 ` Darrick J. Wong
2016-08-24 22:47 ` Omar Sandoval
2016-08-23 5:54 ` Dave Chinner
2016-08-24 0:48 ` Jeff Mahoney
2016-08-24 1:03 ` Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2023-01-22 11:41 Cerem Cem ASLAN
2023-01-22 16:55 ` Forza
2023-01-22 20:27 ` Goffredo Baroncelli
2023-01-23 0:20 ` Zygo Blaxell
2023-01-30 16:39 ` Patrik Lundquist
2023-01-31 11:25 ` Patrik Lundquist
2023-01-23 7:17 ` Christoph Hellwig
2023-01-29 0:40 ` Zygo Blaxell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$3c21f$ab5f3fdc$db27222f$e3bd5a66@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).