From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)
Date: Thu, 17 Dec 2015 04:06:56 +0000 (UTC) [thread overview]
Message-ID: <pan$7df3b$2174f8e9$efd1ab67$2dbe2f31@cox.net> (raw)
In-Reply-To: 1450303141.6242.50.camel@scientia.net
Christoph Anton Mitterer posted on Wed, 16 Dec 2015 22:59:01 +0100 as
excerpted:
> On Wed, 2015-12-09 at 16:36 +0000, Duncan wrote:
>> But... as I've pointed out in other replies, in many cases including
>> this specific one (bittorrent), applications have already had to
>> develop their own integrity management features
> Well let's move discussion upon that into the "dear developers, can we
> have notdatacow + checksumming, plz?" where I showed in one of the more
> recent threads that bittorrent seems rather to be the only thing which
> does use that per default... while on the VM image front, nothing seems
> to support it, and on the DB front, some support it, but don't use it
> per default.
>
>> In the bittorrent case specifically, torrent chunks are already
>> checksummed, and if they don't verify upon download, the chunk is
>> thrown away and redownloaded.
> I'm not a bittorrent expert, because I don't use it, but that sounds to
> be more like the edonkey model, where - while there are checksums -
> these are only used until the download completes. Then you have the
> complete file, any checksum info thrown away, and the file again being
> "at risk" (i.e. not checksum protected).
[I'm breaking this into smaller replies again.]
Just to mention here, that I said "integrity management features", which
includes more than checksumming. As Austin Hemmelgarn has been pointing
out, DBs and some VMs do COW, some DBs do checksumming or at least have
that option, and both VMs and DBs generally do at least some level of
consistency checking as they load. Those are all "integrity management
features" at some level.
As for bittorrent, I /think/ the checksums are in the torrent files
themselves (and if I'm not mistaken, much as git, the chunks within the
file are actually IDed by checksum, not specific position, so as long as
the torrent is active, uploading or downloading, these will by definition
be retained). As long as those are retained, the checksums should be
retained. And ideally, people will continue to torrent the files long
after they've finished downloading them, in which case they'll still need
the torrent files themselves, along with the checksums info.
And for longer term storage, people really should be copying/moving their
torrented files elsewhere, in such a way that they either eliminate the
fragmentation if the files weren't nocowed, or eliminate the nocow
attribute and get them checksum-protected as normal for files not
intended to be constantly randomly rewritten, which will be the case once
they're no longer being actively downloaded. Of course that's at the
slightly technically oriented user level, but then, the whole nocow
thing, or even caring about checksums and longer term file integrity in
the first place, is also technically oriented user level. Normal users
will just download without worrying about the nocow in the first place,
and perhaps wonder why the disk is thrashing so, but not be inclined to
do anything about it except perhaps switch back to their old filesystem,
where it was faster and the disk didn't sound as bad. In doing so,
they'll either automatically get the checksuming along with the worse
performance, or go back to a filesystem without the checksumming, and
think it's fine as they know no different.
Meanwhile, if they do it correctly there's no window without protection,
as the torrent file can be used to double-verify the file once moved, as
well, before deleting it.
--
Duncan - List replies preferred. No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master." Richard Stallman
next prev parent reply other threads:[~2015-12-17 4:07 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-23 1:43 btrfs: poor performance on deleting many large files Mitch Fossen
2015-11-23 6:29 ` Duncan
2015-11-25 21:49 ` Mitchell Fossen
2015-11-26 16:52 ` Duncan
2015-11-26 18:25 ` Christoph Anton Mitterer
2015-11-26 23:29 ` Duncan
2015-11-27 0:06 ` Christoph Anton Mitterer
2015-11-27 3:38 ` Duncan
2015-11-28 3:57 ` Christoph Anton Mitterer
2015-11-28 6:49 ` Duncan
2015-12-12 22:15 ` Christoph Anton Mitterer
2015-12-13 7:10 ` Duncan
2015-12-16 22:14 ` Christoph Anton Mitterer
2015-12-14 14:24 ` Austin S. Hemmelgarn
2015-12-14 19:39 ` Christoph Anton Mitterer
2015-12-14 20:27 ` Austin S. Hemmelgarn
2015-12-14 21:30 ` Lionel Bouton
2015-12-14 23:25 ` Christoph Anton Mitterer
2015-12-15 1:49 ` Duncan
2015-12-15 2:38 ` Lionel Bouton
2015-12-16 8:10 ` Duncan
2015-12-14 23:10 ` Christoph Anton Mitterer
2015-12-14 23:16 ` project idea: per-object default mount-options / more btrfs-properties / chattr attributes (was: btrfs: poor performance on deleting many large files) Christoph Anton Mitterer
2015-12-15 2:08 ` btrfs: poor performance on deleting many large files Duncan
2015-12-15 4:05 ` Chris Murphy
2015-11-27 1:49 ` Qu Wenruo
2015-11-23 12:59 ` Austin S Hemmelgarn
2015-11-26 0:23 ` [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?) Christoph Anton Mitterer
2015-11-26 0:33 ` Hugo Mills
2015-12-09 5:43 ` Christoph Anton Mitterer
2015-12-09 13:36 ` Duncan
2015-12-14 2:46 ` Christoph Anton Mitterer
2015-12-14 11:19 ` Duncan
2015-12-16 23:39 ` Kai Krakow
2015-12-14 1:44 ` Christoph Anton Mitterer
2015-12-14 10:51 ` Duncan
2015-12-16 23:55 ` Christoph Anton Mitterer
2015-11-26 23:08 ` Duncan
2015-12-09 5:45 ` Christoph Anton Mitterer
2015-12-09 16:36 ` Duncan
2015-12-16 21:59 ` Christoph Anton Mitterer
2015-12-17 4:06 ` Duncan [this message]
2015-12-18 0:21 ` Christoph Anton Mitterer
2015-12-17 4:35 ` Duncan
2015-12-17 5:07 ` Duncan
2015-12-17 5:12 ` Duncan
2015-12-17 6:00 ` Duncan
2015-12-17 6:01 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='pan$7df3b$2174f8e9$efd1ab67$2dbe2f31@cox.net' \
--to=1i5t5.duncan@cox.net \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.