From: "Piotr Pawłow" <pp@siedziba.pl>
To: linux-btrfs@vger.kernel.org
Subject: Re: RAID1 failure and recovery
Date: Sun, 14 Sep 2014 16:53:24 +0200 [thread overview]
Message-ID: <5415ABE4.7060800@siedziba.pl> (raw)
In-Reply-To: <20140914044405.GU5783@carfax.org.uk>
On 14.09.2014 06:44, Hugo Mills wrote:
>> I've done this before, by accident (pulled the wrong drive, reinserted
>> it). You can fix it by running a scrub on the device (btrfs scrub
>> start /dev/ice, I think).
> Checksums are done for each 4k block, so the increase in probability
> of a false negative is purely to do with the sher volume of data.
> "Weak" checksums like the CRC32 that btrfs currently uses are indeed
> poor for detecting malicious targeted attacks on the data, but for
> random failures, such as a disk block being unreadable and returning
> zeroes or having bit errors, the odds of identifying the failure are
> still excellent.
I don't require "probably the universe will end sooner" kind of odds,
but I would at least like "better than winning the lottery" odds. Once
there are thousands of blocks to fix, the odds aren't that great: 1 /
2^32 * 10 000 =~ 1 / 430 000
I wouldn't feel confident enough to add the disk back and let btrfs fix
it. I'd rather wipe the FS on it and do the "replace missing".
>> Additionally, nocow files are not checksummed. They will not be
>> corrected
>> and may return good data or random garbage, depending on which mirror is
>> accessed.
> Yes, this is a trade-off that you have to make for your own
> use-case and happiness. For some things (like a browser cache), I'd be
> happy with losing the checksums.
The point is, if I add a drive with old contents back, I will probably
have to delete all nocow files. Cause I'm not aware of any tool that can
compare both mirrors, and tell me which files are identical on both, and
which are different. Scrub will not detect them, as it works separately
on each device, and doesn't compare one mirror to the other.
If I don't delete nocow files, I may get intermittent failures, like my
browser randomly not loading some pages, and wonder what's going on.
On a multi user system, I risk exposing sensitive data to all users
having nocow files, or access to nocow files.
Thus I think this practice is bad, dangerous, and I would advice against
doing that. I'd also like btrfs to reject devices with old content by
default.
next prev parent reply other threads:[~2014-09-14 14:53 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-12 8:57 RAID1 failure and recovery shane-kernel
2014-09-12 10:47 ` Hugo Mills
2014-09-14 3:15 ` Piotr Pawłow
2014-09-14 4:44 ` Hugo Mills
2014-09-14 14:53 ` Piotr Pawłow [this message]
2014-09-12 11:11 ` Duncan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5415ABE4.7060800@siedziba.pl \
--to=pp@siedziba.pl \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).