linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Piotr Pawłow" <pp@siedziba.pl>
To: linux-btrfs@vger.kernel.org
Subject: Re: RAID1 failure and recovery
Date: Sun, 14 Sep 2014 16:53:24 +0200	[thread overview]
Message-ID: <5415ABE4.7060800@siedziba.pl> (raw)
In-Reply-To: <20140914044405.GU5783@carfax.org.uk>

On 14.09.2014 06:44, Hugo Mills wrote:
>> I've done this before, by accident (pulled the wrong drive, reinserted
>> it). You can fix it by running a scrub on the device (btrfs scrub
>> start /dev/ice, I think).
> Checksums are done for each 4k block, so the increase in probability 
> of a false negative is purely to do with the sher volume of data. 
> "Weak" checksums like the CRC32 that btrfs currently uses are indeed 
> poor for detecting malicious targeted attacks on the data, but for 
> random failures, such as a disk block being unreadable and returning 
> zeroes or having bit errors, the odds of identifying the failure are 
> still excellent. 

I don't require "probably the universe will end sooner" kind of odds, 
but I would at least like "better than winning the lottery" odds. Once 
there are thousands of blocks to fix, the odds aren't that great: 1 / 
2^32 * 10 000 =~ 1 / 430 000

I wouldn't feel confident enough to add the disk back and let btrfs fix 
it. I'd rather wipe the FS on it and do the "replace missing".

>> Additionally, nocow files are not checksummed. They will not be 
>> corrected
>> and may return good data or random garbage, depending on which mirror is
>> accessed.
>     Yes, this is a trade-off that you have to make for your own
> use-case and happiness. For some things (like a browser cache), I'd be
> happy with losing the checksums.

The point is, if I add a drive with old contents back, I will probably 
have to delete all nocow files. Cause I'm not aware of any tool that can 
compare both mirrors, and tell me which files are identical on both, and 
which are different. Scrub will not detect them, as it works separately 
on each device, and doesn't compare one mirror to the other.

If I don't delete nocow files, I may get intermittent failures, like my 
browser randomly not loading some pages, and wonder what's going on.

On a multi user system, I risk exposing sensitive data to all users 
having nocow files, or access to nocow files.

Thus I think this practice is bad, dangerous, and I would advice against 
doing that. I'd also like btrfs to reject devices with old content by 
default.

  reply	other threads:[~2014-09-14 14:53 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-12  8:57 RAID1 failure and recovery shane-kernel
2014-09-12 10:47 ` Hugo Mills
2014-09-14  3:15   ` Piotr Pawłow
2014-09-14  4:44     ` Hugo Mills
2014-09-14 14:53       ` Piotr Pawłow [this message]
2014-09-12 11:11 ` Duncan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5415ABE4.7060800@siedziba.pl \
    --to=pp@siedziba.pl \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).