Re: Expected behavior of bad sectors on one drive in a RAID1

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Duncan <1i5t5.duncan@cox.net>
To: linux-btrfs@vger.kernel.org
Subject: Re: Expected behavior of bad sectors on one drive in a RAID1
Date: Tue, 20 Oct 2015 21:24:54 +0000 (UTC)	[thread overview]
Message-ID: <pan$19a41$faa0e7e5$55c66e58$7ec1fc3a@cox.net> (raw)
In-Reply-To: 56269A77.1080709@gmail.com

Austin S Hemmelgarn posted on Tue, 20 Oct 2015 15:48:07 -0400 as
excerpted:

> FWIW, my assessment is based on some testing I did a while back (kernel
> 3.14 IIRC) using a VM.  The (significantly summarized of course)
> procedure I used was:
> 1. Create a basic minimalistic Linux system in a VM (in my case, I just
> used a stage3 tarball for Gentoo, with a paravirtuaized Xen domain)
> using BTRFS as the root filesystem with a raid1 setup.  Make sure and
> verify that it actually boots.
> 2. Shutdown the VM, use btrfs-progs on the host to find the physical
> location of an arbitrary file (ideally one that is not touched at all
> during the boot process, IIRC, I think I used one of the e2fsprogs
> binaries), and then intentionally clear the CRC in one of the copies of
> a block from the file.
> 3. Boot the VM, read the file.
> 4. Shutdown the VM again.
> 5. Verify whether the file block you cleared the checksum on has a valid
> checksum now.
> 
> I repeated this more than a dozen times using different files and
> different methods of reading the file, and each time the CRC I had
> cleared was untouched.  Based on this, unless BTRFS does some kind of
> deferred re-write that doesn't get forced during a clean unmount of the
> FS, I felt it was relatively safe to conclude that it did not
> automatically fix corrupted blocks.  I did not however, test corrupting
> the block itself instead of the checksum, but I doubt that that would
> impact anything in this case.

AFAIK:

1) It would only run into the corruption if the raid1 read-scheduler 
picked that copy based on the even/odd of the requesting PID.

However, statistically that should be a 50% hit rate and if you tested 
more than a dozen times, you'd have quite the luck to fail to hit it on 
at least /one/ of them.

2) (Based on what I understood from the discussion of btrfs check's init-
csum-tree patches a couple cycles ago, before which it was clearing but 
not reinitializing...) Btrfs interprets missing checksums differently 
than invalid checksums.  Would your "cleared" CRC be interpreted as 
invalid or missing?  If missing, AFAIK it would leave it missing.

In which case corrupting the data block itself would indeed have had a 
different result than "clearing" the csum, tho simply corrupting the csum 
should have resulted in an update.

However, by actually testing you've gone farther than I have, and pending 
further info to the contrary, I'll yield to that, changing my own 
thoughts on the matter as well, to "I formerly thought... but someone's 
testing some versions ago anyway suggested otherwise, so being too lazy 
to actually do my own testing, I'll cautiously agree with the results of 
his."

=:^)

Thanks.  I'd rather find out I was wrong, than not find out! =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

     prev parent reply	other threads:[~2015-10-20 21:25 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-20  4:16 Expected behavior of bad sectors on one drive in a RAID1 james harvey
2015-10-20  4:45 ` Russell Coker
2015-10-20 13:00   ` Austin S Hemmelgarn
2015-10-20 13:15     ` Russell Coker
2015-10-20 13:59       ` Austin S Hemmelgarn
2015-10-20 19:20         ` Duncan
2015-10-20 19:59           ` Austin S Hemmelgarn
2015-10-20 20:54             ` Tim Walberg
2015-10-21 11:51             ` Austin S Hemmelgarn
2015-10-21 12:07               ` Austin S Hemmelgarn
2015-10-21 16:01                 ` Chris Murphy
2015-10-21 17:28                   ` Austin S Hemmelgarn
2015-10-20 18:54 ` Duncan
2015-10-20 19:48   ` Austin S Hemmelgarn
2015-10-20 21:24     ` Duncan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='pan$19a41$faa0e7e5$55c66e58$7ec1fc3a@cox.net' \
    --to=1i5t5.duncan@cox.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).