Re: raid10 corruption while removing failing disk

Linux Btrfs filesystem development
 help / color / mirror / Atom feed

From: "Agustín DallʼAlba" <agustin@dallalba.com.ar>
To: Chris Murphy <lists@colorremedies.com>
Cc: Btrfs BTRFS <linux-btrfs@vger.kernel.org>
Subject: Re: raid10 corruption while removing failing disk
Date: Tue, 11 Aug 2020 17:40:27 -0300	[thread overview]
Message-ID: <d46401cf4af5c6ebc7cc7ce584570bc901978151.camel@dallalba.com.ar> (raw)
In-Reply-To: <CAJCQCtSdJVw5o2hJ3OyE6-nvM2xpx=nRHLVNSgf9ydD2O--vMQ@mail.gmail.com>

On Tue, 2020-08-11 at 13:17 -0600, Chris Murphy wrote:
> That drive should have '/sys/block/sda/device/timeout' at least 120.
> Although I've seen folks on linux-raid@ suggest 180. I don't know what
> the actual maximum time for "deep recovery" these drives could have.

I'll do this. Is there any reason not to set _every_ drive to 180s? As
far as I can tell it doesn't really hurt to have the timeout be very
long when the drives do support SCT ERC and if I simply write an udev
rule that matches all disks I won't have to remember to do this again
in the future

> As the signal in a sector weakens, the reads get slower. You can
> freshen the signal simply by rewriting data. Btrfs doesn't ever do
> overwrites, but you can use 'btrfs balance' for this task. Once a year
> seems reasonable, or as you notice reads becoming slower. And use a
> filtered balance to avoid doing it all at once.

I suspect it's the head that's damaged, not the sectors. I forgot to
set the idle3 timer on this drive, which is a power saving "feature" of
WD greens, to something reasonable for years and in the meantime the
head has parked 1.7 million times. Keeping this in mind it sounds to me
like a bad idea to write to it.

> I only fully understood what you meant by this:
> > instead of `found BAB1746E wanted A8A48266` it prints `found 0000006E wanted 00000066`
> 
> once I re-read the first email that had the full 'btrfs check' output
> from the old version. And yeah I don't know why they're different now.

I looked at the code and I think it's just a presentation bug. In disk-
io.c:177 both `result` and `buf->data` are arrays of u8, while they
used to be casted to u32 in btrfs-progs v4.15. The memcmp checks the
full checksum anyway so there's no worries about btrfs check doing the
wrong thing.

> Ballpark 8 hours for --repair given metadata size and spinning drives.
> It'll add some time adding --init-extent-tree which... is decently
> likely to be needed here. So the gotcha is, see if --repair works, and
> it fixes some stuff but still needs extent tree repaired anyway. Now
> you have to do that and it could be another 8 hours. Or do you go with
> the heavy hammer right away to save time and do both at once? But the
> heavy hammer is riskier.
> 
> Whether repair or start over, you need to have the backup plus 2x for
> important stuff. To do the repair you need to be prepared for the
> possibility tihngs get worse. I'll argue strongly that it's a bug if
> things get worse (i.e. now you can't mount ro at all) but as a risk
> assessment, it has to be considered.

It's 16 hours I can run overnight vs 1 - 2 weeks of copying 4 TB of
non-essential data over the Internet at 100 Mbps. I think I'll make
sure there's two copies of the important stuff somewhere and take the
risk.

Is it worse to do the --repair while degraded? I'm sure the failing
drive will manage to ruin the day if leave it connected, as I said it
sometimes decides to hang forever.

Thanks a lot.

next prev parent reply	other threads:[~2020-08-11 20:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-10  7:03 raid10 corruption while removing failing disk Agustín DallʼAlba
2020-08-10  7:22 ` Nikolay Borisov
2020-08-10  7:38   ` Martin Steigerwald
2020-08-10  7:51     ` Nikolay Borisov
2020-08-10  8:57       ` Martin Steigerwald
2020-08-11  1:30       ` Chris Murphy
2020-08-10  7:59     ` Agustín DallʼAlba
2020-08-10  8:21 ` Nikolay Borisov
2020-08-10 22:24   ` Zygo Blaxell
2020-08-11  1:18   ` Agustín DallʼAlba
2020-08-11  1:48     ` Chris Murphy
2020-08-11  2:34 ` Chris Murphy
2020-08-11  5:06   ` Agustín DallʼAlba
2020-08-11 19:17     ` Chris Murphy
2020-08-11 20:40       ` Agustín DallʼAlba [this message]
2020-08-12  3:03         ` Chris Murphy
2020-08-31 20:05       ` Agustín DallʼAlba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d46401cf4af5c6ebc7cc7ce584570bc901978151.camel@dallalba.com.ar \
    --to=agustin@dallalba.com.ar \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox