All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nix <nix@esperi.org.uk>
To: David Brown <david.brown@hesbynett.no>
Cc: Wols Lists <antlists@youngman.org.uk>,
	"Ravi (Tom) Hale" <ravi@hale.ee>,
	linux-raid@vger.kernel.org
Subject: Re: Fault tolerance with badblocks
Date: Tue, 09 May 2017 10:58:34 +0100	[thread overview]
Message-ID: <87efvy73n9.fsf@esperi.org.uk> (raw)
In-Reply-To: <591171BD.3060707@hesbynett.no> (David Brown's message of "Tue, 09 May 2017 09:37:33 +0200")

On 9 May 2017, David Brown spake thusly:

> On 08/05/17 16:50, Nix wrote:
>
>> I wonder... scrubbing is not very useful with md, particularly with RAID
>> 6, because it does no writes unless something mismatches, and on failure
>> there is no attempt to determine which of the N disks is bad and rewrite
>> its contents from the other devices (nor, as I understand it, does it
>> clearly say which drive gave the error, so even failing it out and
>> resyncing it is hard).
>
> Please read Neil Brown's article on this: "Smart or simple RAID
> recovery?" <http://neil.brown.name/blog/20100211050355>

I have. THe simple recovery is too simple. So you have a 40TiB RAID-6
array, say, and mismatch_cnt is consistently >0, but a low value, on
scrub. What can you do? The drive is probably not faulty or you'd have
many more mismatches from persistent misdirected reads or writes. md
doesn't repair the corruption, even though on RAID-6 it could. It
doesn't tell you which disk disagreed so you can fail it out. It doesn't
even tell you where the disagreement was so you can try to rebuild it by
hand. What on earth are you supposed to do in this case? Wipe the entire
array and restore from backup? For a *single* sector?

Right now I'm doing scrubs and ignoring the mismatch_cnt, because all it
can do is increase my worry level to no gain at all. I could just as
well do a dd over /dev/md*. It would have the same effect (only without
md's progress feedback and bandwidth throttling. You get progress
feedback, but you don't get told where errors are found?!)

> When the disk is asked to read a block, it pulls up the data and the ECC
> bits, and uses this to check and re-construct the 4K of data, and a
> measure of how many errors were corrected.  On modern high-capacity
> drives, it is normal that some errors are corrected on a read.  But if
> more than a certain level occur, then the firmware will trigger a
> re-write automatically to the same sector.  This will then be re-read.
> If the error rate is low, fine.  If it is high, then the sector will be
> remapped by the disk.
>
> So simply /reading/ the data, as far as the processor is concerned, will
> cause re-writes as and when needed.

Last time I asked a disk manufacturer about this, they said oh no we
never correct on read, we can't: if we needed to correct on read, the
data would already be unreadable: you have to trigger a write to get
sparing. Nice to see the drive firmware has improved in the last few
years... but one wonders how many disks actually *do* this. It's hard to
tell because sector sparing is so quiet: it's not always even reflected
in the SMART data, AIUI.

  reply	other threads:[~2017-05-09  9:58 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-04 10:04 Fault tolerance in RAID0 with badblocks Ravi (Tom) Hale
2017-05-04 13:44 ` Wols Lists
2017-05-05  4:03   ` Fault tolerance " Ravi (Tom) Hale
2017-05-05 19:20     ` Anthony Youngman
2017-05-06 11:21       ` Ravi (Tom) Hale
2017-05-06 13:00         ` Wols Lists
2017-05-08 14:50           ` Nix
2017-05-08 18:00             ` Anthony Youngman
2017-05-09 10:11               ` David Brown
2017-05-09 10:18               ` Nix
2017-05-08 19:02             ` Phil Turmel
2017-05-08 19:52               ` Nix
2017-05-08 20:27                 ` Anthony Youngman
2017-05-09  9:53                   ` Nix
2017-05-09 11:09                     ` David Brown
2017-05-09 11:27                       ` Nix
2017-05-09 11:58                         ` David Brown
2017-05-09 17:25                           ` Chris Murphy
2017-05-09 19:44                             ` Wols Lists
2017-05-10  3:53                               ` Chris Murphy
2017-05-10  4:49                                 ` Wols Lists
2017-05-10 17:18                                   ` Chris Murphy
2017-05-16  3:20                                   ` NeilBrown
2017-05-10  5:00                                 ` Dave Stevens
2017-05-10 16:44                                 ` Edward Kuns
2017-05-10 18:09                                   ` Chris Murphy
2017-05-09 20:18                             ` Nix
2017-05-09 20:52                               ` Wols Lists
2017-05-10  8:41                               ` David Brown
2017-05-09 21:06                             ` A sector-of-mismatch warning patch (was Re: Fault tolerance with badblocks) Nix
2017-05-12 11:14                               ` Nix
2017-05-16  3:27                               ` NeilBrown
2017-05-16  9:13                                 ` Nix
2017-05-16 21:11                                 ` NeilBrown
2017-05-16 21:46                                   ` Nix
2017-05-18  0:07                                     ` Shaohua Li
2017-05-19  4:53                                       ` NeilBrown
2017-05-19 10:31                                         ` Nix
2017-05-19 16:48                                           ` Shaohua Li
2017-06-02 12:28                                             ` Nix
2017-05-19  4:49                                     ` NeilBrown
2017-05-19 10:32                                       ` Nix
2017-05-19 16:55                                         ` Shaohua Li
2017-05-21 22:00                                           ` NeilBrown
2017-05-09 19:16                         ` Fault tolerance with badblocks Phil Turmel
2017-05-09 20:01                           ` Nix
2017-05-09 20:57                             ` Wols Lists
2017-05-09 21:22                               ` Nix
2017-05-09 21:23                             ` Phil Turmel
2017-05-09 21:32                     ` NeilBrown
2017-05-10 19:03                       ` Nix
2017-05-09 16:05                   ` Chris Murphy
2017-05-09 17:49                     ` Wols Lists
2017-05-10  3:06                       ` Chris Murphy
2017-05-08 20:56                 ` Phil Turmel
2017-05-09 10:28                   ` Nix
2017-05-09 10:50                     ` Reindl Harald
2017-05-09 11:15                       ` Nix
2017-05-09 11:48                         ` Reindl Harald
2017-05-09 16:11                           ` Nix
2017-05-09 16:46                             ` Reindl Harald
2017-05-09  7:37             ` David Brown
2017-05-09  9:58               ` Nix [this message]
2017-05-09 10:28                 ` Brad Campbell
2017-05-09 10:40                   ` Nix
2017-05-09 12:15                     ` Tim Small
2017-05-09 15:30                       ` Nix
2017-05-05 20:23     ` Peter Grandi
2017-05-05 22:14       ` Nix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87efvy73n9.fsf@esperi.org.uk \
    --to=nix@esperi.org.uk \
    --cc=antlists@youngman.org.uk \
    --cc=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    --cc=ravi@hale.ee \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.