From: NeilBrown <neilb@suse.de>
To: Phil Turmel <philip@turmel.org>
Cc: Piergiorgio Sartor <piergiorgio.sartor@nexgo.de>,
linux-raid@vger.kernel.org
Subject: Re: md road-map: 2011
Date: Thu, 17 Feb 2011 14:10:17 +1100 [thread overview]
Message-ID: <20110217141017.01e30eab@notabene.brown> (raw)
In-Reply-To: <4D5C768A.1010502@turmel.org>
On Wed, 16 Feb 2011 20:14:50 -0500 Phil Turmel <philip@turmel.org> wrote:
> On 02/16/2011 07:52 PM, NeilBrown wrote:
> > So when you do the computation on all of the bytes in all of the blocks you
> > get a block full of answers.
> > If the answers are all the same - that tells you something fairly strong.
> > If they are a "all different" then that is also a fairly strong statement.
> > But what if most are the same, but a few are different? How do you interpret
> > that?
>
> Actually, I was thinking about that. (You suckered me into reading that PDF
> some weeks ago.) I would be inclined to allow the kernel to make corrections
> where "all the same" covers individual sectors, per the sector size reported
> by the underlying device.
To see what I am strongly against having the kernel make automatic
corrections like this, see
http://neil.brown.name/blog/20100211050355
>
> Also, the comparison would have to ignore "neutral bytes", where P & Q
> happened to be correct for that byte position.
>
> > The point I'm trying to get to is that the result of this RAID6 calculation
> > isn't a simple "that device is bad". It is a block of data that needs to be
> > interpreted.
> >
> > I'd rather have user-space do that interpretation, so it may as well do the
> > calculation too.
> >
> > If you wanted to do it in the kernel, you would need to be very clear about
> > what information you provide, what it means exactly, and why it is sufficient.
>
> Given that the hardware is going to do error correction and checking at a
> sector size granularity, and the kernel would in fact rewrite that sector using
> this calculation if the hardware made a "fairly strong" statement that it can't
> be trusted, I'd argue that rewriting the sector is appropriate.
You the RAID6 calculation tells you is that something cannot be trusted. It
doesn't tell you what. It could be the controller, the cable, the drive
logic, or the rust on the media. Without the knowledge, correction can be
dangerous.
NeilBrown
>
> Any corrective action that isn't consistent at the sector level should be punted.
> I'm very curious what percentage that would be in production environments.
>
> Phil
next prev parent reply other threads:[~2011-02-17 3:10 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-16 10:27 md road-map: 2011 NeilBrown
2011-02-16 11:28 ` Giovanni Tessore
2011-02-16 13:40 ` Roberto Spadim
2011-02-16 14:00 ` Robin Hill
2011-02-16 14:09 ` Roberto Spadim
2011-02-16 14:21 ` Roberto Spadim
2011-02-16 21:55 ` NeilBrown
2011-02-17 1:30 ` Roberto Spadim
2011-02-16 14:13 ` Joe Landman
2011-02-16 21:24 ` NeilBrown
2011-02-16 21:44 ` Roman Mamedov
2011-02-16 21:59 ` NeilBrown
2011-02-17 0:48 ` Phil Turmel
2011-02-16 22:12 ` Joe Landman
2011-02-16 15:42 ` David Brown
2011-02-16 21:35 ` NeilBrown
2011-02-16 22:34 ` David Brown
2011-02-16 23:01 ` NeilBrown
2011-02-17 0:30 ` David Brown
2011-02-17 0:55 ` NeilBrown
2011-02-17 1:04 ` Keld Jørn Simonsen
2011-02-17 10:45 ` David Brown
2011-02-17 10:58 ` Keld Jørn Simonsen
2011-02-17 11:45 ` Giovanni Tessore
2011-02-17 15:44 ` Keld Jørn Simonsen
2011-02-17 16:22 ` Roberto Spadim
2011-02-18 0:13 ` Giovanni Tessore
2011-02-18 2:56 ` Keld Jørn Simonsen
2011-02-18 4:27 ` Roberto Spadim
2011-02-18 9:47 ` Giovanni Tessore
2011-02-18 18:43 ` Keld Jørn Simonsen
2011-02-18 19:00 ` Roberto Spadim
2011-02-18 19:18 ` Keld Jørn Simonsen
2011-02-18 19:22 ` Roberto Spadim
2011-02-16 17:20 ` Joe Landman
2011-02-16 21:36 ` NeilBrown
2011-02-16 19:37 ` Phil Turmel
2011-02-16 21:44 ` NeilBrown
2011-02-17 0:11 ` Phil Turmel
2011-02-16 20:29 ` Piergiorgio Sartor
2011-02-16 21:48 ` NeilBrown
2011-02-16 22:53 ` Piergiorgio Sartor
2011-02-17 0:24 ` Phil Turmel
2011-02-17 0:52 ` NeilBrown
2011-02-17 1:14 ` Phil Turmel
2011-02-17 3:10 ` NeilBrown [this message]
2011-02-17 18:46 ` Phil Turmel
2011-02-17 21:04 ` Mr. James W. Laferriere
2011-02-18 1:48 ` NeilBrown
2011-02-17 19:56 ` Piergiorgio Sartor
2011-02-16 22:50 ` Keld Jørn Simonsen
2011-02-23 5:06 ` Daniel Reurich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110217141017.01e30eab@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=philip@turmel.org \
--cc=piergiorgio.sartor@nexgo.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).