From: Neil Brown <neilb@suse.de>
To: MRK <mrk@shiftmail.org>
Cc: Janos Haar <janos.haar@netcenter.hu>, linux-raid@vger.kernel.org
Subject: Re: Suggestion needed for fixing RAID6
Date: Tue, 4 May 2010 07:02:00 +1000 [thread overview]
Message-ID: <20100504070200.55be73c7@notabene.brown> (raw)
In-Reply-To: <4BDE9FB6.80309@shiftmail.org>
On Mon, 03 May 2010 12:04:38 +0200
MRK <mrk@shiftmail.org> wrote:
> On 05/03/2010 04:17 AM, Neil Brown wrote:
> > On Sat, 1 May 2010 23:44:04 +0200
> > "Janos Haar"<janos.haar@netcenter.hu> wrote:
> >
> >
> >> The general problem is, i have one single-degraded RAID6 + 2 badblock disk
> >> inside wich have bads in different location.
> >> The big question is how to keep the integrity or how to do the rebuild by 2
> >> step instead of one continous?
> >>
> > Once you have the fix that has already been discussed in this thread, the
> > only other problem I can see with this situation is if attempts to write good
> > data over the read-errors results in a write-error which causes the device to
> > be evicted from the array.
> >
> > And I think you have reported getting write
> > errors.
> >
>
> His dmesg AFAIR has never reported any error of the kind "raid5:%s: read
> error NOT corrected!! " (the error message you get on failed rewrite AFAIU)
> Up to now (after my patch) he only tried with MD above DM-COW and DM was
> dropping the drive on read error so I think MD didn't get any
> opportunity to rewrite.
Hmmm... fair enough.
>
> It is not clear to me what kind of error MD got from DM:
>
> Apr 29 09:50:29 Clarus-gl2k10-2 kernel: device-mapper: snapshots: Invalidating snapshot: Error reading/writing.
> Apr 29 09:50:29 Clarus-gl2k10-2 kernel: ata8: EH complete
> Apr 29 09:50:29 Clarus-gl2k10-2 kernel: raid5: Disk failure on dm-1, disabling device.
>
> I don't understand from what place the md_error() is called...
I suspect it is from raid5_end_write_request. It looks like we don't print
any message when the re-write fails. Only if the read after the rewrite
fails.
> but also in this case it doesn't look like a rewrite error...
>
... so I suspect it is a rewrite error. Unless I missed something. What
message did you expect to see in the case of a re-write error?
> I think without DM COW it should probably work in his case.
>
> Your new patch skips the rewriting and keeps the unreadable sectors,
> right? So that the drive isn't dropped on rewrite...
Correct.
>
> > The following patch should address this issue for you.
> > It is*not* a general-purpose fix, but a specific fix
> [CUT]
NeilBrown
next prev parent reply other threads:[~2010-05-03 21:02 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-22 10:09 Suggestion needed for fixing RAID6 Janos Haar
2010-04-22 15:00 ` Mikael Abrahamsson
2010-04-22 15:12 ` Janos Haar
2010-04-22 15:18 ` Mikael Abrahamsson
2010-04-22 16:25 ` Janos Haar
2010-04-22 16:32 ` Peter Rabbitson
[not found] ` <4BD0AF2D.90207@stud.tu-ilmenau.de>
2010-04-22 20:48 ` Janos Haar
2010-04-23 6:51 ` Luca Berra
2010-04-23 8:47 ` Janos Haar
2010-04-23 12:34 ` MRK
2010-04-24 19:36 ` Janos Haar
2010-04-24 22:47 ` MRK
2010-04-25 10:00 ` Janos Haar
2010-04-26 10:24 ` MRK
2010-04-26 12:52 ` Janos Haar
2010-04-26 16:53 ` MRK
2010-04-26 22:39 ` Janos Haar
2010-04-26 23:06 ` Michael Evans
[not found] ` <7cfd01cae598$419e8d20$0400a8c0@dcccs>
2010-04-27 0:04 ` Michael Evans
2010-04-27 15:50 ` Janos Haar
2010-04-27 23:02 ` MRK
2010-04-28 1:37 ` Neil Brown
2010-04-28 2:02 ` Mikael Abrahamsson
2010-04-28 2:12 ` Neil Brown
2010-04-28 2:30 ` Mikael Abrahamsson
2010-05-03 2:29 ` Neil Brown
2010-04-28 12:57 ` MRK
2010-04-28 13:32 ` Janos Haar
2010-04-28 14:19 ` MRK
2010-04-28 14:51 ` Janos Haar
2010-04-29 7:55 ` Janos Haar
2010-04-29 15:22 ` MRK
2010-04-29 21:07 ` Janos Haar
2010-04-29 23:00 ` MRK
2010-04-30 6:17 ` Janos Haar
2010-04-30 23:54 ` MRK
[not found] ` <4BDB6DB6.5020306@sh iftmail.org>
2010-05-01 9:37 ` Janos Haar
2010-05-01 17:17 ` MRK
2010-05-01 21:44 ` Janos Haar
2010-05-02 23:05 ` MRK
2010-05-03 2:17 ` Neil Brown
2010-05-03 10:04 ` MRK
2010-05-03 10:21 ` MRK
2010-05-03 21:04 ` Neil Brown
2010-05-03 21:02 ` Neil Brown [this message]
[not found] ` <4BDE9FB6.80309@shiftmai! l.org>
2010-05-03 10:20 ` Janos Haar
2010-05-05 15:24 ` Suggestion needed for fixing RAID6 [SOLVED] Janos Haar
2010-05-05 19:27 ` MRK
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100504070200.55be73c7@notabene.brown \
--to=neilb@suse.de \
--cc=janos.haar@netcenter.hu \
--cc=linux-raid@vger.kernel.org \
--cc=mrk@shiftmail.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).