From: NeilBrown <neilb@suse.de>
To: Jianpeng Ma <majianpeng@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>,
"周, 麒" <qi.g.zhou@gmail.com>, "边, 雨" <ycbzzjlbyby@gmail.com>
Subject: Re: [RFC PATCH] raid5: Correct some failed-stripe which because the badsector.
Date: Tue, 8 Jan 2013 09:17:41 +1100 [thread overview]
Message-ID: <20130108091741.4d9786cb@notabene.brown> (raw)
In-Reply-To: <2012092613251873407316@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3966 bytes --]
On Wed, 26 Sep 2012 13:25:21 +0800 "Jianpeng Ma" <majianpeng@gmail.com> wrote:
> On 2012-09-26 09:20 NeilBrown <neilb@suse.de> Wrote:
> >On Sat, 22 Sep 2012 11:14:47 +0800 "Jianpeng Ma" <majianpeng@gmail.com> wrote:
> >
> >> Because raid5 had implemented badsector feature,some failed-stripe which because the
> >> badsector can try to correct instead of only failing.
> >>
> >> For example:
> >> 1:Create a raid5 by four disks missing/sdb/sdc/sdd.
> >> 2:If readed data from sdb and met error,so the stripe is failed
> >> 3:If later write-stripe contains sdb which overwrite.The stripe has a
> >> chance to correct.
> >
> >Your patch looks much more complicated that should be necessary, and you
> >don't provide any explanation for how it works, so it is very hard for me to
> >review.
> >
> Before implementing badsector log, failed-stripes were becasue the faulty or removed disks.
> For those, it's unnecessary to do something for those failed-stripe.
> But after implementing badsector log, failed-stripe can by some badsector.
> One example, RAID5 has four disk: one removed and one contained badsector, so the stripe is failed;
> Another example is recovery, when recovery a stripe and fail,so all the disks set badsector.
> But for above example, if full-write on this stripe,the badsector will be correct and failed-stripe become normal.
>
> This patch is do this to try recorrect some failed-stripe because badsectors.
> The workflow is:
> 1:in func analyse_stripe, we count the overwrite-on-failed-disks(contained badsector or faulty or removed), noworkeddisk(faulty or remove)
> 2:in func handle_stripe, it determines whether to be processed.There will be three cases:
> A: noworkdisk > max_degraded. It indicates there are many removed or faulty disks.It only failed
> B: All faild-disks are overwrited,which dosen't need read old data.So it can read other goog disks or computer parity data using RCW.
> C: Some failed-disks are overwrited.If stripe didn't set STRIPE_PREREAD_ACTIVE, stripe can delay for waitting other failed-disks write-data.
> Otherwise, it only failed the stripe.
> 3;in func handle_stripe_dirtying do two things:
> A: delay for failed-disks write-data
> B: set RCW to control this situation
> 4:After successly writed, one thing must to do.It will be rewrite and reread for check.
> But for this case, it's unnecessary to rewrite,it only reread the failed-disks(only contains badsector).
Thanks - I think I know what you are trying to do now.
Please resend the patch with *only* the changes that you actually need and
I'll try to review it.
i.e.
> >> @@ -548,9 +549,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >> else
> >> rw = WRITE;
> >> } else if (test_and_clear_bit(R5_Wantread, &sh->dev[i].flags))
> >> - rw = READ;
> >> + rw = READ;
> >> else if (test_and_clear_bit(R5_WantReplace,
> >> - &sh->dev[i].flags)) {
> >> + &sh->dev[i].flags)) {
> >> rw = WRITE;
> >> replace_only = 1;
> >> } else
This is purely cosmetic so shouldn't be in the patch.
> >> @@ -1737,6 +1738,7 @@ static void raid5_end_read_request(struct bio * bi, int error)
> >> s = sh->sector + rdev->new_data_offset;
> >> else
> >> s = sh->sector + rdev->data_offset;
> >> +
> >> if (uptodate) {
> >> set_bit(R5_UPTODATE, &sh->dev[i].flags);
> >> if (test_bit(R5_ReadError, &sh->dev[i].flags)) {
So is this.
> >> @@ -1852,8 +1856,8 @@ static void raid5_end_write_request(struct bio *bi, int error)
> >> }
> >> }
> >> pr_debug("end_write_request %llu/%d, count %d, uptodate: %d.\n",
> >> - (unsigned long long)sh->sector, i, atomic_read(&sh->count),
> >> - uptodate);
> >> + (unsigned long long)sh->sector, i,
> >> + atomic_read(&sh->count), uptodate);
> >> if (i == disks) {
> >> BUG();
> >> return;
So is this.
etc.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
prev parent reply other threads:[~2013-01-07 22:17 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-22 3:14 [RFC PATCH] raid5: Correct some failed-stripe which because the badsector Jianpeng Ma
2012-09-26 1:20 ` NeilBrown
2012-09-26 5:25 ` Jianpeng Ma
2013-01-07 22:17 ` NeilBrown [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130108091741.4d9786cb@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=majianpeng@gmail.com \
--cc=qi.g.zhou@gmail.com \
--cc=ycbzzjlbyby@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.