Re: [RFC PATCH] raid5: Correct some failed-stripe which because the badsector.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: Jianpeng Ma <majianpeng@gmail.com>
Cc: linux-raid <linux-raid@vger.kernel.org>,
	"周, 麒" <qi.g.zhou@gmail.com>, "边, 雨" <ycbzzjlbyby@gmail.com>
Subject: Re: [RFC PATCH] raid5: Correct some failed-stripe which because the badsector.
Date: Tue, 8 Jan 2013 09:17:41 +1100	[thread overview]
Message-ID: <20130108091741.4d9786cb@notabene.brown> (raw)
In-Reply-To: <2012092613251873407316@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3966 bytes --]

On Wed, 26 Sep 2012 13:25:21 +0800 "Jianpeng Ma" <majianpeng@gmail.com> wrote:

> On 2012-09-26 09:20 NeilBrown <neilb@suse.de> Wrote:
> >On Sat, 22 Sep 2012 11:14:47 +0800 "Jianpeng Ma" <majianpeng@gmail.com> wrote:
> >
> >> Because raid5 had implemented badsector feature,some failed-stripe which because the
> >> badsector can try to correct instead of only failing.
> >> 
> >> For example:
> >> 1:Create a raid5 by four disks missing/sdb/sdc/sdd.
> >> 2:If readed  data from sdb and met error,so the stripe is failed
> >> 3:If later write-stripe contains sdb which overwrite.The stripe has a
> >> chance to correct.
> >
> >Your patch looks much more complicated that should be necessary, and you
> >don't provide any explanation for how it works, so it is very hard for me to
> >review.
> >
> Before implementing badsector log, failed-stripes were becasue the faulty or removed disks.
> For those, it's unnecessary to do something for those failed-stripe.
> But after implementing badsector log, failed-stripe can by some badsector.
> One example, RAID5 has four disk: one removed and one contained badsector, so the stripe is failed;
> Another example is recovery, when recovery a stripe and fail,so all the disks set badsector.
> But for above example, if full-write on this stripe,the badsector will be correct and failed-stripe become normal.
> 
> This patch is do this to try recorrect some failed-stripe because badsectors.
> The workflow is:
> 1:in func analyse_stripe, we count the overwrite-on-failed-disks(contained badsector or faulty or removed), noworkeddisk(faulty or remove)
> 2:in func handle_stripe, it determines whether to be processed.There will be three cases:
> A: noworkdisk > max_degraded. It indicates there are many removed or faulty disks.It only failed
> B: All faild-disks are overwrited,which dosen't need read old data.So it can read other goog disks or computer parity data using RCW.
> C: Some failed-disks are overwrited.If stripe didn't set STRIPE_PREREAD_ACTIVE, stripe can delay for waitting other failed-disks write-data.
> Otherwise, it only failed the stripe.
> 3;in func handle_stripe_dirtying do two things:
> 	A: delay for failed-disks write-data
> 	B: set RCW to control this situation
> 4:After successly writed, one thing must to do.It will be rewrite and reread for check.
> But for this case, it's unnecessary to rewrite,it only reread the failed-disks(only contains badsector).


Thanks - I think I know what you are trying to do now.

Please resend the patch with *only* the changes that you actually need and
I'll try to review it.


i.e.

> >> @@ -548,9 +549,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
> >>  			else
> >>  				rw = WRITE;
> >>  		} else if (test_and_clear_bit(R5_Wantread, &sh->dev[i].flags))
> >> -			rw = READ;
> >> +				rw = READ;
> >>  		else if (test_and_clear_bit(R5_WantReplace,
> >> -					    &sh->dev[i].flags)) {
> >> +				&sh->dev[i].flags)) {
> >>  			rw = WRITE;
> >>  			replace_only = 1;
> >>  		} else

This is purely cosmetic so shouldn't be in the patch.


> >> @@ -1737,6 +1738,7 @@ static void raid5_end_read_request(struct bio * bi, int error)
> >>  		s = sh->sector + rdev->new_data_offset;
> >>  	else
> >>  		s = sh->sector + rdev->data_offset;
> >> +
> >>  	if (uptodate) {
> >>  		set_bit(R5_UPTODATE, &sh->dev[i].flags);
> >>  		if (test_bit(R5_ReadError, &sh->dev[i].flags)) {

So is this.


> >> @@ -1852,8 +1856,8 @@ static void raid5_end_write_request(struct bio *bi, int error)
> >>  		}
> >>  	}
> >>  	pr_debug("end_write_request %llu/%d, count %d, uptodate: %d.\n",
> >> -		(unsigned long long)sh->sector, i, atomic_read(&sh->count),
> >> -		uptodate);
> >> +		(unsigned long long)sh->sector, i,
> >> +		atomic_read(&sh->count), uptodate);
> >>  	if (i == disks) {
> >>  		BUG();
> >>  		return;

So is this.
etc.

NeilBrown



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

     prev parent reply	other threads:[~2013-01-07 22:17 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-22  3:14 [RFC PATCH] raid5: Correct some failed-stripe which because the badsector Jianpeng Ma
2012-09-26  1:20 ` NeilBrown
2012-09-26  5:25   ` Jianpeng Ma
2013-01-07 22:17     ` NeilBrown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130108091741.4d9786cb@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=majianpeng@gmail.com \
    --cc=qi.g.zhou@gmail.com \
    --cc=ycbzzjlbyby@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).