[PATCH md ] Fix BUG in raid5 resync code.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH md ] Fix BUG in raid5 resync code.
       [not found] <20040604163651.7946.patches@notabene>
@ 2004-06-04  6:38 ` NeilBrown
  2004-06-07  9:33   ` Nick Maynard
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2004-06-04  6:38 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid

Today I lucked upon a bug in raid5 that appears to have been there
since 2.4.0 and earlier.   I guess it doesn't trigger very often.

2.4 will need a similar fix.

### Comments for Changeset

This condtion on this loop is primarily to avoid the loop
if it doesn't appear to be needed.  However it optimises
a little too much and there is a case where it skips the
loop when it is really needed.  This patch fixes it.


Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>

### Diffstat output
 ./drivers/md/raid5.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2004-06-04 16:36:19.000000000 +1000
+++ ./drivers/md/raid5.c	2004-06-04 16:36:28.000000000 +1000
@@ -1037,7 +1037,7 @@ static void handle_stripe(struct stripe_
 	 * parity, or to satisfy requests
 	 * or to load a block that is being partially written.
 	 */
-	if (to_read || non_overwrite || (syncing && (uptodate+failed < disks))) {
+	if (to_read || non_overwrite || (syncing && (uptodate < disks))) {
 		for (i=disks; i--;) {
 			dev = &sh->dev[i];
 			if (!test_bit(R5_LOCKED, &dev->flags) && !test_bit(R5_UPTODATE, &dev->flags) &&

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH md ] Fix BUG in raid5 resync code.
       [not found] <20040604165208.8646.patches@notabene>
@ 2004-06-04  6:55 ` NeilBrown
  2004-06-04  7:10   ` David Greaves
  0 siblings, 1 reply; 6+ messages in thread
From: NeilBrown @ 2004-06-04  6:55 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: linux-raid

Hi Marcelo,
 This patch fixes a long standing bug in raid5, that is fairly hard to hit.

 As the comment below says, the if() condition on the for loop is there to 
optimised out the loop if it is known that it isn't needed, so making the test 
broader (as this patch does) cannot possibly hurt in correctness.
Please include this in an upcoming release.
Thanks,
NeilBrown

(patch against 2.4.27-pre5)

### Comments for Changeset

This condition on this loop is primarily to avoid the loop
if it doesn't appear to be needed.  However it optimises
a little too much and there is a case where it skips the
loop when it is really needed.  This patch fixes it.

Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>

### Diffstat output
 ./drivers/md/raid5.c |    2 +-
 1 files changed, 1 insertion(+), 1 deletion(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2004-06-04 16:50:34.000000000 +1000
+++ ./drivers/md/raid5.c	2004-06-04 16:51:06.000000000 +1000
@@ -950,7 +950,7 @@ static void handle_stripe(struct stripe_
 	/* Now we might consider reading some blocks, either to check/generate
 	 * parity, or to satisfy requests
 	 */
-	if (to_read || (syncing && (uptodate+failed < disks))) {
+	if (to_read || (syncing && (uptodate < disks))) {
 		for (i=disks; i--;) {
 			bh = sh->bh_cache[i];
 			if (!buffer_locked(bh) && !buffer_uptodate(bh) &&

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH md ] Fix BUG in raid5 resync code.
  2004-06-04  6:55 ` [PATCH md ] Fix BUG in raid5 resync code NeilBrown
@ 2004-06-04  7:10   ` David Greaves
  2004-06-04 12:04     ` Neil Brown
  0 siblings, 1 reply; 6+ messages in thread
From: David Greaves @ 2004-06-04  7:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil

Given I'm about to resync about 400Gb of data I'd like to check this 
before blindly storming ahead... :)

Would the following be the correct patch against 2.6.6?

--- drivers/md/raid5.c~ 2004-05-30 18:38:49.000000000 +0100
+++ drivers/md/raid5.c  2004-06-04 09:06:12.000000000 +0100
@@ -1037,7 +1037,7 @@
         * parity, or to satisfy requests
         * or to load a block that is being partially written.
         */
-       if (to_read || non_overwrite || (syncing && (uptodate+failed < 
disks))) {
+       if (to_read || non_overwrite || (syncing && (uptodate < disks))) {
                for (i=disks; i--;) {
                        dev = &sh->dev[i];
                        if (!test_bit(R5_LOCKED, &dev->flags) && 
!test_bit(R5_UPTODATE, &dev->flags) &&
c


Thanks

David

NeilBrown wrote:

>Hi Marcelo,
> This patch fixes a long standing bug in raid5, that is fairly hard to hit.
>
> As the comment below says, the if() condition on the for loop is there to 
>optimised out the loop if it is known that it isn't needed, so making the test 
>broader (as this patch does) cannot possibly hurt in correctness.
>Please include this in an upcoming release.
>Thanks,
>NeilBrown
>
>(patch against 2.4.27-pre5)
>
>### Comments for Changeset
>
>
>This condition on this loop is primarily to avoid the loop
>if it doesn't appear to be needed.  However it optimises
>a little too much and there is a case where it skips the
>loop when it is really needed.  This patch fixes it.
>
>
>Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>
>
>### Diffstat output
> ./drivers/md/raid5.c |    2 +-
> 1 files changed, 1 insertion(+), 1 deletion(-)
>
>diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
>--- ./drivers/md/raid5.c~current~	2004-06-04 16:50:34.000000000 +1000
>+++ ./drivers/md/raid5.c	2004-06-04 16:51:06.000000000 +1000
>@@ -950,7 +950,7 @@ static void handle_stripe(struct stripe_
> 	/* Now we might consider reading some blocks, either to check/generate
> 	 * parity, or to satisfy requests
> 	 */
>-	if (to_read || (syncing && (uptodate+failed < disks))) {
>+	if (to_read || (syncing && (uptodate < disks))) {
> 		for (i=disks; i--;) {
> 			bh = sh->bh_cache[i];
> 			if (!buffer_locked(bh) && !buffer_uptodate(bh) &&
>-
>To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>  
>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH md ] Fix BUG in raid5 resync code.
  2004-06-04  7:10   ` David Greaves
@ 2004-06-04 12:04     ` Neil Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Neil Brown @ 2004-06-04 12:04 UTC (permalink / raw)
  To: David Greaves; +Cc: linux-raid

On Friday June 4, david@dgreaves.com wrote:
> Hi Neil
> 
> Given I'm about to resync about 400Gb of data I'd like to check this 
> before blindly storming ahead... :)
> 
> Would the following be the correct patch against 2.6.6?

Yes. Remove the "+failed".

NeilBrown

> 
> --- drivers/md/raid5.c~ 2004-05-30 18:38:49.000000000 +0100
> +++ drivers/md/raid5.c  2004-06-04 09:06:12.000000000 +0100
> @@ -1037,7 +1037,7 @@
>          * parity, or to satisfy requests
>          * or to load a block that is being partially written.
>          */
> -       if (to_read || non_overwrite || (syncing && (uptodate+failed < 
> disks))) {
> +       if (to_read || non_overwrite || (syncing && (uptodate < disks))) {
>                 for (i=disks; i--;) {
>                         dev = &sh->dev[i];
>                         if (!test_bit(R5_LOCKED, &dev->flags) && 
> !test_bit(R5_UPTODATE, &dev->flags) &&
> c
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH md ] Fix BUG in raid5 resync code.
  2004-06-04  6:38 ` NeilBrown
@ 2004-06-07  9:33   ` Nick Maynard
  2004-06-07 23:01     ` Neil Brown
  0 siblings, 1 reply; 6+ messages in thread
From: Nick Maynard @ 2004-06-07  9:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: Andrew Morton, linux-raid

Hi Neil,

> Today I lucked upon a bug in raid5 that appears to have been there
> since 2.4.0 and earlier.   I guess it doesn't trigger very often.
>
> 2.4 will need a similar fix.
<snip>

Do you have any idea when we can expect this fix ported to 2.4?
Also, when the 2.6 kernel series will include it?

Thanks,

--

Nick Maynard
nick.maynard@alumni.doc.ic.ac.uk

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH md ] Fix BUG in raid5 resync code.
  2004-06-07  9:33   ` Nick Maynard
@ 2004-06-07 23:01     ` Neil Brown
  0 siblings, 0 replies; 6+ messages in thread
From: Neil Brown @ 2004-06-07 23:01 UTC (permalink / raw)
  To: Nick Maynard; +Cc: Andrew Morton, linux-raid

On Monday June 7, nick.maynard@alumni.doc.ic.ac.uk wrote:
> Hi Neil,
> 
> > Today I lucked upon a bug in raid5 that appears to have been there
> > since 2.4.0 and earlier.   I guess it doesn't trigger very often.
> >
> > 2.4 will need a similar fix.
> <snip>
> 
> Do you have any idea when we can expect this fix ported to 2.4?

It is in the linux-2.4 bk repository:
http://linux.bkbits.com:8080/linux-2.4/cset@40c31f28n8OMv8RVMIjLPDkq3APFWA?nav=index.html|ChangeSet@-2d

so it will be in the next (pre)release.

> Also, when the 2.6 kernel series will include it?

It is in Linux 2.6.7-rc3
http://linux.bkbits.com:8080/linux-2.5/cset@40c209c4CQWb5gFyz-f96tQ8QaBv-A?nav=index.html|ChangeSet@-3d

NeilBrown

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2004-06-07 23:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20040604165208.8646.patches@notabene>
2004-06-04  6:55 ` [PATCH md ] Fix BUG in raid5 resync code NeilBrown
2004-06-04  7:10   ` David Greaves
2004-06-04 12:04     ` Neil Brown
     [not found] <20040604163651.7946.patches@notabene>
2004-06-04  6:38 ` NeilBrown
2004-06-07  9:33   ` Nick Maynard
2004-06-07 23:01     ` Neil Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).