public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Mike Fedyk <mfedyk@matchmail.com>
To: Neil Brown <neilb@cse.unsw.edu.au>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.6 md raid5 disk faulty marking bug was: md: bug in file raid5.c, line 1909 in 2.4.22-pre7
Date: Tue, 2 Sep 2003 19:04:28 -0700	[thread overview]
Message-ID: <20030903020428.GA15765@matchmail.com> (raw)
In-Reply-To: <16202.39333.830809.797201@gargle.gargle.HOWL>

On Tue, Aug 26, 2003 at 09:20:05AM +1000, Neil Brown wrote:
> On Friday August 22, mfedyk@matchmail.com wrote:
> > > As far as I can see, the 2.4 code would never set just MD_DISK_REMOVED
> > > (though it really should cope with it).  It is possible that the 2.6
> > > code does.  Has this array had 2.6 running on it?  Does it have any
> > > interesting history?
> > 
> > Yes, it was running 2.6-test2-mm2 or so (don't remember exactly, I can check
> > though if needed) previously, but I didn't notice any bug messages from
> > there, and seeing that it was 2.4 I was surprised to see bug
> > messages from md.
> 
> The 2.4 code is very fragile.  It can easily bug if the superblock
> looks wrong in some obscure way.
> 
> > 
> > Do you have any patches for 2.6 md?  Right now this system is still in
> > testing, and I'd like to help get this code path tested, and fixed.
> 
> The following should fix it.
> With this applied to 2.6, you can simply 
>   start the array under 2.6
>   stop the array
>   reboot into 2.4
> and it should be fine again.
> 
> NeilBrown
> 
> ==================================================
> Fix md superblock incompatabilities with 2.4 kernels.
> 
> 2.4 kernels are very fussy about some values in the superblock, and
> 2.6 got them wrong.  This fixes it.
> 

Yes, I can confirm this fixes it for me.  I applied it to
2.6.0-test4-mm3-1, without any troubles.

I tried rebuilding the array with 2.4 (zeroing the superblock in mdadm and
everything), but that didn't fix it, so I rebuilt with 2.6 and the below
patch, and do far so good.  I'll be beating on this machine some more...

Mike

> 
>  ----------- Diffstat output ------------
>  ./drivers/md/md.c |   13 ++++++-------
>  1 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff ./drivers/md/md.c~current~ ./drivers/md/md.c
> --- ./drivers/md/md.c~current~	2003-08-24 08:07:18.000000000 +1000
> +++ ./drivers/md/md.c	2003-08-26 09:11:39.000000000 +1000
> @@ -638,14 +638,13 @@ static void super_90_sync(mddev_t *mddev
>  	/* make rdev->sb match mddev data..
>  	 *
>  	 * 1/ zero out disks
> -	 * 2/ Add info for each disk, keeping track of highest desc_nr
> -	 * 3/ any empty disks < highest become removed
> +	 * 2/ Add info for each disk, keeping track of highest desc_nr (next_spare);
> +	 * 3/ any empty disks < next_spare become removed
>  	 *
>  	 * disks[0] gets initialised to REMOVED because
>  	 * we cannot be sure from other fields if it has
>  	 * been initialised or not.
>  	 */
> -	int highest = 0;
>  	int i;
>  	int active=0, working=0,failed=0,spare=0,nr_disks=0;
>  
> @@ -716,17 +715,17 @@ static void super_90_sync(mddev_t *mddev
>  			spare++;
>  			working++;
>  		}
> -		if (rdev2->desc_nr > highest)
> -			highest = rdev2->desc_nr;
>  	}
>  	
> -	/* now set the "removed" bit on any non-trailing holes */
> -	for (i=0; i<highest; i++) {
> +	/* now set the "removed" and "faulty" bits on any missing devices */
> +	for (i=0 ; i < mddev->raid_disks ; i++) {
>  		mdp_disk_t *d = &sb->disks[i];
>  		if (d->state == 0 && d->number == 0) {
>  			d->number = i;
>  			d->raid_disk = i;
>  			d->state = (1<<MD_DISK_REMOVED);
> +			d->state |= (1<<MD_DISK_FAULTY);
> +			failed++;
>  		}
>  	}
>  	sb->nr_disks = nr_disks;
> 

      reply	other threads:[~2003-09-03  2:04 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-08-19 20:37 md: bug in file raid5.c, line 1909 in 2.4.22-pre7 Mike Fedyk
2003-08-22  5:22 ` Neil Brown
2003-08-22 17:21   ` 2.6 md raid5 disk faulty marking bug was: " Mike Fedyk
2003-08-25 23:20     ` Neil Brown
2003-09-03  2:04       ` Mike Fedyk [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030903020428.GA15765@matchmail.com \
    --to=mfedyk@matchmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox