linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "3tcdgwg3" <3tcdgwg3@prodigy.net>
To: 3tcdgwg3 <3tcdgwg3@prodigy.net>, Neil Brown <neilb@cse.unsw.edu.au>
Cc: linux-raid@vger.kernel.org
Subject: Re: raid5, 2 drives dead at same time,kernel will Oops?
Date: Fri, 30 May 2003 13:33:03 -0700	[thread overview]
Message-ID: <020701c326ea$abfa0d80$7b07a8c0@pluto> (raw)
In-Reply-To: 00a901c31ffe$19ead340$7b07a8c0@pluto

Hi,

I have some other issues under this "more than 1
arm broken in a raid5 array" condition. The next
important one is this:

If I have two arrays, if first one is a raid5 and resync
is going on, second is another raid5, resync is schedule
to start after the first raid5 array synced.
At this time, if I kill to arms in the first raid5 array, the resync
will stop, but never aborted, consequently, second raid5
array never get a chance to start the resync.
Is there a fix for this?

Thanks

-W


----- Original Message -----
From: "3tcdgwg3" <3tcdgwg3@prodigy.net>
To: "Neil Brown" <neilb@cse.unsw.edu.au>
Cc: <linux-raid@vger.kernel.org>
Sent: Wednesday, May 21, 2003 6:04 PM
Subject: Re: raid5, 2 drives dead at same time,kernel will Oops?


> Neil,
> Preliminary test looks good, will test more
> when have time.
>
> Thanks,
> -Will.
> ----- Original Message -----
> From: "Neil Brown" <neilb@cse.unsw.edu.au>
> To: "3tcdgwg3" <3tcdgwg3@prodigy.net>
> Cc: <linux-raid@vger.kernel.org>
> Sent: Tuesday, May 20, 2003 7:42 PM
> Subject: Re: raid5, 2 drives dead at same time,kernel will Oops?
>
>
> > On Monday May 19, 3tcdgwg3@prodigy.net wrote:
> > > Hi,
> > >
> > > I am trying to simulate a case that two drives
> > > in an array fail ad same time.
> > > I use two ide drives, I try to create a
> > > raid 5 array with 4 arms, created as following:
> > >
> > > /dev/hdc1
> > > /dev/hde1
> > > /dev/hdc2
> > > /dev/hde2
> > >
> > > This is just for test, I know create two arms on
> > > one hard drive doesn't make much sense.
> > >
> > >
> > > Anyway, when I run this array, if I power off one
> > > of hard drive (/dev/hde) to simulate two arms failing
> > > at same  time in an array, I got system Oops. I am using
> > > 2.4-18 kernel.
> > >
> > > Anyone can tell me if this is normal? or if there is a fix for this?
> > >
> >
> > Congratulations and thanks.  You have managed to trigger a bug that
> > no-one else has found.
> >
> > The following patch (against 2.4.20) should fix it.  If you can test
> > and confirm I would really appreciate it.
> >
> > NeilBrown
> >
> >
> > ------------------------------------------------------------
> > Handle concurrent failure of two drives in raid5
> >
> > If two drives both fail during a write request, raid5 doesn't
> > cope properly and will eventually oops.
> >
> > With this patch, blocks that have already been 'written'
> > are failed when double drive failure is noticed, as well as
> > blocks that are about to be written.
> >
> >  ----------- Diffstat output ------------
> >  ./drivers/md/raid5.c |   10 +++++++++-
> >  1 files changed, 9 insertions(+), 1 deletion(-)
> >
> > diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
> > --- ./drivers/md/raid5.c~current~ 2003-05-21 12:42:07.000000000 +1000
> > +++ ./drivers/md/raid5.c 2003-05-21 12:37:37.000000000 +1000
> > @@ -882,7 +882,7 @@ static void handle_stripe(struct stripe_
> >   /* check if the array has lost two devices and, if so, some requests
> might
> >   * need to be failed
> >   */
> > - if (failed > 1 && to_read+to_write) {
> > + if (failed > 1 && to_read+to_write+written) {
> >   for (i=disks; i--; ) {
> >   /* fail all writes first */
> >   if (sh->bh_write[i]) to_write--;
> > @@ -891,6 +891,14 @@ static void handle_stripe(struct stripe_
> >   bh->b_reqnext = return_fail;
> >   return_fail = bh;
> >   }
> > + /* and fail all 'written' */
> > + if (sh->bh_written[i]) written--;
> > + while ((bh = sh->bh_written[i])) {
> > + sh->bh_written[i] = bh->b_reqnext;
> > + bh->b_reqnext = return_fail;
> > + return_fail = bh;
> > + }
> > +
> >   /* fail any reads if this device is non-operational */
> >   if (!conf->disks[i].operational) {
> >   spin_lock_irq(&conf->device_lock);
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


      reply	other threads:[~2003-05-30 20:33 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-05-06 23:24 Upgrading Raid1 array to have 2. disks Anders Fugmann
2003-05-06 23:46 ` Mads Peter Bach
2003-05-07  0:00 ` Neil Brown
2003-05-08 19:11   ` Anders Fugmann
2003-05-20  0:33   ` raid5, 2 drives dead at same time,kernel will Oops? 3tcdgwg3
2003-05-21  2:42     ` Neil Brown
2003-05-22  1:04       ` 3tcdgwg3
2003-05-30 20:33         ` 3tcdgwg3 [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='020701c326ea$abfa0d80$7b07a8c0@pluto' \
    --to=3tcdgwg3@prodigy.net \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@cse.unsw.edu.au \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).