RAID5 refuses to accept replacement drive.

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID5 refuses to accept replacement drive.
@ 2006-10-25 16:52 greg
  2006-10-25 17:33 ` Eli Stair
  2006-10-25 21:25 ` Neil Brown
  0 siblings, 2 replies; 5+ messages in thread
From: greg @ 2006-10-25 16:52 UTC (permalink / raw)
  To: linux-raid; +Cc: neilb

Good morning to everyone, hope everyone's day is going well.

Neil, I sent this to your SUSE address a week ago but it may have
gotten trapped in a SPAM filter or lost in the shuffle.

I've used MD based RAID since it first existed.  First time I've run
into a situation like this.

Environment:
	Kernel: 2.4.33.3
	MDADM:  2.4.1/2.5.3
	MD:	Three drive RAID5 (md3)

A 'silent' disk failure was experienced in a SCSI hot-swap chassis
during a yearly system upgrade.  Machine failed to boot until 'nobd'
directive was given to LILO.  Drive was mechanically dead but
electrically alive.

Drives were shuffled to get the machine operational.  The machine came
up with md3 degraded.  The md3 device refuses to accept a replacement
partition using the following syntax:

mdadm --manage /dev/md3 -a /dev/sde1

No output from mdadm, nothing in the logfiles.  Tail end of strace is
as follows:

open("/dev/md3", O_RDWR)                = 3
fstat64(0x3, 0xbffff8fc)                = 0
ioctl(3, 0x800c0910, 0xbffff9f8)        = 0
_exit(0)                                = ?

I 'zeroed' the superblock on /dev/sde1 to make sure there was nothing
to interfere.  No change in behavior.

I know the 2.4 kernels are not in vogue but this is from a group of
machines which are expected to run a year at a time.  Stability and
known behavior are the foremost goals.

Details on the MD device and component drives are included below.

We've handled a lot of MD failures, first time anything like this has
happened.  I feel like there is probably a 'brown paper bag' solution
to this but I can't see it.

Thoughts?

Greg

---------------------------------------------------------------------------
/dev/md3:
        Version : 00.90.00
  Creation Time : Fri Jun 23 19:51:43 2006
     Raid Level : raid5
     Array Size : 5269120 (5.03 GiB 5.40 GB)
    Device Size : 2634560 (2.51 GiB 2.70 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 3
    Persistence : Superblock is persistent

    Update Time : Wed Oct 11 04:33:06 2006
          State : active, degraded
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0

         Layout : left-symmetric
     Chunk Size : 64K

           UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
         Events : 0.25

    Number   Major   Minor   RaidDevice State
       0       8       49        0      active sync   /dev/sdd1
       1       0        0        1      removed
       2       8       33        2      active sync   /dev/sdc1
---------------------------------------------------------------------------


Details for raid device 0:

---------------------------------------------------------------------------
/dev/sdd1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
  Creation Time : Fri Jun 23 19:51:43 2006
     Raid Level : raid5
    Device Size : 2634560 (2.51 GiB 2.70 GB)
     Array Size : 5269120 (5.03 GiB 5.40 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 3

    Update Time : Wed Oct 11 04:33:06 2006
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 52b602d5 - correct
         Events : 0.25

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     0       8       49        0      active sync   /dev/sdd1

   0     0       8       49        0      active sync   /dev/sdd1
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
---------------------------------------------------------------------------


Details for RAID device 2:

---------------------------------------------------------------------------
/dev/sdc1:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
  Creation Time : Fri Jun 23 19:51:43 2006
     Raid Level : raid5
    Device Size : 2634560 (2.51 GiB 2.70 GB)
     Array Size : 5269120 (5.03 GiB 5.40 GB)
   Raid Devices : 3
  Total Devices : 3
Preferred Minor : 3

    Update Time : Wed Oct 11 04:33:06 2006
          State : active
 Active Devices : 2
Working Devices : 2
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 52b602c9 - correct
         Events : 0.25

         Layout : left-symmetric
     Chunk Size : 64K

      Number   Major   Minor   RaidDevice State
this     2       8       33        2      active sync   /dev/sdc1

   0     0       8       49        0      active sync   /dev/sdd1
   1     1       0        0        1      faulty removed
   2     2       8       33        2      active sync   /dev/sdc1
---------------------------------------------------------------------------

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"We restored the user's real .pinerc from backup but another of our users
must still be missing those cows."
                                -- Malcolm Beattie

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 refuses to accept replacement drive.
  2006-10-25 16:52 RAID5 refuses to accept replacement drive greg
@ 2006-10-25 17:33 ` Eli Stair
  2006-10-25 21:25 ` Neil Brown
  1 sibling, 0 replies; 5+ messages in thread
From: Eli Stair @ 2006-10-25 17:33 UTC (permalink / raw)
  To: greg; +Cc: linux-raid



A tangentially-related suggestion:

If you layer dm-multipath on top of the raw block (SCSI,FC) layer, you 
add some complexity but also the good quality of enabling periodic 
readsector0() checks... so if your spindle powers down unexpectedly but 
the controller thinks it's still alive, you will still get a drive 
disconnect issued from below MD, as device-mapper will fail the drive 
automatically and MD will see it as faulty.

Sorry, no useful suggestion on the recovery task...


/eli


greg@enjellic.com wrote:
> Good morning to everyone, hope everyone's day is going well.
> 
> Neil, I sent this to your SUSE address a week ago but it may have
> gotten trapped in a SPAM filter or lost in the shuffle.
> 
> I've used MD based RAID since it first existed.  First time I've run
> into a situation like this.
> 
> Environment:
>         Kernel: 2.4.33.3
>         MDADM:  2.4.1/2.5.3
>         MD:     Three drive RAID5 (md3)
> 
> A 'silent' disk failure was experienced in a SCSI hot-swap chassis
> during a yearly system upgrade.  Machine failed to boot until 'nobd'
> directive was given to LILO.  Drive was mechanically dead but
> electrically alive.
> 
> Drives were shuffled to get the machine operational.  The machine came
> up with md3 degraded.  The md3 device refuses to accept a replacement
> partition using the following syntax:
> 
> mdadm --manage /dev/md3 -a /dev/sde1
> 
> No output from mdadm, nothing in the logfiles.  Tail end of strace is
> as follows:
> 
> open("/dev/md3", O_RDWR)                = 3
> fstat64(0x3, 0xbffff8fc)                = 0
> ioctl(3, 0x800c0910, 0xbffff9f8)        = 0
> _exit(0)                                = ?
> 
> I 'zeroed' the superblock on /dev/sde1 to make sure there was nothing
> to interfere.  No change in behavior.
> 
> I know the 2.4 kernels are not in vogue but this is from a group of
> machines which are expected to run a year at a time.  Stability and
> known behavior are the foremost goals.
> 
> Details on the MD device and component drives are included below.
> 
> We've handled a lot of MD failures, first time anything like this has
> happened.  I feel like there is probably a 'brown paper bag' solution
> to this but I can't see it.
> 
> Thoughts?
> 
> Greg
> 
> ---------------------------------------------------------------------------
> /dev/md3:
>         Version : 00.90.00
>   Creation Time : Fri Jun 23 19:51:43 2006
>      Raid Level : raid5
>      Array Size : 5269120 (5.03 GiB 5.40 GB)
>     Device Size : 2634560 (2.51 GiB 2.70 GB)
>    Raid Devices : 3
>   Total Devices : 3
> Preferred Minor : 3
>     Persistence : Superblock is persistent
> 
>     Update Time : Wed Oct 11 04:33:06 2006
>           State : active, degraded
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 1
>   Spare Devices : 0
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>            UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
>          Events : 0.25
> 
>     Number   Major   Minor   RaidDevice State
>        0       8       49        0      active sync   /dev/sdd1
>        1       0        0        1      removed
>        2       8       33        2      active sync   /dev/sdc1
> ---------------------------------------------------------------------------
> 
> 
> Details for raid device 0:
> 
> ---------------------------------------------------------------------------
> /dev/sdd1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
>   Creation Time : Fri Jun 23 19:51:43 2006
>      Raid Level : raid5
>     Device Size : 2634560 (2.51 GiB 2.70 GB)
>      Array Size : 5269120 (5.03 GiB 5.40 GB)
>    Raid Devices : 3
>   Total Devices : 3
> Preferred Minor : 3
> 
>     Update Time : Wed Oct 11 04:33:06 2006
>           State : active
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 1
>   Spare Devices : 0
>        Checksum : 52b602d5 - correct
>          Events : 0.25
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     0       8       49        0      active sync   /dev/sdd1
> 
>    0     0       8       49        0      active sync   /dev/sdd1
>    1     1       0        0        1      faulty removed
>    2     2       8       33        2      active sync   /dev/sdc1
> ---------------------------------------------------------------------------
> 
> 
> Details for RAID device 2:
> 
> ---------------------------------------------------------------------------
> /dev/sdc1:
>           Magic : a92b4efc
>         Version : 00.90.00
>            UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
>   Creation Time : Fri Jun 23 19:51:43 2006
>      Raid Level : raid5
>     Device Size : 2634560 (2.51 GiB 2.70 GB)
>      Array Size : 5269120 (5.03 GiB 5.40 GB)
>    Raid Devices : 3
>   Total Devices : 3
> Preferred Minor : 3
> 
>     Update Time : Wed Oct 11 04:33:06 2006
>           State : active
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 1
>   Spare Devices : 0
>        Checksum : 52b602c9 - correct
>          Events : 0.25
> 
>          Layout : left-symmetric
>      Chunk Size : 64K
> 
>       Number   Major   Minor   RaidDevice State
> this     2       8       33        2      active sync   /dev/sdc1
> 
>    0     0       8       49        0      active sync   /dev/sdd1
>    1     1       0        0        1      faulty removed
>    2     2       8       33        2      active sync   /dev/sdc1
> ---------------------------------------------------------------------------
> 
> As always,
> Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
> 4206 N. 19th Ave.           Specializing in information infra-structure
> Fargo, ND  58102            development.
> PH: 701-281-1686
> FAX: 701-281-3949           EMAIL: greg@enjellic.com
> ------------------------------------------------------------------------------
> "We restored the user's real .pinerc from backup but another of our users
> must still be missing those cows."
>                                 -- Malcolm Beattie
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 refuses to accept replacement drive.
  2006-10-25 16:52 RAID5 refuses to accept replacement drive greg
  2006-10-25 17:33 ` Eli Stair
@ 2006-10-25 21:25 ` Neil Brown
  1 sibling, 0 replies; 5+ messages in thread
From: Neil Brown @ 2006-10-25 21:25 UTC (permalink / raw)
  To: greg; +Cc: linux-raid

On Wednesday October 25, greg@enjellic.com wrote:
> Good morning to everyone, hope everyone's day is going well.
> 
> Neil, I sent this to your SUSE address a week ago but it may have
> gotten trapped in a SPAM filter or lost in the shuffle.

Yes, resending is always a good idea if I seem to be ignoring you.

(people who are really on-the-ball will probably start telling me it is a
resend the first time they mail me. I probably wouldn't notice.. :-)

> 
> I've used MD based RAID since it first existed.  First time I've run
> into a situation like this.
> 
> Environment:
> 	Kernel: 2.4.33.3
> 	MDADM:  2.4.1/2.5.3
> 	MD:	Three drive RAID5 (md3)

Old kernel, new mdadm.  Not a tested combination unfortunately.  I
guess I should try booting 2.4 somewhere and try it out...

> 
> A 'silent' disk failure was experienced in a SCSI hot-swap chassis
> during a yearly system upgrade.  Machine failed to boot until 'nobd'
> directive was given to LILO.  Drive was mechanically dead but
> electrically alive.
> 
> Drives were shuffled to get the machine operational.  The machine came
> up with md3 degraded.  The md3 device refuses to accept a replacement
> partition using the following syntax:
> 
> mdadm --manage /dev/md3 -a /dev/sde1
> 
> No output from mdadm, nothing in the logfiles.  Tail end of strace is
> as follows:
> 
> open("/dev/md3", O_RDWR)                = 3
> fstat64(0x3, 0xbffff8fc)                = 0
> ioctl(3, 0x800c0910, 0xbffff9f8)        = 0

Those last to lines are a called to md_get_version. 
Probably the one in open_mddev

> _exit(0)                                = ?

But I can see no way that it would exit...

Are you comfortable with gdb?
Would you be interested in single stepping around and seeing what path
leads to the exit?

Another option is to use mdadm-1.9.0.  That is likely to be more
reliable.

NeilBrown

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 refuses to accept replacement drive.
@ 2006-10-31 19:27 greg
  0 siblings, 0 replies; 5+ messages in thread
From: greg @ 2006-10-31 19:27 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

On Oct 26,  7:25am, Neil Brown wrote:
} Subject: Re: RAID5 refuses to accept replacement drive.

Hi Neil, hope your week is going well, thanks for the reply.

> > Environment:
> > 	Kernel: 2.4.33.3
> > 	MDADM:  2.4.1/2.5.3
> > 	MD:	Three drive RAID5 (md3)

> Old kernel, new mdadm.  Not a tested combination unfortunately.  I
> guess I should try booting 2.4 somewhere and try it out...

Based on what I found, its probably an old library issue as much as
anything.

More below.

> > Drives were shuffled to get the machine operational.  The machine came
> > up with md3 degraded.  The md3 device refuses to accept a replacement
> > partition using the following syntax:
> > 
> > mdadm --manage /dev/md3 -a /dev/sde1
> > 
> > No output from mdadm, nothing in the logfiles.  Tail end of strace is
> > as follows:
> > 
> > open("/dev/md3", O_RDWR)                = 3
> > fstat64(0x3, 0xbffff8fc)                = 0
> > ioctl(3, 0x800c0910, 0xbffff9f8)        = 0

> Those last to lines are a called to md_get_version. 
> Probably the one in open_mddev
> 
> > _exit(0)                                = ?
> 
> But I can see no way that it would exit...
> 
> Are you comfortable with gdb?
> Would you be interested in single stepping around and seeing what path
> leads to the exit?

My apologies for not being quicker on the draw, I should have gone
grovelling with gdb first.

The problem appears to be due to what must be a broken implementation
of getopt_long in the version of the installed C library.  Either that
or the reasonably complex.... :-) option parsing in mdadm is tripping
it up.

As I noted before the following syntax fails:

	mdadm --manage /dev/md3 -a /dev/sde1

After poking around a bit and watching the option parsing in gdb I
noticed that the following syntax should work:

	mdadm /dev/md3 -a /dev/sde1

I tried the latter command outside of GDB and things worked
perfectly.  The drive was added to the RAID5 array and synchronization
proceeded properly.

I then failed out a drive element on one of the other MD devices on
the machine and was able to repeat the problem.  The following refused
to work:

	mdadm --manage /dev/md1 -a /dev/sdb2

While the following worked:

	mdadm /dev/md1 -a /dev/sdb2

The getopt_long function is not picking up on the fact that -a should
have optarg set to /dev/sdb2 when the option is recognized.  Instead
optarg is set to NULL and devs_found is left at 1 rather than 2.  That
results in mdadm simply exiting without saying anything.

I know the 1.x version of mdadm we were using before processed the
'mdadm --manage' syntax properly.  This must have been the first time
we had to add a drive element back into an MD device since we upgraded
mdadm.

I would be happy to chase this a bit more or send you a statically
linked binary if you want to see what it is up to.  At the very least
it may be worthwhile to issue a warning message on exit if mdadm has
an MD device specification, a mode specification and no devices.

I remember trying to build a statically linked copy of mdadm with
dietlibc and ran into option parsing problems.  The resultant binary
would always exit complaining that a device had not been specified.  I
remember the dietlibc documentation noting that the GNU folks had an
inconsistent world view when it came to getopt processing
semantics... :-)

I suspect there is a common thead involved in both cases.

> NeilBrown

Hope the above is useful.  Let me know if you have any
questions/issues.

Happy Halloween.

Greg

}-- End of excerpt from Neil Brown

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Fools ignore complexity.  Pragmatists suffer it.  Some can avoid it.
Geniuses remove it.
                                -- Perliss' Programming Proverb #58
                                   SIGPLAN National, Sept. 1982

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: RAID5 refuses to accept replacement drive.
@ 2006-11-03 15:51 greg
  0 siblings, 0 replies; 5+ messages in thread
From: greg @ 2006-11-03 15:51 UTC (permalink / raw)
  To: Neil Brown, greg; +Cc: linux-raid

On Oct 26,  7:25am, Neil Brown wrote:
} Subject: Re: RAID5 refuses to accept replacement drive.

Hi Neil, hope the end of the week is going well for you.

> On Wednesday October 25, greg@enjellic.com wrote:
> > Good morning to everyone, hope everyone's day is going well.
> > 
> > Neil, I sent this to your SUSE address a week ago but it may have
> > gotten trapped in a SPAM filter or lost in the shuffle.
> 
> Yes, resending is always a good idea if I seem to be ignoring you.
> 
> (people who are really on-the-ball will probably start telling me it is a
> resend the first time they mail me. I probably wouldn't notice.. :-)

Did you get my reply on what I found when I poked at mdadm with gdb?

> NeilBrown

Have a good weekend.

}-- End of excerpt from Neil Brown

As always,
Dr. G.W. Wettstein, Ph.D.   Enjellic Systems Development, LLC.
4206 N. 19th Ave.           Specializing in information infra-structure
Fargo, ND  58102            development.
PH: 701-281-1686
FAX: 701-281-3949           EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Fools ignore complexity.  Pragmatists suffer it.  Some can avoid it.
Geniuses remove it.
                                -- Perliss' Programming Proverb #58
                                   SIGPLAN National, Sept. 1982

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-11-03 15:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-25 16:52 RAID5 refuses to accept replacement drive greg
2006-10-25 17:33 ` Eli Stair
2006-10-25 21:25 ` Neil Brown
  -- strict thread matches above, loose matches on Subject: below --
2006-10-31 19:27 greg
2006-11-03 15:51 greg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).