* RAID5 refuses to accept replacement drive.
@ 2006-10-25 16:52 greg
2006-10-25 17:33 ` Eli Stair
2006-10-25 21:25 ` Neil Brown
0 siblings, 2 replies; 5+ messages in thread
From: greg @ 2006-10-25 16:52 UTC (permalink / raw)
To: linux-raid; +Cc: neilb
Good morning to everyone, hope everyone's day is going well.
Neil, I sent this to your SUSE address a week ago but it may have
gotten trapped in a SPAM filter or lost in the shuffle.
I've used MD based RAID since it first existed. First time I've run
into a situation like this.
Environment:
Kernel: 2.4.33.3
MDADM: 2.4.1/2.5.3
MD: Three drive RAID5 (md3)
A 'silent' disk failure was experienced in a SCSI hot-swap chassis
during a yearly system upgrade. Machine failed to boot until 'nobd'
directive was given to LILO. Drive was mechanically dead but
electrically alive.
Drives were shuffled to get the machine operational. The machine came
up with md3 degraded. The md3 device refuses to accept a replacement
partition using the following syntax:
mdadm --manage /dev/md3 -a /dev/sde1
No output from mdadm, nothing in the logfiles. Tail end of strace is
as follows:
open("/dev/md3", O_RDWR) = 3
fstat64(0x3, 0xbffff8fc) = 0
ioctl(3, 0x800c0910, 0xbffff9f8) = 0
_exit(0) = ?
I 'zeroed' the superblock on /dev/sde1 to make sure there was nothing
to interfere. No change in behavior.
I know the 2.4 kernels are not in vogue but this is from a group of
machines which are expected to run a year at a time. Stability and
known behavior are the foremost goals.
Details on the MD device and component drives are included below.
We've handled a lot of MD failures, first time anything like this has
happened. I feel like there is probably a 'brown paper bag' solution
to this but I can't see it.
Thoughts?
Greg
---------------------------------------------------------------------------
/dev/md3:
Version : 00.90.00
Creation Time : Fri Jun 23 19:51:43 2006
Raid Level : raid5
Array Size : 5269120 (5.03 GiB 5.40 GB)
Device Size : 2634560 (2.51 GiB 2.70 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Wed Oct 11 04:33:06 2006
State : active, degraded
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 64K
UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
Events : 0.25
Number Major Minor RaidDevice State
0 8 49 0 active sync /dev/sdd1
1 0 0 1 removed
2 8 33 2 active sync /dev/sdc1
---------------------------------------------------------------------------
Details for raid device 0:
---------------------------------------------------------------------------
/dev/sdd1:
Magic : a92b4efc
Version : 00.90.00
UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
Creation Time : Fri Jun 23 19:51:43 2006
Raid Level : raid5
Device Size : 2634560 (2.51 GiB 2.70 GB)
Array Size : 5269120 (5.03 GiB 5.40 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Update Time : Wed Oct 11 04:33:06 2006
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : 52b602d5 - correct
Events : 0.25
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 49 0 active sync /dev/sdd1
0 0 8 49 0 active sync /dev/sdd1
1 1 0 0 1 faulty removed
2 2 8 33 2 active sync /dev/sdc1
---------------------------------------------------------------------------
Details for RAID device 2:
---------------------------------------------------------------------------
/dev/sdc1:
Magic : a92b4efc
Version : 00.90.00
UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
Creation Time : Fri Jun 23 19:51:43 2006
Raid Level : raid5
Device Size : 2634560 (2.51 GiB 2.70 GB)
Array Size : 5269120 (5.03 GiB 5.40 GB)
Raid Devices : 3
Total Devices : 3
Preferred Minor : 3
Update Time : Wed Oct 11 04:33:06 2006
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 1
Spare Devices : 0
Checksum : 52b602c9 - correct
Events : 0.25
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 2 8 33 2 active sync /dev/sdc1
0 0 8 49 0 active sync /dev/sdd1
1 1 0 0 1 faulty removed
2 2 8 33 2 active sync /dev/sdc1
---------------------------------------------------------------------------
As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-1686
FAX: 701-281-3949 EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"We restored the user's real .pinerc from backup but another of our users
must still be missing those cows."
-- Malcolm Beattie
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 refuses to accept replacement drive.
2006-10-25 16:52 RAID5 refuses to accept replacement drive greg
@ 2006-10-25 17:33 ` Eli Stair
2006-10-25 21:25 ` Neil Brown
1 sibling, 0 replies; 5+ messages in thread
From: Eli Stair @ 2006-10-25 17:33 UTC (permalink / raw)
To: greg; +Cc: linux-raid
A tangentially-related suggestion:
If you layer dm-multipath on top of the raw block (SCSI,FC) layer, you
add some complexity but also the good quality of enabling periodic
readsector0() checks... so if your spindle powers down unexpectedly but
the controller thinks it's still alive, you will still get a drive
disconnect issued from below MD, as device-mapper will fail the drive
automatically and MD will see it as faulty.
Sorry, no useful suggestion on the recovery task...
/eli
greg@enjellic.com wrote:
> Good morning to everyone, hope everyone's day is going well.
>
> Neil, I sent this to your SUSE address a week ago but it may have
> gotten trapped in a SPAM filter or lost in the shuffle.
>
> I've used MD based RAID since it first existed. First time I've run
> into a situation like this.
>
> Environment:
> Kernel: 2.4.33.3
> MDADM: 2.4.1/2.5.3
> MD: Three drive RAID5 (md3)
>
> A 'silent' disk failure was experienced in a SCSI hot-swap chassis
> during a yearly system upgrade. Machine failed to boot until 'nobd'
> directive was given to LILO. Drive was mechanically dead but
> electrically alive.
>
> Drives were shuffled to get the machine operational. The machine came
> up with md3 degraded. The md3 device refuses to accept a replacement
> partition using the following syntax:
>
> mdadm --manage /dev/md3 -a /dev/sde1
>
> No output from mdadm, nothing in the logfiles. Tail end of strace is
> as follows:
>
> open("/dev/md3", O_RDWR) = 3
> fstat64(0x3, 0xbffff8fc) = 0
> ioctl(3, 0x800c0910, 0xbffff9f8) = 0
> _exit(0) = ?
>
> I 'zeroed' the superblock on /dev/sde1 to make sure there was nothing
> to interfere. No change in behavior.
>
> I know the 2.4 kernels are not in vogue but this is from a group of
> machines which are expected to run a year at a time. Stability and
> known behavior are the foremost goals.
>
> Details on the MD device and component drives are included below.
>
> We've handled a lot of MD failures, first time anything like this has
> happened. I feel like there is probably a 'brown paper bag' solution
> to this but I can't see it.
>
> Thoughts?
>
> Greg
>
> ---------------------------------------------------------------------------
> /dev/md3:
> Version : 00.90.00
> Creation Time : Fri Jun 23 19:51:43 2006
> Raid Level : raid5
> Array Size : 5269120 (5.03 GiB 5.40 GB)
> Device Size : 2634560 (2.51 GiB 2.70 GB)
> Raid Devices : 3
> Total Devices : 3
> Preferred Minor : 3
> Persistence : Superblock is persistent
>
> Update Time : Wed Oct 11 04:33:06 2006
> State : active, degraded
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 1
> Spare Devices : 0
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
> Events : 0.25
>
> Number Major Minor RaidDevice State
> 0 8 49 0 active sync /dev/sdd1
> 1 0 0 1 removed
> 2 8 33 2 active sync /dev/sdc1
> ---------------------------------------------------------------------------
>
>
> Details for raid device 0:
>
> ---------------------------------------------------------------------------
> /dev/sdd1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
> Creation Time : Fri Jun 23 19:51:43 2006
> Raid Level : raid5
> Device Size : 2634560 (2.51 GiB 2.70 GB)
> Array Size : 5269120 (5.03 GiB 5.40 GB)
> Raid Devices : 3
> Total Devices : 3
> Preferred Minor : 3
>
> Update Time : Wed Oct 11 04:33:06 2006
> State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 1
> Spare Devices : 0
> Checksum : 52b602d5 - correct
> Events : 0.25
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 0 8 49 0 active sync /dev/sdd1
>
> 0 0 8 49 0 active sync /dev/sdd1
> 1 1 0 0 1 faulty removed
> 2 2 8 33 2 active sync /dev/sdc1
> ---------------------------------------------------------------------------
>
>
> Details for RAID device 2:
>
> ---------------------------------------------------------------------------
> /dev/sdc1:
> Magic : a92b4efc
> Version : 00.90.00
> UUID : cdd418a1:4bc3da6b:1ec17a15:e73ecadd
> Creation Time : Fri Jun 23 19:51:43 2006
> Raid Level : raid5
> Device Size : 2634560 (2.51 GiB 2.70 GB)
> Array Size : 5269120 (5.03 GiB 5.40 GB)
> Raid Devices : 3
> Total Devices : 3
> Preferred Minor : 3
>
> Update Time : Wed Oct 11 04:33:06 2006
> State : active
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 1
> Spare Devices : 0
> Checksum : 52b602c9 - correct
> Events : 0.25
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Number Major Minor RaidDevice State
> this 2 8 33 2 active sync /dev/sdc1
>
> 0 0 8 49 0 active sync /dev/sdd1
> 1 1 0 0 1 faulty removed
> 2 2 8 33 2 active sync /dev/sdc1
> ---------------------------------------------------------------------------
>
> As always,
> Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
> 4206 N. 19th Ave. Specializing in information infra-structure
> Fargo, ND 58102 development.
> PH: 701-281-1686
> FAX: 701-281-3949 EMAIL: greg@enjellic.com
> ------------------------------------------------------------------------------
> "We restored the user's real .pinerc from backup but another of our users
> must still be missing those cows."
> -- Malcolm Beattie
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 refuses to accept replacement drive.
2006-10-25 16:52 RAID5 refuses to accept replacement drive greg
2006-10-25 17:33 ` Eli Stair
@ 2006-10-25 21:25 ` Neil Brown
1 sibling, 0 replies; 5+ messages in thread
From: Neil Brown @ 2006-10-25 21:25 UTC (permalink / raw)
To: greg; +Cc: linux-raid
On Wednesday October 25, greg@enjellic.com wrote:
> Good morning to everyone, hope everyone's day is going well.
>
> Neil, I sent this to your SUSE address a week ago but it may have
> gotten trapped in a SPAM filter or lost in the shuffle.
Yes, resending is always a good idea if I seem to be ignoring you.
(people who are really on-the-ball will probably start telling me it is a
resend the first time they mail me. I probably wouldn't notice.. :-)
>
> I've used MD based RAID since it first existed. First time I've run
> into a situation like this.
>
> Environment:
> Kernel: 2.4.33.3
> MDADM: 2.4.1/2.5.3
> MD: Three drive RAID5 (md3)
Old kernel, new mdadm. Not a tested combination unfortunately. I
guess I should try booting 2.4 somewhere and try it out...
>
> A 'silent' disk failure was experienced in a SCSI hot-swap chassis
> during a yearly system upgrade. Machine failed to boot until 'nobd'
> directive was given to LILO. Drive was mechanically dead but
> electrically alive.
>
> Drives were shuffled to get the machine operational. The machine came
> up with md3 degraded. The md3 device refuses to accept a replacement
> partition using the following syntax:
>
> mdadm --manage /dev/md3 -a /dev/sde1
>
> No output from mdadm, nothing in the logfiles. Tail end of strace is
> as follows:
>
> open("/dev/md3", O_RDWR) = 3
> fstat64(0x3, 0xbffff8fc) = 0
> ioctl(3, 0x800c0910, 0xbffff9f8) = 0
Those last to lines are a called to md_get_version.
Probably the one in open_mddev
> _exit(0) = ?
But I can see no way that it would exit...
Are you comfortable with gdb?
Would you be interested in single stepping around and seeing what path
leads to the exit?
Another option is to use mdadm-1.9.0. That is likely to be more
reliable.
NeilBrown
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 refuses to accept replacement drive.
@ 2006-10-31 19:27 greg
0 siblings, 0 replies; 5+ messages in thread
From: greg @ 2006-10-31 19:27 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On Oct 26, 7:25am, Neil Brown wrote:
} Subject: Re: RAID5 refuses to accept replacement drive.
Hi Neil, hope your week is going well, thanks for the reply.
> > Environment:
> > Kernel: 2.4.33.3
> > MDADM: 2.4.1/2.5.3
> > MD: Three drive RAID5 (md3)
> Old kernel, new mdadm. Not a tested combination unfortunately. I
> guess I should try booting 2.4 somewhere and try it out...
Based on what I found, its probably an old library issue as much as
anything.
More below.
> > Drives were shuffled to get the machine operational. The machine came
> > up with md3 degraded. The md3 device refuses to accept a replacement
> > partition using the following syntax:
> >
> > mdadm --manage /dev/md3 -a /dev/sde1
> >
> > No output from mdadm, nothing in the logfiles. Tail end of strace is
> > as follows:
> >
> > open("/dev/md3", O_RDWR) = 3
> > fstat64(0x3, 0xbffff8fc) = 0
> > ioctl(3, 0x800c0910, 0xbffff9f8) = 0
> Those last to lines are a called to md_get_version.
> Probably the one in open_mddev
>
> > _exit(0) = ?
>
> But I can see no way that it would exit...
>
> Are you comfortable with gdb?
> Would you be interested in single stepping around and seeing what path
> leads to the exit?
My apologies for not being quicker on the draw, I should have gone
grovelling with gdb first.
The problem appears to be due to what must be a broken implementation
of getopt_long in the version of the installed C library. Either that
or the reasonably complex.... :-) option parsing in mdadm is tripping
it up.
As I noted before the following syntax fails:
mdadm --manage /dev/md3 -a /dev/sde1
After poking around a bit and watching the option parsing in gdb I
noticed that the following syntax should work:
mdadm /dev/md3 -a /dev/sde1
I tried the latter command outside of GDB and things worked
perfectly. The drive was added to the RAID5 array and synchronization
proceeded properly.
I then failed out a drive element on one of the other MD devices on
the machine and was able to repeat the problem. The following refused
to work:
mdadm --manage /dev/md1 -a /dev/sdb2
While the following worked:
mdadm /dev/md1 -a /dev/sdb2
The getopt_long function is not picking up on the fact that -a should
have optarg set to /dev/sdb2 when the option is recognized. Instead
optarg is set to NULL and devs_found is left at 1 rather than 2. That
results in mdadm simply exiting without saying anything.
I know the 1.x version of mdadm we were using before processed the
'mdadm --manage' syntax properly. This must have been the first time
we had to add a drive element back into an MD device since we upgraded
mdadm.
I would be happy to chase this a bit more or send you a statically
linked binary if you want to see what it is up to. At the very least
it may be worthwhile to issue a warning message on exit if mdadm has
an MD device specification, a mode specification and no devices.
I remember trying to build a statically linked copy of mdadm with
dietlibc and ran into option parsing problems. The resultant binary
would always exit complaining that a device had not been specified. I
remember the dietlibc documentation noting that the GNU folks had an
inconsistent world view when it came to getopt processing
semantics... :-)
I suspect there is a common thead involved in both cases.
> NeilBrown
Hope the above is useful. Let me know if you have any
questions/issues.
Happy Halloween.
Greg
}-- End of excerpt from Neil Brown
As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-1686
FAX: 701-281-3949 EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Fools ignore complexity. Pragmatists suffer it. Some can avoid it.
Geniuses remove it.
-- Perliss' Programming Proverb #58
SIGPLAN National, Sept. 1982
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RAID5 refuses to accept replacement drive.
@ 2006-11-03 15:51 greg
0 siblings, 0 replies; 5+ messages in thread
From: greg @ 2006-11-03 15:51 UTC (permalink / raw)
To: Neil Brown, greg; +Cc: linux-raid
On Oct 26, 7:25am, Neil Brown wrote:
} Subject: Re: RAID5 refuses to accept replacement drive.
Hi Neil, hope the end of the week is going well for you.
> On Wednesday October 25, greg@enjellic.com wrote:
> > Good morning to everyone, hope everyone's day is going well.
> >
> > Neil, I sent this to your SUSE address a week ago but it may have
> > gotten trapped in a SPAM filter or lost in the shuffle.
>
> Yes, resending is always a good idea if I seem to be ignoring you.
>
> (people who are really on-the-ball will probably start telling me it is a
> resend the first time they mail me. I probably wouldn't notice.. :-)
Did you get my reply on what I found when I poked at mdadm with gdb?
> NeilBrown
Have a good weekend.
}-- End of excerpt from Neil Brown
As always,
Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC.
4206 N. 19th Ave. Specializing in information infra-structure
Fargo, ND 58102 development.
PH: 701-281-1686
FAX: 701-281-3949 EMAIL: greg@enjellic.com
------------------------------------------------------------------------------
"Fools ignore complexity. Pragmatists suffer it. Some can avoid it.
Geniuses remove it.
-- Perliss' Programming Proverb #58
SIGPLAN National, Sept. 1982
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2006-11-03 15:51 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-25 16:52 RAID5 refuses to accept replacement drive greg
2006-10-25 17:33 ` Eli Stair
2006-10-25 21:25 ` Neil Brown
-- strict thread matches above, loose matches on Subject: below --
2006-10-31 19:27 greg
2006-11-03 15:51 greg
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).