* RAID5 recovery trouble, bd_claim failed?
@ 2006-04-15 13:01 Nathanial Byrnes
2006-04-16 22:46 ` Neil Brown
2006-04-18 22:13 ` Maurice Hilarius
0 siblings, 2 replies; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-15 13:01 UTC (permalink / raw)
To: linux-raid
Hi All,
Recently I lost a disk in my raid5 SW array. It seems that it took a
second disk with it. The other disk appears to still be funtional (from
an fdisk perspective...). I am trying to get the array to work in
degraded mode via failed-disk in raidtab, but am always getting the
following error:
md: could not bd_claim hde.
md: autostart failed!
When I try to raidstart the array. Is it the case tha I had been running
in degraded mode before the disk failure, and then lost the other disk?
if so, how can I tell.
I have been messing about with mkraid -R and I have tried to
add /dev/hdf (a new disk) back to the array. However, I am fairly
confident that I have not kicked off the recovery process, so I am
imagining that once I get the superblocks in order, I should be able to
recover to the new disk?
My system and raid config are:
Kernel 2.6.13.1
Slack 10.2
RAID 5 which originally looked like:
/dev/hde
/dev/hdg
/dev/hdi
/dev/hdk
but when I moved the disks to another box with fewer IDE controllers
/dev/hde
/dev/hdf
/dev/hdg
/dev/hdh
How should I approach this?
Below is the output of mdadm --examine /dev/hd*
Thanks in advance,
Nate
/dev/hde:
Magic : a92b4efc
Version : 00.90.00
UUID : 38081921:59a998f9:64c1a001:ec534ef2
Creation Time : Fri Aug 22 16:34:37 2003
Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Wed Apr 12 02:26:37 2006
State : active
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 165c1b4c - correct
Events : 0.37523832
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 1 33 0 1 active sync /dev/hde
0 0 0 0 0 removed
1 1 33 0 1 active sync /dev/hde
2 2 34 64 2 active sync /dev/hdh
3 3 34 0 3 active sync /dev/hdg
/dev/hdf:
Magic : a92b4efc
Version : 00.90.00
UUID : 38081921:59a998f9:64c1a001:ec534ef2
Creation Time : Fri Aug 22 16:34:37 2003
Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Wed Apr 12 02:26:37 2006
State : active
Active Devices : 3
Working Devices : 3
Failed Devices : 1
Spare Devices : 0
Checksum : 165c1bc5 - correct
Events : 0.37523832
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 3 33 64 -1 sync /dev/hdf
0 0 0 0 0 removed
1 1 33 0 1 active sync /dev/hde
2 2 34 64 2 active sync /dev/hdh
3 3 33 64 -1 sync /dev/hdf
/dev/hdg:
Magic : a92b4efc
Version : 00.90.00
UUID : 38081921:59a998f9:64c1a001:ec534ef2
Creation Time : Fri Aug 22 16:34:37 2003
Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Wed Apr 12 06:12:58 2006
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 3
Spare Devices : 0
Checksum : 1898e1fd - correct
Events : 0.37523844
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 3 34 0 3 active sync /dev/hdg
0 0 0 0 0 removed
1 1 0 0 1 faulty removed
2 2 34 64 2 active sync /dev/hdh
3 3 34 0 3 active sync /dev/hdg
/dev/hdh:
Magic : a92b4efc
Version : 00.90.00
UUID : 38081921:59a998f9:64c1a001:ec534ef2
Creation Time : Fri Aug 22 16:34:37 2003
Raid Level : raid5
Device Size : 78150656 (74.53 GiB 80.03 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Update Time : Wed Apr 12 06:12:58 2006
State : clean
Active Devices : 2
Working Devices : 2
Failed Devices : 3
Spare Devices : 0
Checksum : 1898e23b - correct
Events : 0.37523844
Layout : left-symmetric
Chunk Size : 128K
Number Major Minor RaidDevice State
this 2 34 64 2 active sync /dev/hdh
0 0 0 0 0 removed
1 1 0 0 1 faulty removed
2 2 34 64 2 active sync /dev/hdh
3 3 34 0 3 active sync /dev/hdg
^ permalink raw reply [flat|nested] 18+ messages in thread* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-15 13:01 RAID5 recovery trouble, bd_claim failed? Nathanial Byrnes @ 2006-04-16 22:46 ` Neil Brown 2006-04-17 2:54 ` Nathanial Byrnes 2006-04-18 22:13 ` Maurice Hilarius 1 sibling, 1 reply; 18+ messages in thread From: Neil Brown @ 2006-04-16 22:46 UTC (permalink / raw) To: Nathanial Byrnes; +Cc: linux-raid On Saturday April 15, nate@qabal.org wrote: > Hi All, > Recently I lost a disk in my raid5 SW array. It seems that it took a > second disk with it. The other disk appears to still be funtional (from > an fdisk perspective...). I am trying to get the array to work in > degraded mode via failed-disk in raidtab, but am always getting the > following error: > > md: could not bd_claim hde. > md: autostart failed! > > When I try to raidstart the array. Is it the case tha I had been running > in degraded mode before the disk failure, and then lost the other disk? > if so, how can I tell. raidstart is deprecated. It doesn't work reliably. Don't use it. > > I have been messing about with mkraid -R and I have tried to > add /dev/hdf (a new disk) back to the array. However, I am fairly > confident that I have not kicked off the recovery process, so I am > imagining that once I get the superblocks in order, I should be able to > recover to the new disk? > > My system and raid config are: > Kernel 2.6.13.1 > Slack 10.2 > RAID 5 which originally looked like: > /dev/hde > /dev/hdg > /dev/hdi > /dev/hdk > > but when I moved the disks to another box with fewer IDE controllers > /dev/hde > /dev/hdf > /dev/hdg > /dev/hdh > > How should I approach this? mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec534ef2 /dev/hd* If that doesn't work, add "--force" but be cautious of the data - do an fsck atleast. NeilBrown ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-16 22:46 ` Neil Brown @ 2006-04-17 2:54 ` Nathanial Byrnes 2006-04-17 3:04 ` Neil Brown 0 siblings, 1 reply; 18+ messages in thread From: Nathanial Byrnes @ 2006-04-17 2:54 UTC (permalink / raw) To: Neil Brown; +Cc: Nathanial Byrnes, linux-raid Hi Neil, Thanks for your reply. I tried that, but here is there error I received: root@finn:/etc# mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh] mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to start the array. The output from lsraid against each device is as follows (I think that I messed up my superblocks pretty well...): root@finn:/etc# lsraid -d /dev/hde [dev 9, 0] /dev/md/0 38081921.59A998F9.64C1A001.EC534EF2 offline [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev 34, 64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good [dev 34, 0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good [dev 33, 64] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown [dev 33, 0] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown [dev 33, 0] /dev/hde 38081921.59A998F9.64C1A001.EC534EF2 unbound root@finn:/etc# lsraid -d /dev/hdf [dev 9, 0] /dev/md/0 38081921.59A998F9.64C1A001.EC534EF2 offline [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev 34, 64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good [dev 34, 0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good [dev 33, 64] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown [dev 33, 0] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown [dev 33, 64] /dev/hdf 38081921.59A998F9.64C1A001.EC534EF2 unbound root@finn:/etc# lsraid -d /dev/hdg [dev 9, 0] /dev/md/0 38081921.59A998F9.64C1A001.EC534EF2 offline [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev 34, 64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good [dev 34, 0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good [dev 33, 64] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown [dev 33, 0] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown root@finn:/etc# lsraid -d /dev/hdh [dev 9, 0] /dev/md/0 38081921.59A998F9.64C1A001.EC534EF2 offline [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev ?, ?] (unknown) 00000000.00000000.00000000.00000000 missing [dev 34, 64] /dev/hdh 38081921.59A998F9.64C1A001.EC534EF2 good [dev 34, 0] /dev/hdg 38081921.59A998F9.64C1A001.EC534EF2 good [dev 33, 64] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown [dev 33, 0] (unknown) 38081921.59A998F9.64C1A001.EC534EF2 unknown Thanks again, Nate On Mon, 2006-04-17 at 08:46 +1000, Neil Brown wrote: > On Saturday April 15, nate@qabal.org wrote: > > Hi All, > > Recently I lost a disk in my raid5 SW array. It seems that it took a > > second disk with it. The other disk appears to still be funtional (from > > an fdisk perspective...). I am trying to get the array to work in > > degraded mode via failed-disk in raidtab, but am always getting the > > following error: > > > > md: could not bd_claim hde. > > md: autostart failed! > > > > When I try to raidstart the array. Is it the case tha I had been running > > in degraded mode before the disk failure, and then lost the other disk? > > if so, how can I tell. > > raidstart is deprecated. It doesn't work reliably. Don't use it. > > > > > I have been messing about with mkraid -R and I have tried to > > add /dev/hdf (a new disk) back to the array. However, I am fairly > > confident that I have not kicked off the recovery process, so I am > > imagining that once I get the superblocks in order, I should be able to > > recover to the new disk? > > > > My system and raid config are: > > Kernel 2.6.13.1 > > Slack 10.2 > > RAID 5 which originally looked like: > > /dev/hde > > /dev/hdg > > /dev/hdi > > /dev/hdk > > > > but when I moved the disks to another box with fewer IDE controllers > > /dev/hde > > /dev/hdf > > /dev/hdg > > /dev/hdh > > > > How should I approach this? > > mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec534ef2 /dev/hd* > > If that doesn't work, add "--force" but be cautious of the data - do > an fsck atleast. > > NeilBrown > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > !DSPAM:4442c93863991804284693! > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 2:54 ` Nathanial Byrnes @ 2006-04-17 3:04 ` Neil Brown 2006-04-17 10:08 ` Nathanial Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Neil Brown @ 2006-04-17 3:04 UTC (permalink / raw) To: Nathanial Byrnes; +Cc: Nathanial Byrnes, linux-raid On Sunday April 16, nate@qabal.org wrote: > Hi Neil, > Thanks for your reply. I tried that, but here is there error I > received: > > root@finn:/etc# mdadm --assemble /dev/md0 > --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh] > mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy > mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to > start the array. What is /dev/hdf busy? Is it in use? mounted? something? > > The output from lsraid against each device is as follows (I think that I > messed up my superblocks pretty well...): Sorry, but I don't use lsraid and cannot tell anything useful from it's output. NeilBrown ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 3:04 ` Neil Brown @ 2006-04-17 10:08 ` Nathanial Byrnes 2006-04-17 10:29 ` Neil Brown 0 siblings, 1 reply; 18+ messages in thread From: Nathanial Byrnes @ 2006-04-17 10:08 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Please see below. On Mon, 2006-04-17 at 13:04 +1000, Neil Brown wrote: > On Sunday April 16, nate@qabal.org wrote: > > Hi Neil, > > Thanks for your reply. I tried that, but here is there error I > > received: > > > > root@finn:/etc# mdadm --assemble /dev/md0 > > --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh] > > mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy > > mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to > > start the array. > > What is /dev/hdf busy? Is it in use? mounted? something? > Not that I am aware of. Here is the mount output: root@finn:/etc# mount /dev/sda1 on / type ext3 (rw) proc on /proc type proc (rw) sysfs on /sys type sysfs (rw) /dev/sdb1 on /usr type ext3 (rw) devpts on /dev/pts type devpts (rw,gid=5,mode=620) nfsd on /proc/fs/nfsd type nfsd (rw) usbfs on /proc/bus/usb type usbfs (rw) lsof | grep hdf does not return any results. is there some other way to find out? > > > > The output from lsraid against each device is as follows (I think that I > > messed up my superblocks pretty well...): > > Sorry, but I don't use lsraid and cannot tell anything useful from it's > output. ok > > NeilBrown > > !DSPAM:444305b971501811819476! > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 10:08 ` Nathanial Byrnes @ 2006-04-17 10:29 ` Neil Brown 2006-04-17 12:15 ` Nate Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Neil Brown @ 2006-04-17 10:29 UTC (permalink / raw) To: Nathanial Byrnes; +Cc: linux-raid On Monday April 17, nate@qabal.org wrote: > > > > What is /dev/hdf busy? Is it in use? mounted? something? > > > Not that I am aware of. Here is the mount output: > > root@finn:/etc# mount > /dev/sda1 on / type ext3 (rw) > proc on /proc type proc (rw) > sysfs on /sys type sysfs (rw) > /dev/sdb1 on /usr type ext3 (rw) > devpts on /dev/pts type devpts (rw,gid=5,mode=620) > nfsd on /proc/fs/nfsd type nfsd (rw) > usbfs on /proc/bus/usb type usbfs (rw) > > lsof | grep hdf does not return any results. > > is there some other way to find out? cat /proc/swaps cat /proc/mounts cat /proc/mdstat as well as 'lsof' should find it. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 10:29 ` Neil Brown @ 2006-04-17 12:15 ` Nate Byrnes 2006-04-17 19:29 ` Nate Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Nate Byrnes @ 2006-04-17 12:15 UTC (permalink / raw) To: Neil Brown; +Cc: Nathanial Byrnes, linux-raid Hi Neil, Nothing references hdf as you can see below. I have also rmmod'ed md and raid5 modules and modprobed them back in. Thoughts? Thanks again, Nate root@finn:~# cat /proc/swaps Filename Type Size Used Priority /dev/sdb2 partition 1050616 1028 -1 root@finn:~# cat /proc/mounts rootfs / rootfs rw 0 0 /dev/root / ext3 rw 0 0 proc /proc proc rw,nodiratime 0 0 sysfs /sys sysfs rw 0 0 none /dev ramfs rw 0 0 /dev/sdb1 /usr ext3 rw 0 0 devpts /dev/pts devpts rw 0 0 nfsd /proc/fs/nfsd nfsd rw 0 0 usbfs /proc/bus/usb usbfs rw 0 0 root@finn:~# cat /proc/mdstat Personalities : [raid5] md0 : inactive hdh[2] hdg[3] hde[1] 234451968 blocks unused devices: <none> Neil Brown wrote: > On Monday April 17, nate@qabal.org wrote: > >>> What is /dev/hdf busy? Is it in use? mounted? something? >>> >>> >> Not that I am aware of. Here is the mount output: >> >> root@finn:/etc# mount >> /dev/sda1 on / type ext3 (rw) >> proc on /proc type proc (rw) >> sysfs on /sys type sysfs (rw) >> /dev/sdb1 on /usr type ext3 (rw) >> devpts on /dev/pts type devpts (rw,gid=5,mode=620) >> nfsd on /proc/fs/nfsd type nfsd (rw) >> usbfs on /proc/bus/usb type usbfs (rw) >> >> lsof | grep hdf does not return any results. >> >> is there some other way to find out? >> > > cat /proc/swaps > cat /proc/mounts > cat /proc/mdstat > > as well as 'lsof' should find it. > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > !DSPAM:44436e3576593808182809! > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 12:15 ` Nate Byrnes @ 2006-04-17 19:29 ` Nate Byrnes 2006-04-17 21:43 ` Neil Brown 0 siblings, 1 reply; 18+ messages in thread From: Nate Byrnes @ 2006-04-17 19:29 UTC (permalink / raw) Cc: Neil Brown, linux-raid Hi Neil, List, Am I just out of luck? Perhaps a full reboot? Something else? Thanks, Nate Nate Byrnes wrote: > Hi Neil, > Nothing references hdf as you can see below. I have also rmmod'ed > md and raid5 modules and modprobed them back in. Thoughts? > > Thanks again, > Nate > > root@finn:~# cat /proc/swaps > Filename Type Size > Used Priority > /dev/sdb2 partition 1050616 > 1028 -1 > > root@finn:~# cat /proc/mounts > rootfs / rootfs rw 0 0 > /dev/root / ext3 rw 0 0 > proc /proc proc rw,nodiratime 0 0 > sysfs /sys sysfs rw 0 0 > none /dev ramfs rw 0 0 > /dev/sdb1 /usr ext3 rw 0 0 > devpts /dev/pts devpts rw 0 0 > nfsd /proc/fs/nfsd nfsd rw 0 0 > usbfs /proc/bus/usb usbfs rw 0 0 > > root@finn:~# cat /proc/mdstat > Personalities : [raid5] > md0 : inactive hdh[2] hdg[3] hde[1] > 234451968 blocks > > unused devices: <none> > > > Neil Brown wrote: >> On Monday April 17, nate@qabal.org wrote: >> >>>> What is /dev/hdf busy? Is it in use? mounted? something? >>>> >>>> >>> Not that I am aware of. Here is the mount output: >>> >>> root@finn:/etc# mount >>> /dev/sda1 on / type ext3 (rw) >>> proc on /proc type proc (rw) >>> sysfs on /sys type sysfs (rw) >>> /dev/sdb1 on /usr type ext3 (rw) >>> devpts on /dev/pts type devpts (rw,gid=5,mode=620) >>> nfsd on /proc/fs/nfsd type nfsd (rw) >>> usbfs on /proc/bus/usb type usbfs (rw) >>> >>> lsof | grep hdf does not return any results. >>> >>> is there some other way to find out? >>> >> >> cat /proc/swaps >> cat /proc/mounts >> cat /proc/mdstat >> >> as well as 'lsof' should find it. >> >> - >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> >> >> > > !DSPAM:444386c978211215816793! > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 19:29 ` Nate Byrnes @ 2006-04-17 21:43 ` Neil Brown 2006-04-17 22:21 ` Nathanial Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Neil Brown @ 2006-04-17 21:43 UTC (permalink / raw) To: Nate Byrnes; +Cc: linux-raid On Monday April 17, nate@qabal.org wrote: > Hi Neil, List, > Am I just out of luck? Perhaps a full reboot? Something else? > Thanks, > Nate Reboot and try again seems like the best bet at this stage. NeilBrown ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 21:43 ` Neil Brown @ 2006-04-17 22:21 ` Nathanial Byrnes 2006-04-18 0:24 ` Neil Brown 0 siblings, 1 reply; 18+ messages in thread From: Nathanial Byrnes @ 2006-04-17 22:21 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid Unfortunately nothing changed. On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote: > On Monday April 17, nate@qabal.org wrote: > > Hi Neil, List, > > Am I just out of luck? Perhaps a full reboot? Something else? > > Thanks, > > Nate > > Reboot and try again seems like the best bet at this stage. > > NeilBrown > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > !DSPAM:44440c1a90901937570534! > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-17 22:21 ` Nathanial Byrnes @ 2006-04-18 0:24 ` Neil Brown 2006-04-18 10:07 ` Nathanial Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Neil Brown @ 2006-04-18 0:24 UTC (permalink / raw) To: Nathanial Byrnes; +Cc: linux-raid On Monday April 17, nate@qabal.org wrote: > Unfortunately nothing changed. Weird... so hdf still reports as 'busy'? Is it mentioned anywhere in /var/log/messages since reboot? What version of mdadm are you using? Try 2.4.1 and see if that works differently. NeilBrown > > > On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote: > > On Monday April 17, nate@qabal.org wrote: > > > Hi Neil, List, > > > Am I just out of luck? Perhaps a full reboot? Something else? > > > Thanks, > > > Nate > > > > Reboot and try again seems like the best bet at this stage. > > > > NeilBrown > > - > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > !DSPAM:44440c1a90901937570534! > > > > - > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-18 0:24 ` Neil Brown @ 2006-04-18 10:07 ` Nathanial Byrnes 0 siblings, 0 replies; 18+ messages in thread From: Nathanial Byrnes @ 2006-04-18 10:07 UTC (permalink / raw) To: Neil Brown; +Cc: linux-raid 2.4.1 behaves just like 2.1. so far nothing in the syslog or messages. On Tue, 2006-04-18 at 10:24 +1000, Neil Brown wrote: > On Monday April 17, nate@qabal.org wrote: > > Unfortunately nothing changed. > > Weird... so hdf still reports as 'busy'? > Is it mentioned anywhere in /var/log/messages since reboot? > > What version of mdadm are you using? Try 2.4.1 and see if that works > differently. > > NeilBrown > > > > > > > On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote: > > > On Monday April 17, nate@qabal.org wrote: > > > > Hi Neil, List, > > > > Am I just out of luck? Perhaps a full reboot? Something else? > > > > Thanks, > > > > Nate > > > > > > Reboot and try again seems like the best bet at this stage. > > > > > > NeilBrown > > > - > > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > > > > - > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > !DSPAM:444431e693751804284693! > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-15 13:01 RAID5 recovery trouble, bd_claim failed? Nathanial Byrnes 2006-04-16 22:46 ` Neil Brown @ 2006-04-18 22:13 ` Maurice Hilarius 2006-04-18 23:39 ` Nathanial Byrnes 1 sibling, 1 reply; 18+ messages in thread From: Maurice Hilarius @ 2006-04-18 22:13 UTC (permalink / raw) To: Nathanial Byrnes; +Cc: linux-raid Nathanial Byrnes wrote: > Hi All, > Recently I lost a disk in my raid5 SW array. It seems that it took a > second disk with it. The other disk appears to still be funtional (from > an fdisk perspective...). I am trying to get the array to work in > degraded mode via failed-disk in raidtab, but am always getting the > following error: > > Let me guess: IDE disks, in pairs. Jumpered as Master and Salve. Right? -- With our best regards, Maurice W. Hilarius Telephone: 01-780-456-9771 Hard Data Ltd. FAX: 01-780-456-9772 11060 - 166 Avenue email:maurice@harddata.com Edmonton, AB, Canada http://www.harddata.com/ T5X 1Y3 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-18 22:13 ` Maurice Hilarius @ 2006-04-18 23:39 ` Nathanial Byrnes 2006-04-19 13:41 ` Maurice Hilarius 0 siblings, 1 reply; 18+ messages in thread From: Nathanial Byrnes @ 2006-04-18 23:39 UTC (permalink / raw) To: Maurice Hilarius; +Cc: linux-raid Yes, I did not have the funding nor approval to purchase more hardware when I set it up (read wife). Once it was working... the rest is history. On Tue, 2006-04-18 at 16:13 -0600, Maurice Hilarius wrote: > Nathanial Byrnes wrote: > > Hi All, > > Recently I lost a disk in my raid5 SW array. It seems that it took a > > second disk with it. The other disk appears to still be funtional (from > > an fdisk perspective...). I am trying to get the array to work in > > degraded mode via failed-disk in raidtab, but am always getting the > > following error: > > > > > Let me guess: > IDE disks, in pairs. > Jumpered as Master and Salve. > > Right? > > > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-18 23:39 ` Nathanial Byrnes @ 2006-04-19 13:41 ` Maurice Hilarius 2006-04-19 13:53 ` Nate Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Maurice Hilarius @ 2006-04-19 13:41 UTC (permalink / raw) Cc: linux-raid, neilb, nate Nathanial Byrnes wrote: > Yes, I did not have the funding nor approval to purchase more hardware > when I set it up (read wife). Once it was working... the rest is > history. > > OK, so if you have a pair of IDE disks, jumpered as Master and slave, and if one fails: If Master failed, re-jumper remaining disk on pair on same cable as Master, no slave present If Slave failed, re-jumper remaining disk on pair on same cable as Master, no slave present. Then you will have the remaining disk working normally, at least. When you can afford it I suggest buying a controller with enough ports to support the number of drives you have, with no Master/Slave pairing. Good luck ! And to the software guys trying to help: We need to start with the (obvious) hardware problem, before we advise on how to recover data from a borked system.. Once he has the jumpering on the drives sorted out, the drive that went missing will be back again.. -- Regards, Maurice ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-19 13:41 ` Maurice Hilarius @ 2006-04-19 13:53 ` Nate Byrnes 2006-04-19 14:04 ` Maurice Hilarius 0 siblings, 1 reply; 18+ messages in thread From: Nate Byrnes @ 2006-04-19 13:53 UTC (permalink / raw) To: Maurice Hilarius; +Cc: linux-raid, neilb Hi All, I'm not sure that is entirely the case. From a hardware perspective, I can access all the disks from the OS, via fdisk and dd. It is really just mdadm that is failing. Would I still need to work the jumper issue? Thanks, Nate Maurice Hilarius wrote: > Nathanial Byrnes wrote: > >> Yes, I did not have the funding nor approval to purchase more hardware >> when I set it up (read wife). Once it was working... the rest is >> history. >> >> >> > > OK, so if you have a pair of IDE disks, jumpered as Master and slave, > and if one fails: > > If Master failed, re-jumper remaining disk on pair on same cable as > Master, no slave present > > If Slave failed, re-jumper remaining disk on pair on same cable as > Master, no slave present. > > Then you will have the remaining disk working normally, at least. > > When you can afford it I suggest buying a controller with enough ports > to support the number of drives you have, with no Master/Slave pairing. > > Good luck ! > > And to the software guys trying to help: We need to start with the > (obvious) hardware problem, before we advise on how to recover data from > a borked system.. > Once he has the jumpering on the drives sorted out, the drive that went > missing will be back again.. > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-19 13:53 ` Nate Byrnes @ 2006-04-19 14:04 ` Maurice Hilarius 2006-04-19 14:20 ` Nate Byrnes 0 siblings, 1 reply; 18+ messages in thread From: Maurice Hilarius @ 2006-04-19 14:04 UTC (permalink / raw) To: Nate Byrnes; +Cc: linux-raid, neilb Nate Byrnes wrote: > Hi All, > I'm not sure that is entirely the case. From a hardware > perspective, I can access all the disks from the OS, via fdisk and dd. > It is really just mdadm that is failing. Would I still need to work > the jumper issue? > Thanks, > Nate > IF the disks are as we suspect (master and slave relationships) and IF you now have either a failed or a removed drive, then you MUST correct the jumpering. Sure, you can often see a disk that is misconfigured. It is almost certain, however, that when you write to it you will simply cause corruption on it. Of course, so far this is all speculation, as you have not actually said what the disks, controller interfaces, and jumpering and so forth are at. I was merely speculating, based on what you have said. No amount of software magic will "cure" a hardware problem.. -- With our best regards, Maurice W. Hilarius Telephone: 01-780-456-9771 Hard Data Ltd. FAX: 01-780-456-9772 11060 - 166 Avenue email:maurice@harddata.com Edmonton, AB, Canada http://www.harddata.com/ T5X 1Y3 ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: RAID5 recovery trouble, bd_claim failed? 2006-04-19 14:04 ` Maurice Hilarius @ 2006-04-19 14:20 ` Nate Byrnes 0 siblings, 0 replies; 18+ messages in thread From: Nate Byrnes @ 2006-04-19 14:20 UTC (permalink / raw) To: Maurice Hilarius; +Cc: Nate Byrnes, linux-raid Hello, I replaced the failed disk. The configuration is /dev/hde, /dev/hdf (replaced), on IDE channel 0, /dev/hdg, /dev/hdh on IDE channel 1, on a single PCI controller card. The issue here is that hde in now also not accessible after the failure of hdf. I cannot see the jumper configs as the server is at home, and I am at work. The general thinking was that the hde superblock got hosed with the loss of hdf. My initial post only did discuss the disk ordering and device names. As I had replaced the disk which had failed (in a previously fully functioning array), with a new disk with exactly the same configuration (jumpers, cable locations, etc), and each of the disks could be accessed, my thinking was that there would not be a hardware problem to sort through. Is this logic flawed? Thanks again, Nate Maurice Hilarius wrote: > Nate Byrnes wrote: > >> Hi All, >> I'm not sure that is entirely the case. From a hardware >> perspective, I can access all the disks from the OS, via fdisk and dd. >> It is really just mdadm that is failing. Would I still need to work >> the jumper issue? >> Thanks, >> Nate >> >> > IF the disks are as we suspect (master and slave relationships) and IF > you now have either a failed or a removed drive, then you MUST correct > the jumpering. > Sure, you can often see a disk that is misconfigured. > It is almost certain, however, that when you write to it you will simply > cause corruption on it. > > Of course, so far this is all speculation, as you have not actually said > what the disks, controller interfaces, and jumpering and so forth are at. > I was merely speculating, based on what you have said. > > No amount of software magic will "cure" a hardware problem.. > > > ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2006-04-19 14:20 UTC | newest] Thread overview: 18+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-04-15 13:01 RAID5 recovery trouble, bd_claim failed? Nathanial Byrnes 2006-04-16 22:46 ` Neil Brown 2006-04-17 2:54 ` Nathanial Byrnes 2006-04-17 3:04 ` Neil Brown 2006-04-17 10:08 ` Nathanial Byrnes 2006-04-17 10:29 ` Neil Brown 2006-04-17 12:15 ` Nate Byrnes 2006-04-17 19:29 ` Nate Byrnes 2006-04-17 21:43 ` Neil Brown 2006-04-17 22:21 ` Nathanial Byrnes 2006-04-18 0:24 ` Neil Brown 2006-04-18 10:07 ` Nathanial Byrnes 2006-04-18 22:13 ` Maurice Hilarius 2006-04-18 23:39 ` Nathanial Byrnes 2006-04-19 13:41 ` Maurice Hilarius 2006-04-19 13:53 ` Nate Byrnes 2006-04-19 14:04 ` Maurice Hilarius 2006-04-19 14:20 ` Nate Byrnes
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).