RAID5 recovery trouble, bd

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* RAID5 recovery trouble, bd_claim failed?
@ 2006-04-15 13:01 Nathanial Byrnes
  2006-04-16 22:46 ` Neil Brown
  2006-04-18 22:13 ` Maurice Hilarius
  0 siblings, 2 replies; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-15 13:01 UTC (permalink / raw)
  To: linux-raid

Hi All,
	Recently I lost a disk in my raid5 SW array. It seems that it took a
second disk with it. The other disk appears to still be funtional (from
an fdisk perspective...). I am trying to get the array to work in
degraded mode via failed-disk in raidtab, but am always getting the
following error:

md: could not bd_claim hde.
md: autostart failed!

When I try to raidstart the array. Is it the case tha I had been running
in degraded mode before the disk failure, and then lost the other disk?
if so, how can I tell. 

I have been messing about with mkraid -R and I have tried to
add /dev/hdf (a new disk) back to the array. However, I am fairly
confident that I have not kicked off the recovery process, so I am
imagining that once I get the superblocks in order, I should be able to
recover to the new disk?

My system and raid config are:
Kernel 2.6.13.1
Slack 10.2
RAID 5 which originally looked like:
/dev/hde
/dev/hdg
/dev/hdi
/dev/hdk

but when I moved the disks to another box with fewer IDE controllers
/dev/hde
/dev/hdf
/dev/hdg
/dev/hdh

How should I approach this?

Below is the output of mdadm --examine /dev/hd*

Thanks in advance,
Nate

/dev/hde:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
     Raid Level : raid5
    Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Wed Apr 12 02:26:37 2006
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 165c1b4c - correct
         Events : 0.37523832

         Layout : left-symmetric
     Chunk Size : 128K

      Number   Major   Minor   RaidDevice State
this     1      33        0        1      active sync   /dev/hde

   0     0       0        0        0      removed
   1     1      33        0        1      active sync   /dev/hde
   2     2      34       64        2      active sync   /dev/hdh
   3     3      34        0        3      active sync   /dev/hdg

/dev/hdf:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
     Raid Level : raid5
    Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Wed Apr 12 02:26:37 2006
          State : active
 Active Devices : 3
Working Devices : 3
 Failed Devices : 1
  Spare Devices : 0
       Checksum : 165c1bc5 - correct
         Events : 0.37523832

         Layout : left-symmetric
     Chunk Size : 128K

      Number   Major   Minor   RaidDevice State
this     3      33       64       -1      sync   /dev/hdf

   0     0       0        0        0      removed
   1     1      33        0        1      active sync   /dev/hde
   2     2      34       64        2      active sync   /dev/hdh
   3     3      33       64       -1      sync   /dev/hdf
/dev/hdg:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
     Raid Level : raid5
    Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Wed Apr 12 06:12:58 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 3
  Spare Devices : 0
       Checksum : 1898e1fd - correct
         Events : 0.37523844

         Layout : left-symmetric
     Chunk Size : 128K

      Number   Major   Minor   RaidDevice State
this     3      34        0        3      active sync   /dev/hdg

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2      34       64        2      active sync   /dev/hdh
   3     3      34        0        3      active sync   /dev/hdg
/dev/hdh:
          Magic : a92b4efc
        Version : 00.90.00
           UUID : 38081921:59a998f9:64c1a001:ec534ef2
  Creation Time : Fri Aug 22 16:34:37 2003
     Raid Level : raid5
    Device Size : 78150656 (74.53 GiB 80.03 GB)
   Raid Devices : 4
  Total Devices : 4
Preferred Minor : 0

    Update Time : Wed Apr 12 06:12:58 2006
          State : clean
 Active Devices : 2
Working Devices : 2
 Failed Devices : 3
  Spare Devices : 0
       Checksum : 1898e23b - correct
         Events : 0.37523844

         Layout : left-symmetric
     Chunk Size : 128K

      Number   Major   Minor   RaidDevice State
this     2      34       64        2      active sync   /dev/hdh

   0     0       0        0        0      removed
   1     1       0        0        1      faulty removed
   2     2      34       64        2      active sync   /dev/hdh
   3     3      34        0        3      active sync   /dev/hdg



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-15 13:01 RAID5 recovery trouble, bd_claim failed? Nathanial Byrnes
@ 2006-04-16 22:46 ` Neil Brown
  2006-04-17  2:54   ` Nathanial Byrnes
  2006-04-18 22:13 ` Maurice Hilarius
  1 sibling, 1 reply; 18+ messages in thread
From: Neil Brown @ 2006-04-16 22:46 UTC (permalink / raw)
  To: Nathanial Byrnes; +Cc: linux-raid

On Saturday April 15, nate@qabal.org wrote:
> Hi All,
> 	Recently I lost a disk in my raid5 SW array. It seems that it took a
> second disk with it. The other disk appears to still be funtional (from
> an fdisk perspective...). I am trying to get the array to work in
> degraded mode via failed-disk in raidtab, but am always getting the
> following error:
> 
> md: could not bd_claim hde.
> md: autostart failed!
> 
> When I try to raidstart the array. Is it the case tha I had been running
> in degraded mode before the disk failure, and then lost the other disk?
> if so, how can I tell. 

raidstart is deprecated.  It doesn't work reliably.  Don't use it.

> 
> I have been messing about with mkraid -R and I have tried to
> add /dev/hdf (a new disk) back to the array. However, I am fairly
> confident that I have not kicked off the recovery process, so I am
> imagining that once I get the superblocks in order, I should be able to
> recover to the new disk?
> 
> My system and raid config are:
> Kernel 2.6.13.1
> Slack 10.2
> RAID 5 which originally looked like:
> /dev/hde
> /dev/hdg
> /dev/hdi
> /dev/hdk
> 
> but when I moved the disks to another box with fewer IDE controllers
> /dev/hde
> /dev/hdf
> /dev/hdg
> /dev/hdh
> 
> How should I approach this?

mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec534ef2 /dev/hd*

If that doesn't work, add "--force" but be cautious of the data - do
an fsck atleast.

NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-16 22:46 ` Neil Brown
@ 2006-04-17  2:54   ` Nathanial Byrnes
  2006-04-17  3:04     ` Neil Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-17  2:54 UTC (permalink / raw)
  To: Neil Brown; +Cc: Nathanial Byrnes, linux-raid

Hi Neil,
	Thanks for your reply. I tried that, but here is there error I
received:

root@finn:/etc# mdadm --assemble /dev/md0
--uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh]
mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy
mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to
start the array.

The output from lsraid against each device is as follows (I think that I
messed up my superblocks pretty well...): 

root@finn:/etc# lsraid -d /dev/hde
[dev   9,   0] /dev/md/0        38081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev  34,  64] /dev/hdh         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown

[dev  33,   0] /dev/hde         38081921.59A998F9.64C1A001.EC534EF2
unbound
root@finn:/etc# lsraid -d /dev/hdf
[dev   9,   0] /dev/md/0        38081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev  34,  64] /dev/hdh         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown

[dev  33,  64] /dev/hdf         38081921.59A998F9.64C1A001.EC534EF2
unbound
root@finn:/etc# lsraid -d /dev/hdg
[dev   9,   0] /dev/md/0        38081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev  34,  64] /dev/hdh         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown

root@finn:/etc# lsraid -d /dev/hdh
[dev   9,   0] /dev/md/0        38081921.59A998F9.64C1A001.EC534EF2
offline
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev   ?,   ?] (unknown)        00000000.00000000.00000000.00000000
missing
[dev  34,  64] /dev/hdh         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  34,   0] /dev/hdg         38081921.59A998F9.64C1A001.EC534EF2 good
[dev  33,  64] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown
[dev  33,   0] (unknown)        38081921.59A998F9.64C1A001.EC534EF2
unknown


Thanks again,
Nate

On Mon, 2006-04-17 at 08:46 +1000, Neil Brown wrote:
> On Saturday April 15, nate@qabal.org wrote:
> > Hi All,
> > 	Recently I lost a disk in my raid5 SW array. It seems that it took a
> > second disk with it. The other disk appears to still be funtional (from
> > an fdisk perspective...). I am trying to get the array to work in
> > degraded mode via failed-disk in raidtab, but am always getting the
> > following error:
> > 
> > md: could not bd_claim hde.
> > md: autostart failed!
> > 
> > When I try to raidstart the array. Is it the case tha I had been running
> > in degraded mode before the disk failure, and then lost the other disk?
> > if so, how can I tell. 
> 
> raidstart is deprecated.  It doesn't work reliably.  Don't use it.
> 
> > 
> > I have been messing about with mkraid -R and I have tried to
> > add /dev/hdf (a new disk) back to the array. However, I am fairly
> > confident that I have not kicked off the recovery process, so I am
> > imagining that once I get the superblocks in order, I should be able to
> > recover to the new disk?
> > 
> > My system and raid config are:
> > Kernel 2.6.13.1
> > Slack 10.2
> > RAID 5 which originally looked like:
> > /dev/hde
> > /dev/hdg
> > /dev/hdi
> > /dev/hdk
> > 
> > but when I moved the disks to another box with fewer IDE controllers
> > /dev/hde
> > /dev/hdf
> > /dev/hdg
> > /dev/hdh
> > 
> > How should I approach this?
> 
> mdadm --assemble /dev/md0 --uuid=38081921:59a998f9:64c1a001:ec534ef2 /dev/hd*
> 
> If that doesn't work, add "--force" but be cautious of the data - do
> an fsck atleast.
> 
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> !DSPAM:4442c93863991804284693!
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17  2:54   ` Nathanial Byrnes
@ 2006-04-17  3:04     ` Neil Brown
  2006-04-17 10:08       ` Nathanial Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2006-04-17  3:04 UTC (permalink / raw)
  To: Nathanial Byrnes; +Cc: Nathanial Byrnes, linux-raid

On Sunday April 16, nate@qabal.org wrote:
> Hi Neil,
> 	Thanks for your reply. I tried that, but here is there error I
> received:
> 
> root@finn:/etc# mdadm --assemble /dev/md0
> --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh]
> mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy
> mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to
> start the array.

What is /dev/hdf busy? Is it in use? mounted? something?

> 
> The output from lsraid against each device is as follows (I think that I
> messed up my superblocks pretty well...): 

Sorry, but I don't use lsraid and cannot tell anything useful from it's
output.

NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17  3:04     ` Neil Brown
@ 2006-04-17 10:08       ` Nathanial Byrnes
  2006-04-17 10:29         ` Neil Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-17 10:08 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Please see below.

On Mon, 2006-04-17 at 13:04 +1000, Neil Brown wrote:
> On Sunday April 16, nate@qabal.org wrote:
> > Hi Neil,
> > 	Thanks for your reply. I tried that, but here is there error I
> > received:
> > 
> > root@finn:/etc# mdadm --assemble /dev/md0
> > --uuid=38081921:59a998f9:64c1a001:ec53 4ef2 /dev/hd[efgh]
> > mdadm: failed to add /dev/hdf to /dev/md0: Device or resource busy
> > mdadm: /dev/md0 assembled from 2 drives and -1 spares - not enough to
> > start the array.
> 
> What is /dev/hdf busy? Is it in use? mounted? something?
> 
Not that I am aware of. Here is the mount output:

root@finn:/etc# mount
/dev/sda1 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
/dev/sdb1 on /usr type ext3 (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
nfsd on /proc/fs/nfsd type nfsd (rw)
usbfs on /proc/bus/usb type usbfs (rw)

lsof | grep hdf does not return any results.

is there some other way to find out?
> > 
> > The output from lsraid against each device is as follows (I think that I
> > messed up my superblocks pretty well...): 
> 
> Sorry, but I don't use lsraid and cannot tell anything useful from it's
> output.
ok
> 
> NeilBrown
> 
> !DSPAM:444305b971501811819476!
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17 10:08       ` Nathanial Byrnes
@ 2006-04-17 10:29         ` Neil Brown
  2006-04-17 12:15           ` Nate Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2006-04-17 10:29 UTC (permalink / raw)
  To: Nathanial Byrnes; +Cc: linux-raid

On Monday April 17, nate@qabal.org wrote:
> > 
> > What is /dev/hdf busy? Is it in use? mounted? something?
> > 
> Not that I am aware of. Here is the mount output:
> 
> root@finn:/etc# mount
> /dev/sda1 on / type ext3 (rw)
> proc on /proc type proc (rw)
> sysfs on /sys type sysfs (rw)
> /dev/sdb1 on /usr type ext3 (rw)
> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
> nfsd on /proc/fs/nfsd type nfsd (rw)
> usbfs on /proc/bus/usb type usbfs (rw)
> 
> lsof | grep hdf does not return any results.
> 
> is there some other way to find out?

 cat /proc/swaps
 cat /proc/mounts
 cat /proc/mdstat

as well as 'lsof' should find it.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17 10:29         ` Neil Brown
@ 2006-04-17 12:15           ` Nate Byrnes
  2006-04-17 19:29             ` Nate Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Nate Byrnes @ 2006-04-17 12:15 UTC (permalink / raw)
  To: Neil Brown; +Cc: Nathanial Byrnes, linux-raid

Hi Neil,
    Nothing references hdf as you can see below.  I have also rmmod'ed 
md and raid5 modules and modprobed them back in. Thoughts?

    Thanks again,
    Nate

root@finn:~# cat /proc/swaps
Filename                                Type            Size    Used    
Priority
/dev/sdb2                               partition       1050616 1028    -1

root@finn:~# cat /proc/mounts
rootfs / rootfs rw 0 0
/dev/root / ext3 rw 0 0
proc /proc proc rw,nodiratime 0 0
sysfs /sys sysfs rw 0 0
none /dev ramfs rw 0 0
/dev/sdb1 /usr ext3 rw 0 0
devpts /dev/pts devpts rw 0 0
nfsd /proc/fs/nfsd nfsd rw 0 0
usbfs /proc/bus/usb usbfs rw 0 0

root@finn:~# cat /proc/mdstat
Personalities : [raid5]
md0 : inactive hdh[2] hdg[3] hde[1]
      234451968 blocks

unused devices: <none>


Neil Brown wrote:
> On Monday April 17, nate@qabal.org wrote:
>   
>>> What is /dev/hdf busy? Is it in use? mounted? something?
>>>
>>>       
>> Not that I am aware of. Here is the mount output:
>>
>> root@finn:/etc# mount
>> /dev/sda1 on / type ext3 (rw)
>> proc on /proc type proc (rw)
>> sysfs on /sys type sysfs (rw)
>> /dev/sdb1 on /usr type ext3 (rw)
>> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
>> nfsd on /proc/fs/nfsd type nfsd (rw)
>> usbfs on /proc/bus/usb type usbfs (rw)
>>
>> lsof | grep hdf does not return any results.
>>
>> is there some other way to find out?
>>     
>
>  cat /proc/swaps
>  cat /proc/mounts
>  cat /proc/mdstat
>
> as well as 'lsof' should find it.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> !DSPAM:44436e3576593808182809!
>
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17 12:15           ` Nate Byrnes
@ 2006-04-17 19:29             ` Nate Byrnes
  2006-04-17 21:43               ` Neil Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Nate Byrnes @ 2006-04-17 19:29 UTC (permalink / raw)
  Cc: Neil Brown, linux-raid

Hi Neil, List,
    Am I just out of luck? Perhaps a full reboot? Something else?
    Thanks,
    Nate

Nate Byrnes wrote:
> Hi Neil,
>    Nothing references hdf as you can see below.  I have also rmmod'ed 
> md and raid5 modules and modprobed them back in. Thoughts?
>
>    Thanks again,
>    Nate
>
> root@finn:~# cat /proc/swaps
> Filename                                Type            Size    
> Used    Priority
> /dev/sdb2                               partition       1050616 
> 1028    -1
>
> root@finn:~# cat /proc/mounts
> rootfs / rootfs rw 0 0
> /dev/root / ext3 rw 0 0
> proc /proc proc rw,nodiratime 0 0
> sysfs /sys sysfs rw 0 0
> none /dev ramfs rw 0 0
> /dev/sdb1 /usr ext3 rw 0 0
> devpts /dev/pts devpts rw 0 0
> nfsd /proc/fs/nfsd nfsd rw 0 0
> usbfs /proc/bus/usb usbfs rw 0 0
>
> root@finn:~# cat /proc/mdstat
> Personalities : [raid5]
> md0 : inactive hdh[2] hdg[3] hde[1]
>      234451968 blocks
>
> unused devices: <none>
>
>
> Neil Brown wrote:
>> On Monday April 17, nate@qabal.org wrote:
>>  
>>>> What is /dev/hdf busy? Is it in use? mounted? something?
>>>>
>>>>       
>>> Not that I am aware of. Here is the mount output:
>>>
>>> root@finn:/etc# mount
>>> /dev/sda1 on / type ext3 (rw)
>>> proc on /proc type proc (rw)
>>> sysfs on /sys type sysfs (rw)
>>> /dev/sdb1 on /usr type ext3 (rw)
>>> devpts on /dev/pts type devpts (rw,gid=5,mode=620)
>>> nfsd on /proc/fs/nfsd type nfsd (rw)
>>> usbfs on /proc/bus/usb type usbfs (rw)
>>>
>>> lsof | grep hdf does not return any results.
>>>
>>> is there some other way to find out?
>>>     
>>
>>  cat /proc/swaps
>>  cat /proc/mounts
>>  cat /proc/mdstat
>>
>> as well as 'lsof' should find it.
>>
>> -
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>>   
>
> !DSPAM:444386c978211215816793!
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17 19:29             ` Nate Byrnes
@ 2006-04-17 21:43               ` Neil Brown
  2006-04-17 22:21                 ` Nathanial Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2006-04-17 21:43 UTC (permalink / raw)
  To: Nate Byrnes; +Cc: linux-raid

On Monday April 17, nate@qabal.org wrote:
> Hi Neil, List,
>     Am I just out of luck? Perhaps a full reboot? Something else?
>     Thanks,
>     Nate

Reboot and try again seems like the best bet at this stage.

NeilBrown

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17 21:43               ` Neil Brown
@ 2006-04-17 22:21                 ` Nathanial Byrnes
  2006-04-18  0:24                   ` Neil Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-17 22:21 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Unfortunately nothing changed. 


On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote:
> On Monday April 17, nate@qabal.org wrote:
> > Hi Neil, List,
> >     Am I just out of luck? Perhaps a full reboot? Something else?
> >     Thanks,
> >     Nate
> 
> Reboot and try again seems like the best bet at this stage.
> 
> NeilBrown
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> !DSPAM:44440c1a90901937570534!
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-17 22:21                 ` Nathanial Byrnes
@ 2006-04-18  0:24                   ` Neil Brown
  2006-04-18 10:07                     ` Nathanial Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2006-04-18  0:24 UTC (permalink / raw)
  To: Nathanial Byrnes; +Cc: linux-raid

On Monday April 17, nate@qabal.org wrote:
> Unfortunately nothing changed. 

Weird... so hdf still reports as 'busy'?
Is it mentioned anywhere in /var/log/messages since reboot?

What version of mdadm are you using?  Try 2.4.1 and see if that works
differently.

NeilBrown

> 
> 
> On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote:
> > On Monday April 17, nate@qabal.org wrote:
> > > Hi Neil, List,
> > >     Am I just out of luck? Perhaps a full reboot? Something else?
> > >     Thanks,
> > >     Nate
> > 
> > Reboot and try again seems like the best bet at this stage.
> > 
> > NeilBrown
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > !DSPAM:44440c1a90901937570534!
> > 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-18  0:24                   ` Neil Brown
@ 2006-04-18 10:07                     ` Nathanial Byrnes
  0 siblings, 0 replies; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-18 10:07 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

2.4.1 behaves just like 2.1. so far nothing in the syslog or messages.

On Tue, 2006-04-18 at 10:24 +1000, Neil Brown wrote:
> On Monday April 17, nate@qabal.org wrote:
> > Unfortunately nothing changed. 
> 
> Weird... so hdf still reports as 'busy'?
> Is it mentioned anywhere in /var/log/messages since reboot?
> 
> What version of mdadm are you using?  Try 2.4.1 and see if that works
> differently.
> 
> NeilBrown
> 
> > 
> > 
> > On Tue, 2006-04-18 at 07:43 +1000, Neil Brown wrote:
> > > On Monday April 17, nate@qabal.org wrote:
> > > > Hi Neil, List,
> > > >     Am I just out of luck? Perhaps a full reboot? Something else?
> > > >     Thanks,
> > > >     Nate
> > > 
> > > Reboot and try again seems like the best bet at this stage.
> > > 
> > > NeilBrown
> > > -
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > 
> > > 
> > > 
> > 
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> !DSPAM:444431e693751804284693!
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-15 13:01 RAID5 recovery trouble, bd_claim failed? Nathanial Byrnes
  2006-04-16 22:46 ` Neil Brown
@ 2006-04-18 22:13 ` Maurice Hilarius
  2006-04-18 23:39   ` Nathanial Byrnes
  1 sibling, 1 reply; 18+ messages in thread
From: Maurice Hilarius @ 2006-04-18 22:13 UTC (permalink / raw)
  To: Nathanial Byrnes; +Cc: linux-raid

Nathanial Byrnes wrote:
> Hi All,
> 	Recently I lost a disk in my raid5 SW array. It seems that it took a
> second disk with it. The other disk appears to still be funtional (from
> an fdisk perspective...). I am trying to get the array to work in
> degraded mode via failed-disk in raidtab, but am always getting the
> following error:
>
>   
Let me guess:
IDE disks, in pairs.
Jumpered as Master and Salve.

Right?





-- 

With our best regards,


Maurice W. Hilarius        Telephone: 01-780-456-9771
Hard Data Ltd.  FAX:       01-780-456-9772
11060 - 166 Avenue         email:maurice@harddata.com
Edmonton, AB, Canada       http://www.harddata.com/
   T5X 1Y3


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-18 22:13 ` Maurice Hilarius
@ 2006-04-18 23:39   ` Nathanial Byrnes
  2006-04-19 13:41     ` Maurice Hilarius
  0 siblings, 1 reply; 18+ messages in thread
From: Nathanial Byrnes @ 2006-04-18 23:39 UTC (permalink / raw)
  To: Maurice Hilarius; +Cc: linux-raid

Yes, I did not have the funding nor approval to purchase more hardware
when I set it up (read wife). Once it was working... the rest is
history.

On Tue, 2006-04-18 at 16:13 -0600, Maurice Hilarius wrote:
> Nathanial Byrnes wrote:
> > Hi All,
> > 	Recently I lost a disk in my raid5 SW array. It seems that it took a
> > second disk with it. The other disk appears to still be funtional (from
> > an fdisk perspective...). I am trying to get the array to work in
> > degraded mode via failed-disk in raidtab, but am always getting the
> > following error:
> >
> >   
> Let me guess:
> IDE disks, in pairs.
> Jumpered as Master and Salve.
> 
> Right?
> 
> 
> 
> 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-18 23:39   ` Nathanial Byrnes
@ 2006-04-19 13:41     ` Maurice Hilarius
  2006-04-19 13:53       ` Nate Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Maurice Hilarius @ 2006-04-19 13:41 UTC (permalink / raw)
  Cc: linux-raid, neilb, nate

Nathanial Byrnes wrote:
> Yes, I did not have the funding nor approval to purchase more hardware
> when I set it up (read wife). Once it was working... the rest is
> history.
>
>   

OK, so if you have a pair of IDE disks, jumpered as Master and slave,
and if one fails:

If Master failed, re-jumper remaining disk on pair on same cable as
Master, no slave present

If Slave failed, re-jumper remaining disk on pair on same cable as
Master, no slave present.

Then you will have the remaining disk working normally, at least.

When you can afford it I suggest buying a controller with enough ports
to support the number of drives you have, with no Master/Slave pairing.

Good luck !

And to the  software guys trying to help: We need to start with the
(obvious) hardware problem, before we advise on how to recover data from
a borked system..
Once he has the jumpering on the drives sorted out, the drive that went
missing will be back again..

-- 

Regards,
	Maurice

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-19 13:41     ` Maurice Hilarius
@ 2006-04-19 13:53       ` Nate Byrnes
  2006-04-19 14:04         ` Maurice Hilarius
  0 siblings, 1 reply; 18+ messages in thread
From: Nate Byrnes @ 2006-04-19 13:53 UTC (permalink / raw)
  To: Maurice Hilarius; +Cc: linux-raid, neilb

Hi All,
    I'm not sure that is entirely the case. From a hardware perspective, 
I can access all the disks from the OS, via fdisk and dd. It is really 
just mdadm that is failing.  Would I still need to work the jumper issue?
    Thanks,
    Nate

Maurice Hilarius wrote:
> Nathanial Byrnes wrote:
>   
>> Yes, I did not have the funding nor approval to purchase more hardware
>> when I set it up (read wife). Once it was working... the rest is
>> history.
>>
>>   
>>     
>
> OK, so if you have a pair of IDE disks, jumpered as Master and slave,
> and if one fails:
>
> If Master failed, re-jumper remaining disk on pair on same cable as
> Master, no slave present
>
> If Slave failed, re-jumper remaining disk on pair on same cable as
> Master, no slave present.
>
> Then you will have the remaining disk working normally, at least.
>
> When you can afford it I suggest buying a controller with enough ports
> to support the number of drives you have, with no Master/Slave pairing.
>
> Good luck !
>
> And to the  software guys trying to help: We need to start with the
> (obvious) hardware problem, before we advise on how to recover data from
> a borked system..
> Once he has the jumpering on the drives sorted out, the drive that went
> missing will be back again..
>
>
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-19 13:53       ` Nate Byrnes
@ 2006-04-19 14:04         ` Maurice Hilarius
  2006-04-19 14:20           ` Nate Byrnes
  0 siblings, 1 reply; 18+ messages in thread
From: Maurice Hilarius @ 2006-04-19 14:04 UTC (permalink / raw)
  To: Nate Byrnes; +Cc: linux-raid, neilb

Nate Byrnes wrote:
> Hi All,
>    I'm not sure that is entirely the case. From a hardware
> perspective, I can access all the disks from the OS, via fdisk and dd.
> It is really just mdadm that is failing.  Would I still need to work
> the jumper issue?
>    Thanks,
>    Nate
>
IF the disks are as we suspect (master and slave relationships) and IF
you now have either a failed or a removed drive, then you  MUST correct
the jumpering.
Sure, you can often see a disk that is misconfigured.
It is almost certain, however, that when you write to it you will simply
cause corruption on it.

Of course, so far this is all speculation, as you have not actually said
what the disks, controller interfaces, and jumpering and so forth are at.
I was merely speculating, based on what you have said.

No amount of software magic will "cure" a hardware problem..


-- 

With our best regards,


Maurice W. Hilarius        Telephone: 01-780-456-9771
Hard Data Ltd.  FAX:       01-780-456-9772
11060 - 166 Avenue         email:maurice@harddata.com
Edmonton, AB, Canada       http://www.harddata.com/
   T5X 1Y3


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: RAID5 recovery trouble, bd_claim failed?
  2006-04-19 14:04         ` Maurice Hilarius
@ 2006-04-19 14:20           ` Nate Byrnes
  0 siblings, 0 replies; 18+ messages in thread
From: Nate Byrnes @ 2006-04-19 14:20 UTC (permalink / raw)
  To: Maurice Hilarius; +Cc: Nate Byrnes, linux-raid

Hello,
    I replaced the failed disk. The configuration is /dev/hde, /dev/hdf 
(replaced), on IDE channel 0, /dev/hdg, /dev/hdh on IDE channel 1, on a 
single PCI controller card. The issue here is that hde in now also not 
accessible after the failure of hdf.  I cannot see the jumper configs as 
the server is at home, and I am at work. The general thinking was that 
the hde superblock got hosed with the loss of hdf.

My initial post only did discuss the disk ordering and device names. As 
I had replaced the disk which had failed (in a previously fully 
functioning array), with a new disk with exactly the same configuration 
(jumpers, cable locations, etc), and each of the disks could be 
accessed, my thinking was that there would not be a hardware problem to 
sort through. Is this logic flawed?
    Thanks again,
    Nate

Maurice Hilarius wrote:
> Nate Byrnes wrote:
>   
>> Hi All,
>>    I'm not sure that is entirely the case. From a hardware
>> perspective, I can access all the disks from the OS, via fdisk and dd.
>> It is really just mdadm that is failing.  Would I still need to work
>> the jumper issue?
>>    Thanks,
>>    Nate
>>
>>     
> IF the disks are as we suspect (master and slave relationships) and IF
> you now have either a failed or a removed drive, then you  MUST correct
> the jumpering.
> Sure, you can often see a disk that is misconfigured.
> It is almost certain, however, that when you write to it you will simply
> cause corruption on it.
>
> Of course, so far this is all speculation, as you have not actually said
> what the disks, controller interfaces, and jumpering and so forth are at.
> I was merely speculating, based on what you have said.
>
> No amount of software magic will "cure" a hardware problem..
>
>
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2006-04-19 14:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-15 13:01 RAID5 recovery trouble, bd_claim failed? Nathanial Byrnes
2006-04-16 22:46 ` Neil Brown
2006-04-17  2:54   ` Nathanial Byrnes
2006-04-17  3:04     ` Neil Brown
2006-04-17 10:08       ` Nathanial Byrnes
2006-04-17 10:29         ` Neil Brown
2006-04-17 12:15           ` Nate Byrnes
2006-04-17 19:29             ` Nate Byrnes
2006-04-17 21:43               ` Neil Brown
2006-04-17 22:21                 ` Nathanial Byrnes
2006-04-18  0:24                   ` Neil Brown
2006-04-18 10:07                     ` Nathanial Byrnes
2006-04-18 22:13 ` Maurice Hilarius
2006-04-18 23:39   ` Nathanial Byrnes
2006-04-19 13:41     ` Maurice Hilarius
2006-04-19 13:53       ` Nate Byrnes
2006-04-19 14:04         ` Maurice Hilarius
2006-04-19 14:20           ` Nate Byrnes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).