linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Recovery raid5 after sata cable failure.
@ 2005-07-17 11:34 Francisco Zafra
  2005-07-17 22:10 ` Neil Brown
  2005-10-12 17:19 ` safe to test SATA array by pulling cables? Harry Mangalam
  0 siblings, 2 replies; 8+ messages in thread
From: Francisco Zafra @ 2005-07-17 11:34 UTC (permalink / raw)
  To: linux-raid

Hi all,
 
    I have raid5 array working without problem for some months. A SATA cable
failed and de raid5 works fine keeping the superblock persistent, but now, I
can't get the old device inserted into the array.
    This is the array just now:
 
root@Torero-2:/mnt/raid5 # mdadm --detail /dev/md0
/dev/md0:
        Version : 00.90.01
  Creation Time : Tue May 31 19:37:37 2005
     Raid Level : raid5
     Array Size : 1367507456 (1304.16 GiB 1400.33 GB)
    Device Size : 195358208 (186.31 GiB 200.05 GB)
   Raid Devices : 8
  Total Devices : 7
Preferred Minor : 0
    Persistence : Superblock is persistent
 
    Update Time : Sun Jul 17 13:06:17 2005
          State : clean, degraded
 Active Devices : 7
Working Devices : 7
 Failed Devices : 0
  Spare Devices : 0
 
         Layout : left-symmetric
     Chunk Size : 512K
 
           UUID : c4ed8e45:2a036953:92bff479:7cf5bac9
         Events : 0.162797
 
    Number   Major   Minor   RaidDevice State
       0       8        1        0      active sync   /dev/sda1
       1       8       17        1      active sync   /dev/sdb1
       2       8       33        2      active sync   /dev/sdc1
       3       8       49        3      active sync   /dev/sdd1
       4       8       65        4      active sync   /dev/sde1
       5       8       81        5      active sync   /dev/sdf1
       6       8       97        6      active sync   /dev/sdg1
       7       0        0        -      removed


	And I try to re-add the old disk in this way:

root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1   
mdadm: Cannot open /dev/sdh1: Device or resource busy

	What is wrong? What I am doing bad? Sdh1 is absolutely unused, so I
don't understand the error "resource busy"

	Thanks,

	Paco Zafra.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Recovery raid5 after sata cable failure.
       [not found] <Pine.LNX.4.61.0507170520090.8523@hoss.npl.com>
@ 2005-07-17 12:52 ` Francisco Zafra
  0 siblings, 0 replies; 8+ messages in thread
From: Francisco Zafra @ 2005-07-17 12:52 UTC (permalink / raw)
  To: 'Jim Radford'; +Cc: linux-raid

Hi, I am using lvm in this system, but only for parallel ATA disks, that not
use raid system. Anyway I deactivated lvm units, and tried again to readd de
faulty drive with the same results:

root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1  
mdadm: Cannot open /dev/sdh1: Device or resource busy

Thanks,
Paco.

> -----Mensaje original-----
> De: Jim Radford [mailto:jradford@npl.com] 
> Enviado el: domingo, 17 de julio de 2005 14:21
> Para: Francisco Zafra
> Asunto: Re: Recovery raid5 after sata cable failure.
> 
> Francisco,
> 
> I was having a simuliar issue, and it was due to the fact LVM 
> was starting before the raid array was assembled, if you are 
> using LVM you might want to check that the array is assembled 
> before LVM starts. (Probably would be the same for EVMS also).
> 
> Regards,
> Jim
> 
> 
> On Sun, 17 Jul 2005, Francisco Zafra wrote:
> 
> > Hi all,
> >
> >    I have raid5 array working without problem for some 
> months. A SATA 
> > cable failed and de raid5 works fine keeping the superblock 
> > persistent, but now, I can't get the old device inserted 
> into the array.
> >    This is the array just now:
> >
> > root@Torero-2:/mnt/raid5 # mdadm --detail /dev/md0
> > /dev/md0:
> >        Version : 00.90.01
> >  Creation Time : Tue May 31 19:37:37 2005
> >     Raid Level : raid5
> >     Array Size : 1367507456 (1304.16 GiB 1400.33 GB)
> >    Device Size : 195358208 (186.31 GiB 200.05 GB)
> >   Raid Devices : 8
> >  Total Devices : 7
> > Preferred Minor : 0
> >    Persistence : Superblock is persistent
> >
> >    Update Time : Sun Jul 17 13:06:17 2005
> >          State : clean, degraded
> > Active Devices : 7
> > Working Devices : 7
> > Failed Devices : 0
> >  Spare Devices : 0
> >
> >         Layout : left-symmetric
> >     Chunk Size : 512K
> >
> >           UUID : c4ed8e45:2a036953:92bff479:7cf5bac9
> >         Events : 0.162797
> >
> >    Number   Major   Minor   RaidDevice State
> >       0       8        1        0      active sync   /dev/sda1
> >       1       8       17        1      active sync   /dev/sdb1
> >       2       8       33        2      active sync   /dev/sdc1
> >       3       8       49        3      active sync   /dev/sdd1
> >       4       8       65        4      active sync   /dev/sde1
> >       5       8       81        5      active sync   /dev/sdf1
> >       6       8       97        6      active sync   /dev/sdg1
> >       7       0        0        -      removed
> >
> >
> > 	And I try to re-add the old disk in this way:
> >
> > root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1
> > mdadm: Cannot open /dev/sdh1: Device or resource busy
> >
> > 	What is wrong? What I am doing bad? Sdh1 is absolutely 
> unused, so I 
> > don't understand the error "resource busy"
> >
> > 	Thanks,
> >
> > 	Paco Zafra.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe 
> linux-raid" 
> > in the body of a message to majordomo@vger.kernel.org More 
> majordomo 
> > info at  http://vger.kernel.org/majordomo-info.html
> >
> 
> --
> ==============================================================
> ============
> Jim Radford <jradford@npl.com>
> http://www.jimradford.com/
> ==============================================================
> ============
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Recovery raid5 after sata cable failure.
  2005-07-17 11:34 Recovery raid5 after sata cable failure Francisco Zafra
@ 2005-07-17 22:10 ` Neil Brown
  2005-07-18  8:36   ` Francisco Zafra
  2005-10-12 17:19 ` safe to test SATA array by pulling cables? Harry Mangalam
  1 sibling, 1 reply; 8+ messages in thread
From: Neil Brown @ 2005-07-17 22:10 UTC (permalink / raw)
  To: Francisco Zafra; +Cc: linux-raid

On Sunday July 17, fzafra@gmail.com wrote:
> Hi all,
>  
>     I have raid5 array working without problem for some months. A SATA cable
> failed and de raid5 works fine keeping the superblock persistent, but now, I
> can't get the old device inserted into the array.
...
> 
> 	And I try to re-add the old disk in this way:
> 
> root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1   
> mdadm: Cannot open /dev/sdh1: Device or resource busy
> 
> 	What is wrong? What I am doing bad? Sdh1 is absolutely unused, so I
> don't understand the error "resource busy"

Well, it definitely is busy...

Maybe it is still part of md0, but marked as 'faulty'.
If so (cat /proc/mdstat would tell you) you need to remove it first.
  mdadm /dev/md0 -r /dev/sdh1
  mdadm /dev/md0 -a /dev/sdh1

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Recovery raid5 after sata cable failure.
  2005-07-17 22:10 ` Neil Brown
@ 2005-07-18  8:36   ` Francisco Zafra
  2005-07-18  9:40     ` Neil Brown
  0 siblings, 1 reply; 8+ messages in thread
From: Francisco Zafra @ 2005-07-18  8:36 UTC (permalink / raw)
  To: 'Neil Brown'; +Cc: linux-raid

I already tried that:

root@Torero-2:~ # cat /proc/mdstat
Personalities : [linear] [raid5] 
md0 : active raid5 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] sda1[0]
      1367507456 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
      
unused devices: <none>
root@Torero-2:~ # mdadm /dev/md0 -r /dev/sdh1
mdadm: hot remove failed for /dev/sdh1: No such device or address
root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
mdadm: Cannot open /dev/sdh1: Device or resource busy
root@Torero-2:~ # 

With no luck :(

Paco. 

> -----Mensaje original-----
> De: Neil Brown [mailto:neilb@cse.unsw.edu.au] 
> Enviado el: lunes, 18 de julio de 2005 0:11
> Para: Francisco Zafra
> CC: linux-raid@vger.kernel.org
> Asunto: Re: Recovery raid5 after sata cable failure.
> 
> On Sunday July 17, fzafra@gmail.com wrote:
> > Hi all,
> >  
> >     I have raid5 array working without problem for some 
> months. A SATA 
> > cable failed and de raid5 works fine keeping the superblock 
> > persistent, but now, I can't get the old device inserted 
> into the array.
> ...
> > 
> > 	And I try to re-add the old disk in this way:
> > 
> > root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1   
> > mdadm: Cannot open /dev/sdh1: Device or resource busy
> > 
> > 	What is wrong? What I am doing bad? Sdh1 is absolutely 
> unused, so I 
> > don't understand the error "resource busy"
> 
> Well, it definitely is busy...
> 
> Maybe it is still part of md0, but marked as 'faulty'.
> If so (cat /proc/mdstat would tell you) you need to remove it first.
>   mdadm /dev/md0 -r /dev/sdh1
>   mdadm /dev/md0 -a /dev/sdh1
> 
> NeilBrown
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Recovery raid5 after sata cable failure.
  2005-07-18  8:36   ` Francisco Zafra
@ 2005-07-18  9:40     ` Neil Brown
  2005-07-18  9:56       ` Francisco Zafra
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2005-07-18  9:40 UTC (permalink / raw)
  To: Francisco Zafra; +Cc: linux-raid

On Monday July 18, fzafra@gmail.com wrote:
> I already tried that:
> 
> root@Torero-2:~ # cat /proc/mdstat
> Personalities : [linear] [raid5] 
> md0 : active raid5 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] sda1[0]
>       1367507456 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
>       
> unused devices: <none>
> root@Torero-2:~ # mdadm /dev/md0 -r /dev/sdh1
> mdadm: hot remove failed for /dev/sdh1: No such device or address
> root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
> mdadm: Cannot open /dev/sdh1: Device or resource busy


Uhm, you might have a buggy version of mdadm.  If you have 1.10.0, get
an upgrade.

Otherwise either sdh1 or sdh must be:
  open by some process with O_EXCL
  open via a /dev/raw/* device
  part of an md device (which it obviously isn't)
  part of a dm device
  mounted as a filesystem
  an external-journal device for a jfs or ext3 or xfs filesystem
  in use as a swap device
  open for writing under a security level of 1 (whatever that means..)
  an mtd device that is open

(those are all the places that I can find that take an exclusive lock
 on a block device).

NeilBrown

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: Recovery raid5 after sata cable failure.
  2005-07-18  9:40     ` Neil Brown
@ 2005-07-18  9:56       ` Francisco Zafra
  2005-07-18 10:15         ` Brad Campbell
  0 siblings, 1 reply; 8+ messages in thread
From: Francisco Zafra @ 2005-07-18  9:56 UTC (permalink / raw)
  To: 'Neil Brown'; +Cc: linux-raid

Hi Neil, 
Since some hours I am trying to solved it with the last version:
root@Torero-2:~ # mdadm --version
mdadm - v2.0-devel-2 - DEVELOPMENT VERSION NOT FOR REGULAR USE - 7 July 2005

With the same results :(

I really don't think it is locked I dd it in act of desperation and I have
no problems:
root@Torero-2:~ # dd if=/dev/zero of=/dev/sdh bs=1k count=1000
1000+0 records in
1000+0 records out
1024000 bytes transferred in 0.417862 seconds (2450570 bytes/sec)

No locked or anything... I have really get out of ideas with this...

Thanks for all your help.

Paco.


> -----Mensaje original-----
> De: Neil Brown [mailto:neilb@cse.unsw.edu.au] 
> Enviado el: lunes, 18 de julio de 2005 11:41
> Para: Francisco Zafra
> CC: linux-raid@vger.kernel.org
> Asunto: RE: Recovery raid5 after sata cable failure.
> 
> On Monday July 18, fzafra@gmail.com wrote:
> > I already tried that:
> > 
> > root@Torero-2:~ # cat /proc/mdstat
> > Personalities : [linear] [raid5]
> > md0 : active raid5 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] 
> sdb1[1] sda1[0]
> >       1367507456 blocks level 5, 512k chunk, algorithm 2 [8/7] 
> > [UUUUUUU_]
> >       
> > unused devices: <none>
> > root@Torero-2:~ # mdadm /dev/md0 -r /dev/sdh1
> > mdadm: hot remove failed for /dev/sdh1: No such device or address 
> > root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
> > mdadm: Cannot open /dev/sdh1: Device or resource busy
> 
> 
> Uhm, you might have a buggy version of mdadm.  If you have 
> 1.10.0, get an upgrade.
> 
> Otherwise either sdh1 or sdh must be:
>   open by some process with O_EXCL
>   open via a /dev/raw/* device
>   part of an md device (which it obviously isn't)
>   part of a dm device
>   mounted as a filesystem
>   an external-journal device for a jfs or ext3 or xfs filesystem
>   in use as a swap device
>   open for writing under a security level of 1 (whatever that means..)
>   an mtd device that is open
> 
> (those are all the places that I can find that take an 
> exclusive lock  on a block device).
> 
> NeilBrown
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Recovery raid5 after sata cable failure.
  2005-07-18  9:56       ` Francisco Zafra
@ 2005-07-18 10:15         ` Brad Campbell
  0 siblings, 0 replies; 8+ messages in thread
From: Brad Campbell @ 2005-07-18 10:15 UTC (permalink / raw)
  To: Francisco Zafra; +Cc: 'Neil Brown', linux-raid

Francisco Zafra wrote:
> Hi Neil, 
> Since some hours I am trying to solved it with the last version:
> root@Torero-2:~ # mdadm --version
> mdadm - v2.0-devel-2 - DEVELOPMENT VERSION NOT FOR REGULAR USE - 7 July 2005
> 
> With the same results :(
> 
> I really don't think it is locked I dd it in act of desperation and I have
> no problems:
> root@Torero-2:~ # dd if=/dev/zero of=/dev/sdh bs=1k count=1000
> 1000+0 records in
> 1000+0 records out
> 1024000 bytes transferred in 0.417862 seconds (2450570 bytes/sec)
> 

Asking a silly question perhaps..

fuser /dev/sdh

Regards,
Brad
-- 
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams

^ permalink raw reply	[flat|nested] 8+ messages in thread

* safe to test SATA array by pulling cables?
  2005-07-17 11:34 Recovery raid5 after sata cable failure Francisco Zafra
  2005-07-17 22:10 ` Neil Brown
@ 2005-10-12 17:19 ` Harry Mangalam
  1 sibling, 0 replies; 8+ messages in thread
From: Harry Mangalam @ 2005-10-12 17:19 UTC (permalink / raw)
  To: linux-raid

I've read conflicting views on whether it's safe to pull either or both the 
SATA data cable or power cable from a disk in an array (when they are NOT in 
a hotswap cage) to test whether things works as expected during a real life 
disk failure.

Is there a consensus on this?  Theoretically either should be handled by the 
controller (a 3ware 9500s) since this is a real life possibility, but will 
either event cause damage to the disk?  I'd suspect that removal of the data 
cable would simulate a disk loss with less physical trauma to the disk in 
question.


-- 
Cheers, Harry
Harry J Mangalam - 949 856 2847 (vox; email for fax) - hjm@tacgi.com 
            <<plain text preferred>>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-10-12 17:19 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-17 11:34 Recovery raid5 after sata cable failure Francisco Zafra
2005-07-17 22:10 ` Neil Brown
2005-07-18  8:36   ` Francisco Zafra
2005-07-18  9:40     ` Neil Brown
2005-07-18  9:56       ` Francisco Zafra
2005-07-18 10:15         ` Brad Campbell
2005-10-12 17:19 ` safe to test SATA array by pulling cables? Harry Mangalam
     [not found] <Pine.LNX.4.61.0507170520090.8523@hoss.npl.com>
2005-07-17 12:52 ` Recovery raid5 after sata cable failure Francisco Zafra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).