* Recovery raid5 after sata cable failure.
@ 2005-07-17 11:34 Francisco Zafra
2005-07-17 22:10 ` Neil Brown
2005-10-12 17:19 ` safe to test SATA array by pulling cables? Harry Mangalam
0 siblings, 2 replies; 8+ messages in thread
From: Francisco Zafra @ 2005-07-17 11:34 UTC (permalink / raw)
To: linux-raid
Hi all,
I have raid5 array working without problem for some months. A SATA cable
failed and de raid5 works fine keeping the superblock persistent, but now, I
can't get the old device inserted into the array.
This is the array just now:
root@Torero-2:/mnt/raid5 # mdadm --detail /dev/md0
/dev/md0:
Version : 00.90.01
Creation Time : Tue May 31 19:37:37 2005
Raid Level : raid5
Array Size : 1367507456 (1304.16 GiB 1400.33 GB)
Device Size : 195358208 (186.31 GiB 200.05 GB)
Raid Devices : 8
Total Devices : 7
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Sun Jul 17 13:06:17 2005
State : clean, degraded
Active Devices : 7
Working Devices : 7
Failed Devices : 0
Spare Devices : 0
Layout : left-symmetric
Chunk Size : 512K
UUID : c4ed8e45:2a036953:92bff479:7cf5bac9
Events : 0.162797
Number Major Minor RaidDevice State
0 8 1 0 active sync /dev/sda1
1 8 17 1 active sync /dev/sdb1
2 8 33 2 active sync /dev/sdc1
3 8 49 3 active sync /dev/sdd1
4 8 65 4 active sync /dev/sde1
5 8 81 5 active sync /dev/sdf1
6 8 97 6 active sync /dev/sdg1
7 0 0 - removed
And I try to re-add the old disk in this way:
root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1
mdadm: Cannot open /dev/sdh1: Device or resource busy
What is wrong? What I am doing bad? Sdh1 is absolutely unused, so I
don't understand the error "resource busy"
Thanks,
Paco Zafra.
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Recovery raid5 after sata cable failure.
[not found] <Pine.LNX.4.61.0507170520090.8523@hoss.npl.com>
@ 2005-07-17 12:52 ` Francisco Zafra
0 siblings, 0 replies; 8+ messages in thread
From: Francisco Zafra @ 2005-07-17 12:52 UTC (permalink / raw)
To: 'Jim Radford'; +Cc: linux-raid
Hi, I am using lvm in this system, but only for parallel ATA disks, that not
use raid system. Anyway I deactivated lvm units, and tried again to readd de
faulty drive with the same results:
root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
mdadm: Cannot open /dev/sdh1: Device or resource busy
Thanks,
Paco.
> -----Mensaje original-----
> De: Jim Radford [mailto:jradford@npl.com]
> Enviado el: domingo, 17 de julio de 2005 14:21
> Para: Francisco Zafra
> Asunto: Re: Recovery raid5 after sata cable failure.
>
> Francisco,
>
> I was having a simuliar issue, and it was due to the fact LVM
> was starting before the raid array was assembled, if you are
> using LVM you might want to check that the array is assembled
> before LVM starts. (Probably would be the same for EVMS also).
>
> Regards,
> Jim
>
>
> On Sun, 17 Jul 2005, Francisco Zafra wrote:
>
> > Hi all,
> >
> > I have raid5 array working without problem for some
> months. A SATA
> > cable failed and de raid5 works fine keeping the superblock
> > persistent, but now, I can't get the old device inserted
> into the array.
> > This is the array just now:
> >
> > root@Torero-2:/mnt/raid5 # mdadm --detail /dev/md0
> > /dev/md0:
> > Version : 00.90.01
> > Creation Time : Tue May 31 19:37:37 2005
> > Raid Level : raid5
> > Array Size : 1367507456 (1304.16 GiB 1400.33 GB)
> > Device Size : 195358208 (186.31 GiB 200.05 GB)
> > Raid Devices : 8
> > Total Devices : 7
> > Preferred Minor : 0
> > Persistence : Superblock is persistent
> >
> > Update Time : Sun Jul 17 13:06:17 2005
> > State : clean, degraded
> > Active Devices : 7
> > Working Devices : 7
> > Failed Devices : 0
> > Spare Devices : 0
> >
> > Layout : left-symmetric
> > Chunk Size : 512K
> >
> > UUID : c4ed8e45:2a036953:92bff479:7cf5bac9
> > Events : 0.162797
> >
> > Number Major Minor RaidDevice State
> > 0 8 1 0 active sync /dev/sda1
> > 1 8 17 1 active sync /dev/sdb1
> > 2 8 33 2 active sync /dev/sdc1
> > 3 8 49 3 active sync /dev/sdd1
> > 4 8 65 4 active sync /dev/sde1
> > 5 8 81 5 active sync /dev/sdf1
> > 6 8 97 6 active sync /dev/sdg1
> > 7 0 0 - removed
> >
> >
> > And I try to re-add the old disk in this way:
> >
> > root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1
> > mdadm: Cannot open /dev/sdh1: Device or resource busy
> >
> > What is wrong? What I am doing bad? Sdh1 is absolutely
> unused, so I
> > don't understand the error "resource busy"
> >
> > Thanks,
> >
> > Paco Zafra.
> >
> > -
> > To unsubscribe from this list: send the line "unsubscribe
> linux-raid"
> > in the body of a message to majordomo@vger.kernel.org More
> majordomo
> > info at http://vger.kernel.org/majordomo-info.html
> >
>
> --
> ==============================================================
> ============
> Jim Radford <jradford@npl.com>
> http://www.jimradford.com/
> ==============================================================
> ============
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recovery raid5 after sata cable failure.
2005-07-17 11:34 Recovery raid5 after sata cable failure Francisco Zafra
@ 2005-07-17 22:10 ` Neil Brown
2005-07-18 8:36 ` Francisco Zafra
2005-10-12 17:19 ` safe to test SATA array by pulling cables? Harry Mangalam
1 sibling, 1 reply; 8+ messages in thread
From: Neil Brown @ 2005-07-17 22:10 UTC (permalink / raw)
To: Francisco Zafra; +Cc: linux-raid
On Sunday July 17, fzafra@gmail.com wrote:
> Hi all,
>
> I have raid5 array working without problem for some months. A SATA cable
> failed and de raid5 works fine keeping the superblock persistent, but now, I
> can't get the old device inserted into the array.
...
>
> And I try to re-add the old disk in this way:
>
> root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1
> mdadm: Cannot open /dev/sdh1: Device or resource busy
>
> What is wrong? What I am doing bad? Sdh1 is absolutely unused, so I
> don't understand the error "resource busy"
Well, it definitely is busy...
Maybe it is still part of md0, but marked as 'faulty'.
If so (cat /proc/mdstat would tell you) you need to remove it first.
mdadm /dev/md0 -r /dev/sdh1
mdadm /dev/md0 -a /dev/sdh1
NeilBrown
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Recovery raid5 after sata cable failure.
2005-07-17 22:10 ` Neil Brown
@ 2005-07-18 8:36 ` Francisco Zafra
2005-07-18 9:40 ` Neil Brown
0 siblings, 1 reply; 8+ messages in thread
From: Francisco Zafra @ 2005-07-18 8:36 UTC (permalink / raw)
To: 'Neil Brown'; +Cc: linux-raid
I already tried that:
root@Torero-2:~ # cat /proc/mdstat
Personalities : [linear] [raid5]
md0 : active raid5 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] sda1[0]
1367507456 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
unused devices: <none>
root@Torero-2:~ # mdadm /dev/md0 -r /dev/sdh1
mdadm: hot remove failed for /dev/sdh1: No such device or address
root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
mdadm: Cannot open /dev/sdh1: Device or resource busy
root@Torero-2:~ #
With no luck :(
Paco.
> -----Mensaje original-----
> De: Neil Brown [mailto:neilb@cse.unsw.edu.au]
> Enviado el: lunes, 18 de julio de 2005 0:11
> Para: Francisco Zafra
> CC: linux-raid@vger.kernel.org
> Asunto: Re: Recovery raid5 after sata cable failure.
>
> On Sunday July 17, fzafra@gmail.com wrote:
> > Hi all,
> >
> > I have raid5 array working without problem for some
> months. A SATA
> > cable failed and de raid5 works fine keeping the superblock
> > persistent, but now, I can't get the old device inserted
> into the array.
> ...
> >
> > And I try to re-add the old disk in this way:
> >
> > root@Torero-2:/mnt/raid5 # mdadm /dev/md0 -a /dev/sdh1
> > mdadm: Cannot open /dev/sdh1: Device or resource busy
> >
> > What is wrong? What I am doing bad? Sdh1 is absolutely
> unused, so I
> > don't understand the error "resource busy"
>
> Well, it definitely is busy...
>
> Maybe it is still part of md0, but marked as 'faulty'.
> If so (cat /proc/mdstat would tell you) you need to remove it first.
> mdadm /dev/md0 -r /dev/sdh1
> mdadm /dev/md0 -a /dev/sdh1
>
> NeilBrown
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Recovery raid5 after sata cable failure.
2005-07-18 8:36 ` Francisco Zafra
@ 2005-07-18 9:40 ` Neil Brown
2005-07-18 9:56 ` Francisco Zafra
0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2005-07-18 9:40 UTC (permalink / raw)
To: Francisco Zafra; +Cc: linux-raid
On Monday July 18, fzafra@gmail.com wrote:
> I already tried that:
>
> root@Torero-2:~ # cat /proc/mdstat
> Personalities : [linear] [raid5]
> md0 : active raid5 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2] sdb1[1] sda1[0]
> 1367507456 blocks level 5, 512k chunk, algorithm 2 [8/7] [UUUUUUU_]
>
> unused devices: <none>
> root@Torero-2:~ # mdadm /dev/md0 -r /dev/sdh1
> mdadm: hot remove failed for /dev/sdh1: No such device or address
> root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
> mdadm: Cannot open /dev/sdh1: Device or resource busy
Uhm, you might have a buggy version of mdadm. If you have 1.10.0, get
an upgrade.
Otherwise either sdh1 or sdh must be:
open by some process with O_EXCL
open via a /dev/raw/* device
part of an md device (which it obviously isn't)
part of a dm device
mounted as a filesystem
an external-journal device for a jfs or ext3 or xfs filesystem
in use as a swap device
open for writing under a security level of 1 (whatever that means..)
an mtd device that is open
(those are all the places that I can find that take an exclusive lock
on a block device).
NeilBrown
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: Recovery raid5 after sata cable failure.
2005-07-18 9:40 ` Neil Brown
@ 2005-07-18 9:56 ` Francisco Zafra
2005-07-18 10:15 ` Brad Campbell
0 siblings, 1 reply; 8+ messages in thread
From: Francisco Zafra @ 2005-07-18 9:56 UTC (permalink / raw)
To: 'Neil Brown'; +Cc: linux-raid
Hi Neil,
Since some hours I am trying to solved it with the last version:
root@Torero-2:~ # mdadm --version
mdadm - v2.0-devel-2 - DEVELOPMENT VERSION NOT FOR REGULAR USE - 7 July 2005
With the same results :(
I really don't think it is locked I dd it in act of desperation and I have
no problems:
root@Torero-2:~ # dd if=/dev/zero of=/dev/sdh bs=1k count=1000
1000+0 records in
1000+0 records out
1024000 bytes transferred in 0.417862 seconds (2450570 bytes/sec)
No locked or anything... I have really get out of ideas with this...
Thanks for all your help.
Paco.
> -----Mensaje original-----
> De: Neil Brown [mailto:neilb@cse.unsw.edu.au]
> Enviado el: lunes, 18 de julio de 2005 11:41
> Para: Francisco Zafra
> CC: linux-raid@vger.kernel.org
> Asunto: RE: Recovery raid5 after sata cable failure.
>
> On Monday July 18, fzafra@gmail.com wrote:
> > I already tried that:
> >
> > root@Torero-2:~ # cat /proc/mdstat
> > Personalities : [linear] [raid5]
> > md0 : active raid5 sdg1[6] sdf1[5] sde1[4] sdd1[3] sdc1[2]
> sdb1[1] sda1[0]
> > 1367507456 blocks level 5, 512k chunk, algorithm 2 [8/7]
> > [UUUUUUU_]
> >
> > unused devices: <none>
> > root@Torero-2:~ # mdadm /dev/md0 -r /dev/sdh1
> > mdadm: hot remove failed for /dev/sdh1: No such device or address
> > root@Torero-2:~ # mdadm /dev/md0 -a /dev/sdh1
> > mdadm: Cannot open /dev/sdh1: Device or resource busy
>
>
> Uhm, you might have a buggy version of mdadm. If you have
> 1.10.0, get an upgrade.
>
> Otherwise either sdh1 or sdh must be:
> open by some process with O_EXCL
> open via a /dev/raw/* device
> part of an md device (which it obviously isn't)
> part of a dm device
> mounted as a filesystem
> an external-journal device for a jfs or ext3 or xfs filesystem
> in use as a swap device
> open for writing under a security level of 1 (whatever that means..)
> an mtd device that is open
>
> (those are all the places that I can find that take an
> exclusive lock on a block device).
>
> NeilBrown
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Recovery raid5 after sata cable failure.
2005-07-18 9:56 ` Francisco Zafra
@ 2005-07-18 10:15 ` Brad Campbell
0 siblings, 0 replies; 8+ messages in thread
From: Brad Campbell @ 2005-07-18 10:15 UTC (permalink / raw)
To: Francisco Zafra; +Cc: 'Neil Brown', linux-raid
Francisco Zafra wrote:
> Hi Neil,
> Since some hours I am trying to solved it with the last version:
> root@Torero-2:~ # mdadm --version
> mdadm - v2.0-devel-2 - DEVELOPMENT VERSION NOT FOR REGULAR USE - 7 July 2005
>
> With the same results :(
>
> I really don't think it is locked I dd it in act of desperation and I have
> no problems:
> root@Torero-2:~ # dd if=/dev/zero of=/dev/sdh bs=1k count=1000
> 1000+0 records in
> 1000+0 records out
> 1024000 bytes transferred in 0.417862 seconds (2450570 bytes/sec)
>
Asking a silly question perhaps..
fuser /dev/sdh
Regards,
Brad
--
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams
^ permalink raw reply [flat|nested] 8+ messages in thread
* safe to test SATA array by pulling cables?
2005-07-17 11:34 Recovery raid5 after sata cable failure Francisco Zafra
2005-07-17 22:10 ` Neil Brown
@ 2005-10-12 17:19 ` Harry Mangalam
1 sibling, 0 replies; 8+ messages in thread
From: Harry Mangalam @ 2005-10-12 17:19 UTC (permalink / raw)
To: linux-raid
I've read conflicting views on whether it's safe to pull either or both the
SATA data cable or power cable from a disk in an array (when they are NOT in
a hotswap cage) to test whether things works as expected during a real life
disk failure.
Is there a consensus on this? Theoretically either should be handled by the
controller (a 3ware 9500s) since this is a real life possibility, but will
either event cause damage to the disk? I'd suspect that removal of the data
cable would simulate a disk loss with less physical trauma to the disk in
question.
--
Cheers, Harry
Harry J Mangalam - 949 856 2847 (vox; email for fax) - hjm@tacgi.com
<<plain text preferred>>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2005-10-12 17:19 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-17 11:34 Recovery raid5 after sata cable failure Francisco Zafra
2005-07-17 22:10 ` Neil Brown
2005-07-18 8:36 ` Francisco Zafra
2005-07-18 9:40 ` Neil Brown
2005-07-18 9:56 ` Francisco Zafra
2005-07-18 10:15 ` Brad Campbell
2005-10-12 17:19 ` safe to test SATA array by pulling cables? Harry Mangalam
[not found] <Pine.LNX.4.61.0507170520090.8523@hoss.npl.com>
2005-07-17 12:52 ` Recovery raid5 after sata cable failure Francisco Zafra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).