* trouble repairing raid10
@ 2010-06-02 16:25 Nicolas Jungers
2010-06-03 0:19 ` Neil Brown
0 siblings, 1 reply; 4+ messages in thread
From: Nicolas Jungers @ 2010-06-02 16:25 UTC (permalink / raw)
To: linux-raid
I've a 4 HD raid10 with to failed drive. Any attempt I made to add 2
replacement disks fail consistently.
mdadm -Af /dev/md1 /dev/sdm2 /dev/sdp2 /dev/sdb2 /dev/sdd2
mdadm: failed to add /dev/sdd2 to /dev/md1: Device or resource busy
mdadm: /dev/md1 assembled from 2 drives and 1 spare - not enough to
start the array.
or
root@disk:~# mdadm -AR /dev/md1 /dev/sdm2 /dev/sdp2
mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
mdadm: Not enough devices to start the array.
root@disk:~# mdadm --add /dev/md1 /dev/sdb2
mdadm: add new device failed for /dev/sdb2 as 4: Invalid argument
The array is in near mode and I lost disk 0 and 1. Does it mean that my
data are toasted?
mdadm --examine /dev/sdm2
/dev/sdm2:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x0
Array UUID : d90ad6fe:1355134f:f83ffadc:a4fe7859
Name : m1:1
Creation Time : Thu Apr 1 21:28:58 2010
Raid Level : raid10
Raid Devices : 4
Avail Dev Size : 3907026909 (1863.02 GiB 2000.40 GB)
Array Size : 7814049792 (3726.03 GiB 4000.79 GB)
Used Dev Size : 3907024896 (1863.01 GiB 2000.40 GB)
Data Offset : 272 sectors
Super Offset : 8 sectors
State : clean
Device UUID : e217355e:632ac2f0:8120e55e:3878bd88
Update Time : Wed Jun 2 12:31:39 2010
Checksum : feef2809 - correct
Events : 1377156
Layout : near=2, far=1
Chunk Size : 1024K
Array Slot : 3 (failed, failed, 2, 3)
Array State : __uU 2 failed
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: trouble repairing raid10
2010-06-02 16:25 trouble repairing raid10 Nicolas Jungers
@ 2010-06-03 0:19 ` Neil Brown
2010-06-03 4:38 ` Nicolas Jungers
2010-06-06 16:28 ` Nicolas Jungers
0 siblings, 2 replies; 4+ messages in thread
From: Neil Brown @ 2010-06-03 0:19 UTC (permalink / raw)
To: Nicolas Jungers; +Cc: linux-raid
On Wed, 02 Jun 2010 18:25:58 +0200
Nicolas Jungers <nicolas@jungers.net> wrote:
> I've a 4 HD raid10 with to failed drive. Any attempt I made to add 2
> replacement disks fail consistently.
>
> mdadm -Af /dev/md1 /dev/sdm2 /dev/sdp2 /dev/sdb2 /dev/sdd2
> mdadm: failed to add /dev/sdd2 to /dev/md1: Device or resource busy
Any idea why sdd2 is busy??
> mdadm: /dev/md1 assembled from 2 drives and 1 spare - not enough to
> start the array.
>
> or
>
> root@disk:~# mdadm -AR /dev/md1 /dev/sdm2 /dev/sdp2
> mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
> mdadm: Not enough devices to start the array.
> root@disk:~# mdadm --add /dev/md1 /dev/sdb2
> mdadm: add new device failed for /dev/sdb2 as 4: Invalid argument
>
>
> The array is in near mode and I lost disk 0 and 1. Does it mean that my
> data are toasted?
Yes. RAID10 can survive the failure of 2 non-adjacent devices and sometimes
2 adjacent devices. But not 0 and 1 of a near=2 array.
So if those devices are really dead, so is your data.
If one of these is actually usable and just had a transient failure then you
could try re-creating the array with the drives, or 'missing' in the right
order and with the write layout/chunksize set.
You would need to be user the 'Data Offset' was the same, which unfortunately
can require using exactly the same version of mdadm as created the array in
the first place.
NeilBrown
>
>
>
> mdadm --examine /dev/sdm2
> /dev/sdm2:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x0
> Array UUID : d90ad6fe:1355134f:f83ffadc:a4fe7859
> Name : m1:1
> Creation Time : Thu Apr 1 21:28:58 2010
> Raid Level : raid10
> Raid Devices : 4
>
> Avail Dev Size : 3907026909 (1863.02 GiB 2000.40 GB)
> Array Size : 7814049792 (3726.03 GiB 4000.79 GB)
> Used Dev Size : 3907024896 (1863.01 GiB 2000.40 GB)
> Data Offset : 272 sectors
> Super Offset : 8 sectors
> State : clean
> Device UUID : e217355e:632ac2f0:8120e55e:3878bd88
>
> Update Time : Wed Jun 2 12:31:39 2010
> Checksum : feef2809 - correct
> Events : 1377156
>
> Layout : near=2, far=1
> Chunk Size : 1024K
>
> Array Slot : 3 (failed, failed, 2, 3)
> Array State : __uU 2 failed
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: trouble repairing raid10
2010-06-03 0:19 ` Neil Brown
@ 2010-06-03 4:38 ` Nicolas Jungers
2010-06-06 16:28 ` Nicolas Jungers
1 sibling, 0 replies; 4+ messages in thread
From: Nicolas Jungers @ 2010-06-03 4:38 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On 06/03/2010 02:19 AM, Neil Brown wrote:
> On Wed, 02 Jun 2010 18:25:58 +0200
> Nicolas Jungers<nicolas@jungers.net> wrote:
>
>> I've a 4 HD raid10 with to failed drive. Any attempt I made to add 2
>> replacement disks fail consistently.
>>
>> mdadm -Af /dev/md1 /dev/sdm2 /dev/sdp2 /dev/sdb2 /dev/sdd2
>> mdadm: failed to add /dev/sdd2 to /dev/md1: Device or resource busy
>
> Any idea why sdd2 is busy??
No, because sdd2 is not busy. I have 4 spares (b, c, d and e), the one
I set in fourth position in the above mdadm -Af command is reported as
busy, whatever the one I set there. The same disk in third position get
a mdadm superblock write on them. I suspect then an incorrect error
message.
>> mdadm: /dev/md1 assembled from 2 drives and 1 spare - not enough to
>> start the array.
>>
>> or
>>
>> root@disk:~# mdadm -AR /dev/md1 /dev/sdm2 /dev/sdp2
>> mdadm: failed to RUN_ARRAY /dev/md1: Input/output error
>> mdadm: Not enough devices to start the array.
>> root@disk:~# mdadm --add /dev/md1 /dev/sdb2
>> mdadm: add new device failed for /dev/sdb2 as 4: Invalid argument
>>
>>
>> The array is in near mode and I lost disk 0 and 1. Does it mean that my
>> data are toasted?
>
> Yes. RAID10 can survive the failure of 2 non-adjacent devices and sometimes
> 2 adjacent devices. But not 0 and 1 of a near=2 array.
>
> So if those devices are really dead, so is your data.
>
> If one of these is actually usable and just had a transient failure then you
> could try re-creating the array with the drives, or 'missing' in the right
> order and with the write layout/chunksize set.
> You would need to be user the 'Data Offset' was the same, which unfortunately
> can require using exactly the same version of mdadm as created the array in
> the first place.
will try that, it was created on a beta of Ubuntu 10.04 and is now
running on shipped 10.04 (kernel 2.6.32)
>
> NeilBrown
>
>>
>>
>>
>> mdadm --examine /dev/sdm2
>> /dev/sdm2:
>> Magic : a92b4efc
>> Version : 1.2
>> Feature Map : 0x0
>> Array UUID : d90ad6fe:1355134f:f83ffadc:a4fe7859
>> Name : m1:1
>> Creation Time : Thu Apr 1 21:28:58 2010
>> Raid Level : raid10
>> Raid Devices : 4
>>
>> Avail Dev Size : 3907026909 (1863.02 GiB 2000.40 GB)
>> Array Size : 7814049792 (3726.03 GiB 4000.79 GB)
>> Used Dev Size : 3907024896 (1863.01 GiB 2000.40 GB)
>> Data Offset : 272 sectors
>> Super Offset : 8 sectors
>> State : clean
>> Device UUID : e217355e:632ac2f0:8120e55e:3878bd88
>>
>> Update Time : Wed Jun 2 12:31:39 2010
>> Checksum : feef2809 - correct
>> Events : 1377156
>>
>> Layout : near=2, far=1
>> Chunk Size : 1024K
>>
>> Array Slot : 3 (failed, failed, 2, 3)
>> Array State : __uU 2 failed
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: trouble repairing raid10
2010-06-03 0:19 ` Neil Brown
2010-06-03 4:38 ` Nicolas Jungers
@ 2010-06-06 16:28 ` Nicolas Jungers
1 sibling, 0 replies; 4+ messages in thread
From: Nicolas Jungers @ 2010-06-06 16:28 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
On 06/03/2010 02:19 AM, Neil Brown wrote:
> On Wed, 02 Jun 2010 18:25:58 +0200
> Nicolas Jungers<nicolas@jungers.net> wrote:
>
>> I've a 4 HD raid10 with to failed drive. Any attempt I made to add 2
>> replacement disks fail consistently.
[snip]
>
> If one of these is actually usable and just had a transient failure then you
> could try re-creating the array with the drives, or 'missing' in the right
> order and with the write layout/chunksize set.
> You would need to be user the 'Data Offset' was the same, which unfortunately
> can require using exactly the same version of mdadm as created the array in
> the first place.
I managed to copy the two failed disk on a new one (same brand/model)
with (gnu) ddrescue for a grand total of 512 B lost. With that copy and
a copy of one of the non failed disk I recreated (mdadm -C) the array
over the disks with the same creation parameters and two missing drives.
I'm not sure that the procedure was quicker than pulling the data back
from the backup, but nevertheless, the exercise was interesting.
When thinking about it, could it not be automated/detected in some way
by mdadm or a related utility? Or documented in a FAQ? I had the
feeling that the close to easy recovery state could be eased by mdadm
itself, or am I dreaming?
N.
>
> NeilBrown
>
>>
>>
>>
>> mdadm --examine /dev/sdm2
>> /dev/sdm2:
>> Magic : a92b4efc
>> Version : 1.2
>> Feature Map : 0x0
>> Array UUID : d90ad6fe:1355134f:f83ffadc:a4fe7859
>> Name : m1:1
>> Creation Time : Thu Apr 1 21:28:58 2010
>> Raid Level : raid10
>> Raid Devices : 4
>>
>> Avail Dev Size : 3907026909 (1863.02 GiB 2000.40 GB)
>> Array Size : 7814049792 (3726.03 GiB 4000.79 GB)
>> Used Dev Size : 3907024896 (1863.01 GiB 2000.40 GB)
>> Data Offset : 272 sectors
>> Super Offset : 8 sectors
>> State : clean
>> Device UUID : e217355e:632ac2f0:8120e55e:3878bd88
>>
>> Update Time : Wed Jun 2 12:31:39 2010
>> Checksum : feef2809 - correct
>> Events : 1377156
>>
>> Layout : near=2, far=1
>> Chunk Size : 1024K
>>
>> Array Slot : 3 (failed, failed, 2, 3)
>> Array State : __uU 2 failed
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-06-06 16:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-06-02 16:25 trouble repairing raid10 Nicolas Jungers
2010-06-03 0:19 ` Neil Brown
2010-06-03 4:38 ` Nicolas Jungers
2010-06-06 16:28 ` Nicolas Jungers
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).