* 5 HDD RAID5 not starting after controller failure
@ 2007-06-03 9:47 Karsten Desler
2007-06-03 10:32 ` Neil Brown
0 siblings, 1 reply; 3+ messages in thread
From: Karsten Desler @ 2007-06-03 9:47 UTC (permalink / raw)
To: linux-raid
Hello,
I have a RAID5 that recently failed. sda-sdd are on the same SATA controller and
every thing was running fine, until Linux decided it was a good idea to disable
the controllers interrupt. After a reboot the RAID isn't starting anymore.
Before I do something stupid, I wanted to ask if the following command will
probably restore the array with minimal data corruption.
mdadm --assemble /dev/md6 --run --force /dev/sda8 /dev/sdb8 /dev/sdc8 /dev/sdd8
mdadm /dev/md6 -a /dev/sde1
Looking at the events counter, sda-sdc agree and sdd is very close so I'd guess
that I have the best chances at getting as little corruption as possible. Or does
it make more sense to start it with all disks active?
img2:~# mdadm -E /dev/sda8
/dev/sda8:
Magic : a92b4efc
Version : 00.90.03
UUID : f52098d4:862caf6c:4c56c819:9f529bcc
Creation Time : Thu Jan 25 02:41:06 2007
Raid Level : raid5
Device Size : 381431168 (363.76 GiB 390.59 GB)
Array Size : 1525724672 (1455.04 GiB 1562.34 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Sun Jun 3 11:10:03 2007
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : b3017829 - correct
Events : 0.253660
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 8 0 active sync /dev/sda8
0 0 8 8 0 active sync /dev/sda8
1 1 8 24 1 active sync /dev/sdb8
2 2 8 40 2 active sync /dev/sdc8
3 3 8 56 3 active sync /dev/sdd8
4 4 8 65 4 active sync /dev/sde1
img2:~# mdadm -E /dev/sdb8
/dev/sdb8:
Magic : a92b4efc
Version : 00.90.03
UUID : f52098d4:862caf6c:4c56c819:9f529bcc
Creation Time : Thu Jan 25 02:41:06 2007
Raid Level : raid5
Device Size : 381431168 (363.76 GiB 390.59 GB)
Array Size : 1525724672 (1455.04 GiB 1562.34 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Sun Jun 3 11:10:03 2007
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : b301783b - correct
Events : 0.253660
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 1 8 24 1 active sync /dev/sdb8
0 0 8 8 0 active sync /dev/sda8
1 1 8 24 1 active sync /dev/sdb8
2 2 8 40 2 active sync /dev/sdc8
3 3 8 56 3 active sync /dev/sdd8
4 4 8 65 4 active sync /dev/sde1
img2:~# mdadm -E /dev/sdc8
/dev/sdc8:
Magic : a92b4efc
Version : 00.90.03
UUID : f52098d4:862caf6c:4c56c819:9f529bcc
Creation Time : Thu Jan 25 02:41:06 2007
Raid Level : raid5
Device Size : 381431168 (363.76 GiB 390.59 GB)
Array Size : 1525724672 (1455.04 GiB 1562.34 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Sun Jun 3 11:10:03 2007
State : clean
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : b301784d - correct
Events : 0.253660
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 2 8 40 2 active sync /dev/sdc8
0 0 8 8 0 active sync /dev/sda8
1 1 8 24 1 active sync /dev/sdb8
2 2 8 40 2 active sync /dev/sdc8
3 3 8 56 3 active sync /dev/sdd8
4 4 8 65 4 active sync /dev/sde1
img2:~# mdadm -E /dev/sdd8
/dev/sdd8:
Magic : a92b4efc
Version : 00.90.03
UUID : f52098d4:862caf6c:4c56c819:9f529bcc
Creation Time : Thu Jan 25 02:41:06 2007
Raid Level : raid5
Device Size : 381431168 (363.76 GiB 390.59 GB)
Array Size : 1525724672 (1455.04 GiB 1562.34 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Sun Jun 3 11:10:02 2007
State : active
Active Devices : 5
Working Devices : 5
Failed Devices : 0
Spare Devices : 0
Checksum : b2fd9982 - correct
Events : 0.253661
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 56 3 active sync /dev/sdd8
0 0 8 8 0 active sync /dev/sda8
1 1 8 24 1 active sync /dev/sdb8
2 2 8 40 2 active sync /dev/sdc8
3 3 8 56 3 active sync /dev/sdd8
4 4 8 65 4 active sync /dev/sde1
img2:~# mdadm -E /dev/sde1
/dev/sde1:
Magic : a92b4efc
Version : 00.90.03
UUID : f52098d4:862caf6c:4c56c819:9f529bcc
Creation Time : Thu Jan 25 02:41:06 2007
Raid Level : raid5
Device Size : 381431168 (363.76 GiB 390.59 GB)
Array Size : 1525724672 (1455.04 GiB 1562.34 GB)
Raid Devices : 5
Total Devices : 5
Preferred Minor : 6
Update Time : Sun Jun 3 11:12:47 2007
State : clean
Active Devices : 1
Working Devices : 1
Failed Devices : 3
Spare Devices : 0
Checksum : b3017954 - correct
Events : 0.253664
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 65 4 active sync /dev/sde1
0 0 0 0 0 removed
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 0 0 3 faulty removed
4 4 8 65 4 active sync /dev/sde1
Best Regards,
Karsten Desler
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 5 HDD RAID5 not starting after controller failure
2007-06-03 9:47 5 HDD RAID5 not starting after controller failure Karsten Desler
@ 2007-06-03 10:32 ` Neil Brown
2007-06-03 11:02 ` Karsten Desler
0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2007-06-03 10:32 UTC (permalink / raw)
To: Karsten Desler; +Cc: linux-raid
On Sunday June 3, kdesler@soohrt.org wrote:
> Hello,
>
> I have a RAID5 that recently failed. sda-sdd are on the same SATA controller and
> every thing was running fine, until Linux decided it was a good idea to disable
> the controllers interrupt. After a reboot the RAID isn't starting anymore.
>
> Before I do something stupid, I wanted to ask if the following command will
> probably restore the array with minimal data corruption.
>
> mdadm --assemble /dev/md6 --run --force /dev/sda8 /dev/sdb8 /dev/sdc8 /dev/sdd8
> mdadm /dev/md6 -a /dev/sde1
>
> Looking at the events counter, sda-sdc agree and sdd is very close so I'd guess
> that I have the best chances at getting as little corruption as possible. Or does
> it make more sense to start it with all disks active?
It looks like the data is almost certainly all completely uptodate.
The array was clean at event 253660.
A pending write caused md to try to update all the superblocks to
event 253661. This worked on d8 and e1 but failed on [abc]8.
So md tried to update the superblocks on the others to record the
failure.
This worked on e1 but not [abcd]8, so d1 ended up with an event count
of 253664 (253663 marked the failures, and 253664 marked that there
were no incomplete writes).
So I would just
mdadm -ARf /dev/md6 /dev/sd[abcd]8 /dev/sde1
and let mdadm pick the best drives. Then add the remaining one in as
a hot-add.
NeilBrown
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: 5 HDD RAID5 not starting after controller failure
2007-06-03 10:32 ` Neil Brown
@ 2007-06-03 11:02 ` Karsten Desler
0 siblings, 0 replies; 3+ messages in thread
From: Karsten Desler @ 2007-06-03 11:02 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
* Neil Brown wrote:
> So I would just
> mdadm -ARf /dev/md6 /dev/sd[abcd]8 /dev/sde1
> and let mdadm pick the best drives. Then add the remaining one in as
> a hot-add.
Thank you very much. It chose to exclude /dev/sdc8 which I re-added and
the array is rebuilding now.
Best Regards,
Karsten Desler
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2007-06-03 11:02 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-06-03 9:47 5 HDD RAID5 not starting after controller failure Karsten Desler
2007-06-03 10:32 ` Neil Brown
2007-06-03 11:02 ` Karsten Desler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).