* md RAID5: Disk wrongly marked "spare", need to force re-add it
@ 2013-04-12 20:08 Ben Bucksch
2013-04-13 14:19 ` Roy Sigurd Karlsbakk
2013-04-14 22:40 ` Oliver Schinagl
0 siblings, 2 replies; 23+ messages in thread
From: Ben Bucksch @ 2013-04-12 20:08 UTC (permalink / raw)
To: linux-raid
I have a RAID5 with 8 disks. It worked fine.
I made an update from ubuntu 10.04 to 12.04 using do-release-upgrade,
and rebooted.
before: kernel 2.6.32-41, mdadm v2.6.7.1, after: kernel 3.2.0-41, mdadm
3.2.5 - and I made a copy of the root partition before I updated, so I
can still boot both OSs
After I rebooted, one drive dropped from the array. I don't know why: it
was fine hardware-wise, and the same happened with my other RAID5 array,
where the dropped drive was fine, too. This seems like an MD or Ubuntu
bug to me.
So, I readded it and resynced. When the resync was around or over 80%
done, another harddrive failed (maybe because the resync is now using
all sectors the first time since a long time), this one with a real
hardware failure, so I had to permanently removed it. Now, the first
drive, which was dropped although it was fine, is marked as spare,
although it should have mostly good data on it. mdadm refuses to re-add
it, no matter what I try. mdadm 3.2.5 says "not possible" (see [1]),
while mdadm 2.5.7.1 re-adds it, but as spare, not as real member, so the
array still won't restart.
I need to forcibly re-add the drive that's marked as spare, because it
should have good data on it. I understand that some blocks may be out of
sync or corrupted, but the array has many TB and I want to get to the
rest of all the data that's still good. Even if I get 80% recovered,
that's still better than 0%.
NB 1:
Please do NOT respond with
* restore backup - the backup doesn't have the new data, which I
really need
* re-"create" the array - unless you can give me the exact --create
command that would recover it with data - other people tried this
based on suggestions in forums and they lost all data
Appendix 1:
dmesg during resync:
...
[45345.341865] XFS (dm-13): xfs_imap_to_bp: xfs_trans_read_buf()
returned error 5.
[45345.341909] XFS (dm-13): metadata I/O error: block 0x45986fc0
("xfs_trans_read_buf") error 5 buf count 8192
(many times, about the same handful of blocks)
[45345.610858] XFS (dm-13): metadata I/O error: block 0x7a434f20
("xfs_trans_read_buf") error 5 buf count 4096
(many times, about the same handful of blocks)
(And then probably MD shut down the disk, and the RAID, and so all the
other filesystems went in the trash, too:)
[45353.184049] XFS (dm-10): xfs_log_force: error 5 returned.
[45472.694283] XFS (dm-8): metadata I/O error: block 0x29e89be8
("xfs_trans_read_buf") error 5 buf count 4096
[45473.504044] XFS (dm-10): xfs_log_force: error 5 returned.
[45485.246757] XFS (dm-8): metadata I/O error: block 0x19010c29
("xlog_iodone") error 5 buf count 2560
[45485.248966] XFS (dm-8): xfs_do_force_shutdown(0x2) called from line
1007 of file /build/buildd/linux-3.2.0/fs/xfs/xfs_log.c. Return address
= 0xffffffffa031ede1
[45485.249011] XFS (dm-8): Log I/O Error Detected. Shutting down filesystem
[45485.251126] XFS (dm-8): Please umount the filesystem and rectify the
problem(s)
[45503.584037] XFS (dm-10): xfs_log_force: error 5 returned.
[45514.848040] XFS (dm-8): xfs_log_force: error 5 returned.
Appendix 2:
1. During resync, i.e. after sdl was wrongly dropped, and before sdk failed:
# cat /proc/mdstat
md0 : active raid5 sdl[8] sdp[5] sdq[7] sdk[1] sdj[0] sdo[4] sdn[6] sdm[3]
6837337472 blocks level 5, 64k chunk, algorithm 2 [8/7] [UU_UUUUU]
[>....................] recovery = 0.1% (1075328/976762496)
finish=468.6min speed=34700K/sec
# mdadm --detail /dev/md0
/dev/md0:
Version : 0.90
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu Apr 11 16:59:55 2013
State : clean, degraded, recovering
Active Devices : 7
Working Devices : 8
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Rebuild Status : 0% complete
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Events : 0.13240512
Number Major Minor RaidDevice State
0 8 144 0 active sync /dev/sdj
1 8 160 1 active sync /dev/sdk
8 8 176 2 spare rebuilding /dev/sdl
3 8 192 3 active sync /dev/sdm
4 8 224 4 active sync /dev/sdo
5 8 240 5 active sync /dev/sdp
6 8 208 6 active sync /dev/sdn
7 65 0 7 active sync /dev/sdq
2. after sdk had a hardware fault during resync:
after resync, towards the end (between 78 and 100%), again:
md0 : active raid5 sdj[0] sdl[8](S) sdq[7] sdn[6] sdp[5] sdo[4] sdm[3]
sdk[9](F)
6837337472 blocks level 5, 64k chunk, algorithm 2 [8/6] [U__UUUUU]
*-disk:1
description: ATA Disk
product: WDC WD10EACS-00D
vendor: Western Digital
physical id: 0.1.0
bus info: scsi@7:0.1.0
logical name: /dev/sdk
version: 1A01
serial: WD-...8520
size: 931GiB (1TB)
capacity: 931GiB (1TB)
capabilities: 15000rpm
configuration: ansiversion=5
*-disk:2
description: ATA Disk
product: WDC WD10EACS-00D
vendor: Western Digital
physical id: 0.2.0
bus info: scsi@7:0.2.0
logical name: /dev/sdl
version: 1A01
serial: WD-WCAU45964913
size: 931GiB (1TB)
capacity: 931GiB (1TB)
capabilities: 15000rpm
configuration: ansiversion=5
3. Current state, after fix attempts:
(sdk has hardware failure
sdl is probably good, but marked spare)
# cat /proc/mdstat
md0 : inactive sdk[9](S) sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3]
7814099968 blocks
# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Raid Devices : 8
Total Devices : 8
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Fri Apr 12 17:09:30 2013
State : active, degraded, Not Started
Active Devices : 6
Working Devices : 8
Failed Devices : 0
Spare Devices : 2
Layout : left-symmetric
Chunk Size : 64K
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Events : 0.13274865
Number Major Minor RaidDevice State
0 8 144 0 active sync /dev/sdj
1 0 0 1 removed
2 0 0 2 removed
3 8 192 3 active sync /dev/sdm
4 8 224 4 active sync /dev/sdo
5 8 240 5 active sync /dev/sdp
6 8 208 6 active sync /dev/sdn
7 65 0 7 active sync /dev/sdq
8 8 176 - spare /dev/sdl
9 8 160 - spare /dev/sdk
# mdadm -E /dev/sd[jlmnopqk]
(sdl is the one I need to add:)
/dev/sdl:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 7
Preferred Minor : 0
Update Time : Fri Apr 12 15:00:38 2013
State : clean
Active Devices : 6
Working Devices : 7
Failed Devices : 2
Spare Devices : 1
Checksum : ca6e81a9 - correct
Events : 13274863
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 8 8 176 8 spare /dev/sdl
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
8 8 8 176 8 spare /dev/sdl
/dev/sdj:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : clean
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a40ffb - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 0 8 144 0 active sync /dev/sdj
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
/dev/sdm:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : active
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a41030 - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 3 8 192 3 active sync /dev/sdm
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
/dev/sdn:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : active
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a41046 - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 6 8 208 6 active sync /dev/sdn
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
/dev/sdo:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : active
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a41052 - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 4 8 224 4 active sync /dev/sdo
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
/dev/sdp:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : active
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a41064 - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 5 8 240 5 active sync /dev/sdp
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
/dev/sdq:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : active
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a40fb1 - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 7 65 0 7 active sync /dev/sdq
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
(sdk with serial ...8520 had the hardware fault:)
/dev/sdk:
Magic : a92b4efc
Version : 00.90.00
UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c
Creation Time : Sun Mar 22 15:51:17 2009
Raid Level : raid5
Used Dev Size : 976762496 (931.51 GiB 1000.20 GB)
Array Size : 6837337472 (6520.59 GiB 7001.43 GB)
Raid Devices : 8
Total Devices : 6
Preferred Minor : 0
Update Time : Fri Apr 12 17:09:30 2013
State : clean
Active Devices : 6
Working Devices : 6
Failed Devices : 2
Spare Devices : 0
Checksum : c9a410bf - correct
Events : 13274865
Layout : left-symmetric
Chunk Size : 64K
Number Major Minor RaidDevice State
this 9 8 160 -1 spare /dev/sdk
0 0 8 144 0 active sync /dev/sdj
1 1 0 0 1 faulty removed
2 2 0 0 2 faulty removed
3 3 8 192 3 active sync /dev/sdm
4 4 8 224 4 active sync /dev/sdo
5 5 8 240 5 active sync /dev/sdp
6 6 8 208 6 active sync /dev/sdn
7 7 65 0 7 active sync /dev/sdq
NB 2:
The original problem is that md dropped a perfectly good drive from the
array, just because I upgraded the OS. It seems to me that Linux MD is
all too happy and quick to kick out drives from the array, and then
refuses to readd them without a resync. This might be fine approach on
paper, but not in reality, where the resync is probably when another
drive fails, and then you have no parity left and you're told that your
data is gone.
1. It shouldn't drop drives so quickly
2. It should allow me to re-add them, if I think the data is good
3. There must be a recovery mechanism, to at least partially recover
data. Arrays can easily have 10+ TB, and just because a few
blocks/sectors in one filesystem are bad doesn't mean that I need to
throw away all filesystems that are on that LVM, and all data in the
broken filesystem.
NB 3: Seems like other people have the exact same problem:
http://www.linuxquestions.org/questions/linux-server-73/mdadm-re-added-disk-treated-as-spare-750739/
http://forums.gentoo.org/viewtopic-t-716757.html
https://raid.wiki.kernel.org/index.php/RAID_Recovery#Recreating_an_array
NB 4: Last time I upgraded the OS on the RAID server, I ended up with a
similar mess, due to another md bug:
https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/136252 )
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-12 20:08 md RAID5: Disk wrongly marked "spare", need to force re-add it Ben Bucksch @ 2013-04-13 14:19 ` Roy Sigurd Karlsbakk 2013-04-14 22:40 ` Oliver Schinagl 1 sibling, 0 replies; 23+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-13 14:19 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid > I have a RAID5 with 8 disks. It worked fine. It's a thead a week or so about people who run lots of drives in RAID-5 and lose a drive, and then anothoer during rebuild, or in other ways have double disk failure. With eight drives in a single RAID-8 and no good backup, it's like asking for trouble. Use RAID-6. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-12 20:08 md RAID5: Disk wrongly marked "spare", need to force re-add it Ben Bucksch 2013-04-13 14:19 ` Roy Sigurd Karlsbakk @ 2013-04-14 22:40 ` Oliver Schinagl 2013-04-15 1:34 ` Ben Bucksch 1 sibling, 1 reply; 23+ messages in thread From: Oliver Schinagl @ 2013-04-14 22:40 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid On 04/12/13 22:08, Ben Bucksch wrote: > I have a RAID5 with 8 disks. It worked fine. Raid5 on 8 disks, EEP. Raid6! Really, raid6. > > I made an update from ubuntu 10.04 to 12.04 using do-release-upgrade, > and rebooted. > before: kernel 2.6.32-41, mdadm v2.6.7.1, after: kernel 3.2.0-41, mdadm > 3.2.5 - and I made a copy of the root partition before I updated, so I > can still boot both OSs > > After I rebooted, one drive dropped from the array. I don't know why: it > was fine hardware-wise, and the same happened with my other RAID5 array, > where the dropped drive was fine, too. This seems like an MD or Ubuntu > bug to me. I've seen that happen to, but with all disks marked as spare. This apparently could be race condition with udev etc. You should have tried mdadm -A /dev/md0 and see if it would reassemble it. > > So, I readded it and resynced. When the resync was around or over 80% > done, another harddrive failed (maybe because the resync is now using > all sectors the first time since a long time), this one with a real > hardware failure, so I had to permanently removed it. Now, the first > drive, which was dropped although it was fine, is marked as spare, > although it should have mostly good data on it. mdadm refuses to re-add > it, no matter what I try. mdadm 3.2.5 says "not possible" (see [1]), > while mdadm 2.5.7.1 re-adds it, but as spare, not as real member, so the > array still won't restart. > > I need to forcibly re-add the drive that's marked as spare, because it > should have good data on it. I understand that some blocks may be out of > sync or corrupted, but the array has many TB and I want to get to the > rest of all the data that's still good. Even if I get 80% recovered, > that's still better than 0%. Firstly, have you written anything TOO the array while resyncing? If not, chances are your array is in a reasonable shape still. The 'spare' drive, I don't know what its status is. Theoretically, I would assume that the resync the data written to the disk is exactly the same as it was before, so keep that in mind as a last resort. But basically, you should ignore this drive, its data is not to be trusted. Now the broken drive. Check your cables!! and run smartctl on it to give smart a chance to 'fix' the drive somewhat and check its status/health. Now check the event count for all your drivers and compare. If the 'broken' drive is only a few off (1 or 2 I think i spotted below, try the following) mdadm --run --force -A /dev/md0 /dev/sd[1-7] (leave out the earlier 'spare') You should have your array back up and running. mount -o ro /md0 /mnt and copy anything you need off. IF can't recover any files (due to not having enough free space) and if smartctl said your broken drive was somewhat sane, you can try re-adding it again and hope it'll work this time. If it won't let you, mdadm --zero-superblock /dev/brokendisk. But do try to copy the most important stuff off, continuing may make things worse. If it fails again (at 80% because of hardware failure) you can't re-use the broken disk. It really is broken :p Re-force the assembly as above and copy the rest, it's all you can do. That said, if all else fails, your very last hope, is to not use the broken drive, and 'force' the above using the earlier marked spare. Maybe you can get more data off the array then. After recovering your data and replacing your broken disk, make it an 8 disk raid6 instead! (or if you need the space, 10disk raid6. Raid5 while awesome, on big big arrays it's asking for trouble still. > > NB 1: > Please do NOT respond with > > * restore backup - the backup doesn't have the new data, which I > really need > * re-"create" the array - unless you can give me the exact --create > command that would recover it with data - other people tried this > based on suggestions in forums and they lost all data > > > Appendix 1: > > dmesg during resync: > > ... > [45345.341865] XFS (dm-13): xfs_imap_to_bp: xfs_trans_read_buf() > returned error 5. > [45345.341909] XFS (dm-13): metadata I/O error: block 0x45986fc0 > ("xfs_trans_read_buf") error 5 buf count 8192 > > (many times, about the same handful of blocks) > > [45345.610858] XFS (dm-13): metadata I/O error: block 0x7a434f20 > ("xfs_trans_read_buf") error 5 buf count 4096 > > (many times, about the same handful of blocks) > > (And then probably MD shut down the disk, and the RAID, and so all the > other filesystems went in the trash, too:) > > [45353.184049] XFS (dm-10): xfs_log_force: error 5 returned. > [45472.694283] XFS (dm-8): metadata I/O error: block 0x29e89be8 > ("xfs_trans_read_buf") error 5 buf count 4096 > [45473.504044] XFS (dm-10): xfs_log_force: error 5 returned. > [45485.246757] XFS (dm-8): metadata I/O error: block 0x19010c29 > ("xlog_iodone") error 5 buf count 2560 > [45485.248966] XFS (dm-8): xfs_do_force_shutdown(0x2) called from line > 1007 of file /build/buildd/linux-3.2.0/fs/xfs/xfs_log.c. Return address > = 0xffffffffa031ede1 > [45485.249011] XFS (dm-8): Log I/O Error Detected. Shutting down > filesystem > [45485.251126] XFS (dm-8): Please umount the filesystem and rectify the > problem(s) > [45503.584037] XFS (dm-10): xfs_log_force: error 5 returned. > [45514.848040] XFS (dm-8): xfs_log_force: error 5 returned. > > > Appendix 2: > > 1. During resync, i.e. after sdl was wrongly dropped, and before sdk > failed: > > # cat /proc/mdstat > md0 : active raid5 sdl[8] sdp[5] sdq[7] sdk[1] sdj[0] sdo[4] sdn[6] sdm[3] > 6837337472 blocks level 5, 64k chunk, algorithm 2 [8/7] [UU_UUUUU] > [>....................] recovery = 0.1% (1075328/976762496) > finish=468.6min speed=34700K/sec > > # mdadm --detail /dev/md0 > /dev/md0: > Version : 0.90 > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Raid Devices : 8 > Total Devices : 8 > Preferred Minor : 0 > Persistence : Superblock is persistent > > Update Time : Thu Apr 11 16:59:55 2013 > State : clean, degraded, recovering > Active Devices : 7 > Working Devices : 8 > Failed Devices : 0 > Spare Devices : 1 > > Layout : left-symmetric > Chunk Size : 64K > > Rebuild Status : 0% complete > > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Events : 0.13240512 > > Number Major Minor RaidDevice State > 0 8 144 0 active sync /dev/sdj > 1 8 160 1 active sync /dev/sdk > 8 8 176 2 spare rebuilding /dev/sdl > 3 8 192 3 active sync /dev/sdm > 4 8 224 4 active sync /dev/sdo > 5 8 240 5 active sync /dev/sdp > 6 8 208 6 active sync /dev/sdn > 7 65 0 7 active sync /dev/sdq > > 2. after sdk had a hardware fault during resync: > > after resync, towards the end (between 78 and 100%), again: > > md0 : active raid5 sdj[0] sdl[8](S) sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > sdk[9](F) > 6837337472 blocks level 5, 64k chunk, algorithm 2 [8/6] [U__UUUUU] > > *-disk:1 > description: ATA Disk > product: WDC WD10EACS-00D > vendor: Western Digital > physical id: 0.1.0 > bus info: scsi@7:0.1.0 > logical name: /dev/sdk > version: 1A01 > serial: WD-...8520 > size: 931GiB (1TB) > capacity: 931GiB (1TB) > capabilities: 15000rpm > configuration: ansiversion=5 > > *-disk:2 > description: ATA Disk > product: WDC WD10EACS-00D > vendor: Western Digital > physical id: 0.2.0 > bus info: scsi@7:0.2.0 > logical name: /dev/sdl > version: 1A01 > serial: WD-WCAU45964913 > size: 931GiB (1TB) > capacity: 931GiB (1TB) > capabilities: 15000rpm > configuration: ansiversion=5 > > > > > 3. Current state, after fix attempts: > > (sdk has hardware failure > sdl is probably good, but marked spare) > > # cat /proc/mdstat > md0 : inactive sdk[9](S) sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] > sdm[3] > 7814099968 blocks > > # mdadm --detail /dev/md0 > /dev/md0: > Version : 00.90 > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Raid Devices : 8 > Total Devices : 8 > Preferred Minor : 0 > Persistence : Superblock is persistent > > Update Time : Fri Apr 12 17:09:30 2013 > State : active, degraded, Not Started > Active Devices : 6 > Working Devices : 8 > Failed Devices : 0 > Spare Devices : 2 > > Layout : left-symmetric > Chunk Size : 64K > > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Events : 0.13274865 > > Number Major Minor RaidDevice State > 0 8 144 0 active sync /dev/sdj > 1 0 0 1 removed > 2 0 0 2 removed > 3 8 192 3 active sync /dev/sdm > 4 8 224 4 active sync /dev/sdo > 5 8 240 5 active sync /dev/sdp > 6 8 208 6 active sync /dev/sdn > 7 65 0 7 active sync /dev/sdq > > 8 8 176 - spare /dev/sdl > 9 8 160 - spare /dev/sdk > > # mdadm -E /dev/sd[jlmnopqk] > > (sdl is the one I need to add:) > /dev/sdl: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 7 > Preferred Minor : 0 > > Update Time : Fri Apr 12 15:00:38 2013 > State : clean > Active Devices : 6 > Working Devices : 7 > Failed Devices : 2 > Spare Devices : 1 > Checksum : ca6e81a9 - correct > Events : 13274863 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 8 8 176 8 spare /dev/sdl > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > 8 8 8 176 8 spare /dev/sdl > > /dev/sdj: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : clean > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a40ffb - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 0 8 144 0 active sync /dev/sdj > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > /dev/sdm: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : active > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a41030 - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 3 8 192 3 active sync /dev/sdm > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > /dev/sdn: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : active > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a41046 - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 6 8 208 6 active sync /dev/sdn > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > /dev/sdo: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : active > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a41052 - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 4 8 224 4 active sync /dev/sdo > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > /dev/sdp: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : active > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a41064 - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 5 8 240 5 active sync /dev/sdp > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > /dev/sdq: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : active > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a40fb1 - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 7 65 0 7 active sync /dev/sdq > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > > > (sdk with serial ...8520 had the hardware fault:) > > /dev/sdk: > Magic : a92b4efc > Version : 00.90.00 > UUID : c71c4168:de3a9b44:5ac2d0d1:4a2cd41c > Creation Time : Sun Mar 22 15:51:17 2009 > Raid Level : raid5 > Used Dev Size : 976762496 (931.51 GiB 1000.20 GB) > Array Size : 6837337472 (6520.59 GiB 7001.43 GB) > Raid Devices : 8 > Total Devices : 6 > Preferred Minor : 0 > > Update Time : Fri Apr 12 17:09:30 2013 > State : clean > Active Devices : 6 > Working Devices : 6 > Failed Devices : 2 > Spare Devices : 0 > Checksum : c9a410bf - correct > Events : 13274865 > > Layout : left-symmetric > Chunk Size : 64K > > Number Major Minor RaidDevice State > this 9 8 160 -1 spare /dev/sdk > > 0 0 8 144 0 active sync /dev/sdj > 1 1 0 0 1 faulty removed > 2 2 0 0 2 faulty removed > 3 3 8 192 3 active sync /dev/sdm > 4 4 8 224 4 active sync /dev/sdo > 5 5 8 240 5 active sync /dev/sdp > 6 6 8 208 6 active sync /dev/sdn > 7 7 65 0 7 active sync /dev/sdq > > > NB 2: > The original problem is that md dropped a perfectly good drive from the > array, just because I upgraded the OS. It seems to me that Linux MD is > all too happy and quick to kick out drives from the array, and then > refuses to readd them without a resync. This might be fine approach on > paper, but not in reality, where the resync is probably when another > drive fails, and then you have no parity left and you're told that your > data is gone. > 1. It shouldn't drop drives so quickly > 2. It should allow me to re-add them, if I think the data is good > 3. There must be a recovery mechanism, to at least partially recover > data. Arrays can easily have 10+ TB, and just because a few > blocks/sectors in one filesystem are bad doesn't mean that I need to > throw away all filesystems that are on that LVM, and all data in the > broken filesystem. > > > NB 3: Seems like other people have the exact same problem: > http://www.linuxquestions.org/questions/linux-server-73/mdadm-re-added-disk-treated-as-spare-750739/ > > http://forums.gentoo.org/viewtopic-t-716757.html > https://raid.wiki.kernel.org/index.php/RAID_Recovery#Recreating_an_array > > NB 4: Last time I upgraded the OS on the RAID server, I ended up with a > similar mess, due to another md bug: > https://bugs.launchpad.net/ubuntu/+source/mdadm/+bug/136252 ) > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-14 22:40 ` Oliver Schinagl @ 2013-04-15 1:34 ` Ben Bucksch 2013-04-14 17:30 ` Oliver Schinagl 0 siblings, 1 reply; 23+ messages in thread From: Ben Bucksch @ 2013-04-15 1:34 UTC (permalink / raw) To: Oliver Schinagl; +Cc: linux-raid Hey Oliver, first off: thanks for trying to help me. Oliver Schinagl wrote, On 15.04.2013 00:40: > Firstly, have you written anything TOO the array while resyncing? If > not, chances are your array is in a reasonable shape still. I did write to the array (in fact, I did a bonnie++, which in retrospective is very stupid, and I'm upset I did it, but hindsight is 20/20 - I assumed the array was fine at that time), BUT if you look at the "event count" of each drive, the sdl marked "spare" has an event count just 2 lower then all the others, so they are very close. > Now check the event count for all your drivers and compare. If the > 'broken' drive is only a few off (1 or 2 I think i spotted below, try > the following) Exactly. > The 'spare' drive, I don't know what its status is. According to SMART, it's just fine. Its event status is very close to the others. > Theoretically, I would assume that the resync the data written to the > disk is exactly the same as it was before, so keep that in mind as a > last resort. Yes, that's my plan. My question is: HOW can I tell mdadm to use it? > mdadm --run --force -A /dev/md0 /dev/sd... I've tried that, and it tells me the array can't be started, because I have RAID 5 with 8 drives (in normal situation), 6 good drives, and 2 spares (1 working fine, 1 with hardware failure). So, after this command, I end up in "inactive" operation mode. > Now the broken drive. Check your cables!! and run smartctl on it to > give smart a chance to 'fix' the drive somewhat and check its > status/health. ... > If it fails again (at 80% because of hardware failure) you can't > re-use the broken disk. It really is broken :p It failed twice during resync, at around the same point, and smartctl tells me it's broken, so I assume it's gone for good. (Also, the failed drive is also marked as "spare" currently.) > your very last hope, is to not use the broken drive, and 'force' the > above using the earlier marked spare. How? I haven't managed to do that, that's my whole question. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-15 1:34 ` Ben Bucksch @ 2013-04-14 17:30 ` Oliver Schinagl 2013-04-15 10:26 ` Ben Bucksch 0 siblings, 1 reply; 23+ messages in thread From: Oliver Schinagl @ 2013-04-14 17:30 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid On 15-04-13 03:34, Ben Bucksch wrote: > Hey Oliver, > > first off: thanks for trying to help me. > > Oliver Schinagl wrote, On 15.04.2013 00:40: >> Firstly, have you written anything TOO the array while resyncing? If >> not, chances are your array is in a reasonable shape still. > > I did write to the array (in fact, I did a bonnie++, which in > retrospective is very stupid, and I'm upset I did it, but hindsight is > 20/20 - I assumed the array was fine at that time), BUT if you look at > the "event count" of each drive, the sdl marked "spare" has an event > count just 2 lower then all the others, so they are very close. > >> Now check the event count for all your drivers and compare. If the >> 'broken' drive is only a few off (1 or 2 I think i spotted below, try >> the following) > > Exactly. > >> The 'spare' drive, I don't know what its status is. > > According to SMART, it's just fine. Its event status is very close to > the others. > >> Theoretically, I would assume that the resync the data written to the >> disk is exactly the same as it was before, so keep that in mind as a >> last resort. > > Yes, that's my plan. My question is: HOW can I tell mdadm to use it? > >> mdadm --run --force -A /dev/md0 /dev/sd... > > I've tried that, and it tells me the array can't be started, because I > have RAID 5 with 8 drives (in normal situation), 6 good drives, and 2 > spares (1 working fine, 1 with hardware failure). So, after this > command, I end up in "inactive" operation mode. Make sure to list all known 'good' devices (don't list the really broken device). --run --force should make it come up. I recently (see previous thread) had an issue aswel and I found the order of commands mattered. I may have put the wrong ones up here. Doing history | grep mdadm the last used command, and thus probably the right one was: mdadm --assemble --run --force /dev/md0 /dev/sd[1-7]. Make sure to mdadm --stop /dev/md0 before trying to assemble it. > >> Now the broken drive. Check your cables!! and run smartctl on it to >> give smart a chance to 'fix' the drive somewhat and check its >> status/health. ... >> If it fails again (at 80% because of hardware failure) you can't >> re-use the broken disk. It really is broken :p > > It failed twice during resync, at around the same point, and smartctl > tells me it's broken, so I assume it's gone for good. (Also, the > failed drive is also marked as "spare" currently.) > >> your very last hope, is to not use the broken drive, and 'force' the >> above using the earlier marked spare. > > How? I haven't managed to do that, that's my whole question. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-14 17:30 ` Oliver Schinagl @ 2013-04-15 10:26 ` Ben Bucksch 2013-04-14 18:16 ` Oliver Schinagl 2013-04-18 13:17 ` Ben Bucksch 0 siblings, 2 replies; 23+ messages in thread From: Ben Bucksch @ 2013-04-15 10:26 UTC (permalink / raw) To: Oliver Schinagl; +Cc: linux-raid Oliver Schinagl wrote, On 14.04.2013 19:30: > mdadm --assemble --run --force /dev/md0 /dev/sd[1-7]. > Make sure to mdadm --stop /dev/md0 before trying to assemble it. # mdadm --stop /dev/md0 mdadm: stopped /dev/md0 # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] mdadm: failed to RUN_ARRAY /dev/md0: Input/output error mdadm: Not enough devices to start the array. # cat /proc/mdstat md0 : inactive sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] 5860574976 blocks (Note that sdl is not even listed) # mdadm --re-add /dev/md0 /dev/sdl mdadm: re-added /dev/sdl # cat /proc/mdstat md0 : inactive sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] 6837337472 blocks Now, sdl is listed, but as spare. I need it to be treated not as spare, but as good drive with correct data (well, almost, 2 events off only). How do I do that? ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-15 10:26 ` Ben Bucksch @ 2013-04-14 18:16 ` Oliver Schinagl 2013-04-18 13:17 ` Ben Bucksch 1 sibling, 0 replies; 23+ messages in thread From: Oliver Schinagl @ 2013-04-14 18:16 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid On 15-04-13 12:26, Ben Bucksch wrote: > Oliver Schinagl wrote, On 14.04.2013 19:30: >> mdadm --assemble --run --force /dev/md0 /dev/sd[1-7]. >> Make sure to mdadm --stop /dev/md0 before trying to assemble it. > # mdadm --stop /dev/md0 > mdadm: stopped /dev/md0 > # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] > mdadm: failed to RUN_ARRAY /dev/md0: Input/output error > mdadm: Not enough devices to start the array. > # cat /proc/mdstat > md0 : inactive sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > 5860574976 blocks > (Note that sdl is not even listed) > # mdadm --re-add /dev/md0 /dev/sdl > mdadm: re-added /dev/sdl That can't work I don't think. You want to create a degraded raid5 array, e.g. 7 disks. It tried (and failed) to create a 6 disk array. Re-adding sdl will make it won't to resync. How you can force that however I don't know. I hoped for you that the above command would actually do that. > # cat /proc/mdstat > md0 : inactive sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > 6837337472 blocks > > Now, sdl is listed, but as spare. I need it to be treated not as > spare, but as good drive with correct data (well, almost, 2 events off > only). How do I do that? > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-15 10:26 ` Ben Bucksch 2013-04-14 18:16 ` Oliver Schinagl @ 2013-04-18 13:17 ` Ben Bucksch 2013-04-18 13:58 ` Maarten ` (2 more replies) 1 sibling, 3 replies; 23+ messages in thread From: Ben Bucksch @ 2013-04-18 13:17 UTC (permalink / raw) To: Oliver Schinagl; +Cc: linux-raid To re-summarize (for full info, see first post of thread): * There are 2 RAID5 arrays in the machine, each have 8 disks. * I upgraded Ubuntu 10.04 to 12.04. * After reboot, both arrays had each ejected one disk. The ejected disks are working fine (at least now). * During the resync mandated by above ejection, one other drive failed, this one fatally with a real hardware failure. * The second array resynced fine, further proving that the disks ejected during upgrade were working. * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, 1 disk with hardware failure, and 1 disk that was ejected, but is working. * The latter is currently marked "spare" by md and has an event count (only) 2 events lower than the other 6 disks. * My task is to get the latter disk back online *with* its data, without resync. I desperately need help, please. Based on suggestions here by Oliver and on forums, I did (and the result is): > # mdadm --stop /dev/md0 > mdadm: stopped /dev/md0 > # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] > mdadm: failed to RUN_ARRAY /dev/md0: Input/output error > mdadm: Not enough devices to start the array. > # cat /proc/mdstat > md0 : inactive sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > 5860574976 blocks > (Note that sdl is not even listed) > # mdadm --re-add /dev/md0 /dev/sdl > mdadm: re-added /dev/sdl > # cat /proc/mdstat > md0 : inactive sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > 6837337472 blocks > > Now, sdl is listed, but as spare. I need it to be treated not as > spare, but as good drive with correct data (well, almost, 2 events off > only). How do I do that? > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-18 13:17 ` Ben Bucksch @ 2013-04-18 13:58 ` Maarten 2013-04-19 22:56 ` linux.news 2013-04-18 14:18 ` Roy Sigurd Karlsbakk 2013-04-18 14:38 ` Robin Hill 2 siblings, 1 reply; 23+ messages in thread From: Maarten @ 2013-04-18 13:58 UTC (permalink / raw) To: linux-raid On 18/04/13 15:17, Ben Bucksch wrote: > To re-summarize (for full info, see first post of thread): > * There are 2 RAID5 arrays in the machine, each have 8 disks. > * I upgraded Ubuntu 10.04 to 12.04. > * After reboot, both arrays had each ejected one disk. > The ejected disks are working fine (at least now). > * During the resync mandated by above ejection, > one other drive failed, this one fatally with a real hardware failure. > * The second array resynced fine, further proving that the > disks ejected during upgrade were working. > * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, > 1 disk with hardware failure, and 1 disk that was ejected, but is > working. > * The latter is currently marked "spare" by md and has an event count > (only) 2 events lower than the other 6 disks. > * My task is to get the latter disk back online *with* its data, without > resync. > > I desperately need help, please. > > Based on suggestions here by Oliver and on forums, I did (and the result > is): > >> # mdadm --stop /dev/md0 >> mdadm: stopped /dev/md0 >> # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] >> mdadm: failed to RUN_ARRAY /dev/md0: >> mdadm: Not enough devices to start the array. At this point, does dmesg show anything pointing to that input/output error ? The procedure is correct, I've used that myself on several occasions when confronted with a two-disk failure. What you ought to do before anything, is ascertain all those seven drives are fully readable without any problems occurring. I use dd_rescue for that; dd_rescue /dev/sd{x} /dev/null. You can run parallel dd_rescue sessions. When finished, verify that all dd_rescue sessions reported zero errors. If not, clone that drive using dd_rescue to a fresh new drive, as retry assembly with that new one instead. I have no (further) idea why mdadm insists there are not enough devices, but I'm willing to bet it is that input/output error that is at the root of that. So do the dd_rescue procedure as described. Oh and, make sure you get that drive out of the array as spare, as that is definitely NOT what you want. When in doubt how to do that safely, clone it first, attempt removal later. If you value your data. Good luck! Maarten >> # cat /proc/mdstat >> md0 : inactive sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] >> 5860574976 blocks >> (Note that sdl is not even listed) >> # mdadm --re-add /dev/md0 /dev/sdl >> mdadm: re-added /dev/sdl >> # cat /proc/mdstat >> md0 : inactive sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] >> 6837337472 blocks >> >> Now, sdl is listed, but as spare. I need it to be treated not as >> spare, but as good drive with correct data (well, almost, 2 events off >> only). How do I do that? >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-18 13:58 ` Maarten @ 2013-04-19 22:56 ` linux.news 2013-04-20 1:26 ` Ben Bucksch 0 siblings, 1 reply; 23+ messages in thread From: linux.news @ 2013-04-19 22:56 UTC (permalink / raw) To: linux-raid; +Cc: Maarten Maarten wrote, On 18.04.2013 15:58: > On 18/04/13 15:17, Ben Bucksch wrote: >> To re-summarize (for full info, see first post of thread): >> * There are 2 RAID5 arrays in the machine, each have 8 disks. >> * I upgraded Ubuntu 10.04 to 12.04. >> * After reboot, both arrays had each ejected one disk. >> The ejected disks are working fine (at least now). >> * During the resync mandated by above ejection, >> one other drive failed, this one fatally with a real hardware failure. >> * The second array resynced fine, further proving that the >> disks ejected during upgrade were working. >> * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, >> 1 disk with hardware failure, and 1 disk that was ejected, but is >> working. >> * The latter is currently marked "spare" by md and has an event count >> (only) 2 events lower than the other 6 disks. >> * My task is to get the latter disk back online *with* its data, without >> resync. >> >> I desperately need help, please. >> >> Based on suggestions here by Oliver and on forums, I did (and the result >> is): >> >>> # mdadm --stop /dev/md0 >>> mdadm: stopped /dev/md0 >>> # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] >>> mdadm: failed to RUN_ARRAY /dev/md0: >>> mdadm: Not enough devices to start the array. > At this point, does dmesg show anything pointing to that input/output > error ? The procedure is correct [630786.513314] md: md0 stopped. [630786.513341] md: unbind<sdl> [630786.590662] md: export_rdev(sdl) [630786.590744] md: unbind<sdj> [630786.670652] md: export_rdev(sdj) [630786.670887] md: unbind<sdq> [630786.750650] md: export_rdev(sdq) [630786.750707] md: unbind<sdn> [630786.830649] md: export_rdev(sdn) [630786.830712] md: unbind<sdp> [630786.910651] md: export_rdev(sdp) [630786.910710] md: unbind<sdo> [630786.990649] md: export_rdev(sdo) [630786.990700] md: unbind<sdm> [630787.070649] md: export_rdev(sdm) [630793.315121] md: md0 stopped. [630794.785328] md: bind<sdm> [630794.785512] md: bind<sdo> [630794.785695] md: bind<sdp> [630794.785891] md: bind<sdn> [630794.786643] md: bind<sdq> [630794.787009] md: bind<sdl> [630794.788164] md: bind<sdj> [630794.788236] md: kicking non-fresh sdl from array! [630794.788250] md: unbind<sdl> [630794.810082] md: export_rdev(sdl) [630794.812725] raid5: device sdj operational as raid disk 0 [630794.812734] raid5: device sdq operational as raid disk 7 [630794.812740] raid5: device sdn operational as raid disk 6 [630794.812745] raid5: device sdp operational as raid disk 5 [630794.812750] raid5: device sdo operational as raid disk 4 [630794.812755] raid5: device sdm operational as raid disk 3 [630794.813895] raid5: allocated 8490kB for md0 [630794.813966] 0: w=1 pa=0 pr=8 m=1 a=2 r=8 op1=0 op2=0 [630794.813974] 7: w=2 pa=0 pr=8 m=1 a=2 r=8 op1=0 op2=0 [630794.813980] 6: w=3 pa=0 pr=8 m=1 a=2 r=8 op1=0 op2=0 [630794.813986] 5: w=4 pa=0 pr=8 m=1 a=2 r=8 op1=0 op2=0 [630794.813993] 4: w=5 pa=0 pr=8 m=1 a=2 r=8 op1=0 op2=0 [630794.813999] 3: w=6 pa=0 pr=8 m=1 a=2 r=8 op1=0 op2=0 [630794.814005] raid5: not enough operational devices for md0 (2/8 failed) [630794.820671] RAID5 conf printout: [630794.820675] --- rd:8 wd:6 [630794.820680] disk 0, o:1, dev:sdj [630794.820685] disk 3, o:1, dev:sdm [630794.820689] disk 4, o:1, dev:sdo [630794.820693] disk 5, o:1, dev:sdp [630794.820697] disk 6, o:1, dev:sdn [630794.820701] disk 7, o:1, dev:sdq [630794.820945] raid5: failed to run raid set md0 [630794.826530] md: pers->run() failed ... [630794.834455] md: export_rdev(sdl) [630794.834463] md: export_rdev(sdl) The problem is: md: kicking non-fresh sdl from array! thus: raid5: not enough operational devices for md0 (2/8 failed) # mdadm -E /dev/sdl Checksum : ca6e81a9 - correct Events : 13274863 # mdadm -E /dev/sdn Checksum : c9a41046 - correct Events : 13274865 So, the question is: How do I convince md not to be so anal retentive and prevent me from accessing any of my data? The drive ***is fine***, has practically all the data (I don't care about these 2 events), just use it already. Nobody seems to know the magic shell commands to do that. The lack of a proper shell command for that effectively constitutes a dataloss bug. I've been patient, but I'm getting more and more upset at md. Thanks, Maarten, for your help. I hope 1) you or anybody else can help me, and I hope 2) these kinds of problems will be fixed once and for good by the devs. > Good luck! Thanks. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-19 22:56 ` linux.news @ 2013-04-20 1:26 ` Ben Bucksch 2013-04-20 1:53 ` Ben Bucksch 2013-04-21 21:46 ` NeilBrown 0 siblings, 2 replies; 23+ messages in thread From: Ben Bucksch @ 2013-04-20 1:26 UTC (permalink / raw) To: linux-raid; +Cc: Maarten linux.news@bucksch.org wrote, On 20.04.2013 00:56: > Maarten wrote, On 18.04.2013 15:58: >> On 18/04/13 15:17, Ben Bucksch wrote: >>> To re-summarize (for full info, see first post of thread): >>> * There are 2 RAID5 arrays in the machine, each have 8 disks. >>> * I upgraded Ubuntu 10.04 to 12.04. >>> * After reboot, both arrays had each ejected one disk. >>> The ejected disks are working fine (at least now). >>> * During the resync mandated by above ejection, >>> one other drive failed, this one fatally with a real hardware >>> failure. >>> * The second array resynced fine, further proving that the >>> disks ejected during upgrade were working. >>> * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, >>> 1 disk with hardware failure, and 1 disk that was ejected, but is >>> working. >>> * The latter is currently marked "spare" by md and has an event count >>> (only) 2 events lower than the other 6 disks. >>> * My task is to get the latter disk back online *with* its data, >>> without >>> resync. >>> >>> I desperately need help, please. >>> >>> Based on suggestions here by Oliver and on forums, I did (and the >>> result >>> is): >>> >>>> # mdadm --stop /dev/md0 >>>> mdadm: stopped /dev/md0 >>>> # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] >>>> mdadm: failed to RUN_ARRAY /dev/md0: >>>> mdadm: Not enough devices to start the array. >> At this point, does dmesg show anything pointing to that input/output >> error ? The procedure is correct > > [dmesg] > The problem is: > md: kicking non-fresh sdl from array! > thus: > raid5: not enough operational devices for md0 (2/8 failed) > > So, the question is: How do I convince md not to be so anal retentive > and prevent me from accessing any of my data? The drive ***is fine***, > has practically all the data (I don't care about these 2 events), just > use it already. Nobody seems to know the magic shell commands to do that. Good news: In my desperation, I now ran the following dangerous command: mdadm --create /dev/md0 --assume-clean --level=raid5 -n 8 --chunk=64 --layout=left-symmetric --metadata=0.90 /dev/sdj missing /dev/sdl /dev/sd[mopnq] and that worked. I can read my files again, without problem, all is happy. Before doing that, I saved the superblock, using (no warranty!): 1. mdadm -E /dev/sdj 2. "Used Dev Size" (in KB) * 1024 / 64 - 1 (use this as <skip blocks>) 3. dd if=/dev/sdl of=/root/sdj.mdsuperblock ibs=64 skip=<skip blocks> --- Thanks, Maarten and Oliver, for your help and moral support. --- I still maintain that all of this represents 2 design bugs in the md implementation: 1. ejecting devices out that are working 1.1. individual sectors not readable/writable, but rest of device working (This is very common these days with large drives) 1.2. temporary errors, e.g. disk not connected, loose cable, bad controller etc. 1.3. Linux distro upgrade, no disk problem at all (my case) 2. not allowing me to re-add ejected disks, with data, without resync The result of this is: 1. a device is ejected for no good reasons 2. a resync is triggered 3. the resync discovers a disk that is *really* broken I am left with 2 disks marked "failed", but only 1 actually failed, so normally I should be able to recover, yet I cannot read anything. This fails the very definition of RAID5, therefore is a bug. I have to do risky operations like re-create that can easily destroy all data. Effectively, md achieves the opposite that is intended: It actively risks and destroys my data. I am BEGGING you md raid devs to fix these. Ben Bucksch ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-20 1:26 ` Ben Bucksch @ 2013-04-20 1:53 ` Ben Bucksch 2013-04-21 7:23 ` Brad Campbell 2013-04-21 21:50 ` NeilBrown 2013-04-21 21:46 ` NeilBrown 1 sibling, 2 replies; 23+ messages in thread From: Ben Bucksch @ 2013-04-20 1:53 UTC (permalink / raw) To: linux-raid Ben Bucksch wrote, On 20.04.2013 03:26: > I can read my files again, without problem, all is happy. Actually, no. XFS filesystem structure is not sane. I must have done something wrong. (If possible, please let me know what, all data should be posted.) At first, it looked OK, as if only one recently written directory was broken. I unmounted one of the FS, did xfs_repair, and after re-mounting, almost all directories are gone. Almost 100% dataloss. I can't describe how upset I am against md. Oh, and in case you're wondering about my backup: That's gone, too, due to bugs in btrfs that trashed the FS and also stopped the dedicated backup machine from booting automatically, so I don't have any current backup either. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-20 1:53 ` Ben Bucksch @ 2013-04-21 7:23 ` Brad Campbell 2013-04-21 8:20 ` Ben Bucksch 2013-04-21 21:50 ` NeilBrown 1 sibling, 1 reply; 23+ messages in thread From: Brad Campbell @ 2013-04-21 7:23 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid On 20/04/13 09:53, Ben Bucksch wrote: > Ben Bucksch wrote, On 20.04.2013 03:26: >> I can read my files again, without problem, all is happy. > > Actually, no. XFS filesystem structure is not sane. I must have done > something wrong. (If possible, please let me know what, all data should > be posted.) > > At first, it looked OK, as if only one recently written directory was > broken. I unmounted one of the FS, did xfs_repair, and after > re-mounting, almost all directories are gone. Almost 100% dataloss. I > can't describe how upset I am against md. As others have already told you, md does not go randomly kicking drives from arrays. Your system had a failure of some kind which caused the loss of two drives. You tried to recover it and managed to get a drive into the spare state. After much troubleshooting, you used the canon of last resort "assume-clean" after which (without properly verifying your drives were in the correct order) you ran a terribly destructive write to the disks and have almost certainly ruined any chance you had at recovering your data. I fail to see where the fault lies with md. Had you searched or asked a little more, you would have found a number of people who have written permutation scripts which would have iterated every possible arrangement of drives to allow you to run a read-only fsck on each one, which would have positively identified the correct order of your disks. Your best bet now is to post on the xfs list to find out if there is _any_ way of undoing what you just did, or working around it (backup superblocks or whatever) and then running a permutation on your drives to see if any combination shows you any valid data. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-21 7:23 ` Brad Campbell @ 2013-04-21 8:20 ` Ben Bucksch 2013-04-21 10:45 ` Brad Campbell 2013-04-21 11:07 ` Roy Sigurd Karlsbakk 0 siblings, 2 replies; 23+ messages in thread From: Ben Bucksch @ 2013-04-21 8:20 UTC (permalink / raw) To: Brad Campbell; +Cc: linux-raid Brad Campbell wrote, On 21.04.2013 09:23: > As others have already told you, md does not go randomly kicking > drives from arrays. Your system had a failure of some kind which > caused the loss of two drives. You ignore the facts and do "mi mi mi" in face of bugs reports. 2 different arrays lost 1 drive, both at the same time at reboot after the OS upgrade, and both drives are working fine. Facts. And even *if* they had a temporary error, my case shows why it's a *bug* to kick them out of the array. And it's a *bug* to not let me put them back in with data. Tons of other people have suffered dataloss because of various temporary, easily recoverable problems and these 2 bugs. People like you are the reason why people like me suffer dataloss. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-21 8:20 ` Ben Bucksch @ 2013-04-21 10:45 ` Brad Campbell 2013-04-21 18:17 ` Phil Turmel 2013-04-21 11:07 ` Roy Sigurd Karlsbakk 1 sibling, 1 reply; 23+ messages in thread From: Brad Campbell @ 2013-04-21 10:45 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid On 21/04/13 16:20, Ben Bucksch wrote: > Brad Campbell wrote, On 21.04.2013 09:23: >> As others have already told you, md does not go randomly kicking >> drives from arrays. Your system had a failure of some kind which >> caused the loss of two drives. > > You ignore the facts and do "mi mi mi" in face of bugs reports. 2 > different arrays lost 1 drive, both at the same time at reboot after the > OS upgrade, and both drives are working fine. Facts. Those are not facts, they are uninformed guesses at what happened. You have no facts other than something bad happened and two drives were ejected from the array. If you had actual facts then we'd have been able to assist you in determining what actually happened and how it might have been rectified. > And even *if* they had a temporary error, my case shows why it's a *bug* > to kick them out of the array. And it's a *bug* to not let me put them > back in with data. Tons of other people have suffered dataloss because > of various temporary, easily recoverable problems and these 2 bugs. It's not a bug. It is working as intended. That it is not working the way _you_ would like it to work is not a bug at all. When you have an array, you don't get "temporary errors". It's either good or its not. An error is an error is an error. You had an error, which means something in your storage stack is broken. That you can't figure out what it is is even more insidious and needs to be fixed before you can continue. May I point you at the source of both the kernel and md and suggest if you'd like it to "work" differently you might attempt to make it do so. Question. Have you ever worked with hardware arrays? What do you think would happen in the same set of circumstances with a hardware array (hint, precisely the same thing). The bonus with md is (if you know what you are doing and with the right assistance) you can do things like --create --assume-clean and get access to your data. You can't do that with any hardware array I've ever used. > People like you are the reason why people like me suffer dataloss. Riiiight. Remember, and I quote "I have to do risky operations like re-create that can easily destroy all data. Effectively, md achieves the opposite that is intended: It actively risks and destroys my data." So you knew the operation was risky, yet you went ahead without enough information to do it safely and blitzed all your data. I'm sorry, but that's not my fault. Again : "Good news: In my desperation, I now ran the following dangerous command: mdadm --create /dev/md0 --assume-clean --level=raid5 -n 8 --chunk=64 --layout=left-symmetric --metadata=0.90 /dev/sdj missing /dev/sdl /dev/sd[mopnq]" How did you verify you had your disks in the correct order? Where did that command line come from? This will be my last post on the subject. I pointed you at a path of action in my last post. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-21 10:45 ` Brad Campbell @ 2013-04-21 18:17 ` Phil Turmel 2013-04-21 22:00 ` Ben Bucksch 0 siblings, 1 reply; 23+ messages in thread From: Phil Turmel @ 2013-04-21 18:17 UTC (permalink / raw) To: Brad Campbell; +Cc: Ben Bucksch, linux-raid Hi Brad, I'm sorry you've been sucked into exchanges with this troll. (I spotted the attitude in the original post.) A couple comments below: On 04/21/2013 06:45 AM, Brad Campbell wrote: > Again : "Good news: In my desperation, I now ran the following dangerous > command: mdadm --create /dev/md0 --assume-clean --level=raid5 -n 8 > --chunk=64 --layout=left-symmetric --metadata=0.90 /dev/sdj missing > /dev/sdl /dev/sd[mopnq]" I did find it interesting that Ben tried this on his own, given that's the very advice he demanded not be given in his OP. > How did you verify you had your disks in the correct order? Where did > that command line come from? Ben's shell skills seem to match his interpersonal communication skills (weak). The construct /dev/sd[mopnq] expands as if it was specified /dev/sd[mnopq]. His misunderstanding of bracket syntax has wrecked his array. If he had used braces, or spelled out all the devices, he'd probably be fine right now. If he tries again, with this in mind, he might still be fine. (I haven't checked if his attempted device order was correct, though.) If I was inclined to respond directly to him, I'd further suggest he review the list archives for "error recovery", "timeouts", and "scrubbing". He might learn enough to not suffer so much next time. > This will be my last post on the subject. I pointed you at a path of > action in my last post. If Ben's attitude had moderated in following posts, I might have set aside my first impression and cut him some slack. That's a significant consideration when non-native English is involved. But it's clearly not the case here. My first *and* last post on this topic. Phil ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-21 18:17 ` Phil Turmel @ 2013-04-21 22:00 ` Ben Bucksch 0 siblings, 0 replies; 23+ messages in thread From: Ben Bucksch @ 2013-04-21 22:00 UTC (permalink / raw) Cc: linux-raid Phil Turmel wrote, On 21.04.2013 20:17: > I'm sorry you've been sucked into exchanges with this troll. > [referring to me] Thank you. I came here for help. I *desperately* needed help with an actual, real problem. My goal was to rescue my data, and make sure this never happens again to anybody else. The goal of a troll is to enrage people and cause useless debate. Before I came here, I had already extensively searched on Google, and found lots of unhelpful comments. I wanted to avoid these. Unfortunately, I found the same kind of responses here. Responses like "You should have used ABC" are not helpful to the problem at hand at all, and in fact only serve to enrage the person so "advised". Such advise is fair enough when offered *after* actual practical help. A few people had tried to help, but could not, because apparently there is no safe command to do what I need. >> Again : "Good news: In my desperation, I now ran the following dangerous >> command: mdadm --create /dev/md0 --assume-clean --level=raid5 -n 8 >> --chunk=64 --layout=left-symmetric --metadata=0.90 /dev/sdj missing >> /dev/sdl /dev/sd[mopnq]" > I did find it interesting that Ben tried this on his own, given that's > the very advice he demanded not be given in his OP. That's not correct. I wrote: "Please do NOT respond with re-"create" the array - unless you can give me the exact --create command that would recover it with data - other people tried this based on suggestions in forums and they lost all data" Note the "unless". Unfortunately, nonewithstanding my disclaimer, several people suggested 1) to use the disk that I already wrote is (really, actually) dead 2) hexediting superblocks (without any info on how) 3) using --create (ditto), but *nobody* offered the exact command to run. (I had posted all relevant device info for that purpose.) I didn't feel comfortable with --create, but when nobody offered a real alternative, I saw no other option than to try that. Yet, I was wrong to do that, because: > The construct /dev/sd[mopnq] expands as if it was specified > /dev/sd[mnopq]. His misunderstanding of bracket syntax has wrecked his > array. If he had used braces, or spelled out all the devices, he'd > probably be fine right now. > > If he tries again, with this in mind, he might still be fine. Ah, thanks. This is what I consider practical help. Thank you. Indeed, I had no idea (not even thought of the mere possibility) that the shell would reorder my [] device list. And in fact: # cat /proc/mdstat md0 : active raid5 sdq[7] sdp[6] sdo[5] sdn[4] sdm[3] sdl[2] sdj[0] So, the [] was indeed what killed me and caused my dataloss. Unfortunately, the most important FS on the array is now totally corrupted. FWIW, this is exactly why I had asked for a concrete --create command for my case. > That's a significant consideration when non-native English is > involved. But it's clearly not the case here. FWIW, I am not a native English speaker. Ben ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-21 8:20 ` Ben Bucksch 2013-04-21 10:45 ` Brad Campbell @ 2013-04-21 11:07 ` Roy Sigurd Karlsbakk 1 sibling, 0 replies; 23+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-21 11:07 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid, Brad Campbell > Brad Campbell wrote, On 21.04.2013 09:23: > > As others have already told you, md does not go randomly kicking > > drives from arrays. Your system had a failure of some kind which > > caused the loss of two drives. > > You ignore the facts and do "mi mi mi" in face of bugs reports. 2 > different arrays lost 1 drive, both at the same time at reboot after > the > OS upgrade, and both drives are working fine. Facts. > > And even *if* they had a temporary error, my case shows why it's a > *bug* > to kick them out of the array. And it's a *bug* to not let me put them > back in with data. Tons of other people have suffered dataloss because > of various temporary, easily recoverable problems and these 2 bugs. > > People like you are the reason why people like me suffer dataloss. Well, just restore from backup. Shouldn't be too hard. Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-20 1:53 ` Ben Bucksch 2013-04-21 7:23 ` Brad Campbell @ 2013-04-21 21:50 ` NeilBrown 1 sibling, 0 replies; 23+ messages in thread From: NeilBrown @ 2013-04-21 21:50 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid [-- Attachment #1: Type: text/plain, Size: 1425 bytes --] On Sat, 20 Apr 2013 03:53:49 +0200 Ben Bucksch <linux.news@bucksch.org> wrote: > Ben Bucksch wrote, On 20.04.2013 03:26: > > I can read my files again, without problem, all is happy. > > Actually, no. XFS filesystem structure is not sane. I must have done > something wrong. (If possible, please let me know what, all data should > be posted.) > > At first, it looked OK, as if only one recently written directory was > broken. I unmounted one of the FS, did xfs_repair, and after > re-mounting, almost all directories are gone. Almost 100% dataloss. I > can't describe how upset I am against md. So data was accessible before "xfs_repair", is not accessible after 'xfs_repair', yet you blame md rather than xfs_repair? Interesting. You've clearly had a bad experience - I'm sorry about that. I doubt there is anything you can do to unrepair whatever xfs_repair did, but the place to ask would be on the xfs list. NeilBrown > > Oh, and in case you're wondering about my backup: That's gone, too, due > to bugs in btrfs that trashed the FS and also stopped the dedicated > backup machine from booting automatically, so I don't have any current > backup either. > > Ben > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-20 1:26 ` Ben Bucksch 2013-04-20 1:53 ` Ben Bucksch @ 2013-04-21 21:46 ` NeilBrown 1 sibling, 0 replies; 23+ messages in thread From: NeilBrown @ 2013-04-21 21:46 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid, Maarten [-- Attachment #1: Type: text/plain, Size: 5603 bytes --] On Sat, 20 Apr 2013 03:26:43 +0200 Ben Bucksch <linux.news@bucksch.org> wrote: > linux.news@bucksch.org wrote, On 20.04.2013 00:56: > > Maarten wrote, On 18.04.2013 15:58: > >> On 18/04/13 15:17, Ben Bucksch wrote: > >>> To re-summarize (for full info, see first post of thread): > >>> * There are 2 RAID5 arrays in the machine, each have 8 disks. > >>> * I upgraded Ubuntu 10.04 to 12.04. > >>> * After reboot, both arrays had each ejected one disk. > >>> The ejected disks are working fine (at least now). > >>> * During the resync mandated by above ejection, > >>> one other drive failed, this one fatally with a real hardware > >>> failure. > >>> * The second array resynced fine, further proving that the > >>> disks ejected during upgrade were working. > >>> * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, > >>> 1 disk with hardware failure, and 1 disk that was ejected, but is > >>> working. > >>> * The latter is currently marked "spare" by md and has an event count > >>> (only) 2 events lower than the other 6 disks. > >>> * My task is to get the latter disk back online *with* its data, > >>> without > >>> resync. > >>> > >>> I desperately need help, please. > >>> > >>> Based on suggestions here by Oliver and on forums, I did (and the > >>> result > >>> is): > >>> > >>>> # mdadm --stop /dev/md0 > >>>> mdadm: stopped /dev/md0 > >>>> # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] > >>>> mdadm: failed to RUN_ARRAY /dev/md0: > >>>> mdadm: Not enough devices to start the array. > >> At this point, does dmesg show anything pointing to that input/output > >> error ? The procedure is correct > > > > [dmesg] > > The problem is: > > md: kicking non-fresh sdl from array! > > thus: > > raid5: not enough operational devices for md0 (2/8 failed) > > > > So, the question is: How do I convince md not to be so anal retentive > > and prevent me from accessing any of my data? The drive ***is fine***, > > has practically all the data (I don't care about these 2 events), just > > use it already. Nobody seems to know the magic shell commands to do that. > > Good news: > In my desperation, I now ran the following dangerous command: > mdadm --create /dev/md0 --assume-clean --level=raid5 -n 8 --chunk=64 > --layout=left-symmetric --metadata=0.90 /dev/sdj missing /dev/sdl > /dev/sd[mopnq] > and that worked. I can read my files again, without problem, all is happy. > > Before doing that, I saved the superblock, using (no warranty!): > 1. mdadm -E /dev/sdj > 2. "Used Dev Size" (in KB) * 1024 / 64 - 1 (use this as <skip blocks>) > 3. dd if=/dev/sdl of=/root/sdj.mdsuperblock ibs=64 skip=<skip blocks> > > --- > > Thanks, Maarten and Oliver, for your help and moral support. > > --- > > I still maintain that all of this represents 2 design bugs in the md > implementation: > 1. ejecting devices out that are working Without being able to examine the full sequence of events I cannot be sure what happened here, but my best guess is that the working device wasn't "ejected" so much as it simply wasn't included. The modern approach to booting involves devices appearing asynchronously, with filesystems being mounted as the relevant devices appear. This is slightly awkward for md/raid. If you have a 5-disk RAID5 and only 4 disks have appeared, do you start the array degraded, or do you wait for the 5th disk to appear. What if the 5th disk has been physically removed? That would mean waiting forever. mdadm doesn't impose a policy but allows the boot scripts to choose one. Some boot scripts might get this wrong. If you have a write-intent-bitmap on your array, then getting it wrong isn't too bad: when the 5th disk does appear it can easily be re-added. Without the bitmap, it cannot. My guess is that you got bitten by something going wrong in the init scripts. > 1.1. individual sectors not readable/writable, but rest of device working > (This is very common these days with large drives) Yes, this is a problem. There is code to handle it better by recording bad blocks. It isn't quite production read yet. And it'll never work on 0.90 metadata. > 1.2. temporary errors, e.g. disk not connected, loose cable, bad > controller etc. > 1.3. Linux distro upgrade, no disk problem at all (my case) unless there are bugs in the distro scripts. > 2. not allowing me to re-add ejected disks, with data, without resync It *must* be hard to do this, because it *will* cause data loss. Maybe it shouldn't be quite as hard as it is. But then there are lots of improvements that could be made, but not very many developers working on it. NeilBrown > > The result of this is: > 1. a device is ejected for no good reasons > 2. a resync is triggered > 3. the resync discovers a disk that is *really* broken > > I am left with 2 disks marked "failed", but only 1 actually failed, so > normally I should be able to recover, yet I cannot read anything. This > fails the very definition of RAID5, therefore is a bug. I have to do > risky operations like re-create that can easily destroy all data. > Effectively, md achieves the opposite that is intended: It actively > risks and destroys my data. > > I am BEGGING you md raid devs to fix these. > > Ben Bucksch > -- > To unsubscribe from this list: send the line "unsubscribe linux-raid" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 828 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-18 13:17 ` Ben Bucksch 2013-04-18 13:58 ` Maarten @ 2013-04-18 14:18 ` Roy Sigurd Karlsbakk 2013-04-18 14:38 ` Robin Hill 2 siblings, 0 replies; 23+ messages in thread From: Roy Sigurd Karlsbakk @ 2013-04-18 14:18 UTC (permalink / raw) To: Ben Bucksch; +Cc: linux-raid, Oliver Schinagl ----- Opprinnelig melding ----- > To re-summarize (for full info, see first post of thread): > * There are 2 RAID5 arrays in the machine, each have 8 disks. Once more: 8 drives in RAID-5 isn't very safe. The chances are rather high to get a double disk failure with that amount of drives in R5. See the list for examples. Just my 2c -- Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 98013356 roy@karlsbakk.net http://blogg.karlsbakk.net/ GPG Public key: http://karlsbakk.net/roysigurdkarlsbakk.pubkey.txt -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med xenotyp etymologi. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. -- To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-18 13:17 ` Ben Bucksch 2013-04-18 13:58 ` Maarten 2013-04-18 14:18 ` Roy Sigurd Karlsbakk @ 2013-04-18 14:38 ` Robin Hill 2013-04-20 13:44 ` Oliver Schinagl 2 siblings, 1 reply; 23+ messages in thread From: Robin Hill @ 2013-04-18 14:38 UTC (permalink / raw) To: Ben Bucksch; +Cc: Oliver Schinagl, linux-raid [-- Attachment #1: Type: text/plain, Size: 3264 bytes --] On Thu Apr 18, 2013 at 03:17:15PM +0200, Ben Bucksch wrote: > To re-summarize (for full info, see first post of thread): > * There are 2 RAID5 arrays in the machine, each have 8 disks. > * I upgraded Ubuntu 10.04 to 12.04. > * After reboot, both arrays had each ejected one disk. > The ejected disks are working fine (at least now). > * During the resync mandated by above ejection, > one other drive failed, this one fatally with a real hardware failure. > * The second array resynced fine, further proving that the > disks ejected during upgrade were working. > * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, > 1 disk with hardware failure, and 1 disk that was ejected, but is > working. > * The latter is currently marked "spare" by md and has an event count > (only) 2 events lower than the other 6 disks. > * My task is to get the latter disk back online *with* its data, without > resync. > > I desperately need help, please. > > Based on suggestions here by Oliver and on forums, I did (and the result > is): > > > # mdadm --stop /dev/md0 > > mdadm: stopped /dev/md0 > > # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] > > mdadm: failed to RUN_ARRAY /dev/md0: Input/output error > > mdadm: Not enough devices to start the array. > > # cat /proc/mdstat > > md0 : inactive sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > > 5860574976 blocks > > (Note that sdl is not even listed) > > # mdadm --re-add /dev/md0 /dev/sdl > > mdadm: re-added /dev/sdl > > # cat /proc/mdstat > > md0 : inactive sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] > > 6837337472 blocks > > That won't work here as sdl was being rebuilt at the time of the failure. md therefore 'knows' that it doesn't have the correct data on and can't be used to assemble the array (I think the array position of the disk is only written to the metadata when recovery completes). > > Now, sdl is listed, but as spare. I need it to be treated not as > > spare, but as good drive with correct data (well, almost, 2 events off > > only). How do I do that? > > I can see two options here: - Image the known faulty disk to a new one and then use that to force assemble the array (with the possibility of some data loss, depending on how much can be read from the faulty disk). - Modify the metadata on sdl so that it shows as being a valid member of the array. This will require either manual editing of the superblock, or rerunning the "mdadm --create" command with the correct mdadm version (so data offsets match), disk order and parameters (chunk size, etc). If done correctly then there should be no data loss (providing that sdl, when re-added to the array, used the same data offset as it originally had), but get anything wrong any you'll be even further up the creek. Personally I'd look into option one first. Despite the probability of some data loss, I think it's a lower risk option overall. Cheers, Robin -- ___ ( ' } | Robin Hill <robin@robinhill.me.uk> | / / ) | Little Jim says .... | // !! | "He fallen in de water !!" | [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: md RAID5: Disk wrongly marked "spare", need to force re-add it 2013-04-18 14:38 ` Robin Hill @ 2013-04-20 13:44 ` Oliver Schinagl 0 siblings, 0 replies; 23+ messages in thread From: Oliver Schinagl @ 2013-04-20 13:44 UTC (permalink / raw) To: Ben Bucksch, linux-raid On 04/18/13 16:38, Robin Hill wrote: > On Thu Apr 18, 2013 at 03:17:15PM +0200, Ben Bucksch wrote: > >> To re-summarize (for full info, see first post of thread): >> * There are 2 RAID5 arrays in the machine, each have 8 disks. >> * I upgraded Ubuntu 10.04 to 12.04. >> * After reboot, both arrays had each ejected one disk. >> The ejected disks are working fine (at least now). >> * During the resync mandated by above ejection, >> one other drive failed, this one fatally with a real hardware failure. >> * The second array resynced fine, further proving that the >> disks ejected during upgrade were working. >> * Now I am left with: originally 8-disk RAID5, 6 disks are healthy, >> 1 disk with hardware failure, and 1 disk that was ejected, but is >> working. >> * The latter is currently marked "spare" by md and has an event count >> (only) 2 events lower than the other 6 disks. >> * My task is to get the latter disk back online *with* its data, without >> resync. >> >> I desperately need help, please. >> >> Based on suggestions here by Oliver and on forums, I did (and the result >> is): >> >>> # mdadm --stop /dev/md0 >>> mdadm: stopped /dev/md0 >>> # mdadm --assemble --run --force /dev/md0 /dev/sd[jlmnopq] >>> mdadm: failed to RUN_ARRAY /dev/md0: Input/output error >>> mdadm: Not enough devices to start the array. >>> # cat /proc/mdstat >>> md0 : inactive sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] >>> 5860574976 blocks >>> (Note that sdl is not even listed) >>> # mdadm --re-add /dev/md0 /dev/sdl >>> mdadm: re-added /dev/sdl >>> # cat /proc/mdstat >>> md0 : inactive sdl[8](S) sdj[0] sdq[7] sdn[6] sdp[5] sdo[4] sdm[3] >>> 6837337472 blocks >>> > That won't work here as sdl was being rebuilt at the time of the > failure. md therefore 'knows' that it doesn't have the correct data on > and can't be used to assemble the array (I think the array position of > the disk is only written to the metadata when recovery completes). > >>> Now, sdl is listed, but as spare. I need it to be treated not as >>> spare, but as good drive with correct data (well, almost, 2 events off >>> only). How do I do that? >>> > I can see two options here: > > - Image the known faulty disk to a new one and then use that to force > assemble the array (with the possibility of some data loss, depending > on how much can be read from the faulty disk). IF you have an extra spare disk, this probably is the best idea then. The event count might not even be off at all. Use ddrescue though, check the options and possibilities, going start -> end and end -> start Let is try very often. You get the idea. You could use the spare disk for this, but that removes any option in trying to recover using that. (Hex editing the superblock, possible but not easy). IF you have enough room on the other array, you can always dd if=/dev/spare of=file_on_array Also, make a backup of your superblock!! dd if=/dev/sd* of=sd*.superblock size=<superblocksize> count=1 > - Modify the metadata on sdl so that it shows as being a valid member of > the array. This will require either manual editing of the superblock, > or rerunning the "mdadm --create" command with the correct mdadm > version (so data offsets match), disk order and parameters (chunk > size, etc). If done correctly then there should be no data loss > (providing that sdl, when re-added to the array, used the same data > offset as it originally had), but get anything wrong any you'll be > even further up the creek. > > Personally I'd look into option one first. Despite the probability of > some data loss, I think it's a lower risk option overall. I agree, best option would be that oliver > > Cheers, > Robin > ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2013-04-21 22:00 UTC | newest] Thread overview: 23+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-04-12 20:08 md RAID5: Disk wrongly marked "spare", need to force re-add it Ben Bucksch 2013-04-13 14:19 ` Roy Sigurd Karlsbakk 2013-04-14 22:40 ` Oliver Schinagl 2013-04-15 1:34 ` Ben Bucksch 2013-04-14 17:30 ` Oliver Schinagl 2013-04-15 10:26 ` Ben Bucksch 2013-04-14 18:16 ` Oliver Schinagl 2013-04-18 13:17 ` Ben Bucksch 2013-04-18 13:58 ` Maarten 2013-04-19 22:56 ` linux.news 2013-04-20 1:26 ` Ben Bucksch 2013-04-20 1:53 ` Ben Bucksch 2013-04-21 7:23 ` Brad Campbell 2013-04-21 8:20 ` Ben Bucksch 2013-04-21 10:45 ` Brad Campbell 2013-04-21 18:17 ` Phil Turmel 2013-04-21 22:00 ` Ben Bucksch 2013-04-21 11:07 ` Roy Sigurd Karlsbakk 2013-04-21 21:50 ` NeilBrown 2013-04-21 21:46 ` NeilBrown 2013-04-18 14:18 ` Roy Sigurd Karlsbakk 2013-04-18 14:38 ` Robin Hill 2013-04-20 13:44 ` Oliver Schinagl
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox