* Corrupted FS after recovery. Coincidence?
@ 2013-02-27 19:38 Jamie Thompson
2013-02-27 22:37 ` Adam Goryachev
2013-02-27 23:19 ` joystick
0 siblings, 2 replies; 4+ messages in thread
From: Jamie Thompson @ 2013-02-27 19:38 UTC (permalink / raw)
To: linux-raid
Hi all.
I just wanted to check with those more clued up that I'm not missing
something important. To save you wading through the logs, in summary my
filesystem got borked after I recovered an array when I realised I'd
used the device and not the partition and corrected the mistake. I'd
like to know if it's just bad luck and is likely to be a ram corruption
of some sort, or if I did something wrong with my actions as I was under
the impression the validation would fail the disk if there was a
problem? I've done this before and it all went fine...
I have a remote server running Debian Testing...of relevance are the 3
SATA drives on a LSI SAS controller and the RAID5 in use on them, though
it also has two 2GB compactflash cards running in RAID1 with the root
filesystem on them.
At the weekend I pulled one of the drives to use elsewhere. I replaced
it with a new larger drive (old array drives are 500Gb, new one is 1.5TB
- I'll be getting more in due course to enlarge the array), added a new
partition over the whole disk, and added it to the array, which then
began to rebuild:
Old disk:
> Disk /dev/sdc: 499.3 GB, 499279462400 bytes
> 255 heads, 63 sectors/track, 60700 cylinders, total 975155200 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x5932fff1
>
> Device Boot Start End Blocks Id System
> /dev/sdc1 63 975145499 487572718+ fd Linux raid
> autodetect
New disk:
> Disk /dev/sdd: 1500.3 GB, 1500301910016 bytes
> 81 heads, 63 sectors/track, 574226 cylinders, total 2930277168 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0xbcba78d3
>
> Device Boot Start End Blocks Id System
> /dev/sdf1 2048 2930277167 1465137560 fd Linux raid
> autodetect
<pulls disk>
> Feb 25 18:41:33 mrlinux kernel: [281921.556280] RAID conf printout:
> Feb 25 18:41:33 mrlinux kernel: [281921.556288] --- level:5 rd:3 wd:2
> Feb 25 18:41:33 mrlinux kernel: [281921.556294] disk 0, o:1, dev:sde1
> Feb 25 18:41:33 mrlinux kernel: [281921.556299] disk 1, o:1, dev:sdc1
> Feb 25 18:41:33 mrlinux kernel: [281921.556304] disk 2, o:0, dev:sdd1
> Feb 25 18:41:33 mrlinux kernel: [281921.620023] RAID conf printout:
> Feb 25 18:41:33 mrlinux kernel: [281921.620029] --- level:5 rd:3 wd:2
> Feb 25 18:41:33 mrlinux kernel: [281921.620033] disk 0, o:1, dev:sde1
> Feb 25 18:41:33 mrlinux kernel: [281921.620036] disk 1, o:1, dev:sdc1
<inserts new disk>
> Feb 25 18:56:22 mrlinux kernel: [282809.983189] mptsas: ioc0:
> attaching sata device: fw_channel 0, fw_id 2, phy 0, sas_addr 0
> x455c3c41ddbfae97
> Feb 25 18:56:22 mrlinux kernel: [282809.994670] scsi 5:0:2:0:
> Direct-Access ATA WDC WD15EARS-00M AB51 PQ: 0 ANSI: 5
> Feb 25 18:56:22 mrlinux kernel: [282809.996926] sd 5:0:2:0: [sdf]
> 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)
> Feb 25 18:56:22 mrlinux kernel: [282810.026435] sd 5:0:2:0: [sdf]
> Write Protect is off
> Feb 25 18:56:22 mrlinux kernel: [282810.026442] sd 5:0:2:0: [sdf] Mode
> Sense: 73 00 00 08
> Feb 25 18:56:22 mrlinux kernel: [282810.037183] sd 5:0:2:0: [sdf]
> Write cache: enabled, read cache: enabled, doesn't support
> DPO or FUA
> Feb 25 18:56:22 mrlinux kernel: [282810.098805] sdf: unknown
> partition table
> Feb 25 18:56:22 mrlinux kernel: [282810.159468] sd 5:0:2:0: [sdf]
> Attached SCSI disk
<partitions new disk>
> Feb 25 18:59:08 mrlinux kernel: [282976.397358] sdf: sdf1
<adds to array>
> Feb 25 18:59:42 mrlinux kernel: [283010.226883] md: bind<sdf>
> Feb 25 18:59:42 mrlinux kernel: [283010.266108] RAID conf printout:
> Feb 25 18:59:42 mrlinux kernel: [283010.266116] --- level:5 rd:3 wd:2
> Feb 25 18:59:42 mrlinux kernel: [283010.266122] disk 0, o:1, dev:sde1
> Feb 25 18:59:42 mrlinux kernel: [283010.266127] disk 1, o:1, dev:sdc1
> Feb 25 18:59:42 mrlinux kernel: [283010.266132] disk 2, o:1, dev:sdf
> Feb 25 18:59:42 mrlinux kernel: [283010.266226] md: recovery of RAID
> array md1
> Feb 25 18:59:42 mrlinux kernel: [283010.266235] md: minimum
> _guaranteed_ speed: 1000 KB/sec/disk.
> Feb 25 18:59:42 mrlinux kernel: [283010.266240] md: using maximum
> available idle IO bandwidth (but not more than 200000 KB/sec) for
> recovery.
> Feb 25 18:59:42 mrlinux kernel: [283010.266255] md: using 128k window,
> over a total of 487572608k.
<recovers 500GB in 03:56:38>
> Feb 25 22:56:20 mrlinux kernel: [297208.335405] md: md1: recovery done.
> Feb 25 22:56:20 mrlinux kernel: [297208.601814] RAID conf printout:
> Feb 25 22:56:20 mrlinux kernel: [297208.601822] --- level:5 rd:3 wd:3
> Feb 25 22:56:20 mrlinux kernel: [297208.601828] disk 0, o:1, dev:sde1
> Feb 25 22:56:20 mrlinux kernel: [297208.601833] disk 1, o:1, dev:sdc1
> Feb 25 22:56:20 mrlinux kernel: [297208.601837] disk 2, o:1, dev:sdf
The next day I noticed I'd made a mistake - I'd added /dev/sdf and not
/dev/sdf1. I've had this before and it confused the system during boot
as it saw two superblocks, so I failed the disk, removed it from the
array, then recreated the partition table that had been obliterated. I
then re-added the disk (or, more accurately, the partition) to the array:
> Feb 26 09:50:52 mrlinux kernel: [336480.481487] md: cannot remove
> active disk sdf from md1 ...
<remembers you have to fail a disk before you can remove it...and does so>
> Feb 26 09:51:47 mrlinux kernel: [336535.564743] md/raid:md1: Disk
> failure on sdf, disabling device.
> Feb 26 09:51:47 mrlinux kernel: [336535.564746] md/raid:md1: Operation
> continuing on 2 devices.
> Feb 26 09:51:47 mrlinux kernel: [336535.597393] RAID conf printout:
> Feb 26 09:51:47 mrlinux kernel: [336535.597399] --- level:5 rd:3 wd:2
> Feb 26 09:51:47 mrlinux kernel: [336535.597403] disk 0, o:1, dev:sde1
> Feb 26 09:51:47 mrlinux kernel: [336535.597406] disk 1, o:1, dev:sdc1
> Feb 26 09:51:47 mrlinux kernel: [336535.597409] disk 2, o:0, dev:sdf
> Feb 26 09:51:48 mrlinux kernel: [336535.664011] RAID conf printout:
> Feb 26 09:51:48 mrlinux kernel: [336535.664017] --- level:5 rd:3 wd:2
> Feb 26 09:51:48 mrlinux kernel: [336535.664020] disk 0, o:1, dev:sde1
> Feb 26 09:51:48 mrlinux kernel: [336535.664023] disk 1, o:1, dev:sdc1
<removes disk from array>
> Feb 26 09:52:11 mrlinux kernel: [336558.705730] md: unbind<sdf>
> Feb 26 09:52:11 mrlinux kernel: [336558.769675] md: export_rdev(sdf)
<repartitions disk>
> Feb 26 09:54:06 mrlinux kernel: [336674.474874] sdf: sdf1
<re-adds disk to array>
> Feb 26 09:54:19 mrlinux kernel: [336687.596415] md: bind<sdf1>
> Feb 26 09:54:19 mrlinux kernel: [336687.636078] RAID conf printout:
> Feb 26 09:54:19 mrlinux kernel: [336687.636087] --- level:5 rd:3 wd:2
> Feb 26 09:54:19 mrlinux kernel: [336687.636094] disk 0, o:1, dev:sde1
> Feb 26 09:54:19 mrlinux kernel: [336687.636099] disk 1, o:1, dev:sdc1
> Feb 26 09:54:19 mrlinux kernel: [336687.636105] disk 2, o:1, dev:sdf1
> Feb 26 09:54:19 mrlinux kernel: [336687.636308] md: recovery of RAID
> array md1
> Feb 26 09:54:19 mrlinux kernel: [336687.636317] md: minimum
> _guaranteed_ speed: 1000 KB/sec/disk.
> Feb 26 09:54:19 mrlinux kernel: [336687.636322] md: using maximum
> available idle IO bandwidth (but not more than 200000 KB/sec) for
> recovery.
> Feb 26 09:54:19 mrlinux kernel: [336687.636342] md: using 128k window,
> over a total of 487572608k.
<recovers 500Gb in 00:00:09>
> Feb 26 09:54:28 mrlinux kernel: [336696.647039] md: md1: recovery done.
> Feb 26 09:54:29 mrlinux kernel: [336696.726098] RAID conf printout:
> Feb 26 09:54:29 mrlinux kernel: [336696.726106] --- level:5 rd:3 wd:3
> Feb 26 09:54:29 mrlinux kernel: [336696.726112] disk 0, o:1, dev:sde1
> Feb 26 09:54:29 mrlinux kernel: [336696.726117] disk 1, o:1, dev:sdc1
> Feb 26 09:54:29 mrlinux kernel: [336696.726122] disk 2, o:1, dev:sdf1
Here's the strange thing, it recovered very quickly, which I thought was
nice, but I wonder if it created a problem, as about an hour later I
started getting errors in my logs:
> Feb 26 10:34:05 mrlinux kernel: [339073.432724] attempt to access
> beyond end of device
> Feb 26 10:34:05 mrlinux kernel: [339073.432732] dm-1: rw=0,
> want=15613016656, limit=58589184
> Feb 26 10:34:05 mrlinux kernel: [339073.533896] attempt to access
> beyond end of device
> Feb 26 10:34:05 mrlinux kernel: [339073.533905] dm-0: rw=0,
> want=1681676296, limit=97648640
> Feb 26 10:34:05 mrlinux kernel: [339073.533916] attempt to access
> beyond end of device
> Feb 26 10:34:05 mrlinux kernel: [339073.533920] dm-0: rw=0,
> want=18656264368, limit=97648640
> Feb 26 10:34:05 mrlinux kernel: [339073.533945] attempt to access
> beyond end of device
> Feb 26 10:34:05 mrlinux kernel: [339073.533950] dm-0: rw=0,
> want=18289707016, limit=97648640
> Feb 26 10:34:05 mrlinux kernel: [339073.533955] attempt to access
> beyond end of device
...and eventually:
> Feb 26 10:34:05 mrlinux kernel: [339073.534443] dm-0: rw=0,
> want=16783872552, limit=97648640
> Feb 26 10:34:05 mrlinux kernel: [339073.534447] attempt to access
> beyond end of device
> Feb 26 10:34:05 mrlinux kernel: [339073.534450] dm-0: rw=0,
> want=17686087704, limit=97648640
> Feb 26 10:34:05 mrlinux kernel: [339073.534515] attempt to access
> beyond end of device
> Feb 26 10:34:05 mrlinux kernel: [339073.534520] dm-0: rw=0,
> want=16716398592, limit=97648640
> Feb 26 10:37:57 mrlinux kernel: [339305.646807] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 10:37:57 mrlinux kernel: [339305.647224] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 10:37:57 mrlinux kernel: [339305.647591] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 10:37:57 mrlinux kernel: [339305.647912] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 10:37:57 mrlinux kernel: [339305.657177] attempt to access
> beyond end of device
> Feb 26 10:37:57 mrlinux kernel: [339305.657185] dm-0: rw=0,
> want=2157354904, limit=97648640
> Feb 26 10:37:57 mrlinux kernel: [339305.657192] attempt to access
> beyond end of device
> Feb 26 10:37:57 mrlinux kernel: [339305.657196] dm-0: rw=0,
> want=2157358136, limit=97648640
> Feb 26 10:37:57 mrlinux kernel: [339305.657202] attempt to access
> beyond end of device
...and finally, and hour after that when I noticed and tried to check
the new disk:
> Feb 26 11:28:19 mrlinux kernel: [342327.335517] dm-0: rw=0,
> want=12776039208, limit=97648640
> Feb 26 11:28:19 mrlinux kernel: [342327.335523] attempt to access
> beyond end of device
> Feb 26 11:28:19 mrlinux kernel: [342327.335527] dm-0: rw=0,
> want=6906147664, limit=97648640
> Feb 26 11:28:19 mrlinux kernel: [342327.368414] apache2[12760]:
> segfault at 0 ip b7794e4f sp bffd1245 error 6 in apache2[b7752000+69000]
> Feb 26 11:30:43 mrlinux kernel: [342471.157911] smartctl[11466]:
> segfault at 4 ip b77b6a72 sp bfecf650 error 4 in
> ld-2.13.so[b77ac000+1c000]
> Feb 26 11:32:24 mrlinux kernel: [342572.338509] attempt to access
> beyond end of device
> Feb 26 11:32:24 mrlinux kernel: [342572.338518] dm-0: rw=0,
> want=6906147664, limit=97648640
> Feb 26 11:32:24 mrlinux kernel: [342572.338554] apache2[24950]:
> segfault at 0 ip b7794e4f sp bffd1245 error 6 in apache2[b7752000+69000]
> Feb 26 11:53:28 mrlinux kernel: [343835.861163] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 11:53:28 mrlinux kernel: [343835.901372] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 11:53:28 mrlinux kernel: [343835.930939] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 11:53:28 mrlinux kernel: [343835.963208] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #1243562: rec_len % 4 !=
> 0 - offset=0, inode=873485355, rec_len=14129, name_len=108
> Feb 26 11:53:28 mrlinux kernel: [343835.965505] attempt to access
> beyond end of device
> Feb 26 11:53:28 mrlinux kernel: [343835.965514] dm-0: rw=0,
> want=14955012880, limit=97648640
> Feb 26 12:08:15 mrlinux kernel: [344722.924052] EXT3-fs error (device
> dm-1): ext3_add_entry: bad entry in directory #408850: rec_len % 4 !=
> 0 - offset=0, inode=134901586, rec_len=17695, name_len=24
> Feb 26 12:08:17 mrlinux kernel: [344724.810787] EXT3-fs (dm-1): error
> in ext3_new_inode: IO failure
> Feb 26 12:08:17 mrlinux kernel: [344724.885464] attempt to access
> beyond end of device
,,,and some more for good measure:
> Feb 26 12:08:35 mrlinux kernel: [344743.007400] EXT3-fs error (device
> dm-1): ext3_free_blocks: Freeing blocks not in datazone
> - block = 858599726, count = 1
> Feb 26 12:08:35 mrlinux kernel: [344743.069558] EXT3-fs error (device
> dm-1): ext3_free_blocks: Freeing blocks not in datazone
> - block = 1650811950, count = 1
Having rebooted the segfaults are gone, and I can confirm that the new
disk seems fine:
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED
> WHEN_FAILED RAW_VALUE
> 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail
> Always - 0
> 3 Spin_Up_Time 0x0027 100 253 021 Pre-fail
> Always - 0
> 4 Start_Stop_Count 0x0032 100 100 000 Old_age
> Always - 4
> 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail
> Always - 0
> 7 Seek_Error_Rate 0x002e 200 200 000 Old_age
> Always - 0
> 9 Power_On_Hours 0x0032 100 100 000 Old_age
> Always - 46
> 10 Spin_Retry_Count 0x0032 100 253 000 Old_age
> Always - 0
> 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age
> Always - 0
> 12 Power_Cycle_Count 0x0032 100 100 000 Old_age
> Always - 4
> 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age
> Always - 3
> 193 Load_Cycle_Count 0x0032 200 200 000 Old_age
> Always - 108
> 194 Temperature_Celsius 0x0022 131 113 000 Old_age
> Always - 19
> 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age
> Always - 0
> 197 Current_Pending_Sector 0x0032 200 200 000 Old_age
> Always - 0
> 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age
> Offline - 0
> 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age
> Always - 0
> 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age
> Offline - 0
>
> SMART Error Log Version: 1
> No Errors Logged
However, all three of my data partitions on the RAID5 volume have errors
beyond what the boot time fsck -a/-p will repair, so I'm looking into
those. Seems fine though as I was able to mount them and copy everything
to an external disk fine before recovery.
If you made it this far, thanks! Any pointers?
- Jamie
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Corrupted FS after recovery. Coincidence?
2013-02-27 19:38 Corrupted FS after recovery. Coincidence? Jamie Thompson
@ 2013-02-27 22:37 ` Adam Goryachev
2013-02-27 23:19 ` joystick
1 sibling, 0 replies; 4+ messages in thread
From: Adam Goryachev @ 2013-02-27 22:37 UTC (permalink / raw)
To: Jamie Thompson, linux-raid
Jamie Thompson <jamierocks@gmail.com> wrote:
>Here's the strange thing, it recovered very quickly, which I thought
>was
>nice, but I wonder if it created a problem, as about an hour later I
>started getting errors in my logs:
I'm not an expert in these things, but I would guess the following:
You re-added the partition instead of the drive, the MD metadata was at the end of the drive (same position regardless of whether it was a partition or a whole drive), and so MD just did a resync. However, the first section of the drive, or the offset of all data (from the beginning of the drive) was now wrong. You probably needed to clear the metadata on the drive before adding the partition to the array.
Someone else might confirm the above, but my suspicion is based on the fact that the resync was so quick (used the bitmap from the whole drive instead of making a new one for the partition).
PS, I'm glad you were able to get all your data back, though you should probably verify that nothing was corrupted in the process, perhaps from a backup restore and file comparison....
Regards,
Adam
Regards,
Adam
--
Adam Goryachev
Website Managers
Phone: 02 8304 0000
www.websitemanagers.com.au
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Corrupted FS after recovery. Coincidence?
2013-02-27 19:38 Corrupted FS after recovery. Coincidence? Jamie Thompson
2013-02-27 22:37 ` Adam Goryachev
@ 2013-02-27 23:19 ` joystick
2013-02-28 1:01 ` Jamie Thompson
1 sibling, 1 reply; 4+ messages in thread
From: joystick @ 2013-02-27 23:19 UTC (permalink / raw)
To: Jamie Thompson, linux-raid
On 02/27/13 20:38, Jamie Thompson wrote:
> Hi all.
>
> I just wanted to check with those more clued up that I'm not missing
> something important. To save you wading through the logs, in summary
> my filesystem got borked after I recovered an array when I realised
> I'd used the device and not the partition and corrected the mistake.
Not coincidence.
For sure MD cannot possibly recover 500GB in 9 seconds so something must
be wrong.
You do not show metadata type. My guess is that it is at the end of the
disk (1.0 maybe) and so when you added sdf1 MD thought it was a re-add
and re-synced only the parts that were dirty in the bitmap (changed
since removal of sdf). However since you moved the start of the disk,
all data coming from such disk are offsetted and hence bogus. That's why
metadata default for mdadm is version 1.2: you don't risk this kind of
crazy things with 1.2 .
With nondegraded raid-5 (which is the situation after adding sdf1), in
raid5 the reads always come from the nonparity disk for every stripe. So
when you read, approximately you get 1/3 of data from sdf1, all of it
bogus. Clearly also ext3 is not happy with its metadata screwed up,
hence the read errors you see.
If I am correct, the "fix" for your array is simple:
- fail sdf1
After that already you can read. Then do mdadm --zero-superblock
/dev/sdf1 (and maybe even mdadm --zero-superblock /dev/sdf then
repartition the drive, just to be sure) so mdadm treats it like a new
drive. Then you can re-add. Ensure it performs a full resync, otherwise
fail it again and report here.
Too bad you performed fsck already with bogus sdf1 in the raid... Who
knows what mess it has done! I guess many files might be unreachable by
now. That was unwise.
For the backup you performed to an external disk: if my reasoning is
correct you can throw it away. This is unless you like to have 1/3 of
the content of your files full of bogus bytes. You will have more luck
backing up the array again after failing sdf1 (most parity data should
still be correct, except where fsck wrote data).
However before proceeding with anything I suggest to wait for some other
opinion on the ML, 'cuz I am not infallible (euphemism).
Disassemble the raid in the meantime. This will make sure at least that
a cron'd "repair" does not start, that would be disastrous.
Also please tell us your kernel version and cat /proc/mdstat please so
we can make better guesses.
Good luck
J.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Corrupted FS after recovery. Coincidence?
2013-02-27 23:19 ` joystick
@ 2013-02-28 1:01 ` Jamie Thompson
0 siblings, 0 replies; 4+ messages in thread
From: Jamie Thompson @ 2013-02-28 1:01 UTC (permalink / raw)
To: linux-raid; +Cc: joystick
On 27/02/2013 23:19, joystick wrote:
> Not coincidence.
I don't really believe in coincidences, but you never know :)
>
> For sure MD cannot possibly recover 500GB in 9 seconds so something
> must be wrong.
I wasn't sure if it had some clever way of noticing the change and
simply "shifted" the start data to the end and left the middle as-was.
Linux software RAID and LVM is awesome, so it seemed possible :)
> You do not show metadata type. My guess is that it is at the end of
> the disk (1.0 maybe) and so when you added sdf1 MD thought it was a
> re-add and re-synced only the parts that were dirty in the bitmap
> (changed since removal of sdf). However since you moved the start of
> the disk, all data coming from such disk are offsetted and hence
> bogus. That's why metadata default for mdadm is version 1.2: you don't
> risk this kind of crazy things with 1.2 .
It's an old array I've had for years, so 0.90. :(
> With nondegraded raid-5 (which is the situation after adding sdf1), in
> raid5 the reads always come from the nonparity disk for every stripe.
> So when you read, approximately you get 1/3 of data from sdf1, all of
> it bogus. Clearly also ext3 is not happy with its metadata screwed up,
> hence the read errors you see.
>
> If I am correct, the "fix" for your array is simple:
> - fail sdf1
> After that already you can read. Then do mdadm --zero-superblock
> /dev/sdf1 (and maybe even mdadm --zero-superblock /dev/sdf then
> repartition the drive, just to be sure) so mdadm treats it like a new
> drive. Then you can re-add. Ensure it performs a full resync,
> otherwise fail it again and report here.
>
> Too bad you performed fsck already with bogus sdf1 in the raid... Who
> knows what mess it has done! I guess many files might be unreachable
> by now. That was unwise.
After killing all the services that were dying from db corruption (ldap,
mysql, etc) I tried to fsck /var (where all the errors were coming
from)...but couldn't unmount it, so I failed the new disk as it was
clear that the quick recovery was the most likely culprit, then rebooted
with forced fscks. I guess I had a lucky hunch there then! I'd already
shut off syslogd before I failed the new disk as I was trying to unmount
/var, so these actions weren't logged.
Ok, so a --zero-superblock is all I need to ensure a recovery doesn't
happen again and I get a proper rebuild? Cool.
> For the backup you performed to an external disk: if my reasoning is
> correct you can throw it away. This is unless you like to have 1/3 of
> the content of your files full of bogus bytes. You will have more luck
> backing up the array again after failing sdf1 (most parity data should
> still be correct, except where fsck wrote data).
:) My backup was made from the degraded array after a reboot and the
automatic safe repairs, and so far a fsck -nv gives just 16 inodes on
/home with errors, all of which are old chat logs, /usr has 29 inodes
with errors, 13 of which I have the filenames of (so easy to grab the
files from their packages if recovery goes badly)...the other 16...well.
Guess I'll discover those in time. Finally, /var has just 15 inodes with
errors, all of which are wiki captcha images, apparently. So lucky
escape there it would seem!
Incidentally, I've made a handy little script I'm playing with whilst
waiting for the scans to complete:
> #!/bin/sh
> fsck -nv $1 | grep -ioE "inode ([0-9]+)" | cut -c 7- | sort | uniq |
> xargs -i -d \\n debugfs -R 'ncheck {}' $1 | grep -e "^[0-9]"
Give it a partition (i.e. /dev/main/homes) and it'll eventually show you
the filenames of inodes with errors...last bit of piping from debugfs is
not quite right yet though, had to do that manually.
...I do love the *nix command line ;)
> However before proceeding with anything I suggest to wait for some
> other opinion on the ML, 'cuz I am not infallible (euphemism).
> Disassemble the raid in the meantime. This will make sure at least
> that a cron'd "repair" does not start, that would be disastrous.
Indeed. I'm being pressured to get the system back up, but I'm taking
very measured steps now! I've scped some of the backup I made to my
location and things seem fine...I want to do more checks though to be
sure. Touch wood, I might be lucky...
> Also please tell us your kernel version and cat /proc/mdstat please so
> we can make better guesses.
Certainly.
> mrlinux:/# uname -a
> Linux mrlinux 3.2.0-4-686-pae #1 SMP Debian 3.2.35-2 i686 GNU/Linux
>
> mrlinux:/# cat /proc/mdstat
> Personalities : [raid1] [raid6] [raid5] [raid4]
> md1 : active raid5 sde1[0] sdc1[1]
> 975145216 blocks level 5, 64k chunk, algorithm 2 [3/2] [UU_]
> bitmap: 175/233 pages [700KB], 1024KB chunk
>
> md0 : active raid1 sda1[0] sdb1[1]
> 1951744 blocks [2/2] [UU]
>
> unused devices: <none>
Nothing fancy :)
>
> Good luck
> J.
>
Thanks for your advice (and thanks Adam as well!)
- Jamie
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-02-28 1:01 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-02-27 19:38 Corrupted FS after recovery. Coincidence? Jamie Thompson
2013-02-27 22:37 ` Adam Goryachev
2013-02-27 23:19 ` joystick
2013-02-28 1:01 ` Jamie Thompson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).