* Problems after reshaping of Raid5 array
@ 2010-11-29 17:24 Michele Bonera
2010-11-29 19:12 ` Jan Ceuleers
2010-11-29 21:45 ` Neil Brown
0 siblings, 2 replies; 8+ messages in thread
From: Michele Bonera @ 2010-11-29 17:24 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1037 bytes --]
Hi all.
I'm a little bit in panic... and I really need some help to solve this
(if possible......)...
I have a storage server in my LAN where I save everything
for security (sigh).
The system consists in a 32 GB SSD containing the o.s.
plus 4 WD EADS 1TB harddisks in RAID5 with all my data.
The disks are seen by the system as sdb1, sdc1, sdd1, sde1
Yesterday evening I added another WD, this time an EARS
(512 byte sectors): I created a partition on it, respecting the
alignment and then I added it to the array and performed a
grow command
mdadm --add /dev/md6 /dev/sdb1 (after adding it, the hd took sdb)
mdadm --grow /dev/md6 --raid-devices=5
Reshape started... and worked until today. Or better, until the system
hangs and I have to sync+remount-ro with the sysrq keys.
After rebooting, the reshaping restarted, but the disk become sdb
not sdb1 in the raid array, and the file system became unreadable
Any ideas of what happened?
Thanks a lot for any suggestion you can give me.
I attach the mdadm -E and dumpe2fs outputs
[-- Attachment #2: mdadm-e.tgz --]
[-- Type: application/x-gzip, Size: 803 bytes --]
[-- Attachment #3: dumpe2fs --]
[-- Type: application/octet-stream, Size: 1823 bytes --]
root@mizar:~# dumpe2fs /dev/md6
dumpe2fs 1.41.11 (14-Mar-2010)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 11e60234-97a2-47a0-8e52-8fe0c86aa667
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean with errors
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 161202176
Block count: 644794752
Reserved block count: 32239737
Free blocks: 42614406
Free inodes: 120363507
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 870
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
RAID stride: 32
RAID stripe width: 96
Filesystem created: Sun Nov 1 09:22:22 2009
Last mount time: Mon Nov 29 07:55:37 2010
Last write time: Mon Nov 29 16:46:00 2010
Mount count: 1
Maximum mount count: 31
Last checked: Sun Nov 28 23:04:27 2010
Check interval: 15552000 (6 months)
Next check after: Sat May 28 00:04:27 2011
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 4b69fffd-5dd1-425e-8ebd-bd29940c0722
Journal backup: inode blocks
Journal superblock magic number invalid!
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems after reshaping of Raid5 array
2010-11-29 17:24 Problems after reshaping of Raid5 array Michele Bonera
@ 2010-11-29 19:12 ` Jan Ceuleers
2010-11-29 20:04 ` Michele Bonera
2010-11-29 21:45 ` Neil Brown
1 sibling, 1 reply; 8+ messages in thread
From: Jan Ceuleers @ 2010-11-29 19:12 UTC (permalink / raw)
To: Michele Bonera; +Cc: linux-raid
On 29/11/10 18:24, Michele Bonera wrote:
> After rebooting, the reshaping restarted, but the disk become sdb
> not sdb1 in the raid array, and the file system became unreadable
>
> Any ideas of what happened?
Michele,
I've had similar problems, which I resolved by changing the partition
type from fd (Linux RAID autodetect) to 83 (Linux), ensuring that the
initrd is able to assemble the RAID.
Worth a shot.
HTH, Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems after reshaping of Raid5 array
2010-11-29 19:12 ` Jan Ceuleers
@ 2010-11-29 20:04 ` Michele Bonera
2010-11-30 17:00 ` Jan Ceuleers
0 siblings, 1 reply; 8+ messages in thread
From: Michele Bonera @ 2010-11-29 20:04 UTC (permalink / raw)
To: Jan Ceuleers; +Cc: linux-raid
Il giorno lun, 29/11/2010 alle 20.12 +0100, Jan Ceuleers ha scritto:
> I've had similar problems, which I resolved by changing the partition
> type from fd (Linux RAID autodetect) to 83 (Linux), ensuring that the
> initrd is able to assemble the RAID.
> Worth a shot.
> HTH, Jan
Thanks for the reply, Jan, but the problem is not at raid level, or
better, the raid just finished to reshape and now has 5/5 of his
components and is active:
root@mizar:~# cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md6 : active raid5 sdf1[2] sdb1[0] sda1[1] sdc[4] sde1[3]
3438905344 blocks level 5, 128k chunk, algorithm 2 [5/5] [UUUUU]
The problem is that I added a device partition (sdc1) but after the
crash the array has sdc (the device) as one of his components. The
partition table on the disk was wiped-out.
The result is that the filesystem on it (dev/md6) is unreadable and
fsck.ext3 can't fix it.
Bye
--
Michele Bonera
linux user group brescia
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: Problems after reshaping of Raid5 array
2010-11-29 20:04 ` Michele Bonera
@ 2010-11-30 17:00 ` Jan Ceuleers
0 siblings, 0 replies; 8+ messages in thread
From: Jan Ceuleers @ 2010-11-30 17:00 UTC (permalink / raw)
To: Michele Bonera; +Cc: linux-raid
On 29/11/10 21:04, Michele Bonera wrote:
> The problem is that I added a device partition (sdc1) but after the
> crash the array has sdc (the device) as one of his components. The
> partition table on the disk was wiped-out.
>
> The result is that the filesystem on it (dev/md6) is unreadable and
> fsck.ext3 can't fix it.
Right. This is consistent with what I've seen as well.
I can't help you repair this filesystem, but when you do my suggestion
remains to change the partition types of the RAID members. This is
because I've noticed that the in-kernel RAID detection and assembly code
sometimes gets it wrong (leading to the kinds of symptoms you've
reported), whereas mdadm run from the initrd does not.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems after reshaping of Raid5 array
2010-11-29 17:24 Problems after reshaping of Raid5 array Michele Bonera
2010-11-29 19:12 ` Jan Ceuleers
@ 2010-11-29 21:45 ` Neil Brown
2010-11-30 7:23 ` Michele Bonera
1 sibling, 1 reply; 8+ messages in thread
From: Neil Brown @ 2010-11-29 21:45 UTC (permalink / raw)
To: Michele Bonera; +Cc: linux-raid
On Mon, 29 Nov 2010 18:24:18 +0100 Michele Bonera <mbonera@gmail.com> wrote:
> Hi all.
>
> I'm a little bit in panic... and I really need some help to solve this
> (if possible......)...
>
> I have a storage server in my LAN where I save everything
> for security (sigh).
>
> The system consists in a 32 GB SSD containing the o.s.
> plus 4 WD EADS 1TB harddisks in RAID5 with all my data.
> The disks are seen by the system as sdb1, sdc1, sdd1, sde1
>
> Yesterday evening I added another WD, this time an EARS
> (512 byte sectors): I created a partition on it, respecting the
> alignment and then I added it to the array and performed a
> grow command
>
> mdadm --add /dev/md6 /dev/sdb1 (after adding it, the hd took sdb)
> mdadm --grow /dev/md6 --raid-devices=5
>
> Reshape started... and worked until today. Or better, until the system
> hangs and I have to sync+remount-ro with the sysrq keys.
>
> After rebooting, the reshaping restarted, but the disk become sdb
> not sdb1 in the raid array, and the file system became unreadable
>
> Any ideas of what happened?
Yes.
I think you can fix it by simply failing and removing sdc
Then md/raid5 will recover that data using the parity block, and that should
be correct.
It appears that the partition you created on the new device started at a
multiple of 64K. When this happen, the superblock at the end of the
partition also looks valid when seen at the end of the whole device.
Somehow mdadm got confused and choose the whole device (sdc) instead of the
partition (sdc1).
I am surprised at this because since mdadm-2.5.1, mdadm will refuse to
assemble an array if it sees two devices that appear to have the same
superblock. Could you possibly be using something that old??
So when the reshape started, it was writing data for the 5th device to sdc1.
Then after you restared, it was writing data for the 5th device to sdc, which
the same drive of course, but at a different offset. So everthing that was
written before the crash will look wrong.
So the thing to do is to stop md from reading from sdc at all, as that device
is clearly corrupt. So just fail and remove it. Then add it back in again.
If you do re-partition, try to make sure sdc1 does not start at a multiple of
64K (128 sectors).
NeilBrown
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems after reshaping of Raid5 array
2010-11-29 21:45 ` Neil Brown
@ 2010-11-30 7:23 ` Michele Bonera
2010-11-30 19:45 ` Jan Ceuleers
0 siblings, 1 reply; 8+ messages in thread
From: Michele Bonera @ 2010-11-30 7:23 UTC (permalink / raw)
To: Neil Brown; +Cc: linux-raid
Il giorno mar, 30/11/2010 alle 08.45 +1100, Neil Brown ha scritto:
> > Yesterday evening I added another WD, this time an EARS
> > (512 byte sectors): I created a partition on it, respecting the
Just to be precise: 4K byte sector (my mistake).
> Yes.
> I think you can fix it by simply failing and removing sdc
> Then md/raid5 will recover that data using the parity block, and that should
> be correct.
> It appears that the partition you created on the new device started at a
> multiple of 64K. When this happen, the superblock at the end of the
> partition also looks valid when seen at the end of the whole device.
> Somehow mdadm got confused and choose the whole device (sdc) instead of the
> partition (sdc1).
I did it and it worked! Thanks a lot Neil!!!
> I am surprised at this because since mdadm-2.5.1, mdadm will refuse to
> assemble an array if it sees two devices that appear to have the same
> superblock. Could you possibly be using something that old??
The distribution is Ubuntu Server 10.04,
kernel 2.6.32-26-generic-pae, mdadm 2.6.7.1
Again many thanks Neil, you saved me! :)
Cheers,
--
Michele Bonera
linux user group brescia
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems after reshaping of Raid5 array
2010-11-30 7:23 ` Michele Bonera
@ 2010-11-30 19:45 ` Jan Ceuleers
2010-12-03 7:03 ` Michele Bonera
0 siblings, 1 reply; 8+ messages in thread
From: Jan Ceuleers @ 2010-11-30 19:45 UTC (permalink / raw)
To: Michele Bonera; +Cc: Neil Brown, linux-raid
On 30/11/10 08:23, Michele Bonera wrote:
>> It appears that the partition you created on the new device started at a
>> multiple of 64K. When this happen, the superblock at the end of the
>> partition also looks valid when seen at the end of the whole device.
>> Somehow mdadm got confused and choose the whole device (sdc) instead of the
>> partition (sdc1).
>
> I did it and it worked! Thanks a lot Neil!!!
Michele,
Does it survive a reboot?
Jan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Problems after reshaping of Raid5 array
2010-11-30 19:45 ` Jan Ceuleers
@ 2010-12-03 7:03 ` Michele Bonera
0 siblings, 0 replies; 8+ messages in thread
From: Michele Bonera @ 2010-12-03 7:03 UTC (permalink / raw)
To: Jan Ceuleers; +Cc: linux-raid
Il giorno mar, 30/11/2010 alle 20.45 +0100, Jan Ceuleers ha scritto:
> On 30/11/10 08:23, Michele Bonera wrote:
> >> It appears that the partition you created on the new device started at a
> >> multiple of 64K. When this happen, the superblock at the end of the
> >> partition also looks valid when seen at the end of the whole device.
> >> Somehow mdadm got confused and choose the whole device (sdc) instead of the
> >> partition (sdc1).
> >
> > I did it and it worked! Thanks a lot Neil!!!
> Michele,
> Does it survive a reboot?
> Jan
Yes. Now it's working perfectly...
Bye
--
Michele Bonera
linux user group brescia
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2010-12-03 7:03 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-29 17:24 Problems after reshaping of Raid5 array Michele Bonera
2010-11-29 19:12 ` Jan Ceuleers
2010-11-29 20:04 ` Michele Bonera
2010-11-30 17:00 ` Jan Ceuleers
2010-11-29 21:45 ` Neil Brown
2010-11-30 7:23 ` Michele Bonera
2010-11-30 19:45 ` Jan Ceuleers
2010-12-03 7:03 ` Michele Bonera
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).