* Help with corrupted MDADM Raid6
@ 2014-06-13 8:34 ptschack .
2014-06-13 10:25 ` NeilBrown
0 siblings, 1 reply; 8+ messages in thread
From: ptschack . @ 2014-06-13 8:34 UTC (permalink / raw)
To: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1063 bytes --]
Hi,
I fear I may have messed up my MDADM RAID 6. Some info:
The RAID consists of 11 hard drives (3 TB each), 2 of which are
spares. It used to be 6 drives, no spares.
This happened: one drive (/dev/sdg) kept giving me SMART warnings
(failure imminent). As I was trying to do a fresh install at the time,
i restored a backup of my system drive (an ssd which is is wholly
independent of the RAID) and booted off of that.
I hadn't realized that the backup was too old, from the time before i
grew the raid... Apparently mdadm tried to mount the raid as it used
to be (6 drives), and now it says the raid consists of 6 drives, all
spares!
Plus the drive /dev/sdg seems to have totally failed now :(
I guess I somehow need to re-assemble (or re-create) the raid6 in a
degraded state, since one drive failed, but I'm not entirely clear on
how to do that. One question is if I should include the spares in a
reassembly or add them later.
Can anyone help?
I have attached the outputs of dumpe2fs and mdadm --examine of the
drives, plus my mdadm.conf.
Regards,
-P.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: dumpe2fs.txt --]
[-- Type: text/plain; charset=US-ASCII; name="dumpe2fs.txt", Size: 18175 bytes --]
/dev/sda1:
dumpe2fs 1.42 (29-Nov-2011)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 5f78aca0-0621-4fad-a983-7230935b1b6b
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Mar 3 02:19:36 2013
Last mount time: n/a
Last write time: Sun Mar 3 02:19:37 2013
Mount count: 0
Maximum mount count: -1
Last checked: Sun Mar 3 02:19:36 2013
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: a51a2206-3adb-4e90-89a2-4d0e69bd6da2
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdb1:
dumpe2fs 1.42 (29-Nov-2011)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 438ac068-902a-42fd-a6a0-1352da761cac
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Mar 3 02:21:42 2013
Last mount time: n/a
Last write time: Sun Mar 3 02:21:44 2013
Mount count: 0
Maximum mount count: -1
Last checked: Sun Mar 3 02:21:42 2013
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 70b15171-13bc-4dec-8638-b8aa4f01f9b0
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdc1:
dumpe2fs 1.42 (29-Nov-2011)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 4775ff03-ad4f-4a7b-87c5-d675a5f9092a
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Mar 3 02:22:00 2013
Last mount time: n/a
Last write time: Sun Mar 3 02:22:02 2013
Mount count: 0
Maximum mount count: -1
Last checked: Sun Mar 3 02:22:00 2013
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: d5a04c56-e394-462e-b89c-a9ce1c509eb7
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdd1:
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 4ffd6b00-4a71-48d5-a7e4-97822db89fb7
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Mar 3 02:22:19 2013
Last mount time: n/a
Last write time: Sun Mar 3 02:22:20 2013
Mount count: 0
Maximum mount count: -1
Last checked: Sun Mar 3 02:22:19 2013
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 5f3b5e6f-eeb6-42ab-a4c9-f5fa828bd8a7
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sde1:
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: fda7509c-7ad3-4d63-b67b-f2ede3c55701
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Mar 3 02:22:34 2013
Last mount time: n/a
Last write time: Sun Mar 3 02:22:36 2013
Mount count: 0
Maximum mount count: -1
Last checked: Sun Mar 3 02:22:34 2013
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 91b25ca7-4847-4cc8-b076-947a58d44794
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdf1:
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: b6bd7f6f-a2be-4bf3-bb48-f3e3bd03155d
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Mar 3 02:22:51 2013
Last mount time: n/a
Last write time: Sun Mar 3 02:22:52 2013
Mount count: 0
Maximum mount count: -1
Last checked: Sun Mar 3 02:22:51 2013
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 641bfcef-ce8c-49bd-af2f-f505fec137ea
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdg1:
Cannot find superblock
/dev/sdh1:
dumpe2fs 1.42 (29-Nov-2011)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: e3e0fe2f-5898-4035-b772-43c79670d82d
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Apr 20 10:06:40 2014
Last mount time: n/a
Last write time: Sun Apr 20 10:06:42 2014
Mount count: 0
Maximum mount count: -1
Last checked: Sun Apr 20 10:06:40 2014
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 700aaa9e-e4f7-4002-9ced-9e064f88d2f0
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdi1:
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: c47385ef-9140-4266-80da-a0641f9c1199
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Apr 20 10:07:57 2014
Last mount time: n/a
Last write time: Sun Apr 20 10:07:59 2014
Mount count: 0
Maximum mount count: -1
Last checked: Sun Apr 20 10:07:57 2014
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 747654a9-35b5-465c-aee5-3da187ed40dd
Journal backup: inode blocks
Journal superblock magic number invalid!
/dev/sdj1:
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 0f9b6506-8703-4673-b856-35a6ea36b0ea
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Apr 20 10:08:51 2014
Last mount time: n/a
Last write time: Sun Apr 20 10:08:52 2014
Mount count: 0
Maximum mount count: -1
Last checked: Sun Apr 20 10:08:51 2014
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: c02a71ea-def8-4679-88dd-c485879eca8a
Journal backup: inode blocks
Jounaleigenschaften: (none)
Journalgrösse: 128M
Journal-Länge: 32768
Journal-Sequenz: 0x00000001
Journal-Start: 0
[Bunch of info about Inodes and such here]
/dev/sdk1:
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 3c3969e7-b556-4acd-a57a-9082e275dad8
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 183148544
Block count: 732566272
Reserved block count: 36628313
Free blocks: 721019450
Free inodes: 183148533
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 849
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Sun Apr 6 12:35:30 2014
Last mount time: n/a
Last write time: Sun Apr 6 12:35:32 2014
Mount count: 0
Maximum mount count: -1
Last checked: Sun Apr 6 12:35:30 2014
Check interval: 0 (<none>)
Lifetime writes: 137 MB
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
Default directory hash: half_md4
Directory Hash Seed: 9ea8b5a1-aa8e-4902-aaa7-7760df1c5b40
Journal backup: inode blocks
dumpe2fs: Corrupt extent header while reading Journal-Superblock
[-- Attachment #3: mdadm.conf --]
[-- Type: application/octet-stream, Size: 775 bytes --]
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#
# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers
# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes
# automatically tag new arrays as belonging to the local system
HOMEHOST <system>
# instruct the monitoring daemon where to send mail alerts
MAILADDR root
# definitions of existing MD arrays
ARRAY /dev/md/0 metadata=1.2 UUID=2d18f556:0cbad263:017a87af:aadac8f7 name=brain:0
# This file was auto-generated on Sun, 03 Mar 2013 20:58:22 +0100
# by mkconf $Id$
[-- Attachment #4: mdadm_--examine.txt --]
[-- Type: text/plain, Size: 1396 bytes --]
mdadm: No md superblock detected on /dev/sda1.
mdadm: No md superblock detected on /dev/sdb1.
mdadm: No md superblock detected on /dev/sdc1.
mdadm: No md superblock detected on /dev/sdd1.
mdadm: No md superblock detected on /dev/sde1.
mdadm: No md superblock detected on /dev/sdf1.
mdadm: cannot open /dev/sdg1: No such file or directory
mdadm: No md superblock detected on /dev/sdh1.
mdadm: No md superblock detected on /dev/sdi1.
mdadm: No md superblock detected on /dev/sdj1.
/dev/sdk1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 6
Avail Dev Size : 5860272128 (2794.40 GiB 3000.46 GB)
Array Size : 11720541440 (11177.58 GiB 12001.83 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 258048 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 41eecef8:f5c16378:0c42cbb7:a66262d7
Internal Bitmap : 8 sectors from superblock
Update Time : Sun Apr 6 12:36:57 2014
Checksum : d4129908 - correct
Events : 3848
Layout : left-symmetric
Chunk Size : 64K
Device Role : spare
Array State : AAAAAA ('A' == active, '.' == missing)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-13 8:34 Help with corrupted MDADM Raid6 ptschack .
@ 2014-06-13 10:25 ` NeilBrown
2014-06-13 10:53 ` ptschack .
2014-06-14 9:54 ` ptschack .
0 siblings, 2 replies; 8+ messages in thread
From: NeilBrown @ 2014-06-13 10:25 UTC (permalink / raw)
To: ptschack .; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 2437 bytes --]
On Fri, 13 Jun 2014 10:34:55 +0200 "ptschack ." <ptschack@googlemail.com>
wrote:
> Hi,
>
> I fear I may have messed up my MDADM RAID 6. Some info:
>
> The RAID consists of 11 hard drives (3 TB each), 2 of which are
> spares. It used to be 6 drives, no spares.
>
> This happened: one drive (/dev/sdg) kept giving me SMART warnings
> (failure imminent). As I was trying to do a fresh install at the time,
> i restored a backup of my system drive (an ssd which is is wholly
> independent of the RAID) and booted off of that.
> I hadn't realized that the backup was too old, from the time before i
> grew the raid... Apparently mdadm tried to mount the raid as it used
> to be (6 drives), and now it says the raid consists of 6 drives, all
> spares!
That doesn't make sense. md store the important information about an array
on the drive of the array. The information in /etc/mdadm.conf is largely
advisory.
The mdadm.conf that you attached looks perfectly OK and would work with
either the old 6-drive array or the new 9+2 drive array.
> Plus the drive /dev/sdg seems to have totally failed now :(
>
> I guess I somehow need to re-assemble (or re-create) the raid6 in a
> degraded state, since one drive failed, but I'm not entirely clear on
> how to do that. One question is if I should include the spares in a
> reassembly or add them later.
>
> Can anyone help?
Hopefully.
>
> I have attached the outputs of dumpe2fs and mdadm --examine of the
> drives, plus my mdadm.conf.
sdk1 is the only device to show an md superblock, and it was last updated on
6th April, so it is rather old. That is presumably before you made the array
larger.
sda1, sdb1 sdc1 sdd1 sde1 sdf1 sdh1 sdi1 sdj1 sdk1 all appear to contain
ext3fs or ext4fs superblocks. I think some of those are from before the
devices were added into the array. Quite possibly md doesn't over-write the
ext3 superblock.
I wonder if maybe the array is actually on the whole devices rather than on
the partitions. What does
mdadm --examine /dev/sd[abcdefghijk]
report?
Certainly *don't* try to create the array until you have tried all other
options.
If you do find mdadm superblocks on the whole devices, then you could try
mdadm --assemble /dev/sd[a-k]
possibly remove devices from the list which don't work. Possibly add
"--force" if not including "--force" doesn't work.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-13 10:25 ` NeilBrown
@ 2014-06-13 10:53 ` ptschack .
2014-06-14 9:54 ` ptschack .
1 sibling, 0 replies; 8+ messages in thread
From: ptschack . @ 2014-06-13 10:53 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
Hi Neil,
thanks for your answer. Some clarification:
I clearly remember creating the RAID on the partitions (i.e. sda1,
sdb1, ...), not the whole devices (sda, sdb, ...). Doing this, I
created an ext4 partition on each drive and then created the RAID (in
hindsight, I know now I should have done this differently, starting
with the filesystem).
Before restoring my backup of the system disk (unrelated to the RAID)
I ran GParted to just look at the drives, not make any changes. My
guess is that GParted somehow "restored" the superblocks on all disks
except sdk1, thus destroying the md superblocks on those drives.
Nevertheless, when I get home in a few hours I will run mdadm
--examine on the whole devices, but I am 99% sure that nothing will be
found. Would an assemble (or create, for that matter) be able to
restore missing superblocks?
Regards,
-P.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-13 10:25 ` NeilBrown
2014-06-13 10:53 ` ptschack .
@ 2014-06-14 9:54 ` ptschack .
2014-06-14 10:31 ` NeilBrown
1 sibling, 1 reply; 8+ messages in thread
From: ptschack . @ 2014-06-14 9:54 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 862 bytes --]
Hi Neil,
I ran mdadm --examine on the devices (as opposed to the partitions)
and was surprised:
Apparently the superblocks are present on some the devices, even if I
remember differently.
I attached the output of mdadm to this mail.
For the sake of clarity, this is how the raid looked like before it
all went wrong:
11 Drives (/dev/sd[abcdefghijk]) alltogether as a RAID 6.
2 of those are spares (/dev/sd[jk]).
1 is either close to failing of has failed (/dev/sdg).
I tried
mdadm --assemble --run /dev/md0 -v /dev/sd[abcdefghi]
which gave me
mdadm: looking for devices for /dev/md0
mdadm: no RAID superblock on /dev/sdg
mdadm: /dev/sdg has no superblock - assembly aborted
So it seems I somehow have to restore the superblocks on drives
/dev/sd[ghijk], or at least on /dev/sd[ghi].
Is this possible? Any help would be greatly appreciated!
Regards,
-P.
[-- Attachment #2: mdadm_--examine_devices.txt --]
[-- Type: text/plain, Size: 5713 bytes --]
/dev/sda:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 20510947520 (19560.76 GiB 21003.21 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 205e9ef8:deb77ef0:c1616f65:488ee07b
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 9 21:52:48 2014
Checksum : 2b351b38 - correct
Events : 39295
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 0
Array State : AAAAAAAAA ('A' == active, '.' == missing)
/dev/sdb:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 20510947520 (19560.76 GiB 21003.21 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : f8150037:720c981f:f6a476e2:b98fd23a
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 9 21:52:48 2014
Checksum : d4a96c49 - correct
Events : 39295
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 1
Array State : AAAAAAAAA ('A' == active, '.' == missing)
/dev/sdc:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 20510947520 (19560.76 GiB 21003.21 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 138fa882:0164da5b:5efea797:bddf6041
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 9 21:52:48 2014
Checksum : 1853e661 - correct
Events : 39295
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 2
Array State : AAAAAAAAA ('A' == active, '.' == missing)
/dev/sdd:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 20510947520 (19560.76 GiB 21003.21 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 96eb8a2f:03191596:5f97190b:60d0967e
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 9 21:52:48 2014
Checksum : b018818a - correct
Events : 39295
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 3
Array State : AAAAAAAAA ('A' == active, '.' == missing)
/dev/sde:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 20510947520 (19560.76 GiB 21003.21 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : 6886190c:e41800e8:452f1d0e:0f68dab6
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 9 21:52:48 2014
Checksum : 19d94bd4 - correct
Events : 39295
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 4
Array State : AAAAAAAAA ('A' == active, '.' == missing)
/dev/sdf:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x1
Array UUID : 2d18f556:0cbad263:017a87af:aadac8f7
Name : brain:0 (local to host brain)
Creation Time : Sun Mar 3 02:34:30 2013
Raid Level : raid6
Raid Devices : 9
Avail Dev Size : 5860271024 (2794.40 GiB 3000.46 GB)
Array Size : 20510947520 (19560.76 GiB 21003.21 GB)
Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB)
Data Offset : 262144 sectors
Super Offset : 8 sectors
State : clean
Device UUID : fa319bfb:ffe914f0:36423faf:33634092
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Jun 9 21:52:48 2014
Checksum : 8df7d698 - correct
Events : 39295
Layout : left-symmetric
Chunk Size : 64K
Device Role : Active device 5
Array State : AAAAAAAAA ('A' == active, '.' == missing)
/dev/sdg:
MBR Magic : aa55
Partition[0] : 4294967295 sectors at 1 (type ee)
/dev/sdh:
MBR Magic : aa55
Partition[0] : 4294967295 sectors at 1 (type ee)
/dev/sdi:
MBR Magic : aa55
Partition[0] : 4294967295 sectors at 1 (type ee)
/dev/sdj:
MBR Magic : aa55
Partition[0] : 4294967295 sectors at 1 (type ee)
/dev/sdk:
MBR Magic : aa55
Partition[0] : 4294967295 sectors at 1 (type ee)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-14 9:54 ` ptschack .
@ 2014-06-14 10:31 ` NeilBrown
2014-06-14 11:19 ` ptschack .
0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2014-06-14 10:31 UTC (permalink / raw)
To: ptschack .; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 1806 bytes --]
On Sat, 14 Jun 2014 11:54:58 +0200 "ptschack ." <ptschack@googlemail.com>
wrote:
> Hi Neil,
>
> I ran mdadm --examine on the devices (as opposed to the partitions)
> and was surprised:
> Apparently the superblocks are present on some the devices, even if I
> remember differently.
> I attached the output of mdadm to this mail.
>
> For the sake of clarity, this is how the raid looked like before it
> all went wrong:
>
> 11 Drives (/dev/sd[abcdefghijk]) alltogether as a RAID 6.
> 2 of those are spares (/dev/sd[jk]).
> 1 is either close to failing of has failed (/dev/sdg).
>
> I tried
>
> mdadm --assemble --run /dev/md0 -v /dev/sd[abcdefghi]
>
> which gave me
>
> mdadm: looking for devices for /dev/md0
> mdadm: no RAID superblock on /dev/sdg
> mdadm: /dev/sdg has no superblock - assembly aborted
>
> So it seems I somehow have to restore the superblocks on drives
> /dev/sd[ghijk], or at least on /dev/sd[ghi].
> Is this possible? Any help would be greatly appreciated!
>
> Regards,
> -P.
Well, you've definitely made progress. You've found 6 of the devices.
They all look consistent and it appears the array was completely coherent at
Mon Jun 9 21:52:48 2014
You think that the 7th device is dead or dying, so you just need to find 2
more (1 would do).
Presumably these are sdh and shi, but it is very strange that we cannot find
the superblock on either of them.
When was the last time the machine was rebooted prio to the date given -9th
Jun?
Do you have boot logs from that time? What lines contain 'md'??
Particularly "bind" lines will show you exactly which devices were included.
Maybe also try
od -x /dev/sdh | grep '4efc a92b'
If the superblock is at some strange location, that might find it.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-14 10:31 ` NeilBrown
@ 2014-06-14 11:19 ` ptschack .
2014-06-14 12:06 ` NeilBrown
0 siblings, 1 reply; 8+ messages in thread
From: ptschack . @ 2014-06-14 11:19 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 3044 bytes --]
Hi Neil,
regrettably, I do not have logs from Jun 9th. This is what happened, in Detail:
Before I grew the RAID, I made a backup of the system drive (Sometime
around the beginning of may). Then I grew the RAID and the dm-crypt
container on it.
I then noticed that ext4 filesystems cannot be grown above a certain
limit, which is why I decided to convert to BTRFS.
Prior to Jun 9th I upgraded Ubuntu from 12.04 LTS to 14.04 LTS. The
reason was that I wanted the newest BTRFS utils for the conversion.
The conversion went smoothly, but the Ubuntu upgrade messed with some
services running on the server (e.g. various configs for web apps,
nothing to do with the raid). So I wanted to do a fresh install. I
didn't do a backup of the system, because I had the old backup which
had worked before.
I attempted the fresh install, looking at the disks with GParted
beforehand (as I said earlier, my theory is that GParted might have
messed up some of the md superblocks).
So after the fresh install, I wasn't able to start the RAID (error
message was input/output error).
So I thought I'll just restore the old backup, since that worked
perfectly, and then make my way from there.
After the restore, The system asked me if I wanted to start a degraded
RAID. I thought it meant the raid was degraded because of the failing
drive, and said yes.
It then showed me a Raid with 6 Drives, all spares. At this point the
panic started to set in :(
I have attached some log excerpts from the beginning of may, before I
made the backup and the old RAID was functioning (kern.log and syslog,
grepped for 'md').
Furthermore, searching for the superblock with od gave me the following:
od -x /dev/sdh | grep '4efc a92b'
20234525260 8a2a c251 a28b 2f92 f63e 8d72 4efc a92b
103362752200 4efc a92b 3412 ad92 b451 bc40 5897 d215
od -x /dev/sdi | grep '4efc a92b'
135674640060 4efc a92b 89de a9d8 d2b8 395e 6f37 4597
I don't think those are the superblocks, but rather the "magic number"
being present somewhere on the drive :(
Doing further research I found this:
http://kevin.deldycke.com/2007/03/how-to-recover-a-raid-array-after-having-zero-ized-superblocks/
Is there any "safe" way to restore the superblocks, or is re-creating
the RAID my final option?
Thanks again,
-P.
> Well, you've definitely made progress. You've found 6 of the devices.
> They all look consistent and it appears the array was completely coherent at
> Mon Jun 9 21:52:48 2014
>
> You think that the 7th device is dead or dying, so you just need to find 2
> more (1 would do).
>
> Presumably these are sdh and shi, but it is very strange that we cannot find
> the superblock on either of them.
> When was the last time the machine was rebooted prio to the date given -9th
> Jun?
> Do you have boot logs from that time? What lines contain 'md'??
> Particularly "bind" lines will show you exactly which devices were included.
>
> Maybe also try
>
> od -x /dev/sdh | grep '4efc a92b'
>
> If the superblock is at some strange location, that might find it.
>
> NeilBrown
>
[-- Attachment #2: syslog.txt --]
[-- Type: text/plain, Size: 5792 bytes --]
May 3 15:07:34 brain kernel: [ 1.778123] md: bind<sda>
May 3 15:07:34 brain kernel: [ 2.434201] md: bind<sdb>
May 3 15:07:34 brain kernel: [ 2.459872] md: bind<sdc>
May 3 15:07:34 brain kernel: [ 3.178179] md: bind<sdd>
May 3 15:07:34 brain kernel: [ 3.808668] md: bind<sdf>
May 3 15:07:34 brain kernel: [ 3.818330] md: bind<sdg>
May 3 15:07:34 brain kernel: [ 3.993520] md: bind<sdl>
May 3 15:07:34 brain kernel: [ 4.496568] md: bind<sdi>
May 3 15:07:34 brain kernel: [ 4.499792] md: bind<sdj>
May 3 15:07:34 brain kernel: [ 4.503025] md: bind<sdh>
May 3 15:07:34 brain kernel: [ 4.749289] md: raid6 personality registered for level 6
May 3 15:07:34 brain kernel: [ 4.750109] md: raid5 personality registered for level 5
May 3 15:07:34 brain kernel: [ 4.750894] md: raid4 personality registered for level 4
May 3 15:07:34 brain kernel: [ 4.752556] md/raid:md0: device sdh operational as raid disk 8
May 3 15:07:34 brain kernel: [ 4.753329] md/raid:md0: device sdj operational as raid disk 6
May 3 15:07:34 brain kernel: [ 4.754071] md/raid:md0: device sdi operational as raid disk 7
May 3 15:07:34 brain kernel: [ 4.754803] md/raid:md0: device sdg operational as raid disk 5
May 3 15:07:34 brain kernel: [ 4.755525] md/raid:md0: device sdf operational as raid disk 4
May 3 15:07:34 brain kernel: [ 4.756247] md/raid:md0: device sdd operational as raid disk 3
May 3 15:07:34 brain kernel: [ 4.756965] md/raid:md0: device sdc operational as raid disk 2
May 3 15:07:34 brain kernel: [ 4.757675] md/raid:md0: device sdb operational as raid disk 1
May 3 15:07:34 brain kernel: [ 4.758368] md/raid:md0: device sda operational as raid disk 0
May 3 15:07:34 brain kernel: [ 4.759359] md/raid:md0: allocated 9616kB
May 3 15:07:34 brain kernel: [ 4.760128] md/raid:md0: raid level 6 active with 9 out of 9 devices, algorithm 2
May 3 15:07:34 brain kernel: [ 4.760944] created bitmap (22 pages) for device md0
May 3 15:07:34 brain kernel: [ 4.762166] md0: bitmap initialized from disk: read 2 pages, set 0 of 44711 bits
May 3 15:07:34 brain kernel: [ 4.775618] md0: detected capacity change from 0 to 21003210260480
May 3 15:07:34 brain kernel: [ 4.785806] md0: unknown partition table
May 3 15:07:34 brain kernel: [ 4.787191] md: bind<sdk>
May 3 15:07:34 brain kernel: [ 5.563969] md: linear personality registered for level -1
May 3 15:07:34 brain kernel: [ 5.564882] md: multipath personality registered for level -4
May 3 15:07:34 brain kernel: [ 5.565700] md: raid0 personality registered for level 0
May 3 15:07:34 brain kernel: [ 5.566577] md: raid1 personality registered for level 1
May 3 15:07:34 brain kernel: [ 5.569815] md: raid10 personality registered for level 10
May 3 15:07:42 brain kernel: [ 15.024766] type=1400 audit(1399122462.862:10): apparmor="STATUS" operation="profile_load" name="/usr/sbin/clamd" pid=2129 comm="apparmor_parser"
May 3 16:30:25 brain kernel: [ 1.148254] md: bind<sda>
May 3 16:30:25 brain kernel: [ 1.171116] md: bind<sdb>
May 3 16:30:25 brain kernel: [ 1.190905] md: bind<sdc>
May 3 16:30:25 brain kernel: [ 1.210995] md: bind<sdd>
May 3 16:30:25 brain kernel: [ 1.218889] md: bind<sdf>
May 3 16:30:25 brain kernel: [ 1.235742] md: bind<sde>
May 3 16:30:25 brain kernel: [ 1.796822] md: bind<sdk>
May 3 16:30:25 brain kernel: [ 2.300024] md: bind<sdh>
May 3 16:30:25 brain kernel: [ 2.307113] md: bind<sdj>
May 3 16:30:25 brain kernel: [ 2.310379] md: bind<sdg>
May 3 16:30:25 brain kernel: [ 2.313719] md: bind<sdi>
May 3 16:30:25 brain kernel: [ 2.559622] md: raid6 personality registered for level 6
May 3 16:30:25 brain kernel: [ 2.560418] md: raid5 personality registered for level 5
May 3 16:30:25 brain kernel: [ 2.561176] md: raid4 personality registered for level 4
May 3 16:30:25 brain kernel: [ 2.562814] md/raid:md0: device sdi operational as raid disk 6
May 3 16:30:25 brain kernel: [ 2.563557] md/raid:md0: device sdg operational as raid disk 8
May 3 16:30:25 brain kernel: [ 2.564277] md/raid:md0: device sdh operational as raid disk 7
May 3 16:30:25 brain kernel: [ 2.564985] md/raid:md0: device sde operational as raid disk 4
May 3 16:30:25 brain kernel: [ 2.565686] md/raid:md0: device sdf operational as raid disk 5
May 3 16:30:25 brain kernel: [ 2.566378] md/raid:md0: device sdd operational as raid disk 3
May 3 16:30:25 brain kernel: [ 2.567067] md/raid:md0: device sdc operational as raid disk 2
May 3 16:30:25 brain kernel: [ 2.567749] md/raid:md0: device sdb operational as raid disk 1
May 3 16:30:25 brain kernel: [ 2.568409] md/raid:md0: device sda operational as raid disk 0
May 3 16:30:25 brain kernel: [ 2.569355] md/raid:md0: allocated 9616kB
May 3 16:30:25 brain kernel: [ 2.570077] md/raid:md0: raid level 6 active with 9 out of 9 devices, algorithm 2
May 3 16:30:25 brain kernel: [ 2.570869] created bitmap (22 pages) for device md0
May 3 16:30:25 brain kernel: [ 2.572057] md0: bitmap initialized from disk: read 2 pages, set 0 of 44711 bits
May 3 16:30:25 brain kernel: [ 2.595329] md0: detected capacity change from 0 to 21003210260480
May 3 16:30:25 brain kernel: [ 2.601758] md0: unknown partition table
May 3 16:30:25 brain kernel: [ 3.183200] md: linear personality registered for level -1
May 3 16:30:25 brain kernel: [ 3.184010] md: multipath personality registered for level -4
May 3 16:30:25 brain kernel: [ 3.184764] md: raid0 personality registered for level 0
May 3 16:30:25 brain kernel: [ 3.185666] md: raid1 personality registered for level 1
May 3 16:30:25 brain kernel: [ 3.188831] md: raid10 personality registered for level 10
[-- Attachment #3: kern.log --]
[-- Type: text/x-log, Size: 5970 bytes --]
May 3 15:07:34 brain kernel: [ 1.778123] md: bind<sda>
May 3 15:07:34 brain kernel: [ 2.434201] md: bind<sdb>
May 3 15:07:34 brain kernel: [ 2.459872] md: bind<sdc>
May 3 15:07:34 brain kernel: [ 3.178179] md: bind<sdd>
May 3 15:07:34 brain kernel: [ 3.808668] md: bind<sdf>
May 3 15:07:34 brain kernel: [ 3.818330] md: bind<sdg>
May 3 15:07:34 brain kernel: [ 3.993520] md: bind<sdl>
May 3 15:07:34 brain kernel: [ 4.496568] md: bind<sdi>
May 3 15:07:34 brain kernel: [ 4.499792] md: bind<sdj>
May 3 15:07:34 brain kernel: [ 4.503025] md: bind<sdh>
May 3 15:07:34 brain kernel: [ 4.749289] md: raid6 personality registered for level 6
May 3 15:07:34 brain kernel: [ 4.750109] md: raid5 personality registered for level 5
May 3 15:07:34 brain kernel: [ 4.750894] md: raid4 personality registered for level 4
May 3 15:07:34 brain kernel: [ 4.752556] md/raid:md0: device sdh operational as raid disk 8
May 3 15:07:34 brain kernel: [ 4.753329] md/raid:md0: device sdj operational as raid disk 6
May 3 15:07:34 brain kernel: [ 4.754071] md/raid:md0: device sdi operational as raid disk 7
May 3 15:07:34 brain kernel: [ 4.754803] md/raid:md0: device sdg operational as raid disk 5
May 3 15:07:34 brain kernel: [ 4.755525] md/raid:md0: device sdf operational as raid disk 4
May 3 15:07:34 brain kernel: [ 4.756247] md/raid:md0: device sdd operational as raid disk 3
May 3 15:07:34 brain kernel: [ 4.756965] md/raid:md0: device sdc operational as raid disk 2
May 3 15:07:34 brain kernel: [ 4.757675] md/raid:md0: device sdb operational as raid disk 1
May 3 15:07:34 brain kernel: [ 4.758368] md/raid:md0: device sda operational as raid disk 0
May 3 15:07:34 brain kernel: [ 4.759359] md/raid:md0: allocated 9616kB
May 3 15:07:34 brain kernel: [ 4.760128] md/raid:md0: raid level 6 active with 9 out of 9 devices, algorithm 2
May 3 15:07:34 brain kernel: [ 4.760944] created bitmap (22 pages) for device md0
May 3 15:07:34 brain kernel: [ 4.762166] md0: bitmap initialized from disk: read 2 pages, set 0 of 44711 bits
May 3 15:07:34 brain kernel: [ 4.775618] md0: detected capacity change from 0 to 21003210260480
May 3 15:07:34 brain kernel: [ 4.785806] md0: unknown partition table
May 3 15:07:34 brain kernel: [ 4.787191] md: bind<sdk>
May 3 15:07:34 brain kernel: [ 5.563969] md: linear personality registered for level -1
May 3 15:07:34 brain kernel: [ 5.564882] md: multipath personality registered for level -4
May 3 15:07:34 brain kernel: [ 5.565700] md: raid0 personality registered for level 0
May 3 15:07:34 brain kernel: [ 5.566577] md: raid1 personality registered for level 1
May 3 15:07:34 brain kernel: [ 5.569815] md: raid10 personality registered for level 10
May 3 15:07:42 brain kernel: [ 15.024766] type=1400 audit(1399122462.862:10): apparmor="STATUS" operation="profile_load" name="/usr/sbin/clamd" pid=2129 comm="apparmor_parser"
May 3 16:30:25 brain kernel: [ 1.148254] md: bind<sda>
May 3 16:30:25 brain kernel: [ 1.171116] md: bind<sdb>
May 3 16:30:25 brain kernel: [ 1.190905] md: bind<sdc>
May 3 16:30:25 brain kernel: [ 1.210995] md: bind<sdd>
May 3 16:30:25 brain kernel: [ 1.218889] md: bind<sdf>
May 3 16:30:25 brain kernel: [ 1.235742] md: bind<sde>
May 3 16:30:25 brain kernel: [ 1.796822] md: bind<sdk>
May 3 16:30:25 brain kernel: [ 2.300024] md: bind<sdh>
May 3 16:30:25 brain kernel: [ 2.307113] md: bind<sdj>
May 3 16:30:25 brain kernel: [ 2.310379] md: bind<sdg>
May 3 16:30:25 brain kernel: [ 2.313719] md: bind<sdi>
May 3 16:30:25 brain kernel: [ 2.559622] md: raid6 personality registered for level 6
May 3 16:30:25 brain kernel: [ 2.560418] md: raid5 personality registered for level 5
May 3 16:30:25 brain kernel: [ 2.561176] md: raid4 personality registered for level 4
May 3 16:30:25 brain kernel: [ 2.562814] md/raid:md0: device sdi operational as raid disk 6
May 3 16:30:25 brain kernel: [ 2.563557] md/raid:md0: device sdg operational as raid disk 8
May 3 16:30:25 brain kernel: [ 2.564277] md/raid:md0: device sdh operational as raid disk 7
May 3 16:30:25 brain kernel: [ 2.564985] md/raid:md0: device sde operational as raid disk 4
May 3 16:30:25 brain kernel: [ 2.565686] md/raid:md0: device sdf operational as raid disk 5
May 3 16:30:25 brain kernel: [ 2.566378] md/raid:md0: device sdd operational as raid disk 3
May 3 16:30:25 brain kernel: [ 2.567067] md/raid:md0: device sdc operational as raid disk 2
May 3 16:30:25 brain kernel: [ 2.567749] md/raid:md0: device sdb operational as raid disk 1
May 3 16:30:25 brain kernel: [ 2.568409] md/raid:md0: device sda operational as raid disk 0
May 3 16:30:25 brain kernel: [ 2.569355] md/raid:md0: allocated 9616kB
May 3 16:30:25 brain kernel: [ 2.570077] md/raid:md0: raid level 6 active with 9 out of 9 devices, algorithm 2
May 3 16:30:25 brain kernel: [ 2.570869] created bitmap (22 pages) for device md0
May 3 16:30:25 brain kernel: [ 2.572057] md0: bitmap initialized from disk: read 2 pages, set 0 of 44711 bits
May 3 16:30:25 brain kernel: [ 2.595329] md0: detected capacity change from 0 to 21003210260480
May 3 16:30:25 brain kernel: [ 2.601758] md0: unknown partition table
May 3 16:30:25 brain kernel: [ 3.183200] md: linear personality registered for level -1
May 3 16:30:25 brain kernel: [ 3.184010] md: multipath personality registered for level -4
May 3 16:30:25 brain kernel: [ 3.184764] md: raid0 personality registered for level 0
May 3 16:30:25 brain kernel: [ 3.185666] md: raid1 personality registered for level 1
May 3 16:30:25 brain kernel: [ 3.188831] md: raid10 personality registered for level 10
May 3 16:30:32 brain kernel: [ 13.703548] type=1400 audit(1399127432.542:9): apparmor="STATUS" operation="profile_load" name="/usr/sbin/clamd" pid=2212 comm="apparmor_parser"
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-14 11:19 ` ptschack .
@ 2014-06-14 12:06 ` NeilBrown
2014-06-14 17:14 ` ptschack .
0 siblings, 1 reply; 8+ messages in thread
From: NeilBrown @ 2014-06-14 12:06 UTC (permalink / raw)
To: ptschack .; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 3747 bytes --]
On Sat, 14 Jun 2014 13:19:57 +0200 "ptschack ." <ptschack@googlemail.com>
wrote:
> Hi Neil,
>
> regrettably, I do not have logs from Jun 9th. This is what happened, in Detail:
>
> Before I grew the RAID, I made a backup of the system drive (Sometime
> around the beginning of may). Then I grew the RAID and the dm-crypt
> container on it.
> I then noticed that ext4 filesystems cannot be grown above a certain
> limit, which is why I decided to convert to BTRFS.
> Prior to Jun 9th I upgraded Ubuntu from 12.04 LTS to 14.04 LTS. The
> reason was that I wanted the newest BTRFS utils for the conversion.
> The conversion went smoothly, but the Ubuntu upgrade messed with some
> services running on the server (e.g. various configs for web apps,
> nothing to do with the raid). So I wanted to do a fresh install. I
> didn't do a backup of the system, because I had the old backup which
> had worked before.
>
> I attempted the fresh install, looking at the disks with GParted
> beforehand (as I said earlier, my theory is that GParted might have
> messed up some of the md superblocks).
> So after the fresh install, I wasn't able to start the RAID (error
> message was input/output error).
> So I thought I'll just restore the old backup, since that worked
> perfectly, and then make my way from there.
>
> After the restore, The system asked me if I wanted to start a degraded
> RAID. I thought it meant the raid was degraded because of the failing
> drive, and said yes.
> It then showed me a Raid with 6 Drives, all spares. At this point the
> panic started to set in :(
>
> I have attached some log excerpts from the beginning of may, before I
> made the backup and the old RAID was functioning (kern.log and syslog,
> grepped for 'md').
>
> Furthermore, searching for the superblock with od gave me the following:
>
> od -x /dev/sdh | grep '4efc a92b'
>
> 20234525260 8a2a c251 a28b 2f92 f63e 8d72 4efc a92b
> 103362752200 4efc a92b 3412 ad92 b451 bc40 5897 d215
>
> od -x /dev/sdi | grep '4efc a92b'
>
> 135674640060 4efc a92b 89de a9d8 d2b8 395e 6f37 4597
>
> I don't think those are the superblocks, but rather the "magic number"
> being present somewhere on the drive :(
Yes, I think you are correct.
>
> Doing further research I found this:
> http://kevin.deldycke.com/2007/03/how-to-recover-a-raid-array-after-having-zero-ized-superblocks/
>
> Is there any "safe" way to restore the superblocks, or is re-creating
> the RAID my final option?
It looks like the only option left is to create the array again.
Providing you use --assume-clean and don't add spares, this is fairly safe
and you can try it again if you get it wrong.
It might be good to use 'dd' to backup the first few megabytes of each drive
just to be safe: "mdadm --create" will only overwrite the metadata which is
in the first few K, so maybe that is enough, but more doesn't hurt.
Based on the logs use attached (which did have useful "bind" and
"operational as" lines) the order should be:
sda sdb sdc sdd sde sdf sdi sdh sdg
So something like
mdadm -C /dev/md0 -l6 -n9 -c 64 --assume-clean \
--data-offset=262144s /dev/sd{a,b,c,d,e,f,i,h} missing
Then try 'fsck -n' or similar. If that looks good, try
echo check > /sys/block/md0/md/sync_action
and when that finished, check that "mismatch_cnt" is small.
If it is all good you should be safe to add another device and let it
rebuild.
Then you can add a bitmap (--grow --bitmap=internal). I wouldn't add the
bitmap until the array seems to be otherwise OK.
If the filesystem appears to be badly corrupted, you should stop the array,
and possibly try a different order of devices.
NeilBrown
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Help with corrupted MDADM Raid6
2014-06-14 12:06 ` NeilBrown
@ 2014-06-14 17:14 ` ptschack .
0 siblings, 0 replies; 8+ messages in thread
From: ptschack . @ 2014-06-14 17:14 UTC (permalink / raw)
To: NeilBrown; +Cc: linux-raid
Hi Neil,
you are a lifesaver, it worked! The RAID is currently rebuilding and
the data is all there, phew!
In the future I will always keep copies of the superblocks backed up :-P
Send me your PayPal address if you want me to buy you a beer :)
Greetings,
-P.
On Sat, Jun 14, 2014 at 2:06 PM, NeilBrown <neilb@suse.de> wrote:
> On Sat, 14 Jun 2014 13:19:57 +0200 "ptschack ." <ptschack@googlemail.com>
> wrote:
>
>> Hi Neil,
>>
>> regrettably, I do not have logs from Jun 9th. This is what happened, in Detail:
>>
>> Before I grew the RAID, I made a backup of the system drive (Sometime
>> around the beginning of may). Then I grew the RAID and the dm-crypt
>> container on it.
>> I then noticed that ext4 filesystems cannot be grown above a certain
>> limit, which is why I decided to convert to BTRFS.
>> Prior to Jun 9th I upgraded Ubuntu from 12.04 LTS to 14.04 LTS. The
>> reason was that I wanted the newest BTRFS utils for the conversion.
>> The conversion went smoothly, but the Ubuntu upgrade messed with some
>> services running on the server (e.g. various configs for web apps,
>> nothing to do with the raid). So I wanted to do a fresh install. I
>> didn't do a backup of the system, because I had the old backup which
>> had worked before.
>>
>> I attempted the fresh install, looking at the disks with GParted
>> beforehand (as I said earlier, my theory is that GParted might have
>> messed up some of the md superblocks).
>> So after the fresh install, I wasn't able to start the RAID (error
>> message was input/output error).
>> So I thought I'll just restore the old backup, since that worked
>> perfectly, and then make my way from there.
>>
>> After the restore, The system asked me if I wanted to start a degraded
>> RAID. I thought it meant the raid was degraded because of the failing
>> drive, and said yes.
>> It then showed me a Raid with 6 Drives, all spares. At this point the
>> panic started to set in :(
>>
>> I have attached some log excerpts from the beginning of may, before I
>> made the backup and the old RAID was functioning (kern.log and syslog,
>> grepped for 'md').
>>
>> Furthermore, searching for the superblock with od gave me the following:
>>
>> od -x /dev/sdh | grep '4efc a92b'
>>
>> 20234525260 8a2a c251 a28b 2f92 f63e 8d72 4efc a92b
>> 103362752200 4efc a92b 3412 ad92 b451 bc40 5897 d215
>>
>> od -x /dev/sdi | grep '4efc a92b'
>>
>> 135674640060 4efc a92b 89de a9d8 d2b8 395e 6f37 4597
>>
>> I don't think those are the superblocks, but rather the "magic number"
>> being present somewhere on the drive :(
>
> Yes, I think you are correct.
>
>>
>> Doing further research I found this:
>> http://kevin.deldycke.com/2007/03/how-to-recover-a-raid-array-after-having-zero-ized-superblocks/
>>
>> Is there any "safe" way to restore the superblocks, or is re-creating
>> the RAID my final option?
>
> It looks like the only option left is to create the array again.
> Providing you use --assume-clean and don't add spares, this is fairly safe
> and you can try it again if you get it wrong.
>
> It might be good to use 'dd' to backup the first few megabytes of each drive
> just to be safe: "mdadm --create" will only overwrite the metadata which is
> in the first few K, so maybe that is enough, but more doesn't hurt.
>
> Based on the logs use attached (which did have useful "bind" and
> "operational as" lines) the order should be:
>
> sda sdb sdc sdd sde sdf sdi sdh sdg
>
> So something like
> mdadm -C /dev/md0 -l6 -n9 -c 64 --assume-clean \
> --data-offset=262144s /dev/sd{a,b,c,d,e,f,i,h} missing
>
> Then try 'fsck -n' or similar. If that looks good, try
> echo check > /sys/block/md0/md/sync_action
> and when that finished, check that "mismatch_cnt" is small.
>
> If it is all good you should be safe to add another device and let it
> rebuild.
>
> Then you can add a bitmap (--grow --bitmap=internal). I wouldn't add the
> bitmap until the array seems to be otherwise OK.
>
> If the filesystem appears to be badly corrupted, you should stop the array,
> and possibly try a different order of devices.
>
> NeilBrown
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2014-06-14 17:14 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-13 8:34 Help with corrupted MDADM Raid6 ptschack .
2014-06-13 10:25 ` NeilBrown
2014-06-13 10:53 ` ptschack .
2014-06-14 9:54 ` ptschack .
2014-06-14 10:31 ` NeilBrown
2014-06-14 11:19 ` ptschack .
2014-06-14 12:06 ` NeilBrown
2014-06-14 17:14 ` ptschack .
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox