From: NeilBrown <neilb@suse.de>
To: Oliver Schinagl <oliver+list@schinagl.nl>
Cc: linux-raid@vger.kernel.org
Subject: Re: Recovering from the kernel bug, Neil?
Date: Mon, 10 Sep 2012 09:08:29 +1000 [thread overview]
Message-ID: <20120910090829.19100cdf@notabene.brown> (raw)
In-Reply-To: <504CFA7B.7090606@schinagl.nl>
[-- Attachment #1: Type: text/plain, Size: 5633 bytes --]
On Sun, 09 Sep 2012 22:22:19 +0200 Oliver Schinagl <oliver+list@schinagl.nl>
wrote:
> Since I had no reply as of yet, I wonder if I would arbitrarly change
> the data at offset 0x1100 to something that _might_ be right could I
> horribly break something?
I doubt it would do any good.
I think that editing the metadata by 'hand' is not likely to be a useful
approach. You really want to get 'mdadm --create' to recreate the array with
the correct details. It should be possible to do this, though a little bit
of hacking or careful selection of mdadm version might be required.
What exactly do you know about the array? When you use mdadm to --create
the array, what details does it get wrong?
NeilBrown
>
> oliver
>
> On 08/19/12 15:56, Oliver Schinagl wrote:
> > Hi list,
> >
> > I've once again started to try to repair my broken array. I've tried
> > most things suggested by Neil before (create array in place whilst
> > keeping data etc etc) only breaking it more (having to new of mdadm).
> >
> > So instead, I made a dd of: sda4 and sdb4; sda5 and sdb5, both working
> > raid10 arrays, f2 and o2 layouts. I then compared that to an image of
> > sdb6. Granted, I only used 256mb worth of data.
> >
> > Using https://raid.wiki.kernel.org/index.php/RAID_superblock_formats I
> > compared my broken sdb6 array to the two working and active arrays.
> >
> > I haven't completly finished comparing, since the wiki falls short at
> > the end, which I think is the more important bit concerning my situation.
> >
> > Some info about sdb6:
> >
> > /dev/sdb6:
> > Magic : a92b4efc
> > Version : 1.2
> > Feature Map : 0x0
> > Array UUID : cde37e2e:309beb19:3461f3f3:1ea70694
> > Name : valexia:opt (local to host valexia)
> > Creation Time : Sun Aug 28 17:46:27 2011
> > Raid Level : -unknown-
> > Raid Devices : 0
> >
> > Avail Dev Size : 456165376 (217.52 GiB 233.56 GB)
> > Data Offset : 2048 sectors
> > Super Offset : 8 sectors
> > State : active
> > Device UUID : 7b47e9ab:ea4b27ce:50e12587:9c572944
> >
> > Update Time : Mon May 28 20:53:42 2012
> > Checksum : 32e1e116 - correct
> > Events : 1
> >
> >
> > Device Role : spare
> > Array State : ('A' == active, '.' == missing)
> >
> >
> > Now my questions regarding trying to repair this array are the following:
> >
> > At offset 0x10A0, (metaversion 1.2 accounts for the 0x1000 extra) I
> > found on the wiki:
> >
> > "This is shown as "Array Slot" by the mdadm v2.x "--examine" command
> >
> > Note: This is a 32-bit unsigned integer, but the Device-Roles
> > (Positions-in-Array) Area indexes these values using only 16-bit
> > unsigned integers, and reserves the values 0xFFFF as spare and 0xFFFE as
> > faulty, so only 65,534 devices per array are possible."
> >
> > sda4 and sdb4 list this as 02 00 00 00 and 01 00 00 00. Sounds sensible,
> > although I would have expected 0x0 and 0x1, but I'm sure there's some
> > sensible explanation. sda5 and sdb5 however are slightly different, 03
> > 00 00 00 and 02 00 00 00. It quickly shows that for some coincidental
> > reason, but the 'b' parts have a higher number then the 'a' parts. So a
> > 02 00 00 00 on sdb6 (the broken array) should be okay.
> >
> > Then next, is 'resync_offset' at 0x10D0. I think all devices list it as
> > FF FF FF FF, but the broken device has it at 00 00 00 00. Any impact on
> > this one?
> >
> > Then of course tehre's the 0x10D8 checksum. mdadm currently says it
> > matches, but once I start editing things those probably won't match
> > anymore. Any way around that?
> >
> > Then offset 0x1100 is slightly different for each array. Array sd?5
> > looks like: FE FF FE FF 01 00 00 00
> > Array sd?4 looks similar enough, FE FF 01 00 00 00 FE FF
> >
> > Does this correspond to the 01, 02 and 03 value pairs for 0x10A0?
> >
> > The broken array reads FE FF FE FF FE FF FE, which probably is wrong?
> >
> >
> > As for determining whether the first data block is offset, or 'real', I
> > compared dataoffsets 0x100000 - 0x100520-ish and noticed something that
> > looks like s_volume_name and s_last_mounted of ext4. Thus this should be
> > the 'real' first block. Since sdb6 has something that looks a lot like
> > what's on sdb5, 20 80 00 00 20 80 01 00 20 80 02 etc etc at 0x100000
> > this should be the first offset block, correct?
> >
> >
> > Assuming I can force somehow that mdadm recognizes my disk as part of an
> > array, and no longer a spare, how does mdadm know which of the two parts
> > it is? 'real' or offset? I haven't bumped into anything that would tell
> > mdadm that bit of information. The data seems to all be still very much
> > available, so I still have hope. I did try making a copy of the entire
> > partition, and re-create the array as missing /dev/loop0 (with loop0
> > being the dd-ed copy) but that didn't work.
> >
> > Finally, would it even be possible to 'restore' my first 127mb on sda6,
> > those that the wrong version of mdadm destroyed by reserving 128mb of
> > data instead of the usual 1mb using data from sdb6?
> >
> > Sorry for the long mail, I tried to be complete :)
> >
> > Oliver
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2012-09-09 23:08 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-08-19 13:56 Recovering from the kernel bug Oliver Schinagl
2012-09-09 20:22 ` Recovering from the kernel bug, Neil? Oliver Schinagl
2012-09-09 23:08 ` NeilBrown [this message]
2012-09-10 8:44 ` Oliver Schinagl
2012-09-11 6:16 ` NeilBrown
2012-09-14 10:07 ` Oliver Schinagl
2012-09-14 11:51 ` Small short question Was: " Oliver Schinagl
2012-09-14 16:43 ` Small short question Peter Grandi
2012-09-14 20:19 ` Oliver Schinagl
2012-09-20 2:22 ` Small short question Was: Re: Recovering from the kernel bug, Neil? NeilBrown
2012-09-20 17:05 ` Oliver Schinagl
2012-09-20 17:49 ` Chris Murphy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120910090829.19100cdf@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=oliver+list@schinagl.nl \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).