Re: RAID5 crashed for unknown reason on old 2.6.16 kernel

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Neil Brown <neilb@suse.de>
To: Markus Hennig <mhennig@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID5 crashed for unknown reason on old 2.6.16 kernel
Date: Tue, 29 Jun 2010 16:50:31 +1000	[thread overview]
Message-ID: <20100629165031.6c96a635@notabene.brown> (raw)
In-Reply-To: <AANLkTik3OIS0AKHZaBHEuqi9NNFKYQ7sWh_iZecirJgS@mail.gmail.com>

On Mon, 28 Jun 2010 17:29:37 +0200
Markus Hennig <mhennig@gmail.com> wrote:

> Hi all,
> 
> for the (unlikely) case somebody is interested in a last update:
> 
> I learned in the meantime that the UUID as well as the mdadm version
> is part of the checksum. And that that checksum is calculated on the
> first 1kb of the 4kb ver0.0 superblock.
> (https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#The_version-0.90_Superblock_Format)
> 
> Via hexedit I set the UUID on HHD2 back to the correct value and also
> changed the version information from 0.91.00 (0x5B) to 90 (0x5A).
> Done that the checksum was correct and equal the expect one.
> 
> mdadm --assemble worked than like a charm and my RAID5 is back.

Thanks for letting us know the resolution.
I cannot imagine how all those '1's got into the metadata where they
shouldn't be.

Based on the update times and event counter, the HDD2 was slightly 'older'
than the other devices.  Hopefully nothing had changed on the array in the
intervening time.

You should have been able to assemble the array with just the 3 sane devices
and had a degraded RAID5.  Then add the fourth device and let it recover.

However what you did seems to have worked, so if your data looks OK, you
should be safe.

NeilBrown


> 
> That's it,
> Markus
> 
> 
> On Sat, Jun 26, 2010 at 11:22 PM, Markus Hennig <mhennig@gmail.com> wrote:
> > Hi all,
> >
> > my RAID5 with 4 disks crashed on a Buffalo "NAS" box (big-endian!) -
> > no logs of course...
> > I made immediately images of all disks and try to now gather my very
> > valuable content on a Linux box running GRML 4/10 (little-endian!)
> > with 2.6.33 and mdadm - v3.1.1.
> > Some blocks were not readable from HDD2, maybe that's the reason why
> > the Buffalo box shut down.
> >
> >
> > What I know already:
> >
> > - the RAID5 was created with a very old set of software:
> > linux-2.6.16-tshtgl.tgz   mdadm-2.5.2.tgz   xfsprogs-2.5.6_arm.tgz
> > - the Buffalo box blinked red on HDD2
> > - the box run a rebuild on HDD4, I don't know if that was already finished
> > - all disks are identically, 250GB
> >
> 
> > Open questions for which I wasn't able to find a answer myself :
> >
> > What triggers the event count? And why is the event counter on HDD2
> > just 129, on all other 131?
> > Can that cause problems while rescue my data and how can I work around it?
> >
> >
> > What is that "UUID : ffffffff:ffffffff:ffffffff:ffffffff" on HDD2?
> > What does it mean?
> >
> > Its really in the superblock on the hard disk:
> >  hexdump -s 488006273b -C hdd2_ddrescue
> >  3a2cc50200  a9 2b 4e fc 00 00 00 00  00 00 00 5b 00 00 00 00
> > |.+N........[....|
> >  3a2cc50210  00 00 00 00 ff ff ff ff  41 a0 de f0 00 00 00 05
> > |........A.......|
> >  3a2cc50220  0e 83 39 c0 00 00 00 04  00 00 00 04 00 00 00 01
> > |..9.............|
> >  3a2cc50230  00 00 00 00 ff ff ff ff  ff ff ff ff ff ff ff ff
> > |................|
> >  3a2cc50240  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
> > |................|
> > Would it help to rewrite the UUID via hexedit to the correct one?
> >
> >
> > Can somebody explain the meaning of:
> >  Reshape pos'n : 0
> >      New Level : raid0
> >     New Layout : left-asymmetric
> >  New Chunksize : 0
> > on HDD2 ?
> >
> >
> > What parameters are included in the checksum?
> > And how critical in on HHD2 that "Checksum : b8d2c453 - expected 45703820"?
> >
> >
> > I have no explanation why "Version :" is on HDD2 on 0.91.00"...
> > I see 0x5B in the partition 3 superblock on HDD2 (and on all other
> > 0x5A), so its really on the disk...  Weird...
> > Somebody any idea on that?
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2010-06-29  6:50 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-26 21:22 RAID5 crashed for unknown reason on old 2.6.16 kernel Markus Hennig
2010-06-28 15:29 ` Markus Hennig
2010-06-29  6:50   ` Neil Brown [this message]
2010-07-15 11:53     ` Markus Hennig
2010-07-15 13:09       ` Roman Mamedov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100629165031.6c96a635@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=mhennig@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).