Re: The mysterious case of the disappearing superblock ...

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Roger Heflin <rogerheflin@gmail.com>
To: NeilBrown <neilb@suse.de>
Cc: anthony <antmbox@youngman.org.uk>,
	Linux RAID <linux-raid@vger.kernel.org>,
	Phil Turmel <philip@turmel.org>
Subject: Re: The mysterious case of the disappearing superblock ...
Date: Fri, 21 Jan 2022 11:04:37 -0600	[thread overview]
Message-ID: <CAAMCDecC0+8DGqagttoqaEra=fT=qnoS_cwzpXM+xWcOUT+fPw@mail.gmail.com> (raw)
In-Reply-To: <164254680952.24166.7553126422166310408@noble.neil.brown.name>

I would first look for the superblock magic Neil mentions.  Usually in
lost PV, FSes and other data volumes the issue is that something like
the partition start moved and the magic is now either outside the
given partition or not in the right location in the given partition.
So you may want to take one disk and scan a wide range to see if you
can find it.  If you find it on that disk, now you have an idea where
it may be on the others.

Since the one is sda4 is that the last partition and if it is not the
last are you missing any other partitions?

I have never seen a disk that disappeared for no reason.I have always
been able to find something pointing to what the human error was.  A
lot of being able to do that is the machines/teams I oversee have
weekly data collects similar to sosreport on the active kernel
tables/config files, so I can see that prior to reboot the partition
table was not where it is after boot.  And that is usually as simple
as fixing the partition table to match where it was and then all is
good.  Even without that you can look for the header magic and from
that tell where the partition table for that partition starts.  I
oversee a huge number of systems, with countless different hands of
various experience levels doing work on those 20k systems so I have
seen pretty much every variation of issue, and I have always been able
to find evidence of a root cause.

On Fri, Jan 21, 2022 at 5:13 AM NeilBrown <neilb@suse.de> wrote:
>
> On Wed, 19 Jan 2022, anthony wrote:
> > You all know the story of how the cobbler's children are the worst shod,
> > I expect :-) Well, the superblock to my raid (containing /home, etc) has
> > disappeared, and I don't have a backup ... (well I do but it's now well
> > out of date).
> >
> > So, a new hard drive is on order, for backup ...
> >
> > Firstly, given that superblocks seem to disappear every now and then,
> > does anybody have any ideas for something that might help us track it
> > down? The 1.2 superblock is 4K into the device I believe? So if I copy
> > the first 8K ( dd if=/dev/sda4 of=sda4.img bs=4K count=2 ) of each
> > partition, that might help provide any clues as to what's happened to
> > it? What am I looking for? What is the superblock supposed to look like?
>
> Yes, 4K offset.  Yes, that dd command will get what you want it to.
> It hardly matters what the superblock should looks like, because it
> won't be there.  The thing you want to know is: what is there?
> i.e.  you see random bytes and need to guess what they mean, so you can
> guess where they came from.
> Best to post the "od -x" output and crowd-source.
>
> Are you sure the partition starts haven't changed? Was the array made of
> whole-devices or of partitions?
>
> If you want to find out if the superblock got moved, the maybe searching
> for the magic number is best.
> Look a the start of super1.c in mdadm.  The first 4 bytes of the
> superblock are 0xa92b4efc little-endian.  So: FC 4E 2B A9
> The next 4 bytes as 01 00 00 00 ( the major version)
> Then the feature map - possibly 0.  Then 4 zero bytes.
>
> If you see something that looks like that, it worth trying to point
> mdadm at it.  Create a loop device over the it with an appropriate
> offset, and ask mdadm --example to look at it.
>
>
> >
> > Secondly, once I've backed up my partitions, I obviously need to do
> > --create --assume-clean ... The only snag is, the array has been
> > rebuilt, so I doubt my data offset is the default. The history of the
> > array is simple. It's pretty new, so it will have been created with the
> > latest mdadm, and was originally a mirror of sda4 and sdb4.
> >
> > A new drive was added and the array upgraded to raid-5, and I BELIEVE
> > the order is sdc4, sda4, sdb1 - sdb1 being the new drive that was added.
> >
> > Am I safe to assume that sdc4 and sda4 will have the same data offset?
> > What is it likely to be? And seeing as it was the last added am I safe
> > to assume that sdb1 is the last drive, so all I have to do is see which
> > way round the other two should be?
>
> I would suggest creating some sparse files the same size as the device,
> create loop devices over them, and creating the array in the sequence
> you remember doing it - using "--assume-clean" to avoid rebuilds that
> would make those sparse files less sparse.
> Then look at the metadata written and assume it is will similar to
> that which was written to your array.
>
> NeilBrown
>
>
> >
> > At least the silver lining behind this, is that having been forced to
> > recover my own array, I'll understand it much better helping other
> > people recover theirs!
> >
> > Cheers,
> > Wol
> >
> >

     prev parent reply	other threads:[~2022-01-21 17:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-18 19:51 The mysterious case of the disappearing superblock anthony
2022-01-18 20:00 ` Phil Turmel
2022-01-18 20:11   ` anthony
2022-01-18 23:00 ` NeilBrown
2022-01-19  8:52   ` PANIC OVER! " Wols Lists
2022-01-21 19:28     ` Nix
2022-01-21 19:37       ` Wols Lists
2022-01-21 19:55       ` Wols Lists
2022-01-21 19:42     ` Roger Heflin
2022-01-21 17:04   ` Roger Heflin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAMCDecC0+8DGqagttoqaEra=fT=qnoS_cwzpXM+xWcOUT+fPw@mail.gmail.com' \
    --to=rogerheflin@gmail.com \
    --cc=antmbox@youngman.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=philip@turmel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).