linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phil Turmel <philip@turmel.org>
To: Fabian Knorr <knorrfab@fim.uni-passau.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Recovering an Array with inconsistent Superblocks
Date: Sat, 04 Jan 2014 21:32:19 -0500	[thread overview]
Message-ID: <52C8C433.5030403@turmel.org> (raw)
In-Reply-To: <1388873143.7641.20.camel@vessel>

On 01/04/2014 05:05 PM, Fabian Knorr wrote:
> Hi, Phil,
> 
> thank you very much for your reply.
> 
>> Side note: If you have a live spare available for a raid5, there's no
>> good reason not to reshape to a raid6, and very good reasons to do so.
> 
> I was worried that RAID6 would incur a significant load on the CPU,
> especially if one disk fails. The system is a single-core Intel Atom.

It does add more load, especially when degraded.  I guess it depends on
your usage pattern.  I would try it before I gave up on the idea.

>> Device names are not guaranteed to remain identical from one boot to
>> another.  And often won't be if a removable device is plugged in at that
>> time.  The linux MD driver keeps identity data in the superblock that
>> makes the actual device names immaterial.
>>
>> It is really important that we get a "map" of device names to drive
>> serial numbers, and adjust all future operations to ensure we are
>> working with the correct names.  An excerpt from "ls -l
>> /dev/disk/by-id/" would do.  And you need to verify it after every boot
>> until this crisis is resolved.
> 
> See the attachment "partitions". I grep'ed for raid partitions.
> 
>> 1) raid.status appears to be from *after* your --add attempts.  That
>> means anything in those reports from those devices is useless.  So we
>> will have to figure out what that data was.
> 
> Could it be that --add only changed the superblock of one disk,
> namely /dev/sdb in file from my first e-mail?

/dev/sda actually.

>> 2) You attempted to recreate the array.  If you left out
>> "--assume-clean", your data is toast.  Please show the precise command
>> line you used in your re-create attempt.  Also generate a fresh
>> "raid.status" for the current situation.
> 
> The only commands I used were --add /dev/sdb, --run, --assemble --scan,
> --assemble --scan --force and --stop. I didn't try to re-create it, at
> least not now. Also, the timestamp from raid.status (2011) is incorrect,
> the array was re-created from scratch in the summer of 2012. I can't
> tell why disks other than /dev/sdb1 have an invalid superblock.

This is very good news.  In fact, I think --assemble --force can still
be made to work....

>> 3) The array seems to think it's member devices were /dev/sda through
>> /dev/sdh (not in that order).  Your "raid.status" has /dev/sd[abcefghi],
>> suggesting a rescue usb or some such is /dev/sdd. 
> 
> Yes, that's correct.

Very good.

>> 4) Please describe the structure of the *content* of the array, so we
>> can suggest strategies to *safely* recognize when our future attempts to
>> --create --assume-clean have succeeded.  LVM?  Partitioned?  One big
>> filesystem?
> 
> I'm using the array as a physical volume for LVM.

Ok.

Try this:

mdadm --stop /dev/md0

mdadm -Afv /dev/md0 /dev/sd[bcefghi]1

It leaves out /dev/sda, which appears to have been the spare in the
original setup.

If MD is happy after that, use fsck -n on your logical volumes to verify
your FS integrity, and/or see the extent of the damage (little or none,
I think).

If that works, you can --add /dev/sda1 again, and it will become the
spare again.

If it doesn't work, show everything printed by "mdadm -Afv" above.

HTH,

Phil

  reply	other threads:[~2014-01-05  2:32 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-04 10:04 Recovering an Array with inconsistent Superblocks Fabian Knorr
2014-01-04 16:24 ` Phil Turmel
2014-01-04 17:59   ` Can Jeuleers
2014-01-04 19:16     ` Phil Turmel
2014-01-04 22:05   ` Fabian Knorr
2014-01-05  2:32     ` Phil Turmel [this message]
2014-01-05  9:07       ` Fabian Knorr
2014-01-05  9:56         ` NeilBrown
2014-01-05 10:40           ` Fabian Knorr
     [not found]           ` <1388918703.3591.20.camel@vessel>
2014-01-05 18:25             ` Phil Turmel
2014-01-05 23:50               ` NeilBrown
2014-01-06 14:00               ` Fabian Knorr
2014-01-07  0:26                 ` NeilBrown
2014-01-14  8:54     ` David Brown
2014-01-04 22:08   ` Fabian Knorr

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52C8C433.5030403@turmel.org \
    --to=philip@turmel.org \
    --cc=knorrfab@fim.uni-passau.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).