linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Sam Clark <sclark_77@hotmail.com>
Cc: 'Phil Turmel' <philip@turmel.org>, linux-raid@vger.kernel.org
Subject: Re: RAID5 - Disk failed during re-shape
Date: Wed, 15 Aug 2012 07:05:33 +1000	[thread overview]
Message-ID: <20120815070533.5f7eb1e5@notabene.brown> (raw)
In-Reply-To: <BLU153-ds7438ACF3B689B6D7DA9E294B70@phx.gbl>

[-- Attachment #1: Type: text/plain, Size: 4174 bytes --]

On Tue, 14 Aug 2012 15:40:50 +0200 Sam Clark <sclark_77@hotmail.com> wrote:

> Thanks Neil, 
> 
> Tried that and failed on the first attempt, so I tried shuffling around the
> dev order.. unfortunately I don't know what they were previously, but I do
> recall being surprised that sdd was first on the list when I was looking at
> it previously, so perhaps a starting point.  Since there are some 120
> different permutations of device order (assuming all 5 could be anywhere), I
> modified the script to accept parameters and automated it a little further. 
> 
> I ended up with a few 'possible successes' but none that would mount (i.e.
> fsck actually ran and found problems with the superblocks, group descriptor
> checksums and Inode details, instead of failing with errorlevel 8).  The
> most successful so far was the ones with SDD as device 1 and SDE as device
> 2.. one particular combination (sdd sde sdb sdc sdf) seems to report every
> time "/dev/md_restore has been mounted 35 times without being checked, check
> forced.".. does this mean we're on the right combination? 

Certainly encouraging.  However it might just mean that the first device is
correct.  I think you only need to find the filesystem superblock to be able
to report that.

> 
> In any case, that one produces a lot of output (some 54MB when fsck is piped
> to a file) that looks bad and still fails to mount.  (I assume that "mount
> -r /dev/md_restore /mnt/restore" I all I need to mount with?  I also tried
> with "-t ext4", but that didn't seem to help either).

54MB certainly seems like more that we were hoping for.
Yes, that mount command should be sufficient.  You could try adding "-o
noload".  I'm not sure what it does but from the code it looks like it tried
to be more forgiving of some stuff.


> 
> This is a summary of the errors that appear: 
> Pass 1: Checking inodes, blocks, and sizes
> (51 of these)
> Inode 198574650 has an invalid extent node (blk 38369280, lblk 0)
> Clear? no
> 
> (47 of these)
> Inode 223871986, i_blocks is 2737216, should be 0.  Fix? no
> 
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> /lost+found not found.  Create? no
> 
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Block bitmap differences:  +(36700161--36700162) +36700164 +36700166
> +(36700168--36700170) (this goes on like this for many pages.. in fact, most
> of the 54 MB is here)
> 
> (and 492 of these) 
> Free blocks count wrong for group #3760 (24544, counted=16439).
> Fix? no
> 
> Free blocks count wrong for group #3761 (0, counted=16584).
> Fix? no
> 
> /dev/md_restore: ********** WARNING: Filesystem still has errors **********
> /dev/md_restore: 107033/274718720 files (5.6% non-contiguous),
> 976413581/1098853872 blocks
> 
> 
> I also tried setting the reshape number to 1002152448 , 1002153984,
> 1002157056 , 1002158592 and 1002160128 (+/ - a couple of multiples) but
> output didn't seem to change much in any case.. Not sure if there are many
> different values worth testing there.

Probably not.

> 
> So, unless there's something else worth trying based on the above, it looks
> to me that it's time to raise the white flag and start again... it's not too
> bad, I'll recover most of the data.
> 
> Many thanks for your help so far, but if I may... 1 more question...
> Hopefully I won't lose a disk during re-shape in the future, but just in
> case I do, or for other unforeseen issues, what are good things to backup on
> a system?  Is it enough to backup the /etc/mdadm/mdadm.conf and /proc/mdstat
> on a regular basis?  Or should I also backup the device superblocks?  Or
> something else?  

There isn't really any need to backup anything.  Just don't use a buggy
kernel (which unfortunately I let out into the wild and got into Ubuntu).
The most useful thing if things do go wrong is the "mdadm --examine" output
of all devices.


> 
> Ok, so that's actually 4 questions  ... sorry :-)
> 
> Thanks again for all your efforts. 
> Sam

Sorry we couldn't get your data back.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2012-08-14 21:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-09  8:38 RAID5 - Disk failed during re-shape Sam Clark
2012-08-10 22:36 ` Phil Turmel
2012-08-11  1:21   ` Dmitrijs Ledkovs
2012-08-11  8:42   ` Sam Clark
2012-08-12 23:35     ` NeilBrown
     [not found]       ` <BLU153-ds10943E39726EDC983C484594B00@phx.gbl>
2012-08-14  2:38         ` NeilBrown
2012-08-14 13:40           ` Sam Clark
2012-08-14 21:05             ` NeilBrown [this message]
2012-08-15 16:32               ` Sam Clark
2012-08-15 22:38                 ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120815070533.5f7eb1e5@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=philip@turmel.org \
    --cc=sclark_77@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).