public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@sandeen.net>
To: Michael Maier <m1278468@allmail.net>
Cc: xfs@oss.sgi.com
Subject: Re: Failure growing xfs with linux 3.10.5
Date: Thu, 15 Aug 2013 13:14:26 -0500	[thread overview]
Message-ID: <520D1A82.1000709@sandeen.net> (raw)
In-Reply-To: <520D162B.5060901@allmail.net>

On 8/15/13 12:55 PM, Michael Maier wrote:
> Eric Sandeen wrote:
>> On 8/14/13 11:20 AM, Michael Maier wrote:
>>> Dave Chinner wrote:
>>
>> ...
>>
>>>> If it makes you feel any better, the bug that caused this had been
>>>> in the code for 15+ years and you are the first person I know of to
>>>> have ever hit it....
>>>
>>> Probably the second one :-) See
>>> http://thread.gmane.org/gmane.comp.file-systems.xfs.general/54428
>>>
>>>> xfs_repair doesn't appear to have any checks in it to detect this
>>>> situation or repair it - there are some conditions for zeroing the
>>>> unused parts of a superblock, but they are focussed around detecting
>>>> and correcting damage caused by a buggy Irix 6.5-beta mkfs from 15
>>>> years ago.
>>>
>>> The _big problem_ is: xfs_repair not just doesn't repair it, but it
>>> _causes data loss_ in some situations!
>>>
>>
>> So as far as I can tell at this point, a few things have happened to
>> result in this unfortunate situation.  Congratulations, you hit a
>> perfect storm.  :(
> 
> I can appease you - as it "only" hit my backup device and because I
> noticed the problem before I really needed it: I didn't hit any data
> loss over all, because the original data is ok and I repeated the backup
> w/ the fixed FS now!
> 
>> 1) prior resize operations populated unused portions of backup sbs w/ junk
>> 2) newer kernels fail to verify superblocks in this state
>> 3) during your growfs under 3.10, that verification failure aborted
>>    backup superblock updates, leaving many unmodified
>> 4a) xfs_repair doesn't find or fix the junk in the backup sbs, and
>> 4b) when running, it looks for the superblocks which are "most matching"
>>     other superblocks on the disk, and takes that version as correct.
>>
>> So you had 16 superblocks (0-15) which were correct after the growfs.
>> But 16 didn't verify and was aborted, so nothing was updated after that.
>> This means that 16 onward have the wrong number of AGs and disk blocks;
>> i.e. they are the pre-growfs size, and there are 26 of them.
>>
>> Today, xfs_repair sees this 26-to-16 vote, and decides that the 26
>> matching superblocks "win," rewrites the first superblock with this
>> geometry, and uses that to verify the rest of the filesytem.  Hence
>> anything post-growfs looks out of bounds, and gets nuked.
>>
>> So right now, I'm thinking that the "proper geometry" heuristic should
>> be adjusted, but how to do that in general, I'm not sure.  Weighting
>> sb 0 heavily, especially if it matches many subsequent superblocks,
>> seems somewhat reasonable.
> 
> This would have been my next question! I repaired it w/ the git
> xfs_repair on the already reduced to original size FS. I think, if I
> would have done the same w/ the grown FS, the FS most probably would be
> reduced to the size before the growing.
> 
> Wouldn't it be better to not grow at all if there are problems detected?
> Means: Don't do the check after the growing, but before? Ok, I could
> have done it myself ... . From now on, I will do it like this!

well, see the next couple patches I'm about to send to the list ... ;)

but a check prior wouldn't have helped you, because repair didn't detect
the problem that growfs choked on.

-Eric

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-08-15 18:14 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-11  7:11 Failure growing xfs with linux 3.10.5 Michael Maier
2013-08-11 18:36 ` Eric Sandeen
2013-08-12 16:50   ` Michael Maier
2013-08-13  0:54     ` Dave Chinner
2013-08-13 14:55       ` Michael Maier
2013-08-14  5:43         ` Dave Chinner
2013-08-14 15:16           ` Michael Maier
2013-08-15  0:58             ` Dave Chinner
2013-08-15 18:14               ` Michael Maier
     [not found]   ` <52090C6C.6060604@allmail.net>
2013-08-13  0:04     ` Dave Chinner
2013-08-13 15:30       ` Michael Maier
2013-08-14  5:53         ` Stan Hoeppner
2013-08-14 15:05           ` Michael Maier
2013-08-14 17:31             ` Stan Hoeppner
2013-08-14 18:13               ` Michael Maier
2013-08-14 22:20                 ` Stan Hoeppner
2013-08-15 17:05                   ` Michael Maier
2013-08-14  6:20         ` Dave Chinner
2013-08-14 16:20           ` Michael Maier
2013-08-14 16:37             ` Eric Sandeen
2013-08-15 17:18             ` Eric Sandeen
2013-08-15 17:55               ` Michael Maier
2013-08-15 18:14                 ` Eric Sandeen [this message]
2013-08-15 18:35                   ` Michael Maier
2013-08-15 18:42                     ` Eric Sandeen
2013-08-14 16:51           ` Eric Sandeen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=520D1A82.1000709@sandeen.net \
    --to=sandeen@sandeen.net \
    --cc=m1278468@allmail.net \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox