All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH 1/3] xfs: don't shutdown log recovery on validation errors
Date: Thu, 13 Jun 2013 12:08:27 +1000	[thread overview]
Message-ID: <20130613020827.GG29338@dastard> (raw)
In-Reply-To: <20130613010441.GX20932@sgi.com>

On Wed, Jun 12, 2013 at 08:04:41PM -0500, Ben Myers wrote:
> Hey Dave,
> 
> On Wed, Jun 12, 2013 at 12:19:06PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Unfortunately, we cannot guarantee that items logged multiple times
> > and replayed by log recovery do not take objects back in time. When
> > theya re taken back in time, the go into an intermediate state which
> > is corrupt, and hence verification that occurs on this intermediate
> > state causes log recovery to abort with a corruption shutdown.
> > 
> > Instead of causing a shutdown and unmountable filesystem, don't
> > verify post-recovery items before they are written to disk. This is
> > less than optimal, but there is no way to detect this issue for
> > non-CRC filesystems If log recovery successfully completes, this
> > will be undone and the object will be consistent by subsequent
> > transactions that are replayed, so in most cases we don't need to
> > take drastic action.
> > 
> > For CRC enabled filesystems, leave the verifiers in place - we need
> > to call them to recalculate the CRCs on the objects anyway. This
> > recovery problem canbe solved for such filesystems - we have a LSN
> > stamped in all metadata at writeback time that we can to determine
> > whether the item should be replayed or not. This is a separate piece
> > of work, so is not addressed by this patch.
> 
> Is there a test case for this one?  How are you reproducing this?

The test case was Dave Jones running sysrq-b on a hung test machine.
The machine would occasionally end up with a corrupt home directory.

http://oss.sgi.com/pipermail/xfs/2013-May/026759.html

Analysis from a metdadump provided by Dave:

http://oss.sgi.com/pipermail/xfs/2013-June/026965.html

And Cai also appeared to be hitting this after a crash on 3.10-rc4,
as it's giving exactly the same "verifier failed during log recovery"
stack trace:

http://oss.sgi.com/pipermail/xfs/2013-June/026889.html

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2013-06-13  2:08 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-12  2:19 [PATCH 0/3] xfs: fixes for 3.10-rc6 Dave Chinner
2013-06-12  2:19 ` [PATCH 1/3] xfs: don't shutdown log recovery on validation errors Dave Chinner
2013-06-13  1:04   ` Ben Myers
2013-06-13  2:08     ` Dave Chinner [this message]
2013-06-13 22:09       ` Ben Myers
2013-06-14  0:13         ` Dave Chinner
2013-06-14 12:55           ` Mark Tinguely
2013-06-14 16:09           ` Ben Myers
2013-06-14 16:15             ` Eric Sandeen
2013-06-14 19:08               ` Ben Myers
2013-06-14 19:18                 ` Eric Sandeen
2013-06-14 19:44                   ` Ben Myers
2013-06-14 19:54                     ` Eric Sandeen
2013-06-14 20:22                       ` Ben Myers
2013-06-28 18:54                         ` Dave Jones
2013-06-28 19:24                           ` Ben Myers
2013-06-28 19:28                             ` Dave Jones
2013-06-28 19:31                               ` Ben Myers
2013-06-15  0:56                     ` Dave Chinner
2013-06-17 14:53                       ` Ben Myers
2013-06-18  1:22                         ` Dave Chinner
2013-06-14 16:17             ` Dave Jones
2013-06-14 16:31               ` Ben Myers
2013-06-12  2:19 ` [PATCH 2/3] xfs: fix implicit padding in directory and attr CRC formats Dave Chinner
2013-06-13  0:58   ` Ben Myers
2013-06-13  1:40     ` Michael L. Semon
2013-06-13  2:27     ` Dave Chinner
2013-06-13 21:31       ` Ben Myers
2013-06-12  2:19 ` [PATCH 3/3] xfs: ensure btree root split sets blkno correctly Dave Chinner
2013-06-13 19:16   ` Ben Myers
2013-06-14  0:21     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130613020827.GG29338@dastard \
    --to=david@fromorbit.com \
    --cc=bpm@sgi.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.