From: Dave Chinner <david@fromorbit.com>
To: Ben Myers <bpm@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: inode buffers may not be valid during recovery readahead
Date: Sat, 31 Aug 2013 16:14:20 +1000 [thread overview]
Message-ID: <20130831061420.GY12779@dastard> (raw)
In-Reply-To: <20130830181520.GD1935@sgi.com>
On Fri, Aug 30, 2013 at 01:15:20PM -0500, Ben Myers wrote:
> Dave,
>
> On Tue, Aug 27, 2013 at 11:39:37AM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> >
> > CRC enabled filesystems fail log recovery with 100% reliability on
> > xfstests xfs/085 with the following failure:
>
> Unfortunately I have not been able to hit this one... not sure why.
>
> > XFS (vdb): Mounting Filesystem
> > XFS (vdb): Starting recovery (logdev: internal)
> > XFS (vdb): Corruption detected. Unmount and run xfs_repair
> > XFS (vdb): bad inode magic/vsn daddr 144 #0 (magic=0)
> > XFS: Assertion failed: 0, file: fs/xfs/xfs_inode_buf.c, line: 95
> >
> > The problem is that the inode buffer has not been recovered before
> > the readahead on the inode buffer is issued. The checkpoint being
> > recovered actually allocates the inode chunk we are doing readahead
> > from, so what comes from disk during readahead is essentially
> > random and the verifier barfs on it.
> >
> > This inode buffer readahead problem affects non-crc filesystems,
> > too, but xfstests does not trigger it at all on such
> > configurations....
> >
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
>
> I've been mulling this one over for a bit, and I'm not quite sure this
> is correct:
>
> My feeling is that in light of commit 9222a9cf, if we do take part of a
> buffer back in time, the write verifier should fail.
I don't see the connection between 9222a9cf ("xfs: don't shutdown
log recovery on validation errors") and this issue. 9222a9cf works
around are a longstanding architectural deficiency of log
recovery, while this is a completely new problem introduced by the
inode buffer readahead in log recovery.
> I think for a v2
> inode the read and write verifiers should both be disabled for the
> duration of recovery.
Why? It's an architectural requirement that the underlying buffer we
are replaying inodes into already contains inodes. I mean, what's
the first thing xlog_recover_inode_pass2() does?
/*
* Make sure the place we're flushing out to really looks
* like an inode!
*/
if (unlikely(dip->di_magic != cpu_to_be16(XFS_DINODE_MAGIC))) {
xfs_alert(mp,
"%s: Bad inode magic number, dip = 0x%p, dino bp = 0x%p, ino = %Ld",
__func__, dip, bp, in_f->ilf_ino);
XFS_ERROR_REPORT("xlog_recover_inode_pass2(1)",
XFS_ERRLEVEL_LOW, mp);
error = EFSCORRUPTED;
goto out_release;
}
i.e. running the verifier on inode buffer reads during log recovery
is exactly the right thing to be doing....
Readahead doesn't change this, either. It just introduces a new
ordering issue we need to handle.
> For v3 inodes, I suspect the current situation
> where we do use write verifiers is broken in the same way, at least
> until we pull in 'xfs: prevent transient corrupt states during log
> recovery', which, as you say, won't fix the problem for the v2 inode.
Again - that fix has no connection to this readahead bug. It's the
fix for v5 filesystems that was mentioned in 9222a9cf, and is needed
regardless of whether we are doing readahead or not.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2013-08-31 6:14 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-27 1:39 [PATCH] xfs: inode buffers may not be valid during recovery readahead Dave Chinner
2013-08-30 18:15 ` Ben Myers
2013-08-31 6:14 ` Dave Chinner [this message]
2013-09-03 22:17 ` Ben Myers
2013-09-03 23:50 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130831061420.GY12779@dastard \
--to=david@fromorbit.com \
--cc=bpm@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox