From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: Fanael Linithien <fanael4@gmail.com>, xfs@oss.sgi.com
Subject: Re: Metadata CRC error upon unclean unmount
Date: Fri, 27 Jun 2014 08:47:27 +1000 [thread overview]
Message-ID: <20140626224727.GS9508@dastard> (raw)
In-Reply-To: <53AC7CA9.9050505@sgi.com>
On Thu, Jun 26, 2014 at 03:03:53PM -0500, Mark Tinguely wrote:
> On 06/25/14 19:28, Dave Chinner wrote:
> >On Wed, Jun 25, 2014 at 11:21:44AM +1000, Dave Chinner wrote:
> >>On Tue, Jun 24, 2014 at 11:50:52PM +0200, Fanael Linithien wrote:
> >>Ok, so the CRC corresponds to the version of the AGI that was logged
> >>at lsn = 0x30000017e. That means the version on disk is a partial
> >>update without a CRC recalculation. Ok, so how can that happen?
> >>
> >>Given the lsn mismatch, I suspect log recovery has played a part as
> >>it will not update the LSN when replaying changes in the log. It
> >>should, however, always be attaching the appropriate verifier to
> >>the buffers being recovered so the CRC should be recalculated
> >>correctly.
> >
> >Ok, I have confirmed that this is occurring and behaving correctly.
> >
> >[ 24.437878] XFS (vdb): Mounting V5 Filesystem
> >[ 24.554429] XFS (vdb): Starting recovery (logdev: internal)
> >[ 24.623466] XFS (vdb): xfs_agi_write_verify: lsn reset block 0x2
> >[ 24.625263] XFS (vdb): xfs_btree_sblock_calc_crc: lsn reset block 0x8
> >[ 24.627307] XFS (vdb): xfs_btree_sblock_calc_crc: lsn reset block 0x10
> >[ 24.628729] XFS (vdb): xfs_btree_sblock_calc_crc: lsn reset block 0x18
> >[ 24.630085] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x20000
> >[ 24.631504] XFS (vdb): xfs_da3_node_write_verify: lsn reset block 0x20008
> >[ 24.632935] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x20010
> >[ 24.634360] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x20018
> >[ 24.635622] XFS (vdb): xfs_dir3_free_write_verify: lsn reset block 0x201e0
> >[ 24.636656] XFS (vdb): __write_verify: lsn reset block 0x201e8
> >[ 24.637510] XFS (vdb): __write_verify: lsn reset block 0x201f0
> >[ 24.638365] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x201f8
> >[ 24.639378] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x202c0
> >[ 24.640397] XFS (vdb): __write_verify: lsn reset block 0x202c8
> >[ 24.641260] XFS (vdb): xfs_dir3_data_write_verify: lsn reset block 0x202d0
> >[ 24.664330] XFS (vdb): Ending recovery (logdev: internal)
> >
> >But that also confirms that log recovery is recalculating the CRC
> >after replaying changes into that block:
> >
> ># xfs_db -c "agi 0" -c "p lsn" -c "p crc" /dev/vdb
> >lsn = 0xffffffffffffffff
> >crc = 0x788c4f63 (correct)
> >
> >So the common log recovery path for buffers is working as it is
> >designed to do.
> >
> >What I still don't understand yet is how changes after this recovery
> >phase are getting to disk without updating the CRC. That implies
> >buffers without verifiers being written....
> >
> >More debug to come...
> >
> >Cheers,
> >
> >Dave.
>
> Could an out of order CIL push cause this?
I don't think so - the issue appears to be that a CRC is not being
recalculated on a buffer before IO has been issued to disk, not that
there is incorrect metadata in the buffer. Regardless of how we
modify the buffer, the CRC should always match the contents of the
block on disk because we calculate it with the buffer locked and
just prior to it being written.
> SGI saw sequence 2 (and sometimes 3/4) of the cil push get in front
> of cil push sequence 1. Looks like the setting of
> log->l_cilp->xc_ctx->commit_lsn in xlog_cil_init_post_recovery()
> lets this happen.
I don't think can actually happen - the CIL is not used until after
xlog_cil_init_post_recovery() is completed and transactions start
during EFI recovery. Any attempt to use it prior to that call will
oops on the null ctx_ticket.
As for the ordering issue, I'm pretty sure that was fixed in
commit f876e44 ("xfs: always do log forces via the workqueue").
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2014-06-26 22:47 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-06-24 16:01 Metadata CRC error upon unclean unmount Fanael Linithien
2014-06-24 16:04 ` Grozdan
2014-06-24 16:08 ` Eric Sandeen
2014-06-24 16:19 ` Fanael Linithien
2014-06-24 16:24 ` Eric Sandeen
2014-06-24 16:37 ` Fanael Linithien
2014-06-24 18:48 ` Eric Sandeen
2014-06-24 19:00 ` Fanael Linithien
2014-06-24 19:45 ` Eric Sandeen
2014-06-24 20:19 ` Dave Chinner
2014-06-24 20:44 ` Fanael Linithien
2014-06-24 21:50 ` Fanael Linithien
2014-06-25 1:21 ` Dave Chinner
2014-06-25 15:28 ` Fanael Linithien
2014-06-25 16:09 ` Fanael Linithien
2014-06-25 22:03 ` Dave Chinner
2014-06-25 22:33 ` Fanael Linithien
2014-06-25 23:22 ` Dave Chinner
2014-06-25 23:32 ` Fanael Linithien
2014-06-25 23:52 ` Dave Chinner
2014-06-26 0:28 ` Dave Chinner
2014-06-26 2:23 ` Dave Chinner
2014-06-26 20:03 ` Mark Tinguely
2014-06-26 22:47 ` Dave Chinner [this message]
2014-06-27 14:26 ` Mark Tinguely
2014-06-28 0:49 ` Dave Chinner
2014-06-29 16:19 ` Mark Tinguely
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140626224727.GS9508@dastard \
--to=david@fromorbit.com \
--cc=fanael4@gmail.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox