linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Eric Sandeen <sandeen@redhat.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: xfs <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] xfs_repair: don't flag log_incompat inconsistencies as corruptions
Date: Thu, 26 May 2022 16:53:30 -0500	[thread overview]
Message-ID: <337aa926-ba8c-3383-c200-e54fde4182f1@redhat.com> (raw)
In-Reply-To: <Yo02nmlajIuFqVez@magnolia>

On 5/24/22 2:48 PM, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> While testing xfs/233 and xfs/127 with LARP mode enabled, I noticed
> errors such as the following:
> 
> xfs_growfs --BlockSize=4096 --Blocks=8192
> data blocks changed from 8192 to 2579968
> meta-data=/dev/sdf               isize=512    agcount=630, agsize=4096 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=1        finobt=1, sparse=1, rmapbt=1
>          =                       reflink=1    bigtime=1 inobtcount=1
> data     =                       bsize=4096   blocks=2579968, imaxpct=25
>          =                       sunit=0      swidth=0 blks
> naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
> log      =internal log           bsize=4096   blocks=3075, version=2
>          =                       sectsz=512   sunit=0 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0
> _check_xfs_filesystem: filesystem on /dev/sdf is inconsistent (r)
> *** xfs_repair -n output ***
> Phase 1 - find and verify superblock...
>         - reporting progress in intervals of 15 minutes
> Phase 2 - using internal log
>         - zero log...
>         - 23:03:47: zeroing log - 3075 of 3075 blocks done
>         - scan filesystem freespace and inode maps...
> would fix log incompat feature mismatch in AG 30 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 8 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 12 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 24 super, 0x0 != 0x1
> would fix log incompat feature mismatch in AG 18 super, 0x0 != 0x1
> <snip>
> 
> 0x1 corresponds to XFS_SB_FEAT_INCOMPAT_LOG_XATTRS, which is the feature
> bit used to indicate that the log contains extended attribute log intent
> items.  This is a mechanism to prevent older kernels from trying to
> recover log items that they won't know how to recover.
> 
> I thought about this a little bit more, and realized that log_incompat
> features bits are set on the primary sb prior to writing certain types
> of log records, and cleared once the log has written the committed
> changes back to the filesystem.  If the secondary superblocks are
> updated at any point during that interval (due to things like growfs or
> setting labels), the log_incompat field will now be set on the secondary
> supers.
> 
> Due to the ephemeral nature of the current log_incompat feature bits,
> a discrepancy between the primary and secondary supers is not a
> corruption.  If we're in dry run mode, we should log the discrepancy,
> but that's not a reason to end with EXIT_FAILURE.

Interesting. This makes me wonder a few things.

This approach differs from the just-added handling of 
XFS_SB_FEAT_INCOMPAT_NEEDSREPAIR, where we /always/ ignore it. For now I think
that's a little different, because that flag only gets set from userspace, but
that could change in the future, maybe?

So I wonder why we have this feature getting noted and cleared, but the other
one always ignored.

I also notice that scrub tries to avoid setting it in the first place:

         * Don't write out a secondary super with NEEDSREPAIR or log incompat
         * features set, since both are ignored when set on a secondary.

... should growfs avoid it as well?

It feels like we're spreading this special handling around, copying (or not)
and ignoring (or not) at various points.  I kinda want to step back and think
about this a little.

It seems like the most consistent approach would be to always keep all supers
in sync, though I suppose that has costs. The 2nd most consistent approach would
be to never copy these ephemeral features to the secondary.

Whatever the consistent future looks like, I guess we do have to deal with
inconsistent stuff in the wild, already.

Thoughts?

-Eric


> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  repair/agheader.c |   15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/repair/agheader.c b/repair/agheader.c
> index 2c2a26d1..478ed7e5 100644
> --- a/repair/agheader.c
> +++ b/repair/agheader.c
> @@ -286,15 +286,24 @@ check_v5_feature_mismatch(
>  		}
>  	}
>  
> +	/*
> +	 * Log incompat feature bits are set and cleared from the primary super
> +	 * as needed to protect against log replay on old kernels finding log
> +	 * records that they cannot handle.  Secondary sb resyncs performed as
> +	 * part of a geometry update to the primary sb (e.g. growfs, label/uuid
> +	 * changes) will copy the log incompat feature bits, but it's not a
> +	 * corruption for a secondary to have a bit set that is clear in the
> +	 * primary super.
> +	 */
>  	if (mp->m_sb.sb_features_log_incompat != sb->sb_features_log_incompat) {
>  		if (no_modify) {
> -			do_warn(
> -	_("would fix log incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
> +			do_log(
> +	_("would sync log incompat feature in AG %u super, 0x%x != 0x%x\n"),
>  					agno, mp->m_sb.sb_features_log_incompat,
>  					sb->sb_features_log_incompat);
>  		} else {
>  			do_warn(
> -	_("will fix log incompat feature mismatch in AG %u super, 0x%x != 0x%x\n"),
> +	_("will sync log incompat feature in AG %u super, 0x%x != 0x%x\n"),
>  					agno, mp->m_sb.sb_features_log_incompat,
>  					sb->sb_features_log_incompat);
>  			dirty = true;
> 


  reply	other threads:[~2022-05-26 21:53 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24 19:48 [PATCH] xfs_repair: don't flag log_incompat inconsistencies as corruptions Darrick J. Wong
2022-05-26 21:53 ` Eric Sandeen [this message]
2022-05-27  0:23   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=337aa926-ba8c-3383-c200-e54fde4182f1@redhat.com \
    --to=sandeen@redhat.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).