linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 08/20] xfs: implement the metadata repair ioctl flag
Date: Thu, 29 Mar 2018 10:42:42 +1100	[thread overview]
Message-ID: <20180328234242.GF18129@dastard> (raw)
In-Reply-To: <20180328231915.GN4818@magnolia>

On Wed, Mar 28, 2018 at 04:19:15PM -0700, Darrick J. Wong wrote:
> On Tue, Mar 27, 2018 at 09:58:12PM -0700, Darrick J. Wong wrote:
> > On Wed, Mar 28, 2018 at 10:55:14AM +1100, Dave Chinner wrote:
> > > On Mon, Mar 26, 2018 at 04:56:46PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Plumb in the pieces necessary to make the "scrub" subfunction of
> > > > the scrub ioctl actually work.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Can you list the pieces here - the ioctl, the errtag for debug, etc?
> > 
> > "This means that we make XFS_SCRUB_IFLAG_REPAIR actually do something
> > when userspace calls the scrub ioctl and add a debugging error tag so
> > that xfstests can force the kernel to repair things even when it's not
> > necessary."
> > 
> > > 
> > > ....
> > > 
> > > > +config XFS_ONLINE_REPAIR
> > > > +	bool "XFS online metadata repair support"
> > > > +	default n
> > > > +	depends on XFS_FS && XFS_ONLINE_SCRUB
> > > > +	help
> > > > +	  If you say Y here you will be able to repair metadata on a
> > > > +	  mounted XFS filesystem.  This feature is intended to reduce
> > > > +	  filesystem downtime even further by fixing minor problems
> > > 
> > > s/even further//
> > 
> > Ok.
> > 
> > > > +	  before they cause the filesystem to go down.  However, it
> > > > +	  requires that the filesystem be formatted with secondary
> > > > +	  metadata, such as reverse mappings and inode parent pointers.
> > > > +
> > > > +	  This feature is considered EXPERIMENTAL.  Use with caution!
> > > > +
> > > > +	  See the xfs_scrub man page in section 8 for additional information.
> > > > +
> > > > +	  If unsure, say N.
> > > 
> > > .....
> > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
> > > > index faf1a4e..8bf3ded 100644
> > > > --- a/fs/xfs/libxfs/xfs_fs.h
> > > > +++ b/fs/xfs/libxfs/xfs_fs.h
> > > > @@ -542,13 +542,20 @@ struct xfs_scrub_metadata {
> > > >  /* o: Metadata object looked funny but isn't corrupt. */
> > > >  #define XFS_SCRUB_OFLAG_WARNING		(1 << 6)
> > > >  
> > > > +/*
> > > > + * o: IFLAG_REPAIR was set but metadata object did not need fixing or
> > > > + *    optimization and has therefore not been altered.
> > > > + */
> > > > +#define XFS_SCRUB_OFLAG_UNTOUCHED	(1 << 7)
> > > 
> > > bikeshed: CLEAN rather than UNTOUCHED?
> > 
> > The thing I don't like about using 'CLEAN' is that we only set this flag
> > if userspace told us to touch (i.e. repair) the structure and the kernel
> > decides to leave it alone.
> 
> XFS_SCRUB_OFLAG_NO_REPAIR_NEEDED?

Yeah, that's better.

> I uploaded the htmlized manpage for this ioctl:
> https://djwong.org/docs/ioctl-xfs-scrub-metadata.html
> 
> Hopefully that will make the meanings of all these flags and their
> intended usages clearer, which will make it easier to sift through all
> the code in here. :)

And that helps a lot, too.

> > > > +	/*
> > > > +	 * Repair whatever's broken.  We have to clear the out flags during the
> > > > +	 * repair call because some of our iterator functions abort if any of
> > > > +	 * the corruption flags are set.
> > > > +	 */
> > > > +	scrub_oflags = sc->sm->sm_flags & XFS_SCRUB_FLAGS_OUT;
> > > > +	sc->sm->sm_flags &= ~XFS_SCRUB_FLAGS_OUT;
> > > > +	error = sc->ops->repair(sc, scrub_oflags);
> > > 
> > > Urk, that's a bit messy. Shouldn't we drive that inwards to just the
> > > iterator methods that have this problem? Then we can slowly work
> > > over the iterators that are problematic and fix them?
> > 
> > Hmm, I'll have a closer look tomorrow at which ones actually trigger
> > this...
> 
> Aha, it's the two calls to xfs_scrub_walk_agfl.  That function is almost
> generic enough to be a library function, so I'll promote it to a libxfs
> helper function:
> 
> /*
>  * Walk all the blocks in the AGFL.  The fn function can return any
>  * negative error code or XFS_BTREE_QUERY_RANGE_ABORT.
>  */
> int
> xfs_agfl_walk(
> 	struct xfs_mount	*mp,
> 	struct xfs_agf		*agf,
> 	struct xfs_buf		*agflbp,
> 	xfs_agfl_walk_fn	fn,
> 	void			*priv)
> {
> 	__be32			*agfl_bno;
> 	unsigned int		flfirst;
> 	unsigned int		fllast;
> 	int			i;
> 	int			error;
> 
> 	agfl_bno = XFS_BUF_TO_AGFL_BNO(mp, agflbp);
> 	flfirst = be32_to_cpu(agf->agf_flfirst);
> 	fllast = be32_to_cpu(agf->agf_fllast);
> 
> 	/* Nothing to walk in an empty AGFL. */
> 	if (agf->agf_flcount == cpu_to_be32(0))
> 		return 0;
> 
> 	/* first to last is a consecutive list. */
> 	if (fllast >= flfirst) {
> 		for (i = flfirst; i <= fllast; i++) {
> 			error = fn(mp, be32_to_cpu(agfl_bno[i]), priv);
> 			if (error)
> 				return error;
> 		}
> 
> 		return 0;
> 	}
> 
> 	/* first to the end */
> 	for (i = flfirst; i < xfs_agfl_size(mp); i++) {
> 		error = fn(mp, be32_to_cpu(agfl_bno[i]), priv);
> 		if (error)
> 			return error;
> 	}
> 
> 	/* the start to last. */
> 	for (i = 0; i <= fllast; i++) {
> 		error = fn(mp, be32_to_cpu(agfl_bno[i]), priv);
> 		if (error)
> 			return error;
> 	}
> 
> 	return 0;
> }
> 
> And then adapt the three callers to use the library function instead.
> 
> xfs_scrub_agfl_block() can return QUERY_ABORT if OFLAG_CORRUPT gets set,
> which will cause xfs_scrub_agfl to return as soon as it hits the first
> error.
> 
> xfs_repair_allocbt and xfs_repair_rmapbt can call the library function
> and then we don't need this weird quirk anymore.

Yup, that sounds like a good way to solve the problem. Thanks,
Darrick!

-Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-03-28 23:42 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-26 23:55 [PATCH v14 00/20] xfs-4.17: online repair support Darrick J. Wong
2018-03-26 23:56 ` [PATCH 01/20] xfs: add helpers to calculate btree size Darrick J. Wong
2018-03-27 22:54   ` Dave Chinner
2018-03-26 23:56 ` [PATCH 02/20] xfs: expose various functions to repair code Darrick J. Wong
2018-03-27 22:55   ` Dave Chinner
2018-03-26 23:56 ` [PATCH 03/20] xfs: add repair helpers for the reverse mapping btree Darrick J. Wong
2018-03-27 23:03   ` Dave Chinner
2018-03-27 23:29     ` Darrick J. Wong
2018-03-26 23:56 ` [PATCH 04/20] xfs: add repair helpers for the reference count btree Darrick J. Wong
2018-03-27 23:04   ` Dave Chinner
2018-03-26 23:56 ` [PATCH 05/20] xfs: add BMAPI_NORMAP flag to perform block remapping without updating rmpabt Darrick J. Wong
2018-03-27 23:09   ` Dave Chinner
2018-03-26 23:56 ` [PATCH 06/20] xfs: halt auto-reclamation activities while rebuilding rmap Darrick J. Wong
2018-03-27 23:15   ` Dave Chinner
2018-03-27 23:50     ` Darrick J. Wong
2018-03-26 23:56 ` [PATCH 07/20] xfs: create tracepoints for online repair Darrick J. Wong
2018-03-27 23:18   ` Dave Chinner
2018-03-27 23:33     ` Darrick J. Wong
2018-03-26 23:56 ` [PATCH 08/20] xfs: implement the metadata repair ioctl flag Darrick J. Wong
2018-03-27 23:55   ` Dave Chinner
2018-03-28  4:58     ` Darrick J. Wong
2018-03-28 23:19       ` Darrick J. Wong
2018-03-28 23:42         ` Dave Chinner [this message]
2018-03-26 23:56 ` [PATCH 09/20] xfs: add helper routines for the repair code Darrick J. Wong
2018-03-26 23:56 ` [PATCH 10/20] xfs: repair superblocks Darrick J. Wong
2018-03-26 23:57 ` [PATCH 11/20] xfs: repair the AGF and AGFL Darrick J. Wong
2018-03-26 23:57 ` [PATCH 12/20] xfs: repair the AGI Darrick J. Wong
2018-03-26 23:57 ` [PATCH 13/20] xfs: repair free space btrees Darrick J. Wong
2018-03-26 23:57 ` [PATCH 14/20] xfs: repair inode btrees Darrick J. Wong
2018-03-26 23:57 ` [PATCH 15/20] xfs: repair the rmapbt Darrick J. Wong
2018-03-26 23:57 ` [PATCH 16/20] xfs: repair refcount btrees Darrick J. Wong
2018-03-26 23:57 ` [PATCH 17/20] xfs: repair inode records Darrick J. Wong
2018-03-26 23:57 ` [PATCH 18/20] xfs: zap broken inode forks Darrick J. Wong
2018-03-26 23:57 ` [PATCH 19/20] xfs: repair inode block maps Darrick J. Wong
2018-03-26 23:58 ` [PATCH 20/20] xfs: repair damaged symlinks Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2018-03-15 20:26 [PATCH v13 00/20] xfs-4.17: online repair support Darrick J. Wong
2018-03-15 20:27 ` [PATCH 08/20] xfs: implement the metadata repair ioctl flag Darrick J. Wong
2018-02-23  2:01 [PATCH v12 00/20] xfs: online repair support Darrick J. Wong
2018-02-23  2:02 ` [PATCH 08/20] xfs: implement the metadata repair ioctl flag Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180328234242.GF18129@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).