public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Eric Sandeen <sandeen@sandeen.net>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 06/11] xfs_scrub: reduce fsmap activity for media errors
Date: Mon, 21 Oct 2019 11:14:09 -0700	[thread overview]
Message-ID: <20191021181409.GD913374@magnolia> (raw)
In-Reply-To: <bda4db03-54a2-8ed7-26b2-369ddcc1b697@sandeen.net>

On Mon, Oct 21, 2019 at 12:17:52PM -0500, Eric Sandeen wrote:
> On 9/25/19 4:36 PM, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Right now we rather foolishly query the fsmap data for every single
> > media error that we find.  This is a silly waste of time since we
> > have yet to combine adjacent bad blocks into bad extents, so move the
> > rmap query until after we've constructed the bad block bitmap data.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  libfrog/bitmap.c |   10 +---
> >  scrub/phase6.c   |  148 ++++++++++++++++++++++++++++++++++++++----------------
> >  2 files changed, 108 insertions(+), 50 deletions(-)
> > 
> > 
> > diff --git a/libfrog/bitmap.c b/libfrog/bitmap.c
> > index 6a88ef48..5daa1081 100644
> > --- a/libfrog/bitmap.c
> > +++ b/libfrog/bitmap.c
> > @@ -314,7 +314,6 @@ bitmap_clear(
> >  }
> >  #endif
> >  
> > -#ifdef DEBUG
> >  /* Iterate the set regions of this bitmap. */
> >  int
> >  bitmap_iterate(
> > @@ -324,20 +323,19 @@ bitmap_iterate(
> >  {
> >  	struct avl64node	*node;
> >  	struct bitmap_node	*ext;
> > -	int			error = 0;
> > +	int			ret;
> >  
> >  	pthread_mutex_lock(&bmap->bt_lock);
> >  	avl_for_each(bmap->bt_tree, node) {
> >  		ext = container_of(node, struct bitmap_node, btn_node);
> > -		error = fn(ext->btn_start, ext->btn_length, arg);
> > -		if (error)
> > +		ret = fn(ext->btn_start, ext->btn_length, arg);
> > +		if (ret)
> >  			break;
> >  	}
> >  	pthread_mutex_unlock(&bmap->bt_lock);
> >  
> > -	return error;
> > +	return ret;
> >  }
> > -#endif
> >  
> >  /* Iterate the set regions of part of this bitmap. */
> >  int
> > diff --git a/scrub/phase6.c b/scrub/phase6.c
> > index ec821373..378ea0fb 100644
> > --- a/scrub/phase6.c
> > +++ b/scrub/phase6.c
> > @@ -341,27 +341,9 @@ xfs_report_verify_dirent(
> >  	return moveon;
> >  }
> >  
> > -/* Given bad extent lists for the data & rtdev, find bad files. */
> > -static bool
> > -xfs_report_verify_errors(
> > -	struct scrub_ctx		*ctx,
> > -	struct media_verify_state	*vs)
> > -{
> > -	bool				moveon;
> > -
> > -	/* Scan the directory tree to get file paths. */
> > -	moveon = scan_fs_tree(ctx, xfs_report_verify_dir,
> > -			xfs_report_verify_dirent, vs);
> > -	if (!moveon)
> > -		return false;
> > -
> > -	/* Scan for unlinked files. */
> > -	return xfs_scan_all_inodes(ctx, xfs_report_verify_inode, vs);
> > -}
> > -
> >  /* Report an IO error resulting from read-verify based off getfsmap. */
> >  static bool
> > -xfs_check_rmap_error_report(
> > +ioerr_fsmap_report(
> >  	struct scrub_ctx	*ctx,
> >  	const char		*descr,
> >  	struct fsmap		*map,
> > @@ -409,12 +391,31 @@ xfs_check_rmap_error_report(
> >  	return true;
> >  }
> >  
> > +static struct bitmap *
> > +bitmap_for_disk(
> > +	struct scrub_ctx		*ctx,
> > +	struct disk			*disk,
> > +	struct media_verify_state	*vs)
> > +{
> > +	dev_t				dev = xfs_disk_to_dev(ctx, disk);
> > +
> > +	/*
> > +	 * If we don't have parent pointers, save the bad extent for
> > +	 * later rescanning.
> > +	 */
> 
> This comment doesn't make sense here, does it?
> 
> > +	if (dev == ctx->fsinfo.fs_datadev)
> > +		return vs->d_bad;
> > +	else if (dev == ctx->fsinfo.fs_rtdev)
> > +		return vs->r_bad;
> > +	return NULL;
> > +}
> > +
> >  /*
> >   * Remember a read error for later, and see if rmap will tell us about the
> >   * owner ahead of time.
> >   */
> >  static void
> > -xfs_check_rmap_ioerr(
> > +remember_ioerr(
> >  	struct scrub_ctx		*ctx,
> >  	struct disk			*disk,
> >  	uint64_t			start,
> > @@ -422,32 +423,39 @@ xfs_check_rmap_ioerr(
> >  	int				error,
> >  	void				*arg)
> >  {
> > -	struct fsmap			keys[2];
> > -	char				descr[DESCR_BUFSZ];
> >  	struct media_verify_state	*vs = arg;
> >  	struct bitmap			*tree;
> > -	dev_t				dev;
> >  	int				ret;
> >  
> > -	dev = xfs_disk_to_dev(ctx, disk);
> > +	tree = bitmap_for_disk(ctx, disk, vs);
> > +	if (!tree)
> > +		return;
> >  
> > -	/*
> > -	 * If we don't have parent pointers, save the bad extent for
> > -	 * later rescanning.
> > -	 */
> > -	if (dev == ctx->fsinfo.fs_datadev)
> > -		tree = vs->d_bad;
> > -	else if (dev == ctx->fsinfo.fs_rtdev)
> > -		tree = vs->r_bad;
> > -	else
> > -		tree = NULL;
> > -	if (tree) {
> > -		ret = bitmap_set(tree, start, length);
> > -		if (ret)
> > -			str_liberror(ctx, ret, _("setting bad block bitmap"));
> > -	}
> 
> Maybe that comment should be here?

Oops, yes.

> > +	ret = bitmap_set(tree, start, length);
> > +	if (ret)
> > +		str_liberror(ctx, ret, _("setting bad block bitmap"));
> > +}
> > +
> > +struct walk_ioerr {
> > +	struct scrub_ctx	*ctx;
> > +	struct disk		*disk;
> > +};
> 
> comment here would be great.  Is this walking an ioerror?  Reporting an ioerror
> from a walk?  (or maybe also/instead a comment on walk_ioerrs())
> 
> also whee, functions and structures w/ the same name :D

Dorky C thing to pass variable scope to an iterator function.

> > +static int
> > +walk_ioerr(
> > +	uint64_t		start,
> > +	uint64_t		length,
> > +	void			*arg)
> > +{
> > +	struct walk_ioerr	*wioerr = arg;
> > +	struct fsmap		keys[2];
> > +	char			descr[DESCR_BUFSZ];
> > +	dev_t			dev;
> > +
> > +	dev = xfs_disk_to_dev(wioerr->ctx, wioerr->disk);
> >  
> > -	snprintf(descr, DESCR_BUFSZ, _("dev %d:%d ioerr @ %"PRIu64":%"PRIu64" "),
> > +	snprintf(descr, DESCR_BUFSZ,
> > +_("dev %d:%d ioerr @ %"PRIu64":%"PRIu64" "),
> >  			major(dev), minor(dev), start, length);
> >  
> >  	/* Go figure out which blocks are bad from the fsmap. */
> > @@ -459,8 +467,60 @@ xfs_check_rmap_ioerr(
> >  	(keys + 1)->fmr_owner = ULLONG_MAX;
> >  	(keys + 1)->fmr_offset = ULLONG_MAX;
> >  	(keys + 1)->fmr_flags = UINT_MAX;
> > -	xfs_iterate_fsmap(ctx, descr, keys, xfs_check_rmap_error_report,
> > +	xfs_iterate_fsmap(wioerr->ctx, descr, keys, ioerr_fsmap_report,
> >  			&start);
> > +	return 0;
> > +}
> > +
> > +static int
> > +walk_ioerrs(
> > +	struct scrub_ctx		*ctx,
> > +	struct disk			*disk,
> > +	struct media_verify_state	*vs)
> > +{
> > +	struct walk_ioerr		wioerr = {
> > +		.ctx			= ctx,
> > +		.disk			= disk,
> > +	};> +	struct bitmap			*tree;
> > +
> > +	if (!disk)
> > +		return 0;
> > +	tree = bitmap_for_disk(ctx, disk, vs);
> > +	if (!tree)
> > +		return 0;
> > +	return bitmap_iterate(tree, walk_ioerr, &wioerr);
> > +}
> > +
> > +/* Given bad extent lists for the data & rtdev, find bad files. */
> 
> maybe "find and report bad files" just to tie it in w/ the "report" in
> the function name?

Ok.  I'll work on improving the comments in this file.

> is this only for media errors?  Maybev xfs_report_media_errors?  There are
> so many things going on in scrub (and apparently such a limited namespace
> for functions :D) that I find myself wishing for a little more context
> when I read something in isolation...

Yes.  I'm dropping the xfs_ prefixes, fwiw....

--D

> 
> > +static bool
> > +xfs_report_verify_errors(
> > +	struct scrub_ctx		*ctx,
> > +	struct media_verify_state	*vs)
> > +{
> > +	bool				moveon;
> > +	int				ret;
> > +
> > +	ret = walk_ioerrs(ctx, ctx->datadev, vs);
> > +	if (ret) {
> > +		str_liberror(ctx, ret, _("walking datadev io errors"));
> > +		return false;
> > +	}
> > +
> > +	ret = walk_ioerrs(ctx, ctx->rtdev, vs);
> > +	if (ret) {
> > +		str_liberror(ctx, ret, _("walking rtdev io errors"));
> > +		return false;
> > +	}
> > +
> > +	/* Scan the directory tree to get file paths. */
> > +	moveon = scan_fs_tree(ctx, xfs_report_verify_dir,
> > +			xfs_report_verify_dirent, vs);
> > +	if (!moveon)
> > +		return false;
> > +
> > +	/* Scan for unlinked files. */
> > +	return xfs_scan_all_inodes(ctx, xfs_report_verify_inode, vs);
> >  }
> >  
> >  /* Schedule a read-verify of a (data block) extent. */
> > @@ -571,7 +631,7 @@ xfs_scan_blocks(
> >  	}
> >  
> >  	ret = read_verify_pool_alloc(ctx, ctx->datadev,
> > -			ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr,
> > +			ctx->mnt.fsgeom.blocksize, remember_ioerr,
> >  			scrub_nproc(ctx), &vs.rvp_data);
> >  	if (ret) {
> >  		str_liberror(ctx, ret, _("creating datadev media verifier"));
> > @@ -579,7 +639,7 @@ xfs_scan_blocks(
> >  	}
> >  	if (ctx->logdev) {
> >  		ret = read_verify_pool_alloc(ctx, ctx->logdev,
> > -				ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr,
> > +				ctx->mnt.fsgeom.blocksize, remember_ioerr,
> >  				scrub_nproc(ctx), &vs.rvp_log);
> >  		if (ret) {
> >  			str_liberror(ctx, ret,
> > @@ -589,7 +649,7 @@ xfs_scan_blocks(
> >  	}
> >  	if (ctx->rtdev) {
> >  		ret = read_verify_pool_alloc(ctx, ctx->rtdev,
> > -				ctx->mnt.fsgeom.blocksize, xfs_check_rmap_ioerr,
> > +				ctx->mnt.fsgeom.blocksize, remember_ioerr,
> >  				scrub_nproc(ctx), &vs.rvp_realtime);
> >  		if (ret) {
> >  			str_liberror(ctx, ret,
> > 

  reply	other threads:[~2019-10-21 18:14 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-25 21:36 [PATCH 00/11] xfs_scrub: fix IO error reporting Darrick J. Wong
2019-09-25 21:36 ` [PATCH 01/11] xfs_scrub: separate media error reporting for attribute forks Darrick J. Wong
2019-10-21 16:18   ` Eric Sandeen
2019-10-21 17:32     ` Darrick J. Wong
2019-09-25 21:36 ` [PATCH 02/11] xfs_scrub: improve reporting of file data media errors Darrick J. Wong
2019-10-21 16:33   ` Eric Sandeen
2019-10-21 17:56     ` Darrick J. Wong
2019-09-25 21:36 ` [PATCH 03/11] xfs_scrub: better reporting of metadata " Darrick J. Wong
2019-10-21 16:46   ` Eric Sandeen
2019-10-21 18:10     ` Darrick J. Wong
2019-09-25 21:36 ` [PATCH 04/11] xfs_scrub: improve reporting of file " Darrick J. Wong
2019-10-21 16:53   ` Eric Sandeen
2019-09-25 21:36 ` [PATCH 05/11] xfs_scrub: don't report media errors on unwritten extents Darrick J. Wong
2019-10-21 16:54   ` Eric Sandeen
2019-09-25 21:36 ` [PATCH 06/11] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong
2019-10-21 17:17   ` Eric Sandeen
2019-10-21 18:14     ` Darrick J. Wong [this message]
2019-09-25 21:36 ` [PATCH 07/11] xfs_scrub: request fewer bmaps when we can Darrick J. Wong
2019-10-21 18:05   ` Eric Sandeen
2019-10-21 18:15     ` Darrick J. Wong
2019-09-25 21:36 ` [PATCH 08/11] xfs_scrub: fix media verification thread pool size calculations Darrick J. Wong
2019-10-21 19:28   ` Eric Sandeen
2019-09-25 21:37 ` [PATCH 09/11] libfrog: clean up platform_nproc Darrick J. Wong
2019-10-21 19:31   ` Eric Sandeen
2019-10-21 20:13     ` Darrick J. Wong
2019-09-25 21:37 ` [PATCH 10/11] xfs_scrub: clean out the nproc global variable Darrick J. Wong
2019-10-21 19:32   ` Eric Sandeen
2019-09-25 21:37 ` [PATCH 11/11] xfs_scrub: create a new category for unfixable errors Darrick J. Wong
2019-10-21 19:45   ` Eric Sandeen
2019-10-21 20:29     ` Darrick J. Wong
2019-10-21 20:38       ` Eric Sandeen
  -- strict thread matches above, loose matches on Subject: below --
2019-09-06  3:38 [PATCH 00/11] xfs_scrub: fix IO error reporting Darrick J. Wong
2019-09-06  3:39 ` [PATCH 06/11] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong
2019-08-26 21:31 [PATCH 00/11] xfs_scrub: fix IO error reporting Darrick J. Wong
2019-08-26 21:31 ` [PATCH 06/11] xfs_scrub: reduce fsmap activity for media errors Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191021181409.GD913374@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox