linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Allison Henderson <allison.henderson@oracle.com>
Cc: Dave Chinner <david@fromorbit.com>, linux-xfs@vger.kernel.org
Subject: Re: [PATCH 03/22] xfs: add helpers to collect and sift btree block pointers during repair
Date: Wed, 16 May 2018 11:06:15 -0700	[thread overview]
Message-ID: <20180516180615.GJ23858@magnolia> (raw)
In-Reply-To: <982db5b5-23a3-96e1-2e5b-87fae674f1f6@oracle.com>

On Wed, May 16, 2018 at 10:34:36AM -0700, Allison Henderson wrote:
> Ok, with the points Dave made:
> Reviewed by: Allison Henderson <allison.henderson@oracle.com>
> 
> On 05/16/2018 12:56 AM, Dave Chinner wrote:
> > On Tue, May 15, 2018 at 03:33:58PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Add some helpers to assemble a list of fs block extents.  Generally,
> > > repair functions will iterate the rmapbt to make a list (1) of all
> > > extents owned by the nominal owner of the metadata structure; then they
> > > will iterate all other structures with the same rmap owner to make a
> > > list (2) of active blocks; and finally we have a subtraction function to
> > > subtract all the blocks in (2) from (1), with the result that (1) is now
> > > a list of blocks that were owned by the old btree and must be disposed.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >   fs/xfs/scrub/repair.c |  207 +++++++++++++++++++++++++++++++++++++++++++++++++
> > >   fs/xfs/scrub/repair.h |   31 +++++++
> > >   2 files changed, 238 insertions(+)
> > > 
> > > 
> > > diff --git a/fs/xfs/scrub/repair.c b/fs/xfs/scrub/repair.c
> > > index 72f04a717150..8e8ecddd7537 100644
> > > --- a/fs/xfs/scrub/repair.c
> > > +++ b/fs/xfs/scrub/repair.c
> > > @@ -361,3 +361,210 @@ xfs_repair_init_btblock(
> > >   	return 0;
> > >   }
> > > +
> > > +/* Collect a dead btree extent for later disposal. */
> > > +int
> > > +xfs_repair_collect_btree_extent(
> > > +	struct xfs_scrub_context	*sc,
> > > +	struct xfs_repair_extent_list	*exlist,
> > > +	xfs_fsblock_t			fsbno,
> > > +	xfs_extlen_t			len)
> > > +{
> > > +	struct xfs_repair_extent	*rex;
> > > +
> > > +	trace_xfs_repair_collect_btree_extent(sc->mp,
> > > +			XFS_FSB_TO_AGNO(sc->mp, fsbno),
> > > +			XFS_FSB_TO_AGBNO(sc->mp, fsbno), len);
> > > +
> > > +	rex = kmem_alloc(sizeof(struct xfs_repair_extent), KM_MAYFAIL);
> > > +	if (!rex)
> > > +		return -ENOMEM;
> > Is this in transaction context? Regardless, I think we need to run
> > the entire of scrub/repair in a memalloc_nofs_save() context so we
> > don't have memory reclaim recursion issues...
> > 
> > [...]
> > 
> > > +/* Compare two btree extents. */
> > > +static int
> > > +xfs_repair_btree_extent_cmp(
> > > +	void				*priv,
> > > +	struct list_head		*a,
> > > +	struct list_head		*b)
> > > +{
> > > +	struct xfs_repair_extent	*ap;
> > > +	struct xfs_repair_extent	*bp;
> > > +
> > > +	ap = container_of(a, struct xfs_repair_extent, list);
> > > +	bp = container_of(b, struct xfs_repair_extent, list);
> > > +
> > > +	if (ap->fsbno > bp->fsbno)
> > > +		return 1;
> > > +	else if (ap->fsbno < bp->fsbno)
> > > +		return -1;
> > No need for the else there.
> Well, I think he meant to return 0 in the case of ap->fsbno == bp->fsbno? 
> Am i reading that right?  caller expects 1 for greater than, -1 for less
> than and 0 on equivalence?

Correct.  I think Dave was pointing out that else-after-return is
unnecessary, since the following behaves equivalently:

if (a > b)
	return 1;
if (a < b)
	return -1;
return 0;

Note that gcc (7.3, anyway) generates the same asm for either version so
I assume this is mostly stylistic cleanup to make the comparisons line
up?  I don't have a preference either way. :)

--D

> > > +	return 0;
> > > +}
> > > +
> > > +/*
> > > + * Remove all the blocks mentioned in sublist from the extents in exlist.
> > > + *
> > > + * The intent is that callers will iterate the rmapbt for all of its records
> > > + * for a given owner to generate exlist; and iterate all the blocks of the
> > generate @exlist
> > 
> > > + * metadata structures that are not being rebuilt and have the same rmapbt
> > > + * owner to generate sublist.  This routine subtracts all the extents
> > generate @sublist.
> > 
> > > + * mentioned in sublist from all the extents linked in exlist, which leaves
> > > + * exlist as the list of blocks that are not accounted for, which we assume
> > > + * are the dead blocks of the old metadata structure.  The blocks mentioned in
> > > + * exlist can be reaped.
> > > + */
> > > +#define XFS_REPAIR_EXT_LEFT_CONTIG	(1 << 0)
> > > +#define XFS_REPAIR_EXT_RIGHT_CONTIG	(1 << 1)
> > > +int
> > > +xfs_repair_subtract_extents(
> > > +	struct xfs_scrub_context	*sc,
> > > +	struct xfs_repair_extent_list	*exlist,
> > > +	struct xfs_repair_extent_list	*sublist)
> > > +{
> > > +	struct list_head		*lp;
> > > +	struct xfs_repair_extent	*ex;
> > > +	struct xfs_repair_extent	*newex;
> > > +	struct xfs_repair_extent	*subex;
> > > +	xfs_fsblock_t			sub_fsb;
> > > +	xfs_extlen_t			sub_len;
> > > +	int				state;
> > > +	int				error = 0;
> > > +
> > > +	if (list_empty(&exlist->list) || list_empty(&sublist->list))
> > > +		return 0;
> > > +	ASSERT(!list_empty(&sublist->list));
> > > +
> > > +	list_sort(NULL, &exlist->list, xfs_repair_btree_extent_cmp);
> > > +	list_sort(NULL, &sublist->list, xfs_repair_btree_extent_cmp);
> > > +
> > > +	subex = list_first_entry(&sublist->list, struct xfs_repair_extent,
> > > +			list);
> > > +	lp = exlist->list.next;
> > > +	while (lp != &exlist->list) {
> > > +		ex = list_entry(lp, struct xfs_repair_extent, list);
> > > +
> > > +		/*
> > > +		 * Advance subex and/or ex until we find a pair that
> > > +		 * intersect or we run out of extents.
> > > +		 */
> > > +		while (subex->fsbno + subex->len <= ex->fsbno) {
> > > +			if (list_is_last(&subex->list, &sublist->list))
> > > +				goto out;
> > > +			subex = list_next_entry(subex, list);
> > > +		}
> > So this is a O(n^2) algorithm, right? How does it scale with large
> > extent lists? Given that these extents are dynamically allocated,
> > and we're already burning 16 bytes for a list head on each extent,
> > would it be better to use a smarter structure better suited for
> > exact lookups? e.g. an rbtree only takes an extra 8 bytes per
> > extent, and we get O(log N) searches on the inner loop here...
> > 
> > I guess this isn't necessary to fix right now, but I think it's
> > going to be an issue for maybe mark this down as "needing to be
> > fixed before removing EXPERIMENTAL tags"?
> > 
> > > +		if (subex->fsbno >= ex->fsbno + ex->len) {
> > > +			lp = lp->next;
> > > +			continue;
> > > +		}
> > > +
> > > +		/* trim subex to fit the extent we have */
> > > +		sub_fsb = subex->fsbno;
> > > +		sub_len = subex->len;
> > > +		if (subex->fsbno < ex->fsbno) {
> > > +			sub_len -= ex->fsbno - subex->fsbno;
> > > +			sub_fsb = ex->fsbno;
> > > +		}
> > > +		if (sub_len > ex->len)
> > > +			sub_len = ex->len;
> > > +
> > > +		state = 0;
> > > +		if (sub_fsb == ex->fsbno)
> > > +			state |= XFS_REPAIR_EXT_LEFT_CONTIG;
> > > +		if (sub_fsb + sub_len == ex->fsbno + ex->len)
> > > +			state |= XFS_REPAIR_EXT_RIGHT_CONTIG;
> > Ok, I think "CONTIG" is not the right word to use here. In the BMAP
> > btrees, the merge state flags were to tell us whether the edge of
> > the new extent is contiguous with the left and right extents in
> > the tree, not whether the new extents overlapped to the left/right
> > edges.
> > 
> > i.e. we're checking whether extent start/end overlaps are aligned
> > here, not whether they are contiguous with some other extent. So in
> > this case, I'd just name the variables "LEFT_ALIGNED" and
> > "RIGHT_ALIGNED" and drop all the namespace bits from them.
> > 
> > > +		switch (state) {
> > > +		case XFS_REPAIR_EXT_LEFT_CONTIG:
> > > +			/* Coincides with only the left. */
> > And by calling them aligned, the comments become redundant:
> > 
> > 		case LEFT_ALIGNED:
> > > +			ex->fsbno += sub_len;
> > > +			ex->len -= sub_len;
> > > +			break;
> > > +		case XFS_REPAIR_EXT_RIGHT_CONTIG:
> > > +			/* Coincides with only the right. */
> > > +			ex->len -= sub_len;
> > > +			lp = lp->next;
> > > +			break;
> > > +		case XFS_REPAIR_EXT_LEFT_CONTIG | XFS_REPAIR_EXT_RIGHT_CONTIG:
> > > +			/* Total overlap, just delete ex. */
> > > +			lp = lp->next;
> > > +			list_del(&ex->list);
> > > +			kmem_free(ex);
> > > +			break;
> > > +		case 0:
> > > +			/*
> > > +			 * Deleting from the middle: add the new right extent
> > > +			 * and then shrink the left extent.
> > > +			 */
> > > +			newex = kmem_alloc(sizeof(struct xfs_repair_extent),
> > > +					KM_MAYFAIL);
> > > +			if (!newex) {
> > > +				error = -ENOMEM;
> > > +				goto out;
> > > +			}
> > > +			INIT_LIST_HEAD(&newex->list);
> > > +			newex->fsbno = sub_fsb + sub_len;
> > > +			newex->len = ex->len - (sub_fsb - ex->fsbno) - sub_len;
> > so: new len = old len - (length of first extent) - length of overlap.
> > 
> > I think this is more obvious as "new len = old end - new start", i.e.:
> > 
> > 			newex->len = ex->fsbno + ex->len - newex->fsbno;
> > 
> > Cheers,
> > 
> > Dave.
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-05-16 18:06 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-15 22:33 [PATCH v15.1 00/22] xfs-4.18: online repair support Darrick J. Wong
2018-05-15 22:33 ` [PATCH 01/22] xfs: add helpers to deal with transaction allocation and rolling Darrick J. Wong
2018-05-16  6:51   ` Dave Chinner
2018-05-16 16:46     ` Darrick J. Wong
2018-05-16 21:19       ` Dave Chinner
2018-05-16 16:48   ` Allison Henderson
2018-05-18  3:49   ` [PATCH v2 " Darrick J. Wong
2018-05-15 22:33 ` [PATCH 02/22] xfs: add helpers to allocate and initialize fresh btree roots Darrick J. Wong
2018-05-16  7:07   ` Dave Chinner
2018-05-16 17:15     ` Darrick J. Wong
2018-05-16 17:00   ` Allison Henderson
2018-05-15 22:33 ` [PATCH 03/22] xfs: add helpers to collect and sift btree block pointers during repair Darrick J. Wong
2018-05-16  7:56   ` Dave Chinner
2018-05-16 17:34     ` Allison Henderson
2018-05-16 18:06       ` Darrick J. Wong [this message]
2018-05-16 21:23         ` Dave Chinner
2018-05-16 21:33           ` Allison Henderson
2018-05-16 18:01     ` Darrick J. Wong
2018-05-16 21:32       ` Dave Chinner
2018-05-16 22:05         ` Darrick J. Wong
2018-05-17  0:41           ` Dave Chinner
2018-05-17  5:05             ` Darrick J. Wong
2018-05-18  3:51   ` [PATCH v2 " Darrick J. Wong
2018-05-29  3:10     ` Dave Chinner
2018-05-29 15:28       ` Darrick J. Wong
2018-05-15 22:34 ` [PATCH 04/22] xfs: add helpers to dispose of old btree blocks after a repair Darrick J. Wong
2018-05-16  8:32   ` Dave Chinner
2018-05-16 18:02     ` Allison Henderson
2018-05-16 19:34     ` Darrick J. Wong
2018-05-16 22:32       ` Dave Chinner
2018-05-16 23:18         ` Darrick J. Wong
2018-05-17  5:58           ` Darrick J. Wong
2018-05-18  3:53   ` [PATCH v2 " Darrick J. Wong
2018-05-29  3:14     ` Dave Chinner
2018-05-29 18:01       ` Darrick J. Wong
2018-05-15 22:34 ` [PATCH 05/22] xfs: recover AG btree roots from rmap data Darrick J. Wong
2018-05-16  8:51   ` Dave Chinner
2018-05-16 18:37     ` Darrick J. Wong
2018-05-16 19:18       ` Allison Henderson
2018-05-16 22:36       ` Dave Chinner
2018-05-17  5:53         ` Darrick J. Wong
2018-05-18  3:54   ` [PATCH v2 " Darrick J. Wong
2018-05-29  3:16     ` Dave Chinner
2018-05-15 22:34 ` [PATCH 06/22] xfs: add a repair helper to reset superblock counters Darrick J. Wong
2018-05-16 21:29   ` Allison Henderson
2018-05-18  3:56     ` Darrick J. Wong
2018-05-18  3:56   ` [PATCH v2 " Darrick J. Wong
2018-05-29  3:28     ` Dave Chinner
2018-05-29 22:07       ` Darrick J. Wong
2018-05-29 22:24         ` Dave Chinner
2018-05-29 22:43           ` Darrick J. Wong
2018-05-30  1:23             ` Dave Chinner
2018-05-30  3:22               ` Darrick J. Wong
2018-05-15 22:34 ` [PATCH 07/22] xfs: add helpers to attach quotas to inodes Darrick J. Wong
2018-05-16 22:21   ` Allison Henderson
2018-05-18  3:58   ` [PATCH v2 " Darrick J. Wong
2018-05-29  3:29     ` Dave Chinner
2018-05-15 22:34 ` [PATCH 08/22] xfs: repair superblocks Darrick J. Wong
2018-05-16 22:55   ` Allison Henderson
2018-05-29  3:42   ` Dave Chinner
2018-05-15 22:34 ` [PATCH 09/22] xfs: repair the AGF and AGFL Darrick J. Wong
2018-05-15 22:34 ` [PATCH 10/22] xfs: repair the AGI Darrick J. Wong
2018-05-15 22:34 ` [PATCH 11/22] xfs: repair free space btrees Darrick J. Wong
2018-05-15 22:34 ` [PATCH 12/22] xfs: repair inode btrees Darrick J. Wong
2018-05-15 22:35 ` [PATCH 13/22] xfs: repair the rmapbt Darrick J. Wong
2018-05-15 22:35 ` [PATCH 14/22] xfs: repair refcount btrees Darrick J. Wong
2018-05-15 22:35 ` [PATCH 15/22] xfs: repair inode records Darrick J. Wong
2018-05-15 22:35 ` [PATCH 16/22] xfs: zap broken inode forks Darrick J. Wong
2018-05-15 22:35 ` [PATCH 17/22] xfs: repair inode block maps Darrick J. Wong
2018-05-15 22:35 ` [PATCH 18/22] xfs: repair damaged symlinks Darrick J. Wong
2018-05-15 22:35 ` [PATCH 19/22] xfs: repair extended attributes Darrick J. Wong
2018-05-15 22:35 ` [PATCH 20/22] xfs: scrub should set preen if attr leaf has holes Darrick J. Wong
2018-05-15 22:35 ` [PATCH 21/22] xfs: repair quotas Darrick J. Wong
2018-05-15 22:36 ` [PATCH 22/22] xfs: implement live quotacheck as part of quota repair Darrick J. Wong
2018-05-18  3:47 ` [PATCH 0.5/22] xfs: grab the per-ag structure whenever relevant Darrick J. Wong
2018-05-30  6:44   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180516180615.GJ23858@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=allison.henderson@oracle.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).