All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org, djwong@kernel.org
Subject: Re: [PATCH 5/5] xfs: fix agfl wrapping
Date: Tue, 6 Mar 2018 07:02:44 -0500	[thread overview]
Message-ID: <20180306120244.GA10982@bfoster.bfoster> (raw)
In-Reply-To: <20180306021539.GF18989@magnolia>

On Mon, Mar 05, 2018 at 06:15:39PM -0800, Darrick J. Wong wrote:
> On Mon, Mar 05, 2018 at 02:24:30PM -0800, Darrick J. Wong wrote:
> > On Sat, Mar 03, 2018 at 08:59:50AM -0500, Brian Foster wrote:
> > > On Fri, Mar 02, 2018 at 08:12:07AM -0500, Brian Foster wrote:
> > > > On Thu, Mar 01, 2018 at 12:55:41PM -0800, Darrick J. Wong wrote:
> > > > > On Thu, Mar 01, 2018 at 12:28:33PM -0500, Brian Foster wrote:
> > > > > > On Wed, Feb 28, 2018 at 03:20:32PM -0800, Darrick J. Wong wrote:
> > > > > > > On Wed, Feb 28, 2018 at 05:43:51PM -0500, Brian Foster wrote:
> > > > > > > > On Tue, Feb 27, 2018 at 01:03:13PM -0800, Darrick J. Wong wrote:
> > > > > > > > > On Tue, Feb 27, 2018 at 02:35:49PM -0500, Brian Foster wrote:
> > > > > > > > > > On Thu, Feb 22, 2018 at 06:00:15PM -0800, Darrick J. Wong wrote:
> > > > > > > > > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > ...
> > > > > Going forward, I want the number of unpacked kernels to decrease as
> > > > > quickly as possible.  I understand that distro kernel maintainers are
> > > > > not willing to apply the packing patch to their kernel until we come up
> > > > > with a smooth transition path.
> > > > > 
> > > > 
> > > > I agree wrt to upstream, but note that I don't anticipate including the
> > > > padding fix downstream any time soon.
> > > > 
> > > > > I don't want to support fixing agfls to be 118 units long on 64-bit
> > > > > unpacked kernels and 119 units long on 32-bit unpacked kernels, and I
> > > > > only want to support the packed kernels with their 119 unit long agfls.
> > > > > An AGFL that starts at 0 and ends at flcount-1 is compatible with packed
> > > > > and unpacked kernels, so the v2 patch I sent last night removes the
> > > > > delicate per-case surgery in favor of a new strategy where the mount
> > > > > time and unmount time helpers both look for agfl configurationss that
> > > > > are known to cause problems, and solves them all with the same solution:
> > > > > moving the agfl list towards the start of the block.
> > > > > 
> > > > 
> > > > The purpose of the patch I sent was not for upstream unpacked support
> > > > going forward. Upstream has clearly moved forward with the packed
> > > > format. The goal of the patch was to explore a single/generic patch that
> > > > could be merged upstream/downstream and handle compatibility cleanly.
> > > > 
> > > 
> > > FWIW, here's a new variant of a bidirectional fixup. It's refactored and
> > > polished up a bit into a patch. It basically inspects the agf when first
> > > read for any evidence that the on-disk fields reflect a size mismatch
> > > with the current kernel and sets a flag if so. agfl gets/puts check the
> > > flag and thus the first transaction that attempts to modify a mismatched
> > > agfl swaps a block into or out of the gap slot appropriately.
> > > 
> > > This avoids the need for any new transactions or mount time scan and the
> > > (downstream motivated) packed -> unpacked case is only 10 or so lines of
> > > additional code. Only spot tested, but I _think_ it covers all of the
> > > cases. Hm?
> > 
> > Just from a quick glance this looks like a reasonable way to fix the
> > agfl wrapping to whatever the running kernel expects.  I tried feeding
> > it to the xfstest I wrote to exercise my agfl fixer[1], but even with
> > changing the test to fill the fs to enospc and delete everything I
> > couldn't get it to trigger reliably.
> > 
> > [1] https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfstests-dev.git/commit/?h=djwong-experimental&id=f085bb09c839da69daf921da33f5d13c80c9f165
> 
> Ok, I got it to trigger by updating that xfstest to fragment the free
> space after mounting.  According to the test results on 4.16-rc4 it
> fixes the deliberately tweaked agfls to work without complaint, so could
> you please write this up as a proper patch?
> 

Yeah, I had to trigger this variant by filling a small fs (based on a
couple metadumps I had created with wrapped packed/unpacked agfls) and
then fragmenting free space.

Thanks for looking. I need to run through some more thorough testing and
then I'll post it.

Brian

> --D
> 
> > 
> > --D
> > 
> > > Brian
> > > 
> > > --- 8< ---
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c
> > > index c02781a4c091..1cbf80d8481f 100644
> > > --- a/fs/xfs/libxfs/xfs_alloc.c
> > > +++ b/fs/xfs/libxfs/xfs_alloc.c
> > > @@ -2053,6 +2053,115 @@ xfs_alloc_space_available(
> > >  	return true;
> > >  }
> > >  
> > > +static int
> > > +xfs_agfl_ondisk_size(
> > > +	struct xfs_mount	*mp,
> > > +	int			first,
> > > +	int			last,
> > > +	int			count)
> > > +{
> > > +	int			active = count;
> > > +	int			agfl_size = XFS_AGFL_SIZE(mp);
> > > +
> > > +	if (count && last >= first)
> > > +		active = last - first + 1;
> > > +	else if (count)
> > > +		active = agfl_size - first + last + 1;
> > > +
> > > +	if (active == count + 1)
> > > +		return agfl_size - 1;
> > > +	if (active == count - 1 || first == agfl_size || last == agfl_size)
> > > +		return agfl_size + 1;
> > > +
> > > +	ASSERT(active == count);
> > > +	return agfl_size;
> > > +}
> > > +
> > > +static bool
> > > +xfs_agfl_need_padfix(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_agf		*agf)
> > > +{
> > > +	int			f = be32_to_cpu(agf->agf_flfirst);
> > > +	int			l = be32_to_cpu(agf->agf_fllast);
> > > +	int			c = be32_to_cpu(agf->agf_flcount);
> > > +
> > > +	return xfs_agfl_ondisk_size(mp, f, l, c) != XFS_AGFL_SIZE(mp);
> > > +}
> > > +
> > > +static int
> > > +xfs_agfl_check_padfix(
> > > +	struct xfs_trans	*tp,
> > > +	struct xfs_buf		*agbp,
> > > +	struct xfs_buf		*agflbp,
> > > +	struct xfs_perag	*pag)
> > > +{
> > > +	struct xfs_mount	*mp = tp->t_mountp;
> > > +	struct xfs_agf		*agf = XFS_BUF_TO_AGF(agbp);
> > > +	__be32			*agfl_bno = XFS_BUF_TO_AGFL_BNO(mp, agflbp);
> > > +	int			agfl_size = XFS_AGFL_SIZE(mp);
> > > +	int			ofirst, olast, osize;
> > > +	int			nfirst, nlast;
> > > +	int			logflags = 0;
> > > +	int			startoff = 0;
> > > +
> > > +	if (!pag->pagf_needpadfix)
> > > +		return 0;
> > > +
> > > +	ofirst = nfirst = be32_to_cpu(agf->agf_flfirst);
> > > +	olast = nlast = be32_to_cpu(agf->agf_fllast);
> > > +	osize = xfs_agfl_ondisk_size(mp, ofirst, olast, pag->pagf_flcount);
> > > +
> > > +	/*
> > > +	 * If the on-disk agfl is smaller than what the kernel expects, the
> > > +	 * last slot of the on-disk agfl is a gap with bogus data. Move the
> > > +	 * first valid block into the gap and bump the pointer.
> > > +	 */
> > > +	if (osize < agfl_size) {
> > > +		ASSERT(pag->pagf_flcount != 0);
> > > +		agfl_bno[agfl_size - 1] = agfl_bno[ofirst];
> > > +		startoff = (char *) &agfl_bno[agfl_size - 1] - (char *) agflbp->b_addr;
> > > +		nfirst++;
> > > +		goto done;
> > > +	}
> > > +
> > > +	/*
> > > +	 * Otherwise, the on-disk agfl is larger than what the current kernel
> > > +	 * expects. If empty, just fix up the first and last pointers. If not,
> > > +	 * move the inaccessible block to the end of the valid range.
> > > +	 */
> > > +	nfirst = do_mod(nfirst, agfl_size);
> > > +	if (pag->pagf_flcount == 0) {
> > > +		nlast = (nfirst == 0 ? agfl_size - 1 : nfirst - 1);
> > > +		goto done;
> > > +	}
> > > +	if (nlast != agfl_size)
> > > +		nlast++;
> > > +	nlast = do_mod(nlast, agfl_size);
> > > +	agfl_bno[nlast] = agfl_bno[osize - 1];
> > > +	startoff = (char *) &agfl_bno[nlast] - (char *) agflbp->b_addr;
> > > +
> > > +done:
> > > +	if (nfirst != ofirst) {
> > > +		agf->agf_flfirst = cpu_to_be32(nfirst);
> > > +		logflags |= XFS_AGF_FLFIRST;
> > > +	}
> > > +	if (nlast != olast) {
> > > +		agf->agf_fllast = cpu_to_be32(nlast);
> > > +		logflags |= XFS_AGF_FLLAST;
> > > +	}
> > > +	if (startoff) {
> > > +		xfs_trans_buf_set_type(tp, agflbp, XFS_BLFT_AGFL_BUF);
> > > +		xfs_trans_log_buf(tp, agflbp, startoff,
> > > +				  startoff + sizeof(xfs_agblock_t) - 1);
> > > +	}
> > > +	if (logflags)
> > > +		xfs_alloc_log_agf(tp, agbp, logflags);
> > > +
> > > +	pag->pagf_needpadfix = false;
> > > +	return 0;
> > > +}
> > > +
> > >  /*
> > >   * Decide whether to use this allocation group for this allocation.
> > >   * If so, fix up the btree freelist's size.
> > > @@ -2258,6 +2367,12 @@ xfs_alloc_get_freelist(
> > >  	if (error)
> > >  		return error;
> > >  
> > > +	pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno));
> > > +	error = xfs_agfl_check_padfix(tp, agbp, agflbp, pag);
> > > +	if (error) {
> > > +		xfs_perag_put(pag);
> > > +		return error;
> > > +	}
> > >  
> > >  	/*
> > >  	 * Get the block number and update the data structures.
> > > @@ -2269,7 +2384,6 @@ xfs_alloc_get_freelist(
> > >  	if (be32_to_cpu(agf->agf_flfirst) == XFS_AGFL_SIZE(mp))
> > >  		agf->agf_flfirst = 0;
> > >  
> > > -	pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno));
> > >  	be32_add_cpu(&agf->agf_flcount, -1);
> > >  	xfs_trans_agflist_delta(tp, -1);
> > >  	pag->pagf_flcount--;
> > > @@ -2376,11 +2490,18 @@ xfs_alloc_put_freelist(
> > >  	if (!agflbp && (error = xfs_alloc_read_agfl(mp, tp,
> > >  			be32_to_cpu(agf->agf_seqno), &agflbp)))
> > >  		return error;
> > > +
> > > +	pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno));
> > > +	error = xfs_agfl_check_padfix(tp, agbp, agflbp, pag);
> > > +	if (error) {
> > > +		xfs_perag_put(pag);
> > > +		return error;
> > > +	}
> > > +
> > >  	be32_add_cpu(&agf->agf_fllast, 1);
> > >  	if (be32_to_cpu(agf->agf_fllast) == XFS_AGFL_SIZE(mp))
> > >  		agf->agf_fllast = 0;
> > >  
> > > -	pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno));
> > >  	be32_add_cpu(&agf->agf_flcount, 1);
> > >  	xfs_trans_agflist_delta(tp, 1);
> > >  	pag->pagf_flcount++;
> > > @@ -2588,6 +2709,7 @@ xfs_alloc_read_agf(
> > >  		pag->pagb_count = 0;
> > >  		pag->pagb_tree = RB_ROOT;
> > >  		pag->pagf_init = 1;
> > > +		pag->pagf_needpadfix = xfs_agfl_need_padfix(mp, agf);
> > >  	}
> > >  #ifdef DEBUG
> > >  	else if (!XFS_FORCED_SHUTDOWN(mp)) {
> > > diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
> > > index e0792d036be2..78a6377a9b38 100644
> > > --- a/fs/xfs/xfs_mount.h
> > > +++ b/fs/xfs/xfs_mount.h
> > > @@ -353,6 +353,7 @@ typedef struct xfs_perag {
> > >  	char		pagi_inodeok;	/* The agi is ok for inodes */
> > >  	uint8_t		pagf_levels[XFS_BTNUM_AGF];
> > >  					/* # of levels in bno & cnt btree */
> > > +	bool		pagf_needpadfix;
> > >  	uint32_t	pagf_flcount;	/* count of blocks in freelist */
> > >  	xfs_extlen_t	pagf_freeblks;	/* total free blocks */
> > >  	xfs_extlen_t	pagf_longest;	/* longest free space */
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2018-03-06 12:02 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-23  1:59 [PATCH 0/5] xfs: fix various problems Darrick J. Wong
2018-02-23  1:59 ` [PATCH 1/5] xfs: don't iunlock the quota ip when quota block allocation fails Darrick J. Wong
2018-02-27 13:55   ` Brian Foster
2018-02-23  1:59 ` [PATCH 2/5] xfs: convert a few more directory asserts to corruption returns Darrick J. Wong
2018-02-27 13:55   ` Brian Foster
2018-02-23  2:00 ` [PATCH 3/5] xfs: check for cow blocks before trying to clear them during inode reclaim Darrick J. Wong
2018-02-27 13:55   ` Brian Foster
2018-02-23  2:00 ` [PATCH 4/5] xfs: convert XFS_AGFL_SIZE to a helper function Darrick J. Wong
2018-02-27 19:34   ` Brian Foster
2018-02-23  2:00 ` [PATCH 5/5] xfs: fix agfl wrapping Darrick J. Wong
2018-02-23  4:40   ` Darrick J. Wong
2018-02-23 20:33   ` Darrick J. Wong
2018-02-27 19:35   ` Brian Foster
2018-02-27 21:03     ` Darrick J. Wong
2018-02-28 22:43       ` Brian Foster
2018-02-28 23:20         ` Darrick J. Wong
2018-03-01 17:28           ` Brian Foster
2018-03-01 20:55             ` Darrick J. Wong
2018-03-02 13:12               ` Brian Foster
2018-03-03 13:59                 ` Brian Foster
2018-03-05 22:24                   ` Darrick J. Wong
2018-03-05 22:53                     ` Dave Chinner
2018-03-06  2:15                     ` Darrick J. Wong
2018-03-06 12:02                       ` Brian Foster [this message]
2018-03-01  6:37     ` Darrick J. Wong
2018-03-01  6:42   ` [PATCH v2 " Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180306120244.GA10982@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=darrick.wong@oracle.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.