public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com
Subject: Re: [PATCH 2/3] xfs: periodically relog deferred intent items
Date: Thu, 17 Sep 2020 00:18:21 -0700	[thread overview]
Message-ID: <20200917071821.GX7955@magnolia> (raw)
In-Reply-To: <20200917061148.GH12131@dread.disaster.area>

On Thu, Sep 17, 2020 at 04:11:48PM +1000, Dave Chinner wrote:
> On Wed, Sep 16, 2020 at 08:30:00PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > There's a subtle design flaw in the deferred log item code that can lead
> > to pinning the log tail.  Taking up the defer ops chain examples from
> > the previous commit, we can get trapped in sequences like this:
> > 
> > Caller hands us a transaction t0 with D0-D3 attached.  The defer ops
> > chain will look like the following if the transaction rolls succeed:
> > 
> > t1: D0(t0), D1(t0), D2(t0), D3(t0)
> > t2: d4(t1), d5(t1), D1(t0), D2(t0), D3(t0)
> > t3: d5(t1), D1(t0), D2(t0), D3(t0)
> > ...
> > t9: d9(t7), D3(t0)
> > t10: D3(t0)
> > t11: d10(t10), d11(t10)
> > t12: d11(t10)
> > 
> > In transaction 9, we finish d9 and try to roll to t10 while holding onto
> > an intent item for D3 that we logged in t0.
> > 
> > The previous commit changed the order in which we place new defer ops in
> > the defer ops processing chain to reduce the maximum chain length.  Now
> > make xfs_defer_finish_noroll capable of relogging the entire chain
> > periodically so that we can always move the log tail forward.  We do
> > this every seven loops, having observed that while most chains never
> > exceed seven items in length, the rest go far over that and seem to
> > be involved in most of the stall problems.
> > 
> > Callers are now required to ensure that the transaction reservation is
> > large enough to handle logging done items and new intent items for the
> > maximum possible chain length.  Most callers are careful to keep the
> > chain lengths low, so the overhead should be minimal.
> > 
> > (Note that in the next patch we'll make it so that we only relog on
> > demand, since 7 is an arbitrary number that I used here to get the basic
> > mechanics working.)
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_defer.c  |   30 ++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_bmap_item.c     |   25 +++++++++++++++++++++++++
> >  fs/xfs/xfs_extfree_item.c  |   29 +++++++++++++++++++++++++++++
> >  fs/xfs/xfs_refcount_item.c |   27 +++++++++++++++++++++++++++
> >  fs/xfs/xfs_rmap_item.c     |   27 +++++++++++++++++++++++++++
> >  fs/xfs/xfs_trace.h         |    1 +
> >  fs/xfs/xfs_trans.h         |   10 ++++++++++
> >  7 files changed, 149 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
> > index 84a70edd0da1..7938e4d3af90 100644
> > --- a/fs/xfs/libxfs/xfs_defer.c
> > +++ b/fs/xfs/libxfs/xfs_defer.c
> > @@ -361,6 +361,28 @@ xfs_defer_cancel_list(
> >  	}
> >  }
> >  
> > +/*
> > + * Prevent a log intent item from pinning the tail of the log by logging a
> > + * done item to release the intent item; and then log a new intent item.
> > + * The caller should provide a fresh transaction and roll it after we're done.
> > + */
> > +static int
> > +xfs_defer_relog(
> > +	struct xfs_trans		**tpp,
> > +	struct list_head		*dfops)
> > +{
> > +	struct xfs_defer_pending	*dfp;
> > +
> > +	ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
> > +
> > +	list_for_each_entry(dfp, dfops, dfp_list) {
> > +		trace_xfs_defer_relog_intent((*tpp)->t_mountp, dfp);
> > +		dfp->dfp_intent = xfs_trans_item_relog(dfp->dfp_intent, *tpp);
> 
> Any reason for xfs_trans_item_relog() when it's a one liner?

There aren't log intent items in userspace, so xfs_trans_item_relog
becomes a NOP macro in the xfsprogs port.

> > +	}
> > +
> > +	return xfs_defer_trans_roll(tpp);
> > +}
> > +
> >  /*
> >   * Log an intent-done item for the first pending intent, and finish the work
> >   * items.
> > @@ -422,6 +444,7 @@ xfs_defer_finish_noroll(
> >  	struct xfs_trans		**tp)
> >  {
> >  	struct xfs_defer_pending	*dfp;
> > +	unsigned int			nr_rolls = 0;
> >  	int				error = 0;
> >  	LIST_HEAD(dop_pending);
> >  
> > @@ -447,6 +470,13 @@ xfs_defer_finish_noroll(
> >  		if (error)
> >  			goto out_shutdown;
> >  
> > +		/* Every few rolls we relog all the intent items. */
> > +		if (!(++nr_rolls % 7)) {
> > +			error = xfs_defer_relog(tp, &dop_pending);
> > +			if (error)
> > +				goto out_shutdown;
> > +		}
> 
> Urk.
> 
> I think I've got a better idea: rather than a counter, use something
> meaningful as to whether the intent has been committed or not. e.g.
> use something like xfs_log_item_in_current_chkpt() to determine if
> we need to relog the intent.

I'll take a look at that in the morning.

> i.e. If the intent is active in the CIL, then we don't need to relog
> it. If the intent has been committed to the journal and is no longer
> in the CIL list, relog it so the next CIL push will move it forward
> in the journal.
> 
> The intent relogging functions look fine, though.

Thanks for digging through some of these. :)

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

  reply	other threads:[~2020-09-17  7:18 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-17  3:29 [PATCH 0/3] xfs: fix some log stalling problems in defer ops Darrick J. Wong
2020-09-17  3:29 ` [PATCH 1/3] xfs: change the order in which child and parent defer ops are finished Darrick J. Wong
2020-09-17  5:57   ` Dave Chinner
2020-09-17 15:27   ` Brian Foster
2020-09-17 16:38     ` Darrick J. Wong
2020-09-17  3:30 ` [PATCH 2/3] xfs: periodically relog deferred intent items Darrick J. Wong
2020-09-17  6:11   ` Dave Chinner
2020-09-17  7:18     ` Darrick J. Wong [this message]
2020-09-17 15:28   ` Brian Foster
2020-09-18  0:36     ` Darrick J. Wong
2020-09-17  3:30 ` [PATCH 3/3] xfs: use the log grant push threshold to decide if we're going to relog deferred items Darrick J. Wong
2020-09-17 15:28   ` Brian Foster
2020-09-22 15:51     ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200917071821.GX7955@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox