From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org, bfoster@redhat.com
Subject: Re: [PATCH 3/3] xfs: periodically relog deferred intent items
Date: Mon, 28 Sep 2020 11:00:50 +1000 [thread overview]
Message-ID: <20200928010050.GB14422@dread.disaster.area> (raw)
In-Reply-To: <20200927233025.GA49547@magnolia>
On Sun, Sep 27, 2020 at 04:30:25PM -0700, Darrick J. Wong wrote:
> On Mon, Sep 28, 2020 at 09:08:23AM +1000, Dave Chinner wrote:
> > On Tue, Sep 22, 2020 at 10:33:19PM -0700, Darrick J. Wong wrote:
> > > @@ -361,6 +362,52 @@ xfs_defer_cancel_list(
> > > }
> > > }
> > >
> > > +/*
> > > + * Prevent a log intent item from pinning the tail of the log by logging a
> > > + * done item to release the intent item; and then log a new intent item.
> > > + * The caller should provide a fresh transaction and roll it after we're done.
> > > + */
> > > +static int
> > > +xfs_defer_relog(
> > > + struct xfs_trans **tpp,
> > > + struct list_head *dfops)
> > > +{
> > > + struct xfs_defer_pending *dfp;
> > > + xfs_lsn_t threshold_lsn;
> > > +
> > > + ASSERT((*tpp)->t_flags & XFS_TRANS_PERM_LOG_RES);
> > > +
> > > + /*
> > > + * Figure out where we need the tail to be in order to maintain the
> > > + * minimum required free space in the log.
> > > + */
> > > + threshold_lsn = xlog_grant_push_threshold((*tpp)->t_mountp->m_log, 0);
> > > + if (threshold_lsn == NULLCOMMITLSN)
> > > + return 0;
> >
> > This smells of premature optimisation.
> >
> > When we are in a tail-pushing scenario (i.e. any sort of
> > sustained metadata workload) this will always return true, and so we
> > will relog every intent that isn't in the current checkpoint every
> > time this is called. Under light load, we don't care if we add a
> > little bit of relogging overhead as the CIL slowly flushes/pushes -
> > it will have neglible impact on performance because there is little
> > load on the journal.
> >
> > However, when we are under heavy load the code will now be reading
> > the grant head and log position accounting variables during every
> > commit, hence greatly increasing the number and temporal
> > distribution of accesses to the hotest cachelines in the log. We
> > currently never access these cache lines during commit unless the
> > unit reservation has run out and we have to regrant physical log
> > space for the transaction to continue (i.e. we are into slow path
> > commit code). IOWs, this is like causing far more than double the
> > number of accesses to the grant head, the log tail, the
> > last_sync_lsn, etc, all of which is unnecessary exactly when we care
> > about minimising contention on the log space accounting variables...
> >
> > Given that it is a redundant check under heavy load journal load
> > when access to the log grant/head/tail are already contended,
> > I think we should just be checking the "in current checkpoint" logic
> > and not making it conditional on the log being near full.
>
> <nod> FWIW I broke this patch up again into the first part that
> only does relogging if the checkpoints don't match, and a second part
> that does the LSN push target check to see if I could observe any
> difference.
>
> Across a ~4h fstests run I noticed that there was about ~20% fewer
> relogs, but OTOH the total runtime didn't change noticeably. I kind of
> wondered if the increased cacheline contention would at least slow down
> the frontend a bit to give the log a chance to push things out, but
> haven't had time to dig any further than "ran fstests, recorded runtimes
> and grep | wc -l'd the ftrace log".
fstests doesn't generate anywhere near the load necessary to measure
log space accounting cacheline contention. That's one of the things
I use fsmark workloads for.
One of the things that "20% reduction" tells me, though, is that 80%
of the time that relogging is happening with this patch we are in a
tail pushing situation. i.e. the log is more often full than it is
empty when we are relogging.
That tends to support my statements that this is optimising the
wrong case.
FWIW, it seems to me that we need real stats for the deferops (i.e.
exposed in /proc/fs/xfs/stat) so we can monitor in realtime the
breakdown of work that is being done. Being able to see things like
how many defer rolls a transaction takes under a given workload is
very interesting information, and would give us insight into things
like whether the log count for transaction reservations is
reasonable or not....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2020-09-28 1:00 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-23 5:32 [PATCH v3 0/3] xfs: fix some log stalling problems in defer ops Darrick J. Wong
2020-09-23 5:33 ` [PATCH 1/3] xfs: change the order in which child and parent defer ops are finished Darrick J. Wong
2020-09-23 5:33 ` [PATCH 2/3] xfs: expose the log push threshold Darrick J. Wong
2020-09-25 11:15 ` Brian Foster
2020-09-25 18:59 ` Darrick J. Wong
2020-09-23 5:33 ` [PATCH 3/3] xfs: periodically relog deferred intent items Darrick J. Wong
2020-09-25 11:15 ` Brian Foster
2020-09-25 19:06 ` Darrick J. Wong
2020-09-27 23:08 ` Dave Chinner
2020-09-27 23:30 ` Darrick J. Wong
2020-09-28 1:00 ` Dave Chinner [this message]
2020-09-28 15:16 ` Brian Foster
2020-09-28 23:09 ` Dave Chinner
2020-09-29 12:27 ` Brian Foster
2020-09-29 17:01 ` Darrick J. Wong
2020-09-29 18:45 ` Brian Foster
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200928010050.GB14422@dread.disaster.area \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox