linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 12/12] xfs: wait for deferred inactivation when destroying unlinked inodes
Date: Fri, 18 Jan 2019 09:21:18 +1100	[thread overview]
Message-ID: <20190117222117.GF6173@dastard> (raw)
In-Reply-To: <20190117184158.GE4424@magnolia>

On Thu, Jan 17, 2019 at 10:41:58AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 03, 2019 at 11:46:44PM +1100, Dave Chinner wrote:
> > On Mon, Dec 31, 2018 at 06:18:04PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Now that we've constructed a mechanism to batch background inode
> > > inactivation work, we actually want in some cases to throttle the amount
> > > of backlog work that the frontend can generate.  We do this by making
> > > destroy_inode wait for inactivation when we're deleting things, assuming
> > > that deleted inodes are dropped and destroyed in process context and not
> > > from fs reclaim.
> > 
> > This would kills performance of highly concurrent unlink
> > workloads.
> > 
> > That said, the current unlink code needs severe throttling because
> > the unlinked inode list management does not scale at all well - get
> > more than a a couple of hundred inodes into the AGI unlinked lists,
> > and xfs_iunlink_remove burns huge amounts of CPU.
> > 
> > So if it isn't throttled, it'll be just as slow, but burn huge
> > amounts amounts of extra CPU walking the unlinked lists.....
> 
> ...so I've refactored the iunlink code to record "who points to this
> inode?" references, which speeds up iunlink_remove substantially.  I'll
> put that series out for review after letting it run on some real
> workloads.
> 
> I've also dropped this patch from the series entirely, just to see what
> happens.  The tradeoff here is that allocations see increased latency
> upon hitting ENOSPC because we now force inactivation to see if we can
> clean things out, but OTOH if we never hit ENOSPC then the rest of the
> fs runs considerably faster.

I think that's a valid trade-off - we make it in several other
places, so we expect things to slow down near/at ENOSPC.

> > > +/*
> > > + * Decide if this inode is a candidate for unlinked inactivation throttling.
> > > + * We have to decide this prior to setting the NEED_INACTIVE iflag because
> > > + * once we flag the inode for inactivation we can't access it any more.
> > > + */
> > > +enum xfs_iwait
> > > +xfs_iwait_check(
> > > +	struct xfs_inode	*ip)
> > > +{
> > > +	struct xfs_mount	*mp = ip->i_mount;
> > > +	unsigned long long	x;
> > > +	unsigned long long	y;
> > > +	bool			rt = XFS_IS_REALTIME_INODE(ip);
> > > +
> > > +	/*
> > > +	 * Don't wait unless we're doing a deletion inactivation.  We assume
> > > +	 * that unlinked inodes that lose all their refcount are dropped,
> > > +	 * evicted, and destroyed immediately in the context of the unlink()ing
> > > +	 * process.
> > > +	 */
> > 
> > I think this is wrong - we want to push unlinked inode processing
> > into the background so we don't have to wait on it, not force
> > inactivation of unlinked inodes to wait for other inodes to be
> > inactivated.
> 
> The original idea behind most of this was that we'd try to slow down a
> rm -rf so that the fs doesn't find itself facing a gigantic flood of
> inactivation work, particularly if there were a lot of extents to free
> or a lot of inodes to free.  Under this scheme we don't ever wait for
> inactivation if we're just closing a linked file, but we could do so for
> deletions.
> 
> However, it is difficult to quantify what "gigantic" means here.  The
> situation I was trying to avoid is where the system gets bogged down
> with background processing work for a long time after the userspace
> process terminates, but perhaps it's better not to bother. :)

Multi-processor systems are ubiquitous now. If we can move things to
the background and burn otherwise idle CPU to do the work so that
users/apps don't have to wait on it, then I think that's fine. IMO,
the more work we can push into async background processing the
better we can optimise it (e.g. start batching or using bulk
operations rather than one-at-a-time).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2019-01-17 22:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-01  2:16 [PATCH 00/12] xfs: deferred inode inactivation Darrick J. Wong
2019-01-01  2:16 ` [PATCH 01/12] xfs: free COW staging extents when freezing filesystem Darrick J. Wong
2019-01-11 16:28   ` Brian Foster
2019-01-17 17:24     ` Darrick J. Wong
2019-01-17 18:14       ` Brian Foster
2019-01-17 20:20         ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 02/12] xfs: refactor the predicate part of xfs_free_eofblocks Darrick J. Wong
2019-01-11 19:05   ` Brian Foster
2019-01-17 17:33     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 03/12] xfs: decide if inode needs inactivation Darrick J. Wong
2019-01-01  2:17 ` [PATCH 04/12] xfs: track unlinked inactive inode fs summary counters Darrick J. Wong
2019-01-01  2:17 ` [PATCH 05/12] xfs: track unlinked inactive inode quota counters Darrick J. Wong
2019-01-01  2:17 ` [PATCH 06/12] xfs: refactor walking of per-AG RECLAIM inodes Darrick J. Wong
2019-01-11 19:06   ` Brian Foster
2019-01-17 17:43     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 07/12] xfs: refactor eofblocks inode match code Darrick J. Wong
2019-01-02  9:50   ` Nikolay Borisov
2019-01-17 18:05     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 08/12] xfs: deferred inode inactivation Darrick J. Wong
2019-01-01  2:17 ` [PATCH 09/12] xfs: retry fs writes when there isn't space Darrick J. Wong
2019-01-01  2:17 ` [PATCH 10/12] xfs: force inactivation before fallocate when space is low Darrick J. Wong
2019-01-01  2:17 ` [PATCH 11/12] xfs: parallelize inode inactivation Darrick J. Wong
2019-01-01  2:18 ` [PATCH 12/12] xfs: wait for deferred inactivation when destroying unlinked inodes Darrick J. Wong
2019-01-03 12:46   ` Dave Chinner
2019-01-17 18:41     ` Darrick J. Wong
2019-01-17 22:21       ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190117222117.GF6173@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).