All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 12/12] xfs: wait for deferred inactivation when destroying unlinked inodes
Date: Fri, 18 Jan 2019 09:21:18 +1100	[thread overview]
Message-ID: <20190117222117.GF6173@dastard> (raw)
In-Reply-To: <20190117184158.GE4424@magnolia>

On Thu, Jan 17, 2019 at 10:41:58AM -0800, Darrick J. Wong wrote:
> On Thu, Jan 03, 2019 at 11:46:44PM +1100, Dave Chinner wrote:
> > On Mon, Dec 31, 2018 at 06:18:04PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Now that we've constructed a mechanism to batch background inode
> > > inactivation work, we actually want in some cases to throttle the amount
> > > of backlog work that the frontend can generate.  We do this by making
> > > destroy_inode wait for inactivation when we're deleting things, assuming
> > > that deleted inodes are dropped and destroyed in process context and not
> > > from fs reclaim.
> > 
> > This would kills performance of highly concurrent unlink
> > workloads.
> > 
> > That said, the current unlink code needs severe throttling because
> > the unlinked inode list management does not scale at all well - get
> > more than a a couple of hundred inodes into the AGI unlinked lists,
> > and xfs_iunlink_remove burns huge amounts of CPU.
> > 
> > So if it isn't throttled, it'll be just as slow, but burn huge
> > amounts amounts of extra CPU walking the unlinked lists.....
> 
> ...so I've refactored the iunlink code to record "who points to this
> inode?" references, which speeds up iunlink_remove substantially.  I'll
> put that series out for review after letting it run on some real
> workloads.
> 
> I've also dropped this patch from the series entirely, just to see what
> happens.  The tradeoff here is that allocations see increased latency
> upon hitting ENOSPC because we now force inactivation to see if we can
> clean things out, but OTOH if we never hit ENOSPC then the rest of the
> fs runs considerably faster.

I think that's a valid trade-off - we make it in several other
places, so we expect things to slow down near/at ENOSPC.

> > > +/*
> > > + * Decide if this inode is a candidate for unlinked inactivation throttling.
> > > + * We have to decide this prior to setting the NEED_INACTIVE iflag because
> > > + * once we flag the inode for inactivation we can't access it any more.
> > > + */
> > > +enum xfs_iwait
> > > +xfs_iwait_check(
> > > +	struct xfs_inode	*ip)
> > > +{
> > > +	struct xfs_mount	*mp = ip->i_mount;
> > > +	unsigned long long	x;
> > > +	unsigned long long	y;
> > > +	bool			rt = XFS_IS_REALTIME_INODE(ip);
> > > +
> > > +	/*
> > > +	 * Don't wait unless we're doing a deletion inactivation.  We assume
> > > +	 * that unlinked inodes that lose all their refcount are dropped,
> > > +	 * evicted, and destroyed immediately in the context of the unlink()ing
> > > +	 * process.
> > > +	 */
> > 
> > I think this is wrong - we want to push unlinked inode processing
> > into the background so we don't have to wait on it, not force
> > inactivation of unlinked inodes to wait for other inodes to be
> > inactivated.
> 
> The original idea behind most of this was that we'd try to slow down a
> rm -rf so that the fs doesn't find itself facing a gigantic flood of
> inactivation work, particularly if there were a lot of extents to free
> or a lot of inodes to free.  Under this scheme we don't ever wait for
> inactivation if we're just closing a linked file, but we could do so for
> deletions.
> 
> However, it is difficult to quantify what "gigantic" means here.  The
> situation I was trying to avoid is where the system gets bogged down
> with background processing work for a long time after the userspace
> process terminates, but perhaps it's better not to bother. :)

Multi-processor systems are ubiquitous now. If we can move things to
the background and burn otherwise idle CPU to do the work so that
users/apps don't have to wait on it, then I think that's fine. IMO,
the more work we can push into async background processing the
better we can optimise it (e.g. start batching or using bulk
operations rather than one-at-a-time).

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2019-01-17 22:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-01  2:16 [PATCH 00/12] xfs: deferred inode inactivation Darrick J. Wong
2019-01-01  2:16 ` [PATCH 01/12] xfs: free COW staging extents when freezing filesystem Darrick J. Wong
2019-01-11 16:28   ` Brian Foster
2019-01-17 17:24     ` Darrick J. Wong
2019-01-17 18:14       ` Brian Foster
2019-01-17 20:20         ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 02/12] xfs: refactor the predicate part of xfs_free_eofblocks Darrick J. Wong
2019-01-11 19:05   ` Brian Foster
2019-01-17 17:33     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 03/12] xfs: decide if inode needs inactivation Darrick J. Wong
2019-01-01  2:17 ` [PATCH 04/12] xfs: track unlinked inactive inode fs summary counters Darrick J. Wong
2019-01-01  2:17 ` [PATCH 05/12] xfs: track unlinked inactive inode quota counters Darrick J. Wong
2019-01-01  2:17 ` [PATCH 06/12] xfs: refactor walking of per-AG RECLAIM inodes Darrick J. Wong
2019-01-11 19:06   ` Brian Foster
2019-01-17 17:43     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 07/12] xfs: refactor eofblocks inode match code Darrick J. Wong
2019-01-02  9:50   ` Nikolay Borisov
2019-01-17 18:05     ` Darrick J. Wong
2019-01-01  2:17 ` [PATCH 08/12] xfs: deferred inode inactivation Darrick J. Wong
2019-01-01  2:17 ` [PATCH 09/12] xfs: retry fs writes when there isn't space Darrick J. Wong
2019-01-01  2:17 ` [PATCH 10/12] xfs: force inactivation before fallocate when space is low Darrick J. Wong
2019-01-01  2:17 ` [PATCH 11/12] xfs: parallelize inode inactivation Darrick J. Wong
2019-01-01  2:18 ` [PATCH 12/12] xfs: wait for deferred inactivation when destroying unlinked inodes Darrick J. Wong
2019-01-03 12:46   ` Dave Chinner
2019-01-17 18:41     ` Darrick J. Wong
2019-01-17 22:21       ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190117222117.GF6173@dastard \
    --to=david@fromorbit.com \
    --cc=darrick.wong@oracle.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.