linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-xfs@vger.kernel.org, Ian Kent <raven@themaw.net>,
	rcu@vger.kernel.org
Subject: Re: [PATCH] xfs: require an rcu grace period before inode recycle
Date: Thu, 10 Feb 2022 15:47:13 -0500	[thread overview]
Message-ID: <YgV50a9ia0bHKTFh@bfoster> (raw)
In-Reply-To: <20220210054544.GI4285@paulmck-ThinkPad-P17-Gen-1>

On Wed, Feb 09, 2022 at 09:45:44PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 10, 2022 at 03:09:17PM +1100, Dave Chinner wrote:
> > On Mon, Feb 07, 2022 at 08:36:21AM -0800, Paul E. McKenney wrote:
> > > On Mon, Feb 07, 2022 at 08:30:03AM -0500, Brian Foster wrote:
> > > Another approach is to use SLAB_TYPESAFE_BY_RCU.  This allows immediate
> > > reuse of freed memory, but also requires pointer traversals to the memory
> > > to do a revalidation operation.  (Sorry, no free lunch here!)
> > 
> > Can't do that with inodes - newly allocated/reused inodes have to go
> > through inode_init_always() which is the very function that causes
> > the problems we have now with path-walk tripping over inodes in an
> > intermediate re-initialised state because we recycled it inside a
> > RCU grace period.
> 
> So not just no free lunch, but this is also not a lunch that is consistent
> with the code's dietary restrictions.
> 
> From what you said earlier in this thread, I am guessing that you have
> some other fix in mind.
> 

Yeah.. I've got an experiment running that essentially tracks pending
inode grace period cookies and attempts to avoid them at allocation
time. It's crude atm, but the initial numbers I see aren't that far off
from the results produced by your expedited grace period mechanism. I
see numbers mostly in the 40-50k cycles per second ballpark. This is
somewhat expected because the current baseline behavior relies on unsafe
reuse of inodes before a grace period has elapsed. We have to rely on
more physical allocations to get around this, so the small batch
alloc/free patterns simply won't be able to spin as fast. The difference
I do see with this sort of explicit gp tracking is that the results
remain much closer to the baseline kernel when background activity is
ramped up.

However, one of the things I'd like to experiment with is whether the
combination of this approach and expedited grace periods provides any
sort of opportunity for further optimization. For example, if we can
identify that a grace period has elapsed between the time of
->destroy_inode() and when the queue processing ultimately marks the
inode reclaimable, that might allow for some optimized allocation
behavior. I see this occur occasionally with normal grace periods, but
not quite frequent enough to make a difference.

What I observe right now is that the same test above runs at much closer
to the baseline numbers when using the ikeep mount option, so I may need
to look into ways to mitigate the chunk allocation overhead..

Brian

> 							Thanx, Paul
> 


  reply	other threads:[~2022-02-10 20:47 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-21 14:24 [PATCH] xfs: require an rcu grace period before inode recycle Brian Foster
2022-01-21 17:26 ` Darrick J. Wong
2022-01-21 18:33   ` Brian Foster
2022-01-22  5:30     ` Paul E. McKenney
2022-01-22 16:55       ` Paul E. McKenney
2022-01-24 15:12       ` Brian Foster
2022-01-24 16:40         ` Paul E. McKenney
2022-01-23 22:43 ` Dave Chinner
2022-01-24 15:06   ` Brian Foster
2022-01-24 15:02 ` Brian Foster
2022-01-24 22:08   ` Dave Chinner
2022-01-24 23:29     ` Brian Foster
2022-01-25  0:31       ` Dave Chinner
2022-01-25 14:40         ` Paul E. McKenney
2022-01-25 22:36           ` Dave Chinner
2022-01-26  5:29             ` Paul E. McKenney
2022-01-26 13:21               ` Brian Foster
2022-01-25 18:30         ` Brian Foster
2022-01-25 20:07           ` Brian Foster
2022-01-25 22:45           ` Dave Chinner
2022-01-27  4:19             ` Al Viro
2022-01-27  5:26               ` Dave Chinner
2022-01-27 19:01                 ` Brian Foster
2022-01-27 22:18                   ` Dave Chinner
2022-01-28 14:11                     ` Brian Foster
2022-01-28 23:53                       ` Dave Chinner
2022-01-31 13:28                         ` Brian Foster
2022-01-28 21:39                   ` Paul E. McKenney
2022-01-31 13:22                     ` Brian Foster
2022-02-01 22:00                       ` Paul E. McKenney
2022-02-03 18:49                         ` Paul E. McKenney
2022-02-07 13:30                         ` Brian Foster
2022-02-07 16:36                           ` Paul E. McKenney
2022-02-10  4:09                             ` Dave Chinner
2022-02-10  5:45                               ` Paul E. McKenney
2022-02-10 20:47                                 ` Brian Foster [this message]
2022-01-25  8:16 ` [xfs] a7f4e88080: aim7.jobs-per-min -62.2% regression kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgV50a9ia0bHKTFh@bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=paulmck@kernel.org \
    --cc=raven@themaw.net \
    --cc=rcu@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).