From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 6/6] xfs: lockless buffer lookup
Date: Thu, 7 Jul 2022 10:55:44 -0700 [thread overview]
Message-ID: <YsceIFL3iFwtHWx0@magnolia> (raw)
In-Reply-To: <20220707123633.GM227878@dread.disaster.area>
On Thu, Jul 07, 2022 at 10:36:33PM +1000, Dave Chinner wrote:
> On Wed, Jun 29, 2022 at 03:04:41PM -0700, Darrick J. Wong wrote:
> > On Mon, Jun 27, 2022 at 04:08:41PM +1000, Dave Chinner wrote:
> > > From: Dave Chinner <dchinner@redhat.com>
> > >
> > > Now that we have a standalone fast path for buffer lookup, we can
> > > easily convert it to use rcu lookups. When we continually hammer the
> > > buffer cache with trylock lookups, we end up with a huge amount of
> > > lock contention on the per-ag buffer hash locks:
> > >
> > > - 92.71% 0.05% [kernel] [k] xfs_inodegc_worker
> > > - 92.67% xfs_inodegc_worker
> > > - 92.13% xfs_inode_unlink
> > > - 91.52% xfs_inactive_ifree
> > > - 85.63% xfs_read_agi
> > > - 85.61% xfs_trans_read_buf_map
> > > - 85.59% xfs_buf_read_map
> > > - xfs_buf_get_map
> > > - 85.55% xfs_buf_find
> > > - 72.87% _raw_spin_lock
> > > - do_raw_spin_lock
> > > 71.86% __pv_queued_spin_lock_slowpath
> > > - 8.74% xfs_buf_rele
> > > - 7.88% _raw_spin_lock
> > > - 7.88% do_raw_spin_lock
> > > 7.63% __pv_queued_spin_lock_slowpath
> > > - 1.70% xfs_buf_trylock
> > > - 1.68% down_trylock
> > > - 1.41% _raw_spin_lock_irqsave
> > > - 1.39% do_raw_spin_lock
> > > __pv_queued_spin_lock_slowpath
> > > - 0.76% _raw_spin_unlock
> > > 0.75% do_raw_spin_unlock
> > >
> > > This is basically hammering the pag->pag_buf_lock from lots of CPUs
> > > doing trylocks at the same time. Most of the buffer trylock
> > > operations ultimately fail after we've done the lookup, so we're
> > > really hammering the buf hash lock whilst making no progress.
> > >
> > > We can also see significant spinlock traffic on the same lock just
> > > under normal operation when lots of tasks are accessing metadata
> > > from the same AG, so let's avoid all this by converting the lookup
> > > fast path to leverages the rhashtable's ability to do rcu protected
> > > lookups.
> > >
> > > We avoid races with the buffer release path by using
> > > atomic_inc_not_zero() on the buffer hold count. Any buffer that is
> > > in the LRU will have a non-zero count, thereby allowing the lockless
> > > fast path to be taken in most cache hit situations. If the buffer
> > > hold count is zero, then it is likely going through the release path
> > > so in that case we fall back to the existing lookup miss slow path.
> > >
> > > The slow path will then do an atomic lookup and insert under the
> > > buffer hash lock and hence serialise correctly against buffer
> > > release freeing the buffer.
> > >
> > > The use of rcu protected lookups means that buffer handles now need
> > > to be freed by RCU callbacks (same as inodes). We still free the
> > > buffer pages before the RCU callback - we won't be trying to access
> > > them at all on a buffer that has zero references - but we need the
> > > buffer handle itself to be present for the entire rcu protected read
> > > side to detect a zero hold count correctly.
> >
> > Hmm, so what still uses pag_buf_lock? Are we still using it to
> > serialize xfs_buf_rele calls?
>
> slow path lookup/insert and xfs_buf_rele calls.
>
> The only thing we are allowing lockless lookups on are buffers with
> at least one reference. With the LRU holding a reference, that means
> it the buffer is still actively cached or referenced by something
> else so won't disappear from under us. If the ref count is zero,
> then it means the buffer is being removed from the cache, so we
> need to go the slow way and take the pag_buf_lock to serialise the
> lookup against the release of the unreferenced buffer we found in
> the cache.
Ah, right, got it.
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
--D
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2022-07-07 17:55 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-27 6:08 [PATCH 0/6 v2] xfs: lockless buffer lookups Dave Chinner
2022-06-27 6:08 ` [PATCH 1/6] xfs: rework xfs_buf_incore() API Dave Chinner
2022-06-29 7:30 ` Christoph Hellwig
2022-06-29 21:24 ` Darrick J. Wong
2022-06-27 6:08 ` [PATCH 2/6] xfs: break up xfs_buf_find() into individual pieces Dave Chinner
2022-06-28 2:22 ` Chris Dunlop
2022-06-29 7:35 ` Christoph Hellwig
2022-06-29 21:50 ` Darrick J. Wong
2022-06-27 6:08 ` [PATCH 3/6] xfs: merge xfs_buf_find() and xfs_buf_get_map() Dave Chinner
2022-06-29 7:40 ` Christoph Hellwig
2022-06-29 22:06 ` Darrick J. Wong
2022-07-07 12:39 ` Dave Chinner
2022-06-27 6:08 ` [PATCH 4/6] xfs: reduce the number of atomic when locking a buffer after lookup Dave Chinner
2022-06-29 22:00 ` Darrick J. Wong
2022-06-27 6:08 ` [PATCH 5/6] xfs: remove a superflous hash lookup when inserting new buffers Dave Chinner
2022-06-29 7:40 ` Christoph Hellwig
2022-06-29 22:01 ` Darrick J. Wong
2022-06-27 6:08 ` [PATCH 6/6] xfs: lockless buffer lookup Dave Chinner
2022-06-29 7:41 ` Christoph Hellwig
2022-06-29 22:04 ` Darrick J. Wong
2022-07-07 12:36 ` Dave Chinner
2022-07-07 17:55 ` Darrick J. Wong [this message]
2022-07-11 5:16 ` Christoph Hellwig
2022-07-07 2:40 ` [PATCH 0/6 v2] xfs: lockless buffer lookups Darrick J. Wong
-- strict thread matches above, loose matches on Subject: below --
2022-07-07 23:52 [PATCH 0/6 v3] " Dave Chinner
2022-07-07 23:52 ` [PATCH 6/6] xfs: lockless buffer lookup Dave Chinner
2022-07-10 0:15 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YsceIFL3iFwtHWx0@magnolia \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox