Re: [PATCH 6/6] xfs: lockless buffer lookup

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 6/6] xfs: lockless buffer lookup
Date: Thu, 7 Jul 2022 22:36:33 +1000	[thread overview]
Message-ID: <20220707123633.GM227878@dread.disaster.area> (raw)
In-Reply-To: <YrzMeZ/mW+yN94Oo@magnolia>

On Wed, Jun 29, 2022 at 03:04:41PM -0700, Darrick J. Wong wrote:
> On Mon, Jun 27, 2022 at 04:08:41PM +1000, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > Now that we have a standalone fast path for buffer lookup, we can
> > easily convert it to use rcu lookups. When we continually hammer the
> > buffer cache with trylock lookups, we end up with a huge amount of
> > lock contention on the per-ag buffer hash locks:
> > 
> > -   92.71%     0.05%  [kernel]                  [k] xfs_inodegc_worker
> >    - 92.67% xfs_inodegc_worker
> >       - 92.13% xfs_inode_unlink
> >          - 91.52% xfs_inactive_ifree
> >             - 85.63% xfs_read_agi
> >                - 85.61% xfs_trans_read_buf_map
> >                   - 85.59% xfs_buf_read_map
> >                      - xfs_buf_get_map
> >                         - 85.55% xfs_buf_find
> >                            - 72.87% _raw_spin_lock
> >                               - do_raw_spin_lock
> >                                    71.86% __pv_queued_spin_lock_slowpath
> >                            - 8.74% xfs_buf_rele
> >                               - 7.88% _raw_spin_lock
> >                                  - 7.88% do_raw_spin_lock
> >                                       7.63% __pv_queued_spin_lock_slowpath
> >                            - 1.70% xfs_buf_trylock
> >                               - 1.68% down_trylock
> >                                  - 1.41% _raw_spin_lock_irqsave
> >                                     - 1.39% do_raw_spin_lock
> >                                          __pv_queued_spin_lock_slowpath
> >                            - 0.76% _raw_spin_unlock
> >                                 0.75% do_raw_spin_unlock
> > 
> > This is basically hammering the pag->pag_buf_lock from lots of CPUs
> > doing trylocks at the same time. Most of the buffer trylock
> > operations ultimately fail after we've done the lookup, so we're
> > really hammering the buf hash lock whilst making no progress.
> > 
> > We can also see significant spinlock traffic on the same lock just
> > under normal operation when lots of tasks are accessing metadata
> > from the same AG, so let's avoid all this by converting the lookup
> > fast path to leverages the rhashtable's ability to do rcu protected
> > lookups.
> > 
> > We avoid races with the buffer release path by using
> > atomic_inc_not_zero() on the buffer hold count. Any buffer that is
> > in the LRU will have a non-zero count, thereby allowing the lockless
> > fast path to be taken in most cache hit situations. If the buffer
> > hold count is zero, then it is likely going through the release path
> > so in that case we fall back to the existing lookup miss slow path.
> > 
> > The slow path will then do an atomic lookup and insert under the
> > buffer hash lock and hence serialise correctly against buffer
> > release freeing the buffer.
> > 
> > The use of rcu protected lookups means that buffer handles now need
> > to be freed by RCU callbacks (same as inodes). We still free the
> > buffer pages before the RCU callback - we won't be trying to access
> > them at all on a buffer that has zero references - but we need the
> > buffer handle itself to be present for the entire rcu protected read
> > side to detect a zero hold count correctly.
> 
> Hmm, so what still uses pag_buf_lock?  Are we still using it to
> serialize xfs_buf_rele calls?

slow path lookup/insert and xfs_buf_rele calls.

The only thing we are allowing lockless lookups on are buffers with
at least one reference. With the LRU holding a reference, that means
it the buffer is still actively cached or referenced by something
else so won't disappear from under us. If the ref count is zero,
then it means the buffer is being removed from the cache, so we
need to go the slow way and take the pag_buf_lock to serialise the
lookup against the release of the unreferenced buffer we found in
the cache.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2022-07-07 12:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27  6:08 [PATCH 0/6 v2] xfs: lockless buffer lookups Dave Chinner
2022-06-27  6:08 ` [PATCH 1/6] xfs: rework xfs_buf_incore() API Dave Chinner
2022-06-29  7:30   ` Christoph Hellwig
2022-06-29 21:24   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 2/6] xfs: break up xfs_buf_find() into individual pieces Dave Chinner
2022-06-28  2:22   ` Chris Dunlop
2022-06-29  7:35   ` Christoph Hellwig
2022-06-29 21:50   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 3/6] xfs: merge xfs_buf_find() and xfs_buf_get_map() Dave Chinner
2022-06-29  7:40   ` Christoph Hellwig
2022-06-29 22:06     ` Darrick J. Wong
2022-07-07 12:39       ` Dave Chinner
2022-06-27  6:08 ` [PATCH 4/6] xfs: reduce the number of atomic when locking a buffer after lookup Dave Chinner
2022-06-29 22:00   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 5/6] xfs: remove a superflous hash lookup when inserting new buffers Dave Chinner
2022-06-29  7:40   ` Christoph Hellwig
2022-06-29 22:01   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 6/6] xfs: lockless buffer lookup Dave Chinner
2022-06-29  7:41   ` Christoph Hellwig
2022-06-29 22:04   ` Darrick J. Wong
2022-07-07 12:36     ` Dave Chinner [this message]
2022-07-07 17:55       ` Darrick J. Wong
2022-07-11  5:16       ` Christoph Hellwig
2022-07-07  2:40 ` [PATCH 0/6 v2] xfs: lockless buffer lookups Darrick J. Wong
  -- strict thread matches above, loose matches on Subject: below --
2022-07-07 23:52 [PATCH 0/6 v3] " Dave Chinner
2022-07-07 23:52 ` [PATCH 6/6] xfs: lockless buffer lookup Dave Chinner
2022-07-10  0:15   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220707123633.GM227878@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox