public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 0/6 v2] xfs: lockless buffer lookups
Date: Wed, 6 Jul 2022 19:40:07 -0700	[thread overview]
Message-ID: <YsZHh2ZkopJFmaKx@magnolia> (raw)
In-Reply-To: <20220627060841.244226-1-david@fromorbit.com>

On Mon, Jun 27, 2022 at 04:08:35PM +1000, Dave Chinner wrote:
> Hi folks,
> 
> Current work to merge the XFS inode life cycle with the VFS indoe
> life cycle is finding some interesting issues. If we have a path
> that hits buffer trylocks fairly hard (e.g. a non-blocking
> background inode freeing function), we end up hitting massive
> contention on the buffer cache hash locks:
> 
> -   92.71%     0.05%  [kernel]                  [k] xfs_inodegc_worker
>    - 92.67% xfs_inodegc_worker
>       - 92.13% xfs_inode_unlink
>          - 91.52% xfs_inactive_ifree
>             - 85.63% xfs_read_agi
>                - 85.61% xfs_trans_read_buf_map
>                   - 85.59% xfs_buf_read_map
>                      - xfs_buf_get_map
>                         - 85.55% xfs_buf_find
>                            - 72.87% _raw_spin_lock
>                               - do_raw_spin_lock
>                                    71.86% __pv_queued_spin_lock_slowpath
>                            - 8.74% xfs_buf_rele
>                               - 7.88% _raw_spin_lock
>                                  - 7.88% do_raw_spin_lock
>                                       7.63% __pv_queued_spin_lock_slowpath
>                            - 1.70% xfs_buf_trylock
>                               - 1.68% down_trylock
>                                  - 1.41% _raw_spin_lock_irqsave
>                                     - 1.39% do_raw_spin_lock
>                                          __pv_queued_spin_lock_slowpath
>                            - 0.76% _raw_spin_unlock
>                                 0.75% do_raw_spin_unlock
> 
> This is basically hammering the pag->pag_buf_lock from lots of CPUs
> doing trylocks at the same time. Most of the buffer trylock
> operations ultimately fail after we've done the lookup, so we're
> really hammering the buf hash lock whilst making no progress.
> 
> We can also see significant spinlock traffic on the same lock just
> under normal operation when lots of tasks are accessing metadata
> from the same AG, so let's avoid all this by creating a lookup fast
> path which leverages the rhashtable's ability to do rcu protected
> lookups.
> 
> This is a rework of the initial lockless buffer lookup patch I sent
> here:
> 
> https://lore.kernel.org/linux-xfs/20220328213810.1174688-1-david@fromorbit.com/
> 
> And the alternative cleanup sent by Christoph here:
> 
> https://lore.kernel.org/linux-xfs/20220403120119.235457-1-hch@lst.de/
> 
> This version isn't quite a short as Christophs, but it does roughly
> the same thing in killing the two-phase _xfs_buf_find() call
> mechanism. It separates the fast and slow paths a little more
> cleanly and doesn't have context dependent buffer return state from
> the slow path that the caller needs to handle. It also picks up the
> rhashtable insert optimisation that Christoph added.
> 
> This series passes fstests under several different configs and does
> not cause any obvious regressions in scalability testing that has
> been performed. Hence I'm proposing this as potential 5.20 cycle
> material.
> 
> Thoughts, comments?

Any chance there'll be a v3 (or just responses to the replies sent so
far) in time for 5.20?

--D

> Version 2:
> - based on 5.19-rc2
> - high speed collision of original proposals.
> 
> Initial versions:
> - https://lore.kernel.org/linux-xfs/20220403120119.235457-1-hch@lst.de/
> - https://lore.kernel.org/linux-xfs/20220328213810.1174688-1-david@fromorbit.com/
> 
> 

      parent reply	other threads:[~2022-07-07  2:40 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-27  6:08 [PATCH 0/6 v2] xfs: lockless buffer lookups Dave Chinner
2022-06-27  6:08 ` [PATCH 1/6] xfs: rework xfs_buf_incore() API Dave Chinner
2022-06-29  7:30   ` Christoph Hellwig
2022-06-29 21:24   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 2/6] xfs: break up xfs_buf_find() into individual pieces Dave Chinner
2022-06-28  2:22   ` Chris Dunlop
2022-06-29  7:35   ` Christoph Hellwig
2022-06-29 21:50   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 3/6] xfs: merge xfs_buf_find() and xfs_buf_get_map() Dave Chinner
2022-06-29  7:40   ` Christoph Hellwig
2022-06-29 22:06     ` Darrick J. Wong
2022-07-07 12:39       ` Dave Chinner
2022-06-27  6:08 ` [PATCH 4/6] xfs: reduce the number of atomic when locking a buffer after lookup Dave Chinner
2022-06-29 22:00   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 5/6] xfs: remove a superflous hash lookup when inserting new buffers Dave Chinner
2022-06-29  7:40   ` Christoph Hellwig
2022-06-29 22:01   ` Darrick J. Wong
2022-06-27  6:08 ` [PATCH 6/6] xfs: lockless buffer lookup Dave Chinner
2022-06-29  7:41   ` Christoph Hellwig
2022-06-29 22:04   ` Darrick J. Wong
2022-07-07 12:36     ` Dave Chinner
2022-07-07 17:55       ` Darrick J. Wong
2022-07-11  5:16       ` Christoph Hellwig
2022-07-07  2:40 ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YsZHh2ZkopJFmaKx@magnolia \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox