From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [GIT PULL v2] xfs: CIL and log scalability improvements
Date: Wed, 9 Jun 2021 13:46:53 -0700 [thread overview]
Message-ID: <20210609204653.GZ2945738@locust> (raw)
In-Reply-To: <20210608044340.GK664593@dread.disaster.area>
On Tue, Jun 08, 2021 at 02:43:40PM +1000, Dave Chinner wrote:
> Hi Darrick,
>
> I've updated the branch and tag for the CIL and log scalability
> improvements to fix the CPU hotplug bug that was in the previous
> version. The code changes are limited to those, otherwise everything
> else in the series is unchanged.
>
> Please pull from the tag decsribed below.
Pulled, thanks!
--D
>
> Cheers,
>
> Dave.
>
> The following changes since commit d07f6ca923ea0927a1024dfccafc5b53b61cfecc:
>
> Linux 5.13-rc2 (2021-05-16 15:27:44 -0700)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs.git tags/xfs-cil-scale-2-tag
>
> for you to fetch changes up to 7017b129e69c1b451fa926f2cac507c4128608dc:
>
> xfs: expanding delayed logging design with background material (2021-06-08 14:27:46 +1000)
>
> ----------------------------------------------------------------
> xfs: CIL and log scalability improvements
>
> Performance improvements are largely documented in the change logs of the
> individual patches. Headline numbers are an increase in transaction rate from
> 700k commits/s to 1.7M commits/s, and a reduction in fua/flush operations by
> 2-3 orders of magnitude on metadata heavy workloads that don't use fsync.
>
> Summary of series:
>
> Patches Modifications
> ------- -------------
> 1-7: log write FUA/FLUSH optimisations
> 8: bug fix
> 9-11: Async CIL pushes
> 12-25: xlog_write() rework
> 26-39: CIL commit scalability
>
> The log write FUA/FLUSH optimisations reduce the number of cache flushes
> required to flush the CIL to the journal. It extends the old pre-delayed logging
> ordering semantics required by writing individual transactions to the iclogs out
> to cover then CIL checkpoint transactions rather than individual writes to the
> iclogs. In doing so, we reduce the cache flush requirements to once per CIL
> checkpoint rather than once per iclog write.
>
> The async CIL pushes fix a pipeline limitation that only allowed a single CIL
> push to be processed at a time. This was causing CIL checkpoint writing to
> become CPU bound as only a single CIL checkpoint could be pushed at a time. The
> checkpoint pipleine was designed to allow multiple pushes to be in flight at
> once and use careful ordering of the commit records to ensure correct recovery
> order, but the workqueue implementation didn't allow concurrent works to be run.
> The concurrent works now extend out to 4 CIL checkpoints running at a time,
> hence removing the CPU usage limiations without introducing new lock contention
> issues.
>
> The xlog_write() rework is long overdue. The code is complex, difficult to
> understand, full of tricky, subtle corner cases and just generally really hard
> to modify. This patchset reworks the xlog_write() API to reduce the processing
> overhead of writing out long log vector chains, and factors the xlog_write()
> code into a simple, compact fast path along with a clearer slow path to handle
> the complex cases.
>
> The CIL commit scalability patchset removes spinlocks from the transaction
> commit fast path. These spinlocks are the performance limiting bottleneck in the
> transaction commit path, so we apply a variety of different techniques to do
> either atomic. lockless or per-cpu updates of the CIL tracking structures during
> commits. This greatly increases the throughput of the the transaction commit
> engine, moving the contention point to the log space tracking algorithms after
> doubling throughput on 32-way workloads.
>
> ----------------------------------------------------------------
> Dave Chinner (40):
> xfs: log stripe roundoff is a property of the log
> xfs: separate CIL commit record IO
> xfs: remove xfs_blkdev_issue_flush
> xfs: async blkdev cache flush
> xfs: CIL checkpoint flushes caches unconditionally
> xfs: remove need_start_rec parameter from xlog_write()
> xfs: journal IO cache flush reductions
> xfs: Fix CIL throttle hang when CIL space used going backwards
> xfs: xfs_log_force_lsn isn't passed a LSN
> xfs: AIL needs asynchronous CIL forcing
> xfs: CIL work is serialised, not pipelined
> xfs: factor out the CIL transaction header building
> xfs: only CIL pushes require a start record
> xfs: embed the xlog_op_header in the unmount record
> xfs: embed the xlog_op_header in the commit record
> xfs: log tickets don't need log client id
> xfs: move log iovec alignment to preparation function
> xfs: reserve space and initialise xlog_op_header in item formatting
> xfs: log ticket region debug is largely useless
> xfs: pass lv chain length into xlog_write()
> xfs: introduce xlog_write_single()
> xfs:_introduce xlog_write_partial()
> xfs: xlog_write() no longer needs contwr state
> xfs: xlog_write() doesn't need optype anymore
> xfs: CIL context doesn't need to count iovecs
> xfs: use the CIL space used counter for emptiness checks
> xfs: lift init CIL reservation out of xc_cil_lock
> xfs: rework per-iclog header CIL reservation
> xfs: introduce CPU hotplug infrastructure
> xfs: introduce per-cpu CIL tracking structure
> xfs: implement percpu cil space used calculation
> xfs: track CIL ticket reservation in percpu structure
> xfs: convert CIL busy extents to per-cpu
> xfs: Add order IDs to log items in CIL
> xfs: convert CIL to unordered per cpu lists
> xfs: convert log vector chain to use list heads
> xfs: move CIL ordering to the logvec chain
> xfs: avoid cil push lock if possible
> xfs: xlog_sync() manually adjusts grant head space
> xfs: expanding delayed logging design with background material
>
> Documentation/filesystems/xfs-delayed-logging-design.rst | 361 ++++++++++++++++++++++++++----
> fs/xfs/libxfs/xfs_log_format.h | 4 -
> fs/xfs/libxfs/xfs_types.h | 1 +
> fs/xfs/xfs_bio_io.c | 35 +++
> fs/xfs/xfs_buf.c | 2 +-
> fs/xfs/xfs_buf_item.c | 39 ++--
> fs/xfs/xfs_dquot_item.c | 2 +-
> fs/xfs/xfs_file.c | 20 +-
> fs/xfs/xfs_inode.c | 10 +-
> fs/xfs/xfs_inode_item.c | 18 +-
> fs/xfs/xfs_inode_item.h | 2 +-
> fs/xfs/xfs_linux.h | 2 +
> fs/xfs/xfs_log.c | 1015 +++++++++++++++++++++++++++++++++++++++---------------------------------------------
> fs/xfs/xfs_log.h | 66 ++----
> fs/xfs/xfs_log_cil.c | 804 ++++++++++++++++++++++++++++++++++++++++++++++++------------------
> fs/xfs/xfs_log_priv.h | 123 ++++++-----
> fs/xfs/xfs_super.c | 52 ++++-
> fs/xfs/xfs_super.h | 1 -
> fs/xfs/xfs_sysfs.c | 1 +
> fs/xfs/xfs_trace.c | 1 +
> fs/xfs/xfs_trans.c | 18 +-
> fs/xfs/xfs_trans.h | 5 +-
> fs/xfs/xfs_trans_ail.c | 11 +-
> fs/xfs/xfs_trans_priv.h | 3 +-
> include/linux/cpuhotplug.h | 1 +
> 25 files changed, 1632 insertions(+), 965 deletions(-)
>
> --
> Dave Chinner
> david@fromorbit.com
prev parent reply other threads:[~2021-06-09 20:46 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-08 4:43 [GIT PULL v2] xfs: CIL and log scalability improvements Dave Chinner
2021-06-09 20:46 ` Darrick J. Wong [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210609204653.GZ2945738@locust \
--to=djwong@kernel.org \
--cc=david@fromorbit.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox