public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 00/39 v5] xfs: CIL and log optimisations
Date: Thu, 3 Jun 2021 10:05:13 -0700	[thread overview]
Message-ID: <20210603170513.GH26402@locust> (raw)
In-Reply-To: <20210603052240.171998-1-david@fromorbit.com>

On Thu, Jun 03, 2021 at 03:22:01PM +1000, Dave Chinner wrote:
> Hi folks,
> 
> This is an update of the consolidated log scalability patchset I've been working
> on. Version 4 was posted here:
> 
> https://lore.kernel.org/linux-xfs/20210519121317.585244-1-david@fromorbit.com/
> 
> This version contains the changes Darrick requested during review. The only
> patch remaining without at least one RVB tag is patch 30.
> 
> Performance improvements are largely documented in the change logs of the
> individual patches. Headline numbers are an increase in transaction rate from
> 700k commits/s to 1.7M commits/s, and a reduction in fua/flush operations by
> 2-3 orders of magnitude on metadata heavy workloads that don't use fsync.
> 
> Summary of series:
> 
> Patches		Modifications
> -------		-------------
> 1-7:		log write FUA/FLUSH optimisations
> 8:		bug fix
> 9-11:		Async CIL pushes
> 12-25:		xlog_write() rework
> 26-39:		CIL commit scalability

From this latest posting, I see that the first nine patches all have
multiple reviews.  Some of patches 10-19 have review tags split between
Brian and Christoph, but neither have added them all the way through.
I think I'm the only one who has supplied RVB tags for all forty.

So my question is: at what point would you like me to pull the segments
of this patchset into upstream?  "The maintainer reviewed everything" is
of course our usual standard, but this touches a /lot/ of core logging
code, and logging isn't one of my stronger areas of familiarity.

I think 1-8 look fine for 5.14.  Do you want me to wait for Brian and/or
Christoph (or really, any third pair of eyes) to finish working their
way through 9-11 and 12-25 before merging them?

--D

> 
> Change log is below.
> 
> Cheers,
> 
> Dave.
> 
> Version 5:
> - fix typo in comment (patch 4)
> - fix typo in commit msg (patch 7)
> - removed unnecessary update of num_iovecs (patch 20)
> - fixed whitespace (patch 29)
> - removed unused space_used/curr_res from CIL pcp structure (patch 29)
> - added space_used to CIL pcp structure (patch 30)
> - add comment to xlog_cil_pcp_aggregate() (patch 30)
> - added space_reserved to CIL pcp structure (patch 31)
> - removed unnecessary commentary about CIL ordering issues from commit msg in
>   patch 33 as patch 35 addresses these issues..
> - fix whitespace in xlog_cil_free_logvec() (patch 35)
> - fix weird sentence in docco (patch 39)
> - propagate changes through patches to fix conflicts due to dependent changes.
>   (various patches)
> 
> 
> Version 4:
> - rebase on 5.13-rc2+
> - fixed completion logic for async cache flush
> - trimmed superflous comments about not requiring REQ_PREFLUSH for iclog IO
>   anymore.
> - ensure that setting/clearing XLOG_ICL_NEED_FLUSH is atomic (i.e. only modified
>   while holding the icloglock)
> - ensure callers only add iclog flush/fua flags appropriately before releasing
>   the iclog so that multiple independent writes to an iclog doesn't clear flags
>   other writes into the iclog depend on.
> - buffer log item dirty tracking patches merged so removed from series
> - replaced XFS_LSN_CMP() checks with direct lsn1 == lsn2 comparisons to simplify
>   the code
> - changed "push_async" to "push_commit_stable" to indicate that the push caller
>   wants the entire CIL checkpoint and commit record to be on stable storage when
>   it completes.
> - updated comment to indicate that iclog sync state is set according to the
>   caller's desire for a stable checkpoint to be performed.
> - Added comment explaining why the CIL workqueue is limited to 4 concurrent
>   works per filesystem.
> - debug overhead reduction patches merged so removed from series
> - cleaned up a couple of typedef uses.
> - updated pahole output for checkpoint header in commit message
> - Added BUILD_BUG_ON() to check the size of unmount records.
> - got rid of XFS_VOLUME define.
> - got rid of XLOG_TIC_LEN_MAX define.
> - cleaned up extra blank lines.
> - fixed double initialisation of lv in xlog_write_single().
> - removed the unnecessary change for reserved iclog space in
>   xlog_state_get_iclog_space().
> - no need to check for XLOG_CONTINUE_TRANS in xlog_write_partial() as it will
>   always be set.
> - added a patch for removing the optype parameter from xlog_write()
> - removed unused nvecs from struct xfs_cil_ctx
> - fixed whitespace damage in xlog_cil_pcp_dead()
> - added missing cpu dead accounting transfer in xlog_cil_pcp_dead().
> - factored out CIL push percpu structure aggregation into
>   xlog_cil_pcp_aggregate()
> - added the ctx->ticket->t_unit_res update back into the code even though it is
>   largely unnecessary.
> - cleaned up the pcp, cilpcp, pcptr mess in xlog_cil_pcp_alloc() and elsewhere
>   to use variable names consistently.
> - simplified the CIL sort comparison functions to a single comparison operation
> - fixed percpu CIL item list sort order where items in the same transaction
>   (order id) were reversed, leading to intents being replayed in the wrong
>   order.
> - split out log vector chain conversion to list_head into separate patch
> - Updated documentation with all the fixes and suggestions made.
> 
> 
> Version 3:
> - rebase onto 5.12-rc1+
> - aggregate many small dependent patchsets in one large one.
> - simplify xlog_wait_on_iclog_lsn() back to just a call to xlog_wait_on_iclog()
> - remove xfs_blkdev_issue_flush() instead of moving and renaming it.
> - pass bio to xfs_flush_bdev_async() so it doesn't need allocation.
> - skip cache flush in xfs_flush_bdev_async() if the underlying queue does not
>   require it.
> - fixed whitespace in xfs_flush_bdev_async()
> - remove the implicit external log's data device cache flush code and replace it
>   with an explicit flush in the unmount record write so that it works the same
>   as the new CIL checkpoint cache pre-flush mechanism. This mechanism now
>   guarantees metadata vs journal ordering for both internal and external logs.
> - updated various commit messages
> - fixed incorrect/unintended changes to xfs_log_force() behaviour
> - typedef uint64_t xfs_csn_t; and conversion.
> - removed stray trace_printk()s that were used for debugging.
> - fixed minor formatting details.
> - uninlined xlog_prepare_iovec()
> - fixed up "lv chain vector and size calculation" commit message to reflect we
>   are only calculating and passin gin the vector byte count.
> - reworked the loop in xlog_write_single() based on Christoph's suggestion. Much
>   cleaner!
> - added patch to pass log ticket down to xlog_sync() so that it accounts the
>   roundoff to the log ticket rather than directly modifying grant heads. Grant
>   heads are hot, so every little bit helps.
> - added patch to update delayed logging design doc with background material on
>   how transactions and log space accounting works in XFS.
> 
> Version 2:
> - fix ticket reservation roundoff to include 2 roundoffs
> - removed stale copied comment from roundoff initialisation.
> - clarified "separation" to mean "separation for ordering purposes" in commit
>   message.
> - added comment that newly activated, clean, empty iclogs have a LSN of 0 so are
>   captured by the "iclog lsn < start_lsn" case that avoids needing to wait
>   before releasing the commit iclog to be written.
> - added async cache flush infrastructure
> - convert CIL checkpoint push work it issue an unconditional metadata device
>   cache flush rather than asking the first iclog write to issue it via
>   REQ_PREFLUSH.
> - cleaned up xlog_write() to remove a redundant parameter and prepare the logic
>   for setting flags on the iclog based on the type of operational data is being
>   written to the log.
> - added XLOG_ICL_NEED_FUA flag to complement the NEED_FLUSH flag, allowing
>   callers to issue explicit flushes and clear the NEED_FLUSH flag before the
>   iclog is written without dropping the REQ_FUA requirement in /dev/null...
> - added CIL commit-in-start-iclog optimisation that clears the NEED_FLUSH flag
>   to avoid an unnecessary cache flush when issuing the iclog.
> - fixed typo in CIL throttle bugfix comment.
> - fixed trailing whitespace in commit message.
> 
> 

  parent reply	other threads:[~2021-06-03 17:05 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-03  5:22 [PATCH 00/39 v5] xfs: CIL and log optimisations Dave Chinner
2021-06-03  5:22 ` [PATCH 01/39] xfs: log stripe roundoff is a property of the log Dave Chinner
2021-06-03  5:22 ` [PATCH 02/39] xfs: separate CIL commit record IO Dave Chinner
2021-06-03  5:22 ` [PATCH 03/39] xfs: remove xfs_blkdev_issue_flush Dave Chinner
2021-06-03  5:22 ` [PATCH 04/39] xfs: async blkdev cache flush Dave Chinner
2021-06-03  5:22 ` [PATCH 05/39] xfs: CIL checkpoint flushes caches unconditionally Dave Chinner
2021-06-03  5:22 ` [PATCH 06/39] xfs: remove need_start_rec parameter from xlog_write() Dave Chinner
2021-06-03 22:29   ` Allison Henderson
2021-06-03  5:22 ` [PATCH 07/39] xfs: journal IO cache flush reductions Dave Chinner
2021-06-03 22:29   ` Allison Henderson
2021-06-04  5:15   ` [xfs] 80d287de7b: stress-ng.dir.ops_per_sec 80.1% improvement kernel test robot
2021-06-03  5:22 ` [PATCH 08/39] xfs: Fix CIL throttle hang when CIL space used going backwards Dave Chinner
2021-06-03 22:29   ` Allison Henderson
2021-06-03  5:22 ` [PATCH 09/39] xfs: xfs_log_force_lsn isn't passed a LSN Dave Chinner
2021-06-03 22:29   ` Allison Henderson
2021-06-03  5:22 ` [PATCH 10/39] xfs: AIL needs asynchronous CIL forcing Dave Chinner
2021-06-03 22:29   ` Allison Henderson
2021-06-03  5:22 ` [PATCH 11/39] xfs: CIL work is serialised, not pipelined Dave Chinner
2021-06-03  5:22 ` [PATCH 12/39] xfs: factor out the CIL transaction header building Dave Chinner
2021-06-03  5:22 ` [PATCH 13/39] xfs: only CIL pushes require a start record Dave Chinner
2021-06-03  5:22 ` [PATCH 14/39] xfs: embed the xlog_op_header in the unmount record Dave Chinner
2021-06-03  5:22 ` [PATCH 15/39] xfs: embed the xlog_op_header in the commit record Dave Chinner
2021-06-03  5:22 ` [PATCH 16/39] xfs: log tickets don't need log client id Dave Chinner
2021-06-03  5:22 ` [PATCH 17/39] xfs: move log iovec alignment to preparation function Dave Chinner
2021-06-03  5:22 ` [PATCH 18/39] xfs: reserve space and initialise xlog_op_header in item formatting Dave Chinner
2021-06-03  5:22 ` [PATCH 19/39] xfs: log ticket region debug is largely useless Dave Chinner
2021-06-03  5:22 ` [PATCH 20/39] xfs: pass lv chain length into xlog_write() Dave Chinner
2021-06-03  5:22 ` [PATCH 21/39] xfs: introduce xlog_write_single() Dave Chinner
2021-06-03  5:22 ` [PATCH 22/39] xfs:_introduce xlog_write_partial() Dave Chinner
2021-06-03  5:22 ` [PATCH 23/39] xfs: xlog_write() no longer needs contwr state Dave Chinner
2021-06-03  5:22 ` [PATCH 24/39] xfs: xlog_write() doesn't need optype anymore Dave Chinner
2021-06-03  5:22 ` [PATCH 25/39] xfs: CIL context doesn't need to count iovecs Dave Chinner
2021-06-03  5:22 ` [PATCH 26/39] xfs: use the CIL space used counter for emptiness checks Dave Chinner
2021-06-03  5:22 ` [PATCH 27/39] xfs: lift init CIL reservation out of xc_cil_lock Dave Chinner
2021-06-03  5:22 ` [PATCH 28/39] xfs: rework per-iclog header CIL reservation Dave Chinner
2021-06-03  5:22 ` [PATCH 29/39] xfs: introduce per-cpu CIL tracking structure Dave Chinner
2021-06-03  9:18   ` kernel test robot
2021-06-03 10:08   ` kernel test robot
2021-06-03 16:49   ` Darrick J. Wong
2021-06-03  5:22 ` [PATCH 30/39] xfs: implement percpu cil space used calculation Dave Chinner
2021-06-03 16:44   ` Darrick J. Wong
2021-06-03  5:22 ` [PATCH 31/39] xfs: track CIL ticket reservation in percpu structure Dave Chinner
2021-06-03 16:43   ` Darrick J. Wong
2021-06-03  5:22 ` [PATCH 32/39] xfs: convert CIL busy extents to per-cpu Dave Chinner
2021-06-03  5:22 ` [PATCH 33/39] xfs: Add order IDs to log items in CIL Dave Chinner
2021-06-03  5:22 ` [PATCH 34/39] xfs: convert CIL to unordered per cpu lists Dave Chinner
2021-06-03  5:22 ` [PATCH 35/39] xfs: convert log vector chain to use list heads Dave Chinner
2021-06-03 10:16   ` kernel test robot
2021-06-03  5:22 ` [PATCH 36/39] xfs: move CIL ordering to the logvec chain Dave Chinner
2021-06-03  5:22 ` [PATCH 37/39] xfs: avoid cil push lock if possible Dave Chinner
2021-06-03  5:22 ` [PATCH 38/39] xfs: xlog_sync() manually adjusts grant head space Dave Chinner
2021-06-03  5:22 ` [PATCH 39/39] xfs: expanding delayed logging design with background material Dave Chinner
2021-06-03 17:05 ` Darrick J. Wong [this message]
2021-06-03 22:43   ` [PATCH 00/39 v5] xfs: CIL and log optimisations Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210603170513.GH26402@locust \
    --to=djwong@kernel.org \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox