From: Bart Van Assche <bvanassche@acm.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
Christoph Hellwig <hch@lst.de>,
Damien Le Moal <dlemoal@kernel.org>,
Jaegeuk Kim <jaegeuk@kernel.org>,
Bart Van Assche <bvanassche@acm.org>
Subject: [PATCH v16 00/26] Improve write performance for zoned UFS devices
Date: Mon, 18 Nov 2024 16:27:49 -0800 [thread overview]
Message-ID: <20241119002815.600608-1-bvanassche@acm.org> (raw)
Hi Damien and Christoph,
This patch series improves small write IOPS by a factor of four (+300%) for
zoned UFS devices on my test setup with an UFSHCI 3.0 controller. Although
you are probably busy because the merge window is open, please take a look
at this patch series when you have the time. This patch series is organized
as follows:
- Bug fixes for existing code at the start of the series.
- The write pipelining support implementation comes after the bug fixes.
Thanks,
Bart.
Changes compared to v15:
- Reworked this patch series on top of the zone write plugging approach.
- Moved support for requeuing requests from the SCSI core into the block
layer core.
- In the UFS driver, instead of disabling write pipelining if
auto-hibernation is enabled, rely on the requeuing mechanism to handle
reordering caused by resuming from auto-hibernation.
Changes compared to v14:
- Removed the drivers/scsi/Kconfig.kunit and drivers/scsi/Makefile.kunit
files. Instead, modified drivers/scsi/Kconfig and added #include "*_test.c"
directives in the appropriate .c files. Removed the EXPORT_SYMBOL()
directives that were added to make the unit tests link.
- Fixed a double free in a unit test.
Changes compared to v13:
- Reworked patch "block: Preserve the order of requeued zoned writes".
- Addressed a performance concern by removing the eh_needs_prepare_resubmit
SCSI driver callback and by introducing the SCSI host template flag
.needs_prepare_resubmit instead.
- Added a patch that adds a 'host' argument to scsi_eh_flush_done_q().
- Made the code in unit tests less repetitive.
Changes compared to v12:
- Added two new patches: "block: Preserve the order of requeued zoned writes"
and "scsi: sd: Add a unit test for sd_cmp_sector()"
- Restricted the number of zoned write retries. To my surprise I had to add
"&& scmd->retries <= scmd->allowed" in the SCSI error handler to limit the
number of retries.
- In patch "scsi: ufs: Inform the block layer about write ordering", only set
ELEVATOR_F_ZBD_SEQ_WRITE for zoned block devices.
Changes compared to v11:
- Fixed a NULL pointer dereference that happened when booting from an ATA
device by adding an scmd->device != NULL check in scsi_needs_preparation().
- Updated Reviewed-by tags.
Changes compared to v10:
- Dropped the UFS MediaTek and HiSilicon patches because these are not correct
and because it is safe to drop these patches.
- Updated Acked-by / Reviewed-by tags.
Changes compared to v9:
- Introduced an additional scsi_driver callback: .eh_needs_prepare_resubmit().
- Renamed the scsi_debug kernel module parameter 'no_zone_write_lock' into
'preserves_write_order'.
- Fixed an out-of-bounds access in the unit scsi_call_prepare_resubmit() unit
test.
- Wrapped ufshcd_auto_hibern8_update() calls in UFS host drivers with
WARN_ON_ONCE() such that a kernel stack appears in case an error code is
returned.
- Elaborated a comment in the UFSHCI driver.
Changes compared to v8:
- Fixed handling of 'driver_preserves_write_order' and 'use_zone_write_lock'
in blk_stack_limits().
- Added a comment in disk_set_zoned().
- Modified blk_req_needs_zone_write_lock() such that it returns false if
q->limits.use_zone_write_lock is false.
- Modified disk_clear_zone_settings() such that it clears
q->limits.use_zone_write_lock.
- Left out one change from the mq-deadline patch that became superfluous due to
the blk_req_needs_zone_write_lock() change.
- Modified scsi_call_prepare_resubmit() such that it only calls list_sort() if
zoned writes have to be resubmitted for which zone write locking is disabled.
- Added an additional unit test for scsi_call_prepare_resubmit().
- Modified the sorting code in the sd driver such that only those SCSI commands
are sorted for which write locking is disabled.
- Modified sd_zbc.c such that ELEVATOR_F_ZBD_SEQ_WRITE is only set if the
write order is not preserved.
- Included three patches for UFS host drivers that rework code that wrote
directly to the auto-hibernation controller register.
- Modified the UFS driver such that enabling auto-hibernation is not allowed
if a zoned logical unit is present and if the controller operates in legacy
mode.
- Also in the UFS driver, simplified ufshcd_auto_hibern8_update().
Changes compared to v7:
- Split the queue_limits member variable `use_zone_write_lock' into two member
variables: `use_zone_write_lock' (set by disk_set_zoned()) and
`driver_preserves_write_order' (set by the block driver or SCSI LLD). This
should clear up the confusion about the purpose of this variable.
- Moved the code for sorting SCSI commands by LBA from the SCSI error handler
into the SCSI disk (sd) driver as requested by Christoph.
Changes compared to v6:
- Removed QUEUE_FLAG_NO_ZONE_WRITE_LOCK and instead introduced a flag in
the request queue limits data structure.
Changes compared to v5:
- Renamed scsi_cmp_lba() into scsi_cmp_sector().
- Improved several source code comments.
Changes compared to v4:
- Dropped the patch that introduces the REQ_NO_ZONE_WRITE_LOCK flag.
- Dropped the null_blk patch and added two scsi_debug patches instead.
- Dropped the f2fs patch.
- Split the patch for the UFS driver into two patches.
- Modified several patch descriptions and source code comments.
- Renamed dd_use_write_locking() into dd_use_zone_write_locking().
- Moved the list_sort() call from scsi_unjam_host() into scsi_eh_flush_done_q()
such that sorting happens just before reinserting.
- Removed the scsi_cmd_retry_allowed() call from scsi_check_sense() to make
sure that the retry counter is adjusted once per retry instead of twice.
Changes compared to v3:
- Restored the patch that introduces QUEUE_FLAG_NO_ZONE_WRITE_LOCK. That patch
had accidentally been left out from v2.
- In patch "block: Introduce the flag REQ_NO_ZONE_WRITE_LOCK", improved the
patch description and added the function blk_no_zone_write_lock().
- In patch "block/mq-deadline: Only use zone locking if necessary", moved the
blk_queue_is_zoned() call into dd_use_write_locking().
- In patch "fs/f2fs: Disable zone write locking", set REQ_NO_ZONE_WRITE_LOCK
from inside __bio_alloc() instead of in f2fs_submit_write_bio().
Changes compared to v2:
- Renamed the request queue flag for disabling zone write locking.
- Introduced a new request flag for disabling zone write locking.
- Modified the mq-deadline scheduler such that zone write locking is only
disabled if both flags are set.
- Added an F2FS patch that sets the request flag for disabling zone write
locking.
- Only disable zone write locking in the UFS driver if auto-hibernation is
disabled.
Changes compared to v1:
- Left out the patches that are already upstream.
- Switched the approach in patch "scsi: Retry unaligned zoned writes" from
retrying immediately to sending unaligned write commands to the SCSI error
handler.
Bart Van Assche (26):
blk-zoned: Fix a reference count leak
blk-zoned: Split disk_zone_wplugs_work()
blk-zoned: Split queue_zone_wplugs_show()
blk-zoned: Only handle errors after pending zoned writes have
completed
blk-zoned: Fix a deadlock triggered by unaligned writes
blk-zoned: Fix requeuing of zoned writes
block: Support block drivers that preserve the order of write requests
dm-linear: Report to the block layer that the write order is preserved
mq-deadline: Remove a local variable
blk-mq: Clean up blk_mq_requeue_work()
block: Optimize blk_mq_submit_bio() for the cache hit scenario
block: Rework request allocation in blk_mq_submit_bio()
block: Support allocating from a specific software queue
blk-mq: Restore the zoned write order when requeuing
blk-zoned: Document the locking order
blk-zoned: Document locking assumptions
blk-zoned: Uninline functions that are not in the hot path
blk-zoned: Make disk_should_remove_zone_wplug() more robust
blk-zoned: Add an argument to blk_zone_plug_bio()
blk-zoned: Support pipelining of zoned writes
scsi: core: Retry unaligned zoned writes
scsi: sd: Increase retry count for zoned writes
scsi: scsi_debug: Add the preserves_write_order module parameter
scsi: scsi_debug: Support injecting unaligned write errors
scsi: scsi_debug: Skip host/bus reset settle delay
scsi: ufs: Inform the block layer about write ordering
block/bfq-iosched.c | 2 +
block/blk-mq.c | 97 ++++++----
block/blk-mq.h | 3 +
block/blk-settings.c | 2 +
block/blk-zoned.c | 376 +++++++++++++++++++++++++-------------
block/blk.h | 33 +++-
block/kyber-iosched.c | 2 +
block/mq-deadline.c | 10 +-
drivers/md/dm-linear.c | 6 +
drivers/md/dm.c | 2 +-
drivers/scsi/scsi_debug.c | 22 ++-
drivers/scsi/scsi_error.c | 16 ++
drivers/scsi/sd.c | 7 +
drivers/ufs/core/ufshcd.c | 7 +
include/linux/blk-mq.h | 20 +-
include/linux/blkdev.h | 11 +-
16 files changed, 444 insertions(+), 172 deletions(-)
next reply other threads:[~2024-11-19 0:28 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-19 0:27 Bart Van Assche [this message]
2024-11-19 0:27 ` [PATCH v16 01/26] blk-zoned: Fix a reference count leak Bart Van Assche
2024-11-19 2:23 ` Damien Le Moal
2024-11-19 20:21 ` Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 02/26] blk-zoned: Split disk_zone_wplugs_work() Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 03/26] blk-zoned: Split queue_zone_wplugs_show() Bart Van Assche
2024-11-19 2:25 ` Damien Le Moal
2024-11-19 0:27 ` [PATCH v16 04/26] blk-zoned: Only handle errors after pending zoned writes have completed Bart Van Assche
2024-11-19 2:50 ` Damien Le Moal
2024-11-19 20:51 ` Bart Van Assche
2024-11-21 3:23 ` Damien Le Moal
2024-11-21 17:43 ` Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 05/26] blk-zoned: Fix a deadlock triggered by unaligned writes Bart Van Assche
2024-11-19 2:57 ` Damien Le Moal
2024-11-19 21:04 ` Bart Van Assche
2024-11-21 3:32 ` Damien Le Moal
2024-11-21 17:51 ` Bart Van Assche
2024-11-25 4:00 ` Damien Le Moal
2024-11-25 4:19 ` Damien Le Moal
2025-01-09 19:11 ` Bart Van Assche
2025-01-10 5:07 ` Damien Le Moal
2025-01-10 18:17 ` Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 06/26] blk-zoned: Fix requeuing of zoned writes Bart Van Assche
2024-11-19 3:00 ` Damien Le Moal
2024-11-19 21:06 ` Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 07/26] block: Support block drivers that preserve the order of write requests Bart Van Assche
2024-11-19 7:37 ` Damien Le Moal
2024-11-19 21:08 ` Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 08/26] dm-linear: Report to the block layer that the write order is preserved Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 09/26] mq-deadline: Remove a local variable Bart Van Assche
2024-11-19 7:38 ` Damien Le Moal
2024-11-19 21:11 ` Bart Van Assche
2024-11-19 0:27 ` [PATCH v16 10/26] blk-mq: Clean up blk_mq_requeue_work() Bart Van Assche
2024-11-19 7:39 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 11/26] block: Optimize blk_mq_submit_bio() for the cache hit scenario Bart Van Assche
2024-11-19 7:40 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 12/26] block: Rework request allocation in blk_mq_submit_bio() Bart Van Assche
2024-11-19 7:44 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 13/26] block: Support allocating from a specific software queue Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 14/26] blk-mq: Restore the zoned write order when requeuing Bart Van Assche
2024-11-19 7:52 ` Damien Le Moal
2024-11-19 21:16 ` Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 15/26] blk-zoned: Document the locking order Bart Van Assche
2024-11-19 7:52 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 16/26] blk-zoned: Document locking assumptions Bart Van Assche
2024-11-19 7:53 ` Damien Le Moal
2024-11-19 21:18 ` Bart Van Assche
2024-11-21 3:34 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 17/26] blk-zoned: Uninline functions that are not in the hot path Bart Van Assche
2024-11-19 7:55 ` Damien Le Moal
2024-11-19 21:20 ` Bart Van Assche
2024-11-21 3:36 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 18/26] blk-zoned: Make disk_should_remove_zone_wplug() more robust Bart Van Assche
2024-11-19 7:58 ` Damien Le Moal
2024-11-19 0:28 ` [PATCH v16 19/26] blk-zoned: Add an argument to blk_zone_plug_bio() Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 20/26] blk-zoned: Support pipelining of zoned writes Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 21/26] scsi: core: Retry unaligned " Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 22/26] scsi: sd: Increase retry count for " Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 23/26] scsi: scsi_debug: Add the preserves_write_order module parameter Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 24/26] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 25/26] scsi: scsi_debug: Skip host/bus reset settle delay Bart Van Assche
2024-11-19 0:28 ` [PATCH v16 26/26] scsi: ufs: Inform the block layer about write ordering Bart Van Assche
[not found] ` <37f95f44-ab1d-20db-e0c7-94946cb9d4eb@quicinc.com>
2024-11-22 18:20 ` Bart Van Assche
2024-11-23 0:34 ` Can Guo
2024-11-19 8:01 ` [PATCH v16 00/26] Improve write performance for zoned UFS devices Damien Le Moal
2024-11-19 19:08 ` Bart Van Assche
2024-11-21 3:20 ` Damien Le Moal
2024-11-21 18:00 ` Bart Van Assche
2024-11-25 3:59 ` Damien Le Moal
2025-01-09 19:02 ` Bart Van Assche
2025-01-10 5:10 ` Damien Le Moal
2024-11-19 12:25 ` Christoph Hellwig
2024-11-19 18:52 ` Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241119002815.600608-1-bvanassche@acm.org \
--to=bvanassche@acm.org \
--cc=axboe@kernel.dk \
--cc=dlemoal@kernel.org \
--cc=hch@lst.de \
--cc=jaegeuk@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox