From: Mike Snitzer <snitzer@kernel.org>
To: Damien Le Moal <dlemoal@kernel.org>
Cc: linux-block@vger.kernel.org, Jens Axboe <axboe@kernel.dk>,
linux-scsi@vger.kernel.org,
"Martin K . Petersen" <martin.petersen@oracle.com>,
dm-devel@lists.linux.dev, Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH 10/26] dm: Use the block layer zone append emulation
Date: Sat, 3 Feb 2024 12:58:48 -0500 [thread overview]
Message-ID: <Zb5-2LsnQtJHV2mL@redhat.com> (raw)
In-Reply-To: <20240202073104.2418230-11-dlemoal@kernel.org>
On Fri, Feb 02 2024 at 2:30P -0500,
Damien Le Moal <dlemoal@kernel.org> wrote:
> For targets requiring zone append operation emulation with regular
> writes (e.g. dm-crypt), we can use the block layer emulation provided by
> zone write plugging. Remove DM implemented zone append emulation and
> enable the block layer one.
>
> This is done by setting the max_zone_append_sectors limit of the
> mapped device queue to 0 for mapped devices that have a target table
> that cannot support native zone append operations. These includes
> mixed zoned and non-zoned targets, or targets that explicitly requested
> emulation of zone append (e.g. dm-crypt). For these mapped devices, the
> new field emulate_zone_append is set to true. dm_split_and_process_bio()
> is modified to call blk_zone_write_plug_bio() for such device to let the
> block layer transform zone append operations into regular writes. This
> is done after ensuring that the submitted BIO is split if it straddles
> zone boundaries.
>
> dm_revalidate_zones() is also modified to use the block layer provided
> function blk_revalidate_disk_zones() so that all zone resources needed
> for zone append emulation are allocated and initialized by the block
> layer without DM core needing to do anything. Since the device table is
> not yet live when dm_revalidate_zones() is executed, enabling the use of
> blk_revalidate_disk_zones() requires adding a pointer to the device
> table in struct mapped_device. This avoids errors in
> dm_blk_report_zones() trying to get the table with dm_get_live_table().
> The mapped device table pointer is set to the table passed as argument
> to dm_revalidate_zones() before calling blk_revalidate_disk_zones() and
> reset to NULL after this function returns to restore the live table
> handling for user call of report zones.
>
> All the code related to zone append emulation is removed from
> dm-zone.c. This leads to simplifications of the functions __map_bio()
> and dm_zone_endio(). This later function now only needs to deal with
> completions of real zone append operations for targets that support it.
>
> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Love the overall improvement to the DM core code and the broader block
layer by switching to this bio-based ZWP approach.
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
But one incremental suggestion inlined below.
> ---
> drivers/md/dm-core.h | 11 +-
> drivers/md/dm-zone.c | 470 ++++---------------------------------------
> drivers/md/dm.c | 44 ++--
> drivers/md/dm.h | 7 -
> 4 files changed, 68 insertions(+), 464 deletions(-)
>
<snip>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 8dcabf84d866..92ce3b2eb4ae 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -1419,25 +1419,12 @@ static void __map_bio(struct bio *clone)
> down(&md->swap_bios_semaphore);
> }
>
> - if (static_branch_unlikely(&zoned_enabled)) {
> - /*
> - * Check if the IO needs a special mapping due to zone append
> - * emulation on zoned target. In this case, dm_zone_map_bio()
> - * calls the target map operation.
> - */
> - if (unlikely(dm_emulate_zone_append(md)))
> - r = dm_zone_map_bio(tio);
> - else
> - goto do_map;
> - } else {
> -do_map:
> - if (likely(ti->type->map == linear_map))
> - r = linear_map(ti, clone);
> - else if (ti->type->map == stripe_map)
> - r = stripe_map(ti, clone);
> - else
> - r = ti->type->map(ti, clone);
> - }
> + if (likely(ti->type->map == linear_map))
> + r = linear_map(ti, clone);
> + else if (ti->type->map == stripe_map)
> + r = stripe_map(ti, clone);
> + else
> + r = ti->type->map(ti, clone);
>
> switch (r) {
> case DM_MAPIO_SUBMITTED:
> @@ -1774,19 +1761,33 @@ static void dm_split_and_process_bio(struct mapped_device *md,
> struct clone_info ci;
> struct dm_io *io;
> blk_status_t error = BLK_STS_OK;
> - bool is_abnormal;
> + bool is_abnormal, need_split;
>
> is_abnormal = is_abnormal_io(bio);
> - if (unlikely(is_abnormal)) {
> + if (likely(!md->emulate_zone_append))
> + need_split = is_abnormal;
> + else
> + need_split = is_abnormal || bio_straddle_zones(bio);
> + if (unlikely(need_split)) {
> /*
> * Use bio_split_to_limits() for abnormal IO (e.g. discard, etc)
> * otherwise associated queue_limits won't be imposed.
> + * Also split the BIO for mapped devices needing zone append
> + * emulation to ensure that the BIO does not cross zone
> + * boundaries.
> */
> bio = bio_split_to_limits(bio);
> if (!bio)
> return;
> }
>
> + /*
> + * Use the block layer zone write plugging for mapped devices that
> + * need zone append emulation (e.g. dm-crypt).
> + */
> + if (md->emulate_zone_append && blk_zone_write_plug_bio(bio, 0))
> + return;
> +
> /* Only support nowait for normal IO */
> if (unlikely(bio->bi_opf & REQ_NOWAIT) && !is_abnormal) {
> io = alloc_io(md, bio, GFP_NOWAIT);
Would prefer to see this incremental change included from the start:
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 92ce3b2eb4ae..1fd9bbf35db3 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1763,11 +1763,10 @@ static void dm_split_and_process_bio(struct mapped_device *md,
blk_status_t error = BLK_STS_OK;
bool is_abnormal, need_split;
- is_abnormal = is_abnormal_io(bio);
- if (likely(!md->emulate_zone_append))
- need_split = is_abnormal;
- else
+ need_split = is_abnormal = is_abnormal_io(bio);
+ if (static_branch_unlikely(&zoned_enabled) && unlikely(md->emulate_zone_append))
need_split = is_abnormal || bio_straddle_zones(bio);
+
if (unlikely(need_split)) {
/*
* Use bio_split_to_limits() for abnormal IO (e.g. discard, etc)
@@ -1781,12 +1780,14 @@ static void dm_split_and_process_bio(struct mapped_device *md,
return;
}
- /*
- * Use the block layer zone write plugging for mapped devices that
- * need zone append emulation (e.g. dm-crypt).
- */
- if (md->emulate_zone_append && blk_zone_write_plug_bio(bio, 0))
- return;
+ if (static_branch_unlikely(&zoned_enabled)) {
+ /*
+ * Use the block layer zone write plugging for mapped devices that
+ * need zone append emulation (e.g. dm-crypt).
+ */
+ if (unlikely(md->emulate_zone_append) && blk_zone_write_plug_bio(bio, 0))
+ return;
+ }
/* Only support nowait for normal IO */
if (unlikely(bio->bi_opf & REQ_NOWAIT) && !is_abnormal) {
next prev parent reply other threads:[~2024-02-03 17:58 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-02 7:30 [PATCH 00/26] Zone write plugging Damien Le Moal
2024-02-02 7:30 ` [PATCH 01/26] block: Restore sector of flush requests Damien Le Moal
2024-02-04 11:55 ` Hannes Reinecke
2024-02-05 17:22 ` Bart Van Assche
2024-02-05 23:42 ` Damien Le Moal
2024-02-02 7:30 ` [PATCH 02/26] block: Remove req_bio_endio() Damien Le Moal
2024-02-04 11:57 ` Hannes Reinecke
2024-02-05 17:28 ` Bart Van Assche
2024-02-05 23:45 ` Damien Le Moal
2024-02-09 6:53 ` Damien Le Moal
2024-02-02 7:30 ` [PATCH 03/26] block: Introduce bio_straddle_zones() and bio_offset_from_zone_start() Damien Le Moal
2024-02-03 4:09 ` Bart Van Assche
2024-02-04 11:58 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 04/26] block: Introduce blk_zone_complete_request_bio() Damien Le Moal
2024-02-04 11:59 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 05/26] block: Allow using bio_attempt_back_merge() internally Damien Le Moal
2024-02-03 4:11 ` Bart Van Assche
2024-02-04 12:00 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 06/26] block: Introduce zone write plugging Damien Le Moal
2024-02-04 3:56 ` Ming Lei
2024-02-04 23:57 ` Damien Le Moal
2024-02-05 2:19 ` Ming Lei
2024-02-05 2:41 ` Damien Le Moal
2024-02-05 3:38 ` Ming Lei
2024-02-05 5:11 ` Christoph Hellwig
2024-02-05 5:37 ` Damien Le Moal
2024-02-05 5:50 ` Christoph Hellwig
2024-02-05 6:14 ` Damien Le Moal
2024-02-05 10:06 ` Ming Lei
2024-02-05 12:20 ` Damien Le Moal
2024-02-05 12:43 ` Damien Le Moal
2024-02-04 12:14 ` Hannes Reinecke
2024-02-05 17:48 ` Bart Van Assche
2024-02-05 23:48 ` Damien Le Moal
2024-02-06 0:52 ` Bart Van Assche
2024-02-02 7:30 ` [PATCH 07/26] block: Allow zero value of max_zone_append_sectors queue limit Damien Le Moal
2024-02-04 12:15 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 08/26] block: Implement zone append emulation Damien Le Moal
2024-02-04 12:24 ` Hannes Reinecke
2024-02-05 0:10 ` Damien Le Moal
2024-02-05 17:58 ` Bart Van Assche
2024-02-05 23:57 ` Damien Le Moal
2024-02-02 7:30 ` [PATCH 09/26] block: Allow BIO-based drivers to use blk_revalidate_disk_zones() Damien Le Moal
2024-02-04 12:26 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 10/26] dm: Use the block layer zone append emulation Damien Le Moal
2024-02-03 17:58 ` Mike Snitzer [this message]
2024-02-05 5:38 ` Damien Le Moal
2024-02-05 20:33 ` Mike Snitzer
2024-02-05 23:40 ` Damien Le Moal
2024-02-06 20:41 ` Mike Snitzer
2024-02-04 12:30 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 11/26] scsi: sd: " Damien Le Moal
2024-02-04 12:29 ` Hannes Reinecke
2024-02-06 1:55 ` Martin K. Petersen
2024-02-02 7:30 ` [PATCH 12/26] ublk_drv: Do not request ELEVATOR_F_ZBD_SEQ_WRITE elevator feature Damien Le Moal
2024-02-04 12:31 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 13/26] null_blk: " Damien Le Moal
2024-02-04 12:31 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 14/26] null_blk: Introduce zone_append_max_sectors attribute Damien Le Moal
2024-02-04 12:32 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 15/26] null_blk: Introduce fua attribute Damien Le Moal
2024-02-04 12:33 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 16/26] nvmet: zns: Do not reference the gendisk conv_zones_bitmap Damien Le Moal
2024-02-04 12:34 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 17/26] block: Remove BLK_STS_ZONE_RESOURCE Damien Le Moal
2024-02-04 12:34 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 18/26] block: Simplify blk_revalidate_disk_zones() interface Damien Le Moal
2024-02-04 12:35 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 19/26] block: mq-deadline: Remove support for zone write locking Damien Le Moal
2024-02-04 12:36 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 20/26] block: Remove elevator required features Damien Le Moal
2024-02-04 12:36 ` Hannes Reinecke
2024-02-02 7:30 ` [PATCH 21/26] block: Do not check zone type in blk_check_zone_append() Damien Le Moal
2024-02-04 12:37 ` Hannes Reinecke
2024-02-02 7:31 ` [PATCH 22/26] block: Move zone related debugfs attribute to blk-zoned.c Damien Le Moal
2024-02-04 12:38 ` Hannes Reinecke
2024-02-02 7:31 ` [PATCH 23/26] block: Remove zone write locking Damien Le Moal
2024-02-04 12:38 ` Hannes Reinecke
2024-02-02 7:31 ` [PATCH 24/26] block: Do not special-case plugging of zone write operations Damien Le Moal
2024-02-04 12:39 ` Hannes Reinecke
2024-02-02 7:31 ` [PATCH 25/26] block: Reduce zone write plugging memory usage Damien Le Moal
2024-02-04 12:42 ` Hannes Reinecke
2024-02-05 17:51 ` Bart Van Assche
2024-02-05 23:55 ` Damien Le Moal
2024-02-06 21:20 ` Bart Van Assche
2024-02-09 3:58 ` Damien Le Moal
2024-02-09 19:36 ` Bart Van Assche
2024-02-10 0:06 ` Damien Le Moal
2024-02-11 3:40 ` Bart Van Assche
2024-02-12 1:09 ` Damien Le Moal
2024-02-12 18:58 ` Bart Van Assche
2024-02-12 8:23 ` Damien Le Moal
2024-02-12 8:47 ` Damien Le Moal
2024-02-12 18:40 ` Bart Van Assche
2024-02-13 0:05 ` Damien Le Moal
2024-02-02 7:31 ` [PATCH 26/26] block: Add zone_active_wplugs debugfs entry Damien Le Moal
2024-02-04 12:43 ` Hannes Reinecke
2024-02-02 7:37 ` [PATCH 00/26] Zone write plugging Damien Le Moal
2024-02-03 12:11 ` Jens Axboe
2024-02-09 5:28 ` Damien Le Moal
2024-02-05 17:21 ` Bart Van Assche
2024-02-05 23:42 ` Damien Le Moal
2024-02-06 0:57 ` Bart Van Assche
2024-02-05 18:18 ` Bart Van Assche
2024-02-06 0:07 ` Damien Le Moal
2024-02-06 1:25 ` Bart Van Assche
2024-02-09 4:03 ` Damien Le Moal
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Zb5-2LsnQtJHV2mL@redhat.com \
--to=snitzer@kernel.org \
--cc=axboe@kernel.dk \
--cc=dlemoal@kernel.org \
--cc=dm-devel@lists.linux.dev \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).