linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Damien Le Moal <dlemoal@kernel.org>
To: Bart Van Assche <bvanassche@acm.org>, Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCH v23 01/16] block: Support block devices that preserve the order of write requests
Date: Tue, 12 Aug 2025 11:12:44 +0900	[thread overview]
Message-ID: <7570f60f-932b-4b76-a87d-8f3f0760c44f@kernel.org> (raw)
In-Reply-To: <20250811200851.626402-2-bvanassche@acm.org>

On 8/12/25 5:08 AM, Bart Van Assche wrote:
> Some storage controllers preserve the request order per hardware queue.
> Some but not all device mapper drivers preserve the bio order. Introduce
> the feature flag BLK_FEAT_ORDERED_HWQ to allow block drivers and stacked
> drivers to indicate that the order of write commands is preserved per
> hardware queue and hence that serialization of writes per zone is not
> required if all pending writes are submitted to the same hardware queue.
> Add a sysfs attribute for controlling write pipelining support.

Why ? Why would you want to disable write pipelining since it give better
performance ?

The commit message also does not describe BLK_FEAT_PIPELINE_ZWR, but I think
this enable/disable flag is not needed.

> 
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  Documentation/ABI/stable/sysfs-block | 15 +++++++++++++++
>  block/blk-settings.c                 | 10 ++++++++++
>  block/blk-sysfs.c                    |  7 +++++++
>  include/linux/blkdev.h               |  9 +++++++++
>  4 files changed, 41 insertions(+)
> 
> diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
> index 803f578dc023..5a42d99cf39a 100644
> --- a/Documentation/ABI/stable/sysfs-block
> +++ b/Documentation/ABI/stable/sysfs-block
> @@ -637,6 +637,21 @@ Description:
>  		I/O size is reported this file contains 0.
>  
>  
> +What:		/sys/block/<disk>/queue/pipeline_zoned_writes
> +Date:		August 2025
> +Contact:	Bart Van Assche <bvanassche@acm.org>
> +Description:
> +		[RW] If this attribute is present it means that the block driver
> +		and the storage controller both support preserving the order of
> +		zoned writes per hardware queue. This attribute controls whether
> +		or not pipelining zoned writes is enabled. If the value of this
> +		attribute is zero, the block layer restricts the queue depth for
> +		sequential writes per zone to one (zone append operations are
> +		not affected). If the value of this attribute is one, the block
> +		layer does not restrict the queue depth of sequential writes per
> +		zone to one.
> +
> +
>  What:		/sys/block/<disk>/queue/physical_block_size
>  Date:		May 2009
>  Contact:	Martin K. Petersen <martin.petersen@oracle.com>
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 07874e9b609f..01c0edf2308a 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -119,6 +119,14 @@ static int blk_validate_zoned_limits(struct queue_limits *lim)
>  	lim->max_zone_append_sectors =
>  		min_not_zero(lim->max_hw_zone_append_sectors,
>  			min(lim->chunk_sectors, lim->max_hw_sectors));
> +
> +	/*
> +	 * If both the block driver and the block device preserve the write
> +	 * order per hwq, enable zoned write pipelining.
> +	 */
> +	if (lim->features & BLK_FEAT_ORDERED_HWQ)
> +		lim->features |= BLK_FEAT_PIPELINE_ZWR;
> +
>  	return 0;
>  }
>  
> @@ -780,6 +788,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b,
>  		t->features &= ~BLK_FEAT_NOWAIT;
>  	if (!(b->features & BLK_FEAT_POLL))
>  		t->features &= ~BLK_FEAT_POLL;
> +	if (!(b->features & BLK_FEAT_ORDERED_HWQ))
> +		t->features &= ~BLK_FEAT_ORDERED_HWQ;
>  
>  	t->flags |= (b->flags & BLK_FLAG_MISALIGNED);
>  
> diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
> index 78ee8d324c7f..4bf0b663f25d 100644
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -270,6 +270,7 @@ QUEUE_SYSFS_FEATURE(rotational, BLK_FEAT_ROTATIONAL)
>  QUEUE_SYSFS_FEATURE(add_random, BLK_FEAT_ADD_RANDOM)
>  QUEUE_SYSFS_FEATURE(iostats, BLK_FEAT_IO_STAT)
>  QUEUE_SYSFS_FEATURE(stable_writes, BLK_FEAT_STABLE_WRITES);
> +QUEUE_SYSFS_FEATURE(pipeline_zwr, BLK_FEAT_PIPELINE_ZWR);
>  
>  #define QUEUE_SYSFS_FEATURE_SHOW(_name, _feature)			\
>  static ssize_t queue_##_name##_show(struct gendisk *disk, char *page)	\
> @@ -554,6 +555,7 @@ QUEUE_LIM_RO_ENTRY(queue_dax, "dax");
>  QUEUE_RW_ENTRY(queue_io_timeout, "io_timeout");
>  QUEUE_LIM_RO_ENTRY(queue_virt_boundary_mask, "virt_boundary_mask");
>  QUEUE_LIM_RO_ENTRY(queue_dma_alignment, "dma_alignment");
> +QUEUE_LIM_RW_ENTRY(queue_pipeline_zwr, "pipeline_zoned_writes");
>  
>  /* legacy alias for logical_block_size: */
>  static struct queue_sysfs_entry queue_hw_sector_size_entry = {
> @@ -700,6 +702,7 @@ static struct attribute *queue_attrs[] = {
>  	&queue_dax_entry.attr,
>  	&queue_virt_boundary_mask_entry.attr,
>  	&queue_dma_alignment_entry.attr,
> +	&queue_pipeline_zwr_entry.attr,
>  	&queue_ra_entry.attr,
>  
>  	/*
> @@ -746,6 +749,10 @@ static umode_t queue_attr_visible(struct kobject *kobj, struct attribute *attr,
>  	    !blk_queue_is_zoned(q))
>  		return 0;
>  
> +	if (attr == &queue_pipeline_zwr_entry.attr &&
> +	    !(q->limits.features & BLK_FEAT_ORDERED_HWQ))
> +		return 0;
> +
>  	return attr->mode;
>  }
>  
> diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
> index 95886b404b16..79d14b3d3309 100644
> --- a/include/linux/blkdev.h
> +++ b/include/linux/blkdev.h
> @@ -338,6 +338,15 @@ typedef unsigned int __bitwise blk_features_t;
>  /* skip this queue in blk_mq_(un)quiesce_tagset */
>  #define BLK_FEAT_SKIP_TAGSET_QUIESCE	((__force blk_features_t)(1u << 13))
>  
> +/*
> + * The request order is preserved per hardware queue by the block driver and by
> + * the block device. Set by the block driver.
> + */
> +#define BLK_FEAT_ORDERED_HWQ		((__force blk_features_t)(1u << 14))
> +
> +/* Whether to pipeline zoned writes. Controlled by the block layer. */
> +#define BLK_FEAT_PIPELINE_ZWR		((__force blk_features_t)(1u << 15))
> +
>  /* undocumented magic for bcache */
>  #define BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE \
>  	((__force blk_features_t)(1u << 15))


-- 
Damien Le Moal
Western Digital Research

  reply	other threads:[~2025-08-12  2:15 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-11 20:08 [PATCH v23 00/16] Improve write performance for zoned UFS devices Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 01/16] block: Support block devices that preserve the order of write requests Bart Van Assche
2025-08-12  2:12   ` Damien Le Moal [this message]
2025-08-12 23:57     ` Bart Van Assche
2025-08-14  8:30       ` Damien Le Moal
2025-08-11 20:08 ` [PATCH v23 02/16] blk-mq: Always insert sequential zoned writes into a software queue Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 03/16] blk-mq: Restore the zone write order when requeuing Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 04/16] blk-mq: Run all hwqs for sq scheds if write pipelining is enabled Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 05/16] block/mq-deadline: Preserve the zwr order if zoned write plugging " Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 06/16] blk-zoned: Add an argument to blk_zone_plug_bio() Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 07/16] blk-zoned: Split an if-statement Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 08/16] blk-zoned: Move code from disk_zone_wplug_add_bio() into its caller Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 09/16] blk-zoned: Introduce a loop in blk_zone_wplug_bio_work() Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 10/16] blk-zoned: Support pipelining of zoned writes Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 11/16] null_blk: Add the preserves_write_order attribute Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 12/16] scsi: core: Retry unaligned zoned writes Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 13/16] scsi: sd: Increase retry count for " Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 14/16] scsi: scsi_debug: Add the preserves_write_order module parameter Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 15/16] scsi: scsi_debug: Support injecting unaligned write errors Bart Van Assche
2025-08-11 20:08 ` [PATCH v23 16/16] ufs: core: Inform the block layer about write ordering Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7570f60f-932b-4b76-a87d-8f3f0760c44f@kernel.org \
    --to=dlemoal@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=bvanassche@acm.org \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).