linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Increase SCSI IOPS
@ 2025-11-17 22:51 Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 1/5] block: Rename busy_tag_iter_fn into blk_mq_rq_iter_fn Bart Van Assche
                   ` (5 more replies)
  0 siblings, 6 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-17 22:51 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche

Hi Martin,

This patch series increases scsi_debug IOPS by 5% on my test setup by disabling
SCSI budget management if it is not needed.

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v1 (https://lore.kernel.org/linux-scsi/20250910213254.1215318-1-bvanassche@acm.org/):
 - Added three block layer patches to introduce the function
   blk_mq_tagset_iter().
 - Applied the optimization not only for host-wide tags but also if there is
   only a single hardware queue.
 - Renamed scsi_device_check_in_flight() into scsi_device_check_allocated().
 - Added support for set->shared_tags == NULL in scsi_device_busy().

Bart Van Assche (5):
  block: Rename busy_tag_iter_fn into blk_mq_rq_iter_fn
  block: Introduce __blk_mq_tagset_iter()
  block: Introduce blk_mq_tagset_iter()
  scsi: core: Generalize scsi_device_busy()
  scsi: core: Improve IOPS in case of host-wide tags

 block/blk-mq-tag.c         | 67 ++++++++++++++++++++++++++------------
 block/blk-mq.h             |  4 +--
 drivers/scsi/scsi.c        |  3 +-
 drivers/scsi/scsi_lib.c    | 38 +++++++++++++++++++++
 drivers/scsi/scsi_scan.c   | 18 +++++++++-
 include/linux/blk-mq.h     |  6 ++--
 include/scsi/scsi_device.h |  5 +--
 7 files changed, 110 insertions(+), 31 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 1/5] block: Rename busy_tag_iter_fn into blk_mq_rq_iter_fn
  2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
@ 2025-11-17 22:52 ` Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 2/5] block: Introduce __blk_mq_tagset_iter() Bart Van Assche
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-17 22:52 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche, Jens Axboe, Christoph Hellwig, Ming Lei

The name 'busy_tag_iter_fn' is not correct since blk_mq_all_tag_iter()
uses this function pointer type for requests that may not be "busy"
(started). Hence rename 'busy_tag_iter_fn' into 'blk_mq_rq_iter_fn'.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-mq-tag.c     | 16 ++++++++--------
 block/blk-mq.h         |  4 ++--
 include/linux/blk-mq.h |  4 ++--
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index c7a4d4b9cc87..8a61c481015e 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -247,7 +247,7 @@ void blk_mq_put_tags(struct blk_mq_tags *tags, int *tag_array, int nr_tags)
 struct bt_iter_data {
 	struct blk_mq_hw_ctx *hctx;
 	struct request_queue *q;
-	busy_tag_iter_fn *fn;
+	blk_mq_rq_iter_fn *fn;
 	void *data;
 	bool reserved;
 };
@@ -310,7 +310,7 @@ static bool bt_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
  *		bitmap_tags member of struct blk_mq_tags.
  */
 static void bt_for_each(struct blk_mq_hw_ctx *hctx, struct request_queue *q,
-			struct sbitmap_queue *bt, busy_tag_iter_fn *fn,
+			struct sbitmap_queue *bt, blk_mq_rq_iter_fn *fn,
 			void *data, bool reserved)
 {
 	struct bt_iter_data iter_data = {
@@ -326,7 +326,7 @@ static void bt_for_each(struct blk_mq_hw_ctx *hctx, struct request_queue *q,
 
 struct bt_tags_iter_data {
 	struct blk_mq_tags *tags;
-	busy_tag_iter_fn *fn;
+	blk_mq_rq_iter_fn *fn;
 	void *data;
 	unsigned int flags;
 };
@@ -378,7 +378,7 @@ static bool bt_tags_iter(struct sbitmap *bitmap, unsigned int bitnr, void *data)
  * @flags:	BT_TAG_ITER_*
  */
 static void bt_tags_for_each(struct blk_mq_tags *tags, struct sbitmap_queue *bt,
-			     busy_tag_iter_fn *fn, void *data, unsigned int flags)
+			blk_mq_rq_iter_fn *fn, void *data, unsigned int flags)
 {
 	struct bt_tags_iter_data iter_data = {
 		.tags = tags,
@@ -392,7 +392,7 @@ static void bt_tags_for_each(struct blk_mq_tags *tags, struct sbitmap_queue *bt,
 }
 
 static void __blk_mq_all_tag_iter(struct blk_mq_tags *tags,
-		busy_tag_iter_fn *fn, void *priv, unsigned int flags)
+		blk_mq_rq_iter_fn *fn, void *priv, unsigned int flags)
 {
 	WARN_ON_ONCE(flags & BT_TAG_ITER_RESERVED);
 
@@ -413,7 +413,7 @@ static void __blk_mq_all_tag_iter(struct blk_mq_tags *tags,
  *
  * Caller has to pass the tag map from which requests are allocated.
  */
-void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
+void blk_mq_all_tag_iter(struct blk_mq_tags *tags, blk_mq_rq_iter_fn *fn,
 		void *priv)
 {
 	__blk_mq_all_tag_iter(tags, fn, priv, BT_TAG_ITER_STATIC_RQS);
@@ -432,7 +432,7 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
  * @fn returns.
  */
 void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
-		busy_tag_iter_fn *fn, void *priv)
+		blk_mq_rq_iter_fn *fn, void *priv)
 {
 	unsigned int flags = tagset->flags;
 	int i, nr_tags, srcu_idx;
@@ -493,7 +493,7 @@ EXPORT_SYMBOL(blk_mq_tagset_wait_completed_request);
  * called for all requests on all queues that share that tag set and not only
  * for requests associated with @q.
  */
-void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
+void blk_mq_queue_tag_busy_iter(struct request_queue *q, blk_mq_rq_iter_fn *fn,
 		void *priv)
 {
 	int srcu_idx;
diff --git a/block/blk-mq.h b/block/blk-mq.h
index af42dc018808..d20b87d17faa 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -189,9 +189,9 @@ void blk_mq_tag_resize_shared_tags(struct blk_mq_tag_set *set,
 void blk_mq_tag_update_sched_shared_tags(struct request_queue *q);
 
 void blk_mq_tag_wakeup_all(struct blk_mq_tags *tags, bool);
-void blk_mq_queue_tag_busy_iter(struct request_queue *q, busy_tag_iter_fn *fn,
+void blk_mq_queue_tag_busy_iter(struct request_queue *q, blk_mq_rq_iter_fn *fn,
 		void *priv);
-void blk_mq_all_tag_iter(struct blk_mq_tags *tags, busy_tag_iter_fn *fn,
+void blk_mq_all_tag_iter(struct blk_mq_tags *tags, blk_mq_rq_iter_fn *fn,
 		void *priv);
 
 static inline struct sbq_wait_state *bt_wait_ptr(struct sbitmap_queue *bt,
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index b25d12545f46..3467cacb281c 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -549,7 +549,7 @@ struct blk_mq_queue_data {
 	bool last;
 };
 
-typedef bool (busy_tag_iter_fn)(struct request *, void *);
+typedef bool (blk_mq_rq_iter_fn)(struct request *, void *);
 
 /**
  * struct blk_mq_ops - Callback functions that implements block driver
@@ -926,7 +926,7 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async);
 void blk_mq_run_hw_queues(struct request_queue *q, bool async);
 void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs);
 void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
-		busy_tag_iter_fn *fn, void *priv);
+		blk_mq_rq_iter_fn *fn, void *priv);
 void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset);
 void blk_mq_freeze_queue_nomemsave(struct request_queue *q);
 void blk_mq_unfreeze_queue_nomemrestore(struct request_queue *q);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/5] block: Introduce __blk_mq_tagset_iter()
  2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 1/5] block: Rename busy_tag_iter_fn into blk_mq_rq_iter_fn Bart Van Assche
@ 2025-11-17 22:52 ` Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 3/5] block: Introduce blk_mq_tagset_iter() Bart Van Assche
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-17 22:52 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche, Jens Axboe, Christoph Hellwig, Ming Lei

Prepare for introducing a second caller of __blk_mq_tagset_iter(). No
functionality has been changed.

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-mq-tag.c | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index 8a61c481015e..f169beeded64 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -419,6 +419,24 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, blk_mq_rq_iter_fn *fn,
 	__blk_mq_all_tag_iter(tags, fn, priv, BT_TAG_ITER_STATIC_RQS);
 }
 
+static void __blk_mq_tagset_iter(struct blk_mq_tag_set *tagset,
+			blk_mq_rq_iter_fn *fn, void *priv, unsigned long flags)
+{
+	int i, nr_tags, srcu_idx;
+
+	srcu_idx = srcu_read_lock(&tagset->tags_srcu);
+
+	nr_tags = blk_mq_is_shared_tags(tagset->flags) ? 1 :
+		tagset->nr_hw_queues;
+
+	for (i = 0; i < nr_tags; i++) {
+		if (tagset->tags && tagset->tags[i])
+			__blk_mq_all_tag_iter(tagset->tags[i], fn, priv,
+					      flags);
+	}
+	srcu_read_unlock(&tagset->tags_srcu, srcu_idx);
+}
+
 /**
  * blk_mq_tagset_busy_iter - iterate over all started requests in a tag set
  * @tagset:	Tag set to iterate over.
@@ -434,19 +452,7 @@ void blk_mq_all_tag_iter(struct blk_mq_tags *tags, blk_mq_rq_iter_fn *fn,
 void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		blk_mq_rq_iter_fn *fn, void *priv)
 {
-	unsigned int flags = tagset->flags;
-	int i, nr_tags, srcu_idx;
-
-	srcu_idx = srcu_read_lock(&tagset->tags_srcu);
-
-	nr_tags = blk_mq_is_shared_tags(flags) ? 1 : tagset->nr_hw_queues;
-
-	for (i = 0; i < nr_tags; i++) {
-		if (tagset->tags && tagset->tags[i])
-			__blk_mq_all_tag_iter(tagset->tags[i], fn, priv,
-					      BT_TAG_ITER_STARTED);
-	}
-	srcu_read_unlock(&tagset->tags_srcu, srcu_idx);
+	__blk_mq_tagset_iter(tagset, fn, priv, BT_TAG_ITER_STARTED);
 }
 EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 3/5] block: Introduce blk_mq_tagset_iter()
  2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 1/5] block: Rename busy_tag_iter_fn into blk_mq_rq_iter_fn Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 2/5] block: Introduce __blk_mq_tagset_iter() Bart Van Assche
@ 2025-11-17 22:52 ` Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 4/5] scsi: core: Generalize scsi_device_busy() Bart Van Assche
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-17 22:52 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche, Jens Axboe, Christoph Hellwig, Ming Lei

Support iterating over all requests in a tag set, including requests
that have not yet been started. A later patch will call this function
from scsi_device_busy().

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-mq-tag.c     | 19 +++++++++++++++++++
 include/linux/blk-mq.h |  2 ++
 2 files changed, 21 insertions(+)

diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index f169beeded64..f277ed7e7743 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -456,6 +456,25 @@ void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 }
 EXPORT_SYMBOL(blk_mq_tagset_busy_iter);
 
+/**
+ * blk_mq_tagset_iter - iterate over all requests in a tag set
+ * @tagset:	Tag set to iterate over.
+ * @fn:		Pointer to the function that will be called for each request.
+ *		@fn will be called as follows: @fn(rq, @priv) where rq is a
+ *		pointer to a request. Return true to continue iterating tags,
+ *		false to stop.
+ * @priv:	Will be passed as second argument to @fn.
+ *
+ * We grab one request reference before calling @fn and release it after
+ * @fn returns.
+ */
+void blk_mq_tagset_iter(struct blk_mq_tag_set *tagset, blk_mq_rq_iter_fn *fn,
+			void *priv)
+{
+	__blk_mq_tagset_iter(tagset, fn, priv, 0);
+}
+EXPORT_SYMBOL(blk_mq_tagset_iter);
+
 static bool blk_mq_tagset_count_completed_rqs(struct request *rq, void *data)
 {
 	unsigned *count = data;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 3467cacb281c..20a22c1cd067 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -927,6 +927,8 @@ void blk_mq_run_hw_queues(struct request_queue *q, bool async);
 void blk_mq_delay_run_hw_queues(struct request_queue *q, unsigned long msecs);
 void blk_mq_tagset_busy_iter(struct blk_mq_tag_set *tagset,
 		blk_mq_rq_iter_fn *fn, void *priv);
+void blk_mq_tagset_iter(struct blk_mq_tag_set *tagset, blk_mq_rq_iter_fn *fn,
+		void *priv);
 void blk_mq_tagset_wait_completed_request(struct blk_mq_tag_set *tagset);
 void blk_mq_freeze_queue_nomemsave(struct request_queue *q);
 void blk_mq_unfreeze_queue_nomemrestore(struct request_queue *q);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 4/5] scsi: core: Generalize scsi_device_busy()
  2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
                   ` (2 preceding siblings ...)
  2025-11-17 22:52 ` [PATCH v2 3/5] block: Introduce blk_mq_tagset_iter() Bart Van Assche
@ 2025-11-17 22:52 ` Bart Van Assche
  2025-11-17 22:52 ` [PATCH v2 5/5] scsi: core: Improve IOPS in case of host-wide tags Bart Van Assche
  2025-11-21 21:17 ` [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
  5 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-17 22:52 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche, Jens Axboe, Christoph Hellwig, Ming Lei,
	James E.J. Bottomley

Instead of only handling dev->budget_map.map != NULL, also handle
dev->budget_map.map == NULL. This patch prepares for supporting logical
units without budget map (sdev->budget_map.map == NULL).

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi_lib.c    | 38 ++++++++++++++++++++++++++++++++++++++
 include/scsi/scsi_device.h |  5 +----
 2 files changed, 39 insertions(+), 4 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 51ad2ad07e43..ddc51472b5eb 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -446,6 +446,44 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev)
 	spin_unlock_irqrestore(shost->host_lock, flags);
 }
 
+struct sdev_cmds_allocated_data {
+	const struct scsi_device *sdev;
+	int count;
+};
+
+static bool scsi_device_check_allocated(struct request *rq, void *data)
+{
+	struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(rq);
+	struct sdev_cmds_allocated_data *sifd = data;
+
+	if (cmd->device == sifd->sdev)
+		sifd->count++;
+
+	return true;
+}
+
+/**
+ * scsi_device_busy() - Number of commands allocated for a SCSI device
+ * @sdev: SCSI device.
+ *
+ * Note: There is a subtle difference between this function and
+ * scsi_host_busy(). scsi_host_busy() counts the number of commands that have
+ * been started. This function counts the number of commands that have been
+ * allocated. At least the UFS driver depends on this function counting commands
+ * that have already been allocated but that have not yet been started.
+ */
+int scsi_device_busy(const struct scsi_device *sdev)
+{
+	struct sdev_cmds_allocated_data sifd = { .sdev = sdev };
+	struct blk_mq_tag_set *set = &sdev->host->tag_set;
+
+	if (sdev->budget_map.map)
+		return sbitmap_weight(&sdev->budget_map);
+	blk_mq_tagset_iter(set, scsi_device_check_allocated, &sifd);
+	return sifd.count;
+}
+EXPORT_SYMBOL(scsi_device_busy);
+
 static inline bool scsi_device_is_busy(struct scsi_device *sdev)
 {
 	if (scsi_device_busy(sdev) >= sdev->queue_depth)
diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
index d62265d12cfe..661f0a8e4de6 100644
--- a/include/scsi/scsi_device.h
+++ b/include/scsi/scsi_device.h
@@ -713,10 +713,7 @@ static inline int scsi_device_supports_vpd(struct scsi_device *sdev)
 	return 0;
 }
 
-static inline int scsi_device_busy(struct scsi_device *sdev)
-{
-	return sbitmap_weight(&sdev->budget_map);
-}
+int scsi_device_busy(const struct scsi_device *sdev);
 
 /* Macros to access the UNIT ATTENTION counters */
 #define scsi_get_ua_new_media_ctr(sdev) \

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 5/5] scsi: core: Improve IOPS in case of host-wide tags
  2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
                   ` (3 preceding siblings ...)
  2025-11-17 22:52 ` [PATCH v2 4/5] scsi: core: Generalize scsi_device_busy() Bart Van Assche
@ 2025-11-17 22:52 ` Bart Van Assche
  2025-11-21 21:17 ` [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
  5 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-17 22:52 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche, Jens Axboe, Christoph Hellwig, Ming Lei,
	James E.J. Bottomley

The SCSI core uses the budget map to restrict the number of commands
that are in flight per logical unit. That limit check can be left out if
host->cmd_per_lun >= host->can_queue and if the host tag set is shared
across all hardware queues or if there is only one hardware queue  Since
scsi_mq_get_budget() shows up in all CPU profiles for fast SCSI devices,
do not allocate a budget map if cmd_per_lun >= can_queue and if the host
tag set is shared across all hardware queues.

For the following test this patch increases IOPS by 5%:

modprobe scsi_debug delay=0 no_rwlock=1 host_max_queue=192 submit_queues=$(nproc)

fio --bs=4096 --disable_clat=1 --disable_slat=1 --group_reporting=1 \
  --gtod_reduce=1 --invalidate=1 --ioengine=io_uring --ioscheduler=none \
  --norandommap --runtime=60 --rw=randread --thread --time_based=1 \
  --buffered=0 --numjobs=1 --iodepth=192 --iodepth_batch=24 --name=/dev/sda \
  --filename=/dev/sda

Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Hannes Reinecke <hare@suse.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 drivers/scsi/scsi.c      |  3 ++-
 drivers/scsi/scsi_scan.c | 18 +++++++++++++++++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 76cdad063f7b..3daa32c9e790 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -229,7 +229,8 @@ int scsi_change_queue_depth(struct scsi_device *sdev, int depth)
 	if (sdev->request_queue)
 		blk_set_queue_depth(sdev->request_queue, depth);
 
-	sbitmap_resize(&sdev->budget_map, sdev->queue_depth);
+	if (sdev->budget_map.map)
+		sbitmap_resize(&sdev->budget_map, sdev->queue_depth);
 
 	return sdev->queue_depth;
 }
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 7acbfcfc2172..99b82e28f292 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -215,9 +215,17 @@ static void scsi_unlock_floptical(struct scsi_device *sdev,
 			 SCSI_TIMEOUT, 3, NULL);
 }
 
+static bool scsi_needs_budget_map(struct Scsi_Host *shost, unsigned int depth)
+{
+	if (shost->host_tagset || shost->tag_set.nr_hw_queues == 1)
+		return depth < shost->can_queue;
+	return true;
+}
+
 static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
 					unsigned int depth)
 {
+	struct Scsi_Host *shost = sdev->host;
 	int new_shift = sbitmap_calculate_shift(depth);
 	bool need_alloc = !sdev->budget_map.map;
 	bool need_free = false;
@@ -225,6 +233,13 @@ static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
 	int ret;
 	struct sbitmap sb_backup;
 
+	if (!scsi_needs_budget_map(shost, depth)) {
+		memflags = blk_mq_freeze_queue(sdev->request_queue);
+		sbitmap_free(&sdev->budget_map);
+		blk_mq_unfreeze_queue(sdev->request_queue, memflags);
+		return 0;
+	}
+
 	depth = min_t(unsigned int, depth, scsi_device_max_queue_depth(sdev));
 
 	/*
@@ -1120,7 +1135,8 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
 	scsi_cdl_check(sdev);
 
 	sdev->max_queue_depth = sdev->queue_depth;
-	WARN_ON_ONCE(sdev->max_queue_depth > sdev->budget_map.depth);
+	WARN_ON_ONCE(sdev->budget_map.map &&
+		     sdev->max_queue_depth > sdev->budget_map.depth);
 
 	/*
 	 * Ok, the device is now all set up, we can

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] Increase SCSI IOPS
  2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
                   ` (4 preceding siblings ...)
  2025-11-17 22:52 ` [PATCH v2 5/5] scsi: core: Improve IOPS in case of host-wide tags Bart Van Assche
@ 2025-11-21 21:17 ` Bart Van Assche
  5 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-21 21:17 UTC (permalink / raw)
  To: Martin K . Petersen; +Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke

On 11/17/25 2:51 PM, Bart Van Assche wrote:
> This patch series increases scsi_debug IOPS by 5% on my test setup by disabling
> SCSI budget management if it is not needed.
(replying to my own email)

The kernel test robot reported that this patch series introduces a hang
during LUN scanning for ATA devices. I have been able to reproduce this
hang in a VM. The root cause has been identified and a fix is under
test. I will post a new version of this patch series after testing has
finished.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 0/5] Increase SCSI IOPS
@ 2025-11-24 18:21 Bart Van Assche
  2025-11-25  8:20 ` Niklas Cassel
  2025-11-25  9:08 ` John Garry
  0 siblings, 2 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-24 18:21 UTC (permalink / raw)
  To: Martin K . Petersen
  Cc: linux-scsi, linux-block, John Garry, Hannes Reinecke,
	Bart Van Assche

Hi Martin,

This patch series increases scsi_debug IOPS by 5% on my test setup by disabling
SCSI budget management if it is not needed. This patch series improves the
performance of many SCSI LLDs, including the UFS and ATA drivers.

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v1:
 - Fixed a hang during LUN scanning for ATA devices.

Bart Van Assche (5):
  block: Introduce __blk_mq_tagset_iter()
  block: Introduce blk_mq_tagset_iter()
  libata: Stop using cmd->budget_token
  scsi: core: Generalize scsi_device_busy()
  scsi: core: Improve IOPS in case of host-wide tags

 block/blk-mq-tag.c         | 51 ++++++++++++++++++++++++++++----------
 drivers/ata/libata-scsi.c  | 18 +++++---------
 drivers/scsi/scsi.c        |  6 ++---
 drivers/scsi/scsi_lib.c    | 38 ++++++++++++++++++++++++++++
 drivers/scsi/scsi_scan.c   | 18 +++++++++++++-
 include/linux/blk-mq.h     |  2 ++
 include/scsi/scsi_device.h |  5 +---
 7 files changed, 104 insertions(+), 34 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] Increase SCSI IOPS
  2025-11-24 18:21 Bart Van Assche
@ 2025-11-25  8:20 ` Niklas Cassel
  2025-11-25 16:45   ` Bart Van Assche
  2025-11-25  9:08 ` John Garry
  1 sibling, 1 reply; 12+ messages in thread
From: Niklas Cassel @ 2025-11-25  8:20 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, linux-scsi, linux-block, John Garry,
	Hannes Reinecke

On Mon, Nov 24, 2025 at 10:21:55AM -0800, Bart Van Assche wrote:
> Hi Martin,
> 
> This patch series increases scsi_debug IOPS by 5% on my test setup by disabling
> SCSI budget management if it is not needed. This patch series improves the
> performance of many SCSI LLDs, including the UFS and ATA drivers.
> 
> Please consider this patch series for the next merge window.

Hello Bart,

The subject is:
[PATCH v2 0/5] Increase SCSI IOPS

AFAICT, you already sent a v2 series a few days ago:
https://lore.kernel.org/linux-scsi/20251117225205.2024479-1-bvanassche@acm.org/

I assume that you simply forgot to increase the version count.

If you respin, perhaps label it as v4, to make things less confusing.


Kind regards,
Niklas

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] Increase SCSI IOPS
  2025-11-24 18:21 Bart Van Assche
  2025-11-25  8:20 ` Niklas Cassel
@ 2025-11-25  9:08 ` John Garry
  2025-11-25 16:51   ` Bart Van Assche
  1 sibling, 1 reply; 12+ messages in thread
From: John Garry @ 2025-11-25  9:08 UTC (permalink / raw)
  To: Bart Van Assche, Martin K . Petersen
  Cc: linux-scsi, linux-block, Hannes Reinecke, hch, dlemoal, cassel

On 24/11/2025 18:21, Bart Van Assche wrote:
> Hi Martin,
> 
> This patch series increases scsi_debug IOPS by 5% on my test setup by disabling
> SCSI budget management if it is not needed.

Performance results from scsi_debug are not a real acid test.

> This patch series improves the
> performance of many SCSI LLDs, including the UFS and ATA drivers.

Please provide results from real HW / real scenarios.

> 
> Please consider this patch series for the next merge window.
> 
> Thanks,
> 
> Bart.
> 
> Changes compared to v1:
>   - Fixed a hang during LUN scanning for ATA devices.
> 
> Bart Van Assche (5):
>    block: Introduce __blk_mq_tagset_iter()
>    block: Introduce blk_mq_tagset_iter()
>    libata: Stop using cmd->budget_token
>    scsi: core: Generalize scsi_device_busy()
>    scsi: core: Improve IOPS in case of host-wide tags
> 
>   block/blk-mq-tag.c         | 51 ++++++++++++++++++++++++++++----------
>   drivers/ata/libata-scsi.c  | 18 +++++---------
>   drivers/scsi/scsi.c        |  6 ++---
>   drivers/scsi/scsi_lib.c    | 38 ++++++++++++++++++++++++++++
>   drivers/scsi/scsi_scan.c   | 18 +++++++++++++-
>   include/linux/blk-mq.h     |  2 ++
>   include/scsi/scsi_device.h |  5 +---
>   7 files changed, 104 insertions(+), 34 deletions(-)
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] Increase SCSI IOPS
  2025-11-25  8:20 ` Niklas Cassel
@ 2025-11-25 16:45   ` Bart Van Assche
  0 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-25 16:45 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Martin K . Petersen, linux-scsi, linux-block, John Garry,
	Hannes Reinecke

On 11/25/25 1:20 AM, Niklas Cassel wrote:
> The subject is:
> [PATCH v2 0/5] Increase SCSI IOPS
> 
> AFAICT, you already sent a v2 series a few days ago:
> https://lore.kernel.org/linux-scsi/20251117225205.2024479-1-bvanassche@acm.org/
> 
> I assume that you simply forgot to increase the version count.

Correct.
> If you respin, perhaps label it as v4, to make things less confusing.

Sure, I will do that.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 0/5] Increase SCSI IOPS
  2025-11-25  9:08 ` John Garry
@ 2025-11-25 16:51   ` Bart Van Assche
  0 siblings, 0 replies; 12+ messages in thread
From: Bart Van Assche @ 2025-11-25 16:51 UTC (permalink / raw)
  To: John Garry, Martin K . Petersen
  Cc: linux-scsi, linux-block, Hannes Reinecke, hch, dlemoal, cassel

On 11/25/25 2:08 AM, John Garry wrote:
> Please provide results from real HW / real scenarios.

These have already been provided before. From
https://lore.kernel.org/linux-scsi/20250910213254.1215318-4-bvanassche@acm.org/:
"On my UFS 4 test setup this patch improves IOPS by 1% and reduces the
time spent in scsi_mq_get_budget() from 0.22% to 0.01%." In that test
I/O was submitted from a single CPU core.

UFS 5 devices are expected to become available soon (2026). UFS 5
devices are expected to support up to one million IOPS. That is the
double of what the fastest UFS 4 devices support. Hence my interest in
improving performance of the SCSI core.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2025-11-25 16:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-17 22:51 [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
2025-11-17 22:52 ` [PATCH v2 1/5] block: Rename busy_tag_iter_fn into blk_mq_rq_iter_fn Bart Van Assche
2025-11-17 22:52 ` [PATCH v2 2/5] block: Introduce __blk_mq_tagset_iter() Bart Van Assche
2025-11-17 22:52 ` [PATCH v2 3/5] block: Introduce blk_mq_tagset_iter() Bart Van Assche
2025-11-17 22:52 ` [PATCH v2 4/5] scsi: core: Generalize scsi_device_busy() Bart Van Assche
2025-11-17 22:52 ` [PATCH v2 5/5] scsi: core: Improve IOPS in case of host-wide tags Bart Van Assche
2025-11-21 21:17 ` [PATCH v2 0/5] Increase SCSI IOPS Bart Van Assche
  -- strict thread matches above, loose matches on Subject: below --
2025-11-24 18:21 Bart Van Assche
2025-11-25  8:20 ` Niklas Cassel
2025-11-25 16:45   ` Bart Van Assche
2025-11-25  9:08 ` John Garry
2025-11-25 16:51   ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).