* [PATCH v2 0/3] Improve performance for BLK_MQ_F_BLOCKING drivers
@ 2023-07-19 18:22 Bart Van Assche
  2023-07-19 18:22 ` [PATCH v2 1/3] scsi: Inline scsi_kick_queue() Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Bart Van Assche @ 2023-07-19 18:22 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Christoph Hellwig, Martin K . Petersen,
	Bart Van Assche
Hi Jens,
Request queues are run asynchronously if BLK_MQ_F_BLOCKING has been set. This
results in suboptimal performance. This patch series improves performance if
BLK_MQ_F_BLOCKING has been set by running request queues synchronously
whenever possible.
Please consider this patch series for the next merge window.
Thanks,
Bart.
Changes compared to v1:
- Fixed support for the combination BLK_MQ_F_BLOCKING + BLIST_SINGLELUN.
Bart Van Assche (3):
  scsi: Inline scsi_kick_queue()
  scsi: Remove a blk_mq_run_hw_queues() call
  block: Improve performance for BLK_MQ_F_BLOCKING drivers
 block/blk-mq.c          | 17 +++++++++++------
 drivers/scsi/scsi_lib.c | 11 +++--------
 2 files changed, 14 insertions(+), 14 deletions(-)
^ permalink raw reply	[flat|nested] 9+ messages in thread* [PATCH v2 1/3] scsi: Inline scsi_kick_queue() 2023-07-19 18:22 [PATCH v2 0/3] Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche @ 2023-07-19 18:22 ` Bart Van Assche 2023-07-20 5:39 ` Christoph Hellwig 2023-07-19 18:22 ` [PATCH v2 2/3] scsi: Remove a blk_mq_run_hw_queues() call Bart Van Assche 2023-07-19 18:22 ` [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche 2 siblings, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2023-07-19 18:22 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Bart Van Assche, James E.J. Bottomley scsi_kick_queue() is too short to keep it as a separate function. Hence inline it. This patch prepares for modifying the second argument passed to blk_mq_run_hw_queues(). Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_lib.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ad9afae49544..414d29eef968 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -300,11 +300,6 @@ void scsi_device_unbusy(struct scsi_device *sdev, struct scsi_cmnd *cmd) cmd->budget_token = -1; } -static void scsi_kick_queue(struct request_queue *q) -{ - blk_mq_run_hw_queues(q, false); -} - /* * Kick the queue of SCSI device @sdev if @sdev != current_sdev. Called with * interrupts disabled. @@ -340,7 +335,7 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev) * but in most cases, we will be first. Ideally, each LU on the * target would get some limited time or requests on the target. */ - scsi_kick_queue(current_sdev->request_queue); + blk_mq_run_hw_queues(current_sdev->request_queue, false); spin_lock_irqsave(shost->host_lock, flags); if (!starget->starget_sdev_user) @@ -427,7 +422,7 @@ static void scsi_starved_list_run(struct Scsi_Host *shost) continue; spin_unlock_irqrestore(shost->host_lock, flags); - scsi_kick_queue(slq); + blk_mq_run_hw_queues(slq, false); blk_put_queue(slq); spin_lock_irqsave(shost->host_lock, flags); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 1/3] scsi: Inline scsi_kick_queue() 2023-07-19 18:22 ` [PATCH v2 1/3] scsi: Inline scsi_kick_queue() Bart Van Assche @ 2023-07-20 5:39 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2023-07-20 5:39 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen, James E.J. Bottomley On Wed, Jul 19, 2023 at 11:22:40AM -0700, Bart Van Assche wrote: > scsi_kick_queue() is too short to keep it as a separate function. Hence > inline it. This patch prepares for modifying the second argument passed > to blk_mq_run_hw_queues(). It wouldn't say "too short" as short functions can be useful too. But this one is indeed quite pointless, so the change looks good, even if the commit log could be improved a bit: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 2/3] scsi: Remove a blk_mq_run_hw_queues() call 2023-07-19 18:22 [PATCH v2 0/3] Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche 2023-07-19 18:22 ` [PATCH v2 1/3] scsi: Inline scsi_kick_queue() Bart Van Assche @ 2023-07-19 18:22 ` Bart Van Assche 2023-07-20 5:40 ` Christoph Hellwig 2023-07-19 18:22 ` [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche 2 siblings, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2023-07-19 18:22 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Bart Van Assche, James E.J. Bottomley blk_mq_kick_requeue_list() calls blk_mq_run_hw_queues() asynchronously. Leave out the direct blk_mq_run_hw_queues() call. This patch causes scsi_run_queue() to call blk_mq_run_hw_queues() asynchronously instead of synchronously. Since scsi_run_queue() is not called from the hot I/O submission path, this patch does not affect the hot path. Cc: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- drivers/scsi/scsi_lib.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 414d29eef968..7043ca0f4da9 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -448,7 +448,6 @@ static void scsi_run_queue(struct request_queue *q) scsi_starved_list_run(sdev->host); blk_mq_kick_requeue_list(q); - blk_mq_run_hw_queues(q, false); } void scsi_requeue_run_queue(struct work_struct *work) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 2/3] scsi: Remove a blk_mq_run_hw_queues() call 2023-07-19 18:22 ` [PATCH v2 2/3] scsi: Remove a blk_mq_run_hw_queues() call Bart Van Assche @ 2023-07-20 5:40 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2023-07-20 5:40 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen, James E.J. Bottomley On Wed, Jul 19, 2023 at 11:22:41AM -0700, Bart Van Assche wrote: > blk_mq_kick_requeue_list() calls blk_mq_run_hw_queues() asynchronously. > Leave out the direct blk_mq_run_hw_queues() call. This patch causes > scsi_run_queue() to call blk_mq_run_hw_queues() asynchronously instead > of synchronously. Since scsi_run_queue() is not called from the hot I/O > submission path, this patch does not affect the hot path. I think this looks good, but a comment in the code that we now rely on blk_mq_kick_requeue_list to run the queue for us might be useful as that is a little counter-intuitive. ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers 2023-07-19 18:22 [PATCH v2 0/3] Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche 2023-07-19 18:22 ` [PATCH v2 1/3] scsi: Inline scsi_kick_queue() Bart Van Assche 2023-07-19 18:22 ` [PATCH v2 2/3] scsi: Remove a blk_mq_run_hw_queues() call Bart Van Assche @ 2023-07-19 18:22 ` Bart Van Assche 2023-07-20 5:54 ` Christoph Hellwig 2 siblings, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2023-07-19 18:22 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, Christoph Hellwig, Martin K . Petersen, Bart Van Assche, Ming Lei, James E.J. Bottomley blk_mq_run_queue() runs the queue asynchronously if BLK_MQ_F_BLOCKING has been set. This is suboptimal since running the queue asynchronously is slower than running the queue synchronously. This patch modifies blk_mq_run_queue() as follows if BLK_MQ_F_BLOCKING has been set: - Run the queue synchronously if it is allowed to sleep. - Run the queue asynchronously if it is not allowed to sleep. Additionally, blk_mq_run_hw_queue(hctx, false) calls are modified into blk_mq_run_hw_queue(hctx, hctx->flags & BLK_MQ_F_BLOCKING) if the caller may be invoked from atomic context. The following caller chains have been reviewed: blk_mq_run_hw_queue(hctx, false) blk_mq_get_tag() /* may sleep, hence the functions it calls may also sleep */ blk_execute_rq_nowait() nvme_*() /* the NVMe driver does not set BLK_MQ_F_BLOCKING */ scsi_eh_lock_door() /* may sleep */ sg_common_write() /* implements an ioctl and hence may sleep */ st_scsi_execute() /* may sleep */ pscsi_execute_cmd() /* may sleep */ ufshpb_execute_umap_req() /* may sleep */ ufshbp_execute_map_req() /* may sleep */ blk_execute_rq() /* may sleep */ blk_mq_run_hw_queues(q, async=false) blk_freeze_queue_start() /* may sleep */ blk_mq_requeue_work() /* may sleep */ scsi_kick_queue() scsi_requeue_run_queue() /* may sleep */ scsi_run_host_queues() scsi_ioctl_reset() /* may sleep */ blk_mq_insert_requests(hctx, ctx, list, run_queue_async=false) blk_mq_dispatch_plug_list(plug, from_sched=false) blk_mq_flush_plug_list(plug, from_schedule=false) __blk_flush_plug(plug, from_schedule=false) blk_add_rq_to_plug() blk_execute_rq_nowait() /* see above */ blk_mq_submit_bio() /* may sleep if REQ_NOWAIT has not been set */ blk_mq_plug_issue_direct() blk_mq_flush_plug_list() /* see above */ blk_mq_dispatch_plug_list(plug, from_sched=false) blk_mq_flush_plug_list() /* see above */ blk_mq_try_issue_directly() blk_mq_submit_bio() /* may sleep if REQ_NOWAIT has not been set */ blk_mq_try_issue_list_directly(hctx, list) blk_mq_insert_requests() /* see above */ Cc: Christoph Hellwig <hch@lst.de> Cc: Ming Lei <ming.lei@redhat.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- block/blk-mq.c | 17 +++++++++++------ drivers/scsi/scsi_lib.c | 3 ++- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 5504719b970d..d5ab0bd8b472 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1289,7 +1289,8 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq) * * Description: * Insert a fully prepared request at the back of the I/O scheduler queue - * for execution. Don't wait for completion. + * for execution. Don't wait for completion. May sleep if BLK_MQ_F_BLOCKING + * has been set. * * Note: * This function will invoke @done directly if the queue is dead. @@ -2213,6 +2214,8 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) */ WARN_ON_ONCE(!async && in_interrupt()); + might_sleep_if(!async && hctx->flags & BLK_MQ_F_BLOCKING); + /* * When queue is quiesced, we may be switching io scheduler, or * updating nr_hw_queues, or other things, and we can't run queue @@ -2228,8 +2231,7 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) if (!need_run) return; - if (async || (hctx->flags & BLK_MQ_F_BLOCKING) || - !cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)) { + if (async || !cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask)) { blk_mq_delay_run_hw_queue(hctx, 0); return; } @@ -2364,7 +2366,7 @@ void blk_mq_start_hw_queue(struct blk_mq_hw_ctx *hctx) { clear_bit(BLK_MQ_S_STOPPED, &hctx->state); - blk_mq_run_hw_queue(hctx, false); + blk_mq_run_hw_queue(hctx, hctx->flags & BLK_MQ_F_BLOCKING); } EXPORT_SYMBOL(blk_mq_start_hw_queue); @@ -2394,7 +2396,8 @@ void blk_mq_start_stopped_hw_queues(struct request_queue *q, bool async) unsigned long i; queue_for_each_hw_ctx(q, hctx, i) - blk_mq_start_stopped_hw_queue(hctx, async); + blk_mq_start_stopped_hw_queue(hctx, async || + (hctx->flags & BLK_MQ_F_BLOCKING)); } EXPORT_SYMBOL(blk_mq_start_stopped_hw_queues); @@ -2452,6 +2455,8 @@ static void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, list_for_each_entry(rq, list, queuelist) { BUG_ON(rq->mq_ctx != ctx); trace_block_rq_insert(rq); + if (rq->cmd_flags & REQ_NOWAIT) + run_queue_async = true; } spin_lock(&ctx->lock); @@ -2612,7 +2617,7 @@ static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx, if ((rq->rq_flags & RQF_USE_SCHED) || !blk_mq_get_budget_and_tag(rq)) { blk_mq_insert_request(rq, 0); - blk_mq_run_hw_queue(hctx, false); + blk_mq_run_hw_queue(hctx, rq->cmd_flags & REQ_NOWAIT); return; } diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 7043ca0f4da9..84fb0feb047f 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -335,7 +335,8 @@ static void scsi_single_lun_run(struct scsi_device *current_sdev) * but in most cases, we will be first. Ideally, each LU on the * target would get some limited time or requests on the target. */ - blk_mq_run_hw_queues(current_sdev->request_queue, false); + blk_mq_run_hw_queues(current_sdev->request_queue, + shost->queuecommand_may_block); spin_lock_irqsave(shost->host_lock, flags); if (!starget->starget_sdev_user) ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers 2023-07-19 18:22 ` [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche @ 2023-07-20 5:54 ` Christoph Hellwig 2023-07-20 15:44 ` Bart Van Assche 0 siblings, 1 reply; 9+ messages in thread From: Christoph Hellwig @ 2023-07-20 5:54 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, linux-block, Christoph Hellwig, Martin K . Petersen, Ming Lei, James E.J. Bottomley On Wed, Jul 19, 2023 at 11:22:42AM -0700, Bart Van Assche wrote: > blk_mq_run_queue() runs the queue asynchronously if BLK_MQ_F_BLOCKING > has been set. Maybe something like: blk_mq_run_queue() always runs the queue asynchronously if BLK_MQ_F_BLOCKING is set on the tag_set. > + * for execution. Don't wait for completion. May sleep if BLK_MQ_F_BLOCKING > + * has been set. > * > * Note: > * This function will invoke @done directly if the queue is dead. > @@ -2213,6 +2214,8 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) > */ > WARN_ON_ONCE(!async && in_interrupt()); > > + might_sleep_if(!async && hctx->flags & BLK_MQ_F_BLOCKING); This is some odd an very complex calling conventions. I suspect most !BLK_MQ_F_BLOCKING could also deal with the may sleep if not async, and that would give us a much easier to audit change as we could remove the WARN_ON_ONCE above and just do a: might_sleep_if(!async); In fact this might be a good time to split up blk_mq_run_hw_queue into blk_mq_run_hw_queue and blk_mq_run_hw_queue_async and do away with the bool and have cristal clear calling conventions. If we really need !async calles than can sleep we can add a specific blk_mq_run_hw_queue_atomic. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers 2023-07-20 5:54 ` Christoph Hellwig @ 2023-07-20 15:44 ` Bart Van Assche 2023-07-21 6:31 ` Christoph Hellwig 0 siblings, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2023-07-20 15:44 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, linux-block, Martin K . Petersen, Ming Lei, James E.J. Bottomley On 7/19/23 22:54, Christoph Hellwig wrote: > On Wed, Jul 19, 2023 at 11:22:42AM -0700, Bart Van Assche wrote: >> + * for execution. Don't wait for completion. May sleep if BLK_MQ_F_BLOCKING >> + * has been set. >> * >> * Note: >> * This function will invoke @done directly if the queue is dead. >> @@ -2213,6 +2214,8 @@ void blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async) >> */ >> WARN_ON_ONCE(!async && in_interrupt()); >> >> + might_sleep_if(!async && hctx->flags & BLK_MQ_F_BLOCKING); > > This is some odd an very complex calling conventions. I suspect most > !BLK_MQ_F_BLOCKING could also deal with the may sleep if not async, > and that would give us a much easier to audit change as we could > remove the WARN_ON_ONCE above and just do a: > > might_sleep_if(!async); > > In fact this might be a good time to split up blk_mq_run_hw_queue > into blk_mq_run_hw_queue and blk_mq_run_hw_queue_async and do > away with the bool and have cristal clear calling conventions. > > If we really need !async calles than can sleep we can add a specific > blk_mq_run_hw_queue_atomic. Hi Christoph, blk_mq_run_hw_queue(hctx, false) is called from inside the block layer with an RCU lock held if BLK_MQ_F_BLOCKING and with an SRCU lock held if BLK_MQ_F_BLOCKING is not set. So I'm not sure whether it is possible to simplify the above might_sleep_if() statement. From block/blk-mq.h: /* run the code block in @dispatch_ops with rcu/srcu read lock held */ #define __blk_mq_run_dispatch_ops(q, check_sleep, dispatch_ops) \ do { \ if ((q)->tag_set->flags & BLK_MQ_F_BLOCKING) { \ struct blk_mq_tag_set *__tag_set = (q)->tag_set; \ int srcu_idx; \ \ might_sleep_if(check_sleep); \ srcu_idx = srcu_read_lock(__tag_set->srcu); \ (dispatch_ops); \ srcu_read_unlock(__tag_set->srcu, srcu_idx); \ } else { \ rcu_read_lock(); \ (dispatch_ops); \ rcu_read_unlock(); \ } \ } while (0) Thanks, Bart. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers 2023-07-20 15:44 ` Bart Van Assche @ 2023-07-21 6:31 ` Christoph Hellwig 0 siblings, 0 replies; 9+ messages in thread From: Christoph Hellwig @ 2023-07-21 6:31 UTC (permalink / raw) To: Bart Van Assche Cc: Christoph Hellwig, Jens Axboe, linux-block, Martin K . Petersen, Ming Lei, James E.J. Bottomley On Thu, Jul 20, 2023 at 08:44:52AM -0700, Bart Van Assche wrote: > blk_mq_run_hw_queue(hctx, false) is called from inside the block layer > with an RCU lock held if BLK_MQ_F_BLOCKING and with an SRCU lock held if > BLK_MQ_F_BLOCKING is not set. So I'm not sure whether it is possible to > simplify the above might_sleep_if() statement. From block/blk-mq.h: Ok, it looks like we can't go down that way then. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-07-21 6:31 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-07-19 18:22 [PATCH v2 0/3] Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche 2023-07-19 18:22 ` [PATCH v2 1/3] scsi: Inline scsi_kick_queue() Bart Van Assche 2023-07-20 5:39 ` Christoph Hellwig 2023-07-19 18:22 ` [PATCH v2 2/3] scsi: Remove a blk_mq_run_hw_queues() call Bart Van Assche 2023-07-20 5:40 ` Christoph Hellwig 2023-07-19 18:22 ` [PATCH v2 3/3] block: Improve performance for BLK_MQ_F_BLOCKING drivers Bart Van Assche 2023-07-20 5:54 ` Christoph Hellwig 2023-07-20 15:44 ` Bart Van Assche 2023-07-21 6:31 ` Christoph Hellwig
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).