* [PATCH 1/2] SCSI: run idle hctx after delay in scsi_mq_get_budget()
2017-10-17 5:04 [PATCH 0/2] block/SCSI MQ: two RESTART related patches Ming Lei
@ 2017-10-17 5:04 ` Ming Lei
2017-10-17 5:04 ` [PATCH 2/2] blk-mq: don't handle TAG_SHARED in restart Ming Lei
2017-10-17 5:12 ` [PATCH 0/2] block/SCSI MQ: two RESTART related patches Ming Lei
2 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2017-10-17 5:04 UTC (permalink / raw)
To: Jens Axboe, linux-block, Christoph Hellwig
Cc: Bart Van Assche, linux-kernel, linux-scsi, Omar Sandoval,
John Garry, Ming Lei
If there isn't any outstanding request in this queue, both
blk-mq's RESTART and SCSI's builtin RESTART can't work,
so we have to deal with this case by running this queue
after delay.
Fixes: d04b6d97d0a1(scsi: implement .get_budget and .put_budget for blk-mq)
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/scsi/scsi_lib.c | 13 ++++++++++---
1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 6f10afaca25b..08c495364b28 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1959,6 +1959,14 @@ static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx)
put_device(&sdev->sdev_gendev);
}
+static void scsi_mq_run_idle_hctx(struct scsi_device *sdev,
+ struct blk_mq_hw_ctx *hctx)
+{
+ if (atomic_read(&sdev->device_busy) == 0 &&
+ !scsi_device_blocked(sdev))
+ blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
+}
+
static blk_status_t scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
{
struct request_queue *q = hctx->queue;
@@ -1989,6 +1997,7 @@ static blk_status_t scsi_mq_get_budget(struct blk_mq_hw_ctx *hctx)
out_put_device:
put_device(&sdev->sdev_gendev);
out:
+ scsi_mq_run_idle_hctx(sdev, hctx);
return BLK_STS_RESOURCE;
}
@@ -2039,9 +2048,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
case BLK_STS_OK:
break;
case BLK_STS_RESOURCE:
- if (atomic_read(&sdev->device_busy) == 0 &&
- !scsi_device_blocked(sdev))
- blk_mq_delay_run_hw_queue(hctx, SCSI_QUEUE_DELAY);
+ scsi_mq_run_idle_hctx(sdev, hctx);
break;
default:
/*
--
2.9.5
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH 2/2] blk-mq: don't handle TAG_SHARED in restart
2017-10-17 5:04 [PATCH 0/2] block/SCSI MQ: two RESTART related patches Ming Lei
2017-10-17 5:04 ` [PATCH 1/2] SCSI: run idle hctx after delay in scsi_mq_get_budget() Ming Lei
@ 2017-10-17 5:04 ` Ming Lei
2017-10-17 5:12 ` [PATCH 0/2] block/SCSI MQ: two RESTART related patches Ming Lei
2 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2017-10-17 5:04 UTC (permalink / raw)
To: Jens Axboe, linux-block, Christoph Hellwig
Cc: Bart Van Assche, linux-kernel, linux-scsi, Omar Sandoval,
John Garry, Ming Lei
Now restart is used in the following cases, and TAG_SHARED is for
SCSI only.
1) .get_budget() returns BLK_STS_RESOURCE
- if resource in target/host level isn't satistifed, this SCSI device
will be added in shost->starved_list, and the whole queue will be rerun
(via SCSI's built-in RESTART) in scsi_end_request() after any request
initiated from this host/targe is completed. Forget to mention, host level
resource is always respected by blk-mq before running .queue_rq().
- the same is true if resource in the queue level isn't satisfied.
- if there isn't outstanding request on this queue, then SCSI's RESTART
can't work(blk-mq's can't work too), and the queue will be run after
SCSI_QUEUE_DELAY, and finally all starved sdevs will be handled by SCSI's
RESTART when this request is finished
2) scsi_dispatch_cmd() returns BLK_STS_RESOURCE
- if there isn't onprogressing request on this queue, the queue
will be run after SCSI_QUEUE_DELAY
- otherwise, SCSI's RESTART covers the rerun.
3) blk_mq_get_driver_tag() failed
- BLK_MQ_S_TAG_WAITING covers the cross-queue RESTART for driver
allocation.
In one word, SCSI's built-in RESTART is enough to cover itself.
So we don't need to pay special attention to TAG_SHARED wrt. restart.
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-mq-sched.c | 78 +++-------------------------------------------------
1 file changed, 4 insertions(+), 74 deletions(-)
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index df8581bb0a37..daab27feb653 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -68,25 +68,17 @@ static void blk_mq_sched_mark_restart_hctx(struct blk_mq_hw_ctx *hctx)
set_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
}
-static bool blk_mq_sched_restart_hctx(struct blk_mq_hw_ctx *hctx)
+void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx)
{
if (!test_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state))
- return false;
-
- if (hctx->flags & BLK_MQ_F_TAG_SHARED) {
- struct request_queue *q = hctx->queue;
+ return;
- if (test_and_clear_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state))
- atomic_dec(&q->shared_hctx_restart);
- } else
- clear_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
+ clear_bit(BLK_MQ_S_SCHED_RESTART, &hctx->state);
if (blk_mq_hctx_has_pending(hctx)) {
blk_mq_run_hw_queue(hctx, true);
- return true;
+ return;
}
-
- return false;
}
/* return true if hctx need to run again */
@@ -385,68 +377,6 @@ static bool blk_mq_sched_bypass_insert(struct blk_mq_hw_ctx *hctx,
return true;
}
-/**
- * list_for_each_entry_rcu_rr - iterate in a round-robin fashion over rcu list
- * @pos: loop cursor.
- * @skip: the list element that will not be examined. Iteration starts at
- * @skip->next.
- * @head: head of the list to examine. This list must have at least one
- * element, namely @skip.
- * @member: name of the list_head structure within typeof(*pos).
- */
-#define list_for_each_entry_rcu_rr(pos, skip, head, member) \
- for ((pos) = (skip); \
- (pos = (pos)->member.next != (head) ? list_entry_rcu( \
- (pos)->member.next, typeof(*pos), member) : \
- list_entry_rcu((pos)->member.next->next, typeof(*pos), member)), \
- (pos) != (skip); )
-
-/*
- * Called after a driver tag has been freed to check whether a hctx needs to
- * be restarted. Restarts @hctx if its tag set is not shared. Restarts hardware
- * queues in a round-robin fashion if the tag set of @hctx is shared with other
- * hardware queues.
- */
-void blk_mq_sched_restart(struct blk_mq_hw_ctx *const hctx)
-{
- struct blk_mq_tags *const tags = hctx->tags;
- struct blk_mq_tag_set *const set = hctx->queue->tag_set;
- struct request_queue *const queue = hctx->queue, *q;
- struct blk_mq_hw_ctx *hctx2;
- unsigned int i, j;
-
- if (set->flags & BLK_MQ_F_TAG_SHARED) {
- /*
- * If this is 0, then we know that no hardware queues
- * have RESTART marked. We're done.
- */
- if (!atomic_read(&queue->shared_hctx_restart))
- return;
-
- rcu_read_lock();
- list_for_each_entry_rcu_rr(q, queue, &set->tag_list,
- tag_set_list) {
- queue_for_each_hw_ctx(q, hctx2, i)
- if (hctx2->tags == tags &&
- blk_mq_sched_restart_hctx(hctx2))
- goto done;
- }
- j = hctx->queue_num + 1;
- for (i = 0; i < queue->nr_hw_queues; i++, j++) {
- if (j == queue->nr_hw_queues)
- j = 0;
- hctx2 = queue->queue_hw_ctx[j];
- if (hctx2->tags == tags &&
- blk_mq_sched_restart_hctx(hctx2))
- break;
- }
-done:
- rcu_read_unlock();
- } else {
- blk_mq_sched_restart_hctx(hctx);
- }
-}
-
/*
* Add flush/fua to the queue. If we fail getting a driver tag, then
* punt to the requeue list. Requeue will re-invoke us from a context
--
2.9.5
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH 0/2] block/SCSI MQ: two RESTART related patches
2017-10-17 5:04 [PATCH 0/2] block/SCSI MQ: two RESTART related patches Ming Lei
2017-10-17 5:04 ` [PATCH 1/2] SCSI: run idle hctx after delay in scsi_mq_get_budget() Ming Lei
2017-10-17 5:04 ` [PATCH 2/2] blk-mq: don't handle TAG_SHARED in restart Ming Lei
@ 2017-10-17 5:12 ` Ming Lei
2017-10-17 15:47 ` John Garry
2 siblings, 1 reply; 6+ messages in thread
From: Ming Lei @ 2017-10-17 5:12 UTC (permalink / raw)
To: Jens Axboe, linux-block, Christoph Hellwig
Cc: Bart Van Assche, linux-kernel, linux-scsi, Omar Sandoval,
John Garry
On Tue, Oct 17, 2017 at 01:04:16PM +0800, Ming Lei wrote:
> Hi Jens,
>
> The 1st patch runs idle hctx after dealy in scsi_mq_get_budget(),
> so that we can keep same behaviour with before, and it can be
> thought as a fix.
>
> The 2nd patch cleans up RESTART, and removes handling for TAG_SHARED
> from current blk-mq's RESTART mechanism because SCSI_MQ can covers its
> restart by itself, so that no need to handle TAG_SHARED in blk-mq
> RESTART. And >20% IOPS boost is observed in my rand read test over
> scsi_debug.
>
> John, please test this two patches and see if it may improve your SAS
> IO performance, and you can find the two patches in the following branch:
>
> https://github.com/ming1/linux/commits/blk_mq_improve_restart_V1
Forget to mention, you need to either pull the above branch directly
or apply the two patches against for-next branch of Jens' block
tree:
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git #for-next
--
Ming
^ permalink raw reply [flat|nested] 6+ messages in thread