From: ming.lei@redhat.com (Ming Lei)
Subject: [PATCH V6 1/9] blk-mq: grab .q_usage_counter when queuing request from plug code path
Date: Wed, 17 Apr 2019 11:44:02 +0800 [thread overview]
Message-ID: <20190417034410.31957-2-ming.lei@redhat.com> (raw)
In-Reply-To: <20190417034410.31957-1-ming.lei@redhat.com>
Just like aio/io_uring, we need to grab 2 refcount for queuing one
request, one is for submission, another is for completion.
If the request isn't queued from plug code path, the refcount grabbed
in generic_make_request() serves for submission. In theroy, this
refcount should have been released after the sumission(async run queue)
is done. blk_freeze_queue() works with blk_sync_queue() together
for avoiding race between cleanup queue and IO submission, given async
run queue activities are canceled because hctx->run_work is scheduled with
the refcount held, so it is fine to not hold the refcount when
running the run queue work function for dispatch IO.
However, if request is staggered into plug list, and finally queued
from plug code path, the refcount in submission side is actually missed.
And we may start to run queue after queue is removed because the queue's
kobject refcount isn't guaranteed to be grabbed in flushing plug list
context, then kernel oops is triggered, see the following race:
blk_mq_flush_plug_list():
blk_mq_sched_insert_requests()
insert requests to sw queue or scheduler queue
blk_mq_run_hw_queue
Because of concurrent run queue, all requests inserted above may be
completed before calling the above blk_mq_run_hw_queue. Then queue can
be freed during the above blk_mq_run_hw_queue().
Fixes the issue by grab .q_usage_counter before calling
blk_mq_sched_insert_requests() in blk_mq_flush_plug_list(). This way is
safe because the queue is absolutely alive before inserting request.
Cc: Dongli Zhang <dongli.zhang at oracle.com>
Cc: James Smart <james.smart at broadcom.com>
Cc: Bart Van Assche <bart.vanassche at wdc.com>
Cc: linux-scsi at vger.kernel.org,
Cc: Martin K . Petersen <martin.petersen at oracle.com>,
Cc: Christoph Hellwig <hch at lst.de>,
Cc: James E . J . Bottomley <jejb at linux.vnet.ibm.com>,
Cc: jianchao wang <jianchao.w.wang at oracle.com>
Reviewed-by: Bart Van Assche <bvanassche at acm.org>
Reviewed-by: Johannes Thumshirn <jthumshirn at suse.de>
Reviewed-by: Hannes Reinecke <hare at suse.com>
Signed-off-by: Ming Lei <ming.lei at redhat.com>
---
block/blk-mq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 9516304a38ee..ef5a16a2d6fb 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1736,9 +1736,12 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
if (rq->mq_hctx != this_hctx || rq->mq_ctx != this_ctx) {
if (this_hctx) {
trace_block_unplug(this_q, depth, !from_schedule);
+
+ percpu_ref_get(&this_q->q_usage_counter);
blk_mq_sched_insert_requests(this_hctx, this_ctx,
&rq_list,
from_schedule);
+ percpu_ref_put(&this_q->q_usage_counter);
}
this_q = rq->q;
@@ -1757,8 +1760,11 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
*/
if (this_hctx) {
trace_block_unplug(this_q, depth, !from_schedule);
+
+ percpu_ref_get(&this_q->q_usage_counter);
blk_mq_sched_insert_requests(this_hctx, this_ctx, &rq_list,
from_schedule);
+ percpu_ref_put(&this_q->q_usage_counter);
}
}
--
2.9.5
next prev parent reply other threads:[~2019-04-17 3:44 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-17 3:44 [PATCH V6 0/9] blk-mq: fix races related with freeing queue Ming Lei
2019-04-17 3:44 ` Ming Lei [this message]
2019-04-17 3:44 ` [PATCH V6 2/9] blk-mq: move cancel of requeue_work into blk_mq_release Ming Lei
2019-04-17 12:00 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 3/9] blk-mq: free hw queue's resource in hctx's release handler Ming Lei
2019-04-17 12:02 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 4/9] blk-mq: move all hctx alloction & initialization into __blk_mq_alloc_and_init_hctx Ming Lei
2019-04-17 12:03 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 5/9] blk-mq: split blk_mq_alloc_and_init_hctx into two parts Ming Lei
2019-04-17 3:44 ` [PATCH V6 6/9] blk-mq: always free hctx after request queue is freed Ming Lei
2019-04-17 12:08 ` Hannes Reinecke
2019-04-17 12:59 ` Ming Lei
2019-04-22 3:30 ` Ming Lei
2019-04-23 11:19 ` Hannes Reinecke
2019-04-23 13:30 ` Ming Lei
2019-04-23 14:07 ` Hannes Reinecke
2019-04-24 1:12 ` Ming Lei
2019-04-24 1:45 ` Ming Lei
2019-04-24 5:55 ` Hannes Reinecke
2019-04-17 3:44 ` [PATCH V6 7/9] blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release Ming Lei
2019-04-17 3:44 ` [PATCH V6 8/9] block: don't drain in-progress dispatch in blk_cleanup_queue() Ming Lei
2019-04-17 3:44 ` [PATCH V6 9/9] nvme: hold request queue's refcount in ns's whole lifetime Ming Lei
2019-04-17 12:10 ` Hannes Reinecke
2019-04-17 15:55 ` Keith Busch
2019-04-17 17:22 ` [PATCH V6 0/9] blk-mq: fix races related with freeing queue James Smart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190417034410.31957-2-ming.lei@redhat.com \
--to=ming.lei@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox