public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 0/2] blk-mq/nvme: cancel request synchronously
@ 2019-03-27  8:51 Ming Lei
  2019-03-27  8:51 ` [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Ming Lei @ 2019-03-27  8:51 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei

Hi,

This patchset introduces blk_mq_complete_request_sync() for canceling
request synchronously in error handler context, then one race between
completing request remotely and destroying contoller/queues can be fixed.

V2:
	- export via EXPORT_SYMBOL_GPL
	- minor commit log change


Ming Lei (2):
  blk-mq: introduce blk_mq_complete_request_sync()
  nvme: cancel request synchronously

 block/blk-mq.c           | 20 ++++++++++++++++----
 drivers/nvme/host/core.c |  2 +-
 include/linux/blk-mq.h   |  1 +
 3 files changed, 18 insertions(+), 5 deletions(-)

-- 
2.9.5


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync()
  2019-03-27  8:51 [PATCH V2 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
@ 2019-03-27  8:51 ` Ming Lei
  2019-03-27 13:32   ` Keith Busch
  2019-03-27  8:51 ` [PATCH V2 2/2] nvme: cancel request synchronously Ming Lei
  2019-04-02  3:38 ` [PATCH V2 0/2] blk-mq/nvme: " Ming Lei
  2 siblings, 1 reply; 6+ messages in thread
From: Ming Lei @ 2019-03-27  8:51 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Keith Busch, Sagi Grimberg,
	Bart Van Assche, James Smart, Christoph Hellwig, linux-nvme

In NVMe's error handler, follows the typical steps of tearing down
hardware for recovering controller:

1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
	blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
	mark the request as abort
	blk_mq_complete_request(req);
4) destroy real hw queues

However, there may be race between #3 and #4, because blk_mq_complete_request()
may run q->mq_ops->complete(rq) remotelly and asynchronously, and
->complete(rq) may be run after #4.

This patch introduces blk_mq_complete_request_sync() for fixing the
above race.

Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-mq.c         | 20 ++++++++++++++++----
 include/linux/blk-mq.h |  1 +
 2 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index a9c181603cbd..bc3524428b96 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -569,7 +569,7 @@ static void __blk_mq_complete_request_remote(void *data)
 	q->mq_ops->complete(rq);
 }
 
-static void __blk_mq_complete_request(struct request *rq)
+static void __blk_mq_complete_request(struct request *rq, bool sync)
 {
 	struct blk_mq_ctx *ctx = rq->mq_ctx;
 	struct request_queue *q = rq->q;
@@ -586,7 +586,7 @@ static void __blk_mq_complete_request(struct request *rq)
 	 * So complete IO reqeust in softirq context in case of single queue
 	 * for not degrading IO performance by irqsoff latency.
 	 */
-	if (q->nr_hw_queues == 1) {
+	if (q->nr_hw_queues == 1 && !sync) {
 		__blk_complete_request(rq);
 		return;
 	}
@@ -594,8 +594,11 @@ static void __blk_mq_complete_request(struct request *rq)
 	/*
 	 * For a polled request, always complete locallly, it's pointless
 	 * to redirect the completion.
+	 *
+	 * If driver requires to complete the request synchronously,
+	 * complete it locally, and it is usually done in error handler.
 	 */
-	if ((rq->cmd_flags & REQ_HIPRI) ||
+	if ((rq->cmd_flags & REQ_HIPRI) || sync ||
 	    !test_bit(QUEUE_FLAG_SAME_COMP, &q->queue_flags)) {
 		q->mq_ops->complete(rq);
 		return;
@@ -648,11 +651,20 @@ bool blk_mq_complete_request(struct request *rq)
 {
 	if (unlikely(blk_should_fake_timeout(rq->q)))
 		return false;
-	__blk_mq_complete_request(rq);
+	__blk_mq_complete_request(rq, false);
 	return true;
 }
 EXPORT_SYMBOL(blk_mq_complete_request);
 
+bool blk_mq_complete_request_sync(struct request *rq)
+{
+	if (unlikely(blk_should_fake_timeout(rq->q)))
+		return false;
+	__blk_mq_complete_request(rq, true);
+	return true;
+}
+EXPORT_SYMBOL_GPL(blk_mq_complete_request_sync);
+
 int blk_mq_request_started(struct request *rq)
 {
 	return blk_mq_rq_state(rq) != MQ_RQ_IDLE;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index b0c814bcc7e3..6a514e5136f4 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -305,6 +305,7 @@ void blk_mq_add_to_requeue_list(struct request *rq, bool at_head,
 void blk_mq_kick_requeue_list(struct request_queue *q);
 void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
 bool blk_mq_complete_request(struct request *rq);
+bool blk_mq_complete_request_sync(struct request *rq);
 bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
 			   struct bio *bio);
 bool blk_mq_queue_stopped(struct request_queue *q);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH V2 2/2] nvme: cancel request synchronously
  2019-03-27  8:51 [PATCH V2 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
  2019-03-27  8:51 ` [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
@ 2019-03-27  8:51 ` Ming Lei
  2019-04-02  3:38 ` [PATCH V2 0/2] blk-mq/nvme: " Ming Lei
  2 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2019-03-27  8:51 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Ming Lei, Keith Busch, Sagi Grimberg,
	Bart Van Assche, James Smart, Christoph Hellwig, linux-nvme

nvme_cancel_request() is used in error handler, and it is always
reliable to cancel request synchronously, and avoids possible race
in which request may be completed after real hw queue is destroyed.

One issue is reported by our customer on NVMe RDMA, in which freed ib
queue pair may be used in nvme_rdma_complete_rq().

Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 drivers/nvme/host/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 470601980794..2c43e12b70af 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -288,7 +288,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved)
 				"Cancelling I/O %d", req->tag);
 
 	nvme_req(req)->status = NVME_SC_ABORT_REQ;
-	blk_mq_complete_request(req);
+	blk_mq_complete_request_sync(req);
 	return true;
 }
 EXPORT_SYMBOL_GPL(nvme_cancel_request);
-- 
2.9.5


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync()
  2019-03-27  8:51 ` [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
@ 2019-03-27 13:32   ` Keith Busch
  2019-03-28  0:50     ` Ming Lei
  0 siblings, 1 reply; 6+ messages in thread
From: Keith Busch @ 2019-03-27 13:32 UTC (permalink / raw)
  To: Ming Lei
  Cc: Jens Axboe, linux-block, Sagi Grimberg, Bart Van Assche,
	James Smart, Christoph Hellwig, linux-nvme

On Wed, Mar 27, 2019 at 04:51:13PM +0800, Ming Lei wrote:
> @@ -594,8 +594,11 @@ static void __blk_mq_complete_request(struct request *rq)
>  	/*
>  	 * For a polled request, always complete locallly, it's pointless
>  	 * to redirect the completion.
> +	 *
> +	 * If driver requires to complete the request synchronously,
> +	 * complete it locally, and it is usually done in error handler.
>  	 */
> -	if ((rq->cmd_flags & REQ_HIPRI) ||
> +	if ((rq->cmd_flags & REQ_HIPRI) || sync ||
>  	    !test_bit(QUEUE_FLAG_SAME_COMP, &q->queue_flags)) {
>  		q->mq_ops->complete(rq);
>  		return;

I think this is fine and agree this is fixing a problem, but just
considering an alternative: would it be okay to just temporarily clear
QUEUE_FLAG_SAME_COMP in the necessary error handling that wants a sync
completion so that we're not introducing an additional parameter?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync()
  2019-03-27 13:32   ` Keith Busch
@ 2019-03-28  0:50     ` Ming Lei
  0 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2019-03-28  0:50 UTC (permalink / raw)
  To: Keith Busch
  Cc: Jens Axboe, linux-block, Sagi Grimberg, Bart Van Assche,
	James Smart, Christoph Hellwig, linux-nvme

On Wed, Mar 27, 2019 at 07:32:45AM -0600, Keith Busch wrote:
> On Wed, Mar 27, 2019 at 04:51:13PM +0800, Ming Lei wrote:
> > @@ -594,8 +594,11 @@ static void __blk_mq_complete_request(struct request *rq)
> >  	/*
> >  	 * For a polled request, always complete locallly, it's pointless
> >  	 * to redirect the completion.
> > +	 *
> > +	 * If driver requires to complete the request synchronously,
> > +	 * complete it locally, and it is usually done in error handler.
> >  	 */
> > -	if ((rq->cmd_flags & REQ_HIPRI) ||
> > +	if ((rq->cmd_flags & REQ_HIPRI) || sync ||
> >  	    !test_bit(QUEUE_FLAG_SAME_COMP, &q->queue_flags)) {
> >  		q->mq_ops->complete(rq);
> >  		return;
> 
> I think this is fine and agree this is fixing a problem, but just
> considering an alternative: would it be okay to just temporarily clear
> QUEUE_FLAG_SAME_COMP in the necessary error handling that wants a sync
> completion so that we're not introducing an additional parameter?

But the flag can be changed by sysfs too.

Thanks,
Ming

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH V2 0/2] blk-mq/nvme: cancel request synchronously
  2019-03-27  8:51 [PATCH V2 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
  2019-03-27  8:51 ` [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
  2019-03-27  8:51 ` [PATCH V2 2/2] nvme: cancel request synchronously Ming Lei
@ 2019-04-02  3:38 ` Ming Lei
  2 siblings, 0 replies; 6+ messages in thread
From: Ming Lei @ 2019-04-02  3:38 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block

On Wed, Mar 27, 2019 at 04:51:12PM +0800, Ming Lei wrote:
> Hi,
> 
> This patchset introduces blk_mq_complete_request_sync() for canceling
> request synchronously in error handler context, then one race between
> completing request remotely and destroying contoller/queues can be fixed.
> 
> V2:
> 	- export via EXPORT_SYMBOL_GPL
> 	- minor commit log change
> 
> 
> Ming Lei (2):
>   blk-mq: introduce blk_mq_complete_request_sync()
>   nvme: cancel request synchronously
> 
>  block/blk-mq.c           | 20 ++++++++++++++++----
>  drivers/nvme/host/core.c |  2 +-
>  include/linux/blk-mq.h   |  1 +
>  3 files changed, 18 insertions(+), 5 deletions(-)

Hi Jens,

Gentle ping on this fix...

Thanks,
Ming

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-04-02  3:38 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-03-27  8:51 [PATCH V2 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
2019-03-27  8:51 ` [PATCH V2 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
2019-03-27 13:32   ` Keith Busch
2019-03-28  0:50     ` Ming Lei
2019-03-27  8:51 ` [PATCH V2 2/2] nvme: cancel request synchronously Ming Lei
2019-04-02  3:38 ` [PATCH V2 0/2] blk-mq/nvme: " Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox