* [PATCH V4 1/2] blk-mq: introduce blk_mq_complete_request_sync()
2019-04-08 22:31 [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
@ 2019-04-08 22:31 ` Ming Lei
2019-04-09 9:48 ` Christoph Hellwig
2019-04-08 22:31 ` [PATCH V4 2/2] nvme: cancel request synchronously Ming Lei
` (2 subsequent siblings)
3 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2019-04-08 22:31 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, Keith Busch, Sagi Grimberg,
Bart Van Assche, James Smart, Christoph Hellwig, linux-nvme
In NVMe's error handler, follows the typical steps of tearing down
hardware for recovering controller:
1) stop blk_mq hw queues
2) stop the real hw queues
3) cancel in-flight requests via
blk_mq_tagset_busy_iter(tags, cancel_request, ...)
cancel_request():
mark the request as abort
blk_mq_complete_request(req);
4) destroy real hw queues
However, there may be race between #3 and #4, because blk_mq_complete_request()
may run q->mq_ops->complete(rq) remotelly and asynchronously, and
->complete(rq) may be run after #4.
This patch introduces blk_mq_complete_request_sync() for fixing the
above race.
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
block/blk-mq.c | 7 +++++++
include/linux/blk-mq.h | 1 +
2 files changed, 8 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index a9354835cf51..9516304a38ee 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -654,6 +654,13 @@ bool blk_mq_complete_request(struct request *rq)
}
EXPORT_SYMBOL(blk_mq_complete_request);
+void blk_mq_complete_request_sync(struct request *rq)
+{
+ WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
+ rq->q->mq_ops->complete(rq);
+}
+EXPORT_SYMBOL_GPL(blk_mq_complete_request_sync);
+
int blk_mq_request_started(struct request *rq)
{
return blk_mq_rq_state(rq) != MQ_RQ_IDLE;
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index cb2aa7ecafff..db29928de467 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -302,6 +302,7 @@ void blk_mq_requeue_request(struct request *rq, bool kick_requeue_list);
void blk_mq_kick_requeue_list(struct request_queue *q);
void blk_mq_delay_kick_requeue_list(struct request_queue *q, unsigned long msecs);
bool blk_mq_complete_request(struct request *rq);
+void blk_mq_complete_request_sync(struct request *rq);
bool blk_mq_bio_list_merge(struct request_queue *q, struct list_head *list,
struct bio *bio);
bool blk_mq_queue_stopped(struct request_queue *q);
--
2.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH V4 2/2] nvme: cancel request synchronously
2019-04-08 22:31 [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
2019-04-08 22:31 ` [PATCH V4 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
@ 2019-04-08 22:31 ` Ming Lei
2019-04-10 15:04 ` [PATCH V4 0/2] blk-mq/nvme: " Ming Lei
2019-04-10 15:57 ` Jens Axboe
3 siblings, 0 replies; 7+ messages in thread
From: Ming Lei @ 2019-04-08 22:31 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Ming Lei, Keith Busch, Sagi Grimberg,
Bart Van Assche, James Smart, Christoph Hellwig, linux-nvme
nvme_cancel_request() is used in error handler, and it is always
reliable to cancel request synchronously, and avoids possible race
in which request may be completed after real hw queue is destroyed.
One issue is reported by our customer on NVMe RDMA, in which freed ib
queue pair may be used in nvme_rdma_complete_rq().
Cc: Keith Busch <kbusch@kernel.org>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: James Smart <james.smart@broadcom.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: linux-nvme@lists.infradead.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
drivers/nvme/host/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 470601980794..2c43e12b70af 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -288,7 +288,7 @@ bool nvme_cancel_request(struct request *req, void *data, bool reserved)
"Cancelling I/O %d", req->tag);
nvme_req(req)->status = NVME_SC_ABORT_REQ;
- blk_mq_complete_request(req);
+ blk_mq_complete_request_sync(req);
return true;
}
EXPORT_SYMBOL_GPL(nvme_cancel_request);
--
2.9.5
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously
2019-04-08 22:31 [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
2019-04-08 22:31 ` [PATCH V4 1/2] blk-mq: introduce blk_mq_complete_request_sync() Ming Lei
2019-04-08 22:31 ` [PATCH V4 2/2] nvme: cancel request synchronously Ming Lei
@ 2019-04-10 15:04 ` Ming Lei
2019-04-10 15:30 ` Keith Busch
2019-04-10 15:57 ` Jens Axboe
3 siblings, 1 reply; 7+ messages in thread
From: Ming Lei @ 2019-04-10 15:04 UTC (permalink / raw)
To: Jens Axboe
Cc: linux-block, Keith Busch, Sagi Grimberg, Bart Van Assche,
James Smart, Christoph Hellwig, linux-nvme
On Tue, Apr 09, 2019 at 06:31:20AM +0800, Ming Lei wrote:
> Hi,
>
> This patchset introduces blk_mq_complete_request_sync() for canceling
> request synchronously in error handler context, then one race between
> completing request remotely and destroying contoller/queues can be fixed.
>
>
> V4:
> - drop return value
> - don't handle fake timeout
>
> V3:
> - avoid extra cost to blk_mq_complete_request
>
> V2:
> - export via EXPORT_SYMBOL_GPL
> - minor commit log change
>
> Ming Lei (2):
> blk-mq: introduce blk_mq_complete_request_sync()
> nvme: cancel request synchronously
>
> block/blk-mq.c | 7 +++++++
> drivers/nvme/host/core.c | 2 +-
> include/linux/blk-mq.h | 1 +
> 3 files changed, 9 insertions(+), 1 deletion(-)
>
> Cc: Keith Busch <kbusch@kernel.org>
> Cc: Sagi Grimberg <sagi@grimberg.me>
> Cc: Bart Van Assche <bvanassche@acm.org>
> Cc: James Smart <james.smart@broadcom.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: linux-nvme@lists.infradead.org
>
Hi Jens,
These two patches have been posted for a while, any chance to make them
in v5.1?
Thanks,
Ming
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously
2019-04-10 15:04 ` [PATCH V4 0/2] blk-mq/nvme: " Ming Lei
@ 2019-04-10 15:30 ` Keith Busch
0 siblings, 0 replies; 7+ messages in thread
From: Keith Busch @ 2019-04-10 15:30 UTC (permalink / raw)
To: Ming Lei
Cc: Jens Axboe, linux-block, Sagi Grimberg, Bart Van Assche,
James Smart, Christoph Hellwig, linux-nvme
On Wed, Apr 10, 2019 at 11:04:38PM +0800, Ming Lei wrote:
> On Tue, Apr 09, 2019 at 06:31:20AM +0800, Ming Lei wrote:
> > Hi,
> >
> > This patchset introduces blk_mq_complete_request_sync() for canceling
> > request synchronously in error handler context, then one race between
> > completing request remotely and destroying contoller/queues can be fixed.
> >
> >
> > V4:
> > - drop return value
> > - don't handle fake timeout
> >
> > V3:
> > - avoid extra cost to blk_mq_complete_request
> >
> > V2:
> > - export via EXPORT_SYMBOL_GPL
> > - minor commit log change
> >
> > Ming Lei (2):
> > blk-mq: introduce blk_mq_complete_request_sync()
> > nvme: cancel request synchronously
> >
> > block/blk-mq.c | 7 +++++++
> > drivers/nvme/host/core.c | 2 +-
> > include/linux/blk-mq.h | 1 +
> > 3 files changed, 9 insertions(+), 1 deletion(-)
> >
> > Cc: Keith Busch <kbusch@kernel.org>
> > Cc: Sagi Grimberg <sagi@grimberg.me>
> > Cc: Bart Van Assche <bvanassche@acm.org>
> > Cc: James Smart <james.smart@broadcom.com>
> > Cc: Christoph Hellwig <hch@lst.de>
> > Cc: linux-nvme@lists.infradead.org
> >
>
> Hi Jens,
>
> These two patches have been posted for a while, any chance to make them
> in v5.1?
FWIW, series looks good to me too.
Reviewed-by: Keith Busch <keith.busch@intel.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously
2019-04-08 22:31 [PATCH V4 0/2] blk-mq/nvme: cancel request synchronously Ming Lei
` (2 preceding siblings ...)
2019-04-10 15:04 ` [PATCH V4 0/2] blk-mq/nvme: " Ming Lei
@ 2019-04-10 15:57 ` Jens Axboe
3 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2019-04-10 15:57 UTC (permalink / raw)
To: Ming Lei
Cc: linux-block, Keith Busch, Sagi Grimberg, Bart Van Assche,
James Smart, Christoph Hellwig, linux-nvme
On 4/8/19 4:31 PM, Ming Lei wrote:
> Hi,
>
> This patchset introduces blk_mq_complete_request_sync() for canceling
> request synchronously in error handler context, then one race between
> completing request remotely and destroying contoller/queues can be fixed.
Applied for 5.1, thanks.
--
Jens Axboe
^ permalink raw reply [flat|nested] 7+ messages in thread