[PATCHv2 1/2] blk-mq: export setting request completion state

linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCHv2 1/2] blk-mq: export setting request completion state
@ 2018-07-23 14:37 Keith Busch
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-23 14:37 UTC (permalink / raw)


This is preparing for drivers that want to directly alter the state of
their requests. No functional change here.

Signed-off-by: Keith Busch <keith.busch at intel.com>
---
v1 -> v2:

  Reversed the return value: 'true' for success instead of failure.

  Document the caller's responsibilities after a successful state transtion.

 block/blk-mq.c         |  4 +---
 include/linux/blk-mq.h | 14 ++++++++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 22326612a5d3..8f01cd7fd182 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -558,10 +558,8 @@ static void __blk_mq_complete_request(struct request *rq)
 	bool shared = false;
 	int cpu;
 
-	if (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
-			MQ_RQ_IN_FLIGHT)
+	if (!blk_mq_mark_complete(rq))
 		return;
-
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 9b0fd11ce89a..4d41ed16314c 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -289,6 +289,20 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues);
 
 void blk_mq_quiesce_queue_nowait(struct request_queue *q);
 
+/**
+ * blk_mq_mark_complete() - Set request state to complete
+ * @rq: request to set to complete state
+ *
+ * Returns true if request state was successfully set to complete. If
+ * successful, the caller is responsibile for seeing this request is ended, as
+ * blk_mq_complete_request will not work again.
+ */
+static inline bool blk_mq_mark_complete(struct request *rq)
+{
+	return cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) ==
+			MQ_RQ_IN_FLIGHT;
+}
+
 /*
  * Driver command data is immediately after the request. So subtract request
  * size to get back to the original request, add request size to get the PDU.
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
@ 2018-07-23 14:37 ` Keith Busch
  2018-07-24  7:56   ` Christoph Hellwig
                     ` (2 more replies)
  2018-07-24  7:56 ` [PATCHv2 1/2] blk-mq: export setting request completion state Christoph Hellwig
  2018-07-24 20:42 ` Jens Axboe
  2 siblings, 3 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-23 14:37 UTC (permalink / raw)


The scsi block layer requires requests claimed by the error handling be
completed by the error handler. A previous commit allowed completions
to proceed for blk-mq, breaking that assumption.

This patch prevents completions that may race with the timeout handler
by marking the state to complete, restoring the previous behavior.

Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce")
Signed-off-by: Keith Busch <keith.busch at intel.com>
---
v1 -> v2:

  Document why this is necessary in code comments.

  Update to API's changed return value

 drivers/scsi/scsi_error.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 8932ae81a15a..2715cdaa669c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		rtn = host->hostt->eh_timed_out(scmd);
 
 	if (rtn == BLK_EH_DONE) {
+		/*
+		 * For blk-mq, we must set the request state to complete now
+		 * before sending the request to the scsi error handler. This
+		 * will prevent a use-after-free in the event the LLD manages
+		 * to complete the request before the error handler finishes
+		 * processing this timed out request.
+		 *
+		 * If the request was already completed, then the LLD beat the
+		 * time out handler from transferring the request to the scsi
+		 * error handler. In that case we can return immediately as no
+		 * further action is required.
+		 */
+		if (req->q->mq_ops && !blk_mq_mark_complete(req))
+			return rtn;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv2 1/2] blk-mq: export setting request completion state
  2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
@ 2018-07-24  7:56 ` Christoph Hellwig
  2018-07-24 20:42 ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2018-07-24  7:56 UTC (permalink / raw)


Looks good,

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
@ 2018-07-24  7:56   ` Christoph Hellwig
  2018-07-24 22:46   ` Bart Van Assche
  2018-07-25 15:52   ` Bart Van Assche
  2 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2018-07-24  7:56 UTC (permalink / raw)


On Mon, Jul 23, 2018@08:37:51AM -0600, Keith Busch wrote:
> The scsi block layer requires requests claimed by the error handling be
> completed by the error handler. A previous commit allowed completions
> to proceed for blk-mq, breaking that assumption.
> 
> This patch prevents completions that may race with the timeout handler
> by marking the state to complete, restoring the previous behavior.
> 
> Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce")
> Signed-off-by: Keith Busch <keith.busch at intel.com>

Looks good enough (reluctantly) for this merge window:

Reviewed-by: Christoph Hellwig <hch at lst.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 1/2] blk-mq: export setting request completion state
  2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-24  7:56 ` [PATCHv2 1/2] blk-mq: export setting request completion state Christoph Hellwig
@ 2018-07-24 20:42 ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2018-07-24 20:42 UTC (permalink / raw)


On 7/23/18 7:37 AM, Keith Busch wrote:
> This is preparing for drivers that want to directly alter the state of
> their requests. No functional change here.

Added this (and 2/2) for 4.18...

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-24  7:56   ` Christoph Hellwig
@ 2018-07-24 22:46   ` Bart Van Assche
  2018-07-25  1:15     ` Keith Busch
  2018-07-25 15:52   ` Bart Van Assche
  2 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2018-07-24 22:46 UTC (permalink / raw)


On 07/23/18 07:37, Keith Busch wrote:
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 8932ae81a15a..2715cdaa669c 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>   		rtn = host->hostt->eh_timed_out(scmd);
>   
>   	if (rtn == BLK_EH_DONE) {
> +		/*
> +		 * For blk-mq, we must set the request state to complete now
> +		 * before sending the request to the scsi error handler. This
> +		 * will prevent a use-after-free in the event the LLD manages
> +		 * to complete the request before the error handler finishes
> +		 * processing this timed out request.
> +		 *
> +		 * If the request was already completed, then the LLD beat the
> +		 * time out handler from transferring the request to the scsi
> +		 * error handler. In that case we can return immediately as no
> +		 * further action is required.
> +		 */
> +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> +			return rtn;
>   		if (scsi_abort_command(scmd) != SUCCESS) {
>   			set_host_byte(scmd, DID_TIME_OUT);
>   			scsi_eh_scmd_add(scmd);

This change looks incomplete to me. I think the following scenario is 
not handled properly by the above patch:
- host->hostt->eh_timed_out() gets called.
- The request "req" completes from another context while
   host->hostt->eh_timed_out() is in progress.
- host->hostt->eh_timed_out() calls scsi_finish_command().

I think that scenario will lead to a double completion.

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-24 22:46   ` Bart Van Assche
@ 2018-07-25  1:15     ` Keith Busch
  2018-07-25  1:56       ` Douglas Gilbert
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2018-07-25  1:15 UTC (permalink / raw)


On Tue, Jul 24, 2018@03:46:17PM -0700, Bart Van Assche wrote:
> On 07/23/18 07:37, Keith Busch wrote:
> > +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> > +			return rtn;
> >   		if (scsi_abort_command(scmd) != SUCCESS) {
> >   			set_host_byte(scmd, DID_TIME_OUT);
> >   			scsi_eh_scmd_add(scmd);
> 
> This change looks incomplete to me. I think the following scenario is not
> handled properly by the above patch:
> - host->hostt->eh_timed_out() gets called.
> - The request "req" completes from another context while
>   host->hostt->eh_timed_out() is in progress.
> - host->hostt->eh_timed_out() calls scsi_finish_command().
> 
> I think that scenario will lead to a double completion.

A bit of a moot point, isn't it? Not a single scsi lld directly calls
scsi_finish_command() from anywhere, much less through eh_timed_out().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-25  1:15     ` Keith Busch
@ 2018-07-25  1:56       ` Douglas Gilbert
  2018-07-25  2:48         ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: Douglas Gilbert @ 2018-07-25  1:56 UTC (permalink / raw)

On 2018-07-24 09:15 PM, Keith Busch wrote:
> On Tue, Jul 24, 2018@03:46:17PM -0700, Bart Van Assche wrote:
>> On 07/23/18 07:37, Keith Busch wrote:
>>> +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
>>> +			return rtn;
>>>    		if (scsi_abort_command(scmd) != SUCCESS) {
>>>    			set_host_byte(scmd, DID_TIME_OUT);
>>>    			scsi_eh_scmd_add(scmd);
>>
>> This change looks incomplete to me. I think the following scenario is not
>> handled properly by the above patch:
>> - host->hostt->eh_timed_out() gets called.
>> - The request "req" completes from another context while
>>    host->hostt->eh_timed_out() is in progress.
>> - host->hostt->eh_timed_out() calls scsi_finish_command().
>>
>> I think that scenario will lead to a double completion.
> 
> A bit of a moot point, isn't it? Not a single scsi lld directly calls
> scsi_finish_command() from anywhere, much less through eh_timed_out().

" * scsi_finish_command - cleanup and pass command back to upper layer"

That is from the comment block above the definition of scsi_finish_command().
To me that means the mid-level has already got the response from the LLD
(directly via a queuecommand() return, or asynchronously via the scsi_done()
callback) and this function will call the "upper layer" which will be
the one of the sd, sr, st, sg, ses, etc drivers that originated the command
for that UL-driver's completion.

In USB Type C speak, the SCSI mid level has two interfaces, one downward
facing (toward LLDs) and one upward facing (toward the drivers listed above).
No LLD should be calling scsi_finish_command() IMO.

Doug Gilbert

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-25  1:56       ` Douglas Gilbert
@ 2018-07-25  2:48         ` Keith Busch
  0 siblings, 0 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-25  2:48 UTC (permalink / raw)


On Tue, Jul 24, 2018@09:56:35PM -0400, dgilbert@interlog.com wrote:
> No LLD should be calling scsi_finish_command() IMO.

Agreed. I don't even think scsi's error handler should directly call it
either, but detangling that may take some time, if ever.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-24  7:56   ` Christoph Hellwig
  2018-07-24 22:46   ` Bart Van Assche
@ 2018-07-25 15:52   ` Bart Van Assche
  2018-07-25 16:48     ` Keith Busch
  2 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2018-07-25 15:52 UTC (permalink / raw)


On Mon, 2018-07-23@08:37 -0600, Keith Busch wrote:
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 8932ae81a15a..2715cdaa669c 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>  		rtn = host->hostt->eh_timed_out(scmd);
>  
>  	if (rtn == BLK_EH_DONE) {
> +		/*
> +		 * For blk-mq, we must set the request state to complete now
> +		 * before sending the request to the scsi error handler. This
> +		 * will prevent a use-after-free in the event the LLD manages
> +		 * to complete the request before the error handler finishes
> +		 * processing this timed out request.
> +		 *
> +		 * If the request was already completed, then the LLD beat the
> +		 * time out handler from transferring the request to the scsi
> +		 * error handler. In that case we can return immediately as no
> +		 * further action is required.
> +		 */
> +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> +			return rtn;
>  		if (scsi_abort_command(scmd) != SUCCESS) {
>  			set_host_byte(scmd, DID_TIME_OUT);
>  			scsi_eh_scmd_add(scmd);

Hello Keith,

What will happen if a completion occurs after scsi_times_out() has started and
before or during the host->hostt->eh_timed_out()? Can that cause a use-after-free
in .eh_timed_out()? Can that cause .eh_timed_out() to return BLK_EH_RESET_TIMER
when it should return BLK_EH_DONE? Can that cause blk_mq_rq_timed_out() to call
blk_add_timer() when that function shouldn't be called?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-25 15:52   ` Bart Van Assche
@ 2018-07-25 16:48     ` Keith Busch
  0 siblings, 0 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-25 16:48 UTC (permalink / raw)


On Wed, Jul 25, 2018@03:52:17PM +0000, Bart Van Assche wrote:
> On Mon, 2018-07-23@08:37 -0600, Keith Busch wrote:
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 8932ae81a15a..2715cdaa669c 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
> >  		rtn = host->hostt->eh_timed_out(scmd);
> >  
> >  	if (rtn == BLK_EH_DONE) {
> > +		/*
> > +		 * For blk-mq, we must set the request state to complete now
> > +		 * before sending the request to the scsi error handler. This
> > +		 * will prevent a use-after-free in the event the LLD manages
> > +		 * to complete the request before the error handler finishes
> > +		 * processing this timed out request.
> > +		 *
> > +		 * If the request was already completed, then the LLD beat the
> > +		 * time out handler from transferring the request to the scsi
> > +		 * error handler. In that case we can return immediately as no
> > +		 * further action is required.
> > +		 */
> > +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> > +			return rtn;
> >  		if (scsi_abort_command(scmd) != SUCCESS) {
> >  			set_host_byte(scmd, DID_TIME_OUT);
> >  			scsi_eh_scmd_add(scmd);
> 
> Hello Keith,
> 
> What will happen if a completion occurs after scsi_times_out() has started and
> before or during the host->hostt->eh_timed_out()? Can that cause a use-after-free
> in .eh_timed_out()? Can that cause .eh_timed_out() to return BLK_EH_RESET_TIMER
> when it should return BLK_EH_DONE? Can that cause blk_mq_rq_timed_out() to call
> blk_add_timer() when that function shouldn't be called?

That's what the request's refcount protects. The whole point was that
driver returning RESET_TIMER doesn't lose the completion. In the worst
case scenario, the blk-mq timeout work spends some CPU cycles re-arming
a timer that it didn't need to.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-07-25 16:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
2018-07-24  7:56   ` Christoph Hellwig
2018-07-24 22:46   ` Bart Van Assche
2018-07-25  1:15     ` Keith Busch
2018-07-25  1:56       ` Douglas Gilbert
2018-07-25  2:48         ` Keith Busch
2018-07-25 15:52   ` Bart Van Assche
2018-07-25 16:48     ` Keith Busch
2018-07-24  7:56 ` [PATCHv2 1/2] blk-mq: export setting request completion state Christoph Hellwig
2018-07-24 20:42 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).