public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2 1/2] blk-mq: export setting request completion state
@ 2018-07-23 14:37 Keith Busch
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-23 14:37 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: linux-nvme, Christoph Hellwig, Jens Axboe, Jianchao Wang,
	Bart Van Assche, Keith Busch

This is preparing for drivers that want to directly alter the state of
their requests. No functional change here.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
v1 -> v2:

  Reversed the return value: 'true' for success instead of failure.

  Document the caller's responsibilities after a successful state transtion.

 block/blk-mq.c         |  4 +---
 include/linux/blk-mq.h | 14 ++++++++++++++
 2 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 22326612a5d3..8f01cd7fd182 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -558,10 +558,8 @@ static void __blk_mq_complete_request(struct request *rq)
 	bool shared = false;
 	int cpu;
 
-	if (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
-			MQ_RQ_IN_FLIGHT)
+	if (!blk_mq_mark_complete(rq))
 		return;
-
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 9b0fd11ce89a..4d41ed16314c 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -289,6 +289,20 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues);
 
 void blk_mq_quiesce_queue_nowait(struct request_queue *q);
 
+/**
+ * blk_mq_mark_complete() - Set request state to complete
+ * @rq: request to set to complete state
+ *
+ * Returns true if request state was successfully set to complete. If
+ * successful, the caller is responsibile for seeing this request is ended, as
+ * blk_mq_complete_request will not work again.
+ */
+static inline bool blk_mq_mark_complete(struct request *rq)
+{
+	return cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) ==
+			MQ_RQ_IN_FLIGHT;
+}
+
 /*
  * Driver command data is immediately after the request. So subtract request
  * size to get back to the original request, add request size to get the PDU.
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
@ 2018-07-23 14:37 ` Keith Busch
  2018-07-24  7:56   ` Christoph Hellwig
                     ` (2 more replies)
  2018-07-24  7:56 ` [PATCHv2 1/2] blk-mq: export setting request completion state Christoph Hellwig
  2018-07-24 20:42 ` Jens Axboe
  2 siblings, 3 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-23 14:37 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: linux-nvme, Christoph Hellwig, Jens Axboe, Jianchao Wang,
	Bart Van Assche, Keith Busch

The scsi block layer requires requests claimed by the error handling be
completed by the error handler. A previous commit allowed completions
to proceed for blk-mq, breaking that assumption.

This patch prevents completions that may race with the timeout handler
by marking the state to complete, restoring the previous behavior.

Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce")
Signed-off-by: Keith Busch <keith.busch@intel.com>
---
v1 -> v2:

  Document why this is necessary in code comments.

  Update to API's changed return value

 drivers/scsi/scsi_error.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 8932ae81a15a..2715cdaa669c 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		rtn = host->hostt->eh_timed_out(scmd);
 
 	if (rtn == BLK_EH_DONE) {
+		/*
+		 * For blk-mq, we must set the request state to complete now
+		 * before sending the request to the scsi error handler. This
+		 * will prevent a use-after-free in the event the LLD manages
+		 * to complete the request before the error handler finishes
+		 * processing this timed out request.
+		 *
+		 * If the request was already completed, then the LLD beat the
+		 * time out handler from transferring the request to the scsi
+		 * error handler. In that case we can return immediately as no
+		 * further action is required.
+		 */
+		if (req->q->mq_ops && !blk_mq_mark_complete(req))
+			return rtn;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 1/2] blk-mq: export setting request completion state
  2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
@ 2018-07-24  7:56 ` Christoph Hellwig
  2018-07-24 20:42 ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2018-07-24  7:56 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-scsi, linux-nvme, Christoph Hellwig,
	Jens Axboe, Jianchao Wang, Bart Van Assche

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
@ 2018-07-24  7:56   ` Christoph Hellwig
  2018-07-24 22:46   ` Bart Van Assche
  2018-07-25 15:52   ` Bart Van Assche
  2 siblings, 0 replies; 11+ messages in thread
From: Christoph Hellwig @ 2018-07-24  7:56 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-scsi, linux-nvme, Christoph Hellwig,
	Jens Axboe, Jianchao Wang, Bart Van Assche

On Mon, Jul 23, 2018 at 08:37:51AM -0600, Keith Busch wrote:
> The scsi block layer requires requests claimed by the error handling be
> completed by the error handler. A previous commit allowed completions
> to proceed for blk-mq, breaking that assumption.
> 
> This patch prevents completions that may race with the timeout handler
> by marking the state to complete, restoring the previous behavior.
> 
> Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce")
> Signed-off-by: Keith Busch <keith.busch@intel.com>

Looks good enough (reluctantly) for this merge window:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 1/2] blk-mq: export setting request completion state
  2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-24  7:56 ` [PATCHv2 1/2] blk-mq: export setting request completion state Christoph Hellwig
@ 2018-07-24 20:42 ` Jens Axboe
  2 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2018-07-24 20:42 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-scsi
  Cc: linux-nvme, Christoph Hellwig, Jianchao Wang, Bart Van Assche

On 7/23/18 7:37 AM, Keith Busch wrote:
> This is preparing for drivers that want to directly alter the state of
> their requests. No functional change here.

Added this (and 2/2) for 4.18...

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-24  7:56   ` Christoph Hellwig
@ 2018-07-24 22:46   ` Bart Van Assche
  2018-07-25  1:15     ` Keith Busch
  2018-07-25 15:52   ` Bart Van Assche
  2 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2018-07-24 22:46 UTC (permalink / raw)
  To: Keith Busch, linux-block, linux-scsi
  Cc: Jens Axboe, linux-nvme, Jianchao Wang, Christoph Hellwig

On 07/23/18 07:37, Keith Busch wrote:
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 8932ae81a15a..2715cdaa669c 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>   		rtn = host->hostt->eh_timed_out(scmd);
>   
>   	if (rtn == BLK_EH_DONE) {
> +		/*
> +		 * For blk-mq, we must set the request state to complete now
> +		 * before sending the request to the scsi error handler. This
> +		 * will prevent a use-after-free in the event the LLD manages
> +		 * to complete the request before the error handler finishes
> +		 * processing this timed out request.
> +		 *
> +		 * If the request was already completed, then the LLD beat the
> +		 * time out handler from transferring the request to the scsi
> +		 * error handler. In that case we can return immediately as no
> +		 * further action is required.
> +		 */
> +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> +			return rtn;
>   		if (scsi_abort_command(scmd) != SUCCESS) {
>   			set_host_byte(scmd, DID_TIME_OUT);
>   			scsi_eh_scmd_add(scmd);

This change looks incomplete to me. I think the following scenario is 
not handled properly by the above patch:
- host->hostt->eh_timed_out() gets called.
- The request "req" completes from another context while
   host->hostt->eh_timed_out() is in progress.
- host->hostt->eh_timed_out() calls scsi_finish_command().

I think that scenario will lead to a double completion.

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-24 22:46   ` Bart Van Assche
@ 2018-07-25  1:15     ` Keith Busch
  2018-07-25  1:56       ` Douglas Gilbert
  0 siblings, 1 reply; 11+ messages in thread
From: Keith Busch @ 2018-07-25  1:15 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Keith Busch, linux-block, linux-scsi, Jens Axboe, Jianchao Wang,
	linux-nvme, Christoph Hellwig

On Tue, Jul 24, 2018 at 03:46:17PM -0700, Bart Van Assche wrote:
> On 07/23/18 07:37, Keith Busch wrote:
> > +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> > +			return rtn;
> >   		if (scsi_abort_command(scmd) != SUCCESS) {
> >   			set_host_byte(scmd, DID_TIME_OUT);
> >   			scsi_eh_scmd_add(scmd);
> 
> This change looks incomplete to me. I think the following scenario is not
> handled properly by the above patch:
> - host->hostt->eh_timed_out() gets called.
> - The request "req" completes from another context while
>   host->hostt->eh_timed_out() is in progress.
> - host->hostt->eh_timed_out() calls scsi_finish_command().
> 
> I think that scenario will lead to a double completion.

A bit of a moot point, isn't it? Not a single scsi lld directly calls
scsi_finish_command() from anywhere, much less through eh_timed_out().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-25  1:15     ` Keith Busch
@ 2018-07-25  1:56       ` Douglas Gilbert
  2018-07-25  2:48         ` Keith Busch
  0 siblings, 1 reply; 11+ messages in thread
From: Douglas Gilbert @ 2018-07-25  1:56 UTC (permalink / raw)
  To: Keith Busch, Bart Van Assche
  Cc: Keith Busch, linux-block, linux-scsi, Jens Axboe, Jianchao Wang,
	linux-nvme, Christoph Hellwig

On 2018-07-24 09:15 PM, Keith Busch wrote:
> On Tue, Jul 24, 2018 at 03:46:17PM -0700, Bart Van Assche wrote:
>> On 07/23/18 07:37, Keith Busch wrote:
>>> +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
>>> +			return rtn;
>>>    		if (scsi_abort_command(scmd) != SUCCESS) {
>>>    			set_host_byte(scmd, DID_TIME_OUT);
>>>    			scsi_eh_scmd_add(scmd);
>>
>> This change looks incomplete to me. I think the following scenario is not
>> handled properly by the above patch:
>> - host->hostt->eh_timed_out() gets called.
>> - The request "req" completes from another context while
>>    host->hostt->eh_timed_out() is in progress.
>> - host->hostt->eh_timed_out() calls scsi_finish_command().
>>
>> I think that scenario will lead to a double completion.
> 
> A bit of a moot point, isn't it? Not a single scsi lld directly calls
> scsi_finish_command() from anywhere, much less through eh_timed_out().

" * scsi_finish_command - cleanup and pass command back to upper layer"

That is from the comment block above the definition of scsi_finish_command().
To me that means the mid-level has already got the response from the LLD
(directly via a queuecommand() return, or asynchronously via the scsi_done()
callback) and this function will call the "upper layer" which will be
the one of the sd, sr, st, sg, ses, etc drivers that originated the command
for that UL-driver's completion.

In USB Type C speak, the SCSI mid level has two interfaces, one downward
facing (toward LLDs) and one upward facing (toward the drivers listed above).
No LLD should be calling scsi_finish_command() IMO.

Doug Gilbert

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-25  1:56       ` Douglas Gilbert
@ 2018-07-25  2:48         ` Keith Busch
  0 siblings, 0 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-25  2:48 UTC (permalink / raw)
  To: dgilbert
  Cc: Bart Van Assche, Keith Busch, linux-block, linux-scsi, Jens Axboe,
	Jianchao Wang, linux-nvme, Christoph Hellwig

On Tue, Jul 24, 2018 at 09:56:35PM -0400, dgilbert@interlog.com wrote:
> No LLD should be calling scsi_finish_command() IMO.

Agreed. I don't even think scsi's error handler should directly call it
either, but detangling that may take some time, if ever.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-24  7:56   ` Christoph Hellwig
  2018-07-24 22:46   ` Bart Van Assche
@ 2018-07-25 15:52   ` Bart Van Assche
  2018-07-25 16:48     ` Keith Busch
  2 siblings, 1 reply; 11+ messages in thread
From: Bart Van Assche @ 2018-07-25 15:52 UTC (permalink / raw)
  To: linux-scsi@vger.kernel.org, keith.busch@intel.com,
	linux-block@vger.kernel.org
  Cc: hch@lst.de, linux-nvme@lists.infradead.org, axboe@kernel.dk,
	jianchao.w.wang@oracle.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-7", Size: 2024 bytes --]

On Mon, 2018-07-23 at 08:37 -0600, Keith Busch wrote:
+AD4- diff --git a/drivers/scsi/scsi+AF8-error.c b/drivers/scsi/scsi+AF8-er=
ror.c
+AD4- index 8932ae81a15a..2715cdaa669c 100644
+AD4- --- a/drivers/scsi/scsi+AF8-error.c
+AD4- +-+-+- b/drivers/scsi/scsi+AF8-error.c
+AD4- +AEAAQA- -296,6 +-296,20 +AEAAQA- enum blk+AF8-eh+AF8-timer+AF8-retur=
n scsi+AF8-times+AF8-out(struct request +ACo-req)
+AD4-  		rtn +AD0- host-+AD4-hostt-+AD4-eh+AF8-timed+AF8-out(scmd)+ADs-
+AD4- =20
+AD4-  	if (rtn +AD0APQ- BLK+AF8-EH+AF8-DONE) +AHs-
+AD4- +-		/+ACo-
+AD4- +-		 +ACo- For blk-mq, we must set the request state to complete now
+AD4- +-		 +ACo- before sending the request to the scsi error handler. This
+AD4- +-		 +ACo- will prevent a use-after-free in the event the LLD manages
+AD4- +-		 +ACo- to complete the request before the error handler finishes
+AD4- +-		 +ACo- processing this timed out request.
+AD4- +-		 +ACo-
+AD4- +-		 +ACo- If the request was already completed, then the LLD beat th=
e
+AD4- +-		 +ACo- time out handler from transferring the request to the scsi
+AD4- +-		 +ACo- error handler. In that case we can return immediately as n=
o
+AD4- +-		 +ACo- further action is required.
+AD4- +-		 +ACo-/
+AD4- +-		if (req-+AD4-q-+AD4-mq+AF8-ops +ACYAJg- +ACE-blk+AF8-mq+AF8-mark+=
AF8-complete(req))
+AD4- +-			return rtn+ADs-
+AD4-  		if (scsi+AF8-abort+AF8-command(scmd) +ACEAPQ- SUCCESS) +AHs-
+AD4-  			set+AF8-host+AF8-byte(scmd, DID+AF8-TIME+AF8-OUT)+ADs-
+AD4-  			scsi+AF8-eh+AF8-scmd+AF8-add(scmd)+ADs-

Hello Keith,

What will happen if a completion occurs after scsi+AF8-times+AF8-out() has =
started and
before or during the host-+AD4-hostt-+AD4-eh+AF8-timed+AF8-out()? Can that =
cause a use-after-free
in .eh+AF8-timed+AF8-out()? Can that cause .eh+AF8-timed+AF8-out() to retur=
n BLK+AF8-EH+AF8-RESET+AF8-TIMER
when it should return BLK+AF8-EH+AF8-DONE? Can that cause blk+AF8-mq+AF8-rq=
+AF8-timed+AF8-out() to call
blk+AF8-add+AF8-timer() when that function shouldn't be called?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCHv2 2/2] scsi: set timed out out mq requests to complete
  2018-07-25 15:52   ` Bart Van Assche
@ 2018-07-25 16:48     ` Keith Busch
  0 siblings, 0 replies; 11+ messages in thread
From: Keith Busch @ 2018-07-25 16:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi@vger.kernel.org, keith.busch@intel.com,
	linux-block@vger.kernel.org, axboe@kernel.dk,
	jianchao.w.wang@oracle.com, hch@lst.de,
	linux-nvme@lists.infradead.org

On Wed, Jul 25, 2018 at 03:52:17PM +0000, Bart Van Assche wrote:
> On Mon, 2018-07-23 at 08:37 -0600, Keith Busch wrote:
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 8932ae81a15a..2715cdaa669c 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -296,6 +296,20 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
> >  		rtn = host->hostt->eh_timed_out(scmd);
> >  
> >  	if (rtn == BLK_EH_DONE) {
> > +		/*
> > +		 * For blk-mq, we must set the request state to complete now
> > +		 * before sending the request to the scsi error handler. This
> > +		 * will prevent a use-after-free in the event the LLD manages
> > +		 * to complete the request before the error handler finishes
> > +		 * processing this timed out request.
> > +		 *
> > +		 * If the request was already completed, then the LLD beat the
> > +		 * time out handler from transferring the request to the scsi
> > +		 * error handler. In that case we can return immediately as no
> > +		 * further action is required.
> > +		 */
> > +		if (req->q->mq_ops && !blk_mq_mark_complete(req))
> > +			return rtn;
> >  		if (scsi_abort_command(scmd) != SUCCESS) {
> >  			set_host_byte(scmd, DID_TIME_OUT);
> >  			scsi_eh_scmd_add(scmd);
> 
> Hello Keith,
> 
> What will happen if a completion occurs after scsi_times_out() has started and
> before or during the host->hostt->eh_timed_out()? Can that cause a use-after-free
> in .eh_timed_out()? Can that cause .eh_timed_out() to return BLK_EH_RESET_TIMER
> when it should return BLK_EH_DONE? Can that cause blk_mq_rq_timed_out() to call
> blk_add_timer() when that function shouldn't be called?

That's what the request's refcount protects. The whole point was that
driver returning RESET_TIMER doesn't lose the completion. In the worst
case scenario, the blk-mq timeout work spends some CPU cycles re-arming
a timer that it didn't need to.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-07-25 16:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-23 14:37 [PATCHv2 1/2] blk-mq: export setting request completion state Keith Busch
2018-07-23 14:37 ` [PATCHv2 2/2] scsi: set timed out out mq requests to complete Keith Busch
2018-07-24  7:56   ` Christoph Hellwig
2018-07-24 22:46   ` Bart Van Assche
2018-07-25  1:15     ` Keith Busch
2018-07-25  1:56       ` Douglas Gilbert
2018-07-25  2:48         ` Keith Busch
2018-07-25 15:52   ` Bart Van Assche
2018-07-25 16:48     ` Keith Busch
2018-07-24  7:56 ` [PATCHv2 1/2] blk-mq: export setting request completion state Christoph Hellwig
2018-07-24 20:42 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox