[PATCH 1/2] scsi: Do not rely on blk-mq for double completions

public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/2] scsi: Do not rely on blk-mq for double completions
@ 2018-11-13 18:57 Keith Busch
  2018-11-13 18:57 ` [PATCH 2/2] blk-mq: Simplify request completion state Keith Busch
  2018-11-13 19:20 ` [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Jens Axboe
  0 siblings, 2 replies; 5+ messages in thread
From: Keith Busch @ 2018-11-13 18:57 UTC (permalink / raw)
  To: linux-scsi, linux-block; +Cc: Keith Busch

The scsi timeout error handling had been directly updating the request
state to prevent a natural completion and error handling from completing
the same request twice. Fix this layering violation by having scsi
control the fate of its commands with scsi owned flags rather than
use blk-mq's.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 drivers/scsi/scsi_error.c | 17 +++--------------
 drivers/scsi/scsi_lib.c   | 11 +++++++++++
 include/scsi/scsi_cmnd.h  |  5 ++++-
 3 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index dd338a8cd275..6156f45c2c80 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -204,6 +204,9 @@ scsi_abort_command(struct scsi_cmnd *scmd)
 		shost->last_reset = jiffies;
 	spin_unlock_irqrestore(shost->host_lock, flags);
 
+	if (test_and_set_bit(__SCMD_COMPLETE, &scmd->flags))
+		return SUCCESS;
+
 	scmd->eh_eflags |= SCSI_EH_ABORT_SCHEDULED;
 	SCSI_LOG_ERROR_RECOVERY(3,
 		scmd_printk(KERN_INFO, scmd, "abort scheduled\n"));
@@ -296,20 +299,6 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		rtn = host->hostt->eh_timed_out(scmd);
 
 	if (rtn == BLK_EH_DONE) {
-		/*
-		 * For blk-mq, we must set the request state to complete now
-		 * before sending the request to the scsi error handler. This
-		 * will prevent a use-after-free in the event the LLD manages
-		 * to complete the request before the error handler finishes
-		 * processing this timed out request.
-		 *
-		 * If the request was already completed, then the LLD beat the
-		 * time out handler from transferring the request to the scsi
-		 * error handler. In that case we can return immediately as no
-		 * further action is required.
-		 */
-		if (!blk_mq_mark_complete(req))
-			return rtn;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 61babcb269ab..c680171ca201 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1635,8 +1635,18 @@ static blk_status_t scsi_mq_prep_fn(struct request *req)
 
 static void scsi_mq_done(struct scsi_cmnd *cmd)
 {
+	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
+		return;
 	trace_scsi_dispatch_cmd_done(cmd);
 	blk_mq_complete_request(cmd->request);
+
+#ifdef CONFIG_FAIL_IO_TIMEOUT
+	/*
+	 * Clearing complete here serves only to allow the desired recovery to
+	 * escalate on blk_rq_should_fake_timeout()'s error injection.
+	 */
+	clear_bit(__SCMD_COMPLETE, &cmd->flags);
+#endif
 }
 
 static void scsi_mq_put_budget(struct blk_mq_hw_ctx *hctx)
@@ -1701,6 +1711,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx,
 			goto out_dec_host_busy;
 		req->rq_flags |= RQF_DONTPREP;
 	} else {
+		cmd->flags &= ~SCMD_COMPLETE;
 		blk_mq_start_request(req);
 	}
 
diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h
index d6fd2aba0380..ded7c7194a28 100644
--- a/include/scsi/scsi_cmnd.h
+++ b/include/scsi/scsi_cmnd.h
@@ -58,6 +58,9 @@ struct scsi_pointer {
 #define SCMD_TAGGED		(1 << 0)
 #define SCMD_UNCHECKED_ISA_DMA	(1 << 1)
 #define SCMD_INITIALIZED	(1 << 2)
+
+#define __SCMD_COMPLETE		3
+#define SCMD_COMPLETE		(1 << __SCMD_COMPLETE)
 /* flags preserved across unprep / reprep */
 #define SCMD_PRESERVED_FLAGS	(SCMD_UNCHECKED_ISA_DMA | SCMD_INITIALIZED)
 
@@ -144,7 +147,7 @@ struct scsi_cmnd {
 					 * to be at an address < 16Mb). */
 
 	int result;		/* Status code from lower level driver */
-	int flags;		/* Command flags */
+	unsigned long flags;	/* Command flags */
 
 	unsigned char tag;	/* SCSI-II queued command tag */
 };
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] blk-mq: Simplify request completion state
  2018-11-13 18:57 [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Keith Busch
@ 2018-11-13 18:57 ` Keith Busch
  2018-11-13 19:20 ` [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Jens Axboe
  1 sibling, 0 replies; 5+ messages in thread
From: Keith Busch @ 2018-11-13 18:57 UTC (permalink / raw)
  To: linux-scsi, linux-block; +Cc: Keith Busch

There are no more users relying on blk-mq request states to prevent
double completions, so replace the relatively expensive cmpxchg operation
with WRITE_ONCE.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 block/blk-mq.c         |  4 +---
 include/linux/blk-mq.h | 14 --------------
 2 files changed, 1 insertion(+), 17 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 03b1af0151ca..6c546a021803 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -568,9 +568,7 @@ static void __blk_mq_complete_request(struct request *rq)
 	bool shared = false;
 	int cpu;
 
-	if (!blk_mq_mark_complete(rq))
-		return;
-
+	WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
 	/*
 	 * Most of single queue controllers, there is only one irq vector
 	 * for handling IO completion, and the only irq's affinity is set
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index e32e9293e5a0..1a857c6e17fa 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -237,20 +237,6 @@ void blk_mq_quiesce_queue_nowait(struct request_queue *q);
 
 unsigned int blk_mq_rq_cpu(struct request *rq);
 
-/**
- * blk_mq_mark_complete() - Set request state to complete
- * @rq: request to set to complete state
- *
- * Returns true if request state was successfully set to complete. If
- * successful, the caller is responsibile for seeing this request is ended, as
- * blk_mq_complete_request will not work again.
- */
-static inline bool blk_mq_mark_complete(struct request *rq)
-{
-	return cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) ==
-			MQ_RQ_IN_FLIGHT;
-}
-
 /*
  * Driver command data is immediately after the request. So subtract request
  * size to get back to the original request, add request size to get the PDU.
-- 
2.14.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] scsi: Do not rely on blk-mq for double completions
  2018-11-13 18:57 [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Keith Busch
  2018-11-13 18:57 ` [PATCH 2/2] blk-mq: Simplify request completion state Keith Busch
@ 2018-11-13 19:20 ` Jens Axboe
  2018-11-13 19:45   ` Keith Busch
  1 sibling, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2018-11-13 19:20 UTC (permalink / raw)
  To: Keith Busch, linux-scsi, linux-block

On 11/13/18 11:57 AM, Keith Busch wrote:
> The scsi timeout error handling had been directly updating the request
> state to prevent a natural completion and error handling from completing
> the same request twice. Fix this layering violation by having scsi
> control the fate of its commands with scsi owned flags rather than
> use blk-mq's.
> 
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 61babcb269ab..c680171ca201 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1635,8 +1635,18 @@ static blk_status_t scsi_mq_prep_fn(struct request *req)
>  
>  static void scsi_mq_done(struct scsi_cmnd *cmd)
>  {
> +	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> +		return;
>  	trace_scsi_dispatch_cmd_done(cmd);
>  	blk_mq_complete_request(cmd->request);
> +
> +#ifdef CONFIG_FAIL_IO_TIMEOUT
> +	/*
> +	 * Clearing complete here serves only to allow the desired recovery to
> +	 * escalate on blk_rq_should_fake_timeout()'s error injection.
> +	 */
> +	clear_bit(__SCMD_COMPLETE, &cmd->flags);
> +#endif
>  }

We could have this be:

static void scsi_mq_done(struct scsi_cmnd *cmd)
{
	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
		return;
 	trace_scsi_dispatch_cmd_done(cmd);

 	if (blk_mq_complete_request(cmd->request)) {
		/*
		 * Clearing complete here serves only to allow the
		 * desired recovery to escalate on
		 * blk_rq_should_fake_timeout()'s error injection.
		 */
		clear_bit(__SCMD_COMPLETE, &cmd->flags);
	}
}

with

bool blk_mq_complete_request(struct request *rq)
{
	if (unlikely(blk_should_fake_timeout(rq->q)))
		return true;
	__blk_mq_complete_request(rq);
	return false;
}

and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
more expensive.

Was going to suggest a request flag, but the request is gone at this
point. So that won't really work...

I'm with your solution as well, fwiw.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] scsi: Do not rely on blk-mq for double completions
  2018-11-13 19:20 ` [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Jens Axboe
@ 2018-11-13 19:45   ` Keith Busch
  2018-11-14  4:35     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2018-11-13 19:45 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-scsi, linux-block

On Tue, Nov 13, 2018 at 12:20:46PM -0700, Jens Axboe wrote:
> On 11/13/18 11:57 AM, Keith Busch wrote:
> >  static void scsi_mq_done(struct scsi_cmnd *cmd)
> >  {
> > +	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> > +		return;
> >  	trace_scsi_dispatch_cmd_done(cmd);
> >  	blk_mq_complete_request(cmd->request);
> > +
> > +#ifdef CONFIG_FAIL_IO_TIMEOUT
> > +	/*
> > +	 * Clearing complete here serves only to allow the desired recovery to
> > +	 * escalate on blk_rq_should_fake_timeout()'s error injection.
> > +	 */
> > +	clear_bit(__SCMD_COMPLETE, &cmd->flags);
> > +#endif
> >  }
> 
> We could have this be:
> 
> static void scsi_mq_done(struct scsi_cmnd *cmd)
> {
> 	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
> 		return;
>  	trace_scsi_dispatch_cmd_done(cmd);
> 
>  	if (blk_mq_complete_request(cmd->request)) {
> 		/*
> 		 * Clearing complete here serves only to allow the
> 		 * desired recovery to escalate on
> 		 * blk_rq_should_fake_timeout()'s error injection.
> 		 */
> 		clear_bit(__SCMD_COMPLETE, &cmd->flags);
> 	}
> }
> 
> with
> 
> bool blk_mq_complete_request(struct request *rq)
> {
> 	if (unlikely(blk_should_fake_timeout(rq->q)))
> 		return true;
> 	__blk_mq_complete_request(rq);
> 	return false;
> }
> 
> and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
> more expensive.

I was trying to avoid every cost no matter how negligable (those are the
only types of costs left as far as I can see), but I think your proposal
might actually be necessary: if a timeout wasn't faked, clearing the
completion flag unconditionally might have a problem with a real timeout
racing with the real completion. :(

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] scsi: Do not rely on blk-mq for double completions
  2018-11-13 19:45   ` Keith Busch
@ 2018-11-14  4:35     ` Jens Axboe
  0 siblings, 0 replies; 5+ messages in thread
From: Jens Axboe @ 2018-11-14  4:35 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-scsi, linux-block

On 11/13/18 12:45 PM, Keith Busch wrote:
> On Tue, Nov 13, 2018 at 12:20:46PM -0700, Jens Axboe wrote:
>> On 11/13/18 11:57 AM, Keith Busch wrote:
>>>  static void scsi_mq_done(struct scsi_cmnd *cmd)
>>>  {
>>> +	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
>>> +		return;
>>>  	trace_scsi_dispatch_cmd_done(cmd);
>>>  	blk_mq_complete_request(cmd->request);
>>> +
>>> +#ifdef CONFIG_FAIL_IO_TIMEOUT
>>> +	/*
>>> +	 * Clearing complete here serves only to allow the desired recovery to
>>> +	 * escalate on blk_rq_should_fake_timeout()'s error injection.
>>> +	 */
>>> +	clear_bit(__SCMD_COMPLETE, &cmd->flags);
>>> +#endif
>>>  }
>>
>> We could have this be:
>>
>> static void scsi_mq_done(struct scsi_cmnd *cmd)
>> {
>> 	if (test_and_set_bit(__SCMD_COMPLETE, &cmd->flags))
>> 		return;
>>  	trace_scsi_dispatch_cmd_done(cmd);
>>
>>  	if (blk_mq_complete_request(cmd->request)) {
>> 		/*
>> 		 * Clearing complete here serves only to allow the
>> 		 * desired recovery to escalate on
>> 		 * blk_rq_should_fake_timeout()'s error injection.
>> 		 */
>> 		clear_bit(__SCMD_COMPLETE, &cmd->flags);
>> 	}
>> }
>>
>> with
>>
>> bool blk_mq_complete_request(struct request *rq)
>> {
>> 	if (unlikely(blk_should_fake_timeout(rq->q)))
>> 		return true;
>> 	__blk_mq_complete_request(rq);
>> 	return false;
>> }
>>
>> and not have this CONFIG_FAIL_IO_TIMEOUT dependency, but that'd be a bit
>> more expensive.
> 
> I was trying to avoid every cost no matter how negligable (those are the
> only types of costs left as far as I can see), but I think your proposal
> might actually be necessary: if a timeout wasn't faked, clearing the
> completion flag unconditionally might have a problem with a real timeout
> racing with the real completion. :(

Yep, I think you are right... No way around it then. Are you going to
resend it with the fix?

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-11-14  4:35 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-13 18:57 [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Keith Busch
2018-11-13 18:57 ` [PATCH 2/2] blk-mq: Simplify request completion state Keith Busch
2018-11-13 19:20 ` [PATCH 1/2] scsi: Do not rely on blk-mq for double completions Jens Axboe
2018-11-13 19:45   ` Keith Busch
2018-11-14  4:35     ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox