linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] blk-mq: export setting request completion state
@ 2018-07-19 21:26 Keith Busch
  2018-07-19 21:26 ` [PATCH 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-23  8:09 ` [PATCH 1/2] blk-mq: export setting request completion state Christoph Hellwig
  0 siblings, 2 replies; 18+ messages in thread
From: Keith Busch @ 2018-07-19 21:26 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: linux-nvme, Christoph Hellwig, Jens Axboe, Jianchao Wang,
	Bart Van Assche, Keith Busch

This is preparing for drivers that want to directly alter the state of
their requests. No functional change here.

Signed-off-by: Keith Busch <keith.busch@intel.com>
---
 block/blk-mq.c         | 4 +---
 include/linux/blk-mq.h | 9 +++++++++
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 22326612a5d3..f50559718b71 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -558,10 +558,8 @@ static void __blk_mq_complete_request(struct request *rq)
 	bool shared = false;
 	int cpu;
 
-	if (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
-			MQ_RQ_IN_FLIGHT)
+	if (blk_mq_mark_complete(rq))
 		return;
-
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 9b0fd11ce89a..0ae115e95476 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -289,6 +289,15 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues);
 
 void blk_mq_quiesce_queue_nowait(struct request_queue *q);
 
+/*
+ * Returns true if request was not in flight.
+ */
+static inline bool blk_mq_mark_complete(struct request *rq)
+{
+	return (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
+			MQ_RQ_IN_FLIGHT);
+}
+
 /*
  * Driver command data is immediately after the request. So subtract request
  * size to get back to the original request, add request size to get the PDU.
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-19 21:26 [PATCH 1/2] blk-mq: export setting request completion state Keith Busch
@ 2018-07-19 21:26 ` Keith Busch
  2018-07-20  6:52   ` Johannes Thumshirn
                     ` (2 more replies)
  2018-07-23  8:09 ` [PATCH 1/2] blk-mq: export setting request completion state Christoph Hellwig
  1 sibling, 3 replies; 18+ messages in thread
From: Keith Busch @ 2018-07-19 21:26 UTC (permalink / raw)
  To: linux-block, linux-scsi
  Cc: linux-nvme, Christoph Hellwig, Jens Axboe, Jianchao Wang,
	Bart Van Assche, Keith Busch

The scsi block layer requires requests claimed be the error handling be
completed by the error handler. A previous commit allowed completions
to proceed for blk-mq, breaking that assumption.

This patch prevents completions that may race with the timeout handler
by marking the state to complete, restoring the previous behavior.

Fixes: 12f5b931 ("blk-mq: Remove generation seqeunce")
Signed-off-by: Keith Busch <keith.busch@intel.com>
---

Tested using Jianchao's abort injection to scsi_debug described here:

  https://lore.kernel.org/lkml/a68ad043-26a1-d3d8-2009-504ba4230e0f@oracle.com/

 drivers/scsi/scsi_error.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 8932ae81a15a..86ee10b2c775 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -286,6 +286,9 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 	enum blk_eh_timer_return rtn = BLK_EH_DONE;
 	struct Scsi_Host *host = scmd->device->host;
 
+	if (req->q->mq_ops && blk_mq_mark_complete(req))
+		return rtn;
+
 	trace_scsi_dispatch_cmd_timeout(scmd);
 	scsi_log_completion(scmd, TIMEOUT_ERROR);
 
@@ -300,7 +303,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
 		}
-	}
+	} else if (req->q->mq_ops)
+		WRITE_ONCE(req->state, MQ_RQ_IN_FLIGHT);
 
 	return rtn;
 }
-- 
2.14.4

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-19 21:26 ` [PATCH 2/2] scsi: set timed out out mq requests to complete Keith Busch
@ 2018-07-20  6:52   ` Johannes Thumshirn
  2018-07-20 14:05     ` Keith Busch
  2018-07-20 14:41   ` Christoph Hellwig
  2018-07-20 15:12   ` Bart Van Assche
  2 siblings, 1 reply; 18+ messages in thread
From: Johannes Thumshirn @ 2018-07-20  6:52 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-scsi, linux-nvme, Christoph Hellwig,
	Jens Axboe, Jianchao Wang, Bart Van Assche

On Thu, Jul 19, 2018 at 03:26:18PM -0600, Keith Busch wrote:
> The scsi block layer requires requests claimed be the error handling be
> completed by the error handler. A previous commit allowed completions
> to proceed for blk-mq, breaking that assumption.

Ahm the first sentence is a bit hard to parse, sorry.
Did you mean something like:

"The scsi block later requires requests claimed by the error handling
to be completed by the error handler"?

-- 
Johannes Thumshirn                                          Storage
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 N�rnberg
GF: Felix Imend�rffer, Jane Smithard, Graham Norton
HRB 21284 (AG N�rnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20  6:52   ` Johannes Thumshirn
@ 2018-07-20 14:05     ` Keith Busch
  0 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2018-07-20 14:05 UTC (permalink / raw)
  To: Johannes Thumshirn
  Cc: Keith Busch, Jens Axboe, linux-scsi, linux-nvme, linux-block,
	Jianchao Wang, Bart Van Assche, Christoph Hellwig

On Fri, Jul 20, 2018 at 08:52:58AM +0200, Johannes Thumshirn wrote:
> On Thu, Jul 19, 2018 at 03:26:18PM -0600, Keith Busch wrote:
> > The scsi block layer requires requests claimed be the error handling be
> > completed by the error handler. A previous commit allowed completions
> > to proceed for blk-mq, breaking that assumption.
> 
> Ahm the first sentence is a bit hard to parse, sorry.
> Did you mean something like:
> 
> "The scsi block later requires requests claimed by the error handling
> to be completed by the error handler"?

Ugh, sorry. Yes, you are correct on the intended phrasing.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-19 21:26 ` [PATCH 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-20  6:52   ` Johannes Thumshirn
@ 2018-07-20 14:41   ` Christoph Hellwig
  2018-07-20 14:50     ` Keith Busch
  2018-07-20 15:12   ` Bart Van Assche
  2 siblings, 1 reply; 18+ messages in thread
From: Christoph Hellwig @ 2018-07-20 14:41 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-scsi, Jens Axboe, linux-nvme, Jianchao Wang,
	Bart Van Assche, Christoph Hellwig

Has this been tested?  Especially with the blk_abort_request corner
case?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 14:41   ` Christoph Hellwig
@ 2018-07-20 14:50     ` Keith Busch
  0 siblings, 0 replies; 18+ messages in thread
From: Keith Busch @ 2018-07-20 14:50 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, Jens Axboe, linux-scsi, linux-nvme, linux-block,
	Jianchao Wang, Bart Van Assche, Christoph Hellwig

On Fri, Jul 20, 2018 at 07:41:46AM -0700, Christoph Hellwig wrote:
> Has this been tested?  Especially with the blk_abort_request corner
> case?

Yes, that's what Jianchao's modification to scsi_debug used.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-19 21:26 ` [PATCH 2/2] scsi: set timed out out mq requests to complete Keith Busch
  2018-07-20  6:52   ` Johannes Thumshirn
  2018-07-20 14:41   ` Christoph Hellwig
@ 2018-07-20 15:12   ` Bart Van Assche
  2018-07-20 15:56     ` Keith Busch
  2 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2018-07-20 15:12 UTC (permalink / raw)
  To: linux-scsi@vger.kernel.org, keith.busch@intel.com,
	linux-block@vger.kernel.org
  Cc: hch@lst.de, linux-nvme@lists.infradead.org, axboe@kernel.dk,
	jianchao.w.wang@oracle.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-7", Size: 1589 bytes --]

On Thu, 2018-07-19 at 15:26 -0600, Keith Busch wrote:
+AD4- diff --git a/drivers/scsi/scsi+AF8-error.c b/drivers/scsi/scsi+AF8-er=
ror.c
+AD4- index 8932ae81a15a..86ee10b2c775 100644
+AD4- --- a/drivers/scsi/scsi+AF8-error.c
+AD4- +-+-+- b/drivers/scsi/scsi+AF8-error.c
+AD4- +AEAAQA- -286,6 +-286,9 +AEAAQA- enum blk+AF8-eh+AF8-timer+AF8-return=
 scsi+AF8-times+AF8-out(struct request +ACo-req)
+AD4-  	enum blk+AF8-eh+AF8-timer+AF8-return rtn +AD0- BLK+AF8-EH+AF8-DONE+=
ADs-
+AD4-  	struct Scsi+AF8-Host +ACo-host +AD0- scmd-+AD4-device-+AD4-host+ADs=
-
+AD4- =20
+AD4- +-	if (req-+AD4-q-+AD4-mq+AF8-ops +ACYAJg- blk+AF8-mq+AF8-mark+AF8-co=
mplete(req))
+AD4- +-		return rtn+ADs-
+AD4- +-
+AD4-  	trace+AF8-scsi+AF8-dispatch+AF8-cmd+AF8-timeout(scmd)+ADs-
+AD4-  	scsi+AF8-log+AF8-completion(scmd, TIMEOUT+AF8-ERROR)+ADs-
+AD4- =20
+AD4- +AEAAQA- -300,7 +-303,8 +AEAAQA- enum blk+AF8-eh+AF8-timer+AF8-return=
 scsi+AF8-times+AF8-out(struct request +ACo-req)
+AD4-  			set+AF8-host+AF8-byte(scmd, DID+AF8-TIME+AF8-OUT)+ADs-
+AD4-  			scsi+AF8-eh+AF8-scmd+AF8-add(scmd)+ADs-
+AD4-  		+AH0-
+AD4- -	+AH0-
+AD4- +-	+AH0- else if (req-+AD4-q-+AD4-mq+AF8-ops)
+AD4- +-		WRITE+AF8-ONCE(req-+AD4-state, MQ+AF8-RQ+AF8-IN+AF8-FLIGHT)+ADs-
+AD4- =20
+AD4-  	return rtn+ADs-
+AD4-  +AH0-

Modifying the completion state and req-+AD4-state from the SCSI core are la=
yering
violations. Have you considered to move the above changes into blk+AF8-mq+A=
F8-rq+AF8-timed+AF8-out()?
An additional benefit of that approach is that the req-+AD4-q-+AD4-mq+AF8-o=
ps checks can be
left out.

Thanks,

Bart.=

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 15:12   ` Bart Van Assche
@ 2018-07-20 15:56     ` Keith Busch
  2018-07-20 16:03       ` Bart Van Assche
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2018-07-20 15:56 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi@vger.kernel.org, keith.busch@intel.com,
	linux-block@vger.kernel.org, axboe@kernel.dk,
	jianchao.w.wang@oracle.com, hch@lst.de,
	linux-nvme@lists.infradead.org

On Fri, Jul 20, 2018 at 03:12:59PM +0000, Bart Van Assche wrote:
> On Thu, 2018-07-19 at 15:26 -0600, Keith Busch wrote:
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 8932ae81a15a..86ee10b2c775 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -286,6 +286,9 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
> >  	enum blk_eh_timer_return rtn = BLK_EH_DONE;
> >  	struct Scsi_Host *host = scmd->device->host;
> >  
> > +	if (req->q->mq_ops && blk_mq_mark_complete(req))
> > +		return rtn;
> > +
> >  	trace_scsi_dispatch_cmd_timeout(scmd);
> >  	scsi_log_completion(scmd, TIMEOUT_ERROR);
> >  
> > @@ -300,7 +303,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
> >  			set_host_byte(scmd, DID_TIME_OUT);
> >  			scsi_eh_scmd_add(scmd);
> >  		}
> > -	}
> > +	} else if (req->q->mq_ops)
> > +		WRITE_ONCE(req->state, MQ_RQ_IN_FLIGHT);
> >  
> >  	return rtn;
> >  }
> 
> Modifying the completion state and req->state from the SCSI core are layering
> violations. Have you considered to move the above changes into blk_mq_rq_timed_out()?
> An additional benefit of that approach is that the req->q->mq_ops checks can be
> left out.

SCSI is the only block driver that wants this behavior. Moving it back
to generic where it used to be breaks other block drivers.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 15:56     ` Keith Busch
@ 2018-07-20 16:03       ` Bart Van Assche
  2018-07-20 16:12         ` Keith Busch
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2018-07-20 16:03 UTC (permalink / raw)
  To: keith.busch@linux.intel.com
  Cc: linux-scsi@vger.kernel.org, hch@lst.de, keith.busch@intel.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, jianchao.w.wang@oracle.com

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-7", Size: 2351 bytes --]

On Fri, 2018-07-20 at 09:56 -0600, Keith Busch wrote:
+AD4- On Fri, Jul 20, 2018 at 03:12:59PM +-0000, Bart Van Assche wrote:
+AD4- +AD4- On Thu, 2018-07-19 at 15:26 -0600, Keith Busch wrote:
+AD4- +AD4- +AD4- diff --git a/drivers/scsi/scsi+AF8-error.c b/drivers/scsi=
/scsi+AF8-error.c
+AD4- +AD4- +AD4- index 8932ae81a15a..86ee10b2c775 100644
+AD4- +AD4- +AD4- --- a/drivers/scsi/scsi+AF8-error.c
+AD4- +AD4- +AD4- +-+-+- b/drivers/scsi/scsi+AF8-error.c
+AD4- +AD4- +AD4- +AEAAQA- -286,6 +-286,9 +AEAAQA- enum blk+AF8-eh+AF8-time=
r+AF8-return scsi+AF8-times+AF8-out(struct request +ACo-req)
+AD4- +AD4- +AD4-  	enum blk+AF8-eh+AF8-timer+AF8-return rtn +AD0- BLK+AF8-=
EH+AF8-DONE+ADs-
+AD4- +AD4- +AD4-  	struct Scsi+AF8-Host +ACo-host +AD0- scmd-+AD4-device-+=
AD4-host+ADs-
+AD4- +AD4- +AD4- =20
+AD4- +AD4- +AD4- +-	if (req-+AD4-q-+AD4-mq+AF8-ops +ACYAJg- blk+AF8-mq+AF8=
-mark+AF8-complete(req))
+AD4- +AD4- +AD4- +-		return rtn+ADs-
+AD4- +AD4- +AD4- +-
+AD4- +AD4- +AD4-  	trace+AF8-scsi+AF8-dispatch+AF8-cmd+AF8-timeout(scmd)+A=
Ds-
+AD4- +AD4- +AD4-  	scsi+AF8-log+AF8-completion(scmd, TIMEOUT+AF8-ERROR)+AD=
s-
+AD4- +AD4- +AD4- =20
+AD4- +AD4- +AD4- +AEAAQA- -300,7 +-303,8 +AEAAQA- enum blk+AF8-eh+AF8-time=
r+AF8-return scsi+AF8-times+AF8-out(struct request +ACo-req)
+AD4- +AD4- +AD4-  			set+AF8-host+AF8-byte(scmd, DID+AF8-TIME+AF8-OUT)+ADs=
-
+AD4- +AD4- +AD4-  			scsi+AF8-eh+AF8-scmd+AF8-add(scmd)+ADs-
+AD4- +AD4- +AD4-  		+AH0-
+AD4- +AD4- +AD4- -	+AH0-
+AD4- +AD4- +AD4- +-	+AH0- else if (req-+AD4-q-+AD4-mq+AF8-ops)
+AD4- +AD4- +AD4- +-		WRITE+AF8-ONCE(req-+AD4-state, MQ+AF8-RQ+AF8-IN+AF8-F=
LIGHT)+ADs-
+AD4- +AD4- +AD4- =20
+AD4- +AD4- +AD4-  	return rtn+ADs-
+AD4- +AD4- +AD4-  +AH0-
+AD4- +AD4-=20
+AD4- +AD4- Modifying the completion state and req-+AD4-state from the SCSI=
 core are layering
+AD4- +AD4- violations. Have you considered to move the above changes into =
blk+AF8-mq+AF8-rq+AF8-timed+AF8-out()?
+AD4- +AD4- An additional benefit of that approach is that the req-+AD4-q-+=
AD4-mq+AF8-ops checks can be
+AD4- +AD4- left out.
+AD4-=20
+AD4- SCSI is the only block driver that wants this behavior. Moving it bac=
k
+AD4- to generic where it used to be breaks other block drivers.

That's new to me. What would break in the NVMe driver if the above change w=
ould be
present in the block layer?

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 16:03       ` Bart Van Assche
@ 2018-07-20 16:12         ` Keith Busch
  2018-07-20 16:20           ` Bart Van Assche
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2018-07-20 16:12 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: axboe@kernel.dk, linux-block@vger.kernel.org,
	linux-scsi@vger.kernel.org, linux-nvme@lists.infradead.org,
	keith.busch@intel.com, jianchao.w.wang@oracle.com, hch@lst.de

On Fri, Jul 20, 2018 at 04:03:18PM +0000, Bart Van Assche wrote:
> On Fri, 2018-07-20 at 09:56 -0600, Keith Busch wrote:
> > SCSI is the only block driver that wants this behavior. Moving it back
> > to generic where it used to be breaks other block drivers.
> 
> That's new to me. What would break in the NVMe driver if the above change would be
> present in the block layer?

This is what causes the block layer to lose completions, and most drivers
don't want the kernel to lose their completions.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 16:12         ` Keith Busch
@ 2018-07-20 16:20           ` Bart Van Assche
  2018-07-20 16:23             ` Keith Busch
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2018-07-20 16:20 UTC (permalink / raw)
  To: keith.busch@linux.intel.com
  Cc: linux-scsi@vger.kernel.org, hch@lst.de, keith.busch@intel.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, jianchao.w.wang@oracle.com

On Fri, 2018-07-20 at 10:12 -0600, Keith Busch wrote:
> On Fri, Jul 20, 2018 at 04:03:18PM +0000, Bart Van Assche wrote:
> > On Fri, 2018-07-20 at 09:56 -0600, Keith Busch wrote:
> > > SCSI is the only block driver that wants this behavior. M=
oving it back
> > > to generic where it used to be breaks other block drivers=
.
> >=20
> > That's new to me. What would break in the NVMe driver if the ab=
ove change would be
> > present in the block layer?
>=20
> This is what causes the block layer to lose completions, and most dri=
vers
> don't want the kernel to lose their completions.

Hello Keith,

Have you considered to introduce a fourth state for block layer requests to
avoid that completions that occur while a timeout handler is in progress ge=
t
lost? That would avoid that completions get lost not only for the NVMe driv=
er
but also for SCSI drivers. See e.g. the MQ_RQ_TIMED_OUT state i=
n
https://www.mail-archive.com/linux-block@vger.kernel.org/msg22196.html

Thanks,

Bart.=

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 16:20           ` Bart Van Assche
@ 2018-07-20 16:23             ` Keith Busch
  2018-07-20 16:45               ` Bart Van Assche
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2018-07-20 16:23 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi@vger.kernel.org, hch@lst.de, keith.busch@intel.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, jianchao.w.wang@oracle.com

On Fri, Jul 20, 2018 at 04:20:01PM +0000, Bart Van Assche wrote:
> On Fri, 2018-07-20 at 10:12 -0600, Keith Busch wrote:
> > On Fri, Jul 20, 2018 at 04:03:18PM +0000, Bart Van Assche wrote:
> > > On Fri, 2018-07-20 at 09:56 -0600, Keith Busch wrote:
> > > > SCSI is the only block driver that wants this behavior. Moving it back
> > > > to generic where it used to be breaks other block drivers.
> > > 
> > > That's new to me. What would break in the NVMe driver if the above change would be
> > > present in the block layer?
> > 
> > This is what causes the block layer to lose completions, and most drivers
> > don't want the kernel to lose their completions.
> 
> Hello Keith,
> 
> Have you considered to introduce a fourth state for block layer requests to
> avoid that completions that occur while a timeout handler is in progress get
> lost? That would avoid that completions get lost not only for the NVMe driver
> but also for SCSI drivers. See e.g. the MQ_RQ_TIMED_OUT state in
> https://www.mail-archive.com/linux-block@vger.kernel.org/msg22196.html

Yes, I've considered that, and I really want to use it, but scsi may
still reference a freed request in scmd_eh_abort_handler that way.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 16:23             ` Keith Busch
@ 2018-07-20 16:45               ` Bart Van Assche
  2018-07-20 17:24                 ` Keith Busch
  0 siblings, 1 reply; 18+ messages in thread
From: Bart Van Assche @ 2018-07-20 16:45 UTC (permalink / raw)
  To: keith.busch@linux.intel.com
  Cc: linux-scsi@vger.kernel.org, hch@lst.de, keith.busch@intel.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, jianchao.w.wang@oracle.com

On Fri, 2018-07-20 at 10:23 -0600, Keith Busch wrote:
> On Fri, Jul 20, 2018 at 04:20:01PM +0000, Bart Van Assche wrote:
> > On Fri, 2018-07-20 at 10:12 -0600, Keith Busch wrote:
> > > On Fri, Jul 20, 2018 at 04:03:18PM +0000, Bart Van Assch=
e wrote:
> > > > On Fri, 2018-07-20 at 09:56 -0600, Keith Busch wrot=
e:
> > > > > SCSI is the only block driver that wants this=
 behavior. Moving it back
> > > > > to generic where it used to be breaks other b=
lock drivers.
> > > >=20
> > > > That's new to me. What would break in the NVMe driv=
er if the above change would be
> > > > present in the block layer?
> > >=20
> > > This is what causes the block layer to lose completions, =
and most drivers
> > > don't want the kernel to lose their completions.
> >=20
> > Hello Keith,
> >=20
> > Have you considered to introduce a fourth state for block layer=
 requests to
> > avoid that completions that occur while a timeout handler is in=
 progress get
> > lost? That would avoid that completions get lost not only for t=
he NVMe driver
> > but also for SCSI drivers. See e.g. the MQ_RQ_TIMED_=
-OUT state in
> > https://www.mail-archive.com/linux-block@vger.kernel.org/ms=
g22196.html
>=20
> Yes, I've considered that, and I really want to use it, but scsi may
> still reference a freed request in scmd_eh_abort_handler =
that way.

I think that's a misunderstanding. If scsi_times_out() queues an ab=
ort
asynchronously then it tells the block layer through its return value that =
the
SCSI core still owns the request and hence that the block layer should igno=
re any
completions that occur until the SCSI core calls scsi_finish_comman=
d(). That
scsi_finish_command() will trigger a call to __blk_mq_=
-end_request(). The
scsi_times_out() return value I was referring to is called BLK_=
EH_DONE today and
was called BLK_EH_NOT_HANDLED in kernel version v4.17.

This also means that I got the BLK_EH_NOT_HANDLED case wrong in=
 "blk-mq: Rework
blk-mq timeout handling again": in that case concurrent a blk_mq=
8-complete_request()
call should be ignored instead of triggering request completion.

Bart.=

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 16:45               ` Bart Van Assche
@ 2018-07-20 17:24                 ` Keith Busch
  2018-07-23  8:12                   ` hch
  0 siblings, 1 reply; 18+ messages in thread
From: Keith Busch @ 2018-07-20 17:24 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: linux-scsi@vger.kernel.org, hch@lst.de, keith.busch@intel.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, jianchao.w.wang@oracle.com

On Fri, Jul 20, 2018 at 04:45:05PM +0000, Bart Van Assche wrote:
> I think that's a misunderstanding. If scsi_times_out() queues an abort
> asynchronously then it tells the block layer through its return value that the
> SCSI core still owns the request and hence that the block layer should ignore any
> completions that occur until the SCSI core calls scsi_finish_command(). That
> scsi_finish_command() will trigger a call to __blk_mq_end_request(). The
> scsi_times_out() return value I was referring to is called BLK_EH_DONE today and
> was called BLK_EH_NOT_HANDLED in kernel version v4.17.
> 
> This also means that I got the BLK_EH_NOT_HANDLED case wrong in "blk-mq: Rework
> blk-mq timeout handling again": in that case concurrent a blk_mq_complete_request()
> call should be ignored instead of triggering request completion.

I definitely think it's worth revisiting that for the longer term.

For near term, I don't want scsi error handling broken for 4.18, but also
not revert the changes that fixed all the other drivers. Restoring the
old behavior that scsi wants isolated to the scsi driver seems like the
lowest touch option.

My patch restores the state that scsi had in 4.17. It still has that
gap that may lose requests forever when the scsi LLD always returns
BLK_EH_RESET_TIMER (see virtio-scsi, for example). That gap existed prior,
so that's not new with my patch. Maybe we can fix that with a slight
modification to my previous patch. It looks like SCSI really wants to
block completions only when it hands off the command to the error handler,
so we don't need to have the inflight -> compete -> inflight transition,
and the following is all that's needed:

---
diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 8932ae81a15a..902c30d3c0ed 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -296,6 +296,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
 		rtn = host->hostt->eh_timed_out(scmd);
 
 	if (rtn == BLK_EH_DONE) {
+		if (req->q->mq_ops && blk_mq_mark_complete(req))
+			return rtn;
 		if (scsi_abort_command(scmd) != SUCCESS) {
 			set_host_byte(scmd, DID_TIME_OUT);
 			scsi_eh_scmd_add(scmd);
--

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/2] blk-mq: export setting request completion state
  2018-07-19 21:26 [PATCH 1/2] blk-mq: export setting request completion state Keith Busch
  2018-07-19 21:26 ` [PATCH 2/2] scsi: set timed out out mq requests to complete Keith Busch
@ 2018-07-23  8:09 ` Christoph Hellwig
  1 sibling, 0 replies; 18+ messages in thread
From: Christoph Hellwig @ 2018-07-23  8:09 UTC (permalink / raw)
  To: Keith Busch
  Cc: linux-block, linux-scsi, linux-nvme, Christoph Hellwig,
	Jens Axboe, Jianchao Wang, Bart Van Assche

> +/*
> + * Returns true if request was not in flight.
> + */
> +static inline bool blk_mq_mark_complete(struct request *rq)
> +{
> +	return (cmpxchg(&rq->state, MQ_RQ_IN_FLIGHT, MQ_RQ_COMPLETE) !=
> +			MQ_RQ_IN_FLIGHT);
> +}

This needs a much better comment describing when and how to use it.

Also the outer braces in the return statement are not required.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-20 17:24                 ` Keith Busch
@ 2018-07-23  8:12                   ` hch
  2018-07-23 13:59                     ` Bart Van Assche
  2018-07-23 14:04                     ` Keith Busch
  0 siblings, 2 replies; 18+ messages in thread
From: hch @ 2018-07-23  8:12 UTC (permalink / raw)
  To: Keith Busch
  Cc: Bart Van Assche, linux-scsi@vger.kernel.org, hch@lst.de,
	keith.busch@intel.com, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	jianchao.w.wang@oracle.com

On Fri, Jul 20, 2018 at 11:24:45AM -0600, Keith Busch wrote:
> My patch restores the state that scsi had in 4.17. It still has that
> gap that may lose requests forever when the scsi LLD always returns
> BLK_EH_RESET_TIMER (see virtio-scsi, for example). That gap existed prior,
> so that's not new with my patch. Maybe we can fix that with a slight
> modification to my previous patch. It looks like SCSI really wants to
> block completions only when it hands off the command to the error handler,
> so we don't need to have the inflight -> compete -> inflight transition,
> and the following is all that's needed:

Btw, one thing we should do in blk-mq and scsi is to make the time
optional.  If the blk_mq driver doesn't even have a timeout structure
there is no point in timing out requests and enter the timeout handler
ever.  Same for those scsi drivers always returning BLK_EH_RESET_TIMER.

Whether never having timeouts is a good idea is a different discussion,
but as long as we have such drivers we should handle them somewhat sane.

> ---
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 8932ae81a15a..902c30d3c0ed 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -296,6 +296,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
>  		rtn = host->hostt->eh_timed_out(scmd);
>  
>  	if (rtn == BLK_EH_DONE) {
> +		if (req->q->mq_ops && blk_mq_mark_complete(req))
> +			return rtn;

This looks pretty sensible to me as a band-aid.  It just needs a very
detailed comment explaining what is going on here.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-23  8:12                   ` hch
@ 2018-07-23 13:59                     ` Bart Van Assche
  2018-07-23 14:04                     ` Keith Busch
  1 sibling, 0 replies; 18+ messages in thread
From: Bart Van Assche @ 2018-07-23 13:59 UTC (permalink / raw)
  To: hch@lst.de, keith.busch@linux.intel.com
  Cc: linux-scsi@vger.kernel.org, keith.busch@intel.com,
	linux-block@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@kernel.dk, jianchao.w.wang@oracle.com

On Mon, 2018-07-23 at 10:12 +0200, hch@lst.de wrote:
> Btw, one thing we should do in blk-mq and scsi is to make the time
> optional.  If the blk_mq driver doesn't even have a timeout struc=
ture
> there is no point in timing out requests and enter the timeout handle=
r
> ever.

Are there any blk-mq drivers that do not define a timeout handler and that
use shared tags? I think such drivers need periodic calls to blk_mq_=
-tag_idle().
Do you perhaps want to happen these calls from another context?

Thanks,

Bart.=

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/2] scsi: set timed out out mq requests to complete
  2018-07-23  8:12                   ` hch
  2018-07-23 13:59                     ` Bart Van Assche
@ 2018-07-23 14:04                     ` Keith Busch
  1 sibling, 0 replies; 18+ messages in thread
From: Keith Busch @ 2018-07-23 14:04 UTC (permalink / raw)
  To: hch@lst.de
  Cc: Bart Van Assche, linux-scsi@vger.kernel.org,
	keith.busch@intel.com, linux-block@vger.kernel.org,
	linux-nvme@lists.infradead.org, axboe@kernel.dk,
	jianchao.w.wang@oracle.com

On Mon, Jul 23, 2018 at 10:12:31AM +0200, hch@lst.de wrote:
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 8932ae81a15a..902c30d3c0ed 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -296,6 +296,8 @@ enum blk_eh_timer_return scsi_times_out(struct request *req)
> >  		rtn = host->hostt->eh_timed_out(scmd);
> >  
> >  	if (rtn == BLK_EH_DONE) {
> > +		if (req->q->mq_ops && blk_mq_mark_complete(req))
> > +			return rtn;
> 
> This looks pretty sensible to me as a band-aid.  It just needs a very
> detailed comment explaining what is going on here.

Sounds good, v2 will be sent shortly.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-07-23 14:04 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-07-19 21:26 [PATCH 1/2] blk-mq: export setting request completion state Keith Busch
2018-07-19 21:26 ` [PATCH 2/2] scsi: set timed out out mq requests to complete Keith Busch
2018-07-20  6:52   ` Johannes Thumshirn
2018-07-20 14:05     ` Keith Busch
2018-07-20 14:41   ` Christoph Hellwig
2018-07-20 14:50     ` Keith Busch
2018-07-20 15:12   ` Bart Van Assche
2018-07-20 15:56     ` Keith Busch
2018-07-20 16:03       ` Bart Van Assche
2018-07-20 16:12         ` Keith Busch
2018-07-20 16:20           ` Bart Van Assche
2018-07-20 16:23             ` Keith Busch
2018-07-20 16:45               ` Bart Van Assche
2018-07-20 17:24                 ` Keith Busch
2018-07-23  8:12                   ` hch
2018-07-23 13:59                     ` Bart Van Assche
2018-07-23 14:04                     ` Keith Busch
2018-07-23  8:09 ` [PATCH 1/2] blk-mq: export setting request completion state Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).