* [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout @ 2014-06-04 16:33 K. Y. Srinivasan 2014-06-04 17:02 ` James Bottomley 2014-07-18 17:00 ` Christoph Hellwig 0 siblings, 2 replies; 25+ messages in thread From: K. Y. Srinivasan @ 2014-06-04 16:33 UTC (permalink / raw) To: gregkh, linux-kernel, devel, ohering, jbottomley, hch, linux-scsi, apw, jasowang Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the basic I/O timeout of the device. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> --- drivers/scsi/sd.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index e9689d5..54150b1 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct scsi_device *sdp, struct request *rq) static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct request *rq) { - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; + int timeout = sdp->request_queue->rq_timeout; + + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); rq->retries = SD_MAX_RETRIES; rq->cmd[0] = SYNCHRONIZE_CACHE; rq->cmd_len = 10; -- 1.7.4.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-04 16:33 [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout K. Y. Srinivasan @ 2014-06-04 17:02 ` James Bottomley 2014-06-04 17:15 ` KY Srinivasan 2014-07-18 17:00 ` Christoph Hellwig 1 sibling, 1 reply; 25+ messages in thread From: James Bottomley @ 2014-06-04 17:02 UTC (permalink / raw) To: kys@microsoft.com Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, ohering@suse.com, linux-kernel@vger.kernel.org, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: > Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the > FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the > basic I/O timeout of the device. Fix this bug. > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > --- > drivers/scsi/sd.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c > index e9689d5..54150b1 100644 > --- a/drivers/scsi/sd.c > +++ b/drivers/scsi/sd.c > @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct scsi_device *sdp, struct request *rq) > > static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct request *rq) > { > - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > + int timeout = sdp->request_queue->rq_timeout; > + > + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); Could you share where you found this to be a problem? It looks like a bug in block because all inbound requests being prepared should have a timeout set, so block would be the place to fix it. I can't see how this can happen because we definitely add the timer after the request is prepared in my reading of the block code, unless I'm missing some path in block that violates this. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-04 17:02 ` James Bottomley @ 2014-06-04 17:15 ` KY Srinivasan 2014-06-06 1:32 ` Mike Christie 0 siblings, 1 reply; 25+ messages in thread From: KY Srinivasan @ 2014-06-04 17:15 UTC (permalink / raw) To: James Bottomley Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, ohering@suse.com, linux-kernel@vger.kernel.org, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org > -----Original Message----- > From: James Bottomley [mailto:jbottomley@parallels.com] > Sent: Wednesday, June 4, 2014 10:02 AM > To: KY Srinivasan > Cc: linux-kernel@vger.kernel.org; apw@canonical.com; > devel@linuxdriverproject.org; hch@infradead.org; linux- > scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > jasowang@redhat.com > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > from the basic I/O timeout > > On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: > > Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to > > derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this > > patch did not use the basic I/O timeout of the device. Fix this bug. > > > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > > --- > > drivers/scsi/sd.c | 4 +++- > > 1 files changed, 3 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index > > e9689d5..54150b1 100644 > > --- a/drivers/scsi/sd.c > > +++ b/drivers/scsi/sd.c > > @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct > > scsi_device *sdp, struct request *rq) > > > > static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct > > request *rq) { > > - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > > + int timeout = sdp->request_queue->rq_timeout; > > + > > + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); > > Could you share where you found this to be a problem? It looks like a bug in > block because all inbound requests being prepared should have a timeout > set, so block would be the place to fix it. Perhaps; what I found was that the value in rq->timeout was 0 coming into this function and thus multiplying obviously has no effect. > > I can't see how this can happen because we definitely add the timer after the > request is prepared in my reading of the block code, unless I'm missing some > path in block that violates this. > > James K. Y ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-04 17:15 ` KY Srinivasan @ 2014-06-06 1:32 ` Mike Christie 2014-06-06 2:53 ` KY Srinivasan 0 siblings, 1 reply; 25+ messages in thread From: Mike Christie @ 2014-06-06 1:32 UTC (permalink / raw) To: KY Srinivasan Cc: James Bottomley, linux-kernel@vger.kernel.org, apw@canonical.com, devel@linuxdriverproject.org, hch@infradead.org, linux-scsi@vger.kernel.org, ohering@suse.com, gregkh@linuxfoundation.org, jasowang@redhat.com On 06/04/2014 12:15 PM, KY Srinivasan wrote: > > >> -----Original Message----- >> From: James Bottomley [mailto:jbottomley@parallels.com] >> Sent: Wednesday, June 4, 2014 10:02 AM >> To: KY Srinivasan >> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; >> devel@linuxdriverproject.org; hch@infradead.org; linux- >> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; >> jasowang@redhat.com >> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT >> from the basic I/O timeout >> >> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: >>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to >>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this >>> patch did not use the basic I/O timeout of the device. Fix this bug. >>> >>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> >>> --- >>> drivers/scsi/sd.c | 4 +++- >>> 1 files changed, 3 insertions(+), 1 deletions(-) >>> >>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index >>> e9689d5..54150b1 100644 >>> --- a/drivers/scsi/sd.c >>> +++ b/drivers/scsi/sd.c >>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct >>> scsi_device *sdp, struct request *rq) >>> >>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct >>> request *rq) { >>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; >>> + int timeout = sdp->request_queue->rq_timeout; >>> + >>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); >> >> Could you share where you found this to be a problem? It looks like a bug in >> block because all inbound requests being prepared should have a timeout >> set, so block would be the place to fix it. > > Perhaps; what I found was that the value in rq->timeout was 0 coming into > this function and thus multiplying obviously has no effect. > I think you are right. We hit this problem because we are doing: scsi_request_fn -> blk_peek_request -> sd_prep_fn -> scsi_setup_flush_cmnd. At this time request->timeout is zero so the multiplication does nothing. See how sd_setup_write_same_cmnd will set the request->timeout at this time. Then in scsi_request_fn we do: scsi_request_fn -> blk_start_request -> blk_add_timer. At this time it will set the request->timeout if something like req block pc users (like scsi_execute() or block/scsi_ioctl.c) or the write same code mentioned above have not set the timeout. ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-06 1:32 ` Mike Christie @ 2014-06-06 2:53 ` KY Srinivasan 2014-06-06 17:18 ` Mike Christie 0 siblings, 1 reply; 25+ messages in thread From: KY Srinivasan @ 2014-06-06 2:53 UTC (permalink / raw) To: Mike Christie Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, James Bottomley, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org > -----Original Message----- > From: Mike Christie [mailto:michaelc@cs.wisc.edu] > Sent: Thursday, June 5, 2014 6:33 PM > To: KY Srinivasan > Cc: James Bottomley; linux-kernel@vger.kernel.org; apw@canonical.com; > devel@linuxdriverproject.org; hch@infradead.org; linux- > scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > jasowang@redhat.com > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > from the basic I/O timeout > > On 06/04/2014 12:15 PM, KY Srinivasan wrote: > > > > > >> -----Original Message----- > >> From: James Bottomley [mailto:jbottomley@parallels.com] > >> Sent: Wednesday, June 4, 2014 10:02 AM > >> To: KY Srinivasan > >> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; > >> devel@linuxdriverproject.org; hch@infradead.org; linux- > >> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > >> jasowang@redhat.com > >> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > >> FLUSH_TIMEOUT from the basic I/O timeout > >> > >> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: > >>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to > >>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this > >>> patch did not use the basic I/O timeout of the device. Fix this bug. > >>> > >>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > >>> --- > >>> drivers/scsi/sd.c | 4 +++- > >>> 1 files changed, 3 insertions(+), 1 deletions(-) > >>> > >>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index > >>> e9689d5..54150b1 100644 > >>> --- a/drivers/scsi/sd.c > >>> +++ b/drivers/scsi/sd.c > >>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct > >>> scsi_device *sdp, struct request *rq) > >>> > >>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct > >>> request *rq) { > >>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > >>> + int timeout = sdp->request_queue->rq_timeout; > >>> + > >>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); > >> > >> Could you share where you found this to be a problem? It looks like > >> a bug in block because all inbound requests being prepared should > >> have a timeout set, so block would be the place to fix it. > > > > Perhaps; what I found was that the value in rq->timeout was 0 coming > > into this function and thus multiplying obviously has no effect. > > > > I think you are right. We hit this problem because we are doing: > > scsi_request_fn -> blk_peek_request -> sd_prep_fn -> > scsi_setup_flush_cmnd. > > At this time request->timeout is zero so the multiplication does nothing. See > how sd_setup_write_same_cmnd will set the request->timeout at this time. > > Then in scsi_request_fn we do: > > scsi_request_fn -> blk_start_request -> blk_add_timer. > > At this time it will set the request->timeout if something like req block pc > users (like scsi_execute() or block/scsi_ioctl.c) or the write same code > mentioned above have not set the timeout. I don't think this is a recent change. Prior to this commit, we were setting the timeout value in this function; it just happened to be a different constant unrelated to the I/O timeout. K. Y > > ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-06 2:53 ` KY Srinivasan @ 2014-06-06 17:18 ` Mike Christie 2014-06-06 17:52 ` James Bottomley 0 siblings, 1 reply; 25+ messages in thread From: Mike Christie @ 2014-06-06 17:18 UTC (permalink / raw) To: KY Srinivasan Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, James Bottomley, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On 6/5/14, 9:53 PM, KY Srinivasan wrote: > > >> -----Original Message----- >> From: Mike Christie [mailto:michaelc@cs.wisc.edu] >> Sent: Thursday, June 5, 2014 6:33 PM >> To: KY Srinivasan >> Cc: James Bottomley; linux-kernel@vger.kernel.org; apw@canonical.com; >> devel@linuxdriverproject.org; hch@infradead.org; linux- >> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; >> jasowang@redhat.com >> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT >> from the basic I/O timeout >> >> On 06/04/2014 12:15 PM, KY Srinivasan wrote: >>> >>> >>>> -----Original Message----- >>>> From: James Bottomley [mailto:jbottomley@parallels.com] >>>> Sent: Wednesday, June 4, 2014 10:02 AM >>>> To: KY Srinivasan >>>> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; >>>> devel@linuxdriverproject.org; hch@infradead.org; linux- >>>> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; >>>> jasowang@redhat.com >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the >>>> FLUSH_TIMEOUT from the basic I/O timeout >>>> >>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: >>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to >>>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this >>>>> patch did not use the basic I/O timeout of the device. Fix this bug. >>>>> >>>>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> >>>>> --- >>>>> drivers/scsi/sd.c | 4 +++- >>>>> 1 files changed, 3 insertions(+), 1 deletions(-) >>>>> >>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index >>>>> e9689d5..54150b1 100644 >>>>> --- a/drivers/scsi/sd.c >>>>> +++ b/drivers/scsi/sd.c >>>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct >>>>> scsi_device *sdp, struct request *rq) >>>>> >>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct >>>>> request *rq) { >>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; >>>>> + int timeout = sdp->request_queue->rq_timeout; >>>>> + >>>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); >>>> >>>> Could you share where you found this to be a problem? It looks like >>>> a bug in block because all inbound requests being prepared should >>>> have a timeout set, so block would be the place to fix it. >>> >>> Perhaps; what I found was that the value in rq->timeout was 0 coming >>> into this function and thus multiplying obviously has no effect. >>> >> >> I think you are right. We hit this problem because we are doing: >> >> scsi_request_fn -> blk_peek_request -> sd_prep_fn -> >> scsi_setup_flush_cmnd. >> >> At this time request->timeout is zero so the multiplication does nothing. See >> how sd_setup_write_same_cmnd will set the request->timeout at this time. >> >> Then in scsi_request_fn we do: >> >> scsi_request_fn -> blk_start_request -> blk_add_timer. >> >> At this time it will set the request->timeout if something like req block pc >> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code >> mentioned above have not set the timeout. > > I don't think this is a recent change. Prior to this commit, we were setting the timeout > value in this function; it just happened to be a different constant unrelated to the I/O > timeout. > Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was merged we were supposed to initialize it like in your patch in this thread. I guess we could do your patch in this thread, or if we want the block layer to initialize the timeout before the prep_fn callout is called then we would need to have the blk-flush.c code to that when it sets up the request. If we do the latter, do we want the discard and write same code to initialize the request's timeout before the prep_fn callout is called too? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-06 17:18 ` Mike Christie @ 2014-06-06 17:52 ` James Bottomley 2014-06-06 18:22 ` Jens Axboe 0 siblings, 1 reply; 25+ messages in thread From: James Bottomley @ 2014-06-06 17:52 UTC (permalink / raw) To: michaelc@cs.wisc.edu Cc: linux-kernel@vger.kernel.org, hch@infradead.org, devel@linuxdriverproject.org, apw@canonical.com, kys@microsoft.com, axboe@kernel.dk, linux-scsi@vger.kernel.org, ohering@suse.com, gregkh@linuxfoundation.org, jasowang@redhat.com On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote: > On 6/5/14, 9:53 PM, KY Srinivasan wrote: > > > > > >> -----Original Message----- > >> From: Mike Christie [mailto:michaelc@cs.wisc.edu] > >> Sent: Thursday, June 5, 2014 6:33 PM > >> To: KY Srinivasan > >> Cc: James Bottomley; linux-kernel@vger.kernel.org; apw@canonical.com; > >> devel@linuxdriverproject.org; hch@infradead.org; linux- > >> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > >> jasowang@redhat.com > >> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > >> from the basic I/O timeout > >> > >> On 06/04/2014 12:15 PM, KY Srinivasan wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: James Bottomley [mailto:jbottomley@parallels.com] > >>>> Sent: Wednesday, June 4, 2014 10:02 AM > >>>> To: KY Srinivasan > >>>> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; > >>>> devel@linuxdriverproject.org; hch@infradead.org; linux- > >>>> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > >>>> jasowang@redhat.com > >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > >>>> FLUSH_TIMEOUT from the basic I/O timeout > >>>> > >>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: > >>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to > >>>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this > >>>>> patch did not use the basic I/O timeout of the device. Fix this bug. > >>>>> > >>>>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > >>>>> --- > >>>>> drivers/scsi/sd.c | 4 +++- > >>>>> 1 files changed, 3 insertions(+), 1 deletions(-) > >>>>> > >>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index > >>>>> e9689d5..54150b1 100644 > >>>>> --- a/drivers/scsi/sd.c > >>>>> +++ b/drivers/scsi/sd.c > >>>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct > >>>>> scsi_device *sdp, struct request *rq) > >>>>> > >>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct > >>>>> request *rq) { > >>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > >>>>> + int timeout = sdp->request_queue->rq_timeout; > >>>>> + > >>>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); > >>>> > >>>> Could you share where you found this to be a problem? It looks like > >>>> a bug in block because all inbound requests being prepared should > >>>> have a timeout set, so block would be the place to fix it. > >>> > >>> Perhaps; what I found was that the value in rq->timeout was 0 coming > >>> into this function and thus multiplying obviously has no effect. > >>> > >> > >> I think you are right. We hit this problem because we are doing: > >> > >> scsi_request_fn -> blk_peek_request -> sd_prep_fn -> > >> scsi_setup_flush_cmnd. > >> > >> At this time request->timeout is zero so the multiplication does nothing. See > >> how sd_setup_write_same_cmnd will set the request->timeout at this time. > >> > >> Then in scsi_request_fn we do: > >> > >> scsi_request_fn -> blk_start_request -> blk_add_timer. > >> > >> At this time it will set the request->timeout if something like req block pc > >> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code > >> mentioned above have not set the timeout. > > > > I don't think this is a recent change. Prior to this commit, we were setting the timeout > > value in this function; it just happened to be a different constant unrelated to the I/O > > timeout. > > > > Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was > merged we were supposed to initialize it like in your patch in this thread. > > I guess we could do your patch in this thread, or if we want the block > layer to initialize the timeout before the prep_fn callout is called > then we would need to have the blk-flush.c code to that when it sets up > the request. If we do the latter, do we want the discard and write same > code to initialize the request's timeout before the prep_fn callout is > called too? I looked through the call chain; it seems to be intentional behaviour on the part of block. Just from an mq point of view, it would make better code if we unconditionally initialised rq->timeout early and allowed prep to modify it and then dumped the if(!req->timeout) in blk_add_timer(), but it's a marginal if condition that would compile to a conditional store on sensible architectures, so losing the conditional probably isn't worth worrying about. Cc'd Jens for his opinion with the block patch James --- diff --git a/block/blk-core.c b/block/blk-core.c index a0e3096..cad6b2a 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -111,6 +111,7 @@ void blk_rq_init(struct request_queue *q, struct request *rq) rq->cmd = rq->__cmd; rq->cmd_len = BLK_MAX_CDB; rq->tag = -1; + rq->timeout = q->rq_timeout; rq->start_time = jiffies; set_start_time_ns(rq); rq->part = NULL; diff --git a/block/blk-timeout.c b/block/blk-timeout.c index d96f706..9063ade 100644 --- a/block/blk-timeout.c +++ b/block/blk-timeout.c @@ -180,13 +180,6 @@ void __blk_add_timer(struct request *req, struct list_head *timeout_list) BUG_ON(!list_empty(&req->timeout_list)); - /* - * Some LLDs, like scsi, peek at the timeout to prevent a - * command from being retried forever. - */ - if (!req->timeout) - req->timeout = q->rq_timeout; - req->deadline = jiffies + req->timeout; if (timeout_list) list_add_tail(&req->timeout_list, timeout_list); ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-06 17:52 ` James Bottomley @ 2014-06-06 18:22 ` Jens Axboe 2014-06-20 21:36 ` KY Srinivasan 0 siblings, 1 reply; 25+ messages in thread From: Jens Axboe @ 2014-06-06 18:22 UTC (permalink / raw) To: James Bottomley, michaelc@cs.wisc.edu Cc: linux-kernel@vger.kernel.org, hch@infradead.org, devel@linuxdriverproject.org, apw@canonical.com, kys@microsoft.com, linux-scsi@vger.kernel.org, ohering@suse.com, gregkh@linuxfoundation.org, jasowang@redhat.com On 2014-06-06 11:52, James Bottomley wrote: > On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote: >> On 6/5/14, 9:53 PM, KY Srinivasan wrote: >>> >>> >>>> -----Original Message----- >>>> From: Mike Christie [mailto:michaelc@cs.wisc.edu] >>>> Sent: Thursday, June 5, 2014 6:33 PM >>>> To: KY Srinivasan >>>> Cc: James Bottomley; linux-kernel@vger.kernel.org; apw@canonical.com; >>>> devel@linuxdriverproject.org; hch@infradead.org; linux- >>>> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; >>>> jasowang@redhat.com >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT >>>> from the basic I/O timeout >>>> >>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote: >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: James Bottomley [mailto:jbottomley@parallels.com] >>>>>> Sent: Wednesday, June 4, 2014 10:02 AM >>>>>> To: KY Srinivasan >>>>>> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; >>>>>> devel@linuxdriverproject.org; hch@infradead.org; linux- >>>>>> scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; >>>>>> jasowang@redhat.com >>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the >>>>>> FLUSH_TIMEOUT from the basic I/O timeout >>>>>> >>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: >>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to >>>>>>> derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this >>>>>>> patch did not use the basic I/O timeout of the device. Fix this bug. >>>>>>> >>>>>>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> >>>>>>> --- >>>>>>> drivers/scsi/sd.c | 4 +++- >>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-) >>>>>>> >>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index >>>>>>> e9689d5..54150b1 100644 >>>>>>> --- a/drivers/scsi/sd.c >>>>>>> +++ b/drivers/scsi/sd.c >>>>>>> @@ -832,7 +832,9 @@ static int sd_setup_write_same_cmnd(struct >>>>>>> scsi_device *sdp, struct request *rq) >>>>>>> >>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, struct >>>>>>> request *rq) { >>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; >>>>>>> + int timeout = sdp->request_queue->rq_timeout; >>>>>>> + >>>>>>> + rq->timeout = (timeout * SD_FLUSH_TIMEOUT_MULTIPLIER); >>>>>> >>>>>> Could you share where you found this to be a problem? It looks like >>>>>> a bug in block because all inbound requests being prepared should >>>>>> have a timeout set, so block would be the place to fix it. >>>>> >>>>> Perhaps; what I found was that the value in rq->timeout was 0 coming >>>>> into this function and thus multiplying obviously has no effect. >>>>> >>>> >>>> I think you are right. We hit this problem because we are doing: >>>> >>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn -> >>>> scsi_setup_flush_cmnd. >>>> >>>> At this time request->timeout is zero so the multiplication does nothing. See >>>> how sd_setup_write_same_cmnd will set the request->timeout at this time. >>>> >>>> Then in scsi_request_fn we do: >>>> >>>> scsi_request_fn -> blk_start_request -> blk_add_timer. >>>> >>>> At this time it will set the request->timeout if something like req block pc >>>> users (like scsi_execute() or block/scsi_ioctl.c) or the write same code >>>> mentioned above have not set the timeout. >>> >>> I don't think this is a recent change. Prior to this commit, we were setting the timeout >>> value in this function; it just happened to be a different constant unrelated to the I/O >>> timeout. >>> >> >> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was >> merged we were supposed to initialize it like in your patch in this thread. >> >> I guess we could do your patch in this thread, or if we want the block >> layer to initialize the timeout before the prep_fn callout is called >> then we would need to have the blk-flush.c code to that when it sets up >> the request. If we do the latter, do we want the discard and write same >> code to initialize the request's timeout before the prep_fn callout is >> called too? > > I looked through the call chain; it seems to be intentional behaviour on > the part of block. Just from an mq point of view, it would make better > code if we unconditionally initialised rq->timeout early and allowed > prep to modify it and then dumped the if(!req->timeout) in > blk_add_timer(), but it's a marginal if condition that would compile to > a conditional store on sensible architectures, so losing the conditional > probably isn't worth worrying about. > > Cc'd Jens for his opinion with the block patch I just committed this one earlier today: http://git.kernel.dk/?p=linux-block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f since I ran into the same thing on nvme. Either approach is fine with me, as they both allow override of the timeout before insertion. But we've always done the rq->timeout = 0 init, so I think we should just reinstate that behavior. -- Jens Axboe ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-06 18:22 ` Jens Axboe @ 2014-06-20 21:36 ` KY Srinivasan 2014-07-17 23:53 ` KY Srinivasan 0 siblings, 1 reply; 25+ messages in thread From: KY Srinivasan @ 2014-06-20 21:36 UTC (permalink / raw) To: Jens Axboe, James Bottomley, michaelc@cs.wisc.edu Cc: linux-kernel@vger.kernel.org, hch@infradead.org, devel@linuxdriverproject.org, apw@canonical.com, linux-scsi@vger.kernel.org, ohering@suse.com, gregkh@linuxfoundation.org, jasowang@redhat.com > -----Original Message----- > From: Jens Axboe [mailto:axboe@kernel.dk] > Sent: Friday, June 6, 2014 11:23 AM > To: James Bottomley; michaelc@cs.wisc.edu > Cc: linux-kernel@vger.kernel.org; hch@infradead.org; > devel@linuxdriverproject.org; apw@canonical.com; KY Srinivasan; linux- > scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > jasowang@redhat.com > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > from the basic I/O timeout > > On 2014-06-06 11:52, James Bottomley wrote: > > On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote: > >> On 6/5/14, 9:53 PM, KY Srinivasan wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Mike Christie [mailto:michaelc@cs.wisc.edu] > >>>> Sent: Thursday, June 5, 2014 6:33 PM > >>>> To: KY Srinivasan > >>>> Cc: James Bottomley; linux-kernel@vger.kernel.org; > >>>> apw@canonical.com; devel@linuxdriverproject.org; > hch@infradead.org; > >>>> linux- scsi@vger.kernel.org; ohering@suse.com; > >>>> gregkh@linuxfoundation.org; jasowang@redhat.com > >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > >>>> FLUSH_TIMEOUT from the basic I/O timeout > >>>> > >>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote: > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: James Bottomley [mailto:jbottomley@parallels.com] > >>>>>> Sent: Wednesday, June 4, 2014 10:02 AM > >>>>>> To: KY Srinivasan > >>>>>> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; > >>>>>> devel@linuxdriverproject.org; hch@infradead.org; linux- > >>>>>> scsi@vger.kernel.org; ohering@suse.com; > >>>>>> gregkh@linuxfoundation.org; jasowang@redhat.com > >>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > >>>>>> FLUSH_TIMEOUT from the basic I/O timeout > >>>>>> > >>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: > >>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added > code > >>>>>>> to derive the FLUSH_TIMEOUT from the basic I/O timeout. > However, > >>>>>>> this patch did not use the basic I/O timeout of the device. Fix this > bug. > >>>>>>> > >>>>>>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > >>>>>>> --- > >>>>>>> drivers/scsi/sd.c | 4 +++- > >>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-) > >>>>>>> > >>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index > >>>>>>> e9689d5..54150b1 100644 > >>>>>>> --- a/drivers/scsi/sd.c > >>>>>>> +++ b/drivers/scsi/sd.c > >>>>>>> @@ -832,7 +832,9 @@ static int > sd_setup_write_same_cmnd(struct > >>>>>>> scsi_device *sdp, struct request *rq) > >>>>>>> > >>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, > >>>>>>> struct request *rq) { > >>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > >>>>>>> + int timeout = sdp->request_queue->rq_timeout; > >>>>>>> + > >>>>>>> + rq->timeout = (timeout * > SD_FLUSH_TIMEOUT_MULTIPLIER); > >>>>>> > >>>>>> Could you share where you found this to be a problem? It looks > >>>>>> like a bug in block because all inbound requests being prepared > >>>>>> should have a timeout set, so block would be the place to fix it. > >>>>> > >>>>> Perhaps; what I found was that the value in rq->timeout was 0 > >>>>> coming into this function and thus multiplying obviously has no effect. > >>>>> > >>>> > >>>> I think you are right. We hit this problem because we are doing: > >>>> > >>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn -> > >>>> scsi_setup_flush_cmnd. > >>>> > >>>> At this time request->timeout is zero so the multiplication does > >>>> nothing. See how sd_setup_write_same_cmnd will set the request- > >timeout at this time. > >>>> > >>>> Then in scsi_request_fn we do: > >>>> > >>>> scsi_request_fn -> blk_start_request -> blk_add_timer. > >>>> > >>>> At this time it will set the request->timeout if something like req > >>>> block pc users (like scsi_execute() or block/scsi_ioctl.c) or the > >>>> write same code mentioned above have not set the timeout. > >>> > >>> I don't think this is a recent change. Prior to this commit, we were > >>> setting the timeout value in this function; it just happened to be a > >>> different constant unrelated to the I/O timeout. > >>> > >> > >> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd was > >> merged we were supposed to initialize it like in your patch in this thread. > >> > >> I guess we could do your patch in this thread, or if we want the > >> block layer to initialize the timeout before the prep_fn callout is > >> called then we would need to have the blk-flush.c code to that when > >> it sets up the request. If we do the latter, do we want the discard > >> and write same code to initialize the request's timeout before the > >> prep_fn callout is called too? > > > > I looked through the call chain; it seems to be intentional behaviour > > on the part of block. Just from an mq point of view, it would make > > better code if we unconditionally initialised rq->timeout early and > > allowed prep to modify it and then dumped the if(!req->timeout) in > > blk_add_timer(), but it's a marginal if condition that would compile > > to a conditional store on sensible architectures, so losing the > > conditional probably isn't worth worrying about. > > > > Cc'd Jens for his opinion with the block patch > > I just committed this one earlier today: > > http://git.kernel.dk/?p=linux- > block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f > > since I ran into the same thing on nvme. Either approach is fine with me, as > they both allow override of the timeout before insertion. But we've always > done the rq->timeout = 0 init, so I think we should just reinstate that > behavior. James, How is this being fixed now. Regards, K. Y ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-20 21:36 ` KY Srinivasan @ 2014-07-17 23:53 ` KY Srinivasan 2014-07-18 0:51 ` Elliott, Robert (Server Storage) 2014-07-18 15:10 ` Christoph Hellwig (hch@infradead.org) 0 siblings, 2 replies; 25+ messages in thread From: KY Srinivasan @ 2014-07-17 23:53 UTC (permalink / raw) To: KY Srinivasan, Jens Axboe, James Bottomley, michaelc@cs.wisc.edu Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org > -----Original Message----- > From: driverdev-devel-bounces@linuxdriverproject.org [mailto:driverdev- > devel-bounces@linuxdriverproject.org] On Behalf Of KY Srinivasan > Sent: Friday, June 20, 2014 2:37 PM > To: Jens Axboe; James Bottomley; michaelc@cs.wisc.edu > Cc: linux-scsi@vger.kernel.org; gregkh@linuxfoundation.org; > jasowang@redhat.com; linux-kernel@vger.kernel.org; ohering@suse.com; > hch@infradead.org; apw@canonical.com; devel@linuxdriverproject.org > Subject: RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > from the basic I/O timeout > > > > > -----Original Message----- > > From: Jens Axboe [mailto:axboe@kernel.dk] > > Sent: Friday, June 6, 2014 11:23 AM > > To: James Bottomley; michaelc@cs.wisc.edu > > Cc: linux-kernel@vger.kernel.org; hch@infradead.org; > > devel@linuxdriverproject.org; apw@canonical.com; KY Srinivasan; linux- > > scsi@vger.kernel.org; ohering@suse.com; gregkh@linuxfoundation.org; > > jasowang@redhat.com > > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > > FLUSH_TIMEOUT from the basic I/O timeout > > > > On 2014-06-06 11:52, James Bottomley wrote: > > > On Fri, 2014-06-06 at 12:18 -0500, Mike Christie wrote: > > >> On 6/5/14, 9:53 PM, KY Srinivasan wrote: > > >>> > > >>> > > >>>> -----Original Message----- > > >>>> From: Mike Christie [mailto:michaelc@cs.wisc.edu] > > >>>> Sent: Thursday, June 5, 2014 6:33 PM > > >>>> To: KY Srinivasan > > >>>> Cc: James Bottomley; linux-kernel@vger.kernel.org; > > >>>> apw@canonical.com; devel@linuxdriverproject.org; > > hch@infradead.org; > > >>>> linux- scsi@vger.kernel.org; ohering@suse.com; > > >>>> gregkh@linuxfoundation.org; jasowang@redhat.com > > >>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > > >>>> FLUSH_TIMEOUT from the basic I/O timeout > > >>>> > > >>>> On 06/04/2014 12:15 PM, KY Srinivasan wrote: > > >>>>> > > >>>>> > > >>>>>> -----Original Message----- > > >>>>>> From: James Bottomley [mailto:jbottomley@parallels.com] > > >>>>>> Sent: Wednesday, June 4, 2014 10:02 AM > > >>>>>> To: KY Srinivasan > > >>>>>> Cc: linux-kernel@vger.kernel.org; apw@canonical.com; > > >>>>>> devel@linuxdriverproject.org; hch@infradead.org; linux- > > >>>>>> scsi@vger.kernel.org; ohering@suse.com; > > >>>>>> gregkh@linuxfoundation.org; jasowang@redhat.com > > >>>>>> Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > > >>>>>> FLUSH_TIMEOUT from the basic I/O timeout > > >>>>>> > > >>>>>> On Wed, 2014-06-04 at 09:33 -0700, K. Y. Srinivasan wrote: > > >>>>>>> Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added > > code > > >>>>>>> to derive the FLUSH_TIMEOUT from the basic I/O timeout. > > However, > > >>>>>>> this patch did not use the basic I/O timeout of the device. > > >>>>>>> Fix this > > bug. > > >>>>>>> > > >>>>>>> Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > > >>>>>>> --- > > >>>>>>> drivers/scsi/sd.c | 4 +++- > > >>>>>>> 1 files changed, 3 insertions(+), 1 deletions(-) > > >>>>>>> > > >>>>>>> diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index > > >>>>>>> e9689d5..54150b1 100644 > > >>>>>>> --- a/drivers/scsi/sd.c > > >>>>>>> +++ b/drivers/scsi/sd.c > > >>>>>>> @@ -832,7 +832,9 @@ static int > > sd_setup_write_same_cmnd(struct > > >>>>>>> scsi_device *sdp, struct request *rq) > > >>>>>>> > > >>>>>>> static int scsi_setup_flush_cmnd(struct scsi_device *sdp, > > >>>>>>> struct request *rq) { > > >>>>>>> - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > > >>>>>>> + int timeout = sdp->request_queue->rq_timeout; > > >>>>>>> + > > >>>>>>> + rq->timeout = (timeout * > > SD_FLUSH_TIMEOUT_MULTIPLIER); > > >>>>>> > > >>>>>> Could you share where you found this to be a problem? It looks > > >>>>>> like a bug in block because all inbound requests being prepared > > >>>>>> should have a timeout set, so block would be the place to fix it. > > >>>>> > > >>>>> Perhaps; what I found was that the value in rq->timeout was 0 > > >>>>> coming into this function and thus multiplying obviously has no > effect. > > >>>>> > > >>>> > > >>>> I think you are right. We hit this problem because we are doing: > > >>>> > > >>>> scsi_request_fn -> blk_peek_request -> sd_prep_fn -> > > >>>> scsi_setup_flush_cmnd. > > >>>> > > >>>> At this time request->timeout is zero so the multiplication does > > >>>> nothing. See how sd_setup_write_same_cmnd will set the request- > > >timeout at this time. > > >>>> > > >>>> Then in scsi_request_fn we do: > > >>>> > > >>>> scsi_request_fn -> blk_start_request -> blk_add_timer. > > >>>> > > >>>> At this time it will set the request->timeout if something like > > >>>> req block pc users (like scsi_execute() or block/scsi_ioctl.c) or > > >>>> the write same code mentioned above have not set the timeout. > > >>> > > >>> I don't think this is a recent change. Prior to this commit, we > > >>> were setting the timeout value in this function; it just happened > > >>> to be a different constant unrelated to the I/O timeout. > > >>> > > >> > > >> Yeah, it looks like when 7e660100d85af860e7ad763202fff717adcdaacd > > >> was merged we were supposed to initialize it like in your patch in this > thread. > > >> > > >> I guess we could do your patch in this thread, or if we want the > > >> block layer to initialize the timeout before the prep_fn callout is > > >> called then we would need to have the blk-flush.c code to that when > > >> it sets up the request. If we do the latter, do we want the discard > > >> and write same code to initialize the request's timeout before the > > >> prep_fn callout is called too? > > > > > > I looked through the call chain; it seems to be intentional > > > behaviour on the part of block. Just from an mq point of view, it > > > would make better code if we unconditionally initialised rq->timeout > > > early and allowed prep to modify it and then dumped the > > > if(!req->timeout) in blk_add_timer(), but it's a marginal if > > > condition that would compile to a conditional store on sensible > > > architectures, so losing the conditional probably isn't worth worrying > about. > > > > > > Cc'd Jens for his opinion with the block patch > > > > I just committed this one earlier today: > > > > http://git.kernel.dk/?p=linux- > > block.git;a=commit;h=f6be4fb4bcb396fc3b1c134b7863351972de081f > > > > since I ran into the same thing on nvme. Either approach is fine with > > me, as they both allow override of the timeout before insertion. But > > we've always done the rq->timeout = 0 init, so I think we should just > > reinstate that behavior. > > James, > > How is this being fixed now. > > Regards, > > K. Y I still see this problem. There was talk of fixing it elsewhere. Regards, K. Y > > _______________________________________________ > devel mailing list > devel@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-17 23:53 ` KY Srinivasan @ 2014-07-18 0:51 ` Elliott, Robert (Server Storage) 2014-07-18 15:11 ` Christoph Hellwig (hch@infradead.org) 2014-07-18 15:39 ` James Bottomley 2014-07-18 15:10 ` Christoph Hellwig (hch@infradead.org) 1 sibling, 2 replies; 25+ messages in thread From: Elliott, Robert (Server Storage) @ 2014-07-18 0:51 UTC (permalink / raw) To: KY Srinivasan, Jens Axboe, James Bottomley, michaelc@cs.wisc.edu Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org In sd_sync_cache: rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; Regardless of the baseline for the multiplication, a magic number of 2 is too arbitrary. That might work for an individual drive, but could be far too short for a RAID controller that runs into worst case error handling for the drives to which it is flushing (e.g., if its cache is volatile and the drives all have recoverable errors during writes). That time goes up with a bigger cache and with more fragmented write data in the cache requiring more individual WRITE commands. A better value would be the Recommended Command Timeout field value reported in the REPORT SUPPORTED OPERATION CODES command, if reported by the device server. That is supposed to account for the worst case. For cases where that is not reported, exposing the multiplier in sysfs would let the timeout for simple devices be set to smaller values than complex devices. Also, in both sd_setup_flush_cmnd and sd_sync_cache: cmd->cmnd[0] = SYNCHRONIZE_CACHE; cmd->cmd_len = 10; SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported. --- Rob Elliott HP Server Storage ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 0:51 ` Elliott, Robert (Server Storage) @ 2014-07-18 15:11 ` Christoph Hellwig (hch@infradead.org) 2014-07-18 15:39 ` James Bottomley 1 sibling, 0 replies; 25+ messages in thread From: Christoph Hellwig (hch@infradead.org) @ 2014-07-18 15:11 UTC (permalink / raw) To: Elliott, Robert (Server Storage) Cc: Jens Axboe, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, James Bottomley, ohering@suse.com, Christoph Hellwig (hch@infradead.org), apw@canonical.com, devel@linuxdriverproject.org On Fri, Jul 18, 2014 at 12:51:06AM +0000, Elliott, Robert (Server Storage) wrote: > SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE > CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported. I gues you mean (16) for the last occurance? What's the benefit of using SYNCHRONIZE CACHE (16) if we don't pass a LBA range? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 0:51 ` Elliott, Robert (Server Storage) 2014-07-18 15:11 ` Christoph Hellwig (hch@infradead.org) @ 2014-07-18 15:39 ` James Bottomley 2014-07-18 17:17 ` Elliott, Robert (Server Storage) 1 sibling, 1 reply; 25+ messages in thread From: James Bottomley @ 2014-07-18 15:39 UTC (permalink / raw) To: Elliott@hp.com Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage) wrote: > In sd_sync_cache: > rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; > > Regardless of the baseline for the multiplication, a magic > number of 2 is too arbitrary. That might work for an > individual drive, but could be far too short for a RAID > controller that runs into worst case error handling for > the drives to which it is flushing (e.g., if its cache > is volatile and the drives all have recoverable errors > during writes). That time goes up with a bigger cache and > with more fragmented write data in the cache requiring > more individual WRITE commands. RAID devices with gigabytes of cache are usually battery backed and lie about their cache type (they usually claim write through). This behaviour is exactly what we want because the flush is used to enforce write barriers for filesystem consistency, so we only want to flush volatile caches. The rare case you cite, where the RAID device is volatile and having difficulty flushing, we do want a failure because otherwise the filesystem will become unusable and the system will live lock ... barriers are sent down frequently under normal filesystem operation. The SCSI standards committees have been struggling with defining and detecting the difference between volatile and non-volatile cache and the heuristics we'd have to use to avoid annoying USB devices with detecting this don't look pretty. We always zero the SYNC_NV bit anyway, so even if the devices stopped lying, we'd be safe. > A better value would be the Recommended Command Timeout field > value reported in the REPORT SUPPORTED OPERATION CODES command, > if reported by the device server. That is supposed to account > for the worst case. > > For cases where that is not reported, exposing the multiplier > in sysfs would let the timeout for simple devices be set to > smaller values than complex devices. > > Also, in both sd_setup_flush_cmnd and sd_sync_cache: > cmd->cmnd[0] = SYNCHRONIZE_CACHE; > cmd->cmd_len = 10; > > SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE > CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported. For what reason. We usually go for the safe alternatives, which is 10 byte commands because they have the widest testing and greatest level of support. We don't do range flushes currently, so there doesn't seem to be a practical different. If we did support range flushes, we'd likely only use SC(16) on >2TB devices. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 15:39 ` James Bottomley @ 2014-07-18 17:17 ` Elliott, Robert (Server Storage) 2014-07-18 17:41 ` James Bottomley 0 siblings, 1 reply; 25+ messages in thread From: Elliott, Robert (Server Storage) @ 2014-07-18 17:17 UTC (permalink / raw) To: James Bottomley Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org > From: James Bottomley [mailto:jbottomley@parallels.com] > > On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage) > wrote: ... > > > > Also, in both sd_setup_flush_cmnd and sd_sync_cache: > > cmd->cmnd[0] = SYNCHRONIZE_CACHE; > > cmd->cmd_len = 10; > > > > SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE > > CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported. (sorry - meant "unless ... 16 is not supported") > For what reason. We usually go for the safe alternatives, which is 10 > byte commands because they have the widest testing and greatest level of > support. We don't do range flushes currently, so there doesn't seem to > be a practical different. If we did support range flushes, we'd likely > only use SC(16) on >2TB devices. > > James A goal of the simplified SCSI feature set idea is to drop all the short CDBs that have larger, more capable equivalents - don't carry READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte versions. With modern serial IU-based protocols, short CDBs don't save any transfer time. This will simplify design and testing on both initiator and target sides. Competing command sets like NVMe rightly point out that SCSI has too much legacy baggage - all you need for IO is one READ, one WRITE, and one FLUSH command. That's why SBC-3 ended up with these warning notes for all the non-16 byte CDBs: NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to the SYNCHRONIZE CACHE (16) command is recommended for all implementations. If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe SYNCHRONIZE CACHE (10) would be kept instead of (16), but that field is still present. So, (16) is the likely survivor. --- Rob Elliott HP Server Storage ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 17:17 ` Elliott, Robert (Server Storage) @ 2014-07-18 17:41 ` James Bottomley 2014-07-18 18:16 ` Douglas Gilbert 0 siblings, 1 reply; 25+ messages in thread From: James Bottomley @ 2014-07-18 17:41 UTC (permalink / raw) To: Elliott@hp.com Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On Fri, 2014-07-18 at 17:17 +0000, Elliott, Robert (Server Storage) wrote: > > > > From: James Bottomley [mailto:jbottomley@parallels.com] > > > > On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage) > > wrote: > ... > > > > > > Also, in both sd_setup_flush_cmnd and sd_sync_cache: > > > cmd->cmnd[0] = SYNCHRONIZE_CACHE; > > > cmd->cmd_len = 10; > > > > > > SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE > > > CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported. > > (sorry - meant "unless ... 16 is not supported") Yes, I guessed that? > > For what reason. We usually go for the safe alternatives, which is 10 > > byte commands because they have the widest testing and greatest level of > > support. We don't do range flushes currently, so there doesn't seem to > > be a practical different. If we did support range flushes, we'd likely > > only use SC(16) on >2TB devices. > > > > James > > A goal of the simplified SCSI feature set idea is to drop all the > short CDBs that have larger, more capable equivalents - don't carry > READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte > versions. With modern serial IU-based protocols, short CDBs don't > save any transfer time. This will simplify design and testing on > both initiator and target sides. Competing command sets like NVMe > rightly point out that SCSI has too much legacy baggage - all you > need for IO is one READ, one WRITE, and one FLUSH command. But that's not relevant to us. This is the problem of practical vs standards approaches. We have to support older and buggy devices. Most small USB storage sticks die if they see 16 byte CDB commands because their interpreters. The more "legacy baggage" the standards committee dumps, the greater the number of heuristics OSs have to have to cope with the plethora of odd devices. > That's why SBC-3 ended up with these warning notes for all the > non-16 byte CDBs: > > NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to > the SYNCHRONIZE CACHE (16) command is recommended for all > implementations. > > If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe > SYNCHRONIZE CACHE (10) would be kept instead of (16), but that > field is still present. So, (16) is the likely survivor. OK, but look at it from our point of view: The reply above justifies why we prefer 10 byte CDBs over 16. If the standards body ever did remove SC(10) completely, then we'd have to have yet another heuristic to recognise devices that don't support SC(10), but until that day, using SC(10) alone works in all cases, so is the better path for the OS. If you could, please get the standards body to recognise that the more command churn they introduce (in the name of rationalisation or whatever), the more problems they introduce for Operating Systems and the more likelihood (because of different people reading different revisions of standards) that we end up with compliance bugs in devices. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 17:41 ` James Bottomley @ 2014-07-18 18:16 ` Douglas Gilbert 0 siblings, 0 replies; 25+ messages in thread From: Douglas Gilbert @ 2014-07-18 18:16 UTC (permalink / raw) To: James Bottomley, Elliott@hp.com Cc: linux-kernel@vger.kernel.org, hch@infradead.org, devel@linuxdriverproject.org, apw@canonical.com, michaelc@cs.wisc.edu, kys@microsoft.com, axboe@kernel.dk, linux-scsi@vger.kernel.org, ohering@suse.com, gregkh@linuxfoundation.org, jasowang@redhat.com On 14-07-18 07:41 AM, James Bottomley wrote: > On Fri, 2014-07-18 at 17:17 +0000, Elliott, Robert (Server Storage) > wrote: >> >> >>> From: James Bottomley [mailto:jbottomley@parallels.com] >>> >>> On Fri, 2014-07-18 at 00:51 +0000, Elliott, Robert (Server Storage) >>> wrote: >> ... >>>> >>>> Also, in both sd_setup_flush_cmnd and sd_sync_cache: >>>> cmd->cmnd[0] = SYNCHRONIZE_CACHE; >>>> cmd->cmd_len = 10; >>>> >>>> SYNCHRONIZE CACHE (16) should be favored over SYNCHRONIZE >>>> CACHE (10) unless SYNCHRONIZE CACHE (10) is not supported. >> >> (sorry - meant "unless ... 16 is not supported") > > Yes, I guessed that? > >>> For what reason. We usually go for the safe alternatives, which is 10 >>> byte commands because they have the widest testing and greatest level of >>> support. We don't do range flushes currently, so there doesn't seem to >>> be a practical different. If we did support range flushes, we'd likely >>> only use SC(16) on >2TB devices. >>> >>> James >> >> A goal of the simplified SCSI feature set idea is to drop all the >> short CDBs that have larger, more capable equivalents - don't carry >> READ 6/10/12/16 and SYNCHRONIZE CACHE 10/16, just keep the 16-byte >> versions. With modern serial IU-based protocols, short CDBs don't >> save any transfer time. This will simplify design and testing on >> both initiator and target sides. Competing command sets like NVMe >> rightly point out that SCSI has too much legacy baggage - all you >> need for IO is one READ, one WRITE, and one FLUSH command. > > But that's not relevant to us. This is the problem of practical vs > standards approaches. We have to support older and buggy devices. Most > small USB storage sticks die if they see 16 byte CDB commands because > their interpreters. The more "legacy baggage" the standards committee > dumps, the greater the number of heuristics OSs have to have to cope > with the plethora of odd devices. > >> That's why SBC-3 ended up with these warning notes for all the >> non-16 byte CDBs: >> >> NOTE 15 - Migration from the SYNCHRONIZE CACHE (10) command to >> the SYNCHRONIZE CACHE (16) command is recommended for all >> implementations. >> >> If the LBA field in SYNCHRONIZE CACHE went obsolete, then maybe >> SYNCHRONIZE CACHE (10) would be kept instead of (16), but that >> field is still present. So, (16) is the likely survivor. > > OK, but look at it from our point of view: The reply above justifies > why we prefer 10 byte CDBs over 16. If the standards body ever did > remove SC(10) completely, then we'd have to have yet another heuristic > to recognise devices that don't support SC(10), but until that day, > using SC(10) alone works in all cases, so is the better path for the OS. > > If you could, please get the standards body to recognise that the more > command churn they introduce (in the name of rationalisation or > whatever), the more problems they introduce for Operating Systems and > the more likelihood (because of different people reading different > revisions of standards) that we end up with compliance bugs in devices. From the term: "feature sets" I'm guessing T10 will follow what T13 does and have something like a VPD page with descriptors of feature sets supported. Each set has mandatory and optional commands, perhaps a similar categorization of mode and VPD pages as well. Such a "clean slate" for SCSI would make it simpler in the future, at least for what to put on the fast path. Perhaps some legacy support could be pushed to the user space. For many technical areas "legacy" is a derogatory term, but not necessarily for storage! Doug Gilbert ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-17 23:53 ` KY Srinivasan 2014-07-18 0:51 ` Elliott, Robert (Server Storage) @ 2014-07-18 15:10 ` Christoph Hellwig (hch@infradead.org) 2014-07-18 16:44 ` KY Srinivasan 1 sibling, 1 reply; 25+ messages in thread From: Christoph Hellwig (hch@infradead.org) @ 2014-07-18 15:10 UTC (permalink / raw) To: KY Srinivasan Cc: Jens Axboe, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, James Bottomley, ohering@suse.com, Christoph Hellwig (hch@infradead.org), apw@canonical.com, devel@linuxdriverproject.org [-- Attachment #1: Type: text/plain, Size: 365 bytes --] On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote: > I still see this problem. There was talk of fixing it elsewhere. Well, what we have right not is entirely broken, given that the block layer doesn't initialize ->timeout on TYPE_FS requeuests. We either need to revert that initial commit or apply something like the attached patch as a quick fix. [-- Attachment #2: 0001-sd-set-a-non-zero-timeout-for-flush-requests.patch --] [-- Type: text/plain, Size: 836 bytes --] >From ecbf154d15f4022676219b9ff90e542d1db64392 Mon Sep 17 00:00:00 2001 From: Christoph Hellwig <hch@lst.de> Date: Fri, 18 Jul 2014 17:11:27 +0200 Subject: sd: set a non-zero timeout for flush requests rq->timeout for TYPE_FS commands needs to be initialized by the driver, so we can't simply multiply it. Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/scsi/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 377a520..bef4e78 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -880,7 +880,7 @@ static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd) cmd->transfersize = 0; cmd->allowed = SD_MAX_RETRIES; - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; + rq->timeout = SD_TIMEOUT * SD_FLUSH_TIMEOUT_MULTIPLIER; return BLKPREP_OK; } -- 1.9.1 [-- Attachment #3: Type: text/plain, Size: 169 bytes --] _______________________________________________ devel mailing list devel@linuxdriverproject.org http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel ^ permalink raw reply related [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 15:10 ` Christoph Hellwig (hch@infradead.org) @ 2014-07-18 16:44 ` KY Srinivasan 2014-07-18 16:57 ` James Bottomley 0 siblings, 1 reply; 25+ messages in thread From: KY Srinivasan @ 2014-07-18 16:44 UTC (permalink / raw) To: Christoph Hellwig (hch@infradead.org) Cc: Jens Axboe, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, James Bottomley, ohering@suse.com, apw@canonical.com, devel@linuxdriverproject.org > -----Original Message----- > From: Christoph Hellwig (hch@infradead.org) [mailto:hch@infradead.org] > Sent: Friday, July 18, 2014 8:11 AM > To: KY Srinivasan > Cc: Jens Axboe; James Bottomley; michaelc@cs.wisc.edu; Christoph Hellwig > (hch@infradead.org); linux-scsi@vger.kernel.org; > gregkh@linuxfoundation.org; jasowang@redhat.com; linux- > kernel@vger.kernel.org; ohering@suse.com; apw@canonical.com; > devel@linuxdriverproject.org > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > from the basic I/O timeout > > On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote: > > I still see this problem. There was talk of fixing it elsewhere. > > Well, what we have right not is entirely broken, given that the block layer > doesn't initialize ->timeout on TYPE_FS requeuests. > > We either need to revert that initial commit or apply something like the > attached patch as a quick fix. I had sent this exact patch sometime back: http://www.spinics.net/lists/linux-scsi/msg75385.html K. Y ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 16:44 ` KY Srinivasan @ 2014-07-18 16:57 ` James Bottomley 2014-07-18 17:01 ` hch 2014-07-18 17:05 ` KY Srinivasan 0 siblings, 2 replies; 25+ messages in thread From: James Bottomley @ 2014-07-18 16:57 UTC (permalink / raw) To: kys@microsoft.com Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On Fri, 2014-07-18 at 16:44 +0000, KY Srinivasan wrote: > > > -----Original Message----- > > From: Christoph Hellwig (hch@infradead.org) [mailto:hch@infradead.org] > > Sent: Friday, July 18, 2014 8:11 AM > > To: KY Srinivasan > > Cc: Jens Axboe; James Bottomley; michaelc@cs.wisc.edu; Christoph Hellwig > > (hch@infradead.org); linux-scsi@vger.kernel.org; > > gregkh@linuxfoundation.org; jasowang@redhat.com; linux- > > kernel@vger.kernel.org; ohering@suse.com; apw@canonical.com; > > devel@linuxdriverproject.org > > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > > from the basic I/O timeout > > > > On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote: > > > I still see this problem. There was talk of fixing it elsewhere. > > > > Well, what we have right not is entirely broken, given that the block layer > > doesn't initialize ->timeout on TYPE_FS requeuests. > > > > We either need to revert that initial commit or apply something like the > > attached patch as a quick fix. > I had sent this exact patch sometime back: > > http://www.spinics.net/lists/linux-scsi/msg75385.html Actually, no you didn't. The difference is in the derivation of the timeout. Christoph's patch is absolute in terms of SD_TIMEOUT; yours is relative to the queue timeout setting ... I thought there was a reason for preferring the relative version. James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 16:57 ` James Bottomley @ 2014-07-18 17:01 ` hch 2014-07-18 17:05 ` KY Srinivasan 1 sibling, 0 replies; 25+ messages in thread From: hch @ 2014-07-18 17:01 UTC (permalink / raw) To: James Bottomley Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On Fri, Jul 18, 2014 at 04:57:13PM +0000, James Bottomley wrote: > Actually, no you didn't. The difference is in the derivation of the > timeout. Christoph's patch is absolute in terms of SD_TIMEOUT; yours is > relative to the queue timeout setting ... I thought there was a reason > for preferring the relative version. Yes, KYs version is better. It takes the base timeout drivers set on the request queue into account. ^ permalink raw reply [flat|nested] 25+ messages in thread
* RE: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 16:57 ` James Bottomley 2014-07-18 17:01 ` hch @ 2014-07-18 17:05 ` KY Srinivasan 2014-07-18 17:12 ` hch 1 sibling, 1 reply; 25+ messages in thread From: KY Srinivasan @ 2014-07-18 17:05 UTC (permalink / raw) To: James Bottomley Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org > -----Original Message----- > From: James Bottomley [mailto:jbottomley@parallels.com] > Sent: Friday, July 18, 2014 9:57 AM > To: KY Srinivasan > Cc: linux-kernel@vger.kernel.org; hch@infradead.org; apw@canonical.com; > devel@linuxdriverproject.org; michaelc@cs.wisc.edu; axboe@kernel.dk; > linux-scsi@vger.kernel.org; ohering@suse.com; > gregkh@linuxfoundation.org; jasowang@redhat.com > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT > from the basic I/O timeout > > On Fri, 2014-07-18 at 16:44 +0000, KY Srinivasan wrote: > > > > > -----Original Message----- > > > From: Christoph Hellwig (hch@infradead.org) > > > [mailto:hch@infradead.org] > > > Sent: Friday, July 18, 2014 8:11 AM > > > To: KY Srinivasan > > > Cc: Jens Axboe; James Bottomley; michaelc@cs.wisc.edu; Christoph > > > Hellwig (hch@infradead.org); linux-scsi@vger.kernel.org; > > > gregkh@linuxfoundation.org; jasowang@redhat.com; linux- > > > kernel@vger.kernel.org; ohering@suse.com; apw@canonical.com; > > > devel@linuxdriverproject.org > > > Subject: Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the > > > FLUSH_TIMEOUT from the basic I/O timeout > > > > > > On Thu, Jul 17, 2014 at 11:53:33PM +0000, KY Srinivasan wrote: > > > > I still see this problem. There was talk of fixing it elsewhere. > > > > > > Well, what we have right not is entirely broken, given that the > > > block layer doesn't initialize ->timeout on TYPE_FS requeuests. > > > > > > We either need to revert that initial commit or apply something like > > > the attached patch as a quick fix. > > I had sent this exact patch sometime back: > > > > http://www.spinics.net/lists/linux-scsi/msg75385.html > > Actually, no you didn't. The difference is in the derivation of the timeout. > Christoph's patch is absolute in terms of SD_TIMEOUT; yours is relative to the > queue timeout setting ... I thought there was a reason for preferring the > relative version. You are right; sorry about that. I think my version is better since we do want to base the flush timeout relative to the basic timeout. K. Y > > James ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 17:05 ` KY Srinivasan @ 2014-07-18 17:12 ` hch 2014-07-18 17:15 ` hch 0 siblings, 1 reply; 25+ messages in thread From: hch @ 2014-07-18 17:12 UTC (permalink / raw) To: KY Srinivasan Cc: James Bottomley, linux-kernel@vger.kernel.org, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org, michaelc@cs.wisc.edu, axboe@kernel.dk, linux-scsi@vger.kernel.org, ohering@suse.com, gregkh@linuxfoundation.org, jasowang@redhat.com This is what I plan to put in after it passes basic testing: --- >From bb617c9465b839d70ecbbc69002a20ccf5f935bd Mon Sep 17 00:00:00 2001 From: "K. Y. Srinivasan" <kys@microsoft.com> Date: Fri, 18 Jul 2014 19:12:58 +0200 Subject: sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the basic I/O timeout of the device. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/scsi/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index bef4e78..9ffb393 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -880,7 +880,7 @@ static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd) cmd->transfersize = 0; cmd->allowed = SD_MAX_RETRIES; - rq->timeout = SD_TIMEOUT * SD_FLUSH_TIMEOUT_MULTIPLIER; + rq->timeout = rq->q->rq_timeout * SD_FLUSH_TIMEOUT_MULTIPLIER; return BLKPREP_OK; } -- 1.9.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 17:12 ` hch @ 2014-07-18 17:15 ` hch 0 siblings, 0 replies; 25+ messages in thread From: hch @ 2014-07-18 17:15 UTC (permalink / raw) To: KY Srinivasan Cc: axboe@kernel.dk, michaelc@cs.wisc.edu, linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, James Bottomley, ohering@suse.com, hch@infradead.org, apw@canonical.com, devel@linuxdriverproject.org On Fri, Jul 18, 2014 at 10:12:38AM -0700, hch@infradead.org wrote: > This is what I plan to put in after it passes basic testing: And that one was on top of my previous version. One that applies against core-for-3.17 below: --- >From 8a79783e5f72ec034a724e16c1f46604bd97bf68 Mon Sep 17 00:00:00 2001 From: "K. Y. Srinivasan" <kys@microsoft.com> Date: Fri, 18 Jul 2014 17:11:27 +0200 Subject: sd: fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the basic I/O timeout of the device. Fix this bug. Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Reviewed-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Christoph Hellwig <hch@lst.de> --- drivers/scsi/sd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 377a520..9ffb393 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -880,7 +880,7 @@ static int sd_setup_flush_cmnd(struct scsi_cmnd *cmd) cmd->transfersize = 0; cmd->allowed = SD_MAX_RETRIES; - rq->timeout *= SD_FLUSH_TIMEOUT_MULTIPLIER; + rq->timeout = rq->q->rq_timeout * SD_FLUSH_TIMEOUT_MULTIPLIER; return BLKPREP_OK; } -- 1.9.1 ^ permalink raw reply related [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-06-04 16:33 [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout K. Y. Srinivasan 2014-06-04 17:02 ` James Bottomley @ 2014-07-18 17:00 ` Christoph Hellwig 2014-07-18 17:03 ` James Bottomley 1 sibling, 1 reply; 25+ messages in thread From: Christoph Hellwig @ 2014-07-18 17:00 UTC (permalink / raw) To: K. Y. Srinivasan Cc: linux-scsi, gregkh, jasowang, ohering, jbottomley, linux-kernel, hch, apw, devel On Wed, Jun 04, 2014 at 09:33:43AM -0700, K. Y. Srinivasan wrote: > Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the > FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the > basic I/O timeout of the device. Fix this bug. > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> Looks good to me, Reviewed-by: Christoph Hellwig <hch@lst.de> (it will need some changes to apply to my tree, but I'm happy to do that if I can get another review). James, does this look fine to you? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout 2014-07-18 17:00 ` Christoph Hellwig @ 2014-07-18 17:03 ` James Bottomley 0 siblings, 0 replies; 25+ messages in thread From: James Bottomley @ 2014-07-18 17:03 UTC (permalink / raw) To: hch@infradead.org Cc: linux-scsi@vger.kernel.org, gregkh@linuxfoundation.org, jasowang@redhat.com, linux-kernel@vger.kernel.org, ohering@suse.com, apw@canonical.com, devel@linuxdriverproject.org On Fri, 2014-07-18 at 10:00 -0700, Christoph Hellwig wrote: > On Wed, Jun 04, 2014 at 09:33:43AM -0700, K. Y. Srinivasan wrote: > > Commit ID: 7e660100d85af860e7ad763202fff717adcdaacd added code to derive the > > FLUSH_TIMEOUT from the basic I/O timeout. However, this patch did not use the > > basic I/O timeout of the device. Fix this bug. > > > > Signed-off-by: K. Y. Srinivasan <kys@microsoft.com> > > Looks good to me, > > Reviewed-by: Christoph Hellwig <hch@lst.de> > > (it will need some changes to apply to my tree, but I'm happy to do that > if I can get another review). > > James, does this look fine to you? Yes: Reviewed-by: James Bottomley <JBottomley@Parallels.com> James ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2014-07-18 18:16 UTC | newest] Thread overview: 25+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-06-04 16:33 [PATCH 1/1] [SCSI] Fix a bug in deriving the FLUSH_TIMEOUT from the basic I/O timeout K. Y. Srinivasan 2014-06-04 17:02 ` James Bottomley 2014-06-04 17:15 ` KY Srinivasan 2014-06-06 1:32 ` Mike Christie 2014-06-06 2:53 ` KY Srinivasan 2014-06-06 17:18 ` Mike Christie 2014-06-06 17:52 ` James Bottomley 2014-06-06 18:22 ` Jens Axboe 2014-06-20 21:36 ` KY Srinivasan 2014-07-17 23:53 ` KY Srinivasan 2014-07-18 0:51 ` Elliott, Robert (Server Storage) 2014-07-18 15:11 ` Christoph Hellwig (hch@infradead.org) 2014-07-18 15:39 ` James Bottomley 2014-07-18 17:17 ` Elliott, Robert (Server Storage) 2014-07-18 17:41 ` James Bottomley 2014-07-18 18:16 ` Douglas Gilbert 2014-07-18 15:10 ` Christoph Hellwig (hch@infradead.org) 2014-07-18 16:44 ` KY Srinivasan 2014-07-18 16:57 ` James Bottomley 2014-07-18 17:01 ` hch 2014-07-18 17:05 ` KY Srinivasan 2014-07-18 17:12 ` hch 2014-07-18 17:15 ` hch 2014-07-18 17:00 ` Christoph Hellwig 2014-07-18 17:03 ` James Bottomley
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).