* [PATCH] block: add global timeout to requests
@ 2009-03-18 20:21 James Bottomley
2009-03-24 11:42 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: James Bottomley @ 2009-03-18 20:21 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-scsi
We can eliminate the SCSI command timed out check entirely if the block
layer does this for us. The way to do this in block is to check how
long the request has been outstanding if a requeue is requested and
ending it if we've gone over retries * timeout.
This will also eliminate many cases in SCSI where we evade the command
timeout for various reasons (like initial success converted to requeue)
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
---
Index: linux-2.6/block/blk-core.c
===================================================================
--- linux-2.6.orig/block/blk-core.c 2009-02-19 16:42:00.140427205 -0500
+++ linux-2.6/block/blk-core.c 2009-03-18 10:36:10.511526815 -0400
@@ -937,6 +937,8 @@ EXPORT_SYMBOL(blk_start_queueing);
*/
void blk_requeue_request(struct request_queue *q, struct request *rq)
{
+ unsigned long wait_for = (rq->retries + 1) * rq->timeout;
+
blk_delete_timer(rq);
blk_clear_rq_complete(rq);
trace_block_rq_requeue(q, rq);
@@ -944,7 +946,13 @@ void blk_requeue_request(struct request_
if (blk_rq_tagged(rq))
blk_queue_end_tag(q, rq);
- elv_requeue_request(q, rq);
+ if (time_before(rq->start_time + wait_for, jiffies)) {
+ printk(KERN_ERR "%s: timing out command, waited %lus\n",
+ rq->rq_disk ? rq->rq_disk->disk_name : "?",
+ wait_for/HZ);
+ blk_end_request(rq, -EIO, blk_rq_bytes(rq));
+ } else
+ elv_requeue_request(q, rq);
}
EXPORT_SYMBOL(blk_requeue_request);
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block: add global timeout to requests
2009-03-18 20:21 [PATCH] block: add global timeout to requests James Bottomley
@ 2009-03-24 11:42 ` Jens Axboe
2009-03-25 7:19 ` Kiyoshi Ueda
0 siblings, 1 reply; 4+ messages in thread
From: Jens Axboe @ 2009-03-24 11:42 UTC (permalink / raw)
To: James Bottomley; +Cc: linux-scsi
On Wed, Mar 18 2009, James Bottomley wrote:
> We can eliminate the SCSI command timed out check entirely if the block
> layer does this for us. The way to do this in block is to check how
> long the request has been outstanding if a requeue is requested and
> ending it if we've gone over retries * timeout.
>
> This will also eliminate many cases in SCSI where we evade the command
> timeout for various reasons (like initial success converted to requeue)
>
> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
>
> ---
>
> Index: linux-2.6/block/blk-core.c
> ===================================================================
> --- linux-2.6.orig/block/blk-core.c 2009-02-19 16:42:00.140427205 -0500
> +++ linux-2.6/block/blk-core.c 2009-03-18 10:36:10.511526815 -0400
> @@ -937,6 +937,8 @@ EXPORT_SYMBOL(blk_start_queueing);
> */
> void blk_requeue_request(struct request_queue *q, struct request *rq)
> {
> + unsigned long wait_for = (rq->retries + 1) * rq->timeout;
> +
> blk_delete_timer(rq);
> blk_clear_rq_complete(rq);
> trace_block_rq_requeue(q, rq);
> @@ -944,7 +946,13 @@ void blk_requeue_request(struct request_
> if (blk_rq_tagged(rq))
> blk_queue_end_tag(q, rq);
>
> - elv_requeue_request(q, rq);
> + if (time_before(rq->start_time + wait_for, jiffies)) {
> + printk(KERN_ERR "%s: timing out command, waited %lus\n",
> + rq->rq_disk ? rq->rq_disk->disk_name : "?",
> + wait_for/HZ);
> + blk_end_request(rq, -EIO, blk_rq_bytes(rq));
> + } else
> + elv_requeue_request(q, rq);
> }
> EXPORT_SYMBOL(blk_requeue_request);
Applied, though I think a
time_after(jiffies, time)
is usually much clearer, since we are checking for an expired event (not
an early one) :-)
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block: add global timeout to requests
2009-03-24 11:42 ` Jens Axboe
@ 2009-03-25 7:19 ` Kiyoshi Ueda
2009-03-25 7:36 ` Jens Axboe
0 siblings, 1 reply; 4+ messages in thread
From: Kiyoshi Ueda @ 2009-03-25 7:19 UTC (permalink / raw)
To: Jens Axboe, James Bottomley; +Cc: linux-scsi
Hi Jens, James,
On 2009/03/24 20:42 +0900, Jens Axboe wrote:
> On Wed, Mar 18 2009, James Bottomley wrote:
>> We can eliminate the SCSI command timed out check entirely if the block
>> layer does this for us. The way to do this in block is to check how
>> long the request has been outstanding if a requeue is requested and
>> ending it if we've gone over retries * timeout.
>>
>> This will also eliminate many cases in SCSI where we evade the command
>> timeout for various reasons (like initial success converted to requeue)
>>
>> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
>>
>> ---
>>
>> Index: linux-2.6/block/blk-core.c
>> ===================================================================
>> --- linux-2.6.orig/block/blk-core.c 2009-02-19 16:42:00.140427205 -0500
>> +++ linux-2.6/block/blk-core.c 2009-03-18 10:36:10.511526815 -0400
>> @@ -937,6 +937,8 @@ EXPORT_SYMBOL(blk_start_queueing);
>> */
>> void blk_requeue_request(struct request_queue *q, struct request *rq)
>> {
>> + unsigned long wait_for = (rq->retries + 1) * rq->timeout;
>> +
>> blk_delete_timer(rq);
>> blk_clear_rq_complete(rq);
>> trace_block_rq_requeue(q, rq);
>> @@ -944,7 +946,13 @@ void blk_requeue_request(struct request_
>> if (blk_rq_tagged(rq))
>> blk_queue_end_tag(q, rq);
>>
>> - elv_requeue_request(q, rq);
>> + if (time_before(rq->start_time + wait_for, jiffies)) {
>> + printk(KERN_ERR "%s: timing out command, waited %lus\n",
>> + rq->rq_disk ? rq->rq_disk->disk_name : "?",
>> + wait_for/HZ);
>> + blk_end_request(rq, -EIO, blk_rq_bytes(rq));
>> + } else
>> + elv_requeue_request(q, rq);
>> }
>> EXPORT_SYMBOL(blk_requeue_request);
>
> Applied, though I think a
>
> time_after(jiffies, time)
>
> is usually much clearer, since we are checking for an expired event (not
> an early one) :-)
I have the following concerns on this patch.
o What if the driver doesn't use the generic timeout handling?
This change means that all block drivers must appropriately set
rq->retries and rq->timeout to use blk_requeue_request(),
although there was no such rule.
So some regressions may be caused in drivers which are not using
rq->retries and rq->timeout for unconditional retry.
(I also need some works in the request-based dm patch for this.)
"!q->rq_timed_out_fn" check in addition to the "time_before()" check
may resolve this issue?
o What if the driver doesn't set rq->retries but has its own logic
to decide when to give up retries?
This change means that drivers which don't use rq->retries
must appropriately modify rq->timeout everytime they call
blk_requeue_request() once timeout fires.
o Deadlock will occur on the blk_end_request() call, because
blk_requeue_request() is called with queue_lock held.
__blk_end_request() needs to be used at least, or other ways are
needed (e.g. throw the request to softirq.)
Thanks,
Kiyoshi Ueda
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] block: add global timeout to requests
2009-03-25 7:19 ` Kiyoshi Ueda
@ 2009-03-25 7:36 ` Jens Axboe
0 siblings, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2009-03-25 7:36 UTC (permalink / raw)
To: Kiyoshi Ueda; +Cc: James Bottomley, linux-scsi
On Wed, Mar 25 2009, Kiyoshi Ueda wrote:
> Hi Jens, James,
>
> On 2009/03/24 20:42 +0900, Jens Axboe wrote:
> > On Wed, Mar 18 2009, James Bottomley wrote:
> >> We can eliminate the SCSI command timed out check entirely if the block
> >> layer does this for us. The way to do this in block is to check how
> >> long the request has been outstanding if a requeue is requested and
> >> ending it if we've gone over retries * timeout.
> >>
> >> This will also eliminate many cases in SCSI where we evade the command
> >> timeout for various reasons (like initial success converted to requeue)
> >>
> >> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
> >>
> >> ---
> >>
> >> Index: linux-2.6/block/blk-core.c
> >> ===================================================================
> >> --- linux-2.6.orig/block/blk-core.c 2009-02-19 16:42:00.140427205 -0500
> >> +++ linux-2.6/block/blk-core.c 2009-03-18 10:36:10.511526815 -0400
> >> @@ -937,6 +937,8 @@ EXPORT_SYMBOL(blk_start_queueing);
> >> */
> >> void blk_requeue_request(struct request_queue *q, struct request *rq)
> >> {
> >> + unsigned long wait_for = (rq->retries + 1) * rq->timeout;
> >> +
> >> blk_delete_timer(rq);
> >> blk_clear_rq_complete(rq);
> >> trace_block_rq_requeue(q, rq);
> >> @@ -944,7 +946,13 @@ void blk_requeue_request(struct request_
> >> if (blk_rq_tagged(rq))
> >> blk_queue_end_tag(q, rq);
> >>
> >> - elv_requeue_request(q, rq);
> >> + if (time_before(rq->start_time + wait_for, jiffies)) {
> >> + printk(KERN_ERR "%s: timing out command, waited %lus\n",
> >> + rq->rq_disk ? rq->rq_disk->disk_name : "?",
> >> + wait_for/HZ);
> >> + blk_end_request(rq, -EIO, blk_rq_bytes(rq));
> >> + } else
> >> + elv_requeue_request(q, rq);
> >> }
> >> EXPORT_SYMBOL(blk_requeue_request);
> >
> > Applied, though I think a
> >
> > time_after(jiffies, time)
> >
> > is usually much clearer, since we are checking for an expired event (not
> > an early one) :-)
>
> I have the following concerns on this patch.
>
> o What if the driver doesn't use the generic timeout handling?
> This change means that all block drivers must appropriately set
> rq->retries and rq->timeout to use blk_requeue_request(),
> although there was no such rule.
> So some regressions may be caused in drivers which are not using
> rq->retries and rq->timeout for unconditional retry.
> (I also need some works in the request-based dm patch for this.)
>
> "!q->rq_timed_out_fn" check in addition to the "time_before()" check
> may resolve this issue?
>
> o What if the driver doesn't set rq->retries but has its own logic
> to decide when to give up retries?
> This change means that drivers which don't use rq->retries
> must appropriately modify rq->timeout everytime they call
> blk_requeue_request() once timeout fires.
>
> o Deadlock will occur on the blk_end_request() call, because
> blk_requeue_request() is called with queue_lock held.
> __blk_end_request() needs to be used at least, or other ways are
> needed (e.g. throw the request to softirq.)
Thanks for the review, it is indeed deadlocky and the role with drivers
not using the generic timeout needs to be revisited. I've yanked the
patch for now.
--
Jens Axboe
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-03-25 7:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-18 20:21 [PATCH] block: add global timeout to requests James Bottomley
2009-03-24 11:42 ` Jens Axboe
2009-03-25 7:19 ` Kiyoshi Ueda
2009-03-25 7:36 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).