Re: [PATCH] libata error handling fixes (ATAPI)

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH] libata error handling fixes (ATAPI)
       [not found]     ` <4379B28E.9070708@gmail.com>
@ 2005-11-15 11:02       ` Jeff Garzik
  2005-11-15 12:00         ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Jeff Garzik @ 2005-11-15 11:02 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-ide, lkml, SCSI Mailing List

Tejun Heo wrote:
> My RFC was written *after* the patches to ease the reviewing/integration 
> process.  Well, that was the intention.  I understand your point but I 
> think that EH needs some big changes, so it might be beneficial to 

Agreed, libata EH still needs a lot of work.  ATA device error handling 
could better, ATA bus and PCI bus errors need to be handled a lot 
better, etc.

> establish some consensus such that other developers can do stuff that 
> can be accepted.
> 
>>
>> The current feeling is that we should move away from complex 
>> dependencies on the SCSI EH layer, and do all our own error handling 
>> (perhaps in ata_wq).  This will allow use of libata as a block driver.
>>
> 
> For departure of libata from SCSI, I was thinking more of another more 
> generic block device framework in which libata can live in.  And I 
> thought that it was reasonable to assume that the framework would supply 
> a EH mechanism which supports queue stalling/draining and separate 
> thread.  So, my EH patches tried to make the same environment for libata 

A big reason why libata uses the SCSI layer is infrastructure like this. 
  It would certainly be nice to see timeouts and EH at the block layer. 
  The block layer itself already supports queue stalling/draining.

> so that it doesn't have to be changed drastically later.  All those 
> glueing codes were separated in one or two functions.
> 
> The point I'm trying to make here is not that my EH design should be 
> accepted or anything but that without established consensus it's very 
> difficult for contributing developers like me to develop substantial 
> part for libata.
> 
> One more thing that bothers me is that even with splitted patches and 
> detailed document, I couldn't get a discussion started on the mailing 
> list.  No discussion, no consensus.  I think we need to improve things 
> on this front.

Honestly there isn't a ton of discussion beforehand about anything in 
libata :)  It's mainly just like Linux itself...  no roadmap, its more 
just about what patches are accepted.  The consensus is the code.  I 
have a general idea of where I want libata to go, in my mind, but that 
picture is very rough and fuzzy :)  I let evolution sort out the details.

Speaking specifically, changes to libata's error handling should have 
very practical, observable effects.  Error handling code paths are 
travelled so infrequently by users, that changes for the general purpose 
of "improve something ugly" are not interesting.  Changes which address 
problems people are seeing in the field are top priority:

* kicking the device with SRST, if it continues to signal BSY.  If that 
fails, do a hard reset with SATA COMRESET or <some hardware-specific 
method>.  Note we really really really want to prefer SRST to other 
methods of reset, as SRST is much nicer to the device, and gives the 
device the opportunity to flush its writeback cache.

Note that some hardware may prefer that you use a hardware-specific 
method of issuing SRST, so that it can reset internal state machines 
along with actually issuing the SRST to the device.  This probably means 
a new protocol, ATA_PROT_SRST (which aligns nicely with SCSI SAT spec).

* retrying rather than cancelling commands that fail due to pci/dma/crc 
error.  some hardware (PCI IDE BMDMA) indicates DMA/PCI errors via 
timeout.  Other hardware is nicer, and indicates such via interrupt.

This may involve exporting scsi_eh_finish_cmd() and 
scsi_eh_flush_done_q(), and un-exporting scsi_finish_command().  Hence 
my reluctance to expand the use of scsi_finish_command(), which is 
really inadequate for our purposes.

This is also necessary for NCQ error handling.

* better use of SError.  some vendor drivers read+write SError on every 
disk transaction, but I think that's overkill.  Reading SError to check 
for errors, against a stored mask, might not be a bad idea.

* longer term:  add logic to "downshift" UDMA mode and/or SATA speed, if 
CRC/DMA errors persist.

I'm much less inclined to take patches which don't make progress towards 
any of these IMO critical areas of EH importance.

	Jeff

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-15 11:02       ` [PATCH] libata error handling fixes (ATAPI) Jeff Garzik
@ 2005-11-15 12:00         ` Jens Axboe
  2005-11-15 18:25           ` Mike Christie
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2005-11-15 12:00 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Tejun Heo, linux-ide, lkml, SCSI Mailing List

On Tue, Nov 15 2005, Jeff Garzik wrote:
> >For departure of libata from SCSI, I was thinking more of another more 
> >generic block device framework in which libata can live in.  And I 
> >thought that it was reasonable to assume that the framework would supply 
> >a EH mechanism which supports queue stalling/draining and separate 
> >thread.  So, my EH patches tried to make the same environment for libata 
> 
> A big reason why libata uses the SCSI layer is infrastructure like this. 
>  It would certainly be nice to see timeouts and EH at the block layer. 
>  The block layer itself already supports queue stalling/draining.

I have a pretty simple plan for this:

- Add a timer to struct request. It already has a timeout field for
  SG_IO originated requests, we could easily utilize this in general.
  I'm not sure how the querying of timeout would happen so far, it would
  probably require a q->set_rq_timeout() hook to ask the low level
  driver to set/return rq->timeout for a given request.

- Add a timeout hook to struct request_queue that would get invoked from
  the timeout handler. Something along the lines of:

        - Timeout on a request happens. Freeze the queue and use
          kblockd to take the actual timeout into process context, where
          we call the queue ->rq_timeout() hook. Unfreeze/reschedule
          queue operations based on what the ->rq_timeout() hook tells
          us.

That is generic enough to be able to arm the timeout automatically from
->elevator_activate_req_fn() and dearm it when it completes or gets
deactivated. It should also be possible to implement the SCSI error
handling on top of that.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-15 12:00         ` Jens Axboe
@ 2005-11-15 18:25           ` Mike Christie
  2005-11-15 18:41             ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Mike Christie @ 2005-11-15 18:25 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Jeff Garzik, Tejun Heo, linux-ide, lkml, SCSI Mailing List

Jens Axboe wrote:
> On Tue, Nov 15 2005, Jeff Garzik wrote:
> 
>>>For departure of libata from SCSI, I was thinking more of another more 
>>>generic block device framework in which libata can live in.  And I 
>>>thought that it was reasonable to assume that the framework would supply 
>>>a EH mechanism which supports queue stalling/draining and separate 
>>>thread.  So, my EH patches tried to make the same environment for libata 
>>
>>A big reason why libata uses the SCSI layer is infrastructure like this. 
>> It would certainly be nice to see timeouts and EH at the block layer. 
>> The block layer itself already supports queue stalling/draining.
> 
> 
> I have a pretty simple plan for this:
> 
> - Add a timer to struct request. It already has a timeout field for
>   SG_IO originated requests, we could easily utilize this in general.
>   I'm not sure how the querying of timeout would happen so far, it would
>   probably require a q->set_rq_timeout() hook to ask the low level
>   driver to set/return rq->timeout for a given request.
> 
> - Add a timeout hook to struct request_queue that would get invoked from
>   the timeout handler. Something along the lines of:
> 
>         - Timeout on a request happens. Freeze the queue and use
>           kblockd to take the actual timeout into process context, where
>           we call the queue ->rq_timeout() hook. Unfreeze/reschedule
>           queue operations based on what the ->rq_timeout() hook tells
>           us.
> 
> That is generic enough to be able to arm the timeout automatically from
> ->elevator_activate_req_fn() and dearm it when it completes or gets
> deactivated. It should also be possible to implement the SCSI error
> handling on top of that.
> 

To disable the timeout would you then have scsi_done call a block layer 
function to disarm it then follow the current flow where or do you think 
it would be nice to move the scsi softirq code up to block layer. So 
scsi_done would call a block layer function that would disarm the timer, 
add the request to a block layer softirq list (a list like scsi-ml's 
scsi_done_q), and then in the block layer softirq function it could call 
a request_queue callout which for scsi-ml's device queue would call 
scsi_decide_disposition and return if it wanted the request requeued or 
how many sectors completed or to kick off the eh. I had stated on this 
for my block layer multipath driver, but can seperate the patches if 
this would be useful.

Would ide benefit from running from a softirq and would it be able to 
use such a thing?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-15 18:25           ` Mike Christie
@ 2005-11-15 18:41             ` Jens Axboe
  2005-11-16 12:40               ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2005-11-15 18:41 UTC (permalink / raw)
  To: Mike Christie; +Cc: Jeff Garzik, Tejun Heo, linux-ide, lkml, SCSI Mailing List

On Tue, Nov 15 2005, Mike Christie wrote:
> Jens Axboe wrote:
> >On Tue, Nov 15 2005, Jeff Garzik wrote:
> >
> >>>For departure of libata from SCSI, I was thinking more of another more 
> >>>generic block device framework in which libata can live in.  And I 
> >>>thought that it was reasonable to assume that the framework would supply 
> >>>a EH mechanism which supports queue stalling/draining and separate 
> >>>thread.  So, my EH patches tried to make the same environment for libata 
> >>
> >>A big reason why libata uses the SCSI layer is infrastructure like this. 
> >>It would certainly be nice to see timeouts and EH at the block layer. 
> >>The block layer itself already supports queue stalling/draining.
> >
> >
> >I have a pretty simple plan for this:
> >
> >- Add a timer to struct request. It already has a timeout field for
> >  SG_IO originated requests, we could easily utilize this in general.
> >  I'm not sure how the querying of timeout would happen so far, it would
> >  probably require a q->set_rq_timeout() hook to ask the low level
> >  driver to set/return rq->timeout for a given request.
> >
> >- Add a timeout hook to struct request_queue that would get invoked from
> >  the timeout handler. Something along the lines of:
> >
> >        - Timeout on a request happens. Freeze the queue and use
> >          kblockd to take the actual timeout into process context, where
> >          we call the queue ->rq_timeout() hook. Unfreeze/reschedule
> >          queue operations based on what the ->rq_timeout() hook tells
> >          us.
> >
> >That is generic enough to be able to arm the timeout automatically from
> >->elevator_activate_req_fn() and dearm it when it completes or gets
> >deactivated. It should also be possible to implement the SCSI error
> >handling on top of that.
> >
> 
> To disable the timeout would you then have scsi_done call a block layer 
> function to disarm it then follow the current flow where or do you think 
> it would be nice to move the scsi softirq code up to block layer. So 
> scsi_done would call a block layer function that would disarm the timer, 
> add the request to a block layer softirq list (a list like scsi-ml's 
> scsi_done_q), and then in the block layer softirq function it could call 
> a request_queue callout which for scsi-ml's device queue would call 
> scsi_decide_disposition and return if it wanted the request requeued or 
> how many sectors completed or to kick off the eh. I had stated on this 
> for my block layer multipath driver, but can seperate the patches if 
> this would be useful.

Yeah, that was part of my plan as well. I did post such a patch a year
or so ago, in a thread about decreasing ide completion latencies.

> Would ide benefit from running from a softirq and would it be able to 
> use such a thing?

It's generally useful as it allows lock free completion from the irq
path, so that's goodness.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-15 18:41             ` Jens Axboe
@ 2005-11-16 12:40               ` Jens Axboe
  2005-11-16 12:56                 ` Jeff Garzik
                                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 12:40 UTC (permalink / raw)
  To: Mike Christie; +Cc: Jeff Garzik, Tejun Heo, linux-ide, lkml, SCSI Mailing List

On Tue, Nov 15 2005, Jens Axboe wrote:
> On Tue, Nov 15 2005, Mike Christie wrote:
> > Jens Axboe wrote:
> > >On Tue, Nov 15 2005, Jeff Garzik wrote:
> > >
> > >>>For departure of libata from SCSI, I was thinking more of another more 
> > >>>generic block device framework in which libata can live in.  And I 
> > >>>thought that it was reasonable to assume that the framework would supply 
> > >>>a EH mechanism which supports queue stalling/draining and separate 
> > >>>thread.  So, my EH patches tried to make the same environment for libata 
> > >>
> > >>A big reason why libata uses the SCSI layer is infrastructure like this. 
> > >>It would certainly be nice to see timeouts and EH at the block layer. 
> > >>The block layer itself already supports queue stalling/draining.
> > >
> > >
> > >I have a pretty simple plan for this:
> > >
> > >- Add a timer to struct request. It already has a timeout field for
> > >  SG_IO originated requests, we could easily utilize this in general.
> > >  I'm not sure how the querying of timeout would happen so far, it would
> > >  probably require a q->set_rq_timeout() hook to ask the low level
> > >  driver to set/return rq->timeout for a given request.
> > >
> > >- Add a timeout hook to struct request_queue that would get invoked from
> > >  the timeout handler. Something along the lines of:
> > >
> > >        - Timeout on a request happens. Freeze the queue and use
> > >          kblockd to take the actual timeout into process context, where
> > >          we call the queue ->rq_timeout() hook. Unfreeze/reschedule
> > >          queue operations based on what the ->rq_timeout() hook tells
> > >          us.
> > >
> > >That is generic enough to be able to arm the timeout automatically from
> > >->elevator_activate_req_fn() and dearm it when it completes or gets
> > >deactivated. It should also be possible to implement the SCSI error
> > >handling on top of that.
> > >
> > 
> > To disable the timeout would you then have scsi_done call a block layer 
> > function to disarm it then follow the current flow where or do you think 
> > it would be nice to move the scsi softirq code up to block layer. So 
> > scsi_done would call a block layer function that would disarm the timer, 
> > add the request to a block layer softirq list (a list like scsi-ml's 
> > scsi_done_q), and then in the block layer softirq function it could call 
> > a request_queue callout which for scsi-ml's device queue would call 
> > scsi_decide_disposition and return if it wanted the request requeued or 
> > how many sectors completed or to kick off the eh. I had stated on this 
> > for my block layer multipath driver, but can seperate the patches if 
> > this would be useful.
> 
> Yeah, that was part of my plan as well. I did post such a patch a year
> or so ago, in a thread about decreasing ide completion latencies.
> 
> > Would ide benefit from running from a softirq and would it be able to 
> > use such a thing?
> 
> It's generally useful as it allows lock free completion from the irq
> path, so that's goodness.

I updated that patch, and converted IDE and SCSI to use it. See the
results here:

http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq

The main change from the version posted last october is killing the
'slightly' overdesigned completion queue hashing.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 12:40               ` Jens Axboe
@ 2005-11-16 12:56                 ` Jeff Garzik
  2005-11-16 13:13                   ` Jens Axboe
  2005-11-16 13:31                 ` Jeff Garzik
  2005-11-16 15:04                 ` Bartlomiej Zolnierkiewicz
  2 siblings, 1 reply; 24+ messages in thread
From: Jeff Garzik @ 2005-11-16 12:56 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Christie, Tejun Heo, linux-ide, lkml, SCSI Mailing List

Jens Axboe wrote:
> I updated that patch, and converted IDE and SCSI to use it. See the
> results here:
> 
> http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> 
> The main change from the version posted last october is killing the
> 'slightly' overdesigned completion queue hashing.

Nifty, I like.  Comments:

* use of spin_lock_irq() in all completion paths now makes me nervous.

* certainly it's what SCSI does now, but is a softirq really necessary? 
  Using a tasklet would kill all that per-cpu code, and notifier.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 12:56                 ` Jeff Garzik
@ 2005-11-16 13:13                   ` Jens Axboe
  2005-11-16 13:23                     ` Jeff Garzik
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 13:13 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Mike Christie, Tejun Heo, linux-ide, lkml, SCSI Mailing List

On Wed, Nov 16 2005, Jeff Garzik wrote:
> Jens Axboe wrote:
> >I updated that patch, and converted IDE and SCSI to use it. See the
> >results here:
> >
> >http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> >
> >The main change from the version posted last october is killing the
> >'slightly' overdesigned completion queue hashing.
> 
> Nifty, I like.  Comments:
> 
> * use of spin_lock_irq() in all completion paths now makes me nervous.

Should be fine from the paths originating from blk_done_softirq(), as we
know interrupts are enabled in the first place. But generally I agree,
whenever in doubt always always use the irq saving variants.

> * certainly it's what SCSI does now, but is a softirq really necessary? 
>  Using a tasklet would kill all that per-cpu code, and notifier.

It would work fine with a tasklet of course, but it's going to generate
a _lot_ of traffic on io busy systems so I felt a dedicated softirq was
the way to go.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 13:13                   ` Jens Axboe
@ 2005-11-16 13:23                     ` Jeff Garzik
  0 siblings, 0 replies; 24+ messages in thread
From: Jeff Garzik @ 2005-11-16 13:23 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Christie, Tejun Heo, linux-ide, lkml, SCSI Mailing List

Jens Axboe wrote:
> On Wed, Nov 16 2005, Jeff Garzik wrote:
> 
>>Jens Axboe wrote:
>>
>>>I updated that patch, and converted IDE and SCSI to use it. See the
>>>results here:
>>>
>>>http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
>>>
>>>The main change from the version posted last october is killing the
>>>'slightly' overdesigned completion queue hashing.
>>
>>Nifty, I like.  Comments:
>>
>>* use of spin_lock_irq() in all completion paths now makes me nervous.
> 
> 
> Should be fine from the paths originating from blk_done_softirq(), as we
> know interrupts are enabled in the first place. But generally I agree,
> whenever in doubt always always use the irq saving variants.
> 
> 
>>* certainly it's what SCSI does now, but is a softirq really necessary? 
>> Using a tasklet would kill all that per-cpu code, and notifier.
> 
> 
> It would work fine with a tasklet of course, but it's going to generate
> a _lot_ of traffic on io busy systems so I felt a dedicated softirq was
> the way to go.

fair enough.  ACK.

	Jeff




^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 12:40               ` Jens Axboe
  2005-11-16 12:56                 ` Jeff Garzik
@ 2005-11-16 13:31                 ` Jeff Garzik
  2005-11-16 13:47                   ` Jens Axboe
  2005-11-16 15:04                 ` Bartlomiej Zolnierkiewicz
  2 siblings, 1 reply; 24+ messages in thread
From: Jeff Garzik @ 2005-11-16 13:31 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Mike Christie, Tejun Heo, linux-ide, lkml, SCSI Mailing List

Jens Axboe wrote:
> I updated that patch, and converted IDE and SCSI to use it. See the
> results here:
> 
> http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> 
> The main change from the version posted last october is killing the
> 'slightly' overdesigned completion queue hashing.

Oh yeah:  you shouldn't need to make scsi_retry_command() exported. 
It's private to scsi, to just publish it in scsi_priv.h.

Tangent:  As part of my libata work, I'm thinking about un-exporting 
scsi_finish_command(), and instead exporting scsi_eh_flush_done_q() and 
scsi_eh_finish_cmd().

	Jeff



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 13:31                 ` Jeff Garzik
@ 2005-11-16 13:47                   ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 13:47 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Mike Christie, Tejun Heo, linux-ide, lkml, SCSI Mailing List

On Wed, Nov 16 2005, Jeff Garzik wrote:
> Jens Axboe wrote:
> >I updated that patch, and converted IDE and SCSI to use it. See the
> >results here:
> >
> >http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> >
> >The main change from the version posted last october is killing the
> >'slightly' overdesigned completion queue hashing.
> 
> Oh yeah:  you shouldn't need to make scsi_retry_command() exported. 
> It's private to scsi, to just publish it in scsi_priv.h.

Oh right, I'll make that change. Thanks!

> Tangent:  As part of my libata work, I'm thinking about un-exporting 
> scsi_finish_command(), and instead exporting scsi_eh_flush_done_q() and 
> scsi_eh_finish_cmd().

Sounds ok.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 12:40               ` Jens Axboe
  2005-11-16 12:56                 ` Jeff Garzik
  2005-11-16 13:31                 ` Jeff Garzik
@ 2005-11-16 15:04                 ` Bartlomiej Zolnierkiewicz
  2005-11-16 15:31                   ` Jens Axboe
  2 siblings, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-11-16 15:04 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On 11/16/05, Jens Axboe <axboe@suse.de> wrote:

> I updated that patch, and converted IDE and SCSI to use it. See the
> results here:
>
> http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq

I like it but:

* "we know it's either an FS or PC request" assumption in
  ide_softirq_done() is really wrong
* same with "uptodate = rq->errors"
* blk_complete_request() is called with ide_lock hold but
  ide_softirq_done() also takes ide_lock - is this correct?

"There's still room for improvement, as __ide_end_request() really
could drop the lock after getting HWGROUP->rq (why does it need to
hold it in the first place? If ->rq access isn't serialized, we are
screwed anyways)."

ide_preempt?  and yes we are screwed...

Bartlomiej

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 15:04                 ` Bartlomiej Zolnierkiewicz
@ 2005-11-16 15:31                   ` Jens Axboe
  2005-11-16 16:06                     ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 15:31 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> 
> > I updated that patch, and converted IDE and SCSI to use it. See the
> > results here:
> >
> > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> 
> I like it but:
> 
> * "we know it's either an FS or PC request" assumption in
>   ide_softirq_done() is really wrong

It used to be correct :-)

No, the problem is that I changed the partial stuff to allow the
deletion/putting of the request to work for every type of request. But
it definitely needs some more looking into.

> * same with "uptodate = rq->errors"

Yeah. In general, ->errors needs to be streamlined. It's a huge mess
right now and it's making generic code really hard to do because every
driver does their own weird thing with it.

I'd like for IDE to really do stuff the error in ->errors, and push the
retry or whatever counting into a ->retries instead. Then we can honor
the simple rule of, rq->errors:

< 0, it contains an -Exxxx value
== 0, no errors, uptodate
> 0, not uptodate, no specific error info. Usually 1.

> * blk_complete_request() is called with ide_lock hold but
>   ide_softirq_done() also takes ide_lock - is this correct?

blk_complete_request() need not be called with the lock held, in fact it
would be best if it wasn't (no point in holding the lock). But right now
it is in ide, because of the below. ide_softirq_done() always needs to
grab the lock. There are no recursion problems there, ide_softirq_done()
is called out-of-order from the actual completion call.

> "There's still room for improvement, as __ide_end_request() really
> could drop the lock after getting HWGROUP->rq (why does it need to
> hold it in the first place? If ->rq access isn't serialized, we are
> screwed anyways)."
> 
> ide_preempt?  and yes we are screwed...

Irk it's nasty, since it basically means we have to hold ide_lock over
the entire functions looking at hwgroup->rq.

It's ok for __ide_end_request() to be entered with the ide_lock held,
the costly affair is usually completing the request. Which now happens
outside of the lock.

We could split the completion path in two - if we know this call will
end the request completely, we can drop the lock and call into the
blk_complete_request() stuff for free. We know we need to clear
hwgroup->rq anyways, so we can do it up front and complete the request
'privately'. If we are not fully completing the request, keep the lock
and do the partial completion. The bonus here is that the first case
will be the often taken path by far (always with DMA and no errors), the
other cases are not interesting from a performance perspective.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 15:31                   ` Jens Axboe
@ 2005-11-16 16:06                     ` Bartlomiej Zolnierkiewicz
  2005-11-16 17:10                       ` Jens Axboe
  2005-11-19 10:55                       ` Christoph Hellwig
  0 siblings, 2 replies; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-11-16 16:06 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> >
> > > I updated that patch, and converted IDE and SCSI to use it. See the
> > > results here:
> > >
> > > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> >
> > I like it but:
> >
> > * "we know it's either an FS or PC request" assumption in
> >   ide_softirq_done() is really wrong
>
> It used to be correct :-)

Sorry but it has been always like that,
other requests also pass through ide_end_request()
(which of course needs fixing).

> No, the problem is that I changed the partial stuff to allow the
> deletion/putting of the request to work for every type of request. But
> it definitely needs some more looking into.
>
> > * same with "uptodate = rq->errors"
>
> Yeah. In general, ->errors needs to be streamlined. It's a huge mess
> right now and it's making generic code really hard to do because every
> driver does their own weird thing with it.

Agreed.

> I'd like for IDE to really do stuff the error in ->errors, and push the
> retry or whatever counting into a ->retries instead. Then we can honor
> the simple rule of, rq->errors:
>
> < 0, it contains an -Exxxx value
> == 0, no errors, uptodate
> > 0, not uptodate, no specific error info. Usually 1.

This is a very good idea.

> > * blk_complete_request() is called with ide_lock hold but
> >   ide_softirq_done() also takes ide_lock - is this correct?
>
> blk_complete_request() need not be called with the lock held, in fact it
> would be best if it wasn't (no point in holding the lock). But right now
> it is in ide, because of the below. ide_softirq_done() always needs to
> grab the lock. There are no recursion problems there, ide_softirq_done()
> is called out-of-order from the actual completion call.
>
> > "There's still room for improvement, as __ide_end_request() really
> > could drop the lock after getting HWGROUP->rq (why does it need to
> > hold it in the first place? If ->rq access isn't serialized, we are
> > screwed anyways)."
> >
> > ide_preempt?  and yes we are screwed...
>
> Irk it's nasty, since it basically means we have to hold ide_lock over
> the entire functions looking at hwgroup->rq.
>
> It's ok for __ide_end_request() to be entered with the ide_lock held,
> the costly affair is usually completing the request. Which now happens
> outside of the lock.

We should get rid of ide_preempt later.

This will also allow us to remove ide_do_drive_cmd()
and use blk_execute_rq() exclusively.

> We could split the completion path in two - if we know this call will
> end the request completely, we can drop the lock and call into the
> blk_complete_request() stuff for free. We know we need to clear
> hwgroup->rq anyways, so we can do it up front and complete the request
> 'privately'. If we are not fully completing the request, keep the lock
> and do the partial completion. The bonus here is that the first case
> will be the often taken path by far (always with DMA and no errors), the
> other cases are not interesting from a performance perspective.

Sounds good.

Bartlomiej

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 16:06                     ` Bartlomiej Zolnierkiewicz
@ 2005-11-16 17:10                       ` Jens Axboe
  2005-11-16 19:11                         ` Bartlomiej Zolnierkiewicz
  2005-11-19 10:55                       ` Christoph Hellwig
  1 sibling, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 17:10 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > >
> > > > I updated that patch, and converted IDE and SCSI to use it. See the
> > > > results here:
> > > >
> > > > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> > >
> > > I like it but:
> > >
> > > * "we know it's either an FS or PC request" assumption in
> > >   ide_softirq_done() is really wrong
> >
> > It used to be correct :-)
> 
> Sorry but it has been always like that,
> other requests also pass through ide_end_request()
> (which of course needs fixing).

You misunderstand, for calls to blk_complete_request() it wasn't true
initially since it always obyed rq_all_done() (which returns 0 for
non-fs and non-pc requests).

> > Irk it's nasty, since it basically means we have to hold ide_lock over
> > the entire functions looking at hwgroup->rq.
> >
> > It's ok for __ide_end_request() to be entered with the ide_lock held,
> > the costly affair is usually completing the request. Which now happens
> > outside of the lock.
> 
> We should get rid of ide_preempt later.
> 
> This will also allow us to remove ide_do_drive_cmd()
> and use blk_execute_rq() exclusively.

That would be very nice! Reminds me that there might still be a race
with head insertion and REQ_STARTED request in front in the block layer,
that needs inspection. But killing ide_do_drive_cmd() would be very
nice.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 17:10                       ` Jens Axboe
@ 2005-11-16 19:11                         ` Bartlomiej Zolnierkiewicz
  2005-11-16 19:22                           ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-11-16 19:11 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > >
> > > > > I updated that patch, and converted IDE and SCSI to use it. See the
> > > > > results here:
> > > > >
> > > > > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> > > >
> > > > I like it but:
> > > >
> > > > * "we know it's either an FS or PC request" assumption in
> > > >   ide_softirq_done() is really wrong
> > >
> > > It used to be correct :-)
> >
> > Sorry but it has been always like that,
> > other requests also pass through ide_end_request()
> > (which of course needs fixing).
>
> You misunderstand, for calls to blk_complete_request() it wasn't true
> initially since it always obyed rq_all_done() (which returns 0 for
> non-fs and non-pc requests).

from blk_complete_request() [ the only user of rq_all_done() ]:

+ /*
+ * for partial completions, fall back to normal end io handling.
+ */
+ if (unlikely(!partial_ok && !rq_all_done(req, nbytes)))
+         if (end_that_request_chunk(req, uptodate, nbytes))
+                 return 1;

We still will end up with using ide_softirq_done() for !rq_all_done()
case (non FS/PC request) because majority of them (all?) don't use
partial completions.

Bartlomiej

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 19:11                         ` Bartlomiej Zolnierkiewicz
@ 2005-11-16 19:22                           ` Jens Axboe
  2005-11-16 19:45                             ` Jens Axboe
  2005-11-16 19:46                             ` Bartlomiej Zolnierkiewicz
  0 siblings, 2 replies; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 19:22 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > >
> > > > > > I updated that patch, and converted IDE and SCSI to use it. See the
> > > > > > results here:
> > > > > >
> > > > > > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> > > > >
> > > > > I like it but:
> > > > >
> > > > > * "we know it's either an FS or PC request" assumption in
> > > > >   ide_softirq_done() is really wrong
> > > >
> > > > It used to be correct :-)
> > >
> > > Sorry but it has been always like that,
> > > other requests also pass through ide_end_request()
> > > (which of course needs fixing).
> >
> > You misunderstand, for calls to blk_complete_request() it wasn't true
> > initially since it always obyed rq_all_done() (which returns 0 for
> > non-fs and non-pc requests).
> 
> from blk_complete_request() [ the only user of rq_all_done() ]:
> 
> + /*
> + * for partial completions, fall back to normal end io handling.
> + */
> + if (unlikely(!partial_ok && !rq_all_done(req, nbytes)))
> +         if (end_that_request_chunk(req, uptodate, nbytes))
> +                 return 1;
> 
> We still will end up with using ide_softirq_done() for !rq_all_done()
> case (non FS/PC request) because majority of them (all?) don't use
> partial completions.

Yes, that's what it looks like now... Note I wrote "wasn't", it used to
look like this:

        if (!rq_all_done(req, nbytes)) {
                end_that_request_chunk(..);
                return;
        }

which of course didn't work, so it was changed to the above which then
broke the assumption of what type of requests we expect to see in
ide_softirq_done(). We can't generically handle this case, so it's
probably best to just add this logic to __ide_end_request() - it's just
another case for _not_ using the blk_complete_request() path, just like
the partial case.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 19:22                           ` Jens Axboe
@ 2005-11-16 19:45                             ` Jens Axboe
  2005-11-16 19:46                             ` Bartlomiej Zolnierkiewicz
  1 sibling, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 19:45 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On Wed, Nov 16 2005, Jens Axboe wrote:
> which of course didn't work, so it was changed to the above which then
> broke the assumption of what type of requests we expect to see in
> ide_softirq_done(). We can't generically handle this case, so it's
> probably best to just add this logic to __ide_end_request() - it's just
> another case for _not_ using the blk_complete_request() path, just like
> the partial case.

How about something like this? It requires blk_complete_request() calls
to be full completions of the request (or what is left of it). It cleans
up the calls a lot and fixes the wrong comment for IDE.

diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 82b7290..8e6dd9f 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -3223,33 +3223,20 @@ static struct notifier_block __devinitda
 /**
  * blk_complete_request - end I/O on a request
  * @req:      the request being processed
- * @uptodate: 1 for success, 0 for I/O error, < 0 for specific error
- * @nbytes:   number of bytes to end I/O on
  *
  * Description:
- *     Ends I/O on a number of sectors attached to @req, and sets it up
- *     for the next range of segments (if any) in the cluster. This variant
- *     completes the request out-of-order through a softirq handler. The user
- *     must have registered a completion callback through
- *     blk_queue_softirq_fn() at queue init time.
- *
- * Return:
- *     0 - we are done with this request, call end_that_request_last()
- *     1 - still buffers pending for this request
+ *     Ends all I/O on a request. It does not handle partial completions,
+ *     unless the driver actually implements this in its completionc callback
+ *     through requeueing. Theh actual completion happens out-of-order,
+ *     through a softirq handler. The user must have registered a completion
+ *     callback through blk_queue_softirq_done().
  **/
-int blk_complete_request(struct request *req, int uptodate, unsigned int nbytes,
-			 int partial_ok)
+
+void blk_complete_request(struct request *req)
 {
 	struct list_head *cpu_list;
 	unsigned long flags;
 
-	/*
-	 * for partial completions, fall back to normal end io handling.
-	 */
-	if (unlikely(!partial_ok && !rq_all_done(req, nbytes)))
-		if (end_that_request_chunk(req, uptodate, nbytes))
-			return 1;
-
 	BUG_ON(!req->q->softirq_done_fn);
 		
 	local_irq_save(flags);
@@ -3259,7 +3246,6 @@ int blk_complete_request(struct request 
 	raise_softirq_irqoff(BLOCK_SOFTIRQ);
 
 	local_irq_restore(flags);
-	return 0;
 }
 
 EXPORT_SYMBOL(blk_complete_request);
diff --git a/drivers/ide/ide-io.c b/drivers/ide/ide-io.c
index f114106..1129bfb 100644
--- a/drivers/ide/ide-io.c
+++ b/drivers/ide/ide-io.c
@@ -58,22 +58,9 @@
 void ide_softirq_done(struct request *rq)
 {
 	request_queue_t *q = rq->q;
-	int nbytes, uptodate = 1;
 
 	add_disk_randomness(rq->rq_disk);
-
-	/*
-	 * we know it's either an FS or PC request
-	 */
-	if (blk_fs_request(rq))
-		nbytes = rq->hard_nr_sectors << 9;
-	else
-		nbytes = rq->data_len;
-
-	if (rq->errors < 0)
-		uptodate = rq->errors;
-
-	end_that_request_chunk(rq, uptodate, nbytes);
+	end_that_request_chunk(rq, rq->errors, rq->data_len);
 
 	spin_lock_irq(q->queue_lock);
 	end_that_request_last(rq);
@@ -83,6 +70,9 @@ void ide_softirq_done(struct request *rq
 int __ide_end_request(ide_drive_t *drive, struct request *rq, int uptodate,
 		      int nr_sectors)
 {
+	unsigned int nbytes;
+	int ret = 1;
+
 	BUG_ON(!(rq->flags & REQ_STARTED));
 
 	/*
@@ -104,13 +94,30 @@ int __ide_end_request(ide_drive_t *drive
 		HWGROUP(drive)->hwif->ide_dma_on(drive);
 	}
 
-	if (!blk_complete_request(rq, uptodate, nr_sectors << 9, 0)) {
+	/*
+	 * For partial completions (or non fs/pc requests), use the regular
+	 * direct completion path.
+	 */
+	nbytes = nr_sectors << 9;
+	if (rq_all_done(rq, nbytes)) {
+		rq->errors = uptodate;
+		rq->data_len = nbytes;
+		blk_complete_request(rq);
+		ret = 0;
+	} else {
+		if (!end_that_request_first(rq, uptodate, nr_sectors)) {
+			add_disk_randomness(rq->rq_disk);
+			end_that_request_last(rq);
+			ret = 0;
+		}
+	}
+
+	if (!ret) {
 		blkdev_dequeue_request(rq);
 		HWGROUP(drive)->rq = NULL;
-		return 0;
 	}
 
-	return 1;
+	return ret;
 }
 EXPORT_SYMBOL(__ide_end_request);
 
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 4044ba4..bf902cf 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -751,6 +751,8 @@ static void scsi_done(struct scsi_cmnd *
  * isn't running --- used by scsi_times_out */
 void __scsi_done(struct scsi_cmnd *cmd)
 {
+	struct request *rq = cmd->request;
+
 	/*
 	 * Set the serial numbers back to zero
 	 */
@@ -760,14 +762,14 @@ void __scsi_done(struct scsi_cmnd *cmd)
 	if (cmd->result)
 		atomic_inc(&cmd->device->ioerr_cnt);
 
-	BUG_ON(!cmd->request);
+	BUG_ON(!rq);
 
 	/*
 	 * The uptodate/nbytes values don't matter, as we allow partial
 	 * completes and thus will check this in the softirq callback
 	 */
-	cmd->request->completion_data = cmd;
-	blk_complete_request(cmd->request, 0, 0, 1);
+	rq->completion_data = cmd;
+	blk_complete_request(rq);
 }
 
 /*
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 588b9cc..3ff7c3f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -613,7 +613,7 @@ extern int end_that_request_first(struct
 extern int end_that_request_chunk(struct request *, int, int);
 extern void end_that_request_last(struct request *);
 extern void end_request(struct request *req, int uptodate);
-extern int blk_complete_request(struct request *, int, unsigned int, int);
+extern void blk_complete_request(struct request *);
 
 static inline int rq_all_done(struct request *rq, unsigned int nr_bytes)
 {

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 19:22                           ` Jens Axboe
  2005-11-16 19:45                             ` Jens Axboe
@ 2005-11-16 19:46                             ` Bartlomiej Zolnierkiewicz
  2005-11-16 19:55                               ` Bartlomiej Zolnierkiewicz
  1 sibling, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-11-16 19:46 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > > >
> > > > > > > I updated that patch, and converted IDE and SCSI to use it. See the
> > > > > > > results here:
> > > > > > >
> > > > > > > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> > > > > >
> > > > > > I like it but:
> > > > > >
> > > > > > * "we know it's either an FS or PC request" assumption in
> > > > > >   ide_softirq_done() is really wrong
> > > > >
> > > > > It used to be correct :-)
> > > >
> > > > Sorry but it has been always like that,
> > > > other requests also pass through ide_end_request()
> > > > (which of course needs fixing).
> > >
> > > You misunderstand, for calls to blk_complete_request() it wasn't true
> > > initially since it always obyed rq_all_done() (which returns 0 for
> > > non-fs and non-pc requests).
> >
> > from blk_complete_request() [ the only user of rq_all_done() ]:
> >
> > + /*
> > + * for partial completions, fall back to normal end io handling.
> > + */
> > + if (unlikely(!partial_ok && !rq_all_done(req, nbytes)))
> > +         if (end_that_request_chunk(req, uptodate, nbytes))
> > +                 return 1;
> >
> > We still will end up with using ide_softirq_done() for !rq_all_done()
> > case (non FS/PC request) because majority of them (all?) don't use
> > partial completions.
>
> Yes, that's what it looks like now... Note I wrote "wasn't", it used to
> look like this:
>
>         if (!rq_all_done(req, nbytes)) {
>                 end_that_request_chunk(..);
>                 return;
>         }
>
> which of course didn't work, so it was changed to the above which then
> broke the assumption of what type of requests we expect to see in
> ide_softirq_done(). We can't generically handle this case, so it's
> probably best to just add this logic to __ide_end_request() - it's just
> another case for _not_ using the blk_complete_request() path, just like
> the partial case.

Sounds better but I honestly think that you simply cannot obtain
reliable nr_sectors to complete for FS/PC requests just from the
request type.  Two examples are: failed disk flush requests and
cd noretry requests (both are of FS type).

IMO the best way to fix it is to actually move more (not less!) of
the logic from driver->end_request() paths to ide_softirq_done().

Cheers,
Bartlomiej

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 19:46                             ` Bartlomiej Zolnierkiewicz
@ 2005-11-16 19:55                               ` Bartlomiej Zolnierkiewicz
  2005-11-16 20:02                                 ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-11-16 19:55 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On 11/16/05, Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> wrote:
> On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > > > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > > > > On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > > > > > >
> > > > > > > > I updated that patch, and converted IDE and SCSI to use it. See the
> > > > > > > > results here:
> > > > > > > >
> > > > > > > > http://brick.kernel.dk/git/?p=linux-2.6-block.git;a=shortlog;h=blk-softirq
> > > > > > >
> > > > > > > I like it but:
> > > > > > >
> > > > > > > * "we know it's either an FS or PC request" assumption in
> > > > > > >   ide_softirq_done() is really wrong
> > > > > >
> > > > > > It used to be correct :-)
> > > > >
> > > > > Sorry but it has been always like that,
> > > > > other requests also pass through ide_end_request()
> > > > > (which of course needs fixing).
> > > >
> > > > You misunderstand, for calls to blk_complete_request() it wasn't true
> > > > initially since it always obyed rq_all_done() (which returns 0 for
> > > > non-fs and non-pc requests).
> > >
> > > from blk_complete_request() [ the only user of rq_all_done() ]:
> > >
> > > + /*
> > > + * for partial completions, fall back to normal end io handling.
> > > + */
> > > + if (unlikely(!partial_ok && !rq_all_done(req, nbytes)))
> > > +         if (end_that_request_chunk(req, uptodate, nbytes))
> > > +                 return 1;
> > >
> > > We still will end up with using ide_softirq_done() for !rq_all_done()
> > > case (non FS/PC request) because majority of them (all?) don't use
> > > partial completions.
> >
> > Yes, that's what it looks like now... Note I wrote "wasn't", it used to
> > look like this:
> >
> >         if (!rq_all_done(req, nbytes)) {
> >                 end_that_request_chunk(..);
> >                 return;
> >         }
> >
> > which of course didn't work, so it was changed to the above which then
> > broke the assumption of what type of requests we expect to see in
> > ide_softirq_done(). We can't generically handle this case, so it's
> > probably best to just add this logic to __ide_end_request() - it's just
> > another case for _not_ using the blk_complete_request() path, just like
> > the partial case.
>
> Sounds better but I honestly think that you simply cannot obtain
> reliable nr_sectors to complete for FS/PC requests just from the
> request type.  Two examples are: failed disk flush requests and
> cd noretry requests (both are of FS type).

first example is bad :-)

> IMO the best way to fix it is to actually move more (not less!) of
> the logic from driver->end_request() paths to ide_softirq_done().

Your latest patch is also a good way to fix it
(now the only thing left is rq->errors/rq->retries discussed earlier).

Bartlomiej

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 19:55                               ` Bartlomiej Zolnierkiewicz
@ 2005-11-16 20:02                                 ` Jens Axboe
  2005-11-16 20:23                                   ` Bartlomiej Zolnierkiewicz
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 20:02 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > which of course didn't work, so it was changed to the above which then
> > > broke the assumption of what type of requests we expect to see in
> > > ide_softirq_done(). We can't generically handle this case, so it's
> > > probably best to just add this logic to __ide_end_request() - it's just
> > > another case for _not_ using the blk_complete_request() path, just like
> > > the partial case.
> >
> > Sounds better but I honestly think that you simply cannot obtain
> > reliable nr_sectors to complete for FS/PC requests just from the
> > request type.  Two examples are: failed disk flush requests and
> > cd noretry requests (both are of FS type).
> 
> first example is bad :-)

Both your examples are wrong - a flush request is non-fs/pc, and noretry
requests doesn't impact the type of the request at from this POV.

> > IMO the best way to fix it is to actually move more (not less!) of
> > the logic from driver->end_request() paths to ide_softirq_done().
> 
> Your latest patch is also a good way to fix it
> (now the only thing left is rq->errors/rq->retries discussed earlier).

Yeah, that is still pending... I updated the patch series so it's a
clean 1-2-3-4 step series now, there instead of this little additions.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 20:02                                 ` Jens Axboe
@ 2005-11-16 20:23                                   ` Bartlomiej Zolnierkiewicz
  2005-11-16 20:40                                     ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2005-11-16 20:23 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > which of course didn't work, so it was changed to the above which then
> > > > broke the assumption of what type of requests we expect to see in
> > > > ide_softirq_done(). We can't generically handle this case, so it's
> > > > probably best to just add this logic to __ide_end_request() - it's just
> > > > another case for _not_ using the blk_complete_request() path, just like
> > > > the partial case.
> > >
> > > Sounds better but I honestly think that you simply cannot obtain
> > > reliable nr_sectors to complete for FS/PC requests just from the
> > > request type.  Two examples are: failed disk flush requests and
> > > cd noretry requests (both are of FS type).
> >
> > first example is bad :-)
>
> Both your examples are wrong - a flush request is non-fs/pc, and noretry
> requests doesn't impact the type of the request at from this POV.

I meant doing partial completions of the real request from the
->end_flush() and noretry doesn't impact the type but it does
impact the nr_sectors to complete.  However none of these
matters any longer with your latest patch which simplifies things
a lot.

> > > IMO the best way to fix it is to actually move more (not less!) of
> > > the logic from driver->end_request() paths to ide_softirq_done().
> >
> > Your latest patch is also a good way to fix it
> > (now the only thing left is rq->errors/rq->retries discussed earlier).
>
> Yeah, that is still pending... I updated the patch series so it's a
> clean 1-2-3-4 step series now, there instead of this little additions.

Excellent!

Bartlomiej

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 20:23                                   ` Bartlomiej Zolnierkiewicz
@ 2005-11-16 20:40                                     ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2005-11-16 20:40 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Mike Christie, Jeff Garzik, Tejun Heo, linux-ide, lkml,
	SCSI Mailing List

On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> On 11/16/05, Jens Axboe <axboe@suse.de> wrote:
> > On Wed, Nov 16 2005, Bartlomiej Zolnierkiewicz wrote:
> > > > > which of course didn't work, so it was changed to the above which then
> > > > > broke the assumption of what type of requests we expect to see in
> > > > > ide_softirq_done(). We can't generically handle this case, so it's
> > > > > probably best to just add this logic to __ide_end_request() - it's just
> > > > > another case for _not_ using the blk_complete_request() path, just like
> > > > > the partial case.
> > > >
> > > > Sounds better but I honestly think that you simply cannot obtain
> > > > reliable nr_sectors to complete for FS/PC requests just from the
> > > > request type.  Two examples are: failed disk flush requests and
> > > > cd noretry requests (both are of FS type).
> > >
> > > first example is bad :-)
> >
> > Both your examples are wrong - a flush request is non-fs/pc, and noretry
> > requests doesn't impact the type of the request at from this POV.
> 
> I meant doing partial completions of the real request from the
> ->end_flush() and noretry doesn't impact the type but it does
> impact the nr_sectors to complete.  However none of these
> matters any longer with your latest patch which simplifies things
> a lot.

Ah, partial completions due to hitting an error in the flush. Yeah that
would have broken, you are right. And yes, that should work now, it's
the way it should have been written from the get-go.

-- 
Jens Axboe

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-16 16:06                     ` Bartlomiej Zolnierkiewicz
  2005-11-16 17:10                       ` Jens Axboe
@ 2005-11-19 10:55                       ` Christoph Hellwig
  2005-11-19 13:27                         ` Jens Axboe
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2005-11-19 10:55 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: Jens Axboe, Mike Christie, Jeff Garzik, Tejun Heo, linux-ide,
	lkml, SCSI Mailing List

On Wed, Nov 16, 2005 at 05:06:45PM +0100, Bartlomiej Zolnierkiewicz wrote:
> This will also allow us to remove ide_do_drive_cmd()
> and use blk_execute_rq() exclusively.

While we're talking about moving things to generic request-based stuff,
any chance one of you could look over to convert ide-cd internal request
submissions to BLOCK_PC?  It's the last user of REQ_PC.  After that changing
the CD layer to submit BLOCK_PC commands directly instead of the
->packet_command hook would be a logical next step.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] libata error handling fixes (ATAPI)
  2005-11-19 10:55                       ` Christoph Hellwig
@ 2005-11-19 13:27                         ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2005-11-19 13:27 UTC (permalink / raw)
  To: Christoph Hellwig, Bartlomiej Zolnierkiewicz, Mike Christie,
	Jeff Garzik, Tejun Heo, linux-ide, lkml, SCSI Mailing List

On Sat, Nov 19 2005, Christoph Hellwig wrote:
> On Wed, Nov 16, 2005 at 05:06:45PM +0100, Bartlomiej Zolnierkiewicz wrote:
> > This will also allow us to remove ide_do_drive_cmd()
> > and use blk_execute_rq() exclusively.
> 
> While we're talking about moving things to generic request-based stuff,
> any chance one of you could look over to convert ide-cd internal request
> submissions to BLOCK_PC?  It's the last user of REQ_PC.  After that changing
> the CD layer to submit BLOCK_PC commands directly instead of the
> ->packet_command hook would be a logical next step.

Yup, that's clearly the way to go.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2005-11-19 13:27 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20051114195717.GA24373@havoc.gtf.org>
     [not found] ` <20051115074148.GA17459@htj.dyndns.org>
     [not found]   ` <4379AA5B.1060900@pobox.com>
     [not found]     ` <4379B28E.9070708@gmail.com>
2005-11-15 11:02       ` [PATCH] libata error handling fixes (ATAPI) Jeff Garzik
2005-11-15 12:00         ` Jens Axboe
2005-11-15 18:25           ` Mike Christie
2005-11-15 18:41             ` Jens Axboe
2005-11-16 12:40               ` Jens Axboe
2005-11-16 12:56                 ` Jeff Garzik
2005-11-16 13:13                   ` Jens Axboe
2005-11-16 13:23                     ` Jeff Garzik
2005-11-16 13:31                 ` Jeff Garzik
2005-11-16 13:47                   ` Jens Axboe
2005-11-16 15:04                 ` Bartlomiej Zolnierkiewicz
2005-11-16 15:31                   ` Jens Axboe
2005-11-16 16:06                     ` Bartlomiej Zolnierkiewicz
2005-11-16 17:10                       ` Jens Axboe
2005-11-16 19:11                         ` Bartlomiej Zolnierkiewicz
2005-11-16 19:22                           ` Jens Axboe
2005-11-16 19:45                             ` Jens Axboe
2005-11-16 19:46                             ` Bartlomiej Zolnierkiewicz
2005-11-16 19:55                               ` Bartlomiej Zolnierkiewicz
2005-11-16 20:02                                 ` Jens Axboe
2005-11-16 20:23                                   ` Bartlomiej Zolnierkiewicz
2005-11-16 20:40                                     ` Jens Axboe
2005-11-19 10:55                       ` Christoph Hellwig
2005-11-19 13:27                         ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox