* [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes
@ 2005-08-20 14:15 Tejun Heo
2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions Tejun Heo
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Tejun Heo @ 2005-08-20 14:15 UTC (permalink / raw)
To: jgarzik, albertcc; +Cc: linux-ide
Hello, Jeff & Albert.
This is another patchset fixing races in libata.
The first patch implements ata_poll_qc_complete which graps host
lock, turns irq back on, completes qc and release host lock. All
command completion in polling tasks are replaced with this function.
The effect is...
* If an interrupt from other ports or a spurious interrupt occurs
inbetween ata_irq_on and ata_qc_complete, it leads to double
completion of the qc. This race is fixed.
* atapi_packet_task() now reenables IRQ on error exit.
* ata_qc_complete() is now always called with host lock.
The second patch adds locking to ata_scsi_error such that
->eng_timeout is called after ata_qc_complete is complete.
Previously, EH and the latter part of ata_qc_complete could run side
by side.
Albert, I think, this patch should kill the race you tried to fix by
moving ATA_QCFLAG_ACTIVE clearing above ->complete_fn invocation.
[ Start of patch descriptions ]
01_libata_add-ata_poll_qc_complete.patch
: implement ata_poll_qc_complete and use it in polling functions
Previously, libata polling functions turned irq back on and
completed qc commands without holding host lock. This creates
a race condition between the polling task and interrupts from
other ports on the same host set or spurious interrupt from
itself.
This patch implements ata_poll_qc_complete which enables irq
and completes qc atomically and convert all polling functions.
Note: Unlike other functions, atapi_packet_task() didn't use
to turn irq back on on error exits. This patch makes it use
ata_poll_qc_complete, so irq is now turned back on on error
exits.
Note: With this change, ALL invocations of ata_qc_complete()
are now done under host_set lock.
02_libata_add-locking-to-ata_scsi_error.patch
: add host_set locking to ata_scsi_error()
SCSI EH can start before ata_qc_complete is completely
complete. so, latter part of ata_qc_complete can run
side-by-side with ->eng_timeout(), interfering EH.
This patch makes ata_scsi_error() to grab and release host_set
lock before invoking ->eng_timeout().
Note: host_failed decrementing and eh_cmd_q banging are moved
above ->eng_timeout() invocation such that they're done while
holding the lock.
[ End of patch descriptions ]
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions 2005-08-20 14:15 [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes Tejun Heo @ 2005-08-20 14:15 ` Tejun Heo 2005-08-21 19:53 ` Jeff Garzik 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() Tejun Heo 2005-08-21 20:08 ` a word about ATA interrupts, and SATA (was Re: [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes) Jeff Garzik 2 siblings, 1 reply; 9+ messages in thread From: Tejun Heo @ 2005-08-20 14:15 UTC (permalink / raw) To: jgarzik, albertcc; +Cc: linux-ide 01_libata_add-ata_poll_qc_complete.patch Previously, libata polling functions turned irq back on and completed qc commands without holding host lock. This creates a race condition between the polling task and interrupts from other ports on the same host set or spurious interrupt from itself. This patch implements ata_poll_qc_complete which enables irq and completes qc atomically and convert all polling functions. Note: Unlike other functions, atapi_packet_task() didn't use to turn irq back on on error exits. This patch makes it use ata_poll_qc_complete, so irq is now turned back on on error exits. Note: With this change, ALL invocations of ata_qc_complete() are now done under host_set lock. Signed-off-by: Tejun Heo <htejun@gmail.com> libata-core.c | 32 ++++++++++++++++++++++---------- 1 files changed, 22 insertions(+), 10 deletions(-) Index: work/drivers/scsi/libata-core.c =================================================================== --- work.orig/drivers/scsi/libata-core.c 2005-08-20 23:13:40.000000000 +0900 +++ work/drivers/scsi/libata-core.c 2005-08-20 23:13:40.000000000 +0900 @@ -2402,6 +2402,24 @@ static int ata_sg_setup(struct ata_queue } /** + * ata_poll_qc_complete - turn irq back on and finish qc + * @qc: Command to complete + * @drv_stat: ATA status register content + * + * LOCKING: + * None. (grabs host lock) + */ +void ata_poll_qc_complete(struct ata_queued_cmd *qc, u8 drv_stat) +{ + struct ata_port *ap = qc->ap; + + spin_lock_irq(&ap->host_set->lock); + ata_irq_on(ap); + ata_qc_complete(qc, drv_stat); + spin_unlock_irq(&ap->host_set->lock); +} + +/** * ata_pio_poll - * @ap: * @@ -2492,9 +2510,7 @@ static void ata_pio_complete (struct ata ap->pio_task_state = PIO_ST_IDLE; - ata_irq_on(ap); - - ata_qc_complete(qc, drv_stat); + ata_poll_qc_complete(qc, drv_stat); } @@ -2717,9 +2733,7 @@ static void ata_pio_block(struct ata_por if ((status & ATA_DRQ) == 0) { ap->pio_task_state = PIO_ST_IDLE; - ata_irq_on(ap); - - ata_qc_complete(qc, status); + ata_poll_qc_complete(qc, status); return; } @@ -2749,9 +2763,7 @@ static void ata_pio_error(struct ata_por ap->pio_task_state = PIO_ST_IDLE; - ata_irq_on(ap); - - ata_qc_complete(qc, drv_stat | ATA_ERR); + ata_poll_qc_complete(qc, drv_stat | ATA_ERR); } static void ata_pio_task(void *_data) @@ -3656,7 +3668,7 @@ static void atapi_packet_task(void *_dat return; err_out: - ata_qc_complete(qc, ATA_ERR); + ata_poll_qc_complete(qc, ATA_ERR); } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions Tejun Heo @ 2005-08-21 19:53 ` Jeff Garzik 2005-08-22 8:12 ` Tejun Heo 0 siblings, 1 reply; 9+ messages in thread From: Jeff Garzik @ 2005-08-21 19:53 UTC (permalink / raw) To: Tejun Heo; +Cc: albertcc, linux-ide Tejun Heo wrote: > 01_libata_add-ata_poll_qc_complete.patch > > Previously, libata polling functions turned irq back on and > completed qc commands without holding host lock. This creates > a race condition between the polling task and interrupts from > other ports on the same host set or spurious interrupt from > itself. > > This patch implements ata_poll_qc_complete which enables irq > and completes qc atomically and convert all polling functions. > > Note: Unlike other functions, atapi_packet_task() didn't use > to turn irq back on on error exits. This patch makes it use > ata_poll_qc_complete, so irq is now turned back on on error > exits. > > Note: With this change, ALL invocations of ata_qc_complete() > are now done under host_set lock. > > Signed-off-by: Tejun Heo <htejun@gmail.com> This one looks OK to me. Given that I rejected the other three, would you still like me to apply just this one? Jeff ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions 2005-08-21 19:53 ` Jeff Garzik @ 2005-08-22 8:12 ` Tejun Heo 2005-08-23 5:06 ` Jeff Garzik 0 siblings, 1 reply; 9+ messages in thread From: Tejun Heo @ 2005-08-22 8:12 UTC (permalink / raw) To: Jeff Garzik; +Cc: albertcc, linux-ide On Sun, Aug 21, 2005 at 03:53:58PM -0400, Jeff Garzik wrote: > Tejun Heo wrote: > >01_libata_add-ata_poll_qc_complete.patch > > > > Previously, libata polling functions turned irq back on and > > completed qc commands without holding host lock. This creates > > a race condition between the polling task and interrupts from > > other ports on the same host set or spurious interrupt from > > itself. > > > > This patch implements ata_poll_qc_complete which enables irq > > and completes qc atomically and convert all polling functions. > > > > Note: Unlike other functions, atapi_packet_task() didn't use > > to turn irq back on on error exits. This patch makes it use > > ata_poll_qc_complete, so irq is now turned back on on error > > exits. > > > > Note: With this change, ALL invocations of ata_qc_complete() > > are now done under host_set lock. > > > >Signed-off-by: Tejun Heo <htejun@gmail.com> > > This one looks OK to me. Given that I rejected the other three, would > you still like me to apply just this one? > Greetings, Jeff. (Hello is getting a bit old today :-) Here's updated version which also clears ATA_FLAG_NOINTR. -- [PATCH libata-dev-2.6:upstream] implement ata_poll_qc_complete and use it in polling functions Previously, libata polling functions turned irq back on and completed qc commands without holding host lock. This creates a race condition between the polling task and interrupts from other ports on the same host set or spurious interrupt from itself. This patch implements ata_poll_qc_complete which enables irq and completes qc atomically and convert all polling functions. Note: atapi_packet_task() didn't use to turn irq back on or clear ATA_FLAG_NOINTR on error exits. This patch makes it use ata_poll_qc_complete which does both. Note: With this change, ALL invocations of ata_qc_complete() are now done under host_set lock. Signed-off-by: Tejun Heo <htejun@gmail.com> diff --git a/drivers/scsi/libata-core.c b/drivers/scsi/libata-core.c --- a/drivers/scsi/libata-core.c +++ b/drivers/scsi/libata-core.c @@ -2402,6 +2402,26 @@ static int ata_sg_setup(struct ata_queue } /** + * ata_poll_qc_complete - turn irq back on and finish qc + * @qc: Command to complete + * @drv_stat: ATA status register content + * + * LOCKING: + * None. (grabs host lock) + */ + +void ata_poll_qc_complete(struct ata_queued_cmd *qc, u8 drv_stat) +{ + struct ata_port *ap = qc->ap; + + spin_lock_irq(&ap->host_set->lock); + ap->flags &= ~ATA_FLAG_NOINTR; + ata_irq_on(ap); + ata_qc_complete(qc, drv_stat); + spin_unlock_irq(&ap->host_set->lock); +} + +/** * ata_pio_poll - * @ap: * @@ -2492,9 +2512,7 @@ static void ata_pio_complete (struct ata ap->pio_task_state = PIO_ST_IDLE; - ata_irq_on(ap); - - ata_qc_complete(qc, drv_stat); + ata_poll_qc_complete(qc, drv_stat); } @@ -2717,9 +2735,7 @@ static void ata_pio_block(struct ata_por if ((status & ATA_DRQ) == 0) { ap->pio_task_state = PIO_ST_IDLE; - ata_irq_on(ap); - - ata_qc_complete(qc, status); + ata_poll_qc_complete(qc, status); return; } @@ -2749,9 +2765,7 @@ static void ata_pio_error(struct ata_por ap->pio_task_state = PIO_ST_IDLE; - ata_irq_on(ap); - - ata_qc_complete(qc, drv_stat | ATA_ERR); + ata_poll_qc_complete(qc, drv_stat | ATA_ERR); } static void ata_pio_task(void *_data) @@ -3659,7 +3673,7 @@ static void atapi_packet_task(void *_dat return; err_out: - ata_qc_complete(qc, ATA_ERR); + ata_poll_qc_complete(qc, ATA_ERR); } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions 2005-08-22 8:12 ` Tejun Heo @ 2005-08-23 5:06 ` Jeff Garzik 0 siblings, 0 replies; 9+ messages in thread From: Jeff Garzik @ 2005-08-23 5:06 UTC (permalink / raw) To: Tejun Heo; +Cc: albertcc, linux-ide applied ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() 2005-08-20 14:15 [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes Tejun Heo 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions Tejun Heo @ 2005-08-20 14:15 ` Tejun Heo 2005-08-21 19:52 ` Jeff Garzik 2005-08-21 20:08 ` a word about ATA interrupts, and SATA (was Re: [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes) Jeff Garzik 2 siblings, 1 reply; 9+ messages in thread From: Tejun Heo @ 2005-08-20 14:15 UTC (permalink / raw) To: jgarzik, albertcc; +Cc: linux-ide 02_libata_add-locking-to-ata_scsi_error.patch SCSI EH can start before ata_qc_complete is completely complete. so, latter part of ata_qc_complete can run side-by-side with ->eng_timeout(), interfering EH. This patch makes ata_scsi_error() to grab and release host_set lock before invoking ->eng_timeout(). Note: host_failed decrementing and eh_cmd_q banging are moved above ->eng_timeout() invocation such that they're done while holding the lock. Signed-off-by: Tejun Heo <htejun@gmail.com> libata-scsi.c | 12 +++++++++--- 1 files changed, 9 insertions(+), 3 deletions(-) Index: work/drivers/scsi/libata-scsi.c =================================================================== --- work.orig/drivers/scsi/libata-scsi.c 2005-08-20 22:34:02.000000000 +0900 +++ work/drivers/scsi/libata-scsi.c 2005-08-20 23:13:40.000000000 +0900 @@ -373,12 +373,14 @@ int ata_scsi_slave_config(struct scsi_de int ata_scsi_error(struct Scsi_Host *host) { - struct ata_port *ap; + struct ata_port *ap = (struct ata_port *) &host->hostdata[0]; DPRINTK("ENTER\n"); - ap = (struct ata_port *) &host->hostdata[0]; - ap->ops->eng_timeout(ap); + /* The following locking ensures that we enter eng_timeout + * after ata_qc_complete is complete. + */ + spin_lock_irq(&ap->host_set->lock); /* TODO: this is per-command; when queueing is supported * this code will either change or move to a more @@ -387,6 +389,10 @@ int ata_scsi_error(struct Scsi_Host *hos host->host_failed--; INIT_LIST_HEAD(&host->eh_cmd_q); + spin_unlock_irq(&ap->host_set->lock); + + ap->ops->eng_timeout(ap); + DPRINTK("EXIT\n"); return 0; } ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() Tejun Heo @ 2005-08-21 19:52 ` Jeff Garzik 2005-08-22 8:23 ` Tejun Heo 0 siblings, 1 reply; 9+ messages in thread From: Jeff Garzik @ 2005-08-21 19:52 UTC (permalink / raw) To: Tejun Heo; +Cc: albertcc, linux-ide Tejun Heo wrote: > 02_libata_add-locking-to-ata_scsi_error.patch > > SCSI EH can start before ata_qc_complete is completely > complete. so, latter part of ata_qc_complete can run > side-by-side with ->eng_timeout(), interfering EH. > > This patch makes ata_scsi_error() to grab and release host_set > lock before invoking ->eng_timeout(). > > Note: host_failed decrementing and eh_cmd_q banging are moved > above ->eng_timeout() invocation such that they're done while > holding the lock. I think it's better to follow a path that sets a "EH owns this port now" flag inside spinlock, spin_lock_irqsave() ap->in_eh = 1; spin_unlock_irqrestore() and then check that flag in ata_qc_complete() if (qc->ap->in_eh) return; /* do nothing else with command */ so that for the case of a late interrupt, the interrupt handler should acknowledge the interrupt, but not actually touch the ata_queued_cmd state. Jeff ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() 2005-08-21 19:52 ` Jeff Garzik @ 2005-08-22 8:23 ` Tejun Heo 0 siblings, 0 replies; 9+ messages in thread From: Tejun Heo @ 2005-08-22 8:23 UTC (permalink / raw) To: Jeff Garzik; +Cc: albertcc, linux-ide On Sun, Aug 21, 2005 at 03:52:27PM -0400, Jeff Garzik wrote: > Tejun Heo wrote: > >02_libata_add-locking-to-ata_scsi_error.patch > > > > SCSI EH can start before ata_qc_complete is completely > > complete. so, latter part of ata_qc_complete can run > > side-by-side with ->eng_timeout(), interfering EH. > > > > This patch makes ata_scsi_error() to grab and release host_set > > lock before invoking ->eng_timeout(). > > > > Note: host_failed decrementing and eh_cmd_q banging are moved > > above ->eng_timeout() invocation such that they're done while > > holding the lock. > > I think it's better to follow a path that sets a "EH owns this port now" > flag inside spinlock, > > spin_lock_irqsave() > ap->in_eh = 1; > spin_unlock_irqrestore() > > and then check that flag in ata_qc_complete() > > if (qc->ap->in_eh) > return; /* do nothing else with command */ > > so that for the case of a late interrupt, the interrupt handler should > acknowledge the interrupt, but not actually touch the ata_queued_cmd state. Hi! I think above is larger subject than what this patch deals with and should be dealt with when we revamp EH (implementing in_eh will certainly involve other changes, I think). So, I think, for short term, either Albert's patch[1][2] or above one should go in to plug ATA_QCFLAG_ACTIVE race. I think if Albert's patch goes in, above patch is not necessary, as the only race we currently have is ATAPI EH case and clearing ATA_QCFLAG_ACTIVE is the only thing we do after ->complete_fn() in such cases. Thanks. [1] http://marc.theaimsgroup.com/?l=linux-ide&m=112417360223374&w=2 [2] Well, this referencing thing is neat. ;-) -- tejun ^ permalink raw reply [flat|nested] 9+ messages in thread
* a word about ATA interrupts, and SATA (was Re: [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes) 2005-08-20 14:15 [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes Tejun Heo 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions Tejun Heo 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() Tejun Heo @ 2005-08-21 20:08 ` Jeff Garzik 2 siblings, 0 replies; 9+ messages in thread From: Jeff Garzik @ 2005-08-21 20:08 UTC (permalink / raw) To: Tejun Heo, linux-ide, Jens Axboe, Alan Cox, Mark Lord; +Cc: albertcc BTW, here is food for thought: In first generation SATA controllers, the host controller silicon emulates a set of PATA taskfile registers. When you follow the ATA command protocols (host state machines), handling ATA commands in the specified manner, the host controller silicon is processing that stuff behind your back, silently converting your actions into other actions: * storing data in a H2D Register FIS buffer, for eventually sending to the device * storing data received from a D2H Register FIS or SDB FIS into the host controller taskfile registers. * handling PIO Setup and Data FISes, for PIO data xfer and PACKET command protocol * handling FIS's associated with DMA data transfer I've never seen any documentation for any host controller, that describes how the emulation is implemented in silicon. Probably nobody outside of the original VHDL designers for each chip know that. All we know, we must figure out from inference: we pretend a host chip is PATA, and the host chip turns our actions into a SATA FIS conversion. Because of this, we can discern several unique SATA quirks: (a) writing most ATA registers simply involves storing a data value into a buffer on the host controller (b) writing to Device Control or Command registers causes a H2D Register FIS to be sent. (c) [most important to this thread] the ATA INTRQ is signalled by setting the 'I' bit in certain FIS's. It is no longer a pure indication of the INTRQ signal, as it was in PATA. Due to the ramifications of (c), polling in general is possible but not the best idea. libata currently uses polling for the IDENTIFY DEVICE command, and I have long been thinking about turning off polling for SATA controllers. Some SATA controllers, such as Vitesse/Intel, simply do not support the ATA nIEN bit. It is emulated in the driver. Other controllers may not honor the nIEN bit correctly (due to bad emulation logic in silicon, probably). Just factor this info into your decisions, when playing around with the interrupt delivery code. Jeff ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-08-23 5:06 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-08-20 14:15 [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes Tejun Heo 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 01/02] libata: implement ata_poll_qc_complete and use it in polling functions Tejun Heo 2005-08-21 19:53 ` Jeff Garzik 2005-08-22 8:12 ` Tejun Heo 2005-08-23 5:06 ` Jeff Garzik 2005-08-20 14:15 ` [PATCH libata-dev-2.6:upstream 02/02] libata: add host_set locking to ata_scsi_error() Tejun Heo 2005-08-21 19:52 ` Jeff Garzik 2005-08-22 8:23 ` Tejun Heo 2005-08-21 20:08 ` a word about ATA interrupts, and SATA (was Re: [PATCH libata-dev-2.6:upstream 00/02] libata: race fixes) Jeff Garzik
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).