linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <htejun@gmail.com>
To: Jeff Garzik <jgarzik@pobox.com>
Cc: albertcc@tw.ibm.com, linux-ide@vger.kernel.org
Subject: Re: [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix
Date: Mon, 22 Aug 2005 05:17:02 +0900	[thread overview]
Message-ID: <4308E13E.6070104@gmail.com> (raw)
In-Reply-To: <4308DA57.9000303@pobox.com>


  Hello, Jeff.

Jeff Garzik wrote:
> Tejun Heo wrote:
> 
>>  Hello, Jeff and Albert.
>>
>>  This patchset fixes the following race.
>> (port A has ATA device and B ATAPI).
>>
>> port B    : ata_issue_prot() (ATAPI_NODATA)
>> port B    : packet_task scheduled
>> port A    : ata_issue_prot() (DMA)
>> intr    : ata_interrupt()
>> port A    : ata_host_intr -> this is all good & dandy
>> port B    : ata_host_intr -> finishes ATAPI cmd w/ error (request sense)
>>
>>  This is where the race is, we're polling for port B's qc, we must not
>> mess with it from interrupt.  Now, port B has dangling packet_task
>> which will race w/ whatever will run on port B.
>>
>>  The problem is that we don't always protect polled ports from
>> interrupts with ata_qc_set_polling() and for non-DMA ATA commands
>> there's no way to discern if actually an IRQ has occurred (this
>> sucks), so we end up finishing the other port's command.
>>
>>  This condition occurs quite often if both port A and B are busy.  The
>> most common result is assertion failure in atapi_packet_task
>> (assert(qc->flags & ATA_QCFLAG_ACTIVE)), but depending on timing and
>> on SMP, weirder things could happen, I think.
>>
>>  Note that for ATAPI_DMA, interrupt from the other port won't mess
>> with a polled command as we can tell that it's not ours with
>> bmdma_status, but, if spurious interrupt occurs on the port, the
>> packet_task will go dangling.  That's why ATAPI_DMA also needs
>> protection.  The baseline is that all polled qc's need to be protected
>> with ata_qc_set_polling() until polling task is done with the command.
>>
>>
>> [ Start of patch descriptions ]
> 
> 
> While this is something to look into, the supplied patches are 
> definitely not the way we want to go.  We need to follow the state 
> diagram for the PACKET command protocol.  ATA4 [1] diagrams are a good 
> thing to read, since they include mention of the behavior of certain 
> ATAPI devices that can send an interrupt between taskfile-out and 
> write-cdb steps in the sequence.
> 
> In patch #2, you're making ata_irq_on() way too heavy.  In patch #1, 
> calling ata_qc_set_polling() for non-polled commands is a hack.

  Have you received patches in the reverse order?  #1 changes 
ata_irq_on() and #2 adds ata_qc_set_polling().

  Hmmm, only a call to ata_chk_status() is added to ata_irq_on(), which 
I think is needed regardless of other changes, and ata_wait_idle() is 
removed.  Does that make the function heavy?

> 
> The better solution is to track the PACKET command protocol state much 
> more closely, so that the code _knows_ when it should and shouldn't be 
> getting an interrupt.
> 
> This is required anyway because, as mentioned in another email, an ATA 
> device might assert INTRQ for certain events, such as non-data commands, 
> where the controller is not required to assert the BMDMA IRQ bit.
> 
> I _suspect_ that many host controllers cause the BMDMA IRQ bit to track 
> the ATA INTRQ status precisely, but this theory has not been validated.
> 
>     Jeff
> 
> 
> [1] http://www.t13.org/project/d1153r18-ATA-ATAPI-4.pdf

  Without interrupt pending information from BMDMA bit for non-DMA 
commands (which I don't think we can use for non-DMA cmds as we'll never 
be sure if all controllers behave that way), the problem is that for 
many SATA controllers, more than one ports share single interrupt line. 
   And without interrupt pending bit, shared interrupt means a lot of 
spurious interrupts making it impossible to know when to expect interrupts.

  IDE driver deals with this by having only one command active per 
interrupt, but SATA doesn't have such scheme yet.  And I don't know if 
such a scheme is desirable at all.  Maybe just continuing to poll 
non-DMA commands (which isn't much a burden anyway) and keeping DMA 
commands fast is a better approach.

  So, the thing is that if we run non-DMA commands and other commands at 
the same time on ports which share interrupts, we'll get spurious 
interrupts very often regardless of ATA/ATAPI state machine.  Remember 
many spurious interrupts reports/fixes regarding ATAPI devices?  I 
suspect many of them are not actual spurious interrupts but interrupts 
from the other device sharing the host_set.

  So, how should we do here?  To follow ATA/ATAPI state machine, we need 
to implement exclusion among ports sharing an interrupt.  Is this the 
way to go?  Arggggggg...  Lack of interrupt pending bit is such a pain 
in the ass.  :-(

-- 
tejun

  reply	other threads:[~2005-08-21 21:12 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-08-20  9:17 [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix Tejun Heo
2005-08-20  9:17 ` [PATCH libata-dev-2.6:upstream 01/02] atapi: update ata_irq_on Tejun Heo
2005-08-20  9:17 ` [PATCH libata-dev-2.6:upstream 02/02] atapi: fix atapi_packet_task vs. ata_host_intr race Tejun Heo
2005-08-21 19:47 ` [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix Jeff Garzik
2005-08-21 20:17   ` Tejun Heo [this message]
2005-08-21 20:23     ` Jeff Garzik
2005-08-21 20:40       ` Tejun Heo
2005-08-21 21:24         ` Jeff Garzik
2005-08-21 21:38           ` Tejun Heo
2005-08-21 21:59             ` Jeff Garzik
2005-08-21 23:51               ` Tejun Heo
2005-08-22  0:27                 ` Jeff Garzik
2005-08-22  3:57                   ` Tejun Heo
2005-08-22  4:01                     ` Jeff Garzik
2005-08-22  3:59                   ` [PATCH libata:upstream] fix atapi_packet_task vs. intr race Tejun Heo
2005-08-22  4:15                     ` Jeff Garzik
2005-08-22  5:59                       ` [PATCH libata:upstream] fix atapi_packet_task vs. intr race (take 2) Tejun Heo
2005-08-22  6:54                         ` Jeff Garzik
2005-08-22  7:06                           ` Tejun Heo
2005-08-23  5:06                         ` Jeff Garzik
2005-08-21 20:25     ` [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix Jeff Garzik
2005-08-22 14:48     ` Mark Lord

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4308E13E.6070104@gmail.com \
    --to=htejun@gmail.com \
    --cc=albertcc@tw.ibm.com \
    --cc=jgarzik@pobox.com \
    --cc=linux-ide@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).