From mboxrd@z Thu Jan 1 00:00:00 1970 From: Tejun Heo Subject: Re: [PATCH libata-dev-2.6:upstream 00/02] atapi: packet task vs. intr race fix Date: Mon, 22 Aug 2005 06:38:24 +0900 Message-ID: <4308F450.4050402@gmail.com> References: <20050820091651.486821E0@htj.dyndns.org> <4308DA57.9000303@pobox.com> <4308E13E.6070104@gmail.com> <4308E2C9.6040605@pobox.com> <4308E6B1.6030001@gmail.com> <4308F114.9020608@pobox.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from zproxy.gmail.com ([64.233.162.202]:40320 "EHLO zproxy.gmail.com") by vger.kernel.org with ESMTP id S1751149AbVHUVid (ORCPT ); Sun, 21 Aug 2005 17:38:33 -0400 Received: by zproxy.gmail.com with SMTP id r28so632609nza for ; Sun, 21 Aug 2005 14:38:32 -0700 (PDT) In-Reply-To: <4308F114.9020608@pobox.com> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Jeff Garzik Cc: albertcc@tw.ibm.com, linux-ide@vger.kernel.org Howdy, Jeff. Jeff Garzik wrote: > Tejun Heo wrote: > >> Jeff Garzik wrote: >> >>> Tejun Heo wrote: >>> >>>> >>>> Hello, Jeff. >>>> >>>> Jeff Garzik wrote: >>>> >>>>> Tejun Heo wrote: >>>>> >>>>>> Hello, Jeff and Albert. >>>>>> >>>>>> This patchset fixes the following race. >>>>>> (port A has ATA device and B ATAPI). >>>>>> >>>>>> port B : ata_issue_prot() (ATAPI_NODATA) >>>>>> port B : packet_task scheduled >>>>>> port A : ata_issue_prot() (DMA) >>>>>> intr : ata_interrupt() >>>>>> port A : ata_host_intr -> this is all good & dandy >>>>>> port B : ata_host_intr -> finishes ATAPI cmd w/ error (request >>>>>> sense) >>>>>> >>>>>> This is where the race is, we're polling for port B's qc, we must >>>>>> not >>>>>> mess with it from interrupt. Now, port B has dangling packet_task >>>>>> which will race w/ whatever will run on port B. >>>>>> >>>>>> The problem is that we don't always protect polled ports from >>>>>> interrupts with ata_qc_set_polling() and for non-DMA ATA commands >>>>>> there's no way to discern if actually an IRQ has occurred (this >>>>>> sucks), so we end up finishing the other port's command. >>>>>> >>>>>> This condition occurs quite often if both port A and B are busy. >>>>>> The >>>>>> most common result is assertion failure in atapi_packet_task >>>>>> (assert(qc->flags & ATA_QCFLAG_ACTIVE)), but depending on timing and >>>>>> on SMP, weirder things could happen, I think. >>>>>> >>>>>> Note that for ATAPI_DMA, interrupt from the other port won't mess >>>>>> with a polled command as we can tell that it's not ours with >>>>>> bmdma_status, but, if spurious interrupt occurs on the port, the >>>>>> packet_task will go dangling. That's why ATAPI_DMA also needs >>>>>> protection. The baseline is that all polled qc's need to be >>>>>> protected >>>>>> with ata_qc_set_polling() until polling task is done with the >>>>>> command. >>>>>> >>>>>> >>>>>> [ Start of patch descriptions ] >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> While this is something to look into, the supplied patches are >>>>> definitely not the way we want to go. We need to follow the state >>>>> diagram for the PACKET command protocol. ATA4 [1] diagrams are a >>>>> good thing to read, since they include mention of the behavior of >>>>> certain ATAPI devices that can send an interrupt between >>>>> taskfile-out and write-cdb steps in the sequence. >>>>> >>>>> In patch #2, you're making ata_irq_on() way too heavy. In patch >>>>> #1, calling ata_qc_set_polling() for non-polled commands is a hack. >>>> >>>> >>>> >>>> >>>> >>>> Have you received patches in the reverse order? #1 changes >>>> ata_irq_on() and #2 adds ata_qc_set_polling(). >>>> >>>> Hmmm, only a call to ata_chk_status() is added to ata_irq_on(), >>>> which I think is needed regardless of other changes, and >>>> ata_wait_idle() is removed. Does that make the function heavy? >>>> >>>>> >>>>> The better solution is to track the PACKET command protocol state >>>>> much more closely, so that the code _knows_ when it should and >>>>> shouldn't be getting an interrupt. >>>>> >>>>> This is required anyway because, as mentioned in another email, an >>>>> ATA device might assert INTRQ for certain events, such as non-data >>>>> commands, where the controller is not required to assert the BMDMA >>>>> IRQ bit. >>>>> >>>>> I _suspect_ that many host controllers cause the BMDMA IRQ bit to >>>>> track the ATA INTRQ status precisely, but this theory has not been >>>>> validated. >>>>> >>>>> Jeff >>>>> >>>>> >>>>> [1] http://www.t13.org/project/d1153r18-ATA-ATAPI-4.pdf >>>> >>>> >>>> >>>> >>>> >>>> Without interrupt pending information from BMDMA bit for non-DMA >>>> commands (which I don't think we can use for non-DMA cmds as we'll >>>> never be sure if all controllers behave that way), the problem is >>>> that for many SATA controllers, more than one ports share single >>>> interrupt line. And without interrupt pending bit, shared >>>> interrupt means a lot of spurious interrupts making it impossible to >>>> know when to expect interrupts. >>> >>> >>> >>> >>> Incorrect... this is why I keep harping on the "ATA host state >>> machine." >>> >>> You rely on proper implementation to know when to expect interrupts. >>> Read the state diagrams, they tell you precisely when an interrupt >>> may be expected. By definition, any other interrupt is probably a >>> PCI shared interrupt, or a hardware or software bug. >>> >> >> I think I'm familiar with ATA host state machine. I don't think I'm >> currently saying things because I don't know about it. Maybe I'm >> missing something here. Please consider the following scenario. >> >> 1. non-DMA command issued. >> 2. device does something >> 3. another device sharing the IRQ line raises interrupt >> 4. D2H FIS received (BSY off, DRDY on, INTR on) >> 5. interrupt handler kicks in for #3 >> 6. SATA controller raises interrupt >> 7. SATA handler invoked from #5 checks ATA_STATUS and determines that >> completes the command (we don't know if it's our interrupt or not) >> 8. interrupt handler from #5 returns. >> 9. interrupt handler for #6. >> 10. nobody cared. >> >> Did I miss something? > > > This gets into interrupt controller hardware, how interrupts are > asserted and delivered, and similar not-IDE-specific things. > > If you clear the condition (by checking ATA Status) that caused the > interrupt hardware to raise an interrupt, during #5 and #7, then the > interrupt condition is cleared in the interrupt controller hardware, and > the kernel should not invoke the interrupt handler again. > Ah... yes right. Gotta be level interrupt to be shared. > >> I think the problem is whether or not we're expecting interrupts, >> we'll get interrupts that are't ours and we have no way to tell if >> they're ours or not. No? > > > You can tell just by following normal interrupt assertion rules, and by > the "if it's nor ours it's either somebody else's interrupt, or > out-of-spec hardware" > > In addition, (slight tangent) > * if DMA is active, you can check the DMA status bits to see if you have > an interrupt condition to handle > * if DMA is not active, you can check the AltStatus register, to test > the state without actually clearing anything > > Though beware, on rare PATA controllers (old Macs?) the AltStatus > register doesn't exist. > Now I see what you're saying. We should be able to know/clear interrupt by reading ATA_STATUS. So, not having explicit interrupt pending bit can be compensated by following changes in ATA_STATUS (so the state machine), right? Thanks a lot for putting up with my ignorance. So, how about a port status bit to tell interrupt handler whether or not we are expecting an interrupt currently? That would solve the race from I tried to fix in the first patchset. If that's okay with you, I'll redo all four patches accordingly. Again, thanks a lot. :-) -- tejun