From: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
To: Trent Piepho <tpiepho@impinj.com>,
"marc.zyngier@arm.com" <marc.zyngier@arm.com>,
"lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
"gustavo.pimentel@synopsys.com" <gustavo.pimentel@synopsys.com>
Cc: "faiz_abbas@ti.com" <faiz_abbas@ti.com>,
"jingoohan1@gmail.com" <jingoohan1@gmail.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
"vigneshr@ti.com" <vigneshr@ti.com>,
"stable@vger.kernel.org" <stable@vger.kernel.org>,
"bhelgaas@google.com" <bhelgaas@google.com>,
"joao.pinto@synopsys.com" <joao.pinto@synopsys.com>
Subject: Re: [PATCH] PCI: dwc: Fix interrupt race in when handling MSI
Date: Mon, 12 Nov 2018 23:45:42 +0000 [thread overview]
Message-ID: <4806d92c-697c-2a92-d843-37effd00b20a@synopsys.com> (raw)
In-Reply-To: <1541710298.30311.349.camel@impinj.com>
On 08/11/2018 20:51, Trent Piepho wrote:
> On Thu, 2018-11-08 at 11:46 +0000, Gustavo Pimentel wrote:
>> On 07/11/2018 18:32, Trent Piepho wrote:
>>> On Wed, 2018-11-07 at 12:57 +0000, Gustavo Pimentel wrote:
>>>> On 06/11/2018 16:00, Marc Zyngier wrote:
>>>>> On 06/11/18 14:53, Lorenzo Pieralisi wrote:
>>>>>> On Sat, Oct 27, 2018 at 12:00:57AM +0000, Trent Piepho wrote:
>>>>>>>
>>>>>>> This gives the following race scenario:
>>>>>>>
>>>>>>> 1. An MSI is received by, and the status bit for the MSI is set in, the
>>>>>>> DWC PCI-e controller.
>>>>>>> 2. dw_handle_msi_irq() calls a driver's registered interrupt handler
>>>>>>> for the MSI received.
>>>>>>> 3. At some point, the interrupt handler must decide, correctly, that
>>>>>>> there is no more work to do and return.
>>>>>>> 4. The hardware generates a new MSI. As the MSI's status bit is still
>>>>>>> set, this new MSI is ignored.
>>>>>>> 6. dw_handle_msi_irq() unsets the MSI status bit.
>>>>>>>
>>>>>>> The MSI received at point 4 will never be acted upon. It occurred after
>>>>>>> the driver had finished checking the hardware status for interrupt
>>>>>>> conditions to act on. Since the MSI status was masked, it does not
>>>>>>> generated a new IRQ, neither when it was received nor when the MSI is
>>>>>>> unmasked.
>>>>>>>
>>>> This status register indicates whether exists or not a MSI interrupt on that
>>>> controller [0..7] to be handle.
>>>
>>> While the status for an MSI is set, no new interrupt will be triggered
>>
>> Yes
>>
>>> if another identical MSI is received, correct?
>>
>> You cannot receive another identical MSI till you acknowledge the current one
>> (This is ensured by the PCI protocol, I guess).
>
> I don't believe this is a requirement of PCI. We designed our hardware
> to not send another MSI until our hardware's interrupt status register
> is read, but we didn't have to do that.
>
>>>> In theory, we should clear the interrupt flag only after the interrupt has
>>>> actually handled (which can take some time to process on the worst case scenario).
>>>
>>> But see above, there is a race if a new MSI arrives while still masked.
>>> I can see no possible way to solve this in software that does not
>>> involve unmasking the MSI before calling the handler. To leave the
>>> interrupt masked while calling the handler requires the hardware to
>>> queue an interrupt that arrives while masked. We have no docs, but the
>>> designware controller doesn't appear to do this in practice.
>>
>> See my reply to Marc about the interrupt masking. Like you said, probably the
>> solution pass through using interrupt mask/unmask register instead of interrupt
>> enable/disable register.
>>
>> Can you do a quick test, since you can easily reproduce the issue? Can you
>> change register offset on both functions dw_pci_bottom_mask() and
>> dw_pci_bottom_unmask()?
>>
>> Basically exchange the PCIE_MSI_INTR0_ENABLE register by PCIE_MSI_INTR0_MASK.
>
> Of course MSI still need to be enabled to work at all, which happens
> once when the driver using the MSI registers a handler. Masking can be
> done via mask register after that.
>
Correct, I was asking to switch only on the functions mentioned that are called
after the dw_pcie_setup_rc() that enables the interrupts.
> It is not so easy for me to test on the newest kernel, as imx7d does
> not work due to yet more bugs. I have to port a set of patches to each
> new kernel. A set that does not shrink due to holdups like this.
Ok, I've to try to replicate this scenario of loss of interruptions so that I
can do something about it. Till now this never happen before.
>
> I understand the new flow would look like this (assume dw controller
> MSI interrupt output signal is connected to one of the ARM GIC
> interrupt lines, there could be different or more controllers above the
> dwc of course (but usually aren't)):
>
> 1. MSI arrives, status bit is set in dwc, interrupt raised to GIC.
> 2. dwc handler runs
> 3. dwc handler sees status bit is set for a(n) MSI(s)
> 4. dwc handler sets mask for those MSIs
> 5. dwc handler clears status bit
> 6. dwc handler runs driver handler for the received MSI
> 7. ** an new MSI arrives, racing with 6 **
> 8. status bit becomes set again, but no interrupt is raised due to mask
> 9. dwc handler unmasks MSI, which raises the interrupt to GIC because
> of new MSI received in 7.
> 10. The original GIC interrupt is EOI'ed.
> 11. The interrupt for the dwc is re-raised by the GIC due to 9, and we
> go back to 2.
>
> It is very important that 5 be done before 6. Less so 4 before 5, but
> reversing the order there would allow re-raising even if the 2nd MSI
> arrived before the driver handler ran, which is not necessary.
>
> I do not see a race in this design and it appears correct to me. But,
> I also do not think there is any immediate improvement due to extra
> steps of masking and unmasking the MSI.
>
> The reason is that the GIC interrupt above the dwc is non-reentrant.
> It remains masked (aka active[1]) during this entire process (1 to 10).
> This means every MSI is effectively already masked. So masking the
> active MSI(s) a 2nd time gains nothing besides preventing some extra
> edges for a masked interrupt going to the ARM GIC.
>
> In theory, if the GIC interrupt handler was reentrant, then on receipt
> of a new MSI we could re-enter the dwc handler on a different CPU and
> run the new MSI (a different MSI!) at the same time as the original MSI
> handler is still running.
>
> There difference here is that by unmasking in the interrupt in the GIC
> before the dwc handler is finished, masking an individual MSI in the
> dwc is no longer a 2nd redundant masking.
>
>
> [1] When I say masked in GIC, I mean the interrupt is in the "active"
> or "active and pending" states. In these states the interrupt will not
> be raised to the CPU and can be considered masked.
>
next prev parent reply other threads:[~2018-11-12 23:49 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-27 0:00 [PATCH] PCI: dwc: Fix interrupt race in when handling MSI Trent Piepho
2018-11-05 10:28 ` Vignesh R
2018-11-05 12:08 ` Gustavo Pimentel
2018-11-06 14:53 ` Lorenzo Pieralisi
2018-11-06 16:00 ` Marc Zyngier
2018-11-06 19:40 ` Trent Piepho
2018-11-07 11:07 ` Lorenzo Pieralisi
2018-11-07 12:58 ` Gustavo Pimentel
2018-11-07 18:41 ` Marc Zyngier
2018-11-07 20:17 ` Trent Piepho
2018-11-08 9:49 ` Marc Zyngier
2018-11-08 19:49 ` Trent Piepho
2018-11-09 10:13 ` Lorenzo Pieralisi
2018-11-09 17:17 ` Vignesh R
2018-11-09 11:34 ` Marc Zyngier
2018-11-09 18:53 ` Trent Piepho
2018-11-13 0:41 ` Gustavo Pimentel
2018-11-13 1:18 ` Trent Piepho
2018-11-13 10:36 ` Lorenzo Pieralisi
2018-11-13 18:55 ` Trent Piepho
2018-11-13 14:40 ` Marc Zyngier
2018-11-07 12:57 ` Gustavo Pimentel
2018-11-07 18:32 ` Trent Piepho
2018-11-08 11:46 ` Gustavo Pimentel
2018-11-08 20:51 ` Trent Piepho
2018-11-12 16:01 ` Lorenzo Pieralisi
2018-11-13 1:03 ` Gustavo Pimentel
2018-11-14 21:29 ` Trent Piepho
2018-11-12 23:45 ` Gustavo Pimentel [this message]
2018-11-07 18:46 ` Marc Zyngier
2018-11-08 11:24 ` Gustavo Pimentel
2018-11-06 18:59 ` Trent Piepho
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4806d92c-697c-2a92-d843-37effd00b20a@synopsys.com \
--to=gustavo.pimentel@synopsys.com \
--cc=bhelgaas@google.com \
--cc=faiz_abbas@ti.com \
--cc=jingoohan1@gmail.com \
--cc=joao.pinto@synopsys.com \
--cc=linux-pci@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=marc.zyngier@arm.com \
--cc=stable@vger.kernel.org \
--cc=tpiepho@impinj.com \
--cc=vigneshr@ti.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).