From: "Jan Beulich" <JBeulich@novell.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: "Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
Tom Kopec <tek@acm.org>,
Daniel Stodden <daniel.stodden@citrix.com>
Subject: Re: [GIT PULL] Fix lost interrupt race in Xen event channels
Date: Fri, 27 Aug 2010 09:56:43 +0100 [thread overview]
Message-ID: <4C7799EB020000780001276F@vpn.id2.novell.com> (raw)
In-Reply-To: <4C769736.4050409@goop.org>
>>> On 26.08.10 at 18:32, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> On 08/25/2010 11:46 PM, Jan Beulich wrote:
>> >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>>> Note that this patch is specifically for upstream Xen, which doesn't
>>> have any pirq support in it at present.
>> I understand that, but saw that you had paralleling changes to the
>> pirq handling in your Dom0 tree.
>>
>>> However, I did consider using fasteoi, but I couldn't see how to make
>>> it work. The problem is that it only does a single call into the
>>> irq_chip for EOI after calling the interrupt handler, but there is no
>>> call beforehand to ack the interrupt (which means clear the event flag
>>> in our case). This leads to a race where an event can be lost after the
>>> interrupt handler has returned, but before the event flag has been
>>> cleared (because Xen won't set pending or call the upcall function if
>>> the event is already set). I guess I could pre-clear the event in the
>>> upcall function, but I'm not sure that's any better.
>> That's precisely what we're doing.
>
> You mean pre-clearing the event? OK.
>
> But aren't you still subject to the bug the switch to handle_edge_irq fixed?
>
> With handle_fasteoi_irq:
>
> cpu A cpu B
> get event
mask and clear event
> set INPROGRESS
> call action
> :
> :
> <migrate event channel to B>
> : get event
Cannot happen, event is masked (i.e. all that would happen is
that the event occurrence would be logged evtchn_pending).
> : INPROGRESS set? -> EOI, return
> :
> action returns
> clear INPROGRESS
> EOI
unmask event, checking for whether the event got re-bound (and
doing the unmask through a hypercall if necessary), thus re-raising
the event in any case
> The event arriving on B is lost, and there's no record made of it ever
> having arrived. How do you avoid this?
>
> With handle_edge_irq, the second event will set PENDING in the irq_desc,
> and a loop will keep running on cpu A until PENDING is clear, so nothing
> is ever lost.
Actually, considering that you mask and unmask just like we do, I
cannot even see why you would have run into above scenario
with handle_level_irq(). Where is the window that I'm missing?
>>> In the dom0 kernels, I followed the example of the IOAPIC irq_chip
>>> implementation, which does the hardware EOI in the ack call at the start
>>> of handle_edge_irq, can did the EOI hypercall (when necessary) there. I
>>> welcome a reviewer's eye on this though.
>> This I didn't actually notice so far.
>>
>> That doesn't look right, at least in combination with ->mask() being
>> wired to disable_pirq(), which is empty (and btw., if the latter was
>> right, you should also wire ->mask_ack() to disable_pirq() to avoid
>> a pointless indirect call).
>>
>> But even with ->mask() actually masking the IRQ I'm not certain
>> this is right. At the very least it'll make Xen see a potential
>> second instance of the same IRQ much earlier than you will
>> really be able to handle it. This is particularly bad for level
>> triggered ones, as Xen will see them right again after you
>> passed it the EOI notification - as a result there'll be twice as
>> many interrupts seen by Xen on the respective lines.
>>
>> The native I/O APIC can validly do this as ->ack() only gets
>> called for edge triggered interrupts (which is why ->eoi() is
>> wired to ack_apic_level()).
>
> Yes. The code as-is works OK, but I haven't checked to see if Xen it
> taking many spurious interrupts. It probably helps that my test machine
> has MSI for almost everything.
>
> But does that mean the pirq code needs to have different ack/eoi
> behaviour depending on the triggering of the ioapic interrupt?
If you want to continue to use handle_edge_irq(), I think you will.
With handle_fasteoi_irq(), you would leverage Xen's handling of
edge/level, and wouldn't need to make any distinction.
Jan
next prev parent reply other threads:[~2010-08-27 8:56 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-24 21:35 [GIT PULL] Fix lost interrupt race in Xen event channels Jeremy Fitzhardinge
2010-08-25 7:52 ` [Xen-devel] " Jan Beulich
2010-08-25 10:04 ` Daniel Stodden
2010-08-25 11:24 ` Jan Beulich
2010-08-25 17:54 ` Jeremy Fitzhardinge
2010-08-26 6:46 ` Jan Beulich
2010-08-26 16:32 ` Jeremy Fitzhardinge
2010-08-27 8:56 ` Jan Beulich [this message]
2010-08-27 20:43 ` Daniel Stodden
2010-08-27 21:49 ` Daniel Stodden
2010-08-27 23:43 ` Jeremy Fitzhardinge
2010-08-30 8:03 ` Jan Beulich
2010-08-30 8:43 ` Jan Beulich
2010-08-30 8:48 ` Keir Fraser
2010-08-30 9:06 ` Jan Beulich
2010-08-30 9:15 ` Keir Fraser
2010-08-30 9:22 ` Jan Beulich
2010-08-30 16:36 ` Jeremy Fitzhardinge
2010-08-31 6:38 ` Jan Beulich
2010-09-03 18:46 ` Using handle_fasteoi_irq for pirqs Jeremy Fitzhardinge
2010-09-06 7:58 ` Jan Beulich
2010-09-07 1:53 ` Jeremy Fitzhardinge
2010-09-07 6:58 ` Jan Beulich
2010-09-07 8:02 ` Jeremy Fitzhardinge
2010-09-07 8:58 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C7799EB020000780001276F@vpn.id2.novell.com \
--to=jbeulich@novell.com \
--cc=Xen-devel@lists.xensource.com \
--cc=daniel.stodden@citrix.com \
--cc=jeremy@goop.org \
--cc=tek@acm.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).