From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Daniel Stodden <daniel.stodden@citrix.com>
Cc: "Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
Tom Kopec <tek@acm.org>, Jan Beulich <JBeulich@novell.com>
Subject: Re: [GIT PULL] Fix lost interrupt race in Xen event channels
Date: Fri, 27 Aug 2010 16:43:44 -0700 [thread overview]
Message-ID: <4C784DB0.3000103@goop.org> (raw)
In-Reply-To: <1282941781.26797.386.camel@agari.van.xensource.com>
On 08/27/2010 01:43 PM, Daniel Stodden wrote:
> On Fri, 2010-08-27 at 04:56 -0400, Jan Beulich wrote:
>>>>> On 26.08.10 at 18:32, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>>> On 08/25/2010 11:46 PM, Jan Beulich wrote:
>>>> >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>>>>> Note that this patch is specifically for upstream Xen, which doesn't
>>>>> have any pirq support in it at present.
>>>> I understand that, but saw that you had paralleling changes to the
>>>> pirq handling in your Dom0 tree.
>>>>
>>>>> However, I did consider using fasteoi, but I couldn't see how to make
>>>>> it work. The problem is that it only does a single call into the
>>>>> irq_chip for EOI after calling the interrupt handler, but there is no
>>>>> call beforehand to ack the interrupt (which means clear the event flag
>>>>> in our case). This leads to a race where an event can be lost after the
>>>>> interrupt handler has returned, but before the event flag has been
>>>>> cleared (because Xen won't set pending or call the upcall function if
>>>>> the event is already set). I guess I could pre-clear the event in the
>>>>> upcall function, but I'm not sure that's any better.
>>>> That's precisely what we're doing.
>>> You mean pre-clearing the event? OK.
>>>
>>> But aren't you still subject to the bug the switch to handle_edge_irq fixed?
>>>
>>> With handle_fasteoi_irq:
>>>
>>> cpu A cpu B
>>> get event
>> mask and clear event
> Argh. Right, I guess that's my fault, I was the one who came up with the
> PENDING theory, but indeed I failed to see the event masking bits.
>
> However, please read on.
>
>>> set INPROGRESS
>>> call action
>>> :
>>> :
>>> <migrate event channel to B>
>>> : get event
>> Cannot happen, event is masked (i.e. all that would happen is
>> that the event occurrence would be logged evtchn_pending).
>>
>>> : INPROGRESS set? -> EOI, return
>>> :
>>> action returns
>>> clear INPROGRESS
>>> EOI
>> unmask event, checking for whether the event got re-bound (and
>> doing the unmask through a hypercall if necessary), thus re-raising
>> the event in any case
> Yes. I agree. So let's come up with a new theory. Right now I'm still
> looking at xen/next. Correct me if I'm mistaken:
>
> mask_ack_pirq will:
> 1. chip->mask
> 2. chip->ack
>
> Where chip->ack will:
> 1. move_native_irq
> 2. clear_evtchn.
>
> Now if you look into move_native_irq, it will:
> 1. chip->mask (gratuitous)
> 2. move
> 3. chip->unmask (aiiiiiie).
>
> That explains why edge_irq still fixed the problem.
Good point. I guess the simplest fix in that case would have been to
use move_masked_irq()...
The current fix is not wrong, so we can leave it as-is upstream for now.
But I think I will try Jan's idea about masking/clearing in the event
upcall then using fasteoi as the handler.
J
next prev parent reply other threads:[~2010-08-27 23:43 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-08-24 21:35 [GIT PULL] Fix lost interrupt race in Xen event channels Jeremy Fitzhardinge
2010-08-24 21:35 ` Jeremy Fitzhardinge
2010-08-25 7:52 ` [Xen-devel] " Jan Beulich
2010-08-25 10:04 ` Daniel Stodden
2010-08-25 11:24 ` Jan Beulich
2010-08-25 17:54 ` Jeremy Fitzhardinge
2010-08-25 17:54 ` Jeremy Fitzhardinge
2010-08-26 6:46 ` Jan Beulich
2010-08-26 16:32 ` Jeremy Fitzhardinge
2010-08-27 8:56 ` Jan Beulich
2010-08-27 20:43 ` Daniel Stodden
2010-08-27 21:49 ` Daniel Stodden
2010-08-27 23:43 ` Jeremy Fitzhardinge [this message]
2010-08-30 8:03 ` Jan Beulich
2010-08-30 8:43 ` Jan Beulich
2010-08-30 8:48 ` Keir Fraser
2010-08-30 9:06 ` Jan Beulich
2010-08-30 9:15 ` Keir Fraser
2010-08-30 9:22 ` Jan Beulich
2010-08-30 16:36 ` Jeremy Fitzhardinge
2010-08-31 6:38 ` Jan Beulich
2010-09-03 18:46 ` Using handle_fasteoi_irq for pirqs Jeremy Fitzhardinge
2010-09-06 7:58 ` Jan Beulich
2010-09-07 1:53 ` Jeremy Fitzhardinge
2010-09-07 6:58 ` Jan Beulich
2010-09-07 8:02 ` Jeremy Fitzhardinge
2010-09-07 8:58 ` Jan Beulich
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4C784DB0.3000103@goop.org \
--to=jeremy@goop.org \
--cc=JBeulich@novell.com \
--cc=Xen-devel@lists.xensource.com \
--cc=daniel.stodden@citrix.com \
--cc=tek@acm.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.