Re: [GIT PULL] Fix lost interrupt race in Xen event channels

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Daniel Stodden <daniel.stodden@citrix.com>
Cc: "Xen-devel@lists.xensource.com" <Xen-devel@lists.xensource.com>,
	Tom Kopec <tek@acm.org>, Jan Beulich <JBeulich@novell.com>
Subject: Re: [GIT PULL] Fix lost interrupt race in Xen	 event channels
Date: Fri, 27 Aug 2010 16:43:44 -0700	[thread overview]
Message-ID: <4C784DB0.3000103@goop.org> (raw)
In-Reply-To: <1282941781.26797.386.camel@agari.van.xensource.com>

 On 08/27/2010 01:43 PM, Daniel Stodden wrote:
> On Fri, 2010-08-27 at 04:56 -0400, Jan Beulich wrote:
>>>>> On 26.08.10 at 18:32, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>>> On 08/25/2010 11:46 PM, Jan Beulich wrote:
>>>>  >>> On 25.08.10 at 19:54, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
>>>>> Note that this patch is specifically for upstream Xen, which doesn't
>>>>> have any pirq support in it at present.
>>>> I understand that, but saw that you had paralleling changes to the
>>>> pirq handling in your Dom0 tree.
>>>>
>>>>> However,  I did consider using fasteoi, but I couldn't see how to make
>>>>> it work.  The problem is that it only does a single call into the
>>>>> irq_chip for EOI after calling the interrupt handler, but there is no
>>>>> call beforehand to ack the interrupt (which means clear the event flag
>>>>> in our case).  This leads to a race where an event can be lost after the
>>>>> interrupt handler has returned, but before the event flag has been
>>>>> cleared (because Xen won't set pending or call the upcall function if
>>>>> the event is already set).  I guess I could pre-clear the event in the
>>>>> upcall function, but I'm not sure that's any better.
>>>> That's precisely what we're doing.
>>> You mean pre-clearing the event?  OK.
>>>
>>> But aren't you still subject to the bug the switch to handle_edge_irq fixed?
>>>
>>> With handle_fasteoi_irq:
>>>
>>> cpu A			cpu B
>>> get event
>> mask and clear event
> Argh. Right, I guess that's my fault, I was the one who came up with the
> PENDING theory, but indeed I failed to see the event masking bits.
>
> However, please read on.
>
>>> set INPROGRESS
>>> call action
>>>    :
>>>    :
>>> <migrate event channel to B>
>>>    :			get event
>> Cannot happen, event is masked (i.e. all that would happen is
>> that the event occurrence would be logged evtchn_pending).
>>
>>>    :			INPROGRESS set? -> EOI, return
>>>    :
>>> action returns
>>> clear INPROGRESS
>>> EOI
>> unmask event, checking for whether the event got re-bound (and
>> doing the unmask through a hypercall if necessary), thus re-raising
>> the event in any case
> Yes. I agree. So let's come up with a new theory. Right now I'm still
> looking at xen/next. Correct me if I'm mistaken:
>
> mask_ack_pirq will:
>  1. chip->mask
>  2. chip->ack
>
> Where chip->ack will:
>  1. move_native_irq
>  2. clear_evtchn.
>
> Now if you look into move_native_irq, it will:
>  1. chip->mask (gratuitous)
>  2. move
>  3. chip->unmask (aiiiiiie).
>
> That explains why edge_irq still fixed the problem.

Good point.  I guess the simplest fix in that case would have been to
use move_masked_irq()...

The current fix is not wrong, so we can leave it as-is upstream for now.

But I think I will try Jan's idea about masking/clearing in the event
upcall then using fasteoi as the handler.

    J

next prev parent reply	other threads:[~2010-08-27 23:43 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-08-24 21:35 [GIT PULL] Fix lost interrupt race in Xen event channels Jeremy Fitzhardinge
2010-08-25  7:52 ` [Xen-devel] " Jan Beulich
2010-08-25 10:04   ` Daniel Stodden
2010-08-25 11:24     ` Jan Beulich
2010-08-25 17:54   ` Jeremy Fitzhardinge
2010-08-26  6:46     ` Jan Beulich
2010-08-26 16:32       ` Jeremy Fitzhardinge
2010-08-27  8:56         ` Jan Beulich
2010-08-27 20:43           ` Daniel Stodden
2010-08-27 21:49             ` Daniel Stodden
2010-08-27 23:43             ` Jeremy Fitzhardinge [this message]
2010-08-30  8:03             ` Jan Beulich
2010-08-30  8:43               ` Jan Beulich
2010-08-30  8:48                 ` Keir Fraser
2010-08-30  9:06                   ` Jan Beulich
2010-08-30  9:15                     ` Keir Fraser
2010-08-30  9:22                       ` Jan Beulich
2010-08-30 16:36               ` Jeremy Fitzhardinge
2010-08-31  6:38                 ` Jan Beulich
2010-09-03 18:46   ` Using handle_fasteoi_irq for pirqs Jeremy Fitzhardinge
2010-09-06  7:58     ` Jan Beulich
2010-09-07  1:53       ` Jeremy Fitzhardinge
2010-09-07  6:58         ` Jan Beulich
2010-09-07  8:02           ` Jeremy Fitzhardinge
2010-09-07  8:58             ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C784DB0.3000103@goop.org \
    --to=jeremy@goop.org \
    --cc=JBeulich@novell.com \
    --cc=Xen-devel@lists.xensource.com \
    --cc=daniel.stodden@citrix.com \
    --cc=tek@acm.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).