xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Tim Deegan <tim@xen.org>
Cc: George Dunlap <george.dunlap@eu.citrix.com>,
	Xen-devel <xen-devel@lists.xen.org>, KeirFraser <keir@xen.org>,
	Jan Beulich <JBeulich@suse.com>,
	roger.pau@citrix.com
Subject: Re: [Patch] x86/HVM: Fix RTC interrupt modelling
Date: Tue, 11 Feb 2014 14:52:25 +0000	[thread overview]
Message-ID: <52FA3929.70100@citrix.com> (raw)
In-Reply-To: <20140211141034.GC10482@deinos.phlegethon.org>

On 11/02/14 14:10, Tim Deegan wrote:
> At 13:59 +0000 on 11 Feb (1392123546), Andrew Cooper wrote:
>> On 11/02/14 13:15, Tim Deegan wrote:
>>> At 12:50 +0000 on 11 Feb (1392119457), Jan Beulich wrote:
>>>>>>> On 11.02.14 at 13:11, Tim Deegan <tim@xen.org> wrote:
>>>>> At 09:15 +0000 on 11 Feb (1392106520), Jan Beulich wrote:
>>>>>>>>> On 10.02.14 at 18:21, Tim Deegan <tim@xen.org> wrote:
>>>>>>> That is the main change of this cset:  we go back to driving
>>>>>>> the interrupt from the vpt code and fixing up the RTC state after vpt
>>>>>>> tells us it's injected an interrupt.
>>>>>> And that's what is wrong imo, as it doesn't allow driving PF correctly
>>>>>> when !PIE.
>>>>> Oh, I see -- the current code doesn't turn the vpt off when !PIE.  Can
>>>>> you remember why not?  Have I forgotten some wrinkle or race here?
>>>> Because an OS could inspect PF without setting PIE.
>>> Ugh. :( 
>>>
>>>>>>> Yeah, this has nothing to do with the bug being fixed here.  The old
>>>>>>> REG_C read was operating correctly, but on the return-to-guest path:
>>>>>>>  - vpt sees another RTC interrupt is due and calls RTC code
>>>>>>>  - RTC code sees REG_C clear, sets PF|IRQF and asserts the line
>>>>>>>  - vlapic code sees the last interrupt is still in the ISR and does
>>>>>>>    nothing;
>>>>>>>  - we return to the guest having set IRQF but not consumed a timer
>>>>>>>    event, so vpt stste is the same
>>>>>>>  - the guest sees the old REG_C, with PF|IRQF set, and re-reads, 
>>>>>>>    waiting for a read of 0.
>>>>>>>  - repeat forever.
>>>>>> Which would call for a flag suppressing the setting of PF|IRQF
>>>>>> until the timer event got consumed. Possibly with some safety
>>>>>> belt for this to not get deferred indefinitely (albeit if the interrupt
>>>>>> doesn't get injected for extended periods of time, the guest
>>>>>> would presumably have more severe problems than these flags
>>>>>> not getting updated as expected).
>>>>> That's pretty much what we're doing here -- the pt_intr_post callback
>>>>> sets PF|IRQF when the interrupt is injected.
>>>> Right, except you do this be reverting other stuff rather than
>>>> adding the missing functionality on top.
>>> Absolutely -- because once we went back to having PF set only when the
>>> interrupt was injected, it seemed better to reduce the amount of
>>> special-case plumbing for RTC than to add yet more.
>>>
>>> But for the case of an OS polling for PF with PIE clear, I guess we
>>> might need to keep all the current special cases.  Was that a known
>>> observed bug or a theoretical one?  I can't see a way of handling
>>> both that case and the w2k3 case.
>>>
>>> Either we always set PF when the tick happens, even if the interrupt
>>> is masked (which breaks w2k3) or we don't set it until we can deliver
>>> the interrupt (which breaks pollers).
>> This doesn't break w2k3.  Setting PF when a tick happens (or should
>> happen for !PIE) is the correct thing to do.
>>
>> The bug is that we see an interrupt pending and set PF when we
>> shouldn't
> We _are_ setting PF when the tick happens; it's just that because of
> no-missed-ticks mode the tick happens before w2k3 has finished
> handling the last one.  At that point, anything we do breaks w2k3 in
> some way -- either we leave the tick pending until the interrupt is
> actually delivered (which leads to the hang) or we consume the tick
> even though the interrupt will be lost (which causes clock drift).
>
> Tim.

No - we are setting PF on every vmentry, not every tick.

* pt_update_irq() finds the timer pending and decides to inject an
interrupt.  This sets REG_C.PF
* {svm,vmx}_intr_assist() bails early because it can't actually inject
the interrupt.
* pt_intr_post() doesn't run, which doesn't update the PT state, yet
because pt_update_irq() thought it was injecting an interrupt, it didn't
run its faked-up pt_intr_post()
* Next VMentry finds an erroneously pending tick and decides to inject
an interrupt.

If w2k3 were to repeatedly read REG_C alone without writing to the index
register in-between, it would observe PF being alternately set and clear.


w2k3 would still work perfectly fine if we only set PF when actually
injecting the interrupt, which is why the patch at the root of this
thread fixes the observed hang.  However, Jans comment about the
behaviour of the PF bit when !PIE is quite correct.

Therefore, for !PIE, PF must be updated ahead of time, but for PIE, it
must be updated when the interrupt is actually injected.

~Andrew

  reply	other threads:[~2014-02-11 14:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-10 11:17 [Patch] x86/HVM: Fix RTC interrupt modelling Andrew Cooper
2014-02-10 12:19 ` Tim Deegan
2014-02-10 15:17 ` Roger Pau Monné
2014-02-10 15:33 ` Keir Fraser
2014-02-10 16:34 ` Jan Beulich
2014-02-10 17:13   ` Andrew Cooper
2014-02-11  9:28     ` Jan Beulich
2014-02-10 17:21   ` Tim Deegan
2014-02-11  9:15     ` Jan Beulich
2014-02-11 12:11       ` Tim Deegan
2014-02-11 12:50         ` Jan Beulich
2014-02-11 13:15           ` Tim Deegan
2014-02-11 13:31             ` Jan Beulich
2014-02-11 13:57               ` Tim Deegan
2014-02-11 14:35                 ` Jan Beulich
2014-02-11 13:59             ` Andrew Cooper
2014-02-11 14:10               ` Tim Deegan
2014-02-11 14:52                 ` Andrew Cooper [this message]
2014-02-10 17:46   ` George Dunlap
2014-02-10 18:43     ` Andrew Cooper
2014-02-11  8:47       ` Tim Deegan
2014-02-11  9:02       ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52FA3929.70100@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=keir@xen.org \
    --cc=roger.pau@citrix.com \
    --cc=tim@xen.org \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).