qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kiszka <jan.kiszka@siemens.com>
To: Matthew Ogilvie <mmogilvi_qemu@miniinfo.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	Avi Kivity <avi@redhat.com>,
	"Maciej W. Rozycki" <macro@linux-mips.org>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] [PATCH 1/2] KVM: fix i8259 interrupt high to low transition logic
Date: Thu, 13 Sep 2012 15:55:51 +0200	[thread overview]
Message-ID: <5051E5E7.8080705@siemens.com> (raw)
In-Reply-To: <20120913054950.GA4717@comcast.net>

On 2012-09-13 07:49, Matthew Ogilvie wrote:
> On Wed, Sep 12, 2012 at 10:57:57AM +0200, Jan Kiszka wrote:
>> On 2012-09-12 10:51, Avi Kivity wrote:
>>> On 09/12/2012 11:48 AM, Jan Kiszka wrote:
>>>> On 2012-09-12 10:01, Avi Kivity wrote:
>>>>> On 09/10/2012 04:29 AM, Matthew Ogilvie wrote:
>>>>>> Intel's definition of "edge triggered" means: "asserted with a
>>>>>> low-to-high transition at the time an interrupt is registered
>>>>>> and then kept high until the interrupt is served via one of the
>>>>>> EOI mechanisms or goes away unhandled."
>>>>>>
>>>>>> So the only difference between edge triggered and level triggered
>>>>>> is in the leading edge, with no difference in the trailing edge.
>>>>>>
>>>>>> This bug manifested itself when the guest was Microport UNIX
>>>>>> System V/386 v2.1 (ca. 1987), because it would sometimes mask
>>>>>> off IRQ14 in the slave IMR after it had already been asserted.
>>>>>> The master would still try to deliver an interrupt even though
>>>>>> IRQ2 had dropped again, resulting in a spurious interupt
>>>>>> (IRQ15) and a panicked UNIX kernel.
>>>>>> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
>>>>>> index adba28f..5cbba99 100644
>>>>>> --- a/arch/x86/kvm/i8254.c
>>>>>> +++ b/arch/x86/kvm/i8254.c
>>>>>> @@ -302,8 +302,12 @@ static void pit_do_work(struct kthread_work *work)
>>>>>>  	}
>>>>>>  	spin_unlock(&ps->inject_lock);
>>>>>>  	if (inject) {
>>>>>> -		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
>>>>>> +		/* Clear previous interrupt, then create a rising
>>>>>> +		 * edge to request another interupt, and leave it at
>>>>>> +		 * level=1 until time to inject another one.
>>>>>> +		 */
>>>>>>  		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
>>>>>> +		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
>>>>>>  
>>>>>>  		/*
>>>>>
>>>>> I thought I understood this, now I'm not sure.  How can this be correct?
>>>>>  Real hardware doesn't act like this.
>>>>>
>>>>> What if the PIT is disabled after this?  You're injecting a spurious
>>>>> interrupt then.
>>>>
>>>> Yes, the PIT has to raise the output as long as specified, i.e.
>>>> according to the datasheet. That's important now due to the corrections
>>>> to the PIC. We can then carefully check if there is room for
>>>> simplifications / optimizations. I also cannot imagine that the above
>>>> already fulfills these requirements.
>>>>
>>>> And if the PIT is disabled by the HPET, we need to clear the output
>>>> explicitly as we inject the IRQ#0 under a different source ID than
>>>> userspace HPET does (which will logically take over IRQ#0 control). The
>>>> kernel would otherwise OR both sources to an incorrect result.
>>>>
>>>
>>> I guess we need to double the hrtimer rate then in order to generate a
>>> square wave.  It's getting ridiculous how accurate our model needs to be.
>>
>> I would suggest to solve this for the userspace model first, ensure that
>> it works properly in all modes, maybe optimize it, and then decide how
>> to map all this on kernel space. As long as we have two models, we can
>> also make use of them.
> 
> Thoughts about the 8254 PIT:
> 
> First, this summary of (real) 8254 PIT behavior seems fairly
> good, as far it goes:
> 
> On Tue, Sep 04, 2012 at 07:27:38PM +0100, Maciej W. Rozycki wrote:
>>  * The 8254 PIT is normally configured in mode 2 or 3 in the PC/AT
>>    architecture.  In the former its output is high (active) all the time
>>    except from one (last) clock cycle.  In the latter a wave that has a
>>    duty cycle close or equal to 0.5 (depending on whether the divider is
>>    odd or even) is produced, so no short pulses either.  I don't remember
>>    the other four modes -- have a look at the datasheet if interested, but
>>    I reckon they're not really compatible with the wiring anyway, e.g. the
>>    gate is hardwired enabled.
> 
> I've also just skimmed parts of the 8254 section of "The Indispensable PC
> Hardware Book", by Hans-Peter Messmer, Copyright 1994 Addison-Wesley,
> although I probably ought to read it more carefully.

http://download.intel.com/design/archives/periphrl/docs/23124406.pdf
should be the primary reference - as long as it leaves no open questions.

> 
> Under "normal" conditions, the 8254 part of the patch above should be
> indistinguishable from previous behavior.  The 8259's IRR will
> still show up as 1 until the interrupt is actually serviced,
> and no new interrupt will be serviced after one is serviced until
> another edge is injected via the high-low-high transition of the new
> code.  (Unless the guest resets the 8259 or maybe messes with IMR,
> but real hardware would generate extra interrupts in such cases as
> well.)
> 
> The new code sounds much closer to mode 2 described by
> Maciej, compared to the old code - except the duty cycle is
> effectively 100 percent instead of 99.[some number of 9's] percent.
> 
> -----------------
> But there might be some concerns in abnormal conditions:
> 
>    * If some guest is actually depending on a 50 percent duty cycle
>      (maybe some kind of polling rather than interrupts), I would
>      expect it to be just as broken before this patch as after,
>      unless it is really weird (handles continuous 1's more
>      gracefully than continuous 0's).
> 
>         According to the my book, mode 3 isn't normally
>      used to create interrupts: "The generated square-wave signal
>      can be used, for example, to trasmit data via serial interfaces.
>      The PIT then operates as a baud rate generator.  In the PC,
>      counters 1 and 2 are operated in mode 3 to drive memory refresh
>      and the speaker, respectively." (page 369)
> 
>         I wouldn't be inclined to worry about the 50 percent duty
>      cycle too much unless someone comes up with a real guest OS
>      that depends on it.
> 
>    * To be correct there are probably some cases when the 8254 should
>      force the IRQ0 line when the guest is setting up the 8254.  Based on
>      a very quick reading of some of the 8254 section of my book:
> 
>       * Note that modes 0, 2, 3 and 4 look usable with a hard-wired GATE=1,
>         but not modes 1 or 5.
>       * Maybe force IRQ0=0 when the 8254 is disabled (however that is done;
>         I haven't found it yet)?
>       * Force IRQ0=0 when starting the timer in "mode 0" (one-shot).
>       * Modes 2, 3, and 4 all apparently start with IRQ0=1.  I guess
>         they generate an interrupt when first enabled?
>          * Mode 2 (periodic) has a 99 percent duty cycle (high).
>          * Mode 3 (periodic) a 50 percent duty cycle (see above).
>          * Mode 4 (one-shot) is distinguished from mode 0 in that
>            it generates both a high-low and low-high transition when
>            it expires, instead of just a low-high.
> 
>    * I don't know anything about the HPET.  I didn't even realize
>      it re-uses IRQ0.

In modern chipsets, the HPET can drive IRQ0 instead of the PIT. A muxer
selects the input for that purpose. And we disable the timer of the PIT
in that case to avoid needless background activity. See e.g.
pit_irq_control().

> 
>    * If you back-migrate from after this change to before this change,
>      maybe there is a risk that it will lose one timer interrupt?
>      Forward migration shouldn't have an issue (the first trailing
>      edge in the new code becomes a nop).  Perhaps hack it to encourage
>      extra interrupts in some migration cases rather than lost interrupts
>      in others?  Some kind of subsection thing like was being
>      discussed for the 8259 earlier?  Or:
> 
> Also, how big of a concern is a very rare gained or lost IRQ0
> actually?  Under normal conditions, I would expect this to at most
> cause a one time clock drift in the guest OS of a fraction of
> a second.  If that only happens when rebooting or migrating the
> guest...
> 
> On the other hand, lost or gained interrupts might be a more
> serious concern (especially if lost) if the 8254 is operating
> in a one-shot mode (0 or 4): Something in the guest doesn't
> stop (hangs) if not canceled by the interrupt.
> 
> 
> Can anyone confirm or contradict any of this?  Other thoughts?

See my other reply: Until we really understand the impact of some
optimization / simplification, we are forced to do it accurately.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

  parent reply	other threads:[~2012-09-13 13:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-10  1:29 [Qemu-devel] [PATCH 1/2] KVM: fix i8259 interrupt high to low transition logic Matthew Ogilvie
2012-09-10  1:29 ` [Qemu-devel] [PATCH 2/2] KVM: i8259: refactor pic_set_irq level logic Matthew Ogilvie
2012-09-11  0:49 ` [Qemu-devel] [PATCH 1/2] KVM: fix i8259 interrupt high to low transition logic Maciej W. Rozycki
2012-09-11  4:54   ` Matthew Ogilvie
2012-09-11 11:53     ` Maciej W. Rozycki
2012-09-11  9:04   ` Jan Kiszka
2012-09-12  8:01 ` Avi Kivity
2012-09-12  8:48   ` Jan Kiszka
2012-09-12  8:51     ` Avi Kivity
2012-09-12  8:57       ` Jan Kiszka
2012-09-12  9:02         ` Avi Kivity
2012-09-13  5:49         ` Matthew Ogilvie
2012-09-13 13:41           ` Maciej W. Rozycki
2012-09-13 13:49             ` Jan Kiszka
2012-09-13 13:55           ` Jan Kiszka [this message]
2012-09-13 15:48             ` Maciej W. Rozycki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5051E5E7.8080705@siemens.com \
    --to=jan.kiszka@siemens.com \
    --cc=avi@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=macro@linux-mips.org \
    --cc=mmogilvi_qemu@miniinfo.net \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).