From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [Qemu-devel] [PATCH 1/2] KVM: fix i8259 interrupt high to low
 transition logic
Date: Wed, 12 Sep 2012 10:57:57 +0200
Message-ID: <50504E95.9040709@siemens.com>
References: <1347240563-6212-1-git-send-email-mmogilvi_qemu@miniinfo.net> <50504151.40704@redhat.com> <50504C69.60703@siemens.com> <50504D2E.2080802@redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: Matthew Ogilvie <mmogilvi_qemu@miniinfo.net>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Maciej W. Rozycki" <macro@linux-mips.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>
To: Avi Kivity <avi@redhat.com>
Return-path: <kvm-owner@vger.kernel.org>
Received: from goliath.siemens.de ([192.35.17.28]:24979 "EHLO
	goliath.siemens.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751572Ab2ILI6I (ORCPT <rfc822;kvm@vger.kernel.org>);
	Wed, 12 Sep 2012 04:58:08 -0400
In-Reply-To: <50504D2E.2080802@redhat.com>
Sender: kvm-owner@vger.kernel.org
List-ID: <kvm.vger.kernel.org>

On 2012-09-12 10:51, Avi Kivity wrote:
> On 09/12/2012 11:48 AM, Jan Kiszka wrote:
>> On 2012-09-12 10:01, Avi Kivity wrote:
>>> On 09/10/2012 04:29 AM, Matthew Ogilvie wrote:
>>>> Intel's definition of "edge triggered" means: "asserted with a
>>>> low-to-high transition at the time an interrupt is registered
>>>> and then kept high until the interrupt is served via one of the
>>>> EOI mechanisms or goes away unhandled."
>>>>
>>>> So the only difference between edge triggered and level triggered
>>>> is in the leading edge, with no difference in the trailing edge.
>>>>
>>>> This bug manifested itself when the guest was Microport UNIX
>>>> System V/386 v2.1 (ca. 1987), because it would sometimes mask
>>>> off IRQ14 in the slave IMR after it had already been asserted.
>>>> The master would still try to deliver an interrupt even though
>>>> IRQ2 had dropped again, resulting in a spurious interupt
>>>> (IRQ15) and a panicked UNIX kernel.
>>>> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
>>>> index adba28f..5cbba99 100644
>>>> --- a/arch/x86/kvm/i8254.c
>>>> +++ b/arch/x86/kvm/i8254.c
>>>> @@ -302,8 +302,12 @@ static void pit_do_work(struct kthread_work *work)
>>>>  	}
>>>>  	spin_unlock(&ps->inject_lock);
>>>>  	if (inject) {
>>>> -		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
>>>> +		/* Clear previous interrupt, then create a rising
>>>> +		 * edge to request another interupt, and leave it at
>>>> +		 * level=1 until time to inject another one.
>>>> +		 */
>>>>  		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
>>>> +		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
>>>>  
>>>>  		/*
>>>
>>> I thought I understood this, now I'm not sure.  How can this be correct?
>>>  Real hardware doesn't act like this.
>>>
>>> What if the PIT is disabled after this?  You're injecting a spurious
>>> interrupt then.
>>
>> Yes, the PIT has to raise the output as long as specified, i.e.
>> according to the datasheet. That's important now due to the corrections
>> to the PIC. We can then carefully check if there is room for
>> simplifications / optimizations. I also cannot imagine that the above
>> already fulfills these requirements.
>>
>> And if the PIT is disabled by the HPET, we need to clear the output
>> explicitly as we inject the IRQ#0 under a different source ID than
>> userspace HPET does (which will logically take over IRQ#0 control). The
>> kernel would otherwise OR both sources to an incorrect result.
>>
> 
> I guess we need to double the hrtimer rate then in order to generate a
> square wave.  It's getting ridiculous how accurate our model needs to be.

I would suggest to solve this for the userspace model first, ensure that
it works properly in all modes, maybe optimize it, and then decide how
to map all this on kernel space. As long as we have two models, we can
also make use of them.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux

From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([208.118.235.92]:51456)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jan.kiszka@siemens.com>) id 1TBilv-0005W0-C8
	for qemu-devel@nongnu.org; Wed, 12 Sep 2012 04:58:12 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jan.kiszka@siemens.com>) id 1TBilo-0000r7-6n
	for qemu-devel@nongnu.org; Wed, 12 Sep 2012 04:58:11 -0400
Received: from goliath.siemens.de ([192.35.17.28]:19190)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jan.kiszka@siemens.com>) id 1TBiln-0000qx-T7
	for qemu-devel@nongnu.org; Wed, 12 Sep 2012 04:58:04 -0400
Message-ID: <50504E95.9040709@siemens.com>
Date: Wed, 12 Sep 2012 10:57:57 +0200
From: Jan Kiszka <jan.kiszka@siemens.com>
MIME-Version: 1.0
References: <1347240563-6212-1-git-send-email-mmogilvi_qemu@miniinfo.net>
	<50504151.40704@redhat.com> <50504C69.60703@siemens.com>
	<50504D2E.2080802@redhat.com>
In-Reply-To: <50504D2E.2080802@redhat.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH 1/2] KVM: fix i8259 interrupt high to low
 transition logic
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Avi Kivity <avi@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>, "kvm@vger.kernel.org" <kvm@vger.kernel.org>, Matthew Ogilvie <mmogilvi_qemu@miniinfo.net>, "Maciej W. Rozycki" <macro@linux-mips.org>, "qemu-devel@nongnu.org" <qemu-devel@nongnu.org>

On 2012-09-12 10:51, Avi Kivity wrote:
> On 09/12/2012 11:48 AM, Jan Kiszka wrote:
>> On 2012-09-12 10:01, Avi Kivity wrote:
>>> On 09/10/2012 04:29 AM, Matthew Ogilvie wrote:
>>>> Intel's definition of "edge triggered" means: "asserted with a
>>>> low-to-high transition at the time an interrupt is registered
>>>> and then kept high until the interrupt is served via one of the
>>>> EOI mechanisms or goes away unhandled."
>>>>
>>>> So the only difference between edge triggered and level triggered
>>>> is in the leading edge, with no difference in the trailing edge.
>>>>
>>>> This bug manifested itself when the guest was Microport UNIX
>>>> System V/386 v2.1 (ca. 1987), because it would sometimes mask
>>>> off IRQ14 in the slave IMR after it had already been asserted.
>>>> The master would still try to deliver an interrupt even though
>>>> IRQ2 had dropped again, resulting in a spurious interupt
>>>> (IRQ15) and a panicked UNIX kernel.
>>>> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
>>>> index adba28f..5cbba99 100644
>>>> --- a/arch/x86/kvm/i8254.c
>>>> +++ b/arch/x86/kvm/i8254.c
>>>> @@ -302,8 +302,12 @@ static void pit_do_work(struct kthread_work *work)
>>>>  	}
>>>>  	spin_unlock(&ps->inject_lock);
>>>>  	if (inject) {
>>>> -		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
>>>> +		/* Clear previous interrupt, then create a rising
>>>> +		 * edge to request another interupt, and leave it at
>>>> +		 * level=1 until time to inject another one.
>>>> +		 */
>>>>  		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
>>>> +		kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
>>>>  
>>>>  		/*
>>>
>>> I thought I understood this, now I'm not sure.  How can this be correct?
>>>  Real hardware doesn't act like this.
>>>
>>> What if the PIT is disabled after this?  You're injecting a spurious
>>> interrupt then.
>>
>> Yes, the PIT has to raise the output as long as specified, i.e.
>> according to the datasheet. That's important now due to the corrections
>> to the PIC. We can then carefully check if there is room for
>> simplifications / optimizations. I also cannot imagine that the above
>> already fulfills these requirements.
>>
>> And if the PIT is disabled by the HPET, we need to clear the output
>> explicitly as we inject the IRQ#0 under a different source ID than
>> userspace HPET does (which will logically take over IRQ#0 control). The
>> kernel would otherwise OR both sources to an incorrect result.
>>
> 
> I guess we need to double the hrtimer rate then in order to generate a
> square wave.  It's getting ridiculous how accurate our model needs to be.

I would suggest to solve this for the userspace model first, ensure that
it works properly in all modes, maybe optimize it, and then decide how
to map all this on kernel space. As long as we have two models, we can
also make use of them.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SDP-DE
Corporate Competence Center Embedded Linux