From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jan Kiszka Subject: Re: [PATCH 1/3] KVM: x86: Relax accept conditions of kvm_apic_accept_pic_intr Date: Fri, 17 Oct 2008 19:35:01 +0200 Message-ID: <48F8CCC5.8060502@web.de> References: <20081015142748.385784583@mchn012c.ww002.siemens.net> <20081015142748.606503565@mchn012c.ww002.siemens.net> <200810171311.11309.sheng@linux.intel.com> <48F8488E.9070700@siemens.com> <20081017163530.GA20831@yukikaze> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigA26FB861EBD698485216D2B5" Cc: Jan Kiszka , kvm@vger.kernel.org, avi@redhat.com, jiajun.xu@intel.com To: Sheng Yang Return-path: Received: from fmmailgate01.web.de ([217.72.192.221]:50085 "EHLO fmmailgate01.web.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755763AbYJQRfM (ORCPT ); Fri, 17 Oct 2008 13:35:12 -0400 In-Reply-To: <20081017163530.GA20831@yukikaze> Sender: kvm-owner@vger.kernel.org List-ID: This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigA26FB861EBD698485216D2B5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Sheng Yang wrote: > On Fri, Oct 17, 2008 at 10:10:54AM +0200, Jan Kiszka wrote: >> Sheng Yang wrote: >>> On Wednesday 15 October 2008 22:27:49 Jan Kiszka wrote: >>>> Aligning in-kernel kvm_apic_accept_pic_intr with its user space mate= , >>>> this patch relaxes the conditions under which PIC IRQs are accepted >>>> by LVT0. This reflects reality and allows to reuse the service for t= he >>>> NMI watchdog use case. >>>> >>>> Signed-off-by: Jan Kiszka >>>> --- >>>> arch/x86/kvm/lapic.c | 13 ++++--------- >>>> 1 file changed, 4 insertions(+), 9 deletions(-) >>>> >>>> Index: b/arch/x86/kvm/lapic.c >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> --- a/arch/x86/kvm/lapic.c >>>> +++ b/arch/x86/kvm/lapic.c >>>> @@ -1072,16 +1072,11 @@ int kvm_apic_has_interrupt(struct kvm_vc >>>> int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu) >>>> { >>>> u32 lvt0 =3D apic_get_reg(vcpu->arch.apic, APIC_LVT0); >>>> - int r =3D 0; >>>> >>>> - if (vcpu->vcpu_id =3D=3D 0) { >>>> - if (!apic_hw_enabled(vcpu->arch.apic)) >>>> - r =3D 1; >>>> - if ((lvt0 & APIC_LVT_MASKED) =3D=3D 0 && >>>> - GET_APIC_DELIVERY_MODE(lvt0) =3D=3D APIC_MODE_EXTINT) >>>> - r =3D 1; >>>> - } >>>> - return r; >>>> + if (!apic_hw_enabled(vcpu->arch.apic) || >>>> + (lvt0 & APIC_LVT_MASKED) =3D=3D 0) >>>> + return 1; >>>> + return 0; >>>> } >>>> >>>> void kvm_inject_apic_timer_irqs(struct kvm_vcpu *vcpu) >>>> >>> (sorry for late review...) >>> >>> Thanks to find out the root cause of BSOD! >>> >>> But I am a little concern about this change. As you know, PIC only co= nnect to=20 >>> cpu0. So I think it's not proper to make it generic.=20 >> I don't think so - and if it were true, qemu would have a bug then, se= e >> its corresponding code. >=20 > You can refer to Intel MP spec, virtual wire mode. Google > "MP spec" can find it. Ah, good reference. >=20 > Normally PIC is only used in BSP boot up for SMP guest(PIC can't afford= SMP, > otherwise we won't need IOAPIC/LAPIC). After that, it should be disable= d. > And virtual wire mode works with APIC_MODE_EXTINT on LVT0 of BSP lapic,= so > that's why you see >=20 > GET_APIC_DELIVERY_MODE(lvt0) =3D=3D APIC_MODE_EXTINT >=20 > KVM follow virtual wire mode exactly. According to my understanding of the spec, the virtual wire mode means that the PIC signal is delivered via LVT0, and thus can be received by _all_ CPUs in the system. However, only the BSP usually enables LVT0, thus is receiving the IRQ. When Linux switches to NMI watchdog mode 1, it also unmasks the other CPUs (and reprograms all to deliver NMIs instead of EXTINTs). Then there is also the "PIC Mode", ie. direct delivery to the BSP, and only the latter. That mode is obviously target by the current kvm_apic_accept_pic_intr implementation. But I find no indication in the spec yet that both modes cannot exists in the same system. But I also fail to understand how one could switch between both modes (via software)= =2E >=20 > For QEmu, it just check if lapic LVT0 is masked, and don't check vcpu0.= > That's indeed a little problematic, for it's not that sufficient to > determine if it's programmed as virtual wire mode and used for deliver > interrupts from PIC. Well, in most condition, it can work. But maybe > it's not clean in logic. >=20 > For NMI watchdog here, we use a little more tricky way other than norma= l > PIC/LAPIC interaction. IIRC, NMI watchdog don't mask PIC after enable > IOAPIC, it also don't mask LVT0 of every LAPIC. It use physical connect= ion > of PIT to PIC then to LAPIC LVT0 to send NMI. Program LVT0 to NMI, then= > every PIT interrupt would go through PIC, arrive at LVT0, trig a NMI. >=20 > So I think the key problem for Windows is, they don't need it, but we s= end > the NMIs. We send the NMI when LVT0 is masked. Base on this, I think yo= ur > optimize patch also can resolve this issue? It's already including nece= ssary > judgment. We will try it next week. The key problem for Windows was most probably not NMI, but the fact that we forwarded _any_ PIC IRQ (emulating virtual wire mode) without checking for the LAPICs' mask state. OK, this requires a few more thoughts and a bit more reading. Jan --------------enigA26FB861EBD698485216D2B5 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkj4zMsACgkQniDOoMHTA+l8EgCfUhVKnmYnt4k7edWXhXML9447 QJAAn2ggI9y7g4G7rlCxQh4I9M2x7kDd =JNso -----END PGP SIGNATURE----- --------------enigA26FB861EBD698485216D2B5--