From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: [PATCH 14/21] KVM: x86: Software disabled APIC should still deliver NMIs Date: Wed, 26 Nov 2014 19:00:10 +0100 Message-ID: <5476152A.7030205@redhat.com> References: <1414922101-17626-1-git-send-email-namit@cs.technion.ac.il> <1414922101-17626-15-git-send-email-namit@cs.technion.ac.il> <545A1852.1000603@redhat.com> <5D182FD6-0365-4471-AD28-C587A624D151@gmail.com> <545B409E.1020500@redhat.com> <20141106164534.GA15407@potion.brq.redhat.com> <546618F2.2060108@redhat.com> <02C5C22B-2809-438B-BD42-8C235F1E5CEE@gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm list To: Nadav Amit , =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= Return-path: Received: from mx1.redhat.com ([209.132.183.28]:60599 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752568AbaKZSAO (ORCPT ); Wed, 26 Nov 2014 13:00:14 -0500 In-Reply-To: <02C5C22B-2809-438B-BD42-8C235F1E5CEE@gmail.com> Sender: kvm-owner@vger.kernel.org List-ID: On 26/11/2014 18:01, Nadav Amit wrote: > Paolo Bonzini wrote: >=20 >> >> >> On 06/11/2014 17:45, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: >>> 2014-11-06 10:34+0100, Paolo Bonzini: >>>> On 05/11/2014 21:45, Nadav Amit wrote: >>>>> If I understand the SDM correctly, in such scenario (all APICs ar= e >>>>> software disabled) the mode is left as the default - flat mode (s= ee >>> >>> APIC doesn't have any global mode (it is just KVM's simplification)= , so >>> when a message lands on the system bus, it just compares MDA with L= DR >>> and DFR ... >>> >>>>> section 10.6.2.2 "Logical Destination Mode=E2=80=9D): "All proces= sors that >>>>> have their APIC software enabled (using the spurious vector >>>>> enable/disable bit) must have their DFRs (Destination Format >>>>> Registers) programmed identically. The default mode for DFR is fl= at >>>>> mode.=E2=80=9D >>> >>> I think the "default mode" points to reset state, which is flat DFR= ; >>> and it is mentioned only because of the following sentence >>> If you are using cluster mode, DFRs must be programmed before the = APIC >>> is software enabled. >>> >>>> That's not what either Bochs or QEMU do, though. (Though in the c= ase of >>>> Bochs I cannot find the place where reception of IPIs is prevented= for >>>> software-disabled APICs, so I'm not sure how much to trust it in t= his case). >>>> >>>> I'm not sure why software-disabled APICs could have different DFRs= , if >>>> the APICs can receive NMI IPIs. I'll ask Intel. >>> >>> When changing the mode, we can't switch DFR synchronously, so it ha= s to >>> happen and NMI may be needed (watchdog?) before APIC configuration. >>> Explicit statement might have been a hint to be _extra_ careful whe= n >>> using logical destination for INIT, NMI, ... I wonder what they'll = say. >>> >>> Anyway, Paolo's patch seems to be in the right direction, I'd modif= y it >>> a bit though: >>> >>> LDR=3D0 isn't addressable in any logical mode, so we can ignore API= Cs that >>> don't set it and decide the mode by the last nonzero one. >>> This works in a situation, where one part is configured for cluster= and >>> the rest is still in reset state. >>> >>> (It gets harder if we allow nonzero LDRs with different DFR ... >>> we'd need to split our logical map to handle it.) >>> >>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >>> index 758f838..6da303e1 100644 >>> --- a/arch/x86/kvm/lapic.c >>> +++ b/arch/x86/kvm/lapic.c >>> @@ -148,10 +148,6 @@ static void recalculate_apic_map(struct kvm *k= vm) >>> goto out; >>> >>> new->ldr_bits =3D 8; >>> - /* flat mode is default */ >>> - new->cid_shift =3D 8; >>> - new->cid_mask =3D 0; >>> - new->lid_mask =3D 0xff; >>> new->broadcast =3D APIC_BROADCAST; >>> >>> kvm_for_each_vcpu(i, vcpu, kvm) { >>> @@ -166,7 +162,7 @@ static void recalculate_apic_map(struct kvm *kv= m) >>> new->cid_mask =3D (1 << KVM_X2APIC_CID_BITS) - 1; >>> new->lid_mask =3D 0xffff; >>> new->broadcast =3D X2APIC_BROADCAST; >>> - } else if (kvm_apic_hw_enabled(apic)) { >>> + } else if (kvm_apic_get_reg(apic, APIC_LDR)) { >>> if (kvm_apic_get_reg(apic, APIC_DFR) =3D=3D >>> APIC_DFR_CLUSTER) { >>> new->cid_shift =3D 4; >> >> I merged this patch and Nadav=E2=80=99s. >=20 > Sorry for the late and long reply, but I got an issue with the new ve= rsion > (and my previous version as well). Indeed, the SDM states that DFR sh= ould > be the same for enabled CPUs, and that the BIOS should get all CPUs i= n > either xAPIC or x2APIC. Yet, there is nothing that says all CPUs need= to be > in xAPIC/x2APIC mode. >=20 > In my tests (which pass on bare-metal), I got a scenario in which som= e CPUs > are in xAPIC mode, the BSP changed (which is currently not handled co= rrectly > by KVM) and the BSP has x2APIC enabled. All the core APICs are > software-enabled. The expected behaviour is that the CPUs with x2APIC > enabled would work in x2APIC mode. >=20 > I think such a transitory scenario is possible on real-systems as wel= l, > perhaps during CPU hot-plug. It appears the previous version (before = all of > our changes) handled it better. I presume the most efficient way is t= o start > determining the APIC logical mode from the BSP, and if it is disabled= , > traverse the rest of the CPUs until finding the first one with APIC e= nabled. > Yet, I have not finished doing and checking the BSP fix and other dep= endent > INIT signal handling fixes. >=20 > In the meanwhile, would you be ok with restoring some of the previous > behaviour - i.e., x2APIC is enabled if any CPU turned it on (regardle= ss to > whether APIC is software enabled), otherwise use the configuration of= the > last enabled APIC? >=20 > -- >8 =E2=80=94 > Subject: [PATCH] KVM: x86: Traverse all CPUs during recalculate_apic_= map >=20 > --- > arch/x86/kvm/lapic.c | 20 +++++++++++--------- > 1 file changed, 11 insertions(+), 9 deletions(-) >=20 > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 9c90d31..6dc2be6 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -139,6 +139,7 @@ static void recalculate_apic_map(struct kvm *kvm) > struct kvm_apic_map *new, *old =3D NULL; > struct kvm_vcpu *vcpu; > int i; > + bool any_enabled =3D false; > =20 > new =3D kzalloc(sizeof(struct kvm_apic_map), GFP_KERNEL); > =20 > @@ -160,13 +161,21 @@ static void recalculate_apic_map(struct kvm *kv= m) > if (!kvm_apic_present(vcpu)) > continue; > =20 > + /* > + * All APICs DFRs have to be configured the same mode by an OS. > + * We take advatage of this while building logical id lookup > + * table. After reset APICs are in software disabled mode, so if > + * we find apic with different setting we assume this is the mode > + * OS wants all apics to be in; build lookup table accordingly. > + */ > if (apic_x2apic_mode(apic)) { > new->ldr_bits =3D 32; > new->cid_shift =3D 16; > new->cid_mask =3D (1 << KVM_X2APIC_CID_BITS) - 1; > new->lid_mask =3D 0xffff; > new->broadcast =3D X2APIC_BROADCAST; > - } else if (kvm_apic_get_reg(apic, APIC_LDR)) { > + break; > + } else if (!any_enabled && kvm_apic_get_reg(apic, APIC_LDR)) { > if (kvm_apic_get_reg(apic, APIC_DFR) =3D=3D > APIC_DFR_CLUSTER) { > new->cid_shift =3D 4; > @@ -179,15 +188,8 @@ static void recalculate_apic_map(struct kvm *kvm= ) > } > } > =20 > - /* > - * All APICs have to be configured in the same mode by an OS. > - * We take advatage of this while building logical id loockup > - * table. After reset APICs are in software disabled mode, so if > - * we find apic with different setting we assume this is the mode > - * OS wants all apics to be in; build lookup table accordingly. > - */ > if (kvm_apic_sw_enabled(apic)) > - break; > + any_enabled =3D true; > } > =20 > kvm_for_each_vcpu(i, vcpu, kvm) { >=20 Yes, this looks good. Paolo