From mboxrd@z Thu Jan 1 00:00:00 1970 From: Paolo Bonzini Subject: Re: [PATCH] KVM: x86: Add lowest-priority support for vt-d posted-interrupts Date: Tue, 17 Nov 2015 10:41:25 +0100 Message-ID: <564AF645.5010506@redhat.com> References: <1447037208-75615-1-git-send-email-feng.wu@intel.com> <20151116190314.GA12245@potion.brq.redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= , Feng Wu Return-path: In-Reply-To: <20151116190314.GA12245@potion.brq.redhat.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 16/11/2015 20:03, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote: > 2015-11-09 10:46+0800, Feng Wu: >> Use vector-hashing to handle lowest-priority interrupts for >> posted-interrupts. As an example, modern Intel CPUs use this >> method to handle lowest-priority interrupts. >=20 > (I don't think it's a good idea that the algorithm differs from non-P= I > lowest priority delivery. I'd make them both vector-hashing, which > would be "fun" to explain to people expecting round robin ...) Yup, I would make it a module option. Thanks very much Radim for helping with the review. Paolo >> Signed-off-by: Feng Wu >> --- >> diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c >> +/* >> + * This routine handles lowest-priority interrupts using vector-has= hing >> + * mechanism. As an example, modern Intel CPUs use this method to h= andle >> + * lowest-priority interrupts. >> + * >> + * Here is the details about the vector-hashing mechanism: >> + * 1. For lowest-priority interrupts, store all the possible destin= ation >> + * vCPUs in an array. >> + * 2. Use "guest vector % max number of destination vCPUs" to find = the right >> + * destination vCPU in the array for the lowest-priority interru= pt. >> + */ >=20 > (Is Skylake i7-6700 a modern Intel CPU? > I didn't manage to get hashing ... all interrupts always went to the > lowest APIC ID in the set :/ > Is there a simple way to verify the algorithm?) >=20 >> +struct kvm_vcpu *kvm_intr_vector_hashing_dest(struct kvm *kvm, >> + struct kvm_lapic_irq *irq) >> + >> +{ >> + unsigned long dest_vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)]; >> + unsigned int dest_vcpus =3D 0; >> + struct kvm_vcpu *vcpu; >> + unsigned int i, mod, idx =3D 0; >> + >> + vcpu =3D kvm_intr_vector_hashing_dest_fast(kvm, irq); >> + if (vcpu) >> + return vcpu; >=20 > I think the rest of this function shouldn't be implemented: > - Shorthands are only for IPIs and hence don't need to be handled, > - Lowest priority physical broadcast is not supported, > - Lowest priority cluster logical broadcast is not supported, > - No point in optimizing mixed xAPIC and x2APIC mode, > - The rest is handled by kvm_intr_vector_hashing_dest_fast(). > (Even lowest priority flat logical "broadcast".) > - We do the work twice when vcpu =3D=3D NULL means that there is no > matching destination. >=20 > Is there a valid case that can be resolved by going through all vcpus= ? >=20 >> + >> + memset(dest_vcpu_bitmap, 0, sizeof(dest_vcpu_bitmap)); >> + >> + kvm_for_each_vcpu(i, vcpu, kvm) { >> + if (!kvm_apic_present(vcpu)) >> + continue; >> + >> + if (!kvm_apic_match_dest(vcpu, NULL, irq->shorthand, >> + irq->dest_id, irq->dest_mode)) >> + continue; >> + >> + __set_bit(vcpu->vcpu_id, dest_vcpu_bitmap); >> + dest_vcpus++; >> + } >> + >> + if (dest_vcpus =3D=3D 0) >> + return NULL; >> + >> + mod =3D irq->vector % dest_vcpus; >> + >> + for (i =3D 0; i <=3D mod; i++) { >> + idx =3D find_next_bit(dest_vcpu_bitmap, KVM_MAX_VCPUS, idx) + 1; >> + BUG_ON(idx >=3D KVM_MAX_VCPUS); >> + } >> + >> + return kvm_get_vcpu(kvm, idx - 1); >> +} >> +EXPORT_SYMBOL_GPL(kvm_intr_vector_hashing_dest); >> + >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c >> @@ -816,6 +816,63 @@ out: >> +struct kvm_vcpu *kvm_intr_vector_hashing_dest_fast(struct kvm *kvm, >> + struct kvm_lapic_irq *irq) >=20 > We now have three very similar functions :( >=20 > kvm_irq_delivery_to_apic_fast > kvm_intr_is_single_vcpu_fast > kvm_intr_vector_hashing_dest_fast >=20 > By utilizing the gcc optimizer, they can be merged without introducin= g > many instructions to the hot path, kvm_irq_delivery_to_apic_fast. > (I would eventually do it, so you can save time by ignoring this.) >=20 > Thanks. >=20