From mboxrd@z Thu Jan  1 00:00:00 1970
From: Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] KVM: x86: Add lowest-priority support for vt-d
 posted-interrupts
Date: Tue, 17 Nov 2015 10:41:25 +0100
Message-ID: <564AF645.5010506@redhat.com>
References: <1447037208-75615-1-git-send-email-feng.wu@intel.com>
 <20151116190314.GA12245@potion.brq.redhat.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: QUOTED-PRINTABLE
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org
To: =?UTF-8?B?UmFkaW0gS3LEjW3DocWZ?= <rkrcmar@redhat.com>,
	Feng Wu <feng.wu@intel.com>
Return-path: <linux-kernel-owner@vger.kernel.org>
In-Reply-To: <20151116190314.GA12245@potion.brq.redhat.com>
Sender: linux-kernel-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org


On 16/11/2015 20:03, Radim Kr=C4=8Dm=C3=A1=C5=99 wrote:
> 2015-11-09 10:46+0800, Feng Wu:
>> Use vector-hashing to handle lowest-priority interrupts for
>> posted-interrupts. As an example, modern Intel CPUs use this
>> method to handle lowest-priority interrupts.
>=20
> (I don't think it's a good idea that the algorithm differs from non-P=
I
>  lowest priority delivery.  I'd make them both vector-hashing, which
>  would be "fun" to explain to people expecting round robin ...)

Yup, I would make it a module option.  Thanks very much Radim for
helping with the review.

Paolo

>> Signed-off-by: Feng Wu <feng.wu@intel.com>
>> ---
>> diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
>> +/*
>> + * This routine handles lowest-priority interrupts using vector-has=
hing
>> + * mechanism. As an example, modern Intel CPUs use this method to h=
andle
>> + * lowest-priority interrupts.
>> + *
>> + * Here is the details about the vector-hashing mechanism:
>> + * 1. For lowest-priority interrupts, store all the possible destin=
ation
>> + *    vCPUs in an array.
>> + * 2. Use "guest vector % max number of destination vCPUs" to find =
the right
>> + *    destination vCPU in the array for the lowest-priority interru=
pt.
>> + */
>=20
> (Is Skylake i7-6700 a modern Intel CPU?
>  I didn't manage to get hashing ... all interrupts always went to the
>  lowest APIC ID in the set :/
>  Is there a simple way to verify the algorithm?)
>=20
>> +struct kvm_vcpu *kvm_intr_vector_hashing_dest(struct kvm *kvm,
>> +					      struct kvm_lapic_irq *irq)
>> +
>> +{
>> +	unsigned long dest_vcpu_bitmap[BITS_TO_LONGS(KVM_MAX_VCPUS)];
>> +	unsigned int dest_vcpus =3D 0;
>> +	struct kvm_vcpu *vcpu;
>> +	unsigned int i, mod, idx =3D 0;
>> +
>> +	vcpu =3D kvm_intr_vector_hashing_dest_fast(kvm, irq);
>> +	if (vcpu)
>> +		return vcpu;
>=20
> I think the rest of this function shouldn't be implemented:
>  - Shorthands are only for IPIs and hence don't need to be handled,
>  - Lowest priority physical broadcast is not supported,
>  - Lowest priority cluster logical broadcast is not supported,
>  - No point in optimizing mixed xAPIC and x2APIC mode,
>  - The rest is handled by kvm_intr_vector_hashing_dest_fast().
>    (Even lowest priority flat logical "broadcast".)
>  - We do the work twice when vcpu =3D=3D NULL means that there is no
>    matching destination.
>=20
> Is there a valid case that can be resolved by going through all vcpus=
?
>=20
>> +
>> +	memset(dest_vcpu_bitmap, 0, sizeof(dest_vcpu_bitmap));
>> +
>> +	kvm_for_each_vcpu(i, vcpu, kvm) {
>> +		if (!kvm_apic_present(vcpu))
>> +			continue;
>> +
>> +		if (!kvm_apic_match_dest(vcpu, NULL, irq->shorthand,
>> +					irq->dest_id, irq->dest_mode))
>> +			continue;
>> +
>> +		__set_bit(vcpu->vcpu_id, dest_vcpu_bitmap);
>> +		dest_vcpus++;
>> +	}
>> +
>> +	if (dest_vcpus =3D=3D 0)
>> +		return NULL;
>> +
>> +	mod =3D irq->vector % dest_vcpus;
>> +
>> +	for (i =3D 0; i <=3D mod; i++) {
>> +		idx =3D find_next_bit(dest_vcpu_bitmap, KVM_MAX_VCPUS, idx) + 1;
>> +		BUG_ON(idx >=3D KVM_MAX_VCPUS);
>> +	}
>> +
>> +	return kvm_get_vcpu(kvm, idx - 1);
>> +}
>> +EXPORT_SYMBOL_GPL(kvm_intr_vector_hashing_dest);
>> +
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> @@ -816,6 +816,63 @@ out:
>> +struct kvm_vcpu *kvm_intr_vector_hashing_dest_fast(struct kvm *kvm,
>> +						   struct kvm_lapic_irq *irq)
>=20
> We now have three very similar functions :(
>=20
>   kvm_irq_delivery_to_apic_fast
>   kvm_intr_is_single_vcpu_fast
>   kvm_intr_vector_hashing_dest_fast
>=20
> By utilizing the gcc optimizer, they can be merged without introducin=
g
> many instructions to the hot path, kvm_irq_delivery_to_apic_fast.
> (I would eventually do it, so you can save time by ignoring this.)
>=20
> Thanks.
>=20