kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yang Zhang <yang.zhang.wz@gmail.com>
To: "Wu, Feng" <feng.wu@intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"rkrcmar@redhat.com" <rkrcmar@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d posted-interrupts
Date: Tue, 22 Dec 2015 14:42:06 +0800	[thread overview]
Message-ID: <5678F0BE.2020409@gmail.com> (raw)
In-Reply-To: <E959C4978C3B6342920538CF579893F00AF06251@SHSMSX104.ccr.corp.intel.com>

On 2015/12/22 12:36, Wu, Feng wrote:
>
>
>> -----Original Message-----
>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com]
>> Sent: Monday, December 21, 2015 10:01 AM
>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com;
>> rkrcmar@redhat.com
>> Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d
>> posted-interrupts
>>
>> On 2015/12/21 9:55, Wu, Feng wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
>>>> owner@vger.kernel.org] On Behalf Of Yang Zhang
>>>> Sent: Monday, December 21, 2015 9:50 AM
>>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com;
>>>> rkrcmar@redhat.com
>>>> Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org
>>>> Subject: Re: [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d
>>>> posted-interrupts
>>>>
>>>> On 2015/12/16 9:37, Feng Wu wrote:
>>>>> Use vector-hashing to deliver lowest-priority interrupts for
>>>>> VT-d posted-interrupts.
>>>>>
>>>>> Signed-off-by: Feng Wu <feng.wu@intel.com>
>>>>> ---
>>>>>     arch/x86/kvm/lapic.c | 67
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>     arch/x86/kvm/lapic.h |  2 ++
>>>>>     arch/x86/kvm/vmx.c   | 12 ++++++++--
>>>>>     3 files changed, 79 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>> index e29001f..d4f2c8f 100644
>>>>> --- a/arch/x86/kvm/lapic.c
>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>> @@ -854,6 +854,73 @@ out:
>>>>>     }
>>>>>
>>>>>     /*
>>>>> + * This routine handles lowest-priority interrupts using vector-hashing
>>>>> + * mechanism. As an example, modern Intel CPUs use this method to
>> handle
>>>>> + * lowest-priority interrupts.
>>>>> + *
>>>>> + * Here is the details about the vector-hashing mechanism:
>>>>> + * 1. For lowest-priority interrupts, store all the possible destination
>>>>> + *    vCPUs in an array.
>>>>> + * 2. Use "guest vector % max number of destination vCPUs" to find the
>> right
>>>>> + *    destination vCPU in the array for the lowest-priority interrupt.
>>>>> + */
>>>>> +struct kvm_vcpu *kvm_intr_vector_hashing_dest(struct kvm *kvm,
>>>>> +					      struct kvm_lapic_irq *irq)
>>>>> +{
>>>>> +	struct kvm_apic_map *map;
>>>>> +	struct kvm_vcpu *vcpu = NULL;
>>>>> +
>>>>> +	if (irq->shorthand)
>>>>> +		return NULL;
>>>>> +
>>>>> +	rcu_read_lock();
>>>>> +	map = rcu_dereference(kvm->arch.apic_map);
>>>>> +
>>>>> +	if (!map)
>>>>> +		goto out;
>>>>> +
>>>>> +	if ((irq->dest_mode != APIC_DEST_PHYSICAL) &&
>>>>> +			kvm_lowest_prio_delivery(irq)) {
>>>>> +		u16 cid;
>>>>> +		int i, idx = 0;
>>>>> +		unsigned long bitmap = 1;
>>>>> +		unsigned int dest_vcpus = 0;
>>>>> +		struct kvm_lapic **dst = NULL;
>>>>> +
>>>>> +
>>>>> +		if (!kvm_apic_logical_map_valid(map))
>>>>> +			goto out;
>>>>> +
>>>>> +		apic_logical_id(map, irq->dest_id, &cid, (u16 *)&bitmap);
>>>>> +
>>>>> +		if (cid >= ARRAY_SIZE(map->logical_map))
>>>>> +			goto out;
>>>>> +
>>>>> +		dst = map->logical_map[cid];
>>>>> +
>>>>> +		for_each_set_bit(i, &bitmap, 16) {
>>>>> +			if (!dst[i] && !kvm_lapic_enabled(dst[i]->vcpu)) {
>>>>> +				clear_bit(i, &bitmap);
>>>>> +				continue;
>>>>> +			}
>>>>> +		}
>>>>> +
>>>>> +		dest_vcpus = hweight16(bitmap);
>>>>> +
>>>>> +		if (dest_vcpus != 0) {
>>>>> +			idx = kvm_vector_2_index(irq->vector, dest_vcpus,
>>>>> +						 &bitmap, 16);
>>>>> +			vcpu = dst[idx-1]->vcpu;
>>>>> +		}
>>>>> +	}
>>>>> +
>>>>> +out:
>>>>> +	rcu_read_unlock();
>>>>> +	return vcpu;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(kvm_intr_vector_hashing_dest);
>>>>> +
>>>>> +/*
>>>>>      * Add a pending IRQ into lapic.
>>>>>      * Return 1 if successfully added and 0 if discarded.
>>>>>      */
>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>> index 6890ef0..52bffce 100644
>>>>> --- a/arch/x86/kvm/lapic.h
>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>> @@ -172,4 +172,6 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm,
>>>> struct kvm_lapic_irq *irq,
>>>>>     			struct kvm_vcpu **dest_vcpu);
>>>>>     int kvm_vector_2_index(u32 vector, u32 dest_vcpus,
>>>>>     		       const unsigned long *bitmap, u32 bitmap_size);
>>>>> +struct kvm_vcpu *kvm_intr_vector_hashing_dest(struct kvm *kvm,
>>>>> +					      struct kvm_lapic_irq *irq);
>>>>>     #endif
>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>> index 5eb56ed..3f89189 100644
>>>>> --- a/arch/x86/kvm/vmx.c
>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>> @@ -10702,8 +10702,16 @@ static int vmx_update_pi_irte(struct kvm
>> *kvm,
>>>> unsigned int host_irq,
>>>>>     		 */
>>>>>
>>>>>     		kvm_set_msi_irq(e, &irq);
>>>>> -		if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu))
>>>>> -			continue;
>>>>> +
>>>>> +		if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
>>>>> +			if (!kvm_vector_hashing_enabled() ||
>>>>> +					irq.delivery_mode !=
>>>> APIC_DM_LOWEST)
>>>>> +				continue;
>>>>> +
>>>>> +			vcpu = kvm_intr_vector_hashing_dest(kvm, &irq);
>>>>> +			if (!vcpu)
>>>>> +				continue;
>>>>> +		}
>>>>
>>>> I am a little confused with the 'continue'. If the destination is not
>>>> single vcpu, shouldn't we rollback to use non-PI mode?
>>>
>>> Here is the logic:
>>> - If it is single destination, we will use PI no matter it is fixed or lowest-priority.
>>> - If it is not single destination:
>>> 	a) It is fixed, we will use non-PI
>>> 	b) It is lowest-priority and vector-hashing is enabled, we will use PI
>>> 	c) otherwise, use non-PI
>>
>> If it is single destination previously, then change to no-single mode.
>> Can current code cover this case?
>
> In my test, before setting irq affinity (change single vcpu to non-single vcpu
> in this case), the guest will mask the interrupt first, so before getting here, IRTE
> has been changed back to remapped mode already(when guest masks the MSIx,
> we will change back to remapped mode), hence nothing needed here.
>
> Digging into the linux code (guest) a bit more, I found that if interrupt remapping
> is not enabled in the guest (IR is not supported for guest anyway), it will always
> mask the MSI/MSIx before setting the irq affinity. So the code should work
> well currently.

We should not rely on guest's behavior. From code level, it need be fixed.

>
> However, for robustness, I think explicitly changing IRTE back to remapped
> mode for the 'continue' case should be a good idea.

This is what i am looking for.

>
> Radim, Paolo, what are your guys' options about this? Any comments are
> appreciated! Thanks a lot!
>
> Thanks,
> Feng
>


-- 
best regards
yang

  reply	other threads:[~2015-12-22  6:42 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-16  1:37 [PATCH v2 0/2] Add vector-hashing support for lowest-priority interrupts delivery Feng Wu
2015-12-16  1:37 ` [PATCH v2 1/2] KVM: x86: Use vector-hashing to deliver lowest-priority interrupts Feng Wu
2015-12-21  1:46   ` Yang Zhang
2015-12-21  1:50     ` Wu, Feng
2015-12-21  2:06       ` Yang Zhang
2015-12-22  4:37         ` Wu, Feng
2015-12-22  6:49           ` Yang Zhang
2015-12-22  6:59             ` Wu, Feng
2015-12-22  7:13               ` Yang Zhang
2015-12-22  7:19                 ` Wu, Feng
2015-12-22 19:52                   ` rkrcmar
2015-12-23  2:12                     ` Wu, Feng
2015-12-23 16:42                       ` rkrcmar
2015-12-23  3:17                     ` Yang Zhang
2015-12-23 17:19   ` Radim Krčmář
2016-01-18  5:19     ` Wu, Feng
2016-01-18 10:41       ` Paolo Bonzini
2016-01-19  4:44         ` Wu, Feng
2016-01-19 13:42           ` Paolo Bonzini
2016-01-19 13:49             ` Wu, Feng
2016-01-18 14:00       ` Radim Krcmár
2015-12-16  1:37 ` [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d posted-interrupts Feng Wu
2015-12-21  1:50   ` Yang Zhang
2015-12-21  1:55     ` Wu, Feng
2015-12-21  2:01       ` Yang Zhang
2015-12-22  4:36         ` Wu, Feng
2015-12-22  6:42           ` Yang Zhang [this message]
2015-12-23 16:50             ` rkrcmar
2015-12-23 17:21   ` Radim Krčmář
2016-01-04  1:57     ` Wu, Feng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5678F0BE.2020409@gmail.com \
    --to=yang.zhang.wz@gmail.com \
    --cc=feng.wu@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).