From: Yang Zhang <yang.zhang.wz@gmail.com>
To: "Wu, Feng" <feng.wu@intel.com>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"rkrcmar@redhat.com" <rkrcmar@redhat.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d posted-interrupts
Date: Tue, 22 Dec 2015 14:42:06 +0800 [thread overview]
Message-ID: <5678F0BE.2020409@gmail.com> (raw)
In-Reply-To: <E959C4978C3B6342920538CF579893F00AF06251@SHSMSX104.ccr.corp.intel.com>
On 2015/12/22 12:36, Wu, Feng wrote:
>
>
>> -----Original Message-----
>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com]
>> Sent: Monday, December 21, 2015 10:01 AM
>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com;
>> rkrcmar@redhat.com
>> Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: Re: [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d
>> posted-interrupts
>>
>> On 2015/12/21 9:55, Wu, Feng wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: linux-kernel-owner@vger.kernel.org [mailto:linux-kernel-
>>>> owner@vger.kernel.org] On Behalf Of Yang Zhang
>>>> Sent: Monday, December 21, 2015 9:50 AM
>>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com;
>>>> rkrcmar@redhat.com
>>>> Cc: kvm@vger.kernel.org; linux-kernel@vger.kernel.org
>>>> Subject: Re: [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d
>>>> posted-interrupts
>>>>
>>>> On 2015/12/16 9:37, Feng Wu wrote:
>>>>> Use vector-hashing to deliver lowest-priority interrupts for
>>>>> VT-d posted-interrupts.
>>>>>
>>>>> Signed-off-by: Feng Wu <feng.wu@intel.com>
>>>>> ---
>>>>> arch/x86/kvm/lapic.c | 67
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> arch/x86/kvm/lapic.h | 2 ++
>>>>> arch/x86/kvm/vmx.c | 12 ++++++++--
>>>>> 3 files changed, 79 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>> index e29001f..d4f2c8f 100644
>>>>> --- a/arch/x86/kvm/lapic.c
>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>> @@ -854,6 +854,73 @@ out:
>>>>> }
>>>>>
>>>>> /*
>>>>> + * This routine handles lowest-priority interrupts using vector-hashing
>>>>> + * mechanism. As an example, modern Intel CPUs use this method to
>> handle
>>>>> + * lowest-priority interrupts.
>>>>> + *
>>>>> + * Here is the details about the vector-hashing mechanism:
>>>>> + * 1. For lowest-priority interrupts, store all the possible destination
>>>>> + * vCPUs in an array.
>>>>> + * 2. Use "guest vector % max number of destination vCPUs" to find the
>> right
>>>>> + * destination vCPU in the array for the lowest-priority interrupt.
>>>>> + */
>>>>> +struct kvm_vcpu *kvm_intr_vector_hashing_dest(struct kvm *kvm,
>>>>> + struct kvm_lapic_irq *irq)
>>>>> +{
>>>>> + struct kvm_apic_map *map;
>>>>> + struct kvm_vcpu *vcpu = NULL;
>>>>> +
>>>>> + if (irq->shorthand)
>>>>> + return NULL;
>>>>> +
>>>>> + rcu_read_lock();
>>>>> + map = rcu_dereference(kvm->arch.apic_map);
>>>>> +
>>>>> + if (!map)
>>>>> + goto out;
>>>>> +
>>>>> + if ((irq->dest_mode != APIC_DEST_PHYSICAL) &&
>>>>> + kvm_lowest_prio_delivery(irq)) {
>>>>> + u16 cid;
>>>>> + int i, idx = 0;
>>>>> + unsigned long bitmap = 1;
>>>>> + unsigned int dest_vcpus = 0;
>>>>> + struct kvm_lapic **dst = NULL;
>>>>> +
>>>>> +
>>>>> + if (!kvm_apic_logical_map_valid(map))
>>>>> + goto out;
>>>>> +
>>>>> + apic_logical_id(map, irq->dest_id, &cid, (u16 *)&bitmap);
>>>>> +
>>>>> + if (cid >= ARRAY_SIZE(map->logical_map))
>>>>> + goto out;
>>>>> +
>>>>> + dst = map->logical_map[cid];
>>>>> +
>>>>> + for_each_set_bit(i, &bitmap, 16) {
>>>>> + if (!dst[i] && !kvm_lapic_enabled(dst[i]->vcpu)) {
>>>>> + clear_bit(i, &bitmap);
>>>>> + continue;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + dest_vcpus = hweight16(bitmap);
>>>>> +
>>>>> + if (dest_vcpus != 0) {
>>>>> + idx = kvm_vector_2_index(irq->vector, dest_vcpus,
>>>>> + &bitmap, 16);
>>>>> + vcpu = dst[idx-1]->vcpu;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> +out:
>>>>> + rcu_read_unlock();
>>>>> + return vcpu;
>>>>> +}
>>>>> +EXPORT_SYMBOL_GPL(kvm_intr_vector_hashing_dest);
>>>>> +
>>>>> +/*
>>>>> * Add a pending IRQ into lapic.
>>>>> * Return 1 if successfully added and 0 if discarded.
>>>>> */
>>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>>> index 6890ef0..52bffce 100644
>>>>> --- a/arch/x86/kvm/lapic.h
>>>>> +++ b/arch/x86/kvm/lapic.h
>>>>> @@ -172,4 +172,6 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm,
>>>> struct kvm_lapic_irq *irq,
>>>>> struct kvm_vcpu **dest_vcpu);
>>>>> int kvm_vector_2_index(u32 vector, u32 dest_vcpus,
>>>>> const unsigned long *bitmap, u32 bitmap_size);
>>>>> +struct kvm_vcpu *kvm_intr_vector_hashing_dest(struct kvm *kvm,
>>>>> + struct kvm_lapic_irq *irq);
>>>>> #endif
>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>>> index 5eb56ed..3f89189 100644
>>>>> --- a/arch/x86/kvm/vmx.c
>>>>> +++ b/arch/x86/kvm/vmx.c
>>>>> @@ -10702,8 +10702,16 @@ static int vmx_update_pi_irte(struct kvm
>> *kvm,
>>>> unsigned int host_irq,
>>>>> */
>>>>>
>>>>> kvm_set_msi_irq(e, &irq);
>>>>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu))
>>>>> - continue;
>>>>> +
>>>>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
>>>>> + if (!kvm_vector_hashing_enabled() ||
>>>>> + irq.delivery_mode !=
>>>> APIC_DM_LOWEST)
>>>>> + continue;
>>>>> +
>>>>> + vcpu = kvm_intr_vector_hashing_dest(kvm, &irq);
>>>>> + if (!vcpu)
>>>>> + continue;
>>>>> + }
>>>>
>>>> I am a little confused with the 'continue'. If the destination is not
>>>> single vcpu, shouldn't we rollback to use non-PI mode?
>>>
>>> Here is the logic:
>>> - If it is single destination, we will use PI no matter it is fixed or lowest-priority.
>>> - If it is not single destination:
>>> a) It is fixed, we will use non-PI
>>> b) It is lowest-priority and vector-hashing is enabled, we will use PI
>>> c) otherwise, use non-PI
>>
>> If it is single destination previously, then change to no-single mode.
>> Can current code cover this case?
>
> In my test, before setting irq affinity (change single vcpu to non-single vcpu
> in this case), the guest will mask the interrupt first, so before getting here, IRTE
> has been changed back to remapped mode already(when guest masks the MSIx,
> we will change back to remapped mode), hence nothing needed here.
>
> Digging into the linux code (guest) a bit more, I found that if interrupt remapping
> is not enabled in the guest (IR is not supported for guest anyway), it will always
> mask the MSI/MSIx before setting the irq affinity. So the code should work
> well currently.
We should not rely on guest's behavior. From code level, it need be fixed.
>
> However, for robustness, I think explicitly changing IRTE back to remapped
> mode for the 'continue' case should be a good idea.
This is what i am looking for.
>
> Radim, Paolo, what are your guys' options about this? Any comments are
> appreciated! Thanks a lot!
>
> Thanks,
> Feng
>
--
best regards
yang
next prev parent reply other threads:[~2015-12-22 6:42 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-16 1:37 [PATCH v2 0/2] Add vector-hashing support for lowest-priority interrupts delivery Feng Wu
2015-12-16 1:37 ` [PATCH v2 1/2] KVM: x86: Use vector-hashing to deliver lowest-priority interrupts Feng Wu
2015-12-21 1:46 ` Yang Zhang
2015-12-21 1:50 ` Wu, Feng
2015-12-21 2:06 ` Yang Zhang
2015-12-22 4:37 ` Wu, Feng
2015-12-22 6:49 ` Yang Zhang
2015-12-22 6:59 ` Wu, Feng
2015-12-22 7:13 ` Yang Zhang
2015-12-22 7:19 ` Wu, Feng
2015-12-22 19:52 ` rkrcmar
2015-12-23 2:12 ` Wu, Feng
2015-12-23 16:42 ` rkrcmar
2015-12-23 3:17 ` Yang Zhang
2015-12-23 17:19 ` Radim Krčmář
2016-01-18 5:19 ` Wu, Feng
2016-01-18 10:41 ` Paolo Bonzini
2016-01-19 4:44 ` Wu, Feng
2016-01-19 13:42 ` Paolo Bonzini
2016-01-19 13:49 ` Wu, Feng
2016-01-18 14:00 ` Radim Krcmár
2015-12-16 1:37 ` [PATCH v2 2/2] KVM: x86: Add lowest-priority support for vt-d posted-interrupts Feng Wu
2015-12-21 1:50 ` Yang Zhang
2015-12-21 1:55 ` Wu, Feng
2015-12-21 2:01 ` Yang Zhang
2015-12-22 4:36 ` Wu, Feng
2015-12-22 6:42 ` Yang Zhang [this message]
2015-12-23 16:50 ` rkrcmar
2015-12-23 17:21 ` Radim Krčmář
2016-01-04 1:57 ` Wu, Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5678F0BE.2020409@gmail.com \
--to=yang.zhang.wz@gmail.com \
--cc=feng.wu@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).