From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: Roman Kagan <rkagan@virtuozzo.com>
Cc: kvm@vger.kernel.org, x86@kernel.org,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Radim Krčmář" <rkrcmar@redhat.com>,
"K. Y. Srinivasan" <kys@microsoft.com>,
"Haiyang Zhang" <haiyangz@microsoft.com>,
"Stephen Hemminger" <sthemmin@microsoft.com>,
"Michael Kelley (EOSG)" <Michael.H.Kelley@microsoft.com>,
"Mohammed Gamal" <mmorsy@redhat.com>,
"Cathy Avery" <cavery@redhat.com>,
"Wanpeng Li" <wanpeng.li@hotmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 2/5] KVM: x86: hyperv: introduce vp_index_to_vcpu_idx mapping
Date: Fri, 29 Jun 2018 13:37:44 +0200 [thread overview]
Message-ID: <87tvplddrr.fsf@vitty.brq.redhat.com> (raw)
In-Reply-To: <20180629111227.GB15656@rkaganb.sw.ru> (Roman Kagan's message of "Fri, 29 Jun 2018 14:12:28 +0300")
Roman Kagan <rkagan@virtuozzo.com> writes:
> On Fri, Jun 29, 2018 at 12:26:23PM +0200, Vitaly Kuznetsov wrote:
>> Roman Kagan <rkagan@virtuozzo.com> writes:
>>
>> > On Thu, Jun 28, 2018 at 03:53:10PM +0200, Vitaly Kuznetsov wrote:
>> >> While it is easy to get VP index from vCPU index the reverse task is hard.
>> >> Basically, to solve it we have to walk all vCPUs checking if their VP index
>> >> matches. For hypercalls like HvFlushVirtualAddress{List,Space}* and the
>> >> upcoming HvSendSyntheticClusterIpi* where a single CPU may be specified in
>> >> the whole set this is obviously sub-optimal.
>> >>
>> >> As VP index can be set to anything <= U32_MAX by userspace using plain
>> >> [0..MAX_VP_INDEX] array is not a viable option. Use condensed sorted
>> >> array with logarithmic search complexity instead. Use RCU to make read
>> >> access as fast as possible and maintain atomicity of updates.
>> >
>> > Quoting TLFS 5.0C section 7.8.1:
>> >
>> >> Virtual processors are identified by using an index (VP index). The
>> >> maximum number of virtual processors per partition supported by the
>> >> current implementation of the hypervisor can be obtained through CPUID
>> >> leaf 0x40000005. A virtual processor index must be less than the
>> >> maximum number of virtual processors per partition.
>> >
>> > so this is a dense index, and VP_INDEX >= KVM_MAX_VCPUS is invalid. I
>> > think we're better off enforcing this in kvm_hv_set_msr and keep the
>> > translation simple. If the algorithm in get_vcpu_by_vpidx is not good
>> > enough (and yes it can be made to return NULL early on vpidx >=
>> > KVM_MAX_VCPUS instead of taking the slow path) then a simple index array
>> > of KVM_MAX_VCPUS entries should certainly do.
>>
>> Sure, we can use pre-allocated [0..KVM_MAX_VCPUS] array instead and put
>> limits on what userspace can assign VP_INDEX to. Howver, while thinking
>> about it I decided to go with the more complex condensed array approach
>> because the tendency is for KVM_MAX_VCPUS to grow and we will be
>> pre-allocating more and more memory for no particular reason (so I think
>> even 'struct kvm_vcpu *vcpus[KVM_MAX_VCPUS]' in 'struct kvm' will need
>> to be converted to something else eventually).
>
> We're talking of kilobytes here. I guess this is going to be the least
> of the scalability problems.
Yes, kilobytes but per-VM.
>
>> Anyway, I'm flexible and if you think we should go this way now I'll do
>> this in v3. We can re-think this when we later decide to raise
>> KVM_MAX_VCPUS significantly.
>
> Although there's no strict requirement for that I think every sensible
> userspace will allocate VP_INDEX linearly resulting in it being equal to
> KVM's vcpu index. So we've yet to see a case where get_vcpu_by_vpidx
> doesn't take the fast path. If it ever starts appearing in the profiles
> we may consider optimiziing it but ATM I don't even think introducing
> the translation array is justified.
It was Radim who suggested it in the first place :-)
The problem we're trying to solve here is: with PV TLB flush and IPI we
need to walk through the supplied list of VP_INDEXes and get VCPU
ids. Usually they match. But in case they don't we'll fall back to full
scan for every VP_INDEX in the supplied list. Now let's say we have 128
CPUs. We'll need to perform up to 128 * 128 extra comparisons on every
hypercall. Not good. So instead of using get_vcpu_by_vpidx() I opted for
walking the whole VCPU list and checking if VPU's VP_INDEX is in the
supplied set. This way we end up with 128 comparisons in the example
above (worst case scenarion). However, we lose in simple scenarios like
only 1 VP_INDEX was specified in the set: we'll still need to walk the
whole list. So having the translation array (one way or another) is IMO
justified.
--
Vitaly
next prev parent reply other threads:[~2018-06-29 11:37 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-28 13:53 [PATCH v2 0/5] KVM: x86: hyperv: PV IPI support for Windows guests Vitaly Kuznetsov
2018-06-28 13:53 ` [PATCH v2 1/5] KVM: x86: hyperv: optimize 'all cpus' case in kvm_hv_flush_tlb() Vitaly Kuznetsov
2018-06-28 13:53 ` [PATCH v2 2/5] KVM: x86: hyperv: introduce vp_index_to_vcpu_idx mapping Vitaly Kuznetsov
2018-06-28 18:49 ` kbuild test robot
2018-06-28 19:09 ` kbuild test robot
2018-06-28 19:09 ` [PATCH] KVM: x86: hyperv: fix semicolon.cocci warnings kbuild test robot
2018-06-29 10:11 ` [PATCH v2 2/5] KVM: x86: hyperv: introduce vp_index_to_vcpu_idx mapping Roman Kagan
2018-06-29 10:26 ` Vitaly Kuznetsov
2018-06-29 11:12 ` Roman Kagan
2018-06-29 11:37 ` Vitaly Kuznetsov [this message]
2018-06-29 12:52 ` Roman Kagan
2018-06-29 13:10 ` Vitaly Kuznetsov
2018-06-29 14:32 ` Roman Kagan
2018-06-29 15:25 ` Vitaly Kuznetsov
2018-06-29 15:55 ` Roman Kagan
2018-06-28 13:53 ` [PATCH v2 3/5] KVM: x86: hyperv: use vp_idx_to_vcpu_idx() in kvm_hv_flush_tlb() Vitaly Kuznetsov
2018-06-28 13:53 ` [PATCH v2 4/5] x86/hyper-v: rename ipi_arg_{ex,non_ex} structures Vitaly Kuznetsov
2018-06-28 13:53 ` [PATCH v2 5/5] KVM: x86: hyperv: implement PV IPI send hypercalls Vitaly Kuznetsov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87tvplddrr.fsf@vitty.brq.redhat.com \
--to=vkuznets@redhat.com \
--cc=Michael.H.Kelley@microsoft.com \
--cc=cavery@redhat.com \
--cc=haiyangz@microsoft.com \
--cc=kvm@vger.kernel.org \
--cc=kys@microsoft.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mmorsy@redhat.com \
--cc=pbonzini@redhat.com \
--cc=rkagan@virtuozzo.com \
--cc=rkrcmar@redhat.com \
--cc=sthemmin@microsoft.com \
--cc=wanpeng.li@hotmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox