From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
roy.hopkins@suse.com, thomas.lendacky@amd.com,
ashish.kalra@amd.com, michael.roth@amd.com, jroedel@suse.de,
nsaenz@amazon.com, anelkz@amazon.de,
James.Bottomley@hansenpartnership.com
Subject: Re: [PATCH 10/29] KVM: share statistics for same vCPU id on different planes
Date: Fri, 6 Jun 2025 09:23:09 -0700 [thread overview]
Message-ID: <aEMV7awKTSXEkLqu@google.com> (raw)
In-Reply-To: <20250401161106.790710-11-pbonzini@redhat.com>
On Tue, Apr 01, 2025, Paolo Bonzini wrote:
> Statistics are protected by vcpu->mutex; because KVM_RUN takes the
> plane-0 vCPU mutex, there is no race on applying statistics for all
> planes to the plane-0 kvm_vcpu struct.
>
> This saves the burden on the kernel of implementing the binary stats
> interface for vCPU plane file descriptors, and on userspace of gathering
> info from multiple planes. The disadvantage is a slight loss of
> information, and an extra pointer dereference when updating stats.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/arm64/kvm/arm.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 6 +--
> arch/arm64/kvm/hyp/nvhe/gen-hyprel.c | 4 +-
> arch/arm64/kvm/mmio.c | 4 +-
> arch/loongarch/kvm/exit.c | 8 ++--
> arch/loongarch/kvm/vcpu.c | 2 +-
> arch/mips/kvm/emulate.c | 2 +-
> arch/mips/kvm/mips.c | 30 +++++++-------
> arch/mips/kvm/vz.c | 18 ++++-----
> arch/powerpc/kvm/book3s.c | 2 +-
> arch/powerpc/kvm/book3s_hv.c | 46 ++++++++++-----------
> arch/powerpc/kvm/book3s_hv_rm_xics.c | 8 ++--
> arch/powerpc/kvm/book3s_pr.c | 22 +++++-----
> arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
> arch/powerpc/kvm/powerpc.c | 4 +-
> arch/powerpc/kvm/timing.h | 28 ++++++-------
> arch/riscv/kvm/vcpu.c | 2 +-
> arch/riscv/kvm/vcpu_exit.c | 10 ++---
> arch/riscv/kvm/vcpu_insn.c | 16 ++++----
> arch/riscv/kvm/vcpu_sbi.c | 2 +-
> arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
> arch/s390/kvm/diag.c | 18 ++++-----
> arch/s390/kvm/intercept.c | 20 +++++-----
> arch/s390/kvm/interrupt.c | 48 +++++++++++-----------
> arch/s390/kvm/kvm-s390.c | 8 ++--
> arch/s390/kvm/priv.c | 60 ++++++++++++++--------------
> arch/s390/kvm/sigp.c | 50 +++++++++++------------
> arch/s390/kvm/vsie.c | 2 +-
> arch/x86/kvm/debugfs.c | 2 +-
> arch/x86/kvm/hyperv.c | 4 +-
> arch/x86/kvm/kvm_cache_regs.h | 4 +-
> arch/x86/kvm/mmu/mmu.c | 18 ++++-----
> arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
> arch/x86/kvm/svm/sev.c | 2 +-
> arch/x86/kvm/svm/svm.c | 18 ++++-----
> arch/x86/kvm/vmx/tdx.c | 8 ++--
> arch/x86/kvm/vmx/vmx.c | 20 +++++-----
> arch/x86/kvm/x86.c | 40 +++++++++----------
> include/linux/kvm_host.h | 5 ++-
> virt/kvm/kvm_main.c | 19 ++++-----
> 40 files changed, 285 insertions(+), 283 deletions(-)
...
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index dbca418d64f5..d2e0c0e8ff17 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -393,7 +393,8 @@ struct kvm_vcpu {
> bool ready;
> bool scheduled_out;
> struct kvm_vcpu_arch arch;
> - struct kvm_vcpu_stat stat;
> + struct kvm_vcpu_stat *stat;
> + struct kvm_vcpu_stat __stat;
Rather than special case invidiual fields, I think we should give kvm_vcpu the
same treatment as "struct kvm", and have kvm_vcpu represent the overall vCPU,
with an array of planes to hold the sub-vCPUs.
Having "kvm_vcpu" represent a plane, while "kvm" represents the overall VM, is
conceptually messy. And more importantly, I think the approach taken here will
be nigh impossible to maintain, and will have quite a bit of baggage. E.g. planes1+
will be filled with dead memory, and we also risk goofs where KVM could access
__stat in a plane1+ vCPU.
Documenting which fields are plane0-only, i.e. per-vCPU, via comments isn't
sustainable, whereas a hard split via structures will naturally what fields are
scope to the overall vCPU, versus what is per-plane, and will force us to more
explicitly audit the code. E.g. ____srcu_idx (and thus srcu_depth) is something
that I think should be shared by all planes. Ditto for preempt_notifier, vcpu_id,
vcpu_idx, pid, etc.
Aha! And to prove my point, this series breaks legacy signal handling, because
sigset_active and sigset are accessed using the plane1+ vCPU in kvm_vcpu_ioctl_run_plane(),
but KVM_SET_SIGNAL_MASK is only allowed to operated on plane0. And I definitely
don't think the answer is to let KVM_SET_SIGNAL_MASK operate on plane1+, because
forcing userspace to duplicate the sigmal masks to all planes is pointless.
Yeeeaaap. pid and pid_lock are also broken. As is vmx_hwapic_isr_update()
and kvm_sched_out()'s usage of wants_to_run. And guest_debug.
Long term, I just don't see this approach as being maintainable. We're pretty
much guaranteed to end up with bugs where KVM operates on the wrong kvm_vcpu
structure due to lack of explicit isolation in code. And those bugs are going
to absolutely brutal to debug (or even notice). E.g. failure to set "preempted"
on planes 1+ will mostly manifest as subtle performance issues.
Oof. And that would force us to document that duplicating cpuid and cpu_caps to
planes1+ is actually necessary, due to dynamic CPUID features (ugh). Though FWIW,
we could dodge that by special casing dynamic features, which isn't a bad idea
irrespective of planes.
Somewhat of a side topic: unless we need/want to explicitly support concurrent
GET/SET on planes of a vCPU, I think we should make vcpu->mutex per-vCPU, not
per-plane, so that there's zero chance of having bugs due to thinking that holding
vcpu->mutex provides protection against a race.
Extracing fields to a separate kvm_vcpu_plane will obviously require a *lot* more
churn, but I think in the long run it will be less work in total, because we won't
spend as much time chasing down bugs.
Very little per-plane state is in "struct kvm_vcpu", so I think we can do the big
conversion on a per-arch basis via a small amount of #ifdefs, i.e. not be force to
immediatedly convert all architectures to a kvm_vcpu vs. kvm_vcpu_plane world.
next prev parent reply other threads:[~2025-06-06 16:23 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-01 16:10 [RFC PATCH 00/29] KVM: VM planes Paolo Bonzini
2025-04-01 16:10 ` [PATCH 01/29] Documentation: kvm: introduce "VM plane" concept Paolo Bonzini
2025-04-21 18:43 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 02/29] KVM: API definitions for plane userspace exit Paolo Bonzini
2025-06-04 0:10 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 03/29] KVM: add plane info to structs Paolo Bonzini
2025-04-21 18:57 ` Tom Lendacky
2025-04-21 19:04 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 04/29] KVM: introduce struct kvm_arch_plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 05/29] KVM: add plane support to KVM_SIGNAL_MSI Paolo Bonzini
2025-04-01 16:10 ` [PATCH 06/29] KVM: move mem_attr_array to kvm_plane Paolo Bonzini
2025-06-06 22:50 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 07/29] KVM: do not use online_vcpus to test vCPU validity Paolo Bonzini
2025-06-05 22:45 ` Sean Christopherson
2025-06-06 13:49 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 08/29] KVM: move vcpu_array to struct kvm_plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 09/29] KVM: implement plane file descriptors ioctl and creation Paolo Bonzini
2025-04-21 20:32 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 10/29] KVM: share statistics for same vCPU id on different planes Paolo Bonzini
2025-06-06 16:23 ` Sean Christopherson [this message]
2025-06-06 16:32 ` Paolo Bonzini
2025-04-01 16:10 ` [PATCH 11/29] KVM: anticipate allocation of dirty ring Paolo Bonzini
2025-04-01 16:10 ` [PATCH 12/29] KVM: share dirty ring for same vCPU id on different planes Paolo Bonzini
2025-04-21 21:51 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 13/29] KVM: implement vCPU creation for extra planes Paolo Bonzini
2025-04-21 22:08 ` Tom Lendacky
2025-06-05 22:47 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 14/29] KVM: pass plane to kvm_arch_vcpu_create Paolo Bonzini
2025-04-01 16:10 ` [PATCH 15/29] KVM: x86: pass vcpu to kvm_pv_send_ipi() Paolo Bonzini
2025-04-01 16:10 ` [PATCH 16/29] KVM: x86: split "if" in __kvm_set_or_clear_apicv_inhibit Paolo Bonzini
2025-04-01 16:10 ` [PATCH 17/29] KVM: x86: block creating irqchip if planes are active Paolo Bonzini
2025-04-01 16:10 ` [PATCH 18/29] KVM: x86: track APICv inhibits per plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 19/29] KVM: x86: move APIC map to kvm_arch_plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 20/29] KVM: x86: add planes support for interrupt delivery Paolo Bonzini
2025-06-06 16:30 ` Sean Christopherson
2025-06-06 16:38 ` Paolo Bonzini
2025-04-01 16:10 ` [PATCH 21/29] KVM: x86: add infrastructure to share FPU across planes Paolo Bonzini
2025-04-01 16:10 ` [PATCH 22/29] KVM: x86: implement initial plane support Paolo Bonzini
2025-04-01 16:11 ` [PATCH 23/29] KVM: x86: extract kvm_post_set_cpuid Paolo Bonzini
2025-04-01 16:11 ` [PATCH 24/29] KVM: x86: initialize CPUID for non-default planes Paolo Bonzini
2025-04-01 16:11 ` [PATCH 25/29] KVM: x86: handle interrupt priorities for planes Paolo Bonzini
2025-04-01 16:11 ` [PATCH 26/29] KVM: x86: enable up to 16 planes Paolo Bonzini
2025-06-06 22:41 ` Sean Christopherson
2025-04-01 16:11 ` [PATCH 27/29] selftests: kvm: introduce basic test for VM planes Paolo Bonzini
2025-04-01 16:11 ` [PATCH 28/29] selftests: kvm: add plane infrastructure Paolo Bonzini
2025-04-01 16:11 ` [PATCH 29/29] selftests: kvm: add x86-specific plane test Paolo Bonzini
2025-04-01 16:16 ` [RFC PATCH 00/29] KVM: VM planes Sean Christopherson
2025-06-06 16:42 ` Tom Lendacky
2025-08-07 12:34 ` Vaishali Thakkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aEMV7awKTSXEkLqu@google.com \
--to=seanjc@google.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=anelkz@amazon.de \
--cc=ashish.kalra@amd.com \
--cc=jroedel@suse.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=nsaenz@amazon.com \
--cc=pbonzini@redhat.com \
--cc=roy.hopkins@suse.com \
--cc=thomas.lendacky@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.