From: Sean Christopherson <seanjc@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
roy.hopkins@suse.com, thomas.lendacky@amd.com,
ashish.kalra@amd.com, michael.roth@amd.com, jroedel@suse.de,
nsaenz@amazon.com, anelkz@amazon.de,
James.Bottomley@hansenpartnership.com
Subject: Re: [PATCH 10/29] KVM: share statistics for same vCPU id on different planes
Date: Fri, 6 Jun 2025 09:23:09 -0700 [thread overview]
Message-ID: <aEMV7awKTSXEkLqu@google.com> (raw)
In-Reply-To: <20250401161106.790710-11-pbonzini@redhat.com>
On Tue, Apr 01, 2025, Paolo Bonzini wrote:
> Statistics are protected by vcpu->mutex; because KVM_RUN takes the
> plane-0 vCPU mutex, there is no race on applying statistics for all
> planes to the plane-0 kvm_vcpu struct.
>
> This saves the burden on the kernel of implementing the binary stats
> interface for vCPU plane file descriptors, and on userspace of gathering
> info from multiple planes. The disadvantage is a slight loss of
> information, and an extra pointer dereference when updating stats.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
> arch/arm64/kvm/arm.c | 2 +-
> arch/arm64/kvm/handle_exit.c | 6 +--
> arch/arm64/kvm/hyp/nvhe/gen-hyprel.c | 4 +-
> arch/arm64/kvm/mmio.c | 4 +-
> arch/loongarch/kvm/exit.c | 8 ++--
> arch/loongarch/kvm/vcpu.c | 2 +-
> arch/mips/kvm/emulate.c | 2 +-
> arch/mips/kvm/mips.c | 30 +++++++-------
> arch/mips/kvm/vz.c | 18 ++++-----
> arch/powerpc/kvm/book3s.c | 2 +-
> arch/powerpc/kvm/book3s_hv.c | 46 ++++++++++-----------
> arch/powerpc/kvm/book3s_hv_rm_xics.c | 8 ++--
> arch/powerpc/kvm/book3s_pr.c | 22 +++++-----
> arch/powerpc/kvm/book3s_pr_papr.c | 2 +-
> arch/powerpc/kvm/powerpc.c | 4 +-
> arch/powerpc/kvm/timing.h | 28 ++++++-------
> arch/riscv/kvm/vcpu.c | 2 +-
> arch/riscv/kvm/vcpu_exit.c | 10 ++---
> arch/riscv/kvm/vcpu_insn.c | 16 ++++----
> arch/riscv/kvm/vcpu_sbi.c | 2 +-
> arch/riscv/kvm/vcpu_sbi_hsm.c | 2 +-
> arch/s390/kvm/diag.c | 18 ++++-----
> arch/s390/kvm/intercept.c | 20 +++++-----
> arch/s390/kvm/interrupt.c | 48 +++++++++++-----------
> arch/s390/kvm/kvm-s390.c | 8 ++--
> arch/s390/kvm/priv.c | 60 ++++++++++++++--------------
> arch/s390/kvm/sigp.c | 50 +++++++++++------------
> arch/s390/kvm/vsie.c | 2 +-
> arch/x86/kvm/debugfs.c | 2 +-
> arch/x86/kvm/hyperv.c | 4 +-
> arch/x86/kvm/kvm_cache_regs.h | 4 +-
> arch/x86/kvm/mmu/mmu.c | 18 ++++-----
> arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
> arch/x86/kvm/svm/sev.c | 2 +-
> arch/x86/kvm/svm/svm.c | 18 ++++-----
> arch/x86/kvm/vmx/tdx.c | 8 ++--
> arch/x86/kvm/vmx/vmx.c | 20 +++++-----
> arch/x86/kvm/x86.c | 40 +++++++++----------
> include/linux/kvm_host.h | 5 ++-
> virt/kvm/kvm_main.c | 19 ++++-----
> 40 files changed, 285 insertions(+), 283 deletions(-)
...
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index dbca418d64f5..d2e0c0e8ff17 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -393,7 +393,8 @@ struct kvm_vcpu {
> bool ready;
> bool scheduled_out;
> struct kvm_vcpu_arch arch;
> - struct kvm_vcpu_stat stat;
> + struct kvm_vcpu_stat *stat;
> + struct kvm_vcpu_stat __stat;
Rather than special case invidiual fields, I think we should give kvm_vcpu the
same treatment as "struct kvm", and have kvm_vcpu represent the overall vCPU,
with an array of planes to hold the sub-vCPUs.
Having "kvm_vcpu" represent a plane, while "kvm" represents the overall VM, is
conceptually messy. And more importantly, I think the approach taken here will
be nigh impossible to maintain, and will have quite a bit of baggage. E.g. planes1+
will be filled with dead memory, and we also risk goofs where KVM could access
__stat in a plane1+ vCPU.
Documenting which fields are plane0-only, i.e. per-vCPU, via comments isn't
sustainable, whereas a hard split via structures will naturally what fields are
scope to the overall vCPU, versus what is per-plane, and will force us to more
explicitly audit the code. E.g. ____srcu_idx (and thus srcu_depth) is something
that I think should be shared by all planes. Ditto for preempt_notifier, vcpu_id,
vcpu_idx, pid, etc.
Aha! And to prove my point, this series breaks legacy signal handling, because
sigset_active and sigset are accessed using the plane1+ vCPU in kvm_vcpu_ioctl_run_plane(),
but KVM_SET_SIGNAL_MASK is only allowed to operated on plane0. And I definitely
don't think the answer is to let KVM_SET_SIGNAL_MASK operate on plane1+, because
forcing userspace to duplicate the sigmal masks to all planes is pointless.
Yeeeaaap. pid and pid_lock are also broken. As is vmx_hwapic_isr_update()
and kvm_sched_out()'s usage of wants_to_run. And guest_debug.
Long term, I just don't see this approach as being maintainable. We're pretty
much guaranteed to end up with bugs where KVM operates on the wrong kvm_vcpu
structure due to lack of explicit isolation in code. And those bugs are going
to absolutely brutal to debug (or even notice). E.g. failure to set "preempted"
on planes 1+ will mostly manifest as subtle performance issues.
Oof. And that would force us to document that duplicating cpuid and cpu_caps to
planes1+ is actually necessary, due to dynamic CPUID features (ugh). Though FWIW,
we could dodge that by special casing dynamic features, which isn't a bad idea
irrespective of planes.
Somewhat of a side topic: unless we need/want to explicitly support concurrent
GET/SET on planes of a vCPU, I think we should make vcpu->mutex per-vCPU, not
per-plane, so that there's zero chance of having bugs due to thinking that holding
vcpu->mutex provides protection against a race.
Extracing fields to a separate kvm_vcpu_plane will obviously require a *lot* more
churn, but I think in the long run it will be less work in total, because we won't
spend as much time chasing down bugs.
Very little per-plane state is in "struct kvm_vcpu", so I think we can do the big
conversion on a per-arch basis via a small amount of #ifdefs, i.e. not be force to
immediatedly convert all architectures to a kvm_vcpu vs. kvm_vcpu_plane world.
next prev parent reply other threads:[~2025-06-06 16:23 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-01 16:10 [RFC PATCH 00/29] KVM: VM planes Paolo Bonzini
2025-04-01 16:10 ` [PATCH 01/29] Documentation: kvm: introduce "VM plane" concept Paolo Bonzini
2025-04-21 18:43 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 02/29] KVM: API definitions for plane userspace exit Paolo Bonzini
2025-06-04 0:10 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 03/29] KVM: add plane info to structs Paolo Bonzini
2025-04-21 18:57 ` Tom Lendacky
2025-04-21 19:04 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 04/29] KVM: introduce struct kvm_arch_plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 05/29] KVM: add plane support to KVM_SIGNAL_MSI Paolo Bonzini
2025-04-01 16:10 ` [PATCH 06/29] KVM: move mem_attr_array to kvm_plane Paolo Bonzini
2025-06-06 22:50 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 07/29] KVM: do not use online_vcpus to test vCPU validity Paolo Bonzini
2025-06-05 22:45 ` Sean Christopherson
2025-06-06 13:49 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 08/29] KVM: move vcpu_array to struct kvm_plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 09/29] KVM: implement plane file descriptors ioctl and creation Paolo Bonzini
2025-04-21 20:32 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 10/29] KVM: share statistics for same vCPU id on different planes Paolo Bonzini
2025-06-06 16:23 ` Sean Christopherson [this message]
2025-06-06 16:32 ` Paolo Bonzini
2025-04-01 16:10 ` [PATCH 11/29] KVM: anticipate allocation of dirty ring Paolo Bonzini
2025-04-01 16:10 ` [PATCH 12/29] KVM: share dirty ring for same vCPU id on different planes Paolo Bonzini
2025-04-21 21:51 ` Tom Lendacky
2025-04-01 16:10 ` [PATCH 13/29] KVM: implement vCPU creation for extra planes Paolo Bonzini
2025-04-21 22:08 ` Tom Lendacky
2025-06-05 22:47 ` Sean Christopherson
2025-04-01 16:10 ` [PATCH 14/29] KVM: pass plane to kvm_arch_vcpu_create Paolo Bonzini
2025-04-01 16:10 ` [PATCH 15/29] KVM: x86: pass vcpu to kvm_pv_send_ipi() Paolo Bonzini
2025-04-01 16:10 ` [PATCH 16/29] KVM: x86: split "if" in __kvm_set_or_clear_apicv_inhibit Paolo Bonzini
2025-04-01 16:10 ` [PATCH 17/29] KVM: x86: block creating irqchip if planes are active Paolo Bonzini
2025-04-01 16:10 ` [PATCH 18/29] KVM: x86: track APICv inhibits per plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 19/29] KVM: x86: move APIC map to kvm_arch_plane Paolo Bonzini
2025-04-01 16:10 ` [PATCH 20/29] KVM: x86: add planes support for interrupt delivery Paolo Bonzini
2025-06-06 16:30 ` Sean Christopherson
2025-06-06 16:38 ` Paolo Bonzini
2025-04-01 16:10 ` [PATCH 21/29] KVM: x86: add infrastructure to share FPU across planes Paolo Bonzini
2025-04-01 16:10 ` [PATCH 22/29] KVM: x86: implement initial plane support Paolo Bonzini
2025-04-01 16:11 ` [PATCH 23/29] KVM: x86: extract kvm_post_set_cpuid Paolo Bonzini
2025-04-01 16:11 ` [PATCH 24/29] KVM: x86: initialize CPUID for non-default planes Paolo Bonzini
2025-04-01 16:11 ` [PATCH 25/29] KVM: x86: handle interrupt priorities for planes Paolo Bonzini
2025-04-01 16:11 ` [PATCH 26/29] KVM: x86: enable up to 16 planes Paolo Bonzini
2025-06-06 22:41 ` Sean Christopherson
2025-04-01 16:11 ` [PATCH 27/29] selftests: kvm: introduce basic test for VM planes Paolo Bonzini
2025-04-01 16:11 ` [PATCH 28/29] selftests: kvm: add plane infrastructure Paolo Bonzini
2025-04-01 16:11 ` [PATCH 29/29] selftests: kvm: add x86-specific plane test Paolo Bonzini
2025-04-01 16:16 ` [RFC PATCH 00/29] KVM: VM planes Sean Christopherson
2025-06-06 16:42 ` Tom Lendacky
2025-08-07 12:34 ` Vaishali Thakkar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aEMV7awKTSXEkLqu@google.com \
--to=seanjc@google.com \
--cc=James.Bottomley@hansenpartnership.com \
--cc=anelkz@amazon.de \
--cc=ashish.kalra@amd.com \
--cc=jroedel@suse.de \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=nsaenz@amazon.com \
--cc=pbonzini@redhat.com \
--cc=roy.hopkins@suse.com \
--cc=thomas.lendacky@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox