From: Marc Zyngier <maz@kernel.org>
To: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Oliver Upton <oupton@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Kees Cook <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org,
devel@daynix.com, kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
Date: Mon, 20 Apr 2026 14:53:15 +0100 [thread overview]
Message-ID: <86ldeh20xg.wl-maz@kernel.org> (raw)
In-Reply-To: <ad44c69e-2f99-4f31-81b4-faae52eea080@rsg.ci.i.u-tokyo.ac.jp>
On Mon, 20 Apr 2026 13:07:15 +0100,
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
>
> On 2026/04/20 18:51, Marc Zyngier wrote:
> > On Mon, 20 Apr 2026 09:36:16 +0100,
> > Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> >>
> >> On 2026/04/20 2:19, Marc Zyngier wrote:
> >>> On Sat, 18 Apr 2026 09:14:25 +0100,
> >>> Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> >>>>
> >>>> On a heterogeneous arm64 system, KVM's PMU emulation is based on the
> >>>> features of a single host PMU instance. When a vCPU is migrated to a
> >>>> pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
> >>>> incrementing.
> >>>>
> >>>> Although this behavior is permitted by the architecture, Windows does
> >>>> not handle it gracefully and may crash with a division-by-zero error.
> >>>>
> >>>> The current workaround requires VMMs to pin vCPUs to a set of pCPUs
> >>>> that share a compatible PMU. This is difficult to implement correctly in
> >>>> QEMU/libvirt, where pinning occurs after vCPU initialization, and it
> >>>> also restricts the guest to a subset of available pCPUs.
> >>>>
> >>>> Introduce the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY attribute to
> >>>> create a "fixed-counters-only" PMU. When set, KVM exposes a PMU that is
> >>>> compatible with all pCPUs but that does not support programmable
> >>>> event counters which may have different feature sets on different PMUs.
> >>>>
> >>>> This allows Windows guests to run reliably on heterogeneous systems
> >>>> without crashing, even without vCPU pinning, and enables VMMs to
> >>>> schedule vCPUs across all available pCPUs, making full use of the host
> >>>> hardware.
> >>>>
> >>>> Much like KVM_ARM_VCPU_PMU_V3_IRQ and other read-write attributes, this
> >>>> attribute provides a getter that facilitates kernel and userspace
> >>>> debugging/testing.
> >>>
> >>> OK, so that's the sales pitch. But how is it implemented? I would like
> >>> to be able to read a high-level description of the implementation
> >>> trade-offs.
> >>
> >> Implementation-wise it is very trivial. Essentially the following
> >> addition in kvm_arm_pmu_v3_get_attr() is the entire implementation:
> >> + case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
> >> + if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> >> &vcpu->kvm->arch.flags))
> >> + return 0;
> >>
> >> Both its functionality and code complexity is trivial. So we can argue that:
> >> - the functionality is too trivial to be useful or
> >> - the interface/implementation complexity is so trivial that it does not
> >> incur maintenance burden
> >>
> >> In this case the selftest uses the getter so I was more inclined to
> >> have it, but adding one just for the selftest sounds too ad-hoc, so
> >> here I looked into other attributes to ensure that it was not
> >> introducing inconsistency with existing interfaces.
> >>
> >> As the result, I found there are other read-write attributes; in fact
> >> there are more read-write attributes than write-only ones.
> >
> > You're completely missing the point. I'm referring to the whole of the
> > commit message, which is more of a marketing slide than a technical
> > description.
>
> In terms of implementation, the obvious tradeoff is that it adds more
> code to implement the feature. One thing to note is that
> kvm_vcpu_load_pmu() is added and is called each time a vCPU migrates
> across pCPUs. The heavy part, making the KVM_REQ_RELOAD_PMU request,
> only happens when the feature is enabled.
Well, that's what I want to see. The repeated blurb about Windows
being broken is cover letter material, but not fir for a commit
message.
[...]
> >>> "for the first time" gives the impression that it will work if you try
> >>> again. I'd rather we say that "This feature is incompatible with the
> >>> existence of a PMU event filter".
> >>
> >> The following sequence will work:
> >> 1. Set KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
> >> 2. Set KVM_ARM_VCPU_PMU_V3_FILTER
> >> 3. Set KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
> >>
> >> This is to make the behavior conistent with KVM_ARM_VCPU_PMU_V3_SET_PMU.
> >
> > I don't think this is correct. Filtering is completely at odds with
> > this patch, and I don't want to have to reason about the combination.
>
> kvm_arm_pmu_v3_set_pmu() has the following condition:
>
> if (kvm_vm_has_ran_once(kvm) ||
> (kvm->arch.pmu_filter && kvm->arch.arm_pmu != arm_pmu)) {
> ret = -EBUSY;
> break;
> }
>
> kvm_arm_pmu_v3_set_pmu_fixed_counters_only() has the corresponding
> condition for consistency:
>
> if (kvm_vm_has_ran_once(kvm) ||
> (kvm->arch.pmu_filter &&
> !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> &kvm->arch.flags)))
> return -EBUSY;
>
> We can of course kill the PMU event filter for
> FIXED_COUNTERS_ONLY. The filter is effectively no-op with
> FIXED_COUNTERS_ONLY and I don't think that consistency matters much.
We shouldn't allow weird combinations in the UAPI. Since it makes no
sense to have both fixed-function *and* filters, we should make them
mutually exclusive.
[...]
> >> I expect migration will be handled with the conventional register
> >> getters and setters, but please share if you have a concern.
> >
> > At the very least I want to see some documentation explaining that.
>
> What kind of documentation do you expect?
A description of what counters are exposed by this feature, and what
architectural features they are dependent on.
> If we change kvm_vcpu_load_pmu() to avoid for_each_set_bit(), there
> would be a good chance to forget updating it when mechanically
> updating existing for_each_set_bit() instances, so it is a candidate
> for documentation. But I don't have a good idea where to place it
> either.
The moment we introduce FEAT_PMUv3_ICNTR, the whole PMUv3 emulation
will catch fire anyway, as we already limit ourselves to 32 bits for
reset and nesting switch.
So this will be a major redesign anyway. If you are really worried,
leave a comment in there.
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2026-04-20 13:53 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-18 8:14 [PATCH v7 0/4] KVM: arm64: PMU: Use multiple host PMUs Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 1/4] KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask() Akihiko Odaki
2026-04-19 14:13 ` Marc Zyngier
2026-04-20 5:27 ` Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 2/4] KVM: arm64: PMU: Protect the list of PMUs with RCU Akihiko Odaki
2026-04-19 14:34 ` Marc Zyngier
2026-04-20 6:21 ` Akihiko Odaki
2026-04-20 7:01 ` Marc Zyngier
2026-04-20 7:17 ` Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY Akihiko Odaki
2026-04-19 17:19 ` Marc Zyngier
2026-04-20 8:36 ` Akihiko Odaki
2026-04-20 9:51 ` Marc Zyngier
2026-04-20 12:07 ` Akihiko Odaki
2026-04-20 13:53 ` Marc Zyngier [this message]
2026-04-22 7:02 ` Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 4/4] KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY Akihiko Odaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86ldeh20xg.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=devel@daynix.com \
--cc=gustavoars@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kees@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
--cc=oupton@kernel.org \
--cc=pbonzini@redhat.com \
--cc=shuah@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.