From: Marc Zyngier <maz@kernel.org>
To: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Oliver Upton <oupton@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Kees Cook <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org,
devel@daynix.com, kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
Date: Mon, 20 Apr 2026 14:53:15 +0100 [thread overview]
Message-ID: <86ldeh20xg.wl-maz@kernel.org> (raw)
In-Reply-To: <ad44c69e-2f99-4f31-81b4-faae52eea080@rsg.ci.i.u-tokyo.ac.jp>
On Mon, 20 Apr 2026 13:07:15 +0100,
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
>
> On 2026/04/20 18:51, Marc Zyngier wrote:
> > On Mon, 20 Apr 2026 09:36:16 +0100,
> > Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> >>
> >> On 2026/04/20 2:19, Marc Zyngier wrote:
> >>> On Sat, 18 Apr 2026 09:14:25 +0100,
> >>> Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> >>>>
> >>>> On a heterogeneous arm64 system, KVM's PMU emulation is based on the
> >>>> features of a single host PMU instance. When a vCPU is migrated to a
> >>>> pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
> >>>> incrementing.
> >>>>
> >>>> Although this behavior is permitted by the architecture, Windows does
> >>>> not handle it gracefully and may crash with a division-by-zero error.
> >>>>
> >>>> The current workaround requires VMMs to pin vCPUs to a set of pCPUs
> >>>> that share a compatible PMU. This is difficult to implement correctly in
> >>>> QEMU/libvirt, where pinning occurs after vCPU initialization, and it
> >>>> also restricts the guest to a subset of available pCPUs.
> >>>>
> >>>> Introduce the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY attribute to
> >>>> create a "fixed-counters-only" PMU. When set, KVM exposes a PMU that is
> >>>> compatible with all pCPUs but that does not support programmable
> >>>> event counters which may have different feature sets on different PMUs.
> >>>>
> >>>> This allows Windows guests to run reliably on heterogeneous systems
> >>>> without crashing, even without vCPU pinning, and enables VMMs to
> >>>> schedule vCPUs across all available pCPUs, making full use of the host
> >>>> hardware.
> >>>>
> >>>> Much like KVM_ARM_VCPU_PMU_V3_IRQ and other read-write attributes, this
> >>>> attribute provides a getter that facilitates kernel and userspace
> >>>> debugging/testing.
> >>>
> >>> OK, so that's the sales pitch. But how is it implemented? I would like
> >>> to be able to read a high-level description of the implementation
> >>> trade-offs.
> >>
> >> Implementation-wise it is very trivial. Essentially the following
> >> addition in kvm_arm_pmu_v3_get_attr() is the entire implementation:
> >> + case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
> >> + if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> >> &vcpu->kvm->arch.flags))
> >> + return 0;
> >>
> >> Both its functionality and code complexity is trivial. So we can argue that:
> >> - the functionality is too trivial to be useful or
> >> - the interface/implementation complexity is so trivial that it does not
> >> incur maintenance burden
> >>
> >> In this case the selftest uses the getter so I was more inclined to
> >> have it, but adding one just for the selftest sounds too ad-hoc, so
> >> here I looked into other attributes to ensure that it was not
> >> introducing inconsistency with existing interfaces.
> >>
> >> As the result, I found there are other read-write attributes; in fact
> >> there are more read-write attributes than write-only ones.
> >
> > You're completely missing the point. I'm referring to the whole of the
> > commit message, which is more of a marketing slide than a technical
> > description.
>
> In terms of implementation, the obvious tradeoff is that it adds more
> code to implement the feature. One thing to note is that
> kvm_vcpu_load_pmu() is added and is called each time a vCPU migrates
> across pCPUs. The heavy part, making the KVM_REQ_RELOAD_PMU request,
> only happens when the feature is enabled.
Well, that's what I want to see. The repeated blurb about Windows
being broken is cover letter material, but not fir for a commit
message.
[...]
> >>> "for the first time" gives the impression that it will work if you try
> >>> again. I'd rather we say that "This feature is incompatible with the
> >>> existence of a PMU event filter".
> >>
> >> The following sequence will work:
> >> 1. Set KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
> >> 2. Set KVM_ARM_VCPU_PMU_V3_FILTER
> >> 3. Set KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
> >>
> >> This is to make the behavior conistent with KVM_ARM_VCPU_PMU_V3_SET_PMU.
> >
> > I don't think this is correct. Filtering is completely at odds with
> > this patch, and I don't want to have to reason about the combination.
>
> kvm_arm_pmu_v3_set_pmu() has the following condition:
>
> if (kvm_vm_has_ran_once(kvm) ||
> (kvm->arch.pmu_filter && kvm->arch.arm_pmu != arm_pmu)) {
> ret = -EBUSY;
> break;
> }
>
> kvm_arm_pmu_v3_set_pmu_fixed_counters_only() has the corresponding
> condition for consistency:
>
> if (kvm_vm_has_ran_once(kvm) ||
> (kvm->arch.pmu_filter &&
> !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> &kvm->arch.flags)))
> return -EBUSY;
>
> We can of course kill the PMU event filter for
> FIXED_COUNTERS_ONLY. The filter is effectively no-op with
> FIXED_COUNTERS_ONLY and I don't think that consistency matters much.
We shouldn't allow weird combinations in the UAPI. Since it makes no
sense to have both fixed-function *and* filters, we should make them
mutually exclusive.
[...]
> >> I expect migration will be handled with the conventional register
> >> getters and setters, but please share if you have a concern.
> >
> > At the very least I want to see some documentation explaining that.
>
> What kind of documentation do you expect?
A description of what counters are exposed by this feature, and what
architectural features they are dependent on.
> If we change kvm_vcpu_load_pmu() to avoid for_each_set_bit(), there
> would be a good chance to forget updating it when mechanically
> updating existing for_each_set_bit() instances, so it is a candidate
> for documentation. But I don't have a good idea where to place it
> either.
The moment we introduce FEAT_PMUv3_ICNTR, the whole PMUv3 emulation
will catch fire anyway, as we already limit ourselves to 32 bits for
reset and nesting switch.
So this will be a major redesign anyway. If you are really worried,
leave a comment in there.
M.
--
Without deviation from the norm, progress is not possible.
next prev parent reply other threads:[~2026-04-20 13:53 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-18 8:14 [PATCH v7 0/4] KVM: arm64: PMU: Use multiple host PMUs Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 1/4] KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask() Akihiko Odaki
2026-04-19 14:13 ` Marc Zyngier
2026-04-20 5:27 ` Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 2/4] KVM: arm64: PMU: Protect the list of PMUs with RCU Akihiko Odaki
2026-04-19 14:34 ` Marc Zyngier
2026-04-20 6:21 ` Akihiko Odaki
2026-04-20 7:01 ` Marc Zyngier
2026-04-20 7:17 ` Akihiko Odaki
2026-04-18 8:14 ` [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY Akihiko Odaki
2026-04-19 17:19 ` Marc Zyngier
2026-04-20 8:36 ` Akihiko Odaki
2026-04-20 9:51 ` Marc Zyngier
2026-04-20 12:07 ` Akihiko Odaki
2026-04-20 13:53 ` Marc Zyngier [this message]
2026-04-18 8:14 ` [PATCH v7 4/4] KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY Akihiko Odaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=86ldeh20xg.wl-maz@kernel.org \
--to=maz@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=devel@daynix.com \
--cc=gustavoars@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kees@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
--cc=oupton@kernel.org \
--cc=pbonzini@redhat.com \
--cc=shuah@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox