public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Oliver Upton <oupton@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
	Suzuki K Poulose <suzuki.poulose@arm.com>,
	Zenghui Yu <yuzenghui@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Kees Cook <kees@kernel.org>,
	"Gustavo A. R. Silva" <gustavoars@kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
	linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
	linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org,
	devel@daynix.com, kvm@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY
Date: Mon, 20 Apr 2026 14:53:15 +0100	[thread overview]
Message-ID: <86ldeh20xg.wl-maz@kernel.org> (raw)
In-Reply-To: <ad44c69e-2f99-4f31-81b4-faae52eea080@rsg.ci.i.u-tokyo.ac.jp>

On Mon, 20 Apr 2026 13:07:15 +0100,
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> 
> On 2026/04/20 18:51, Marc Zyngier wrote:
> > On Mon, 20 Apr 2026 09:36:16 +0100,
> > Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> >> 
> >> On 2026/04/20 2:19, Marc Zyngier wrote:
> >>> On Sat, 18 Apr 2026 09:14:25 +0100,
> >>> Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> wrote:
> >>>> 
> >>>> On a heterogeneous arm64 system, KVM's PMU emulation is based on the
> >>>> features of a single host PMU instance. When a vCPU is migrated to a
> >>>> pCPU with an incompatible PMU, counters such as PMCCNTR_EL0 stop
> >>>> incrementing.
> >>>> 
> >>>> Although this behavior is permitted by the architecture, Windows does
> >>>> not handle it gracefully and may crash with a division-by-zero error.
> >>>> 
> >>>> The current workaround requires VMMs to pin vCPUs to a set of pCPUs
> >>>> that share a compatible PMU. This is difficult to implement correctly in
> >>>> QEMU/libvirt, where pinning occurs after vCPU initialization, and it
> >>>> also restricts the guest to a subset of available pCPUs.
> >>>> 
> >>>> Introduce the KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY attribute to
> >>>> create a "fixed-counters-only" PMU. When set, KVM exposes a PMU that is
> >>>> compatible with all pCPUs but that does not support programmable
> >>>> event counters which may have different feature sets on different PMUs.
> >>>> 
> >>>> This allows Windows guests to run reliably on heterogeneous systems
> >>>> without crashing, even without vCPU pinning, and enables VMMs to
> >>>> schedule vCPUs across all available pCPUs, making full use of the host
> >>>> hardware.
> >>>> 
> >>>> Much like KVM_ARM_VCPU_PMU_V3_IRQ and other read-write attributes, this
> >>>> attribute provides a getter that facilitates kernel and userspace
> >>>> debugging/testing.
> >>> 
> >>> OK, so that's the sales pitch. But how is it implemented? I would like
> >>> to be able to read a high-level description of the implementation
> >>> trade-offs.
> >> 
> >> Implementation-wise it is very trivial. Essentially the following
> >> addition in kvm_arm_pmu_v3_get_attr() is the entire implementation:
> >> +	case KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY:
> >> +		if (test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> >> &vcpu->kvm->arch.flags))
> >> +			return 0;
> >> 
> >> Both its functionality and code complexity is trivial. So we can argue that:
> >> - the functionality is too trivial to be useful or
> >> - the interface/implementation complexity is so trivial that it does not
> >>    incur maintenance burden
> >> 
> >> In this case the selftest uses the getter so I was more inclined to
> >> have it, but adding one just for the selftest sounds too ad-hoc, so
> >> here I looked into other attributes to ensure that it was not
> >> introducing inconsistency with existing interfaces.
> >> 
> >> As the result, I found there are other read-write attributes; in fact
> >> there are more read-write attributes than write-only ones.
> > 
> > You're completely missing the point. I'm referring to the whole of the
> > commit message, which is more of a marketing slide than a technical
> > description.
> 
> In terms of implementation, the obvious tradeoff is that it adds more
> code to implement the feature. One thing to note is that
> kvm_vcpu_load_pmu() is added and is called each time a vCPU migrates
> across pCPUs. The heavy part, making the KVM_REQ_RELOAD_PMU request,
> only happens when the feature is enabled.

Well, that's what I want to see. The repeated blurb about Windows
being broken is cover letter material, but not fir for a commit
message.

[...]

> >>> "for the first time" gives the impression that it will work if you try
> >>> again. I'd rather we say that "This feature is incompatible with the
> >>> existence of a PMU event filter".
> >> 
> >> The following sequence will work:
> >> 1. Set KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
> >> 2. Set KVM_ARM_VCPU_PMU_V3_FILTER
> >> 3. Set KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY
> >> 
> >> This is to make the behavior conistent with KVM_ARM_VCPU_PMU_V3_SET_PMU.
> > 
> > I don't think this is correct. Filtering is completely at odds with
> > this patch, and I don't want to have to reason about the combination.
> 
> kvm_arm_pmu_v3_set_pmu() has the following condition:
> 
> if (kvm_vm_has_ran_once(kvm) ||
>     (kvm->arch.pmu_filter && kvm->arch.arm_pmu != arm_pmu)) {
> 	ret = -EBUSY;
> 	break;
> }
> 
> kvm_arm_pmu_v3_set_pmu_fixed_counters_only() has the corresponding
> condition for consistency:
> 
> if (kvm_vm_has_ran_once(kvm) ||
>     (kvm->arch.pmu_filter &&
>      !test_bit(KVM_ARCH_FLAG_PMU_V3_FIXED_COUNTERS_ONLY,
> 	       &kvm->arch.flags)))
> 	return -EBUSY;
> 
> We can of course kill the PMU event filter for
> FIXED_COUNTERS_ONLY. The filter is effectively no-op with
> FIXED_COUNTERS_ONLY and I don't think that consistency matters much.

We shouldn't allow weird combinations in the UAPI. Since it makes no
sense to have both fixed-function *and* filters, we should make them
mutually exclusive.

[...]

> >> I expect migration will be handled with the conventional register
> >> getters and setters, but please share if you have a concern.
> > 
> > At the very least I want to see some documentation explaining that.
> 
> What kind of documentation do you expect?

A description of what counters are exposed by this feature, and what
architectural features they are dependent on.

> If we change kvm_vcpu_load_pmu() to avoid for_each_set_bit(), there
> would be a good chance to forget updating it when mechanically
> updating existing for_each_set_bit() instances, so it is a candidate
> for documentation. But I don't have a good idea where to place it
> either.

The moment we introduce FEAT_PMUv3_ICNTR, the whole PMUv3 emulation
will catch fire anyway, as we already limit ourselves to 32 bits for
reset and nesting switch.

So this will be a major redesign anyway. If you are really worried,
leave a comment in there.

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2026-04-20 13:53 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-18  8:14 [PATCH v7 0/4] KVM: arm64: PMU: Use multiple host PMUs Akihiko Odaki
2026-04-18  8:14 ` [PATCH v7 1/4] KVM: arm64: PMU: Add kvm_pmu_enabled_counter_mask() Akihiko Odaki
2026-04-19 14:13   ` Marc Zyngier
2026-04-20  5:27     ` Akihiko Odaki
2026-04-18  8:14 ` [PATCH v7 2/4] KVM: arm64: PMU: Protect the list of PMUs with RCU Akihiko Odaki
2026-04-19 14:34   ` Marc Zyngier
2026-04-20  6:21     ` Akihiko Odaki
2026-04-20  7:01       ` Marc Zyngier
2026-04-20  7:17         ` Akihiko Odaki
2026-04-18  8:14 ` [PATCH v7 3/4] KVM: arm64: PMU: Introduce FIXED_COUNTERS_ONLY Akihiko Odaki
2026-04-19 17:19   ` Marc Zyngier
2026-04-20  8:36     ` Akihiko Odaki
2026-04-20  9:51       ` Marc Zyngier
2026-04-20 12:07         ` Akihiko Odaki
2026-04-20 13:53           ` Marc Zyngier [this message]
2026-04-18  8:14 ` [PATCH v7 4/4] KVM: arm64: selftests: Test PMU_V3_FIXED_COUNTERS_ONLY Akihiko Odaki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86ldeh20xg.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=devel@daynix.com \
    --cc=gustavoars@kernel.org \
    --cc=joey.gouly@arm.com \
    --cc=kees@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
    --cc=oupton@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=shuah@kernel.org \
    --cc=suzuki.poulose@arm.com \
    --cc=will@kernel.org \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox