From: Oliver Upton <oliver.upton@linux.dev>
To: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
Cc: Marc Zyngier <maz@kernel.org>, Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Kees Cook <kees@kernel.org>,
"Gustavo A. R. Silva" <gustavoars@kernel.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org,
devel@daynix.com, kvm@vger.kernel.org, linux-doc@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH RFC v2 1/2] KVM: arm64: PMU: Introduce KVM_ARM_VCPU_PMU_V3_COMPOSITION
Date: Wed, 6 Aug 2025 10:20:55 -0700 [thread overview]
Message-ID: <aJOO99xUrhsrvLwl@linux.dev> (raw)
In-Reply-To: <20250806-hybrid-v2-1-0661aec3af8c@rsg.ci.i.u-tokyo.ac.jp>
Hi Akihiko,
This is an unreasonably large patch that needs to be broken down into
smaller patches, ideally one functional change per patch. We need this
even for an RFC for the sake of reviews.
On Wed, Aug 06, 2025 at 06:09:54PM +0900, Akihiko Odaki wrote:
> +static u64 kvm_pmu_get_pmc_value(struct kvm_vcpu *vcpu, u8 idx)
> {
> - struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
> + struct kvm_pmc *pmc = *kvm_vcpu_idx_to_pmc(vcpu, idx);
> u64 counter, reg, enabled, running;
> + unsigned int i;
>
> - reg = counter_index_to_reg(pmc->idx);
> + reg = counter_index_to_reg(idx);
> counter = __vcpu_sys_reg(vcpu, reg);
>
> /*
> * The real counter value is equal to the value of counter register plus
> * the value perf event counts.
> */
> - if (pmc->perf_event)
> - counter += perf_event_read_value(pmc->perf_event, &enabled,
> - &running);
> + if (pmc)
> + for (i = 0; i < pmc->nr_perf_events; i++)
> + counter += perf_event_read_value(pmc->perf_events[i],
> + &enabled, &running);
I'm concerned that this array of events concept you're introducing is
going to be error-prone. An approach that reallocates a new PMU event in
the case of a vCPU migrating to a new PMU implementation would be
desirable.
> +static void reset_sample_period(struct perf_event *perf_event)
> +{
> + struct kvm_pmc **pmc = perf_event->overflow_handler_context;
> + struct kvm_vcpu *vcpu = kvm_pmc_to_vcpu(pmc);
> + struct arm_pmu *cpu_pmu = to_arm_pmu(perf_event->pmu);
> + u64 period;
> +
> + cpu_pmu->pmu.stop(perf_event, PERF_EF_UPDATE);
> +
> + /*
> + * Reset the sample period to the architectural limit,
> + * i.e. the point where the counter overflows.
> + */
> + period = compute_period(pmc, kvm_pmu_get_pmc_value(vcpu, (*pmc)->idx));
> +
> + local64_set(&perf_event->hw.period_left, 0);
> + perf_event->attr.sample_period = period;
> + perf_event->hw.sample_period = period;
> +
> + cpu_pmu->pmu.start(perf_event, PERF_EF_RELOAD);
> +}
No, we can't start calling into the internal driver interfaces. The fact
that we have a pointer to the PMU is an ugly hack and shouldn't be used
like this.
> @@ -725,8 +729,8 @@ static void kvm_pmu_create_perf_event(struct kvm_pmc *pmc)
> attr.type = arm_pmu->pmu.type;
> attr.size = sizeof(attr);
> attr.pinned = 1;
> - attr.disabled = !kvm_pmu_counter_is_enabled(pmc);
> - attr.exclude_user = !kvm_pmc_counts_at_el0(pmc);
> + attr.disabled = !kvm_pmu_counter_is_enabled(vcpu, (*pmc)->idx);
> + attr.exclude_user = !kvm_pmc_counts_at_el0(vcpu, (*pmc)->idx);
> attr.exclude_hv = 1; /* Don't count EL2 events */
> attr.exclude_host = 1; /* Don't count host events */
> attr.config = eventsel;
Can we just special-case the fixed CPU cycle counter to use
PERF_TYPE_HARDWARE / PERF_COUNT_HW_CPU_CYCLES? That _should_ have the
intended effect of opening an event on the PMU for this CPU.
> + /*
> + * If we have a filter in place and that the event isn't allowed, do
> + * not install a perf event either.
> + */
> + if (vcpu->kvm->arch.pmu_filter &&
> + !test_bit(eventsel, vcpu->kvm->arch.pmu_filter))
> + return;
> +
> + if (arm_pmu) {
> + *pmc = kvm_pmu_alloc_pmc(idx, 1);
> + if (!*pmc)
> + goto err;
> +
> + kvm_pmu_create_perf_event(pmc, arm_pmu, eventsel);
> + } else {
> + guard(mutex)(&arm_pmus_lock);
This is a system-wide lock, the need for which is eliminated if you go
for the reallocation approach I mention.
> +static int kvm_arm_pmu_v3_set_pmu_composition(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> + struct arm_pmu_entry *entry;
> + struct arm_pmu *arm_pmu;
> +
> + lockdep_assert_held(&kvm->arch.config_lock);
> +
> + if (kvm_vm_has_ran_once(kvm) ||
> + (kvm->arch.pmu_filter && !kvm->arch.nr_composed_host_pmus))
> + return -EBUSY;
I'm not sure there's much value in preventing the user from configuring
the PMU event filter. Even in the case of the fixed CPU cycle counter we
allow userspace to filter the event.
It is much more important to have mutual exclusion between this UAPI and
userspace explicitly selecting a PMU implementation.
> @@ -1223,6 +1328,8 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
>
> return kvm_arm_pmu_v3_set_nr_counters(vcpu, n);
> }
> + case KVM_ARM_VCPU_PMU_V3_COMPOSITION:
> + return kvm_arm_pmu_v3_set_pmu_composition(vcpu);
I'd prefer naming this something like 'KVM_ARM_VCPU_PMU_V3_FIXED_COUNTERS_ONLY'.
We will have the fixed instruction counter eventually which is another
event we could potentially provide system-wide.
Thanks,
Oliver
next prev parent reply other threads:[~2025-08-06 17:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-06 9:09 [PATCH RFC v2 0/2] KVM: arm64: PMU: Use multiple host PMUs Akihiko Odaki
2025-08-06 9:09 ` [PATCH RFC v2 1/2] KVM: arm64: PMU: Introduce KVM_ARM_VCPU_PMU_V3_COMPOSITION Akihiko Odaki
2025-08-06 17:20 ` Oliver Upton [this message]
2025-08-06 18:24 ` Akihiko Odaki
2025-08-07 2:03 ` Oliver Upton
2025-08-07 14:06 ` Akihiko Odaki
2025-08-08 23:08 ` Oliver Upton
2025-08-09 6:15 ` Akihiko Odaki
2025-08-06 9:09 ` [PATCH RFC v2 2/2] KVM: arm64: selftests: Test guest PMUv3 composition Akihiko Odaki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aJOO99xUrhsrvLwl@linux.dev \
--to=oliver.upton@linux.dev \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=devel@daynix.com \
--cc=gustavoars@kernel.org \
--cc=joey.gouly@arm.com \
--cc=kees@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-hardening@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maz@kernel.org \
--cc=odaki@rsg.ci.i.u-tokyo.ac.jp \
--cc=pbonzini@redhat.com \
--cc=shuah@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).