From: Colton Lewis <coltonlewis@google.com>
To: kvm@vger.kernel.org
Cc: Alexandru Elisei <alexandru.elisei@arm.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Jonathan Corbet <corbet@lwn.net>,
Russell King <linux@armlinux.org.uk>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>, Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Mingwei Zhang <mizhang@google.com>,
Joey Gouly <joey.gouly@arm.com>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Zenghui Yu <yuzenghui@huawei.com>,
Mark Rutland <mark.rutland@arm.com>,
Shuah Khan <shuah@kernel.org>,
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>,
James Clark <james.clark@linaro.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev,
linux-perf-users@vger.kernel.org,
linux-kselftest@vger.kernel.org,
Colton Lewis <coltonlewis@google.com>
Subject: [PATCH v7 00/20] ARM64 PMU Partitioning
Date: Mon, 4 May 2026 21:17:53 +0000 [thread overview]
Message-ID: <20260504211813.1804997-1-coltonlewis@google.com> (raw)
This series creates a new PMU scheme on ARM, a partitioned PMU that
allows reserving a subset of counters for more direct guest access,
significantly reducing overhead. More details, including performance
benchmarks, can be read in the v1 cover letter linked below.
An overview of what this series accomplishes was presented at KVM
Forum 2025. Slides [1] and video [2] are linked below.
After a few false starts, meeting with Will Deacon and Mark Rutland to
discuss implementation ideas, and a few more false starts, I finally
have an implementation of dynamic counter reservation that works
without disrupting host perf too much. Now the host only loses access
to the guest counters when a vCPU resides on the CPU.
The key was creating perf_pmu_resched_update, which behaves exactly
like perf_pmu_resched except it takes a callback to call in between
when the perf events are scheduled out and when they are scheduled
back in. That allows us to update the PMU's available counters when we
know they are not currently in use without needing to expose private
perf core functions and triple check they are not being called in a
way that violates existing assumptions.
Because this introduces a possibility of perf reschedule during vCPU
load, I've optimized to only do that operation if there are host
events occupying the intended guest counters at the time of the load.
The kernel command line parameter for the driver still exists, but now
only defines an upper limit of counters the guest might use rather
than taking those counters from the host permanently.
v7:
* Implement dynamic counter reservation as described above. One side
effect is the PMUv3 driver now needs much fewer changes to enforce
the boundary.
* Move register accesses out of fast path for non-FGT hardware. The
performance impact was negligible and this moves bloat out of the
fast path and allows a more reliable design with more code sharing.
* Make PMCCNTR a special case in the context swap again because trying
to access it with PMXEVCNTR is undefined.
* Fix a bug where kvm_pmu_guest_counter_mask was using & instead of |.
* Re-expose the dedicated instruction counter to the host since it was
decided the guest will not own it.
* Change the global armv8pmu_reserved_host_counters to
armv8pmu_is_partitoned because it was only used in boolean checks.
* Fix typo in vcpu attribute commit so the spelling of the flag in the
commit message matches the code.
* Rebase to v7.0-rc7
v6:
https://lore.kernel.org/kvmarm/20260209221414.2169465-1-coltonlewis@google.com/
v5:
https://lore.kernel.org/kvmarm/20251209205121.1871534-1-coltonlewis@google.com/
v4:
https://lore.kernel.org/kvmarm/20250714225917.1396543-1-coltonlewis@google.com/
v3:
https://lore.kernel.org/kvm/20250626200459.1153955-1-coltonlewis@google.com/
v2:
https://lore.kernel.org/kvm/20250620221326.1261128-1-coltonlewis@google.com/
v1:
https://lore.kernel.org/kvm/20250602192702.2125115-1-coltonlewis@google.com/
[1] https://gitlab.com/qemu-project/kvm-forum/-/raw/main/_attachments/2025/Optimizing__itvHkhc.pdf
[2] https://www.youtube.com/watch?v=YRzZ8jMIA6M&list=PLW3ep1uCIRfxwmllXTOA2txfDWN6vUOHp&index=9
Colton Lewis (19):
arm64: cpufeature: Add cpucap for HPMN0
KVM: arm64: Reorganize PMU functions
perf: arm_pmuv3: Generalize counter bitmasks
perf: arm_pmuv3: Check cntr_mask before using pmccntr
perf: arm_pmuv3: Add method to partition the PMU
KVM: arm64: Set up FGT for Partitioned PMU
KVM: arm64: Add Partitioned PMU register trap handlers
KVM: arm64: Set up MDCR_EL2 to handle a Partitioned PMU
KVM: arm64: Context swap Partitioned PMU guest registers
KVM: arm64: Enforce PMU event filter at vcpu_load()
perf: Add perf_pmu_resched_update()
KVM: arm64: Apply dynamic guest counter reservations
KVM: arm64: Implement lazy PMU context swaps
perf: arm_pmuv3: Handle IRQs for Partitioned PMU guest counters
KVM: arm64: Detect overflows for the Partitioned PMU
KVM: arm64: Add vCPU device attr to partition the PMU
KVM: selftests: Add find_bit to KVM library
KVM: arm64: selftests: Add test case for Partitioned PMU
KVM: arm64: selftests: Relax testing for exceptions when partitioned
Marc Zyngier (1):
KVM: arm64: Reorganize PMU includes
arch/arm/include/asm/arm_pmuv3.h | 18 +
arch/arm64/include/asm/arm_pmuv3.h | 12 +-
arch/arm64/include/asm/kvm_host.h | 17 +-
arch/arm64/include/asm/kvm_types.h | 6 +-
arch/arm64/include/uapi/asm/kvm.h | 2 +
arch/arm64/kernel/cpufeature.c | 8 +
arch/arm64/kvm/Makefile | 2 +-
arch/arm64/kvm/arm.c | 2 +
arch/arm64/kvm/config.c | 41 +-
arch/arm64/kvm/debug.c | 31 +-
arch/arm64/kvm/pmu-direct.c | 494 ++++++++++++
arch/arm64/kvm/pmu-emul.c | 674 +----------------
arch/arm64/kvm/pmu.c | 701 ++++++++++++++++++
arch/arm64/kvm/sys_regs.c | 250 ++++++-
arch/arm64/tools/cpucaps | 1 +
arch/arm64/tools/sysreg | 6 +-
drivers/perf/arm_pmuv3.c | 111 ++-
include/kvm/arm_pmu.h | 110 +++
include/linux/perf/arm_pmu.h | 3 +
include/linux/perf/arm_pmuv3.h | 14 +-
include/linux/perf_event.h | 3 +
kernel/events/core.c | 28 +-
tools/testing/selftests/kvm/Makefile.kvm | 1 +
.../selftests/kvm/arm64/vpmu_counter_access.c | 112 ++-
tools/testing/selftests/kvm/lib/find_bit.c | 1 +
25 files changed, 1861 insertions(+), 787 deletions(-)
create mode 100644 arch/arm64/kvm/pmu-direct.c
create mode 100644 tools/testing/selftests/kvm/lib/find_bit.c
base-commit: 591cd656a1bf5ea94a222af5ef2ee76df029c1d2
--
2.54.0.545.g6539524ca2-goog
next reply other threads:[~2026-05-04 21:18 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-04 21:17 Colton Lewis [this message]
2026-05-04 21:17 ` [PATCH v7 01/20] arm64: cpufeature: Add cpucap for HPMN0 Colton Lewis
2026-05-04 21:17 ` [PATCH v7 02/20] KVM: arm64: Reorganize PMU includes Colton Lewis
2026-05-04 21:17 ` [PATCH v7 03/20] KVM: arm64: Reorganize PMU functions Colton Lewis
2026-05-04 21:17 ` [PATCH v7 04/20] perf: arm_pmuv3: Generalize counter bitmasks Colton Lewis
2026-05-04 21:17 ` [PATCH v7 05/20] perf: arm_pmuv3: Check cntr_mask before using pmccntr Colton Lewis
2026-05-04 21:17 ` [PATCH v7 06/20] perf: arm_pmuv3: Add method to partition the PMU Colton Lewis
2026-05-04 21:18 ` [PATCH v7 07/20] KVM: arm64: Set up FGT for Partitioned PMU Colton Lewis
2026-05-04 21:18 ` [PATCH v7 08/20] KVM: arm64: Add Partitioned PMU register trap handlers Colton Lewis
2026-05-04 21:18 ` [PATCH v7 09/20] KVM: arm64: Set up MDCR_EL2 to handle a Partitioned PMU Colton Lewis
2026-05-04 21:18 ` [PATCH v7 10/20] KVM: arm64: Context swap Partitioned PMU guest registers Colton Lewis
2026-05-04 21:18 ` [PATCH v7 11/20] KVM: arm64: Enforce PMU event filter at vcpu_load() Colton Lewis
2026-05-04 21:18 ` [PATCH v7 12/20] perf: Add perf_pmu_resched_update() Colton Lewis
2026-05-04 21:18 ` [PATCH v7 13/20] KVM: arm64: Apply dynamic guest counter reservations Colton Lewis
2026-05-04 21:18 ` [PATCH v7 14/20] KVM: arm64: Implement lazy PMU context swaps Colton Lewis
2026-05-04 21:18 ` [PATCH v7 15/20] perf: arm_pmuv3: Handle IRQs for Partitioned PMU guest counters Colton Lewis
2026-05-04 21:18 ` [PATCH v7 16/20] KVM: arm64: Detect overflows for the Partitioned PMU Colton Lewis
2026-05-04 21:18 ` [PATCH v7 17/20] KVM: arm64: Add vCPU device attr to partition the PMU Colton Lewis
2026-05-04 21:18 ` [PATCH v7 18/20] KVM: selftests: Add find_bit to KVM library Colton Lewis
2026-05-04 21:18 ` [PATCH v7 19/20] KVM: arm64: selftests: Add test case for Partitioned PMU Colton Lewis
2026-05-04 21:18 ` [PATCH v7 20/20] KVM: arm64: selftests: Relax testing for exceptions when partitioned Colton Lewis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260504211813.1804997-1-coltonlewis@google.com \
--to=coltonlewis@google.com \
--cc=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=corbet@lwn.net \
--cc=gankulkarni@os.amperecomputing.com \
--cc=james.clark@linaro.org \
--cc=joey.gouly@arm.com \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-perf-users@vger.kernel.org \
--cc=linux@armlinux.org.uk \
--cc=mark.rutland@arm.com \
--cc=maz@kernel.org \
--cc=mizhang@google.com \
--cc=oliver.upton@linux.dev \
--cc=pbonzini@redhat.com \
--cc=shuah@kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
--cc=yuzenghui@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox