* [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
@ 2026-02-03 21:43 Ben Horgan
2026-02-03 21:43 ` [PATCH v4 01/41] arm64/sysreg: Add MPAMSM_EL1 register Ben Horgan
` (42 more replies)
0 siblings, 43 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc
This new version of the mpam missing pieces series has a few significant
changes in the mpam driver part of the series. The heuristics for deciding
if features should be exposed are tightened. This is to fix some
inaccuracies and avoid overcommitting before needed - shout if this changes
anything on your platform. The final patch adds documentation which
explains which features you should expect. The ABMC emulation is dropped
for the moment as it requires resctrl changes to support for MPAM without
breaking the abi. The default 5% gap for min_bw is dropped in favour of a
simple default (kept for grace). The series is based on x86/resctrl [1] as
resctrl has telemetry patches queued which change the arch interface.
Fixes that are in 6.19-rc8 are dropped from the series but
b9f5c38e4af1 ("arm_mpam: Use non-atomic bitops when modifying feature bitmap")
is required to avoid an alignment fault in the kunit tests.
Thank you for all the testing and reviewing so far! It all helps.
Changelogs in patches
From James' cover letter:
This is the missing piece to make MPAM usable resctrl in user-space. This has
shed its debugfs code and the read/write 'event configuration' for the monitors
to make the series smaller.
This adds the arch code and KVM support first. I anticipate the whole thing
going via arm64, but if goes via tip instead, the an immutable branch with those
patches should be easy to do.
Generally the resctrl glue code works by picking what MPAM features it can expose
from the MPAM drive, then configuring the structs that back the resctrl helpers.
If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
counters are considerably more hairy, and depend on hueristics around the topology,
and a bunch of stuff trying to emulate ABMC.
If it didn't pick what you wanted it to, please share the debug messages produced
when enabling dynamic debug and booting with:
| dyndbg="file mpam_resctrl.c +pl"
I've not found a platform that can test all the behaviours around the monitors,
so this is where I'd expect the most bugs.
The MPAM spec that describes all the system and MMIO registers can be found here:
https://developer.arm.com/documentation/ddi0598/db/?lang=en
(Ignored the 'RETIRED' warning - that is just arm moving the documentation around.
This document has the best overview)
Based on:
[1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
(To include telemetry code which changes the resctrl arch interface)
The series can be retrieved from:
https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
(Final commit is a fix already in 6.19-rc8)
v3 can be found at:
https://lore.kernel.org/linux-arm-kernel/20260112165914.4086692-1-ben.horgan@arm.com/
v2 can be found at:
https://lore.kernel.org/linux-arm-kernel/20251219181147.3404071-1-ben.horgan@arm.com/
rfc can be found at:
https://lore.kernel.org/linux-arm-kernel/20251205215901.17772-1-james.morse@arm.com/
Ben Horgan (10):
arm64/sysreg: Add MPAMSM_EL1 register
KVM: arm64: Preserve host MPAM configuration when changing traps
KVM: arm64: Make MPAMSM_EL1 accesses UNDEF
arm64: mpam: Drop the CONFIG_EXPERT restriction
arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
KVM: arm64: Use kernel-space partid configuration for hypercalls
arm_mpam: resctrl: Add rmid index helpers
arm_mpam: resctrl: Add kunit test for rmid idx conversions
arm_mpam: resctrl: Wait for cacheinfo to be ready
arm64: mpam: Add initial MPAM documentation
Dave Martin (2):
arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
arm_mpam: resctrl: Add kunit test for control format conversions
James Morse (25):
arm64: mpam: Context switch the MPAM registers
arm64: mpam: Re-initialise MPAM regs when CPU comes online
arm64: mpam: Advertise the CPUs MPAM limits to the driver
arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs
arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG
values
KVM: arm64: Force guest EL1 to use user-space's partid configuration
arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
arm_mpam: resctrl: Sort the order of the domain lists
arm_mpam: resctrl: Pick the caches we will use as resctrl resources
arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()
arm_mpam: resctrl: Add resctrl_arch_get_config()
arm_mpam: resctrl: Implement helpers to update configuration
arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks
arm_mpam: resctrl: Add CDP emulation
arm_mpam: resctrl: Add support for 'MB' resource
arm_mpam: resctrl: Add support for csu counters
arm_mpam: resctrl: Pick classes for use as mbm counters
arm_mpam: resctrl: Pre-allocate free running monitors
arm_mpam: resctrl: Allow resctrl to allocate monitors
arm_mpam: resctrl: Add resctrl_arch_rmid_read() and
resctrl_arch_reset_rmid()
arm_mpam: resctrl: Update the rmid reallocation limit
arm_mpam: resctrl: Add empty definitions for assorted resctrl
functions
arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
arm_mpam: resctrl: Call resctrl_init() on platforms that can support
resctrl
arm_mpam: Quirk CMN-650's CSU NRDY behaviour
Shanker Donthineni (4):
arm_mpam: Add quirk framework
arm_mpam: Add workaround for T241-MPAM-1
arm_mpam: Add workaround for T241-MPAM-4
arm_mpam: Add workaround for T241-MPAM-6
Documentation/arch/arm64/index.rst | 1 +
Documentation/arch/arm64/mpam.rst | 93 +
Documentation/arch/arm64/silicon-errata.rst | 9 +
arch/arm64/Kconfig | 6 +-
arch/arm64/include/asm/el2_setup.h | 3 +-
arch/arm64/include/asm/mpam.h | 96 +
arch/arm64/include/asm/resctrl.h | 2 +
arch/arm64/include/asm/thread_info.h | 3 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/cpufeature.c | 21 +-
arch/arm64/kernel/mpam.c | 62 +
arch/arm64/kernel/process.c | 7 +
arch/arm64/kvm/hyp/include/hyp/switch.h | 12 +-
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +
arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 13 +
arch/arm64/kvm/sys_regs.c | 2 +
arch/arm64/tools/sysreg | 8 +
drivers/resctrl/Kconfig | 9 +-
drivers/resctrl/Makefile | 1 +
drivers/resctrl/mpam_devices.c | 257 ++-
drivers/resctrl/mpam_internal.h | 105 +-
drivers/resctrl/mpam_resctrl.c | 1861 +++++++++++++++++++
drivers/resctrl/test_mpam_resctrl.c | 364 ++++
include/linux/arm_mpam.h | 32 +
24 files changed, 2949 insertions(+), 28 deletions(-)
create mode 100644 Documentation/arch/arm64/mpam.rst
create mode 100644 arch/arm64/include/asm/mpam.h
create mode 100644 arch/arm64/include/asm/resctrl.h
create mode 100644 arch/arm64/kernel/mpam.c
create mode 100644 drivers/resctrl/mpam_resctrl.c
create mode 100644 drivers/resctrl/test_mpam_resctrl.c
--
2.43.0
^ permalink raw reply [flat|nested] 89+ messages in thread
* [PATCH v4 01/41] arm64/sysreg: Add MPAMSM_EL1 register
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 02/41] KVM: arm64: Preserve host MPAM configuration when changing traps Ben Horgan
` (41 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
The MPAMSM_EL1 register determines the MPAM configuration for an SMCU. Add
the register definition.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
arch/arm64/tools/sysreg | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/arm64/tools/sysreg b/arch/arm64/tools/sysreg
index 8921b51866d6..afbb55c9b038 100644
--- a/arch/arm64/tools/sysreg
+++ b/arch/arm64/tools/sysreg
@@ -5052,6 +5052,14 @@ Field 31:16 PARTID_D
Field 15:0 PARTID_I
EndSysreg
+Sysreg MPAMSM_EL1 3 0 10 5 3
+Res0 63:48
+Field 47:40 PMG_D
+Res0 39:32
+Field 31:16 PARTID_D
+Res0 15:0
+EndSysreg
+
Sysreg ISR_EL1 3 0 12 1 0
Res0 63:11
Field 10 IS
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 02/41] KVM: arm64: Preserve host MPAM configuration when changing traps
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
2026-02-03 21:43 ` [PATCH v4 01/41] arm64/sysreg: Add MPAMSM_EL1 register Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 03/41] KVM: arm64: Make MPAMSM_EL1 accesses UNDEF Ben Horgan
` (40 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
When kvm enables or disables MPAM traps to EL2 it clears all other bits in
MPAM2_EL2. Notably, it clears the partition ids (PARTIDs) and performance
monitoring groups (PMGs). Avoid changing these bits in anticipation of
adding support for MPAM in the kernel. Otherwise, on a VHE system with the
host running at EL2 where MPAM2_EL2 and MPAM1_EL1 access the same register,
any attempt to use MPAM to monitor or partition resources for kernel space
would be foiled by running a KVM guest. Additionally, MPAM2_EL2.EnMPAMSM is
always set to 0 which causes MPAMSM_EL1 to always trap. Keep EnMPAMSM set
to 1 when not in a guest so that the kernel can use MPAMSM_EL1.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
arch/arm64/kvm/hyp/include/hyp/switch.h | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index c5d5e5b86eaf..63195275a8b8 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -269,7 +269,8 @@ static inline void __deactivate_traps_hfgxtr(struct kvm_vcpu *vcpu)
static inline void __activate_traps_mpam(struct kvm_vcpu *vcpu)
{
- u64 r = MPAM2_EL2_TRAPMPAM0EL1 | MPAM2_EL2_TRAPMPAM1EL1;
+ u64 clr = MPAM2_EL2_EnMPAMSM;
+ u64 set = MPAM2_EL2_TRAPMPAM0EL1 | MPAM2_EL2_TRAPMPAM1EL1;
if (!system_supports_mpam())
return;
@@ -279,18 +280,21 @@ static inline void __activate_traps_mpam(struct kvm_vcpu *vcpu)
write_sysreg_s(MPAMHCR_EL2_TRAP_MPAMIDR_EL1, SYS_MPAMHCR_EL2);
} else {
/* From v1.1 TIDR can trap MPAMIDR, set it unconditionally */
- r |= MPAM2_EL2_TIDR;
+ set |= MPAM2_EL2_TIDR;
}
- write_sysreg_s(r, SYS_MPAM2_EL2);
+ sysreg_clear_set_s(SYS_MPAM2_EL2, clr, set);
}
static inline void __deactivate_traps_mpam(void)
{
+ u64 clr = MPAM2_EL2_TRAPMPAM0EL1 | MPAM2_EL2_TRAPMPAM1EL1 | MPAM2_EL2_TIDR;
+ u64 set = MPAM2_EL2_EnMPAMSM;
+
if (!system_supports_mpam())
return;
- write_sysreg_s(0, SYS_MPAM2_EL2);
+ sysreg_clear_set_s(SYS_MPAM2_EL2, clr, set);
if (system_supports_mpam_hcr())
write_sysreg_s(MPAMHCR_HOST_FLAGS, SYS_MPAMHCR_EL2);
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 03/41] KVM: arm64: Make MPAMSM_EL1 accesses UNDEF
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
2026-02-03 21:43 ` [PATCH v4 01/41] arm64/sysreg: Add MPAMSM_EL1 register Ben Horgan
2026-02-03 21:43 ` [PATCH v4 02/41] KVM: arm64: Preserve host MPAM configuration when changing traps Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 04/41] arm64: mpam: Context switch the MPAM registers Ben Horgan
` (39 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
The MPAMSM_EL1 controls the MPAM labeling for an SMCU, Streaming Mode
Compute Unit. As there is on MPAM support in kvm, make sure MPAMSM_EL1
accesses trigger an UNDEF.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Remove paragraph from commit on allowed range of values
---
arch/arm64/kvm/sys_regs.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c8fd7c6a12a1..72654ab984ee 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -3373,6 +3373,8 @@ static const struct sys_reg_desc sys_reg_descs[] = {
{ SYS_DESC(SYS_MPAM1_EL1), undef_access },
{ SYS_DESC(SYS_MPAM0_EL1), undef_access },
+ { SYS_DESC(SYS_MPAMSM_EL1), undef_access },
+
{ SYS_DESC(SYS_VBAR_EL1), access_rw, reset_val, VBAR_EL1, 0 },
{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 04/41] arm64: mpam: Context switch the MPAM registers
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (2 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 03/41] KVM: arm64: Make MPAMSM_EL1 accesses UNDEF Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:16 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 05/41] arm64: mpam: Re-initialise MPAM regs when CPU comes online Ben Horgan
` (38 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
MPAM allows traffic in the SoC to be labeled by the OS, these labels are
used to apply policy in caches and bandwidth regulators, and to monitor
traffic in the SoC. The label is made up of a PARTID and PMG value. The x86
equivalent calls these CLOSID and RMID, but they don't map precisely.
MPAM has two CPU system registers that is used to hold the PARTID and PMG
values that traffic generated at each exception level will use. These can
be set per-task by the resctrl file system. (resctrl is the defacto
interface for controlling this stuff).
Add a helper to switch this.
struct task_struct's separate CLOSID and RMID fields are insufficient to
implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID) and
PMG (sort of like the RMID) separately. On x86, the rmid is an independent
number, so a race that writes a mismatched closid and rmid into hardware is
benign. On arm64, the pmg bits extend the partid.
(i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0). In
this case, mismatching the values will 'dirty' a pmg value that resctrl
believes is clean, and is not tracking with its 'limbo' code.
To avoid this, the partid and pmg are always read and written as a
pair. This requires a new u64 field. In struct task_struct there are two
u32, rmid and closid for the x86 case, but as we can't use them here do
something else. Add this new field, mpam_partid_pmg, to struct thread_info
to avoid adding more architecture specific code to struct task_struct.
Always use READ_ONCE()/WRITE_ONCE() when accessing this field.
Resctrl allows a per-cpu 'default' value to be set, this overrides the
values when scheduling a task in the default control-group, which has
PARTID 0. The way 'code data prioritisation' gets emulated means the
register value for the default group needs to be a variable.
The current system register value is kept in a per-cpu variable to avoid
writing to the system register if the value isn't going to change. Writes
to this register may reset the hardware state for regulating bandwidth.
Finally, there is no reason to context switch these registers unless there
is a driver changing the values in struct task_struct. Hide the whole thing
behind a static key. This also allows the driver to disable MPAM in
response to errors reported by hardware. Move the existing static key to
belong to the arch code, as in the future the MPAM driver may become a
loadable module.
All this should depend on whether there is an MPAM driver, hide it behind
CONFIG_ARM64_MPAM.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
CC: Amit Singh Tomar <amitsinght@marvell.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
CONFIG_MPAM -> CONFIG_ARM64_MPAM in commit message
Remove extra DECLARE_STATIC_KEY_FALSE
Function name in comment, __mpam_sched_in() -> mpam_thread_switch()
Remove unused headers
Expand comment (Jonathan)
Changes since v2:
Tidy up ifdefs
Changes since v3:
Always set MPAMEN for MPAM1_EL1 rather than relying on it being read only.
---
arch/arm64/Kconfig | 2 +
arch/arm64/include/asm/mpam.h | 67 ++++++++++++++++++++++++++++
arch/arm64/include/asm/thread_info.h | 3 ++
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/mpam.c | 13 ++++++
arch/arm64/kernel/process.c | 7 +++
drivers/resctrl/mpam_devices.c | 2 -
drivers/resctrl/mpam_internal.h | 4 +-
8 files changed, 95 insertions(+), 4 deletions(-)
create mode 100644 arch/arm64/include/asm/mpam.h
create mode 100644 arch/arm64/kernel/mpam.c
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 93173f0a09c7..cdcc5b76a110 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2049,6 +2049,8 @@ config ARM64_MPAM
MPAM is exposed to user-space via the resctrl pseudo filesystem.
+ This option enables the extra context switch code.
+
endmenu # "ARMv8.4 architectural features"
menu "ARMv8.5 architectural features"
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
new file mode 100644
index 000000000000..0747e0526927
--- /dev/null
+++ b/arch/arm64/include/asm/mpam.h
@@ -0,0 +1,67 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2025 Arm Ltd. */
+
+#ifndef __ASM__MPAM_H
+#define __ASM__MPAM_H
+
+#include <linux/jump_label.h>
+#include <linux/percpu.h>
+#include <linux/sched.h>
+
+#include <asm/sysreg.h>
+
+DECLARE_STATIC_KEY_FALSE(mpam_enabled);
+DECLARE_PER_CPU(u64, arm64_mpam_default);
+DECLARE_PER_CPU(u64, arm64_mpam_current);
+
+/*
+ * The value of the MPAM0_EL1 sysreg when a task is in resctrl's default group.
+ * This is used by the context switch code to use the resctrl CPU property
+ * instead. The value is modified when CDP is enabled/disabled by mounting
+ * the resctrl filesystem.
+ */
+extern u64 arm64_mpam_global_default;
+
+/*
+ * The resctrl filesystem writes to the partid/pmg values for threads and CPUs,
+ * which may race with reads in mpam_thread_switch(). Ensure only one of the old
+ * or new values are used. Particular care should be taken with the pmg field as
+ * mpam_thread_switch() may read a partid and pmg that don't match, causing this
+ * value to be stored with cache allocations, despite being considered 'free' by
+ * resctrl.
+ */
+#ifdef CONFIG_ARM64_MPAM
+static inline u64 mpam_get_regval(struct task_struct *tsk)
+{
+ return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
+}
+
+static inline void mpam_thread_switch(struct task_struct *tsk)
+{
+ u64 oldregval;
+ int cpu = smp_processor_id();
+ u64 regval = mpam_get_regval(tsk);
+
+ if (!static_branch_likely(&mpam_enabled))
+ return;
+
+ if (regval == READ_ONCE(arm64_mpam_global_default))
+ regval = READ_ONCE(per_cpu(arm64_mpam_default, cpu));
+
+ oldregval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
+ if (oldregval == regval)
+ return;
+
+ write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
+ isb();
+
+ /* Synchronising the EL0 write is left until the ERET to EL0 */
+ write_sysreg_s(regval, SYS_MPAM0_EL1);
+
+ WRITE_ONCE(per_cpu(arm64_mpam_current, cpu), regval);
+}
+#else
+static inline void mpam_thread_switch(struct task_struct *tsk) {}
+#endif /* CONFIG_ARM64_MPAM */
+
+#endif /* __ASM__MPAM_H */
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index a803b887b0b4..fc801a26ff9e 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -41,6 +41,9 @@ struct thread_info {
#ifdef CONFIG_SHADOW_CALL_STACK
void *scs_base;
void *scs_sp;
+#endif
+#ifdef CONFIG_ARM64_MPAM
+ u64 mpam_partid_pmg;
#endif
u32 cpu;
};
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 76f32e424065..15979f366519 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -67,6 +67,7 @@ obj-$(CONFIG_CRASH_DUMP) += crash_dump.o
obj-$(CONFIG_VMCORE_INFO) += vmcore_info.o
obj-$(CONFIG_ARM_SDE_INTERFACE) += sdei.o
obj-$(CONFIG_ARM64_PTR_AUTH) += pointer_auth.o
+obj-$(CONFIG_ARM64_MPAM) += mpam.o
obj-$(CONFIG_ARM64_MTE) += mte.o
obj-y += vdso-wrap.o
obj-$(CONFIG_COMPAT_VDSO) += vdso32-wrap.o
diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam.c
new file mode 100644
index 000000000000..9866d2ca0faa
--- /dev/null
+++ b/arch/arm64/kernel/mpam.c
@@ -0,0 +1,13 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2025 Arm Ltd. */
+
+#include <asm/mpam.h>
+
+#include <linux/jump_label.h>
+#include <linux/percpu.h>
+
+DEFINE_STATIC_KEY_FALSE(mpam_enabled);
+DEFINE_PER_CPU(u64, arm64_mpam_default);
+DEFINE_PER_CPU(u64, arm64_mpam_current);
+
+u64 arm64_mpam_global_default;
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 489554931231..47698955fa1e 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -51,6 +51,7 @@
#include <asm/fpsimd.h>
#include <asm/gcs.h>
#include <asm/mmu_context.h>
+#include <asm/mpam.h>
#include <asm/mte.h>
#include <asm/processor.h>
#include <asm/pointer_auth.h>
@@ -738,6 +739,12 @@ struct task_struct *__switch_to(struct task_struct *prev,
if (prev->thread.sctlr_user != next->thread.sctlr_user)
update_sctlr_el1(next->thread.sctlr_user);
+ /*
+ * MPAM thread switch happens after the DSB to ensure prev's accesses
+ * use prev's MPAM settings.
+ */
+ mpam_thread_switch(next);
+
/* the actual thread switch */
last = cpu_switch_to(prev, next);
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index b495d5291868..860181266b15 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -29,8 +29,6 @@
#include "mpam_internal.h"
-DEFINE_STATIC_KEY_FALSE(mpam_enabled); /* This moves to arch code */
-
/*
* mpam_list_lock protects the SRCU lists when writing. Once the
* mpam_enabled key is enabled these lists are read-only,
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index e79c3c47259c..8983dbe715c2 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -17,12 +17,12 @@
#include <linux/srcu.h>
#include <linux/types.h>
+#include <asm/mpam.h>
+
#define MPAM_MSC_MAX_NUM_RIS 16
struct platform_device;
-DECLARE_STATIC_KEY_FALSE(mpam_enabled);
-
#ifdef CONFIG_MPAM_KUNIT_TEST
#define PACKED_FOR_KUNIT __packed
#else
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 05/41] arm64: mpam: Re-initialise MPAM regs when CPU comes online
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (3 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 04/41] arm64: mpam: Context switch the MPAM registers Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:20 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction Ben Horgan
` (37 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
Now that the MPAM system registers are expected to have values that change,
reprogram them based on the previous value when a CPU is brought online.
Previously MPAM's 'default PARTID' of 0 was always used for MPAM in
kernel-space as this is the PARTID that hardware guarantees to
reset. Because there are a limited number of PARTID, this value is exposed
to user-space, meaning resctrl changes to the resctrl default group would
also affect kernel threads. Instead, use the task's PARTID value for
kernel work on behalf of user-space too. The default of 0 is kept for both
user-space and kernel-space when MPAM is not enabled.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
CONFIG_MPAM -> CONFIG_ARM64_MPAM
Check mpam_enabled
Comment about relying on ERET for synchronisation
Update commit message
Changes since v3:
Always set MPAM1_EL1.MPAMEN rather than relying on it being read only
---
arch/arm64/kernel/cpufeature.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index c840a93b9ef9..343018c6159f 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -86,6 +86,7 @@
#include <asm/kvm_host.h>
#include <asm/mmu.h>
#include <asm/mmu_context.h>
+#include <asm/mpam.h>
#include <asm/mte.h>
#include <asm/hypervisor.h>
#include <asm/processor.h>
@@ -2483,13 +2484,17 @@ test_has_mpam(const struct arm64_cpu_capabilities *entry, int scope)
static void
cpu_enable_mpam(const struct arm64_cpu_capabilities *entry)
{
- /*
- * Access by the kernel (at EL1) should use the reserved PARTID
- * which is configured unrestricted. This avoids priority-inversion
- * where latency sensitive tasks have to wait for a task that has
- * been throttled to release the lock.
- */
- write_sysreg_s(0, SYS_MPAM1_EL1);
+ int cpu = smp_processor_id();
+ u64 regval = 0;
+
+ if (IS_ENABLED(CONFIG_ARM64_MPAM) && static_branch_likely(&mpam_enabled))
+ regval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
+
+ write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
+ isb();
+
+ /* Synchronising the EL0 write is left until the ERET to EL0 */
+ write_sysreg_s(regval, SYS_MPAM0_EL1);
}
static bool
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (4 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 05/41] arm64: mpam: Re-initialise MPAM regs when CPU comes online Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 14:08 ` Jonathan Cameron
2026-02-05 16:21 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 07/41] arm64: mpam: Advertise the CPUs MPAM limits to the driver Ben Horgan
` (36 subsequent siblings)
42 siblings, 2 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc
In anticipation of MPAM being useful remove the CONFIG_EXPERT restriction.
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
arch/arm64/Kconfig | 2 +-
drivers/resctrl/Kconfig | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index cdcc5b76a110..0e8fa195580b 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2026,7 +2026,7 @@ config ARM64_TLB_RANGE
config ARM64_MPAM
bool "Enable support for MPAM"
- select ARM64_MPAM_DRIVER if EXPERT # does nothing yet
+ select ARM64_MPAM_DRIVER
select ACPI_MPAM if ACPI
help
Memory System Resource Partitioning and Monitoring (MPAM) is an
diff --git a/drivers/resctrl/Kconfig b/drivers/resctrl/Kconfig
index c808e0470394..c34e059c6e41 100644
--- a/drivers/resctrl/Kconfig
+++ b/drivers/resctrl/Kconfig
@@ -1,6 +1,6 @@
menuconfig ARM64_MPAM_DRIVER
bool "MPAM driver"
- depends on ARM64 && ARM64_MPAM && EXPERT
+ depends on ARM64 && ARM64_MPAM
help
Memory System Resource Partitioning and Monitoring (MPAM) driver for
System IP, e.g. caches and memory controllers.
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 07/41] arm64: mpam: Advertise the CPUs MPAM limits to the driver
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (5 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 08/41] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs Ben Horgan
` (35 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
Requestors need to populate the MPAM fields for any traffic they send on
the interconnect. For the CPUs these values are taken from the
corresponding MPAMy_ELx register. Each requestor may have a limit on the
largest PARTID or PMG value that can be used. The MPAM driver has to
determine the system-wide minimum supported PARTID and PMG values.
To do this, the driver needs to be told what each requestor's limit is.
CPUs are special, but this infrastructure is also needed for the SMMU and
GIC ITS. Call the helper to tell the MPAM driver what the CPUs can do.
The return value can be ignored by the arch code as it runs well before the
MPAM driver starts probing.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
arch/arm64/kernel/mpam.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam.c
index 9866d2ca0faa..e6feff2324ac 100644
--- a/arch/arm64/kernel/mpam.c
+++ b/arch/arm64/kernel/mpam.c
@@ -3,6 +3,7 @@
#include <asm/mpam.h>
+#include <linux/arm_mpam.h>
#include <linux/jump_label.h>
#include <linux/percpu.h>
@@ -11,3 +12,14 @@ DEFINE_PER_CPU(u64, arm64_mpam_default);
DEFINE_PER_CPU(u64, arm64_mpam_current);
u64 arm64_mpam_global_default;
+
+static int __init arm64_mpam_register_cpus(void)
+{
+ u64 mpamidr = read_sanitised_ftr_reg(SYS_MPAMIDR_EL1);
+ u16 partid_max = FIELD_GET(MPAMIDR_EL1_PARTID_MAX, mpamidr);
+ u8 pmg_max = FIELD_GET(MPAMIDR_EL1_PMG_MAX, mpamidr);
+
+ return mpam_register_requestor(partid_max, pmg_max);
+}
+/* Must occur before mpam_msc_driver_init() from subsys_initcall() */
+arch_initcall(arm64_mpam_register_cpus)
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 08/41] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (6 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 07/41] arm64: mpam: Advertise the CPUs MPAM limits to the driver Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:54 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 09/41] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register Ben Horgan
` (34 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
The MPAM system registers will be lost if the CPU is reset during PSCI's
CPU_SUSPEND.
Add a PM notifier to restore them.
mpam_thread_switch(current) can't be used as this won't make any changes if
the in-memory copy says the register already has the correct value. In
reality the system register is UNKNOWN out of reset.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v3:
Always set MPAM1_EL1.MPAMEN rather than relying on it being read only
Bail out early if mpam not supported (Gavin)
---
arch/arm64/kernel/mpam.c | 33 +++++++++++++++++++++++++++++++++
1 file changed, 33 insertions(+)
diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam.c
index e6feff2324ac..48ec0ffd5999 100644
--- a/arch/arm64/kernel/mpam.c
+++ b/arch/arm64/kernel/mpam.c
@@ -4,6 +4,7 @@
#include <asm/mpam.h>
#include <linux/arm_mpam.h>
+#include <linux/cpu_pm.h>
#include <linux/jump_label.h>
#include <linux/percpu.h>
@@ -13,12 +14,44 @@ DEFINE_PER_CPU(u64, arm64_mpam_current);
u64 arm64_mpam_global_default;
+static int mpam_pm_notifier(struct notifier_block *self,
+ unsigned long cmd, void *v)
+{
+ u64 regval;
+ int cpu = smp_processor_id();
+
+ switch (cmd) {
+ case CPU_PM_EXIT:
+ /*
+ * Don't use mpam_thread_switch() as the system register
+ * value has changed under our feet.
+ */
+ regval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
+ write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
+ isb();
+
+ write_sysreg_s(regval, SYS_MPAM0_EL1);
+
+ return NOTIFY_OK;
+ default:
+ return NOTIFY_DONE;
+ }
+}
+
+static struct notifier_block mpam_pm_nb = {
+ .notifier_call = mpam_pm_notifier,
+};
+
static int __init arm64_mpam_register_cpus(void)
{
u64 mpamidr = read_sanitised_ftr_reg(SYS_MPAMIDR_EL1);
u16 partid_max = FIELD_GET(MPAMIDR_EL1_PARTID_MAX, mpamidr);
u8 pmg_max = FIELD_GET(MPAMIDR_EL1_PMG_MAX, mpamidr);
+ if (!system_supports_mpam())
+ return 0;
+
+ cpu_pm_register_notifier(&mpam_pm_nb);
return mpam_register_requestor(partid_max, pmg_max);
}
/* Must occur before mpam_msc_driver_init() from subsys_initcall() */
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 09/41] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (7 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 08/41] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:55 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 10/41] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values Ben Horgan
` (33 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
The MPAMSM_EL1 sets the MPAM labels, PMG and PARTID, for loads and stores
generated by a shared SMCU. Disable the traps so the kernel can use it and
set it to the same configuration as the per-EL cpu MPAM configuration.
If an SMCU is not shared with other cpus then it is implementation
defined whether the configuration from MPAMSM_EL1 is used or that from
the appropriate MPAMy_ELx. As we set the same, PMG_D and PARTID_D,
configuration for MPAM0_EL1, MPAM1_EL1 and MPAMSM_EL1 the resulting
configuration is the same regardless.
The range of valid configurations for the PARTID and PMG in MPAMSM_EL1 is
not currently specified in Arm Architectural Reference Manual but the
architect has confirmed that it is intended to be the same as that for the
cpu configuration in the MPAMy_ELx registers.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Mention PMG_D and PARTID_D specifically int he commit message
Add paragraph in commit message on range of MPAMSM_EL1 fields
Changes since v3:
Use cpus_have_cap() in cpu_enable_mpam()
add {}
---
arch/arm64/include/asm/el2_setup.h | 3 ++-
arch/arm64/include/asm/mpam.h | 2 ++
arch/arm64/kernel/cpufeature.c | 2 ++
arch/arm64/kernel/mpam.c | 4 ++++
4 files changed, 10 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/el2_setup.h b/arch/arm64/include/asm/el2_setup.h
index cacd20df1786..d37984c09799 100644
--- a/arch/arm64/include/asm/el2_setup.h
+++ b/arch/arm64/include/asm/el2_setup.h
@@ -504,7 +504,8 @@
check_override id_aa64pfr0, ID_AA64PFR0_EL1_MPAM_SHIFT, .Linit_mpam_\@, .Lskip_mpam_\@, x1, x2
.Linit_mpam_\@:
- msr_s SYS_MPAM2_EL2, xzr // use the default partition
+ mov x0, #MPAM2_EL2_EnMPAMSM_MASK
+ msr_s SYS_MPAM2_EL2, x0 // use the default partition,
// and disable lower traps
mrs_s x0, SYS_MPAMIDR_EL1
tbz x0, #MPAMIDR_EL1_HAS_HCR_SHIFT, .Lskip_mpam_\@ // skip if no MPAMHCR reg
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
index 0747e0526927..6bccbfdccb87 100644
--- a/arch/arm64/include/asm/mpam.h
+++ b/arch/arm64/include/asm/mpam.h
@@ -53,6 +53,8 @@ static inline void mpam_thread_switch(struct task_struct *tsk)
return;
write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
+ if (system_supports_sme())
+ write_sysreg_s(regval & (MPAMSM_EL1_PARTID_D | MPAMSM_EL1_PMG_D), SYS_MPAMSM_EL1);
isb();
/* Synchronising the EL0 write is left until the ERET to EL0 */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index 343018c6159f..45c16c19b3df 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -2491,6 +2491,8 @@ cpu_enable_mpam(const struct arm64_cpu_capabilities *entry)
regval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
+ if (cpus_have_cap(ARM64_SME))
+ write_sysreg_s(regval & (MPAMSM_EL1_PARTID_D | MPAMSM_EL1_PMG_D), SYS_MPAMSM_EL1);
isb();
/* Synchronising the EL0 write is left until the ERET to EL0 */
diff --git a/arch/arm64/kernel/mpam.c b/arch/arm64/kernel/mpam.c
index 48ec0ffd5999..3a490de4fa12 100644
--- a/arch/arm64/kernel/mpam.c
+++ b/arch/arm64/kernel/mpam.c
@@ -28,6 +28,10 @@ static int mpam_pm_notifier(struct notifier_block *self,
*/
regval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
write_sysreg_s(regval | MPAM1_EL1_MPAMEN, SYS_MPAM1_EL1);
+ if (system_supports_sme()) {
+ write_sysreg_s(regval & (MPAMSM_EL1_PARTID_D | MPAMSM_EL1_PMG_D),
+ SYS_MPAMSM_EL1);
+ }
isb();
write_sysreg_s(regval, SYS_MPAM0_EL1);
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 10/41] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (8 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 09/41] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:56 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 11/41] KVM: arm64: Force guest EL1 to use user-space's partid configuration Ben Horgan
` (32 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan, Dave Martin
From: James Morse <james.morse@arm.com>
Care must be taken when modifying the PARTID and PMG of a task in any
per-task structure as writing these values may race with the task being
scheduled in, and reading the modified values.
Add helpers to set the task properties, and the CPU default value. These
use WRITE_ONCE() that pairs with the READ_ONCE() in mpam_get_regval() to
avoid causing torn values.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
CC: Dave Martin <Dave.Martin@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Keep comment attached to mpam_get_regval()
Add internal helper, __mpam_regval() (Jonathan)
Changes since v3:
Remove extra CONFIG_ARM64_MPAM guarding
Extend CONFIG_ARM64_MPAM guarding
---
arch/arm64/include/asm/mpam.h | 28 +++++++++++++++++++++++++++-
1 file changed, 27 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
index 6bccbfdccb87..05aa71200f61 100644
--- a/arch/arm64/include/asm/mpam.h
+++ b/arch/arm64/include/asm/mpam.h
@@ -4,6 +4,7 @@
#ifndef __ASM__MPAM_H
#define __ASM__MPAM_H
+#include <linux/bitfield.h>
#include <linux/jump_label.h>
#include <linux/percpu.h>
#include <linux/sched.h>
@@ -22,6 +23,23 @@ DECLARE_PER_CPU(u64, arm64_mpam_current);
*/
extern u64 arm64_mpam_global_default;
+#ifdef CONFIG_ARM64_MPAM
+static inline u64 __mpam_regval(u16 partid_d, u16 partid_i, u8 pmg_d, u8 pmg_i)
+{
+ return FIELD_PREP(MPAM0_EL1_PARTID_D, partid_d) |
+ FIELD_PREP(MPAM0_EL1_PARTID_I, partid_i) |
+ FIELD_PREP(MPAM0_EL1_PMG_D, pmg_d) |
+ FIELD_PREP(MPAM0_EL1_PMG_I, pmg_i);
+}
+
+static inline void mpam_set_cpu_defaults(int cpu, u16 partid_d, u16 partid_i,
+ u8 pmg_d, u8 pmg_i)
+{
+ u64 default_val = __mpam_regval(partid_d, partid_i, pmg_d, pmg_i);
+
+ WRITE_ONCE(per_cpu(arm64_mpam_default, cpu), default_val);
+}
+
/*
* The resctrl filesystem writes to the partid/pmg values for threads and CPUs,
* which may race with reads in mpam_thread_switch(). Ensure only one of the old
@@ -30,12 +48,20 @@ extern u64 arm64_mpam_global_default;
* value to be stored with cache allocations, despite being considered 'free' by
* resctrl.
*/
-#ifdef CONFIG_ARM64_MPAM
static inline u64 mpam_get_regval(struct task_struct *tsk)
{
return READ_ONCE(task_thread_info(tsk)->mpam_partid_pmg);
}
+static inline void mpam_set_task_partid_pmg(struct task_struct *tsk,
+ u16 partid_d, u16 partid_i,
+ u8 pmg_d, u8 pmg_i)
+{
+ u64 regval = __mpam_regval(partid_d, partid_i, pmg_d, pmg_i);
+
+ WRITE_ONCE(task_thread_info(tsk)->mpam_partid_pmg, regval);
+}
+
static inline void mpam_thread_switch(struct task_struct *tsk)
{
u64 oldregval;
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 11/41] KVM: arm64: Force guest EL1 to use user-space's partid configuration
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (9 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 10/41] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 12/41] KVM: arm64: Use kernel-space partid configuration for hypercalls Ben Horgan
` (31 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
While we trap the guest's attempts to read/write the MPAM control
registers, the hardware continues to use them. Guest-EL0 uses KVM's
user-space's configuration, as the value is left in the register, and
guest-EL1 uses either the host kernel's configuration, or in the case of
VHE, the UNKNOWN reset value of MPAM1_EL1.
We want to force the guest-EL1 to use KVM's user-space's MPAM
configuration. On nVHE rely on MPAM0_EL1 and MPAM1_EL1 always being
programmed the same and on VHE copy MPAM0_EL1 into the guest's
MPAM1_EL1. There is no need to restore as this is out of context once TGE
is set.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Drop the unneeded __mpam_guest_load() in nvhre and the MPAM1_EL1 save restore
Defer EL2 handling until next patch
Changes since v2:
Use mask (Oliver)
---
arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
index f28c6cf4fe1b..9fb8e6628611 100644
--- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c
@@ -183,6 +183,18 @@ void sysreg_restore_guest_state_vhe(struct kvm_cpu_context *ctxt)
}
NOKPROBE_SYMBOL(sysreg_restore_guest_state_vhe);
+/*
+ * The _EL0 value was written by the host's context switch and belongs to the
+ * VMM. Copy this into the guest's _EL1 register.
+ */
+static inline void __mpam_guest_load(void)
+{
+ u64 mask = MPAM0_EL1_PARTID_D | MPAM0_EL1_PARTID_I | MPAM0_EL1_PMG_D | MPAM0_EL1_PMG_I;
+
+ if (system_supports_mpam())
+ write_sysreg_el1(read_sysreg_s(SYS_MPAM0_EL1) & mask, SYS_MPAM1);
+}
+
/**
* __vcpu_load_switch_sysregs - Load guest system registers to the physical CPU
*
@@ -222,6 +234,7 @@ void __vcpu_load_switch_sysregs(struct kvm_vcpu *vcpu)
*/
__sysreg32_restore_state(vcpu);
__sysreg_restore_user_state(guest_ctxt);
+ __mpam_guest_load();
if (unlikely(is_hyp_ctxt(vcpu))) {
__sysreg_restore_vel2_state(vcpu);
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 12/41] KVM: arm64: Use kernel-space partid configuration for hypercalls
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (10 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 11/41] KVM: arm64: Force guest EL1 to use user-space's partid configuration Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation Ben Horgan
` (30 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
On nVHE systems whether or not MPAM is enabled, EL2 continues to use
partid-0 for hypercalls, even when the host may have configured its kernel
threads to use a different partid. 0 may have been assigned to another
task. Copy the EL1 MPAM register to EL2. This ensures hypercalls use the
same partid as the kernel thread does on the host.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Use mask
Use read_sysreg_el1 to cope with hvhe
Changes since v3:
Set MPAM2_EL2.MPAMEN to 1 as we rely on that before and after
---
arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index a7c689152f68..b25a5ddb9cf0 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -635,6 +635,15 @@ static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
unsigned long hcall_min = 0;
hcall_t hfn;
+ if (system_supports_mpam()) {
+ u64 mask = MPAM1_EL1_PARTID_D | MPAM1_EL1_PARTID_I |
+ MPAM1_EL1_PMG_D | MPAM1_EL1_PMG_I;
+ u64 val = MPAM2_EL2_MPAMEN | (read_sysreg_el1(SYS_MPAM1) & mask);
+
+ write_sysreg_s(val, SYS_MPAM2_EL2);
+ isb();
+ }
+
/*
* If pKVM has been initialised then reject any calls to the
* early "privileged" hypercalls. Note that we cannot reject
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (11 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 12/41] KVM: arm64: Use kernel-space partid configuration for hypercalls Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-10 22:57 ` Reinette Chatre
2026-02-03 21:43 ` [PATCH v4 14/41] arm_mpam: resctrl: Sort the order of the domain lists Ben Horgan
` (29 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
resctrl has its own data structures to describe its resources. We can't use
these directly as we play tricks with the 'MBA' resource, picking the MPAM
controls or monitors that best apply. We may export the same component as
both L3 and MBA.
Add mpam_resctrl_exports[] as the array of class->resctrl mappings we are
exporting, and add the cpuhp hooks that allocated and free the resctrl
domain structures.
While we're here, plumb in a few other obvious things.
CONFIG_ARM_CPU_RESCTRL is used to allow this code to be built even though
it can't yet be linked against resctrl.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Domain list is an rcu list
Add synchronize_rcu() to free the deleted element
Code flow simplification (Jonathan)
Changes since v2:
Iterate over mpam_resctrl_dom directly (Jonathan)
Code flow clarification
Comment tidying
Remove power of 2 check as no longer creates holes in rmid indices
Remove unused type argument
add macro helper for_each_mpam_resctrl_control
Changes since v3:
Add and use mpam_resctrl_online_domain_hdr()
mpam_resctrl_alloc_domain() error paths (Reinette)
rebase on x86/cache changes rdt_mon_domain becomes rdt_l3_mon_domain etc
---
drivers/resctrl/Makefile | 1 +
drivers/resctrl/mpam_devices.c | 12 ++
drivers/resctrl/mpam_internal.h | 23 ++-
drivers/resctrl/mpam_resctrl.c | 343 ++++++++++++++++++++++++++++++++
include/linux/arm_mpam.h | 3 +
5 files changed, 381 insertions(+), 1 deletion(-)
create mode 100644 drivers/resctrl/mpam_resctrl.c
diff --git a/drivers/resctrl/Makefile b/drivers/resctrl/Makefile
index 898199dcf80d..40beaf999582 100644
--- a/drivers/resctrl/Makefile
+++ b/drivers/resctrl/Makefile
@@ -1,4 +1,5 @@
obj-$(CONFIG_ARM64_MPAM_DRIVER) += mpam.o
mpam-y += mpam_devices.o
+mpam-$(CONFIG_ARM_CPU_RESCTRL) += mpam_resctrl.o
ccflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 860181266b15..b81d5c7f44ca 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1628,6 +1628,9 @@ static int mpam_cpu_online(unsigned int cpu)
mpam_reprogram_msc(msc);
}
+ if (mpam_is_enabled())
+ return mpam_resctrl_online_cpu(cpu);
+
return 0;
}
@@ -1671,6 +1674,9 @@ static int mpam_cpu_offline(unsigned int cpu)
{
struct mpam_msc *msc;
+ if (mpam_is_enabled())
+ mpam_resctrl_offline_cpu(cpu);
+
guard(srcu)(&mpam_srcu);
list_for_each_entry_srcu(msc, &mpam_all_msc, all_msc_list,
srcu_read_lock_held(&mpam_srcu)) {
@@ -2517,6 +2523,12 @@ static void mpam_enable_once(void)
mutex_unlock(&mpam_list_lock);
cpus_read_unlock();
+ if (!err) {
+ err = mpam_resctrl_setup();
+ if (err)
+ pr_err("Failed to initialise resctrl: %d\n", err);
+ }
+
if (err) {
mpam_disable_reason = "Failed to enable.";
schedule_work(&mpam_broken_work);
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 8983dbe715c2..a9d89d9e18a8 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -12,7 +12,7 @@
#include <linux/jump_label.h>
#include <linux/llist.h>
#include <linux/mutex.h>
-#include <linux/srcu.h>
+#include <linux/resctrl.h>
#include <linux/spinlock.h>
#include <linux/srcu.h>
#include <linux/types.h>
@@ -334,6 +334,17 @@ struct mpam_msc_ris {
struct mpam_garbage garbage;
};
+struct mpam_resctrl_dom {
+ struct mpam_component *ctrl_comp;
+ struct rdt_ctrl_domain resctrl_ctrl_dom;
+ struct rdt_l3_mon_domain resctrl_mon_dom;
+};
+
+struct mpam_resctrl_res {
+ struct mpam_class *class;
+ struct rdt_resource resctrl_res;
+};
+
static inline int mpam_alloc_csu_mon(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
@@ -388,6 +399,16 @@ void mpam_msmon_reset_mbwu(struct mpam_component *comp, struct mon_cfg *ctx);
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
+#ifdef CONFIG_RESCTRL_FS
+int mpam_resctrl_setup(void);
+int mpam_resctrl_online_cpu(unsigned int cpu);
+void mpam_resctrl_offline_cpu(unsigned int cpu);
+#else
+static inline int mpam_resctrl_setup(void) { return 0; }
+static inline int mpam_resctrl_online_cpu(unsigned int cpu) { return 0; }
+static inline void mpam_resctrl_offline_cpu(unsigned int cpu) { }
+#endif /* CONFIG_RESCTRL_FS */
+
/*
* MPAM MSCs have the following register layout. See:
* Arm Memory System Resource Partitioning and Monitoring (MPAM) System
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
new file mode 100644
index 000000000000..4c2248c92955
--- /dev/null
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -0,0 +1,343 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2025 Arm Ltd.
+
+#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
+
+#include <linux/arm_mpam.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/errno.h>
+#include <linux/list.h>
+#include <linux/printk.h>
+#include <linux/rculist.h>
+#include <linux/resctrl.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+
+#include <asm/mpam.h>
+
+#include "mpam_internal.h"
+
+/*
+ * The classes we've picked to map to resctrl resources, wrapped
+ * in with their resctrl structure.
+ * Class pointer may be NULL.
+ */
+static struct mpam_resctrl_res mpam_resctrl_controls[RDT_NUM_RESOURCES];
+
+#define for_each_mpam_resctrl_control(res, rid) \
+ for (rid = 0, res = &mpam_resctrl_controls[rid]; \
+ rid < RDT_NUM_RESOURCES; \
+ rid++, res = &mpam_resctrl_controls[rid])
+
+/* The lock for modifying resctrl's domain lists from cpuhp callbacks. */
+static DEFINE_MUTEX(domain_list_lock);
+
+static bool exposed_alloc_capable;
+static bool exposed_mon_capable;
+
+bool resctrl_arch_alloc_capable(void)
+{
+ return exposed_alloc_capable;
+}
+
+bool resctrl_arch_mon_capable(void)
+{
+ return exposed_mon_capable;
+}
+
+/*
+ * MSC may raise an error interrupt if it sees an out or range partid/pmg,
+ * and go on to truncate the value. Regardless of what the hardware supports,
+ * only the system wide safe value is safe to use.
+ */
+u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
+{
+ return mpam_partid_max + 1;
+}
+
+struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
+{
+ if (l >= RDT_NUM_RESOURCES)
+ return NULL;
+
+ return &mpam_resctrl_controls[l].resctrl_res;
+}
+
+static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
+{
+ /* TODO: initialise the resctrl resources */
+
+ return 0;
+}
+
+static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
+{
+ struct mpam_class *class = comp->class;
+
+ if (class->type == MPAM_CLASS_CACHE)
+ return comp->comp_id;
+
+ /* TODO: repaint domain ids to match the L3 domain ids */
+ /* Otherwise, expose the ID used by the firmware table code. */
+ return comp->comp_id;
+}
+
+static void mpam_resctrl_domain_hdr_init(int cpu, struct mpam_component *comp,
+ struct rdt_domain_hdr *hdr)
+{
+ lockdep_assert_cpus_held();
+
+ INIT_LIST_HEAD(&hdr->list);
+ hdr->id = mpam_resctrl_pick_domain_id(cpu, comp);
+ cpumask_set_cpu(cpu, &hdr->cpu_mask);
+}
+
+static void mpam_resctrl_online_domain_hdr(unsigned int cpu,
+ struct rdt_domain_hdr *hdr)
+{
+ lockdep_assert_cpus_held();
+
+ cpumask_set_cpu(cpu, &hdr->cpu_mask);
+}
+
+/**
+ * mpam_resctrl_offline_domain_hdr() - Update the domain header to remove a CPU.
+ * @cpu: The CPU to remove from the domain.
+ * @hdr: The domain's header.
+ *
+ * Removes @cpu from the header mask. If this was the last CPU in the domain,
+ * the domain header is removed from its parent list and true is returned,
+ * indicating the parent structure can be freed.
+ * If there are other CPUs in the domain, returns false.
+ */
+static bool mpam_resctrl_offline_domain_hdr(unsigned int cpu,
+ struct rdt_domain_hdr *hdr)
+{
+ lockdep_assert_held(&domain_list_lock);
+
+ cpumask_clear_cpu(cpu, &hdr->cpu_mask);
+ if (cpumask_empty(&hdr->cpu_mask)) {
+ list_del_rcu(&hdr->list);
+ synchronize_rcu();
+ return true;
+ }
+
+ return false;
+}
+
+static struct mpam_resctrl_dom *
+mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
+{
+ int err;
+ struct mpam_resctrl_dom *dom;
+ struct rdt_l3_mon_domain *mon_d;
+ struct rdt_ctrl_domain *ctrl_d;
+ struct mpam_class *class = res->class;
+ struct mpam_component *comp_iter, *ctrl_comp;
+ struct rdt_resource *r = &res->resctrl_res;
+
+ lockdep_assert_held(&domain_list_lock);
+
+ ctrl_comp = NULL;
+ guard(srcu)(&mpam_srcu);
+ list_for_each_entry_srcu(comp_iter, &class->components, class_list,
+ srcu_read_lock_held(&mpam_srcu)) {
+ if (cpumask_test_cpu(cpu, &comp_iter->affinity)) {
+ ctrl_comp = comp_iter;
+ break;
+ }
+ }
+
+ /* class has no component for this CPU */
+ if (WARN_ON_ONCE(!ctrl_comp))
+ return ERR_PTR(-EINVAL);
+
+ dom = kzalloc_node(sizeof(*dom), GFP_KERNEL, cpu_to_node(cpu));
+ if (!dom)
+ return ERR_PTR(-ENOMEM);
+
+ if (exposed_alloc_capable) {
+ dom->ctrl_comp = ctrl_comp;
+
+ ctrl_d = &dom->resctrl_ctrl_dom;
+ mpam_resctrl_domain_hdr_init(cpu, ctrl_comp, &ctrl_d->hdr);
+ ctrl_d->hdr.type = RESCTRL_CTRL_DOMAIN;
+ err = resctrl_online_ctrl_domain(r, ctrl_d);
+ if (err)
+ goto free_domain;
+
+ /* TODO: this list should be sorted */
+ list_add_tail_rcu(&ctrl_d->hdr.list, &r->ctrl_domains);
+ } else {
+ pr_debug("Skipped control domain online - no controls\n");
+ }
+
+ if (exposed_mon_capable) {
+ mon_d = &dom->resctrl_mon_dom;
+ mpam_resctrl_domain_hdr_init(cpu, any_mon_comp, &mon_d->hdr);
+ mon_d->hdr.type = RESCTRL_MON_DOMAIN;
+ err = resctrl_online_mon_domain(r, &mon_d->hdr);
+ if (err)
+ goto offline_ctrl_domain;
+
+ /* TODO: this list should be sorted */
+ list_add_tail_rcu(&mon_d->hdr.list, &r->mon_domains);
+ } else {
+ pr_debug("Skipped monitor domain online - no monitors\n");
+ }
+
+ return dom;
+
+offline_ctrl_domain:
+ if (exposed_alloc_capable) {
+ mpam_resctrl_offline_domain_hdr(cpu, &ctrl_d->hdr);
+ resctrl_offline_ctrl_domain(r, ctrl_d);
+ }
+free_domain:
+ kfree(dom);
+ dom = ERR_PTR(err);
+
+ return dom;
+}
+
+static struct mpam_resctrl_dom *
+mpam_resctrl_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res)
+{
+ struct mpam_resctrl_dom *dom;
+ struct rdt_resource *r = &res->resctrl_res;
+
+ lockdep_assert_cpus_held();
+
+ list_for_each_entry_rcu(dom, &r->ctrl_domains, resctrl_ctrl_dom.hdr.list) {
+ if (cpumask_test_cpu(cpu, &dom->ctrl_comp->affinity))
+ return dom;
+ }
+
+ return NULL;
+}
+
+int mpam_resctrl_online_cpu(unsigned int cpu)
+{
+ struct mpam_resctrl_res *res;
+ enum resctrl_res_level rid;
+
+ guard(mutex)(&domain_list_lock);
+ for_each_mpam_resctrl_control(res, rid) {
+ struct mpam_resctrl_dom *dom;
+
+ if (!res->class)
+ continue; // dummy_resource;
+
+ dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
+ if (!dom) {
+ dom = mpam_resctrl_alloc_domain(cpu, res);
+ } else {
+ if (exposed_alloc_capable) {
+ struct rdt_ctrl_domain *ctrl_d = &dom->resctrl_ctrl_dom;
+
+ mpam_resctrl_online_domain_hdr(cpu, &ctrl_d->hdr);
+ }
+ if (exposed_mon_capable) {
+ struct rdt_l3_mon_domain *mon_d = &dom->resctrl_mon_dom;
+
+ mpam_resctrl_online_domain_hdr(cpu, &mon_d->hdr);
+ }
+ }
+ if (IS_ERR(dom))
+ return PTR_ERR(dom);
+ }
+
+ resctrl_online_cpu(cpu);
+
+ return 0;
+}
+
+void mpam_resctrl_offline_cpu(unsigned int cpu)
+{
+ struct mpam_resctrl_res *res;
+ enum resctrl_res_level rid;
+
+ resctrl_offline_cpu(cpu);
+
+ guard(mutex)(&domain_list_lock);
+ for_each_mpam_resctrl_control(res, rid) {
+ struct mpam_resctrl_dom *dom;
+ struct rdt_l3_mon_domain *mon_d;
+ struct rdt_ctrl_domain *ctrl_d;
+ bool ctrl_dom_empty, mon_dom_empty;
+
+ if (!res->class)
+ continue; // dummy resource
+
+ dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
+ if (WARN_ON_ONCE(!dom))
+ continue;
+
+ if (exposed_alloc_capable) {
+ ctrl_d = &dom->resctrl_ctrl_dom;
+ ctrl_dom_empty = mpam_resctrl_offline_domain_hdr(cpu, &ctrl_d->hdr);
+ if (ctrl_dom_empty)
+ resctrl_offline_ctrl_domain(&res->resctrl_res, ctrl_d);
+ } else {
+ ctrl_dom_empty = true;
+ }
+
+ if (exposed_mon_capable) {
+ mon_d = &dom->resctrl_mon_dom;
+ mon_dom_empty = mpam_resctrl_offline_domain_hdr(cpu, &mon_d->hdr);
+ if (mon_dom_empty)
+ resctrl_offline_mon_domain(&res->resctrl_res, &mon_d->hdr);
+ } else {
+ mon_dom_empty = true;
+ }
+
+ if (ctrl_dom_empty && mon_dom_empty)
+ kfree(dom);
+ }
+}
+
+int mpam_resctrl_setup(void)
+{
+ int err = 0;
+ struct mpam_resctrl_res *res;
+ enum resctrl_res_level rid;
+
+ cpus_read_lock();
+ for_each_mpam_resctrl_control(res, rid) {
+ INIT_LIST_HEAD_RCU(&res->resctrl_res.ctrl_domains);
+ INIT_LIST_HEAD_RCU(&res->resctrl_res.mon_domains);
+ res->resctrl_res.rid = rid;
+ }
+
+ /* TODO: pick MPAM classes to map to resctrl resources */
+
+ /* Initialise the resctrl structures from the classes */
+ for_each_mpam_resctrl_control(res, rid) {
+ if (!res->class)
+ continue; // dummy resource
+
+ err = mpam_resctrl_control_init(res);
+ if (err) {
+ pr_debug("Failed to initialise rid %u\n", rid);
+ break;
+ }
+ }
+ cpus_read_unlock();
+
+ if (err) {
+ pr_debug("Internal error %d - resctrl not supported\n", err);
+ return err;
+ }
+
+ if (!exposed_alloc_capable && !exposed_mon_capable) {
+ pr_debug("No alloc(%u) or monitor(%u) found - resctrl not supported\n",
+ exposed_alloc_capable, exposed_mon_capable);
+ return -EOPNOTSUPP;
+ }
+
+ /* TODO: call resctrl_init() */
+
+ return 0;
+}
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 7f00c5285a32..2c7d1413a401 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -49,6 +49,9 @@ static inline int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
}
#endif
+bool resctrl_arch_alloc_capable(void);
+bool resctrl_arch_mon_capable(void);
+
/**
* mpam_register_requestor() - Register a requestor with the MPAM driver
* @partid_max: The maximum PARTID value the requestor can generate.
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 14/41] arm_mpam: resctrl: Sort the order of the domain lists
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (12 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources Ben Horgan
` (28 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
resctrl documents that the domains appear in numeric order in the schemata
file. This means a little more work is needed when bringing a domain
online.
Add the support for this, using resctrl_find_domain() to find the point to
insert in the list.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
drivers/resctrl/mpam_resctrl.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 4c2248c92955..90cdb5fdd3d6 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -127,6 +127,21 @@ static bool mpam_resctrl_offline_domain_hdr(unsigned int cpu,
return false;
}
+static void mpam_resctrl_domain_insert(struct list_head *list,
+ struct rdt_domain_hdr *new)
+{
+ struct rdt_domain_hdr *err;
+ struct list_head *pos = NULL;
+
+ lockdep_assert_held(&domain_list_lock);
+
+ err = resctrl_find_domain(list, new->id, &pos);
+ if (WARN_ON_ONCE(err))
+ return;
+
+ list_add_tail_rcu(&new->list, pos);
+}
+
static struct mpam_resctrl_dom *
mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
{
@@ -168,8 +183,7 @@ mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
if (err)
goto free_domain;
- /* TODO: this list should be sorted */
- list_add_tail_rcu(&ctrl_d->hdr.list, &r->ctrl_domains);
+ mpam_resctrl_domain_insert(&r->ctrl_domains, &ctrl_d->hdr);
} else {
pr_debug("Skipped control domain online - no controls\n");
}
@@ -182,8 +196,7 @@ mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
if (err)
goto offline_ctrl_domain;
- /* TODO: this list should be sorted */
- list_add_tail_rcu(&mon_d->hdr.list, &r->mon_domains);
+ mpam_resctrl_domain_insert(&r->mon_domains, &mon_d->hdr);
} else {
pr_debug("Skipped monitor domain online - no monitors\n");
}
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (13 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 14/41] arm_mpam: resctrl: Sort the order of the domain lists Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-10 23:39 ` Reinette Chatre
2026-02-03 21:43 ` [PATCH v4 16/41] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls() Ben Horgan
` (27 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
Systems with MPAM support may have a variety of control types at any point
of their system layout. We can only expose certain types of control, and
only if they exist at particular locations.
Start with the well-known caches. These have to be depth 2 or 3 and support
MPAM's cache portion bitmap controls, with a number of portions fewer than
resctrl's limit.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Jonathan:
Remove brackets
Compress debug message
Use temp var, r
Changes since v2:
Return -EINVAL in mpam_resctrl_control_init() for unknown rid
---
drivers/resctrl/mpam_resctrl.c | 90 +++++++++++++++++++++++++++++++++-
1 file changed, 88 insertions(+), 2 deletions(-)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 90cdb5fdd3d6..a59d1659fe12 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -65,9 +65,94 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
return &mpam_resctrl_controls[l].resctrl_res;
}
+static bool cache_has_usable_cpor(struct mpam_class *class)
+{
+ struct mpam_props *cprops = &class->props;
+
+ if (!mpam_has_feature(mpam_feat_cpor_part, cprops))
+ return false;
+
+ /* resctrl uses u32 for all bitmap configurations */
+ return class->props.cpbm_wd <= 32;
+}
+
+/* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */
+static void mpam_resctrl_pick_caches(void)
+{
+ struct mpam_class *class;
+ struct mpam_resctrl_res *res;
+
+ lockdep_assert_cpus_held();
+
+ guard(srcu)(&mpam_srcu);
+ list_for_each_entry_srcu(class, &mpam_classes, classes_list,
+ srcu_read_lock_held(&mpam_srcu)) {
+ if (class->type != MPAM_CLASS_CACHE) {
+ pr_debug("class %u is not a cache\n", class->level);
+ continue;
+ }
+
+ if (class->level != 2 && class->level != 3) {
+ pr_debug("class %u is not L2 or L3\n", class->level);
+ continue;
+ }
+
+ if (!cache_has_usable_cpor(class)) {
+ pr_debug("class %u cache misses CPOR\n", class->level);
+ continue;
+ }
+
+ if (!cpumask_equal(&class->affinity, cpu_possible_mask)) {
+ pr_debug("class %u has missing CPUs, mask %*pb != %*pb\n", class->level,
+ cpumask_pr_args(&class->affinity),
+ cpumask_pr_args(cpu_possible_mask));
+ continue;
+ }
+
+ if (class->level == 2)
+ res = &mpam_resctrl_controls[RDT_RESOURCE_L2];
+ else
+ res = &mpam_resctrl_controls[RDT_RESOURCE_L3];
+ res->class = class;
+ exposed_alloc_capable = true;
+ }
+}
+
static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
{
- /* TODO: initialise the resctrl resources */
+ struct mpam_class *class = res->class;
+ struct rdt_resource *r = &res->resctrl_res;
+
+ switch (r->rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ r->alloc_capable = true;
+ r->schema_fmt = RESCTRL_SCHEMA_BITMAP;
+ r->cache.arch_has_sparse_bitmasks = true;
+
+ r->cache.cbm_len = class->props.cpbm_wd;
+ /* mpam_devices will reject empty bitmaps */
+ r->cache.min_cbm_bits = 1;
+
+ if (r->rid == RDT_RESOURCE_L2) {
+ r->name = "L2";
+ r->ctrl_scope = RESCTRL_L2_CACHE;
+ } else {
+ r->name = "L3";
+ r->ctrl_scope = RESCTRL_L3_CACHE;
+ }
+
+ /*
+ * Which bits are shared with other ...things...
+ * Unknown devices use partid-0 which uses all the bitmap
+ * fields. Until we configured the SMMU and GIC not to do this
+ * 'all the bits' is the correct answer here.
+ */
+ r->cache.shareable_bits = resctrl_get_default_ctrl(r);
+ break;
+ default:
+ return -EINVAL;
+ }
return 0;
}
@@ -324,7 +409,8 @@ int mpam_resctrl_setup(void)
res->resctrl_res.rid = rid;
}
- /* TODO: pick MPAM classes to map to resctrl resources */
+ /* Find some classes to use for controls */
+ mpam_resctrl_pick_caches();
/* Initialise the resctrl structures from the classes */
for_each_mpam_resctrl_control(res, rid) {
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 16/41] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (14 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-13 3:32 ` Zeng Heng
2026-02-03 21:43 ` [PATCH v4 17/41] arm_mpam: resctrl: Add resctrl_arch_get_config() Ben Horgan
` (26 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
We already have a helper for resetting an mpam class and component. Hook
it up to resctrl_arch_reset_all_ctrls() and the domain offline path.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Cc: Zeng Heng <zengheng4@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Don't expose unlocked reset
Changes since v3:
Don't use or expose mpam_reset_component_locked()
---
drivers/resctrl/mpam_devices.c | 2 +-
drivers/resctrl/mpam_internal.h | 3 +++
drivers/resctrl/mpam_resctrl.c | 13 +++++++++++++
3 files changed, 17 insertions(+), 1 deletion(-)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index b81d5c7f44ca..bf99c37a80a7 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -2568,7 +2568,7 @@ static void mpam_reset_component_locked(struct mpam_component *comp)
}
}
-static void mpam_reset_class_locked(struct mpam_class *class)
+void mpam_reset_class_locked(struct mpam_class *class)
{
struct mpam_component *comp;
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index a9d89d9e18a8..21ade1620147 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -389,6 +389,9 @@ extern u8 mpam_pmg_max;
void mpam_enable(struct work_struct *work);
void mpam_disable(struct work_struct *work);
+/* Reset all the RIS in a class under cpus_read_lock() */
+void mpam_reset_class_locked(struct mpam_class *class);
+
int mpam_apply_config(struct mpam_component *comp, u16 partid,
struct mpam_config *cfg);
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index a59d1659fe12..7fc0b515cfa4 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -169,6 +169,19 @@ static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
return comp->comp_id;
}
+void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
+{
+ struct mpam_resctrl_res *res;
+
+ lockdep_assert_cpus_held();
+
+ if (!mpam_is_enabled())
+ return;
+
+ res = container_of(r, struct mpam_resctrl_res, resctrl_res);
+ mpam_reset_class_locked(res->class);
+}
+
static void mpam_resctrl_domain_hdr_init(int cpu, struct mpam_component *comp,
struct rdt_domain_hdr *hdr)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 17/41] arm_mpam: resctrl: Add resctrl_arch_get_config()
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (15 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 16/41] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls() Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration Ben Horgan
` (25 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
Implement resctrl_arch_get_config() by testing the live configuration for a
CPOR bitmap. For any other configuration type return the default.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
drivers/resctrl/mpam_resctrl.c | 43 ++++++++++++++++++++++++++++++++++
1 file changed, 43 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 7fc0b515cfa4..ecf00386edca 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -169,6 +169,49 @@ static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
return comp->comp_id;
}
+u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
+ u32 closid, enum resctrl_conf_type type)
+{
+ u32 partid;
+ struct mpam_config *cfg;
+ struct mpam_props *cprops;
+ struct mpam_resctrl_res *res;
+ struct mpam_resctrl_dom *dom;
+ enum mpam_device_features configured_by;
+
+ lockdep_assert_cpus_held();
+
+ if (!mpam_is_enabled())
+ return resctrl_get_default_ctrl(r);
+
+ res = container_of(r, struct mpam_resctrl_res, resctrl_res);
+ dom = container_of(d, struct mpam_resctrl_dom, resctrl_ctrl_dom);
+ cprops = &res->class->props;
+
+ partid = resctrl_get_config_index(closid, type);
+ cfg = &dom->ctrl_comp->cfg[partid];
+
+ switch (r->rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ configured_by = mpam_feat_cpor_part;
+ break;
+ default:
+ return resctrl_get_default_ctrl(r);
+ }
+
+ if (!r->alloc_capable || partid >= resctrl_arch_get_num_closid(r) ||
+ !mpam_has_feature(configured_by, cfg))
+ return resctrl_get_default_ctrl(r);
+
+ switch (configured_by) {
+ case mpam_feat_cpor_part:
+ return cfg->cpbm;
+ default:
+ return resctrl_get_default_ctrl(r);
+ }
+}
+
void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
{
struct mpam_resctrl_res *res;
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (16 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 17/41] arm_mpam: resctrl: Add resctrl_arch_get_config() Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-14 10:39 ` Zeng Heng
2026-02-03 21:43 ` [PATCH v4 19/41] arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks Ben Horgan
` (24 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
resctrl has two helpers for updating the configuration.
resctrl_arch_update_one() updates a single value, and is used by the
software-controller to apply feedback to the bandwidth controls, it has to
be called on one of the CPUs in the resctrl:domain.
resctrl_arch_update_domains() copies multiple staged configurations, it can
be called from anywhere.
Both helpers should update any changes to the underlying hardware.
Implement resctrl_arch_update_domains() to use
resctrl_arch_update_one(). Neither need to be called on a specific CPU as
the mpam driver will send IPIs as needed.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
list_for_each_entry -> list_for_each_entry_rcu
return 0
Restrict scope of local variables
Changes since v2:
whitespace fix
---
drivers/resctrl/mpam_resctrl.c | 70 ++++++++++++++++++++++++++++++++++
1 file changed, 70 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index ecf00386edca..48d047510089 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -212,6 +212,76 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
}
}
+int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
+ u32 closid, enum resctrl_conf_type t, u32 cfg_val)
+{
+ u32 partid;
+ struct mpam_config cfg;
+ struct mpam_props *cprops;
+ struct mpam_resctrl_res *res;
+ struct mpam_resctrl_dom *dom;
+
+ lockdep_assert_cpus_held();
+ lockdep_assert_irqs_enabled();
+
+ /*
+ * No need to check the CPU as mpam_apply_config() doesn't care, and
+ * resctrl_arch_update_domains() relies on this.
+ */
+ res = container_of(r, struct mpam_resctrl_res, resctrl_res);
+ dom = container_of(d, struct mpam_resctrl_dom, resctrl_ctrl_dom);
+ cprops = &res->class->props;
+
+ partid = resctrl_get_config_index(closid, t);
+ if (!r->alloc_capable || partid >= resctrl_arch_get_num_closid(r)) {
+ pr_debug("Not alloc capable or computed PARTID out of range\n");
+ return -EINVAL;
+ }
+
+ /*
+ * Copy the current config to avoid clearing other resources when the
+ * same component is exposed multiple times through resctrl.
+ */
+ cfg = dom->ctrl_comp->cfg[partid];
+
+ switch (r->rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ cfg.cpbm = cfg_val;
+ mpam_set_feature(mpam_feat_cpor_part, &cfg);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
+}
+
+int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
+{
+ int err;
+ struct rdt_ctrl_domain *d;
+
+ lockdep_assert_cpus_held();
+ lockdep_assert_irqs_enabled();
+
+ list_for_each_entry_rcu(d, &r->ctrl_domains, hdr.list) {
+ for (enum resctrl_conf_type t = 0; t < CDP_NUM_TYPES; t++) {
+ struct resctrl_staged_config *cfg = &d->staged_config[t];
+
+ if (!cfg->have_new_ctrl)
+ continue;
+
+ err = resctrl_arch_update_one(r, d, closid, t,
+ cfg->new_ctrl);
+ if (err)
+ return err;
+ }
+ }
+
+ return 0;
+}
+
void resctrl_arch_reset_all_ctrls(struct rdt_resource *r)
{
struct mpam_resctrl_res *res;
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 19/41] arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (17 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation Ben Horgan
` (23 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
arm64 provides helpers for changing a task's and a cpu's mpam partid/pmg
values.
These are used to back a number of resctrl_arch_ functions. Connect them
up.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
apostrophes in commit message
---
drivers/resctrl/mpam_resctrl.c | 58 ++++++++++++++++++++++++++++++++++
include/linux/arm_mpam.h | 5 +++
2 files changed, 63 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 48d047510089..cd52ca279651 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -8,6 +8,7 @@
#include <linux/cpu.h>
#include <linux/cpumask.h>
#include <linux/errno.h>
+#include <linux/limits.h>
#include <linux/list.h>
#include <linux/printk.h>
#include <linux/rculist.h>
@@ -37,6 +38,8 @@ static DEFINE_MUTEX(domain_list_lock);
static bool exposed_alloc_capable;
static bool exposed_mon_capable;
+static bool cdp_enabled;
+
bool resctrl_arch_alloc_capable(void)
{
return exposed_alloc_capable;
@@ -57,6 +60,61 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
return mpam_partid_max + 1;
}
+void resctrl_arch_sched_in(struct task_struct *tsk)
+{
+ lockdep_assert_preemption_disabled();
+
+ mpam_thread_switch(tsk);
+}
+
+void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid)
+{
+ WARN_ON_ONCE(closid > U16_MAX);
+ WARN_ON_ONCE(rmid > U8_MAX);
+
+ if (!cdp_enabled) {
+ mpam_set_cpu_defaults(cpu, closid, closid, rmid, rmid);
+ } else {
+ /*
+ * When CDP is enabled, resctrl halves the closid range and we
+ * use odd/even partid for one closid.
+ */
+ u32 partid_d = resctrl_get_config_index(closid, CDP_DATA);
+ u32 partid_i = resctrl_get_config_index(closid, CDP_CODE);
+
+ mpam_set_cpu_defaults(cpu, partid_d, partid_i, rmid, rmid);
+ }
+}
+
+void resctrl_arch_sync_cpu_closid_rmid(void *info)
+{
+ struct resctrl_cpu_defaults *r = info;
+
+ lockdep_assert_preemption_disabled();
+
+ if (r) {
+ resctrl_arch_set_cpu_default_closid_rmid(smp_processor_id(),
+ r->closid, r->rmid);
+ }
+
+ resctrl_arch_sched_in(current);
+}
+
+void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
+{
+ WARN_ON_ONCE(closid > U16_MAX);
+ WARN_ON_ONCE(rmid > U8_MAX);
+
+ if (!cdp_enabled) {
+ mpam_set_task_partid_pmg(tsk, closid, closid, rmid, rmid);
+ } else {
+ u32 partid_d = resctrl_get_config_index(closid, CDP_DATA);
+ u32 partid_i = resctrl_get_config_index(closid, CDP_CODE);
+
+ mpam_set_task_partid_pmg(tsk, partid_d, partid_i, rmid, rmid);
+ }
+}
+
struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
{
if (l >= RDT_NUM_RESOURCES)
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 2c7d1413a401..5a78299ec464 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -52,6 +52,11 @@ static inline int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
bool resctrl_arch_alloc_capable(void);
bool resctrl_arch_mon_capable(void);
+void resctrl_arch_set_cpu_default_closid(int cpu, u32 closid);
+void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
+void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid);
+void resctrl_arch_sched_in(struct task_struct *tsk);
+
/**
* mpam_register_requestor() - Register a requestor with the MPAM driver
* @partid_max: The maximum PARTID value the requestor can generate.
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (18 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 19/41] arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-09 1:16 ` Fenghua Yu
2026-02-03 21:43 ` [PATCH v4 21/41] arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats Ben Horgan
` (22 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan, Dave Martin
From: James Morse <james.morse@arm.com>
Intel RDT's CDP feature allows the cache to use a different control value
depending on whether the accesses was for instruction fetch or a data
access. MPAM's equivalent feature is the other way up: the CPU assigns a
different partid label to traffic depending on whether it was instruction
fetch or a data access, which causes the cache to use a different control
value based solely on the partid.
MPAM can emulate CDP, with the side effect that the alternative partid is
seen by all MSC, it can't be enabled per-MSC.
Add the resctrl hooks to turn this on or off. Add the helpers that match a
closid against a task, which need to be aware that the value written to
hardware is not the same as the one resctrl is using.
Update the 'arm64_mpam_global_default' variable the arch code uses during
context switch to know when the per-cpu value should be used instead. Also,
update these per-cpu values and sync the resulting mpam partid/pmg
configuration to hardware.
Awkwardly, the MB controls don't implement CDP. To emulate this, the MPAM
equivalent needs programming twice by the resctrl glue, as resctrl expects
the bandwidth controls to be applied independently for both data and
instruction-fetch.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
CC: Dave Martin <Dave.Martin@arm.com>
CC: Amit Singh Tomar <amitsinght@marvell.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Fail cdp initialisation if there is only one partid
Correct data/code confusion
Changes since v2:
Don't include unused header
Changes since v3:
Update the per-cpu values and sync to h/w
---
arch/arm64/include/asm/mpam.h | 1 +
drivers/resctrl/mpam_resctrl.c | 117 +++++++++++++++++++++++++++++++++
include/linux/arm_mpam.h | 2 +
3 files changed, 120 insertions(+)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
index 05aa71200f61..70d396e7b6da 100644
--- a/arch/arm64/include/asm/mpam.h
+++ b/arch/arm64/include/asm/mpam.h
@@ -4,6 +4,7 @@
#ifndef __ASM__MPAM_H
#define __ASM__MPAM_H
+#include <linux/arm_mpam.h>
#include <linux/bitfield.h>
#include <linux/jump_label.h>
#include <linux/percpu.h>
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index cd52ca279651..12017264530a 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -38,6 +38,10 @@ static DEFINE_MUTEX(domain_list_lock);
static bool exposed_alloc_capable;
static bool exposed_mon_capable;
+/*
+ * MPAM emulates CDP by setting different PARTID in the I/D fields of MPAM0_EL1.
+ * This applies globally to all traffic the CPU generates.
+ */
static bool cdp_enabled;
bool resctrl_arch_alloc_capable(void)
@@ -50,6 +54,72 @@ bool resctrl_arch_mon_capable(void)
return exposed_mon_capable;
}
+bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
+{
+ switch (rid) {
+ case RDT_RESOURCE_L2:
+ case RDT_RESOURCE_L3:
+ return cdp_enabled;
+ case RDT_RESOURCE_MBA:
+ default:
+ /*
+ * x86's MBA control doesn't support CDP, so user-space doesn't
+ * expect it.
+ */
+ return false;
+ }
+}
+
+/**
+ * resctrl_reset_task_closids() - Reset the PARTID/PMG values for all tasks.
+ *
+ * At boot, all existing tasks use partid zero for D and I.
+ * To enable/disable CDP emulation, all these tasks need relabelling.
+ */
+static void resctrl_reset_task_closids(void)
+{
+ struct task_struct *p, *t;
+
+ read_lock(&tasklist_lock);
+ for_each_process_thread(p, t) {
+ resctrl_arch_set_closid_rmid(t, RESCTRL_RESERVED_CLOSID,
+ RESCTRL_RESERVED_RMID);
+ }
+ read_unlock(&tasklist_lock);
+}
+
+int resctrl_arch_set_cdp_enabled(enum resctrl_res_level ignored, bool enable)
+{
+ u32 partid_i = RESCTRL_RESERVED_CLOSID, partid_d = RESCTRL_RESERVED_CLOSID;
+ int cpu;
+
+ cdp_enabled = enable;
+
+ if (enable) {
+ if (mpam_partid_max < 1)
+ return -EINVAL;
+
+ partid_d = resctrl_get_config_index(RESCTRL_RESERVED_CLOSID, CDP_DATA);
+ partid_i = resctrl_get_config_index(RESCTRL_RESERVED_CLOSID, CDP_CODE);
+ }
+
+ mpam_set_task_partid_pmg(current, partid_d, partid_i, 0, 0);
+ WRITE_ONCE(arm64_mpam_global_default, mpam_get_regval(current));
+
+ resctrl_reset_task_closids();
+
+ for_each_possible_cpu(cpu)
+ mpam_set_cpu_defaults(cpu, partid_d, partid_i, 0, 0);
+ on_each_cpu(resctrl_arch_sync_cpu_closid_rmid, NULL, 1);
+
+ return 0;
+}
+
+static bool mpam_resctrl_hide_cdp(enum resctrl_res_level rid)
+{
+ return cdp_enabled && !resctrl_arch_get_cdp_enabled(rid);
+}
+
/*
* MSC may raise an error interrupt if it sees an out or range partid/pmg,
* and go on to truncate the value. Regardless of what the hardware supports,
@@ -115,6 +185,30 @@ void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
}
}
+bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid)
+{
+ u64 regval = mpam_get_regval(tsk);
+ u32 tsk_closid = FIELD_GET(MPAM0_EL1_PARTID_D, regval);
+
+ if (cdp_enabled)
+ tsk_closid >>= 1;
+
+ return tsk_closid == closid;
+}
+
+/* The task's pmg is not unique, the partid must be considered too */
+bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
+{
+ u64 regval = mpam_get_regval(tsk);
+ u32 tsk_closid = FIELD_GET(MPAM0_EL1_PARTID_D, regval);
+ u32 tsk_rmid = FIELD_GET(MPAM0_EL1_PMG_D, regval);
+
+ if (cdp_enabled)
+ tsk_closid >>= 1;
+
+ return (tsk_closid == closid) && (tsk_rmid == rmid);
+}
+
struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
{
if (l >= RDT_NUM_RESOURCES)
@@ -246,6 +340,14 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
dom = container_of(d, struct mpam_resctrl_dom, resctrl_ctrl_dom);
cprops = &res->class->props;
+ /*
+ * When CDP is enabled, but the resource doesn't support it,
+ * the control is cloned across both partids.
+ * Pick one at random to read:
+ */
+ if (mpam_resctrl_hide_cdp(r->rid))
+ type = CDP_DATA;
+
partid = resctrl_get_config_index(closid, type);
cfg = &dom->ctrl_comp->cfg[partid];
@@ -273,6 +375,7 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type t, u32 cfg_val)
{
+ int err;
u32 partid;
struct mpam_config cfg;
struct mpam_props *cprops;
@@ -312,6 +415,20 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
return -EINVAL;
}
+ /*
+ * When CDP is enabled, but the resource doesn't support it, we need to
+ * apply the same configuration to the other partid.
+ */
+ if (mpam_resctrl_hide_cdp(r->rid)) {
+ partid = resctrl_get_config_index(closid, CDP_CODE);
+ err = mpam_apply_config(dom->ctrl_comp, partid, &cfg);
+ if (err)
+ return err;
+
+ partid = resctrl_get_config_index(closid, CDP_DATA);
+ return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
+ }
+
return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
}
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 5a78299ec464..d329b1dc148b 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -56,6 +56,8 @@ void resctrl_arch_set_cpu_default_closid(int cpu, u32 closid);
void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid);
void resctrl_arch_sched_in(struct task_struct *tsk);
+bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid);
+bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
/**
* mpam_register_requestor() - Register a requestor with the MPAM driver
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 21/41] arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (19 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 22/41] arm_mpam: resctrl: Add kunit test for control format conversions Ben Horgan
` (21 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Dave Martin, Shaopeng Tan
From: Dave Martin <Dave.Martin@arm.com>
MPAM uses a fixed-point formats for some hardware controls. Resctrl
provides the bandwidth controls as a percentage. Add helpers to convert
between these.
Ensure bwa_wd is at most 16 to make it clear higher values have no meaning.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Ensure bwa_wd is at most 16 (moved from patch 40: arm_mpam: Generate a
configuration for min controls)
Expand comments
---
drivers/resctrl/mpam_devices.c | 7 +++++
drivers/resctrl/mpam_resctrl.c | 51 ++++++++++++++++++++++++++++++++++
2 files changed, 58 insertions(+)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index bf99c37a80a7..965384c667f6 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -713,6 +713,13 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
mpam_set_feature(mpam_feat_mbw_part, props);
props->bwa_wd = FIELD_GET(MPAMF_MBW_IDR_BWA_WD, mbw_features);
+
+ /*
+ * The BWA_WD field can represent 0-63, but the control fields it
+ * describes have a maximum of 16 bits.
+ */
+ props->bwa_wd = min(props->bwa_wd, 16);
+
if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_MAX, mbw_features))
mpam_set_feature(mpam_feat_mbw_max, props);
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 12017264530a..ece46e0d2ab3 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -10,6 +10,7 @@
#include <linux/errno.h>
#include <linux/limits.h>
#include <linux/list.h>
+#include <linux/math.h>
#include <linux/printk.h>
#include <linux/rculist.h>
#include <linux/resctrl.h>
@@ -228,6 +229,56 @@ static bool cache_has_usable_cpor(struct mpam_class *class)
return class->props.cpbm_wd <= 32;
}
+/*
+ * Each fixed-point hardware value architecturally represents a range
+ * of values: the full range 0% - 100% is split contiguously into
+ * (1 << cprops->bwa_wd) equal bands.
+ *
+ * Although the bwa_bwd fields have 6 bits the maximum valid value is 16
+ * as it reports the width of fields that are at most 16 bits. When
+ * fewer than 16 bits are valid the least significant bits are
+ * ignored. The implied binary point is kept between bits 15 and 16 and
+ * so the valid bits are leftmost.
+ *
+ * See ARM IHI0099B.a "MPAM system component specification", Section 9.3,
+ * "The fixed-point fractional format" for more information.
+ *
+ * Find the nearest percentage value to the upper bound of the selected band:
+ */
+static u32 mbw_max_to_percent(u16 mbw_max, struct mpam_props *cprops)
+{
+ u32 val = mbw_max;
+
+ val >>= 16 - cprops->bwa_wd;
+ val += 1;
+ val *= MAX_MBA_BW;
+ val = DIV_ROUND_CLOSEST(val, 1 << cprops->bwa_wd);
+
+ return val;
+}
+
+/*
+ * Find the band whose upper bound is closest to the specified percentage.
+ *
+ * A round-to-nearest policy is followed here as a balanced compromise
+ * between unexpected under-commit of the resource (where the total of
+ * a set of resource allocations after conversion is less than the
+ * expected total, due to rounding of the individual converted
+ * percentages) and over-commit (where the total of the converted
+ * allocations is greater than expected).
+ */
+static u16 percent_to_mbw_max(u8 pc, struct mpam_props *cprops)
+{
+ u32 val = pc;
+
+ val <<= cprops->bwa_wd;
+ val = DIV_ROUND_CLOSEST(val, MAX_MBA_BW);
+ val = max(val, 1) - 1;
+ val <<= 16 - cprops->bwa_wd;
+
+ return val;
+}
+
/* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */
static void mpam_resctrl_pick_caches(void)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 22/41] arm_mpam: resctrl: Add kunit test for control format conversions
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (20 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 21/41] arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 23/41] arm_mpam: resctrl: Add rmid index helpers Ben Horgan
` (20 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Dave Martin, Shaopeng Tan
From: Dave Martin <Dave.Martin@arm.com>
resctrl specifies the format of the control schemes, and these don't match
the hardware.
Some of the conversions are a bit hairy - add some kunit tests.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
[morse: squashed enough of Dave's fixes in here that it's his patch now!]
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Include additional values from the latest spec
---
drivers/resctrl/mpam_resctrl.c | 4 +
drivers/resctrl/test_mpam_resctrl.c | 315 ++++++++++++++++++++++++++++
2 files changed, 319 insertions(+)
create mode 100644 drivers/resctrl/test_mpam_resctrl.c
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index ece46e0d2ab3..183428e2d38c 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -792,3 +792,7 @@ int mpam_resctrl_setup(void)
return 0;
}
+
+#ifdef CONFIG_MPAM_KUNIT_TEST
+#include "test_mpam_resctrl.c"
+#endif
diff --git a/drivers/resctrl/test_mpam_resctrl.c b/drivers/resctrl/test_mpam_resctrl.c
new file mode 100644
index 000000000000..b93d6ad87e43
--- /dev/null
+++ b/drivers/resctrl/test_mpam_resctrl.c
@@ -0,0 +1,315 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2025 Arm Ltd.
+/* This file is intended to be included into mpam_resctrl.c */
+
+#include <kunit/test.h>
+#include <linux/array_size.h>
+#include <linux/bits.h>
+#include <linux/math.h>
+#include <linux/sprintf.h>
+
+struct percent_value_case {
+ u8 pc;
+ u8 width;
+ u16 value;
+};
+
+/*
+ * Mysterious inscriptions taken from the union of ARM DDI 0598D.b,
+ * "Arm Architecture Reference Manual Supplement - Memory System
+ * Resource Partitioning and Monitoring (MPAM), for A-profile
+ * architecture", Section 9.8, "About the fixed-point fractional
+ * format" (exact percentage entries only) and ARM IHI0099B.a
+ * "MPAM system component specification", Section 9.3,
+ * "The fixed-point fractional format":
+ */
+static const struct percent_value_case percent_value_cases[] = {
+ /* Architectural cases: */
+ { 1, 8, 1 }, { 1, 12, 0x27 }, { 1, 16, 0x28e },
+ { 25, 8, 0x3f }, { 25, 12, 0x3ff }, { 25, 16, 0x3fff },
+ { 33, 8, 0x53 }, { 33, 12, 0x546 }, { 33, 16, 0x5479 },
+ { 35, 8, 0x58 }, { 35, 12, 0x598 }, { 35, 16, 0x5998 },
+ { 45, 8, 0x72 }, { 45, 12, 0x732 }, { 45, 16, 0x7332 },
+ { 50, 8, 0x7f }, { 50, 12, 0x7ff }, { 50, 16, 0x7fff },
+ { 52, 8, 0x84 }, { 52, 12, 0x850 }, { 52, 16, 0x851d },
+ { 55, 8, 0x8b }, { 55, 12, 0x8cb }, { 55, 16, 0x8ccb },
+ { 58, 8, 0x93 }, { 58, 12, 0x946 }, { 58, 16, 0x9479 },
+ { 75, 8, 0xbf }, { 75, 12, 0xbff }, { 75, 16, 0xbfff },
+ { 80, 8, 0xcb }, { 80, 12, 0xccb }, { 80, 16, 0xcccb },
+ { 88, 8, 0xe0 }, { 88, 12, 0xe13 }, { 88, 16, 0xe146 },
+ { 95, 8, 0xf2 }, { 95, 12, 0xf32 }, { 95, 16, 0xf332 },
+ { 100, 8, 0xff }, { 100, 12, 0xfff }, { 100, 16, 0xffff },
+};
+
+static void test_percent_value_desc(const struct percent_value_case *param,
+ char *desc)
+{
+ snprintf(desc, KUNIT_PARAM_DESC_SIZE,
+ "pc=%d, width=%d, value=0x%.*x\n",
+ param->pc, param->width,
+ DIV_ROUND_UP(param->width, 4), param->value);
+}
+
+KUNIT_ARRAY_PARAM(test_percent_value, percent_value_cases,
+ test_percent_value_desc);
+
+struct percent_value_test_info {
+ u32 pc; /* result of value-to-percent conversion */
+ u32 value; /* result of percent-to-value conversion */
+ u32 max_value; /* maximum raw value allowed by test params */
+ unsigned int shift; /* promotes raw testcase value to 16 bits */
+};
+
+/*
+ * Convert a reference percentage to a fixed-point MAX value and
+ * vice-versa, based on param (not test->param_value!)
+ */
+static void __prepare_percent_value_test(struct kunit *test,
+ struct percent_value_test_info *res,
+ const struct percent_value_case *param)
+{
+ struct mpam_props fake_props = { };
+
+ /* Reject bogus test parameters that would break the tests: */
+ KUNIT_ASSERT_GE(test, param->width, 1);
+ KUNIT_ASSERT_LE(test, param->width, 16);
+ KUNIT_ASSERT_LT(test, param->value, 1 << param->width);
+
+ mpam_set_feature(mpam_feat_mbw_max, &fake_props);
+ fake_props.bwa_wd = param->width;
+
+ res->shift = 16 - param->width;
+ res->max_value = GENMASK_U32(param->width - 1, 0);
+ res->value = percent_to_mbw_max(param->pc, &fake_props);
+ res->pc = mbw_max_to_percent(param->value << res->shift, &fake_props);
+}
+
+static void test_get_mba_granularity(struct kunit *test)
+{
+ int ret;
+ struct mpam_props fake_props = { };
+
+ /* Use MBW_MAX */
+ mpam_set_feature(mpam_feat_mbw_max, &fake_props);
+
+ fake_props.bwa_wd = 0;
+ KUNIT_EXPECT_FALSE(test, mba_class_use_mbw_max(&fake_props));
+
+ fake_props.bwa_wd = 1;
+ KUNIT_EXPECT_TRUE(test, mba_class_use_mbw_max(&fake_props));
+
+ /* Architectural maximum: */
+ fake_props.bwa_wd = 16;
+ KUNIT_EXPECT_TRUE(test, mba_class_use_mbw_max(&fake_props));
+
+ /* No usable control... */
+ fake_props.bwa_wd = 0;
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 0);
+
+ fake_props.bwa_wd = 1;
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 50); /* DIV_ROUND_UP(100, 1 << 1)% = 50% */
+
+ fake_props.bwa_wd = 2;
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 25); /* DIV_ROUND_UP(100, 1 << 2)% = 25% */
+
+ fake_props.bwa_wd = 3;
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 13); /* DIV_ROUND_UP(100, 1 << 3)% = 13% */
+
+ fake_props.bwa_wd = 6;
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 2); /* DIV_ROUND_UP(100, 1 << 6)% = 2% */
+
+ fake_props.bwa_wd = 7;
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 1); /* DIV_ROUND_UP(100, 1 << 7)% = 1% */
+
+ /* Granularity saturates at 1% */
+ fake_props.bwa_wd = 16; /* architectural maximum */
+ ret = get_mba_granularity(&fake_props);
+ KUNIT_EXPECT_EQ(test, ret, 1); /* DIV_ROUND_UP(100, 1 << 16)% = 1% */
+}
+
+static void test_mbw_max_to_percent(struct kunit *test)
+{
+ const struct percent_value_case *param = test->param_value;
+ struct percent_value_test_info res;
+
+ /*
+ * Since the reference values in percent_value_cases[] all
+ * correspond to exact percentages, round-to-nearest will
+ * always give the exact percentage back when the MPAM max
+ * value has precision of 0.5% or finer. (Always true for the
+ * reference data, since they all specify 8 bits or more of
+ * precision.
+ *
+ * So, keep it simple and demand an exact match:
+ */
+ __prepare_percent_value_test(test, &res, param);
+ KUNIT_EXPECT_EQ(test, res.pc, param->pc);
+}
+
+static void test_percent_to_mbw_max(struct kunit *test)
+{
+ const struct percent_value_case *param = test->param_value;
+ struct percent_value_test_info res;
+
+ __prepare_percent_value_test(test, &res, param);
+
+ KUNIT_EXPECT_GE(test, res.value, param->value << res.shift);
+ KUNIT_EXPECT_LE(test, res.value, (param->value + 1) << res.shift);
+ KUNIT_EXPECT_LE(test, res.value, res.max_value << res.shift);
+
+ /* No flexibility allowed for 0% and 100%! */
+
+ if (param->pc == 0)
+ KUNIT_EXPECT_EQ(test, res.value, 0);
+
+ if (param->pc == 100)
+ KUNIT_EXPECT_EQ(test, res.value, res.max_value << res.shift);
+}
+
+static const void *test_all_bwa_wd_gen_params(struct kunit *test, const void *prev,
+ char *desc)
+{
+ uintptr_t param = (uintptr_t)prev;
+
+ if (param > 15)
+ return NULL;
+
+ param++;
+
+ snprintf(desc, KUNIT_PARAM_DESC_SIZE, "wd=%u\n", (unsigned int)param);
+
+ return (void *)param;
+}
+
+static unsigned int test_get_bwa_wd(struct kunit *test)
+{
+ uintptr_t param = (uintptr_t)test->param_value;
+
+ KUNIT_ASSERT_GE(test, param, 1);
+ KUNIT_ASSERT_LE(test, param, 16);
+
+ return param;
+}
+
+static void test_mbw_max_to_percent_limits(struct kunit *test)
+{
+ struct mpam_props fake_props = {0};
+ u32 max_value;
+
+ mpam_set_feature(mpam_feat_mbw_max, &fake_props);
+ fake_props.bwa_wd = test_get_bwa_wd(test);
+ max_value = GENMASK(15, 16 - fake_props.bwa_wd);
+
+ KUNIT_EXPECT_EQ(test, mbw_max_to_percent(max_value, &fake_props),
+ MAX_MBA_BW);
+ KUNIT_EXPECT_EQ(test, mbw_max_to_percent(0, &fake_props),
+ get_mba_min(&fake_props));
+
+ /*
+ * Rounding policy dependent 0% sanity-check:
+ * With round-to-nearest, the minimum mbw_max value really
+ * should map to 0% if there are at least 200 steps.
+ * (100 steps may be enough for some other rounding policies.)
+ */
+ if (fake_props.bwa_wd >= 8)
+ KUNIT_EXPECT_EQ(test, mbw_max_to_percent(0, &fake_props), 0);
+
+ if (fake_props.bwa_wd < 8 &&
+ mbw_max_to_percent(0, &fake_props) == 0)
+ kunit_warn(test, "wd=%d: Testsuite/driver Rounding policy mismatch?",
+ fake_props.bwa_wd);
+}
+
+/*
+ * Check that converting a percentage to mbw_max and back again (or, as
+ * appropriate, vice-versa) always restores the original value:
+ */
+static void test_percent_max_roundtrip_stability(struct kunit *test)
+{
+ struct mpam_props fake_props = {0};
+ unsigned int shift;
+ u32 pc, max, pc2, max2;
+
+ mpam_set_feature(mpam_feat_mbw_max, &fake_props);
+ fake_props.bwa_wd = test_get_bwa_wd(test);
+ shift = 16 - fake_props.bwa_wd;
+
+ /*
+ * Converting a valid value from the coarser scale to the finer
+ * scale and back again must yield the original value:
+ */
+ if (fake_props.bwa_wd >= 7) {
+ /* More than 100 steps: only test exact pc values: */
+ for (pc = get_mba_min(&fake_props); pc <= MAX_MBA_BW; pc++) {
+ max = percent_to_mbw_max(pc, &fake_props);
+ pc2 = mbw_max_to_percent(max, &fake_props);
+ KUNIT_EXPECT_EQ(test, pc2, pc);
+ }
+ } else {
+ /* Fewer than 100 steps: only test exact mbw_max values: */
+ for (max = 0; max < 1 << 16; max += 1 << shift) {
+ pc = mbw_max_to_percent(max, &fake_props);
+ max2 = percent_to_mbw_max(pc, &fake_props);
+ KUNIT_EXPECT_EQ(test, max2, max);
+ }
+ }
+}
+
+static void test_percent_to_max_rounding(struct kunit *test)
+{
+ const struct percent_value_case *param = test->param_value;
+ unsigned int num_rounded_up = 0, total = 0;
+ struct percent_value_test_info res;
+
+ for (param = percent_value_cases, total = 0;
+ param < &percent_value_cases[ARRAY_SIZE(percent_value_cases)];
+ param++, total++) {
+ __prepare_percent_value_test(test, &res, param);
+ if (res.value > param->value << res.shift)
+ num_rounded_up++;
+ }
+
+ /*
+ * The MPAM driver applies a round-to-nearest policy, whereas a
+ * round-down policy seems to have been applied in the
+ * reference table from which the test vectors were selected.
+ *
+ * For a large and well-distributed suite of test vectors,
+ * about half should be rounded up and half down compared with
+ * the reference table. The actual test vectors are few in
+ * number and probably not very well distributed however, so
+ * tolerate a round-up rate of between 1/4 and 3/4 before
+ * crying foul:
+ */
+
+ kunit_info(test, "Round-up rate: %u%% (%u/%u)\n",
+ DIV_ROUND_CLOSEST(num_rounded_up * 100, total),
+ num_rounded_up, total);
+
+ KUNIT_EXPECT_GE(test, 4 * num_rounded_up, 1 * total);
+ KUNIT_EXPECT_LE(test, 4 * num_rounded_up, 3 * total);
+}
+
+static struct kunit_case mpam_resctrl_test_cases[] = {
+ KUNIT_CASE(test_get_mba_granularity),
+ KUNIT_CASE_PARAM(test_mbw_max_to_percent, test_percent_value_gen_params),
+ KUNIT_CASE_PARAM(test_percent_to_mbw_max, test_percent_value_gen_params),
+ KUNIT_CASE_PARAM(test_mbw_max_to_percent_limits, test_all_bwa_wd_gen_params),
+ KUNIT_CASE(test_percent_to_max_rounding),
+ KUNIT_CASE_PARAM(test_percent_max_roundtrip_stability,
+ test_all_bwa_wd_gen_params),
+ {}
+};
+
+static struct kunit_suite mpam_resctrl_test_suite = {
+ .name = "mpam_resctrl_test_suite",
+ .test_cases = mpam_resctrl_test_cases,
+};
+
+kunit_test_suites(&mpam_resctrl_test_suite);
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 23/41] arm_mpam: resctrl: Add rmid index helpers
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (21 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 22/41] arm_mpam: resctrl: Add kunit test for control format conversions Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 24/41] arm_mpam: resctrl: Add kunit test for rmid idx conversions Ben Horgan
` (19 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
Because MPAM's pmg aren't identical to RDT's rmid, resctrl handles some
data structures by index. This allows x86 to map indexes to RMID, and MPAM
to map them to partid-and-pmg.
Add the helpers to do this.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Suggested-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Use ~0U instead of ~0 in lhs of left shift
Changes since v2:
Drop changes signed-off-by as reworked patch
Use multiply and add rather than shift to avoid holes
---
drivers/resctrl/mpam_resctrl.c | 16 ++++++++++++++++
include/linux/arm_mpam.h | 3 +++
2 files changed, 19 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 183428e2d38c..e3a87464f0ac 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -131,6 +131,22 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
return mpam_partid_max + 1;
}
+u32 resctrl_arch_system_num_rmid_idx(void)
+{
+ return (mpam_pmg_max + 1) * (mpam_partid_max + 1);
+}
+
+u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
+{
+ return closid * (mpam_pmg_max + 1) + rmid;
+}
+
+void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
+{
+ *closid = idx / (mpam_pmg_max + 1);
+ *rmid = idx % (mpam_pmg_max + 1);
+}
+
void resctrl_arch_sched_in(struct task_struct *tsk)
{
lockdep_assert_preemption_disabled();
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index d329b1dc148b..7d23c90f077d 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -58,6 +58,9 @@ void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid);
void resctrl_arch_sched_in(struct task_struct *tsk);
bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid);
bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
+u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid);
+void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid);
+u32 resctrl_arch_system_num_rmid_idx(void);
/**
* mpam_register_requestor() - Register a requestor with the MPAM driver
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 24/41] arm_mpam: resctrl: Add kunit test for rmid idx conversions
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (22 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 23/41] arm_mpam: resctrl: Add rmid index helpers Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 25/41] arm_mpam: resctrl: Wait for cacheinfo to be ready Ben Horgan
` (18 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
As MPAM's pmg are scoped by partid and RDT's rmid are global the
rescrl mapping to an index needs to differ.
Add some tests for the MPAM rmid mapping.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
drivers/resctrl/test_mpam_resctrl.c | 49 +++++++++++++++++++++++++++++
1 file changed, 49 insertions(+)
diff --git a/drivers/resctrl/test_mpam_resctrl.c b/drivers/resctrl/test_mpam_resctrl.c
index b93d6ad87e43..a20da161d965 100644
--- a/drivers/resctrl/test_mpam_resctrl.c
+++ b/drivers/resctrl/test_mpam_resctrl.c
@@ -296,6 +296,54 @@ static void test_percent_to_max_rounding(struct kunit *test)
KUNIT_EXPECT_LE(test, 4 * num_rounded_up, 3 * total);
}
+struct rmid_idx_case {
+ u32 max_partid;
+ u32 max_pmg;
+};
+
+static const struct rmid_idx_case rmid_idx_cases[] = {
+ {0, 0}, {1, 4}, {3, 1}, {5, 9}, {4, 4}, {100, 11}, {0xFFFF, 0xFF},
+};
+
+static void test_rmid_idx_desc(const struct rmid_idx_case *param, char *desc)
+{
+ snprintf(desc, KUNIT_PARAM_DESC_SIZE, "max_partid=%d, max_pmg=%d\n",
+ param->max_partid, param->max_pmg);
+}
+
+KUNIT_ARRAY_PARAM(test_rmid_idx, rmid_idx_cases, test_rmid_idx_desc);
+
+static void test_rmid_idx_encoding(struct kunit *test)
+{
+ u32 orig_mpam_partid_max = mpam_partid_max;
+ u32 orig_mpam_pmg_max = mpam_pmg_max;
+ const struct rmid_idx_case *param = test->param_value;
+ u32 idx, num_idx, count = 0;
+
+ mpam_partid_max = param->max_partid;
+ mpam_pmg_max = param->max_pmg;
+
+ for (u32 partid = 0; partid <= mpam_partid_max; partid++) {
+ for (u32 pmg = 0; pmg <= mpam_pmg_max; pmg++) {
+ u32 partid_out, pmg_out;
+
+ idx = resctrl_arch_rmid_idx_encode(partid, pmg);
+ /* Confirm there are no holes in the rmid idx range */
+ KUNIT_EXPECT_EQ(test, count, idx);
+ count++;
+ resctrl_arch_rmid_idx_decode(idx, &partid_out, &pmg_out);
+ KUNIT_EXPECT_EQ(test, pmg, pmg_out);
+ KUNIT_EXPECT_EQ(test, partid, partid_out);
+ }
+ }
+ num_idx = resctrl_arch_system_num_rmid_idx();
+ KUNIT_EXPECT_EQ(test, idx + 1, num_idx);
+
+ /* Restore global variables that were messed with */
+ mpam_partid_max = orig_mpam_partid_max;
+ mpam_pmg_max = orig_mpam_pmg_max;
+}
+
static struct kunit_case mpam_resctrl_test_cases[] = {
KUNIT_CASE(test_get_mba_granularity),
KUNIT_CASE_PARAM(test_mbw_max_to_percent, test_percent_value_gen_params),
@@ -304,6 +352,7 @@ static struct kunit_case mpam_resctrl_test_cases[] = {
KUNIT_CASE(test_percent_to_max_rounding),
KUNIT_CASE_PARAM(test_percent_max_roundtrip_stability,
test_all_bwa_wd_gen_params),
+ KUNIT_CASE_PARAM(test_rmid_idx_encoding, test_rmid_idx_gen_params),
{}
};
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 25/41] arm_mpam: resctrl: Wait for cacheinfo to be ready
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (23 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 24/41] arm_mpam: resctrl: Add kunit test for rmid idx conversions Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource Ben Horgan
` (17 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
In order to calculate the rmid realloc threshold the size of the cache
needs to be known. Cache domains will also be named after the cache id. So
that this information can be extracted from cacheinfo we need to wait for
it to be ready. The cacheinfo information is populated in device_initcall()
so we wait for that.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Signed-off-by: James Morse <james.morse@arm.com>
[horgan: split out from another patch]
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
This is moved into it's own patch to allow all uses of cacheinfo to be
valid when they are introduced.
---
drivers/resctrl/mpam_resctrl.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index e3a87464f0ac..25950e667077 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -16,6 +16,7 @@
#include <linux/resctrl.h>
#include <linux/slab.h>
#include <linux/types.h>
+#include <linux/wait.h>
#include <asm/mpam.h>
@@ -45,6 +46,13 @@ static bool exposed_mon_capable;
*/
static bool cdp_enabled;
+/*
+ * We use cacheinfo to discover the size of the caches and their id. cacheinfo
+ * populates this from a device_initcall(). mpam_resctrl_setup() must wait.
+ */
+static bool cacheinfo_ready;
+static DECLARE_WAIT_QUEUE_HEAD(wait_cacheinfo_ready);
+
bool resctrl_arch_alloc_capable(void)
{
return exposed_alloc_capable;
@@ -770,6 +778,8 @@ int mpam_resctrl_setup(void)
struct mpam_resctrl_res *res;
enum resctrl_res_level rid;
+ wait_event(wait_cacheinfo_ready, cacheinfo_ready);
+
cpus_read_lock();
for_each_mpam_resctrl_control(res, rid) {
INIT_LIST_HEAD_RCU(&res->resctrl_res.ctrl_domains);
@@ -809,6 +819,15 @@ int mpam_resctrl_setup(void)
return 0;
}
+static int __init __cacheinfo_ready(void)
+{
+ cacheinfo_ready = true;
+ wake_up(&wait_cacheinfo_ready);
+
+ return 0;
+}
+device_initcall_sync(__cacheinfo_ready);
+
#ifdef CONFIG_MPAM_KUNIT_TEST
#include "test_mpam_resctrl.c"
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (24 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 25/41] arm_mpam: resctrl: Wait for cacheinfo to be ready Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:50 ` Jonathan Cameron
2026-02-10 6:20 ` Shaopeng Tan (Fujitsu)
2026-02-03 21:43 ` [PATCH v4 27/41] arm_mpam: resctrl: Add support for csu counters Ben Horgan
` (16 subsequent siblings)
42 siblings, 2 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Dave Martin
From: James Morse <james.morse@arm.com>
resctrl supports 'MB', as a percentage throttling of traffic from the
L3. This is the control that mba_sc uses, so ideally the class chosen
should be as close as possible to the counters used for mbm_total. If
there is a single L3 and the topology of the memory matches then the
traffic at the memory controller will be equivalent to that at egress of
the L3. If these conditions are met allow the memory class to back MB.
MB's percentage control should be backed either with the fixed point
fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
bitmaps is not used as its tricky to pick which bits to use to avoid
contention, and may be possible to expose this as something other than a
percentage in the future.
CC: Zeng Heng <zengheng4@huawei.com>
Co-developed-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Code flow change
Commit message 'or'
Changes since v3:
initialise tmp_cpumask
update commit message
check the traffic matches l3
update comment on candidate_class update, only mbm_total
drop tags due to rework
---
drivers/resctrl/mpam_resctrl.c | 275 ++++++++++++++++++++++++++++++++-
1 file changed, 274 insertions(+), 1 deletion(-)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 25950e667077..4cca3694978d 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -253,6 +253,33 @@ static bool cache_has_usable_cpor(struct mpam_class *class)
return class->props.cpbm_wd <= 32;
}
+static bool mba_class_use_mbw_max(struct mpam_props *cprops)
+{
+ return (mpam_has_feature(mpam_feat_mbw_max, cprops) &&
+ cprops->bwa_wd);
+}
+
+static bool class_has_usable_mba(struct mpam_props *cprops)
+{
+ return mba_class_use_mbw_max(cprops);
+}
+
+/*
+ * Calculate the worst-case percentage change from each implemented step
+ * in the control.
+ */
+static u32 get_mba_granularity(struct mpam_props *cprops)
+{
+ if (!mba_class_use_mbw_max(cprops))
+ return 0;
+
+ /*
+ * bwa_wd is the number of bits implemented in the 0.xxx
+ * fixed point fraction. 1 bit is 50%, 2 is 25% etc.
+ */
+ return DIV_ROUND_UP(MAX_MBA_BW, 1 << cprops->bwa_wd);
+}
+
/*
* Each fixed-point hardware value architecturally represents a range
* of values: the full range 0% - 100% is split contiguously into
@@ -303,6 +330,153 @@ static u16 percent_to_mbw_max(u8 pc, struct mpam_props *cprops)
return val;
}
+static u32 get_mba_min(struct mpam_props *cprops)
+{
+ if (!mba_class_use_mbw_max(cprops)) {
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+
+ return mbw_max_to_percent(0, cprops);
+}
+
+/* Find the L3 cache that has affinity with this CPU */
+static int find_l3_equivalent_bitmask(int cpu, cpumask_var_t tmp_cpumask)
+{
+ u32 cache_id = get_cpu_cacheinfo_id(cpu, 3);
+
+ lockdep_assert_cpus_held();
+
+ return mpam_get_cpumask_from_cache_id(cache_id, 3, tmp_cpumask);
+}
+
+/*
+ * topology_matches_l3() - Is the provided class the same shape as L3
+ * @victim: The class we'd like to pretend is L3.
+ *
+ * resctrl expects all the world's a Xeon, and all counters are on the
+ * L3. We allow some mapping counters on other classes. This requires
+ * that the CPU->domain mapping is the same kind of shape.
+ *
+ * Using cacheinfo directly would make this work even if resctrl can't
+ * use the L3 - but cacheinfo can't tell us anything about offline CPUs.
+ * Using the L3 resctrl domain list also depends on CPUs being online.
+ * Using the mpam_class we picked for L3 so we can use its domain list
+ * assumes that there are MPAM controls on the L3.
+ * Instead, this path eventually uses the mpam_get_cpumask_from_cache_id()
+ * helper which can tell us about offline CPUs ... but getting the cache_id
+ * to start with relies on at least one CPU per L3 cache being online at
+ * boot.
+ *
+ * Walk the victim component list and compare the affinity mask with the
+ * corresponding L3. The topology matches if each victim:component's affinity
+ * mask is the same as the CPU's corresponding L3's. These lists/masks are
+ * computed from firmware tables so don't change at runtime.
+ */
+static bool topology_matches_l3(struct mpam_class *victim)
+{
+ int cpu, err;
+ struct mpam_component *victim_iter;
+ cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
+
+ lockdep_assert_cpus_held();
+
+ if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL))
+ return false;
+
+ guard(srcu)(&mpam_srcu);
+ list_for_each_entry_srcu(victim_iter, &victim->components, class_list,
+ srcu_read_lock_held(&mpam_srcu)) {
+ if (cpumask_empty(&victim_iter->affinity)) {
+ pr_debug("class %u has CPU-less component %u - can't match L3!\n",
+ victim->level, victim_iter->comp_id);
+ return false;
+ }
+
+ cpu = cpumask_any_and(&victim_iter->affinity, cpu_online_mask);
+ if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
+ return false;
+
+ cpumask_clear(tmp_cpumask);
+ err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
+ if (err) {
+ pr_debug("Failed to find L3's equivalent component to class %u component %u\n",
+ victim->level, victim_iter->comp_id);
+ return false;
+ }
+
+ /* Any differing bits in the affinity mask? */
+ if (!cpumask_equal(tmp_cpumask, &victim_iter->affinity)) {
+ pr_debug("class %u component %u has Mismatched CPU mask with L3 equivalent\n"
+ "L3:%*pbl != victim:%*pbl\n",
+ victim->level, victim_iter->comp_id,
+ cpumask_pr_args(tmp_cpumask),
+ cpumask_pr_args(&victim_iter->affinity));
+
+ return false;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * Test if the traffic for a class matches that at egress from the L3. For
+ * MSC at memory controllers this is only possible if there is a single L3
+ * as otherwise the counters at the memory can include bandwidth from the
+ * non-local L3.
+ */
+static bool traffic_matches_l3(struct mpam_class *class) {
+ int err, cpu;
+ cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
+
+ lockdep_assert_cpus_held();
+
+ if (class->type == MPAM_CLASS_CACHE && class->level == 3)
+ return true;
+
+ if (class->type == MPAM_CLASS_CACHE && class->level != 3) {
+ pr_debug("class %u is a different cache from L3\n", class->level);
+ return false;
+ }
+
+ if (class->type != MPAM_CLASS_MEMORY) {
+ pr_debug("class %u is neither of type cache or memory\n", class->level);
+ return false;
+ }
+
+ if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL)) {
+ pr_debug("cpumask allocation failed\n");
+ return false;
+ }
+
+ if (class->type != MPAM_CLASS_MEMORY) {
+ pr_debug("class %u is neither of type cache or memory\n",
+ class->level);
+ return false;
+ }
+
+ cpu = cpumask_any_and(&class->affinity, cpu_online_mask);
+ err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
+ if (err) {
+ pr_debug("Failed to find L3 downstream to cpu %d\n", cpu);
+ return false;
+ }
+
+ if (!cpumask_equal(tmp_cpumask, cpu_possible_mask)) {
+ pr_debug("There is more than one L3\n");
+ return false;
+ }
+
+ /* Be strict; the traffic might stop in the intermediate cache. */
+ if (get_cpu_cacheinfo_id(cpu, 4) != -1) {
+ pr_debug("L3 isn't the last level of cache\n");
+ return false;
+ }
+
+ return true;
+}
+
/* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */
static void mpam_resctrl_pick_caches(void)
{
@@ -345,9 +519,69 @@ static void mpam_resctrl_pick_caches(void)
}
}
+static void mpam_resctrl_pick_mba(void)
+{
+ struct mpam_class *class, *candidate_class = NULL;
+ struct mpam_resctrl_res *res;
+
+ lockdep_assert_cpus_held();
+
+ guard(srcu)(&mpam_srcu);
+ list_for_each_entry_srcu(class, &mpam_classes, classes_list,
+ srcu_read_lock_held(&mpam_srcu)) {
+ struct mpam_props *cprops = &class->props;
+
+ if (class->level != 3 && class->type == MPAM_CLASS_CACHE) {
+ pr_debug("class %u is a cache but not the L3\n", class->level);
+ continue;
+ }
+
+ if (!class_has_usable_mba(cprops)) {
+ pr_debug("class %u has no bandwidth control\n",
+ class->level);
+ continue;
+ }
+
+ if (!cpumask_equal(&class->affinity, cpu_possible_mask)) {
+ pr_debug("class %u has missing CPUs\n", class->level);
+ continue;
+ }
+
+ if (!topology_matches_l3(class)) {
+ pr_debug("class %u topology doesn't match L3\n",
+ class->level);
+ continue;
+ }
+
+ if (!traffic_matches_l3(class)) {
+ pr_debug("class %u traffic doesn't match L3 egress\n",
+ class->level);
+ continue;
+ }
+
+ /*
+ * Pick a resource to be MBA that as close as possible to
+ * the L3. mbm_total counts the bandwidth leaving the L3
+ * cache and MBA should correspond as closely as possible
+ * for proper operation of mba_sc.
+ */
+ if (!candidate_class || class->level < candidate_class->level)
+ candidate_class = class;
+ }
+
+ if (candidate_class) {
+ pr_debug("selected class %u to back MBA\n",
+ candidate_class->level);
+ res = &mpam_resctrl_controls[RDT_RESOURCE_MBA];
+ res->class = candidate_class;
+ exposed_alloc_capable = true;
+ }
+}
+
static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
{
struct mpam_class *class = res->class;
+ struct mpam_props *cprops = &class->props;
struct rdt_resource *r = &res->resctrl_res;
switch (r->rid) {
@@ -377,6 +611,19 @@ static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
*/
r->cache.shareable_bits = resctrl_get_default_ctrl(r);
break;
+ case RDT_RESOURCE_MBA:
+ r->alloc_capable = true;
+ r->schema_fmt = RESCTRL_SCHEMA_RANGE;
+ r->ctrl_scope = RESCTRL_L3_CACHE;
+
+ r->membw.delay_linear = true;
+ r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
+ r->membw.min_bw = get_mba_min(cprops);
+ r->membw.max_bw = MAX_MBA_BW;
+ r->membw.bw_gran = get_mba_granularity(cprops);
+
+ r->name = "MB";
+ break;
default:
return -EINVAL;
}
@@ -391,7 +638,17 @@ static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
if (class->type == MPAM_CLASS_CACHE)
return comp->comp_id;
- /* TODO: repaint domain ids to match the L3 domain ids */
+ if (topology_matches_l3(class)) {
+ /* Use the corresponding L3 component ID as the domain ID */
+ int id = get_cpu_cacheinfo_id(cpu, 3);
+
+ /* Implies topology_matches_l3() made a mistake */
+ if (WARN_ON_ONCE(id == -1))
+ return comp->comp_id;
+
+ return id;
+ }
+
/* Otherwise, expose the ID used by the firmware table code. */
return comp->comp_id;
}
@@ -431,6 +688,12 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
case RDT_RESOURCE_L3:
configured_by = mpam_feat_cpor_part;
break;
+ case RDT_RESOURCE_MBA:
+ if (mpam_has_feature(mpam_feat_mbw_max, cprops)) {
+ configured_by = mpam_feat_mbw_max;
+ break;
+ }
+ fallthrough;
default:
return resctrl_get_default_ctrl(r);
}
@@ -442,6 +705,8 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
switch (configured_by) {
case mpam_feat_cpor_part:
return cfg->cpbm;
+ case mpam_feat_mbw_max:
+ return mbw_max_to_percent(cfg->mbw_max, cprops);
default:
return resctrl_get_default_ctrl(r);
}
@@ -486,6 +751,13 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
cfg.cpbm = cfg_val;
mpam_set_feature(mpam_feat_cpor_part, &cfg);
break;
+ case RDT_RESOURCE_MBA:
+ if (mpam_has_feature(mpam_feat_mbw_max, cprops)) {
+ cfg.mbw_max = percent_to_mbw_max(cfg_val, cprops);
+ mpam_set_feature(mpam_feat_mbw_max, &cfg);
+ break;
+ }
+ fallthrough;
default:
return -EINVAL;
}
@@ -789,6 +1061,7 @@ int mpam_resctrl_setup(void)
/* Find some classes to use for controls */
mpam_resctrl_pick_caches();
+ mpam_resctrl_pick_mba();
/* Initialise the resctrl structures from the classes */
for_each_mpam_resctrl_control(res, rid) {
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 27/41] arm_mpam: resctrl: Add support for csu counters
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (25 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:55 ` Jonathan Cameron
2026-02-03 21:43 ` [PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters Ben Horgan
` (15 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc
From: James Morse <james.morse@arm.com>
resctrl exposes a counter via a file named llc_occupancy. This isn't really
a counter as its value goes up and down, this is a snapshot of the cache
storage usage monitor.
Add some picking code which will only find an L3. The resctrl counter
file is called llc_occupancy but we don't check it is the last one as
it is already identified as L3.
Signed-off-by: James Morse <james.morse@arm.com>
Co-developed-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Dave Martin <dave.martin@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
Allow csu counters however many partid or pmg there are
else if -> if
reduce scope of local variables
drop has_csu
Changes since v2:
return -> break so works for mbwu in later patch
add for_each_mpam_resctrl_mon
return error from mpam_resctrl_monitor_init(). It may fail when is abmc
allocation introduced in a later patch.
Squashed in patch from Dave Martin:
https://lore.kernel.org/lkml/20250820131621.54983-1-Dave.Martin@arm.com/
Changes since v3:
resctrl_enable_mon_event() signature update
Restrict the events considered
num-rmid update
Use raw_smp_processor_id()
Tighten heuristics:
Make sure it is the L3
Please shout if this means the counters aren't exposed on any platforms
Drop tags due to change in policy/rework
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
drivers/resctrl/mpam_internal.h | 6 ++
drivers/resctrl/mpam_resctrl.c | 176 +++++++++++++++++++++++++++++++-
2 files changed, 177 insertions(+), 5 deletions(-)
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 21ade1620147..58b883fe9d30 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -345,6 +345,12 @@ struct mpam_resctrl_res {
struct rdt_resource resctrl_res;
};
+struct mpam_resctrl_mon {
+ struct mpam_class *class;
+
+ /* per-class data that resctrl needs will live here */
+};
+
static inline int mpam_alloc_csu_mon(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 4cca3694978d..638fdca7caea 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -37,6 +37,23 @@ static struct mpam_resctrl_res mpam_resctrl_controls[RDT_NUM_RESOURCES];
/* The lock for modifying resctrl's domain lists from cpuhp callbacks. */
static DEFINE_MUTEX(domain_list_lock);
+/*
+ * The classes we've picked to map to resctrl events.
+ * Resctrl believes all the worlds a Xeon, and these are all on the L3. This
+ * array lets us find the actual class backing the event counters. e.g.
+ * the only memory bandwidth counters may be on the memory controller, but to
+ * make use of them, we pretend they are on L3. Restrict the events considered
+ * to those supported by MPAM.
+ * Class pointer may be NULL.
+ */
+#define MPAM_MAX_EVENT QOS_L3_MBM_TOTAL_EVENT_ID
+static struct mpam_resctrl_mon mpam_resctrl_counters[MPAM_MAX_EVENT + 1];
+
+#define for_each_mpam_resctrl_mon(mon, eventid) \
+ for (eventid = QOS_FIRST_EVENT, mon = &mpam_resctrl_counters[eventid]; \
+ eventid <= MPAM_MAX_EVENT; \
+ eventid++, mon = &mpam_resctrl_counters[eventid])
+
static bool exposed_alloc_capable;
static bool exposed_mon_capable;
@@ -264,6 +281,28 @@ static bool class_has_usable_mba(struct mpam_props *cprops)
return mba_class_use_mbw_max(cprops);
}
+static bool cache_has_usable_csu(struct mpam_class *class)
+{
+ struct mpam_props *cprops;
+
+ if (!class)
+ return false;
+
+ cprops = &class->props;
+
+ if (!mpam_has_feature(mpam_feat_msmon_csu, cprops))
+ return false;
+
+ /*
+ * CSU counters settle on the value, so we can get away with
+ * having only one.
+ */
+ if (!cprops->num_csu_mon)
+ return false;
+
+ return true;
+}
+
/*
* Calculate the worst-case percentage change from each implemented step
* in the control.
@@ -578,6 +617,65 @@ static void mpam_resctrl_pick_mba(void)
}
}
+static void counter_update_class(enum resctrl_event_id evt_id,
+ struct mpam_class *class)
+{
+ struct mpam_class *existing_class = mpam_resctrl_counters[evt_id].class;
+
+ if (existing_class) {
+ if (class->level == 3) {
+ pr_debug("Existing class is L3 - L3 wins\n");
+ return;
+ }
+
+ if (existing_class->level < class->level) {
+ pr_debug("Existing class is closer to L3, %u versus %u - closer is better\n",
+ existing_class->level, class->level);
+ return;
+ }
+ }
+
+ mpam_resctrl_counters[evt_id].class = class;
+ exposed_mon_capable = true;
+}
+
+static void mpam_resctrl_pick_counters(void)
+{
+ struct mpam_class *class;
+
+ lockdep_assert_cpus_held();
+
+ guard(srcu)(&mpam_srcu);
+ list_for_each_entry_srcu(class, &mpam_classes, classes_list,
+ srcu_read_lock_held(&mpam_srcu)) {
+ /* The name of the resource is L3... */
+ if (class->type == MPAM_CLASS_CACHE && class->level != 3) {
+ pr_debug("class %u is a cache but not the L3", class->level);
+ continue;
+ }
+
+ if (!cpumask_equal(&class->affinity, cpu_possible_mask)) {
+ pr_debug("class %u does not cover all CPUs",
+ class->level);
+ continue;
+ }
+
+ if (cache_has_usable_csu(class)) {
+ pr_debug("class %u has usable CSU",
+ class->level);
+
+ /* CSU counters only make sense on a cache. */
+ switch (class->type) {
+ case MPAM_CLASS_CACHE:
+ counter_update_class(QOS_L3_OCCUP_EVENT_ID, class);
+ break;
+ default:
+ break;
+ }
+ }
+ }
+}
+
static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
{
struct mpam_class *class = res->class;
@@ -653,6 +751,57 @@ static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
return comp->comp_id;
}
+static int mpam_resctrl_monitor_init(struct mpam_resctrl_mon *mon,
+ enum resctrl_event_id type)
+{
+ struct mpam_resctrl_res *res = &mpam_resctrl_controls[RDT_RESOURCE_L3];
+ struct rdt_resource *l3 = &res->resctrl_res;
+
+ lockdep_assert_cpus_held();
+
+ /*
+ * There also needs to be an L3 cache present.
+ * The check just requires any online CPU and it can't go offline as we
+ * hold the cpu lock.
+ */
+ if (get_cpu_cacheinfo_id(raw_smp_processor_id(), 3) == -1)
+ return 0;
+
+ /*
+ * If there are no MPAM resources on L3, force it into existence.
+ * topology_matches_l3() already ensures this looks like the L3.
+ * The domain-ids will be fixed up by mpam_resctrl_domain_hdr_init().
+ */
+ if (!res->class) {
+ pr_warn_once("Faking L3 MSC to enable counters.\n");
+ res->class = mpam_resctrl_counters[type].class;
+ }
+
+ /* Called multiple times!, once per event type */
+ if (exposed_mon_capable) {
+ l3->mon_capable = true;
+
+ /* Setting name is necessary on monitor only platforms */
+ l3->name = "L3";
+ l3->mon_scope = RESCTRL_L3_CACHE;
+
+ resctrl_enable_mon_event(type, false, 0, NULL);
+
+ /*
+ * num-rmid is the upper bound for the number of monitoring
+ * groups that can exist simultaneously, including the
+ * default monitoring group for each control group. Hence,
+ * advertise the whole rmid_idx space even though each
+ * control group has its own pmg/rmid space. Unfortunately,
+ * this does mean userspace needs to know the architecture
+ * to correctly interpret this value.
+ */
+ l3->mon.num_rmid = resctrl_arch_system_num_rmid_idx();
+ }
+
+ return 0;
+}
+
u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
u32 closid, enum resctrl_conf_type type)
{
@@ -1049,6 +1198,8 @@ int mpam_resctrl_setup(void)
int err = 0;
struct mpam_resctrl_res *res;
enum resctrl_res_level rid;
+ struct mpam_resctrl_mon *mon;
+ enum resctrl_event_id eventid;
wait_event(wait_cacheinfo_ready, cacheinfo_ready);
@@ -1071,16 +1222,26 @@ int mpam_resctrl_setup(void)
err = mpam_resctrl_control_init(res);
if (err) {
pr_debug("Failed to initialise rid %u\n", rid);
- break;
+ goto internal_error;
}
}
- cpus_read_unlock();
- if (err) {
- pr_debug("Internal error %d - resctrl not supported\n", err);
- return err;
+ /* Find some classes to use for monitors */
+ mpam_resctrl_pick_counters();
+
+ for_each_mpam_resctrl_mon(mon, eventid) {
+ if (!mon->class)
+ continue; // dummy resource
+
+ err = mpam_resctrl_monitor_init(mon, eventid);
+ if (err) {
+ pr_debug("Failed to initialise event %u\n", eventid);
+ goto internal_error;
+ }
}
+ cpus_read_unlock();
+
if (!exposed_alloc_capable && !exposed_mon_capable) {
pr_debug("No alloc(%u) or monitor(%u) found - resctrl not supported\n",
exposed_alloc_capable, exposed_mon_capable);
@@ -1090,6 +1251,11 @@ int mpam_resctrl_setup(void)
/* TODO: call resctrl_init() */
return 0;
+
+internal_error:
+ cpus_read_unlock();
+ pr_debug("Internal error %d - resctrl not supported\n", err);
+ return err;
}
static int __init __cacheinfo_ready(void)
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (26 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 27/41] arm_mpam: resctrl: Add support for csu counters Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:58 ` Jonathan Cameron
2026-02-03 21:43 ` [PATCH v4 29/41] arm_mpam: resctrl: Pre-allocate free running monitors Ben Horgan
` (14 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc
From: James Morse <james.morse@arm.com>
resctrl has two types of counters, NUMA-local and global. MPAM can only
count global either using MSC at the L3 cache or in the memory controllers.
When global and local equate to the same thing continue just to call it
global.
Because the class or component backing the event may not be 'the L3', it is
necessary for mpam_resctrl_get_domain_from_cpu() to search the monitor
domains too. This matters the most for 'monitor only' systems, where 'the
L3' control domains may be empty, and the ctrl_comp pointer NULL.
resctrl expects there to be enough monitors for every possible control and
monitor group to have one. Such a system gets called 'free running' as the
monitors can be programmed once and left running. Any other platform will
need to emulate ABMC.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
drop has_mbwu
Changes since v2:
Iterate over mpam_resctrl_dom directly (Jonathan)
Use for_each_mpam_resctrl_mon
Changes since v3:
Don't continue if mon not found to avoid NULL pointer deref
use int for cache_id in mpam_resctrl_alloc_domain()
Update commit message
Take traffic into account
Only use mbm_total.
Drop tags due to rework
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
drivers/resctrl/mpam_internal.h | 8 +++
drivers/resctrl/mpam_resctrl.c | 122 +++++++++++++++++++++++++++++++-
2 files changed, 129 insertions(+), 1 deletion(-)
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 58b883fe9d30..bab6eea60dae 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -336,6 +336,14 @@ struct mpam_msc_ris {
struct mpam_resctrl_dom {
struct mpam_component *ctrl_comp;
+
+ /*
+ * There is no single mon_comp because different events may be backed
+ * by different class/components. mon_comp is indexed by the event
+ * number.
+ */
+ struct mpam_component *mon_comp[QOS_NUM_EVENTS];
+
struct rdt_ctrl_domain resctrl_ctrl_dom;
struct rdt_l3_mon_domain resctrl_mon_dom;
};
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 638fdca7caea..c96c434c9454 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -70,6 +70,14 @@ static bool cdp_enabled;
static bool cacheinfo_ready;
static DECLARE_WAIT_QUEUE_HEAD(wait_cacheinfo_ready);
+/* Whether this num_mbw_mon could result in a free_running system */
+static int __mpam_monitors_free_running(u16 num_mbwu_mon)
+{
+ if (num_mbwu_mon >= resctrl_arch_system_num_rmid_idx())
+ return resctrl_arch_system_num_rmid_idx();
+ return 0;
+}
+
bool resctrl_arch_alloc_capable(void)
{
return exposed_alloc_capable;
@@ -303,6 +311,26 @@ static bool cache_has_usable_csu(struct mpam_class *class)
return true;
}
+static bool class_has_usable_mbwu(struct mpam_class *class)
+{
+ struct mpam_props *cprops = &class->props;
+
+ if (!mpam_has_feature(mpam_feat_msmon_mbwu, cprops))
+ return false;
+
+ /*
+ * resctrl expects the bandwidth counters to be free running,
+ * which means we need as many monitors as resctrl has
+ * control/monitor groups.
+ */
+ if (__mpam_monitors_free_running(cprops->num_mbwu_mon)) {
+ pr_debug("monitors usable in free-running mode\n");
+ return true;
+ }
+
+ return false;
+}
+
/*
* Calculate the worst-case percentage change from each implemented step
* in the control.
@@ -673,6 +701,22 @@ static void mpam_resctrl_pick_counters(void)
break;
}
}
+
+ if (class_has_usable_mbwu(class) &&
+ topology_matches_l3(class) &&
+ traffic_matches_l3(class)) {
+ pr_debug("class %u has usable MBWU, and matches L3 topology and traffic\n",
+ class->level);
+
+ /*
+ * We can't distinguish traffic by destination so
+ * we don't know if it's staying on the same NUMA
+ * node. Hence, we can't calculate mbm_local except
+ * when we only have one L3 and it's equivalent to
+ * mbm_total and so always use mbm_total.
+ */
+ counter_update_class(QOS_L3_MBM_TOTAL_EVENT_ID, class);
+ }
}
}
@@ -1024,6 +1068,20 @@ static void mpam_resctrl_domain_insert(struct list_head *list,
list_add_tail_rcu(&new->list, pos);
}
+static struct mpam_component *find_component(struct mpam_class *class, int cpu)
+{
+ struct mpam_component *comp;
+
+ guard(srcu)(&mpam_srcu);
+ list_for_each_entry_srcu(comp, &class->components, class_list,
+ srcu_read_lock_held(&mpam_srcu)) {
+ if (cpumask_test_cpu(cpu, &comp->affinity))
+ return comp;
+ }
+
+ return NULL;
+}
+
static struct mpam_resctrl_dom *
mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
{
@@ -1071,6 +1129,35 @@ mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
}
if (exposed_mon_capable) {
+ struct mpam_component *any_mon_comp;
+ struct mpam_resctrl_mon *mon;
+ enum resctrl_event_id eventid;
+
+ /*
+ * Even if the monitor domain is backed by a different
+ * component, the L3 component IDs need to be used... only
+ * there may be no ctrl_comp for the L3.
+ * Search each event's class list for a component with
+ * overlapping CPUs and set up the dom->mon_comp array.
+ */
+
+ for_each_mpam_resctrl_mon(mon, eventid) {
+ struct mpam_component *mon_comp;
+
+ if (!mon->class)
+ continue; // dummy resource
+
+ mon_comp = find_component(mon->class, cpu);
+ dom->mon_comp[eventid] = mon_comp;
+ if (mon_comp)
+ any_mon_comp = mon_comp;
+ }
+ if (!any_mon_comp) {
+ WARN_ON_ONCE(0);
+ err = -EFAULT;
+ goto offline_ctrl_domain;
+ }
+
mon_d = &dom->resctrl_mon_dom;
mpam_resctrl_domain_hdr_init(cpu, any_mon_comp, &mon_d->hdr);
mon_d->hdr.type = RESCTRL_MON_DOMAIN;
@@ -1097,6 +1184,35 @@ mpam_resctrl_alloc_domain(unsigned int cpu, struct mpam_resctrl_res *res)
return dom;
}
+/*
+ * We know all the monitors are associated with the L3, even if there are no
+ * controls and therefore no control component. Find the cache-id for the CPU
+ * and use that to search for existing resctrl domains.
+ * This relies on mpam_resctrl_pick_domain_id() using the L3 cache-id
+ * for anything that is not a cache.
+ */
+static struct mpam_resctrl_dom *mpam_resctrl_get_mon_domain_from_cpu(int cpu)
+{
+ int cache_id;
+ struct mpam_resctrl_dom *dom;
+ struct mpam_resctrl_res *l3 = &mpam_resctrl_controls[RDT_RESOURCE_L3];
+
+ lockdep_assert_cpus_held();
+
+ if (!l3->class)
+ return NULL;
+ cache_id = get_cpu_cacheinfo_id(cpu, 3);
+ if (cache_id < 0)
+ return NULL;
+
+ list_for_each_entry_rcu(dom, &l3->resctrl_res.mon_domains, resctrl_mon_dom.hdr.list) {
+ if (dom->resctrl_mon_dom.hdr.id == cache_id)
+ return dom;
+ }
+
+ return NULL;
+}
+
static struct mpam_resctrl_dom *
mpam_resctrl_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res)
{
@@ -1110,7 +1226,11 @@ mpam_resctrl_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res)
return dom;
}
- return NULL;
+ if (r->rid != RDT_RESOURCE_L3)
+ return NULL;
+
+ /* Search the mon domain list too - needed on monitor only platforms. */
+ return mpam_resctrl_get_mon_domain_from_cpu(cpu);
}
int mpam_resctrl_online_cpu(unsigned int cpu)
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 29/41] arm_mpam: resctrl: Pre-allocate free running monitors
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (27 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 30/41] arm_mpam: resctrl: Allow resctrl to allocate monitors Ben Horgan
` (13 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
When there are enough monitors, the resctrl mbm local and total files can
be exposed. These need all the monitors that resctrl may use to be
allocated up front.
Add helpers to do this.
If a different candidate class is discovered, the old array should be
free'd and the allocated monitors returned to the driver.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Code flow tidying (Jonathan)
---
drivers/resctrl/mpam_internal.h | 3 +-
drivers/resctrl/mpam_resctrl.c | 81 ++++++++++++++++++++++++++++++++-
2 files changed, 81 insertions(+), 3 deletions(-)
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index bab6eea60dae..5f4ac4fabc0d 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -356,7 +356,8 @@ struct mpam_resctrl_res {
struct mpam_resctrl_mon {
struct mpam_class *class;
- /* per-class data that resctrl needs will live here */
+ /* Array of allocated MBWU monitors, indexed by (closid, rmid). */
+ int *mbwu_idx_to_mon;
};
static inline int mpam_alloc_csu_mon(struct mpam_class *class)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index c96c434c9454..8c57fd48e560 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -645,10 +645,58 @@ static void mpam_resctrl_pick_mba(void)
}
}
+static void __free_mbwu_mon(struct mpam_class *class, int *array,
+ u16 num_mbwu_mon)
+{
+ for (int i = 0; i < num_mbwu_mon; i++) {
+ if (array[i] < 0)
+ continue;
+
+ mpam_free_mbwu_mon(class, array[i]);
+ array[i] = ~0;
+ }
+}
+
+static int __alloc_mbwu_mon(struct mpam_class *class, int *array,
+ u16 num_mbwu_mon)
+{
+ for (int i = 0; i < num_mbwu_mon; i++) {
+ int mbwu_mon = mpam_alloc_mbwu_mon(class);
+
+ if (mbwu_mon < 0) {
+ __free_mbwu_mon(class, array, num_mbwu_mon);
+ return mbwu_mon;
+ }
+ array[i] = mbwu_mon;
+ }
+
+ return 0;
+}
+
+static int *__alloc_mbwu_array(struct mpam_class *class, u16 num_mbwu_mon)
+{
+ int err;
+ size_t array_size = num_mbwu_mon * sizeof(int);
+ int *array __free(kfree) = kmalloc(array_size, GFP_KERNEL);
+
+ if (!array)
+ return ERR_PTR(-ENOMEM);
+
+ memset(array, -1, array_size);
+
+ err = __alloc_mbwu_mon(class, array, num_mbwu_mon);
+ if (err)
+ return ERR_PTR(err);
+ return_ptr(array);
+}
+
static void counter_update_class(enum resctrl_event_id evt_id,
struct mpam_class *class)
{
- struct mpam_class *existing_class = mpam_resctrl_counters[evt_id].class;
+ struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[evt_id];
+ struct mpam_class *existing_class = mon->class;
+ u16 num_mbwu_mon = class->props.num_mbwu_mon;
+ int *new_array, *existing_array = mon->mbwu_idx_to_mon;
if (existing_class) {
if (class->level == 3) {
@@ -663,8 +711,37 @@ static void counter_update_class(enum resctrl_event_id evt_id,
}
}
- mpam_resctrl_counters[evt_id].class = class;
+ pr_debug("Updating event %u to use class %u\n", evt_id, class->level);
+
+ /* Might not need all the monitors */
+ num_mbwu_mon = __mpam_monitors_free_running(num_mbwu_mon);
+
+ if (evt_id != QOS_L3_OCCUP_EVENT_ID && num_mbwu_mon) {
+ /*
+ * This is the pre-allocated free-running monitors path. It always
+ * allocates one monitor per PARTID * PMG.
+ */
+ WARN_ON_ONCE(num_mbwu_mon != resctrl_arch_system_num_rmid_idx());
+
+ new_array = __alloc_mbwu_array(class, num_mbwu_mon);
+ if (IS_ERR(new_array)) {
+ pr_debug("Failed to allocate MBWU array\n");
+ return;
+ }
+ mon->mbwu_idx_to_mon = new_array;
+
+ if (existing_array) {
+ pr_debug("Releasing previous class %u's monitors\n",
+ existing_class->level);
+ __free_mbwu_mon(existing_class, existing_array, num_mbwu_mon);
+ kfree(existing_array);
+ }
+ } else if (evt_id != QOS_L3_OCCUP_EVENT_ID) {
+ pr_debug("Not pre-allocating free-running counters\n");
+ }
+
exposed_mon_capable = true;
+ mon->class = class;
}
static void mpam_resctrl_pick_counters(void)
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 30/41] arm_mpam: resctrl: Allow resctrl to allocate monitors
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (28 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 29/41] arm_mpam: resctrl: Pre-allocate free running monitors Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 31/41] arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid() Ben Horgan
` (12 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
When resctrl wants to read a domain's 'QOS_L3_OCCUP', it needs to allocate
a monitor on the corresponding resource. Monitors are allocated by class
instead of component.
If there are enough MBM monitors, they will be pre-allocated and
free-running.
Add helpers to allocate a CSU monitor. These helper return an out of range
value for MBM counters.
Allocating a montitor context is expected to block until hardware resources
become available. This only makes sense for QOS_L3_OCCUP as unallocated MBM
counters are losing data.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
USE_RMID_IDX -> USE_PRE_ALLOCATED in comment
Remove unnecessary arch_mon_ctx = NULL
Changes since v2:
Add include of resctrl_types.h as dropped from earlier patch
Changes since v3:
Don't mention ABMC in commit message
---
drivers/resctrl/mpam_internal.h | 14 ++++++-
drivers/resctrl/mpam_resctrl.c | 67 +++++++++++++++++++++++++++++++++
include/linux/arm_mpam.h | 5 +++
3 files changed, 85 insertions(+), 1 deletion(-)
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 5f4ac4fabc0d..549faa9df0f3 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -29,6 +29,14 @@ struct platform_device;
#define PACKED_FOR_KUNIT
#endif
+/*
+ * This 'mon' values must not alias an actual monitor, so must be larger than
+ * U16_MAX, but not be confused with an errno value, so smaller than
+ * (u32)-SZ_4K.
+ * USE_PRE_ALLOCATED is used to avoid confusion with an actual monitor.
+ */
+#define USE_PRE_ALLOCATED (U16_MAX + 1)
+
static inline bool mpam_is_enabled(void)
{
return static_branch_likely(&mpam_enabled);
@@ -212,7 +220,11 @@ enum mon_filter_options {
};
struct mon_cfg {
- u16 mon;
+ /*
+ * mon must be large enough to hold out of range values like
+ * USE_PRE_ALLOCATED
+ */
+ u32 mon;
u8 pmg;
bool match_pmg;
bool csu_exclude_clean;
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 8c57fd48e560..e2c534a68898 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -22,6 +22,8 @@
#include "mpam_internal.h"
+DECLARE_WAIT_QUEUE_HEAD(resctrl_mon_ctx_waiters);
+
/*
* The classes we've picked to map to resctrl resources, wrapped
* in with their resctrl structure.
@@ -267,6 +269,71 @@ struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
return &mpam_resctrl_controls[l].resctrl_res;
}
+static int resctrl_arch_mon_ctx_alloc_no_wait(enum resctrl_event_id evtid)
+{
+ struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[evtid];
+
+ if (!mon->class)
+ return -EINVAL;
+
+ switch (evtid) {
+ case QOS_L3_OCCUP_EVENT_ID:
+ /* With CDP, one monitor gets used for both code/data reads */
+ return mpam_alloc_csu_mon(mon->class);
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ return USE_PRE_ALLOCATED;
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r,
+ enum resctrl_event_id evtid)
+{
+ DEFINE_WAIT(wait);
+ int *ret;
+
+ ret = kmalloc(sizeof(*ret), GFP_KERNEL);
+ if (!ret)
+ return ERR_PTR(-ENOMEM);
+
+ do {
+ prepare_to_wait(&resctrl_mon_ctx_waiters, &wait,
+ TASK_INTERRUPTIBLE);
+ *ret = resctrl_arch_mon_ctx_alloc_no_wait(evtid);
+ if (*ret == -ENOSPC)
+ schedule();
+ } while (*ret == -ENOSPC && !signal_pending(current));
+ finish_wait(&resctrl_mon_ctx_waiters, &wait);
+
+ return ret;
+}
+
+static void resctrl_arch_mon_ctx_free_no_wait(enum resctrl_event_id evtid,
+ u32 mon_idx)
+{
+ struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[evtid];
+
+ if (!mon->class)
+ return;
+
+ if (evtid == QOS_L3_OCCUP_EVENT_ID)
+ mpam_free_csu_mon(mon->class, mon_idx);
+
+ wake_up(&resctrl_mon_ctx_waiters);
+}
+
+void resctrl_arch_mon_ctx_free(struct rdt_resource *r,
+ enum resctrl_event_id evtid, void *arch_mon_ctx)
+{
+ u32 mon_idx = *(u32 *)arch_mon_ctx;
+
+ kfree(arch_mon_ctx);
+
+ resctrl_arch_mon_ctx_free_no_wait(evtid, mon_idx);
+}
+
static bool cache_has_usable_cpor(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 7d23c90f077d..e1461e32af75 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -5,6 +5,7 @@
#define __LINUX_ARM_MPAM_H
#include <linux/acpi.h>
+#include <linux/resctrl_types.h>
#include <linux/types.h>
struct mpam_msc;
@@ -62,6 +63,10 @@ u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid);
void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid);
u32 resctrl_arch_system_num_rmid_idx(void);
+struct rdt_resource;
+void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, enum resctrl_event_id evtid);
+void resctrl_arch_mon_ctx_free(struct rdt_resource *r, enum resctrl_event_id evtid, void *ctx);
+
/**
* mpam_register_requestor() - Register a requestor with the MPAM driver
* @partid_max: The maximum PARTID value the requestor can generate.
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 31/41] arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid()
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (29 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 30/41] arm_mpam: resctrl: Allow resctrl to allocate monitors Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 32/41] arm_mpam: resctrl: Update the rmid reallocation limit Ben Horgan
` (11 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
resctrl uses resctrl_arch_rmid_read() to read counters. CDP emulation means
the counter may need reading in three different ways. The same goes for
reset.
The helpers behind the resctrl_arch_ functions will be re-used for the ABMC
equivalent functions.
Add the rounding helper for checking monitor values while we're here.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
cfg initialisation style
code flow at end of read_mon_cdp_safe()
Changes since v2:
Whitespace changes
Changes since v3:
Update function signatures
Remove abmc check
---
drivers/resctrl/mpam_resctrl.c | 152 +++++++++++++++++++++++++++++++++
include/linux/arm_mpam.h | 5 ++
2 files changed, 157 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index e2c534a68898..e8102cc74de8 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -334,6 +334,158 @@ void resctrl_arch_mon_ctx_free(struct rdt_resource *r,
resctrl_arch_mon_ctx_free_no_wait(evtid, mon_idx);
}
+static int __read_mon(struct mpam_resctrl_mon *mon, struct mpam_component *mon_comp,
+ enum mpam_device_features mon_type,
+ int mon_idx,
+ enum resctrl_conf_type cdp_type, u32 closid, u32 rmid, u64 *val)
+{
+ struct mon_cfg cfg;
+
+ if (!mpam_is_enabled())
+ return -EINVAL;
+
+ /* Shift closid to account for CDP */
+ closid = resctrl_get_config_index(closid, cdp_type);
+
+ if (mon_idx == USE_PRE_ALLOCATED) {
+ int mbwu_idx = resctrl_arch_rmid_idx_encode(closid, rmid);
+
+ mon_idx = mon->mbwu_idx_to_mon[mbwu_idx];
+ if (mon_idx == -1)
+ return -EINVAL;
+ }
+
+ if (irqs_disabled()) {
+ /* Check if we can access this domain without an IPI */
+ return -EIO;
+ }
+
+ cfg = (struct mon_cfg) {
+ .mon = mon_idx,
+ .match_pmg = true,
+ .partid = closid,
+ .pmg = rmid,
+ };
+
+ return mpam_msmon_read(mon_comp, &cfg, mon_type, val);
+}
+
+static int read_mon_cdp_safe(struct mpam_resctrl_mon *mon, struct mpam_component *mon_comp,
+ enum mpam_device_features mon_type,
+ int mon_idx, u32 closid, u32 rmid, u64 *val)
+{
+ if (cdp_enabled) {
+ u64 code_val = 0, data_val = 0;
+ int err;
+
+ err = __read_mon(mon, mon_comp, mon_type, mon_idx,
+ CDP_CODE, closid, rmid, &code_val);
+ if (err)
+ return err;
+
+ err = __read_mon(mon, mon_comp, mon_type, mon_idx,
+ CDP_DATA, closid, rmid, &data_val);
+ if (err)
+ return err;
+
+ *val += code_val + data_val;
+ return 0;
+ }
+
+ return __read_mon(mon, mon_comp, mon_type, mon_idx,
+ CDP_NONE, closid, rmid, val);
+}
+
+/* MBWU when not in ABMC mode, and CSU counters. */
+int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
+ u32 closid, u32 rmid, enum resctrl_event_id eventid,
+ void *arch_priv, u64 *val, void *arch_mon_ctx)
+{
+ struct mpam_resctrl_dom *l3_dom;
+ struct mpam_component *mon_comp;
+ u32 mon_idx = *(u32 *)arch_mon_ctx;
+ enum mpam_device_features mon_type;
+ struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[eventid];
+
+ resctrl_arch_rmid_read_context_check();
+
+ if (eventid >= QOS_NUM_EVENTS || !mon->class)
+ return -EINVAL;
+
+ l3_dom = container_of(hdr, struct mpam_resctrl_dom, resctrl_mon_dom.hdr);
+ mon_comp = l3_dom->mon_comp[eventid];
+
+ switch (eventid) {
+ case QOS_L3_OCCUP_EVENT_ID:
+ mon_type = mpam_feat_msmon_csu;
+ break;
+ case QOS_L3_MBM_LOCAL_EVENT_ID:
+ case QOS_L3_MBM_TOTAL_EVENT_ID:
+ mon_type = mpam_feat_msmon_mbwu;
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ return read_mon_cdp_safe(mon, mon_comp, mon_type, mon_idx,
+ closid, rmid, val);
+}
+
+static void __reset_mon(struct mpam_resctrl_mon *mon, struct mpam_component *mon_comp,
+ int mon_idx,
+ enum resctrl_conf_type cdp_type, u32 closid, u32 rmid)
+{
+ struct mon_cfg cfg = { };
+
+ if (!mpam_is_enabled())
+ return;
+
+ /* Shift closid to account for CDP */
+ closid = resctrl_get_config_index(closid, cdp_type);
+
+ if (mon_idx == USE_PRE_ALLOCATED) {
+ int mbwu_idx = resctrl_arch_rmid_idx_encode(closid, rmid);
+ mon_idx = mon->mbwu_idx_to_mon[mbwu_idx];
+ }
+
+ if (mon_idx == -1)
+ return;
+ cfg.mon = mon_idx;
+ mpam_msmon_reset_mbwu(mon_comp, &cfg);
+}
+
+static void reset_mon_cdp_safe(struct mpam_resctrl_mon *mon, struct mpam_component *mon_comp,
+ int mon_idx, u32 closid, u32 rmid)
+{
+ if (cdp_enabled) {
+ __reset_mon(mon, mon_comp, mon_idx, CDP_CODE, closid, rmid);
+ __reset_mon(mon, mon_comp, mon_idx, CDP_DATA, closid, rmid);
+ } else {
+ __reset_mon(mon, mon_comp, mon_idx, CDP_NONE, closid, rmid);
+ }
+}
+
+/* Called via IPI. Call with read_cpus_lock() held. */
+void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ u32 closid, u32 rmid, enum resctrl_event_id eventid)
+{
+ struct mpam_resctrl_dom *l3_dom;
+ struct mpam_component *mon_comp;
+ struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[eventid];
+
+ if (!mpam_is_enabled())
+ return;
+
+ /* Only MBWU counters are relevant, and for supported event types. */
+ if (eventid == QOS_L3_OCCUP_EVENT_ID || !mon->class)
+ return;
+
+ l3_dom = container_of(d, struct mpam_resctrl_dom, resctrl_mon_dom);
+ mon_comp = l3_dom->mon_comp[eventid];
+
+ reset_mon_cdp_safe(mon, mon_comp, USE_PRE_ALLOCATED, closid, rmid);
+}
+
static bool cache_has_usable_cpor(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index e1461e32af75..86d5e326d2bd 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -67,6 +67,11 @@ struct rdt_resource;
void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, enum resctrl_event_id evtid);
void resctrl_arch_mon_ctx_free(struct rdt_resource *r, enum resctrl_event_id evtid, void *ctx);
+static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
+{
+ return val;
+}
+
/**
* mpam_register_requestor() - Register a requestor with the MPAM driver
* @partid_max: The maximum PARTID value the requestor can generate.
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 32/41] arm_mpam: resctrl: Update the rmid reallocation limit
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (30 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 31/41] arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid() Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 33/41] arm_mpam: resctrl: Add empty definitions for assorted resctrl functions Ben Horgan
` (10 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
resctrl's limbo code needs to be told when the data left in a cache is
small enough for the partid+pmg value to be re-allocated.
x86 uses the cache size divided by the number of rmid users the cache may
have. Do the same, but for the smallest cache, and with the number of
partid-and-pmg users.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Move waiting for cache info into it's own patch
Changes since v3:
Move check class is csu higher (just kept to document intent)
continue -> break
to squash update rmid limits
use raw_smp_processor_id()
---
drivers/resctrl/mpam_resctrl.c | 39 ++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index e8102cc74de8..e2b1afca5f01 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -486,6 +486,42 @@ void resctrl_arch_reset_rmid(struct rdt_resource *r, struct rdt_l3_mon_domain *d
reset_mon_cdp_safe(mon, mon_comp, USE_PRE_ALLOCATED, closid, rmid);
}
+/*
+ * The rmid realloc threshold should be for the smallest cache exposed to
+ * resctrl.
+ */
+static int update_rmid_limits(struct mpam_class *class)
+{
+ u32 num_unique_pmg = resctrl_arch_system_num_rmid_idx();
+ struct mpam_props *cprops = &class->props;
+ struct cacheinfo *ci;
+
+ lockdep_assert_cpus_held();
+
+ if (!mpam_has_feature(mpam_feat_msmon_csu, cprops))
+ return 0;
+
+ /*
+ * Assume cache levels are the same size for all CPUs...
+ * The check just requires any online CPU and it can't go offline as we
+ * hold the cpu lock.
+ */
+ ci = get_cpu_cacheinfo_level(raw_smp_processor_id(), class->level);
+ if (!ci || ci->size == 0) {
+ pr_debug("Could not read cache size for class %u\n",
+ class->level);
+ return -EINVAL;
+ }
+
+ if (!resctrl_rmid_realloc_limit ||
+ ci->size < resctrl_rmid_realloc_limit) {
+ resctrl_rmid_realloc_limit = ci->size;
+ resctrl_rmid_realloc_threshold = ci->size / num_unique_pmg;
+ }
+
+ return 0;
+}
+
static bool cache_has_usable_cpor(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
@@ -991,6 +1027,9 @@ static void mpam_resctrl_pick_counters(void)
/* CSU counters only make sense on a cache. */
switch (class->type) {
case MPAM_CLASS_CACHE:
+ if (update_rmid_limits(class))
+ break;
+
counter_update_class(QOS_L3_OCCUP_EVENT_ID, class);
break;
default:
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 33/41] arm_mpam: resctrl: Add empty definitions for assorted resctrl functions
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (31 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 32/41] arm_mpam: resctrl: Update the rmid reallocation limit Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 34/41] arm64: mpam: Select ARCH_HAS_CPU_RESCTRL Ben Horgan
` (9 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
A few resctrl features and hooks need to be provided, but aren't needed or
supported on MPAM platforms.
resctrl has individual hooks to separately enable and disable the
closid/partid and rmid/pmg context switching code. For MPAM this is all the
same thing, as the value in struct task_struct is used to cache the value
that should be written to hardware. arm64's context switching code is
enabled once MPAM is usable, but doesn't touch the hardware unless the
value has changed.
For now event configuration is not supported, and can be turned off by
returning 'false' from resctrl_arch_is_evt_configurable().
The new io_alloc feature is not supported either, always return false from
the enable helper to indicate and fail the enable.
Add this, and empty definitions for the other hooks.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v3:
Add resctrl_arch_pre_mount() {}
resctrl_arch_reset_rmid_all() signature update
add stubs for abmc
keep empty definitions together
---
drivers/resctrl/mpam_resctrl.c | 60 ++++++++++++++++++++++++++++++++++
include/linux/arm_mpam.h | 9 +++++
2 files changed, 69 insertions(+)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index e2b1afca5f01..db1aba14b41d 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -90,6 +90,66 @@ bool resctrl_arch_mon_capable(void)
return exposed_mon_capable;
}
+bool resctrl_arch_is_evt_configurable(enum resctrl_event_id evt)
+{
+ return false;
+}
+
+void resctrl_arch_mon_event_config_read(void *info)
+{
+}
+
+void resctrl_arch_mon_event_config_write(void *info)
+{
+}
+
+void resctrl_arch_reset_rmid_all(struct rdt_resource *r, struct rdt_l3_mon_domain *d)
+{
+}
+
+void resctrl_arch_reset_cntr(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ u32 closid, u32 rmid, int cntr_id,
+ enum resctrl_event_id eventid)
+{
+}
+
+void resctrl_arch_config_cntr(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ enum resctrl_event_id evtid, u32 rmid, u32 closid,
+ u32 cntr_id, bool assign)
+{
+}
+
+int resctrl_arch_cntr_read(struct rdt_resource *r, struct rdt_l3_mon_domain *d,
+ u32 unused, u32 rmid, int cntr_id,
+ enum resctrl_event_id eventid, u64 *val)
+{
+ return -EOPNOTSUPP;
+}
+
+bool resctrl_arch_mbm_cntr_assign_enabled(struct rdt_resource *r)
+{
+ return false;
+}
+
+int resctrl_arch_mbm_cntr_assign_set(struct rdt_resource *r, bool enable)
+{
+ return -EINVAL;
+}
+
+int resctrl_arch_io_alloc_enable(struct rdt_resource *r, bool enable)
+{
+ return -EOPNOTSUPP;
+}
+
+bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r)
+{
+ return false;
+}
+
+void resctrl_arch_pre_mount(void)
+{
+}
+
bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
{
switch (rid) {
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 86d5e326d2bd..f92a36187a52 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -67,6 +67,15 @@ struct rdt_resource;
void *resctrl_arch_mon_ctx_alloc(struct rdt_resource *r, enum resctrl_event_id evtid);
void resctrl_arch_mon_ctx_free(struct rdt_resource *r, enum resctrl_event_id evtid, void *ctx);
+/*
+ * The CPU configuration for MPAM is cheap to write, and is only written if it
+ * has changed. No need for fine grained enables.
+ */
+static inline void resctrl_arch_enable_mon(void) { }
+static inline void resctrl_arch_disable_mon(void) { }
+static inline void resctrl_arch_enable_alloc(void) { }
+static inline void resctrl_arch_disable_alloc(void) { }
+
static inline unsigned int resctrl_arch_round_mon_val(unsigned int val)
{
return val;
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 34/41] arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (32 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 33/41] arm_mpam: resctrl: Add empty definitions for assorted resctrl functions Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 35/41] arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl Ben Horgan
` (8 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
Enough MPAM support is present to enable ARCH_HAS_CPU_RESCTRL. Let it
rip^Wlink!
ARCH_HAS_CPU_RESCTRL indicates resctrl can be enabled. It is enabled by the
arch code simply because it has 'arch' in its name.
This removes ARM_CPU_RESCTRL as a mimic of X86_CPU_RESCTRL. While here,
move the ACPI dependency to the driver's Kconfig file.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
arch/arm64/Kconfig | 2 +-
arch/arm64/include/asm/resctrl.h | 2 ++
drivers/resctrl/Kconfig | 7 +++++++
drivers/resctrl/Makefile | 2 +-
4 files changed, 11 insertions(+), 2 deletions(-)
create mode 100644 arch/arm64/include/asm/resctrl.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 0e8fa195580b..baeecb88771d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2027,7 +2027,7 @@ config ARM64_TLB_RANGE
config ARM64_MPAM
bool "Enable support for MPAM"
select ARM64_MPAM_DRIVER
- select ACPI_MPAM if ACPI
+ select ARCH_HAS_CPU_RESCTRL
help
Memory System Resource Partitioning and Monitoring (MPAM) is an
optional extension to the Arm architecture that allows each
diff --git a/arch/arm64/include/asm/resctrl.h b/arch/arm64/include/asm/resctrl.h
new file mode 100644
index 000000000000..b506e95cf6e3
--- /dev/null
+++ b/arch/arm64/include/asm/resctrl.h
@@ -0,0 +1,2 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#include <linux/arm_mpam.h>
diff --git a/drivers/resctrl/Kconfig b/drivers/resctrl/Kconfig
index c34e059c6e41..672abea3b03c 100644
--- a/drivers/resctrl/Kconfig
+++ b/drivers/resctrl/Kconfig
@@ -1,6 +1,7 @@
menuconfig ARM64_MPAM_DRIVER
bool "MPAM driver"
depends on ARM64 && ARM64_MPAM
+ select ACPI_MPAM if ACPI
help
Memory System Resource Partitioning and Monitoring (MPAM) driver for
System IP, e.g. caches and memory controllers.
@@ -22,3 +23,9 @@ config MPAM_KUNIT_TEST
If unsure, say N.
endif
+
+config ARM64_MPAM_RESCTRL_FS
+ bool
+ default y if ARM64_MPAM_DRIVER && RESCTRL_FS
+ select RESCTRL_RMID_DEPENDS_ON_CLOSID
+ select RESCTRL_ASSIGN_FIXED
diff --git a/drivers/resctrl/Makefile b/drivers/resctrl/Makefile
index 40beaf999582..4f6d0e81f9b8 100644
--- a/drivers/resctrl/Makefile
+++ b/drivers/resctrl/Makefile
@@ -1,5 +1,5 @@
obj-$(CONFIG_ARM64_MPAM_DRIVER) += mpam.o
mpam-y += mpam_devices.o
-mpam-$(CONFIG_ARM_CPU_RESCTRL) += mpam_resctrl.o
+mpam-$(CONFIG_ARM64_MPAM_RESCTRL_FS) += mpam_resctrl.o
ccflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 35/41] arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (33 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 34/41] arm64: mpam: Select ARCH_HAS_CPU_RESCTRL Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 36/41] arm_mpam: Add quirk framework Ben Horgan
` (7 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: James Morse <james.morse@arm.com>
Now that MPAM links against resctrl, call resctrl_init() to register the
filesystem and setup resctrl's structures.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Tested-by: Peter Newman <peternewman@google.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v2:
Use for_each_mpam...
error path tidying
Changes since v3:
Don't consider abmc in teardown
---
drivers/resctrl/mpam_devices.c | 32 ++++++++++++--
drivers/resctrl/mpam_internal.h | 4 ++
drivers/resctrl/mpam_resctrl.c | 76 ++++++++++++++++++++++++++++++++-
3 files changed, 107 insertions(+), 5 deletions(-)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 965384c667f6..6c81821a9a04 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -73,6 +73,14 @@ static DECLARE_WORK(mpam_broken_work, &mpam_disable);
/* When mpam is disabled, the printed reason to aid debugging */
static char *mpam_disable_reason;
+/*
+ * Whether resctrl has been setup. Used by cpuhp in preference to
+ * mpam_is_enabled(). The disable call after an error interrupt makes
+ * mpam_is_enabled() false before the cpuhp callbacks are made.
+ * Reads/writes should hold mpam_cpuhp_state_lock, (or be cpuhp callbacks).
+ */
+static bool mpam_resctrl_enabled;
+
/*
* An MSC is a physical container for controls and monitors, each identified by
* their RIS index. These share a base-address, interrupts and some MMIO
@@ -1635,7 +1643,7 @@ static int mpam_cpu_online(unsigned int cpu)
mpam_reprogram_msc(msc);
}
- if (mpam_is_enabled())
+ if (mpam_resctrl_enabled)
return mpam_resctrl_online_cpu(cpu);
return 0;
@@ -1681,7 +1689,7 @@ static int mpam_cpu_offline(unsigned int cpu)
{
struct mpam_msc *msc;
- if (mpam_is_enabled())
+ if (mpam_resctrl_enabled)
mpam_resctrl_offline_cpu(cpu);
guard(srcu)(&mpam_srcu);
@@ -2543,6 +2551,7 @@ static void mpam_enable_once(void)
}
static_branch_enable(&mpam_enabled);
+ mpam_resctrl_enabled = true;
mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline,
"mpam:online");
@@ -2602,24 +2611,39 @@ static void mpam_reset_class(struct mpam_class *class)
void mpam_disable(struct work_struct *ignored)
{
int idx;
+ bool do_resctrl_exit;
struct mpam_class *class;
struct mpam_msc *msc, *tmp;
+ if (mpam_is_enabled())
+ static_branch_disable(&mpam_enabled);
+
mutex_lock(&mpam_cpuhp_state_lock);
if (mpam_cpuhp_state) {
cpuhp_remove_state(mpam_cpuhp_state);
mpam_cpuhp_state = 0;
}
+
+ /*
+ * Removing the cpuhp state called mpam_cpu_offline() and told resctrl
+ * all the CPUs are offline.
+ */
+ do_resctrl_exit = mpam_resctrl_enabled;
+ mpam_resctrl_enabled = false;
mutex_unlock(&mpam_cpuhp_state_lock);
- static_branch_disable(&mpam_enabled);
+ if (do_resctrl_exit)
+ mpam_resctrl_exit();
mpam_unregister_irqs();
idx = srcu_read_lock(&mpam_srcu);
list_for_each_entry_srcu(class, &mpam_classes, classes_list,
- srcu_read_lock_held(&mpam_srcu))
+ srcu_read_lock_held(&mpam_srcu)) {
mpam_reset_class(class);
+ if (do_resctrl_exit)
+ mpam_resctrl_teardown_class(class);
+ }
srcu_read_unlock(&mpam_srcu, idx);
mutex_lock(&mpam_list_lock);
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 549faa9df0f3..05561d447938 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -431,12 +431,16 @@ int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
#ifdef CONFIG_RESCTRL_FS
int mpam_resctrl_setup(void);
+void mpam_resctrl_exit(void);
int mpam_resctrl_online_cpu(unsigned int cpu);
void mpam_resctrl_offline_cpu(unsigned int cpu);
+void mpam_resctrl_teardown_class(struct mpam_class *class);
#else
static inline int mpam_resctrl_setup(void) { return 0; }
+static inline void mpam_resctrl_exit(void) { }
static inline int mpam_resctrl_online_cpu(unsigned int cpu) { return 0; }
static inline void mpam_resctrl_offline_cpu(unsigned int cpu) { }
+static inline void mpam_resctrl_teardown_class(struct mpam_class *class) { }
#endif /* CONFIG_RESCTRL_FS */
/*
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index db1aba14b41d..91e46503c9a5 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -72,6 +72,12 @@ static bool cdp_enabled;
static bool cacheinfo_ready;
static DECLARE_WAIT_QUEUE_HEAD(wait_cacheinfo_ready);
+/*
+ * If resctrl_init() succeeded, resctrl_exit() can be used to remove support
+ * for the filesystem in the event of an error.
+ */
+static bool resctrl_enabled;
+
/* Whether this num_mbw_mon could result in a free_running system */
static int __mpam_monitors_free_running(u16 num_mbwu_mon)
{
@@ -333,6 +339,9 @@ static int resctrl_arch_mon_ctx_alloc_no_wait(enum resctrl_event_id evtid)
{
struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[evtid];
+ if (!mpam_is_enabled())
+ return -EINVAL;
+
if (!mon->class)
return -EINVAL;
@@ -375,6 +384,9 @@ static void resctrl_arch_mon_ctx_free_no_wait(enum resctrl_event_id evtid,
{
struct mpam_resctrl_mon *mon = &mpam_resctrl_counters[evtid];
+ if (!mpam_is_enabled())
+ return;
+
if (!mon->class)
return;
@@ -469,6 +481,9 @@ int resctrl_arch_rmid_read(struct rdt_resource *r, struct rdt_domain_hdr *hdr,
resctrl_arch_rmid_read_context_check();
+ if (!mpam_is_enabled())
+ return -EINVAL;
+
if (eventid >= QOS_NUM_EVENTS || !mon->class)
return -EINVAL;
@@ -1313,6 +1328,9 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
lockdep_assert_cpus_held();
lockdep_assert_irqs_enabled();
+ if (!mpam_is_enabled())
+ return -EINVAL;
+
/*
* No need to check the CPU as mpam_apply_config() doesn't care, and
* resctrl_arch_update_domains() relies on this.
@@ -1375,6 +1393,9 @@ int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
lockdep_assert_cpus_held();
lockdep_assert_irqs_enabled();
+ if (!mpam_is_enabled())
+ return -EINVAL;
+
list_for_each_entry_rcu(d, &r->ctrl_domains, hdr.list) {
for (enum resctrl_conf_type t = 0; t < CDP_NUM_TYPES; t++) {
struct resctrl_staged_config *cfg = &d->staged_config[t];
@@ -1763,7 +1784,11 @@ int mpam_resctrl_setup(void)
return -EOPNOTSUPP;
}
- /* TODO: call resctrl_init() */
+ err = resctrl_init();
+ if (err)
+ return err;
+
+ WRITE_ONCE(resctrl_enabled, true);
return 0;
@@ -1773,6 +1798,55 @@ int mpam_resctrl_setup(void)
return err;
}
+void mpam_resctrl_exit(void)
+{
+ if (!READ_ONCE(resctrl_enabled))
+ return;
+
+ WRITE_ONCE(resctrl_enabled, false);
+ resctrl_exit();
+}
+
+static void mpam_resctrl_teardown_mon(struct mpam_resctrl_mon *mon, struct mpam_class *class)
+{
+ u32 num_mbwu_mon = resctrl_arch_system_num_rmid_idx();
+
+ if (!mon->mbwu_idx_to_mon)
+ return;
+
+ __free_mbwu_mon(class, mon->mbwu_idx_to_mon, num_mbwu_mon);
+ mon->mbwu_idx_to_mon = NULL;
+}
+
+/*
+ * The driver is detaching an MSC from this class, if resctrl was using it,
+ * pull on resctrl_exit().
+ */
+void mpam_resctrl_teardown_class(struct mpam_class *class)
+{
+ struct mpam_resctrl_res *res;
+ enum resctrl_res_level rid;
+ struct mpam_resctrl_mon *mon;
+ enum resctrl_event_id eventid;
+
+ might_sleep();
+
+ for_each_mpam_resctrl_control(res, rid) {
+ if (res->class == class) {
+ res->class = NULL;
+ break;
+ }
+ }
+ for_each_mpam_resctrl_mon(mon, eventid) {
+ if (mon->class == class) {
+ mon->class = NULL;
+
+ mpam_resctrl_teardown_mon(mon, class);
+ break;
+ }
+ }
+}
+
static int __init __cacheinfo_ready(void)
{
cacheinfo_ready = true;
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 36/41] arm_mpam: Add quirk framework
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (34 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 35/41] arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 37/41] arm_mpam: Add workaround for T241-MPAM-1 Ben Horgan
` (6 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: Shanker Donthineni <sdonthineni@nvidia.com>
The MPAM specification includes the MPAMF_IIDR, which serves to uniquely
identify the MSC implementation through a combination of implementer
details, product ID, variant, and revision. Certain hardware issues/errata
can be resolved using software workarounds.
Introduce a quirk framework to allow workarounds to be enabled based on the
MPAMF_IIDR value.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Co-developed-by: James Morse <james.morse@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes by James:
Stash the IIDR so this doesn't need an IPI, enable quirks only
once, move the description to the callback so it can be pr_once()d, add an
enum of workarounds for popular errata. Add macros for making lists of
product/revision/vendor half readable
Changes since rfc:
remove trailing commas in last element of enums
Make mpam_enable_quirks() in charge of mpam_set_quirk() even if there
is an enable.
Changes since v3:
Brackets in macro
---
drivers/resctrl/mpam_devices.c | 32 ++++++++++++++++++++++++++++++++
drivers/resctrl/mpam_internal.h | 25 +++++++++++++++++++++++++
2 files changed, 57 insertions(+)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 6c81821a9a04..4c5b9810c41f 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -630,6 +630,30 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc *msc,
return ERR_PTR(-ENOENT);
}
+static const struct mpam_quirk mpam_quirks[] = {
+ { NULL } /* Sentinel */
+};
+
+static void mpam_enable_quirks(struct mpam_msc *msc)
+{
+ const struct mpam_quirk *quirk;
+
+ for (quirk = &mpam_quirks[0]; quirk->iidr_mask; quirk++) {
+ int err = 0;
+
+ if (quirk->iidr != (msc->iidr & quirk->iidr_mask))
+ continue;
+
+ if (quirk->init)
+ err = quirk->init(msc, quirk);
+
+ if (err)
+ continue;
+
+ mpam_set_quirk(quirk->workaround, msc);
+ }
+}
+
/*
* IHI009A.a has this nugget: "If a monitor does not support automatic behaviour
* of NRDY, software can use this bit for any purpose" - so hardware might not
@@ -864,8 +888,11 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
/* Grab an IDR value to find out how many RIS there are */
mutex_lock(&msc->part_sel_lock);
idr = mpam_msc_read_idr(msc);
+ msc->iidr = mpam_read_partsel_reg(msc, IIDR);
mutex_unlock(&msc->part_sel_lock);
+ mpam_enable_quirks(msc);
+
msc->ris_max = FIELD_GET(MPAMF_IDR_RIS_MAX, idr);
/* Use these values so partid/pmg always starts with a valid value */
@@ -1988,6 +2015,7 @@ static bool mpam_has_cmax_wd_feature(struct mpam_props *props)
* resulting safe value must be compatible with both. When merging values in
* the tree, all the aliasing resources must be handled first.
* On mismatch, parent is modified.
+ * Quirks on an MSC will apply to all MSC in that class.
*/
static void __props_mismatch(struct mpam_props *parent,
struct mpam_props *child, bool alias)
@@ -2107,6 +2135,7 @@ static void __props_mismatch(struct mpam_props *parent,
* nobble the class feature, as we can't configure all the resources.
* e.g. The L3 cache is composed of two resources with 13 and 17 portion
* bitmaps respectively.
+ * Quirks on an MSC will apply to all MSC in that class.
*/
static void
__class_props_mismatch(struct mpam_class *class, struct mpam_vmsc *vmsc)
@@ -2120,6 +2149,9 @@ __class_props_mismatch(struct mpam_class *class, struct mpam_vmsc *vmsc)
dev_dbg(dev, "Merging features for class:0x%lx &= vmsc:0x%lx\n",
(long)cprops->features, (long)vprops->features);
+ /* Merge quirks */
+ class->quirks |= vmsc->msc->quirks;
+
/* Take the safe value for any common features */
__props_mismatch(cprops, vprops, false);
}
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 05561d447938..b161d13d7877 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -85,6 +85,8 @@ struct mpam_msc {
u8 pmg_max;
unsigned long ris_idxs;
u32 ris_max;
+ u32 iidr;
+ u16 quirks;
/*
* error_irq_lock is taken when registering/unregistering the error
@@ -212,6 +214,28 @@ struct mpam_props {
#define mpam_set_feature(_feat, x) set_bit(_feat, (x)->features)
#define mpam_clear_feature(_feat, x) clear_bit(_feat, (x)->features)
+/* Workaround bits for msc->quirks */
+enum mpam_device_quirks {
+ MPAM_QUIRK_LAST
+};
+
+#define mpam_has_quirk(_quirk, x) ((1 << (_quirk) & (x)->quirks))
+#define mpam_set_quirk(_quirk, x) ((x)->quirks |= (1 << (_quirk)))
+
+struct mpam_quirk {
+ int (*init)(struct mpam_msc *msc, const struct mpam_quirk *quirk);
+
+ u32 iidr;
+ u32 iidr_mask;
+
+ enum mpam_device_quirks workaround;
+};
+
+#define MPAM_IIDR_MATCH_ONE (FIELD_PREP_CONST(MPAMF_IIDR_PRODUCTID, 0xfff) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_VARIANT, 0xf) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_REVISION, 0xf) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_IMPLEMENTER, 0xfff))
+
/* The values for MSMON_CFG_MBWU_FLT.RWBW */
enum mon_filter_options {
COUNT_BOTH = 0,
@@ -255,6 +279,7 @@ struct mpam_class {
struct mpam_props props;
u32 nrdy_usec;
+ u16 quirks;
u8 level;
enum mpam_class_types type;
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 37/41] arm_mpam: Add workaround for T241-MPAM-1
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (35 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 36/41] arm_mpam: Add quirk framework Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4 Ben Horgan
` (5 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: Shanker Donthineni <sdonthineni@nvidia.com>
The MPAM bandwidth partitioning controls will not be correctly configured,
and hardware will retain default configuration register values, meaning
generally that bandwidth will remain unprovisioned.
To address the issue, follow the below steps after updating the MBW_MIN
and/or MBW_MAX registers.
- Perform 64b reads from all 12 bridge MPAM shadow registers at offsets
(0x360048 + slice*0x10000 + partid*8). These registers are read-only.
- Continue iterating until all 12 shadow register values match in a loop.
pr_warn_once if the values fail to match within the loop count 1000.
- Perform 64b writes with the value 0x0 to the two spare registers at
offsets 0x1b0000 and 0x1c0000.
In the hardware, writes to the MPAMCFG_MBW_MAX MPAMCFG_MBW_MIN registers
are transformed into broadcast writes to the 12 shadow registers. The
final two writes to the spare registers cause a final rank of downstream
micro-architectural MPAM registers to be updated from the shadow copies.
The intervening loop to read the 12 shadow registers helps avoid a race
condition where writes to the spare registers occur before all shadow
registers have been updated.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes from James:
Merged the min/max update into a single
mpam_quirk_post_config_change() helper. Stashed the t241_id in the msc
instead of carrying the physical address around. Test the msc quirk bit
instead of a static key.
Changes since rfc:
MPAM_IIDR_NVIDIA_T421 -> MPAM_IIDR_NVIDIA_T241
return err from init
Be specific about the errata in the init name,
mpam_enable_quirk_nvidia_t241 -> mpam_enable_quirk_nvidia_t241_1
Changes since v3:
parentheses
---
Documentation/arch/arm64/silicon-errata.rst | 2 +
drivers/resctrl/mpam_devices.c | 88 +++++++++++++++++++++
drivers/resctrl/mpam_internal.h | 9 +++
3 files changed, 99 insertions(+)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
index a7ec57060f64..4e86b85fe3d6 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -246,6 +246,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 GICv3/4.x | T241-FABRIC-4 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| NVIDIA | T241 MPAM | T241-MPAM-1 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+----------------+-----------------+-----------------+-----------------------------+
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 4c5b9810c41f..76c8cfcef3c0 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -29,6 +29,16 @@
#include "mpam_internal.h"
+/* Values for the T241 errata workaround */
+#define T241_CHIPS_MAX 4
+#define T241_CHIP_NSLICES 12
+#define T241_SPARE_REG0_OFF 0x1b0000
+#define T241_SPARE_REG1_OFF 0x1c0000
+#define T241_CHIP_ID(phys) FIELD_GET(GENMASK_ULL(44, 43), phys)
+#define T241_SHADOW_REG_OFF(sidx, pid) (0x360048 + (sidx) * 0x10000 + (pid) * 8)
+#define SMCCC_SOC_ID_T241 0x036b0241
+static void __iomem *t241_scratch_regs[T241_CHIPS_MAX];
+
/*
* mpam_list_lock protects the SRCU lists when writing. Once the
* mpam_enabled key is enabled these lists are read-only,
@@ -630,7 +640,45 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc *msc,
return ERR_PTR(-ENOENT);
}
+static int mpam_enable_quirk_nvidia_t241_1(struct mpam_msc *msc,
+ const struct mpam_quirk *quirk)
+{
+ s32 soc_id = arm_smccc_get_soc_id_version();
+ struct resource *r;
+ phys_addr_t phys;
+
+ /*
+ * A mapping to a device other than the MSC is needed, check
+ * SOC_ID is NVIDIA T241 chip (036b:0241)
+ */
+ if (soc_id < 0 || soc_id != SMCCC_SOC_ID_T241)
+ return -EINVAL;
+
+ r = platform_get_resource(msc->pdev, IORESOURCE_MEM, 0);
+ if (!r)
+ return -EINVAL;
+
+ /* Find the internal registers base addr from the CHIP ID */
+ msc->t241_id = T241_CHIP_ID(r->start);
+ phys = FIELD_PREP(GENMASK_ULL(45, 44), msc->t241_id) | 0x19000000ULL;
+
+ t241_scratch_regs[msc->t241_id] = ioremap(phys, SZ_8M);
+ if (WARN_ON_ONCE(!t241_scratch_regs[msc->t241_id]))
+ return -EINVAL;
+
+ pr_info_once("Enabled workaround for NVIDIA T241 erratum T241-MPAM-1\n");
+
+ return 0;
+}
+
static const struct mpam_quirk mpam_quirks[] = {
+ {
+ /* NVIDIA t241 erratum T241-MPAM-1 */
+ .init = mpam_enable_quirk_nvidia_t241_1,
+ .iidr = MPAM_IIDR_NVIDIA_T241,
+ .iidr_mask = MPAM_IIDR_MATCH_ONE,
+ .workaround = T241_SCRUB_SHADOW_REGS,
+ },
{ NULL } /* Sentinel */
};
@@ -1378,6 +1426,44 @@ static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
__mpam_write_reg(msc, reg, bm);
}
+static void mpam_apply_t241_erratum(struct mpam_msc_ris *ris, u16 partid)
+{
+ int sidx, i, lcount = 1000;
+ void __iomem *regs;
+ u64 val0, val;
+
+ regs = t241_scratch_regs[ris->vmsc->msc->t241_id];
+
+ for (i = 0; i < lcount; i++) {
+ /* Read the shadow register at index 0 */
+ val0 = readq_relaxed(regs + T241_SHADOW_REG_OFF(0, partid));
+
+ /* Check if all the shadow registers have the same value */
+ for (sidx = 1; sidx < T241_CHIP_NSLICES; sidx++) {
+ val = readq_relaxed(regs +
+ T241_SHADOW_REG_OFF(sidx, partid));
+ if (val != val0)
+ break;
+ }
+ if (sidx == T241_CHIP_NSLICES)
+ break;
+ }
+
+ if (i == lcount)
+ pr_warn_once("t241: inconsistent values in shadow regs");
+
+ /* Write a value zero to spare registers to take effect of MBW conf */
+ writeq_relaxed(0, regs + T241_SPARE_REG0_OFF);
+ writeq_relaxed(0, regs + T241_SPARE_REG1_OFF);
+}
+
+static void mpam_quirk_post_config_change(struct mpam_msc_ris *ris, u16 partid,
+ struct mpam_config *cfg)
+{
+ if (mpam_has_quirk(T241_SCRUB_SHADOW_REGS, ris->vmsc->msc))
+ mpam_apply_t241_erratum(ris, partid);
+}
+
/* Called via IPI. Call while holding an SRCU reference */
static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
struct mpam_config *cfg)
@@ -1461,6 +1547,8 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
mpam_write_partsel_reg(msc, PRI, pri_val);
}
+ mpam_quirk_post_config_change(ris, partid, cfg);
+
mutex_unlock(&msc->part_sel_lock);
}
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index b161d13d7877..6b832f2100d9 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -130,6 +130,9 @@ struct mpam_msc {
void __iomem *mapped_hwpage;
size_t mapped_hwpage_sz;
+ /* Values only used on some platforms for quirks */
+ u32 t241_id;
+
struct mpam_garbage garbage;
};
@@ -216,6 +219,7 @@ struct mpam_props {
/* Workaround bits for msc->quirks */
enum mpam_device_quirks {
+ T241_SCRUB_SHADOW_REGS,
MPAM_QUIRK_LAST
};
@@ -236,6 +240,11 @@ struct mpam_quirk {
FIELD_PREP_CONST(MPAMF_IIDR_REVISION, 0xf) | \
FIELD_PREP_CONST(MPAMF_IIDR_IMPLEMENTER, 0xfff))
+#define MPAM_IIDR_NVIDIA_T241 (FIELD_PREP_CONST(MPAMF_IIDR_PRODUCTID, 0x241) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_VARIANT, 0) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_REVISION, 0) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_IMPLEMENTER, 0x36b))
+
/* The values for MSMON_CFG_MBWU_FLT.RWBW */
enum mon_filter_options {
COUNT_BOTH = 0,
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (36 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 37/41] arm_mpam: Add workaround for T241-MPAM-1 Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-13 7:02 ` Shaopeng Tan (Fujitsu)
2026-02-03 21:43 ` [PATCH v4 39/41] arm_mpam: Add workaround for T241-MPAM-6 Ben Horgan
` (4 subsequent siblings)
42 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: Shanker Donthineni <sdonthineni@nvidia.com>
In the T241 implementation of memory-bandwidth partitioning, in the absence
of contention for bandwidth, the minimum bandwidth setting can affect the
amount of achieved bandwidth. Specifically, the achieved bandwidth in the
absence of contention can settle to any value between the values of
MPAMCFG_MBW_MIN and MPAMCFG_MBW_MAX. Also, if MPAMCFG_MBW_MIN is set
zero (below 0.78125%), once a core enters a throttled state, it will never
leave that state.
The first issue is not a concern if the MPAM software allows to program
MPAMCFG_MBW_MIN through the sysfs interface. This patch ensures program
MBW_MIN=1 (0.78125%) whenever MPAMCFG_MBW_MIN=0 is programmed.
In the scenario where the resctrl doesn't support the MBW_MIN interface via
sysfs, to achieve bandwidth closer to MBW_MAX in the absence of contention,
software should configure a relatively narrow gap between MBW_MIN and
MBW_MAX. The recommendation is to use a 5% gap to mitigate the problem.
Clear the feature MBW_MIN feature from the class to ensure we don't
accidentally change behaviour when resctrl adds support for a MBW_MIN
interface.
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
[ morse: Added as second quirk, adapted to use the new intermediate values
in mpam_extend_config() ]
Changes since rfc:
MPAM_IIDR_NVIDIA_T421 -> MPAM_IIDR_NVIDIA_T241
Handling when reset_mbw_min is set
Changes since v3:
Move the 5% gap policy back here
Clear mbw_min feature in class
---
Documentation/arch/arm64/silicon-errata.rst | 2 +
drivers/resctrl/mpam_devices.c | 50 +++++++++++++++++++--
drivers/resctrl/mpam_internal.h | 1 +
3 files changed, 50 insertions(+), 3 deletions(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
index 4e86b85fe3d6..b18bc704d4a1 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -248,6 +248,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 MPAM | T241-MPAM-1 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| NVIDIA | T241 MPAM | T241-MPAM-4 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+----------------+-----------------+-----------------+-----------------------------+
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 76c8cfcef3c0..37a105d95d66 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -679,6 +679,12 @@ static const struct mpam_quirk mpam_quirks[] = {
.iidr_mask = MPAM_IIDR_MATCH_ONE,
.workaround = T241_SCRUB_SHADOW_REGS,
},
+ {
+ /* NVIDIA t241 erratum T241-MPAM-4 */
+ .iidr = MPAM_IIDR_NVIDIA_T241,
+ .iidr_mask = MPAM_IIDR_MATCH_ONE,
+ .workaround = T241_FORCE_MBW_MIN_TO_ONE,
+ },
{ NULL } /* Sentinel */
};
@@ -1464,6 +1470,31 @@ static void mpam_quirk_post_config_change(struct mpam_msc_ris *ris, u16 partid,
mpam_apply_t241_erratum(ris, partid);
}
+static u16 mpam_wa_t241_force_mbw_min_to_one(struct mpam_props *props)
+{
+ u16 max_hw_value, min_hw_granule, res0_bits;
+
+ res0_bits = 16 - props->bwa_wd;
+ max_hw_value = ((1 << props->bwa_wd) - 1) << res0_bits;
+ min_hw_granule = ~max_hw_value;
+
+ return min_hw_granule + 1;
+}
+
+static u16 mpam_wa_t241_calc_min_from_max(struct mpam_config *cfg)
+{
+ u16 val = 0;
+
+ if (mpam_has_feature(mpam_feat_mbw_max, cfg)) {
+ u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
+
+ if (cfg->mbw_max > delta)
+ val = cfg->mbw_max - delta;
+ }
+
+ return val;
+}
+
/* Called via IPI. Call while holding an SRCU reference */
static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
struct mpam_config *cfg)
@@ -1506,9 +1537,19 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
}
- if (mpam_has_feature(mpam_feat_mbw_min, rprops) &&
- mpam_has_feature(mpam_feat_mbw_min, cfg))
- mpam_write_partsel_reg(msc, MBW_MIN, 0);
+ if (mpam_has_feature(mpam_feat_mbw_min, rprops)) {
+ u16 val = 0;
+
+ if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, msc)) {
+ u16 min = mpam_wa_t241_force_mbw_min_to_one(rprops);
+
+ val = mpam_wa_t241_calc_min_from_max(cfg);
+ if (val < min)
+ val = min;
+ }
+
+ mpam_write_partsel_reg(msc, MBW_MIN, val);
+ }
if (mpam_has_feature(mpam_feat_mbw_max, rprops) &&
mpam_has_feature(mpam_feat_mbw_max, cfg)) {
@@ -2304,6 +2345,9 @@ static void mpam_enable_merge_class_features(struct mpam_component *comp)
list_for_each_entry(vmsc, &comp->vmsc, comp_list)
__class_props_mismatch(class, vmsc);
+
+ if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, class))
+ mpam_clear_feature(mpam_feat_mbw_min, &class->props);
}
/*
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 6b832f2100d9..f6bf462058d9 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -220,6 +220,7 @@ struct mpam_props {
/* Workaround bits for msc->quirks */
enum mpam_device_quirks {
T241_SCRUB_SHADOW_REGS,
+ T241_FORCE_MBW_MIN_TO_ONE,
MPAM_QUIRK_LAST
};
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 39/41] arm_mpam: Add workaround for T241-MPAM-6
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (37 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4 Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 40/41] arm_mpam: Quirk CMN-650's CSU NRDY behaviour Ben Horgan
` (3 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc, Shaopeng Tan
From: Shanker Donthineni <sdonthineni@nvidia.com>
The registers MSMON_MBWU_L and MSMON_MBWU return the number of requests
rather than the number of bytes transferred.
Bandwidth resource monitoring is performed at the last level cache, where
each request arrive in 64Byte granularity. The current implementation
returns the number of transactions received at the last level cache but
does not provide the value in bytes. Scaling by 64 gives an accurate byte
count to match the MPAM specification for the MSMON_MBWU and MSMON_MBWU_L
registers. This patch fixes the issue by reporting the actual number of
bytes instead of the number of transactions from __ris_msmon_read().
Tested-by: Gavin Shan <gshan@redhat.com>
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since rfc:
MPAM_IIDR_NVIDIA_T421 -> MPAM_IIDR_NVIDIA_T241
Don't apply workaround to MSMON_MBWU_LWD
---
Documentation/arch/arm64/silicon-errata.rst | 2 ++
drivers/resctrl/mpam_devices.c | 26 +++++++++++++++++++--
drivers/resctrl/mpam_internal.h | 1 +
3 files changed, 27 insertions(+), 2 deletions(-)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
index b18bc704d4a1..e810b2a8f40e 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -250,6 +250,8 @@ stable kernels.
+----------------+-----------------+-----------------+-----------------------------+
| NVIDIA | T241 MPAM | T241-MPAM-4 | N/A |
+----------------+-----------------+-----------------+-----------------------------+
+| NVIDIA | T241 MPAM | T241-MPAM-6 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
| Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
+----------------+-----------------+-----------------+-----------------------------+
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 37a105d95d66..51cf29bda66e 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -685,6 +685,12 @@ static const struct mpam_quirk mpam_quirks[] = {
.iidr_mask = MPAM_IIDR_MATCH_ONE,
.workaround = T241_FORCE_MBW_MIN_TO_ONE,
},
+ {
+ /* NVIDIA t241 erratum T241-MPAM-6 */
+ .iidr = MPAM_IIDR_NVIDIA_T241,
+ .iidr_mask = MPAM_IIDR_MATCH_ONE,
+ .workaround = T241_MBW_COUNTER_SCALE_64,
+ },
{ NULL } /* Sentinel */
};
@@ -1146,7 +1152,7 @@ static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
}
}
-static u64 mpam_msmon_overflow_val(enum mpam_device_features type)
+static u64 __mpam_msmon_overflow_val(enum mpam_device_features type)
{
/* TODO: implement scaling counters */
switch (type) {
@@ -1161,6 +1167,18 @@ static u64 mpam_msmon_overflow_val(enum mpam_device_features type)
}
}
+static u64 mpam_msmon_overflow_val(enum mpam_device_features type,
+ struct mpam_msc *msc)
+{
+ u64 overflow_val = __mpam_msmon_overflow_val(type);
+
+ if (mpam_has_quirk(T241_MBW_COUNTER_SCALE_64, msc) &&
+ type != mpam_feat_msmon_mbwu_63counter)
+ overflow_val *= 64;
+
+ return overflow_val;
+}
+
static void __ris_msmon_read(void *arg)
{
u64 now;
@@ -1251,13 +1269,17 @@ static void __ris_msmon_read(void *arg)
now = FIELD_GET(MSMON___VALUE, now);
}
+ if (mpam_has_quirk(T241_MBW_COUNTER_SCALE_64, msc) &&
+ m->type != mpam_feat_msmon_mbwu_63counter)
+ now *= 64;
+
if (nrdy)
break;
mbwu_state = &ris->mbwu_state[ctx->mon];
if (overflow)
- mbwu_state->correction += mpam_msmon_overflow_val(m->type);
+ mbwu_state->correction += mpam_msmon_overflow_val(m->type, msc);
/*
* Include bandwidth consumed before the last hardware reset and
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index f6bf462058d9..7a3a46b94913 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -221,6 +221,7 @@ struct mpam_props {
enum mpam_device_quirks {
T241_SCRUB_SHADOW_REGS,
T241_FORCE_MBW_MIN_TO_ONE,
+ T241_MBW_COUNTER_SCALE_64,
MPAM_QUIRK_LAST
};
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 40/41] arm_mpam: Quirk CMN-650's CSU NRDY behaviour
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (38 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 39/41] arm_mpam: Add workaround for T241-MPAM-6 Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation Ben Horgan
` (2 subsequent siblings)
42 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc
From: James Morse <james.morse@arm.com>
CMN-650 is afflicted with an erratum where the CSU NRDY bit never clears.
This tells us the monitor never finishes scanning the cache. The erratum
document says to wait the maximum time, then ignore the field.
Add a flag to indicate whether this is the final attempt to read the
counter, and when this quirk is applied, ignore the NRDY field.
This means accesses to this counter will always retry, even if the counter
was previously programmed to the same values.
The counter value is not expected to be stable, it drifts up and down with
each allocation and eviction. The CSU register provides the value for a
point in time.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes since v3:
parentheses in macro
---
Documentation/arch/arm64/silicon-errata.rst | 3 +++
drivers/resctrl/mpam_devices.c | 12 ++++++++++++
drivers/resctrl/mpam_internal.h | 6 ++++++
3 files changed, 21 insertions(+)
diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
index e810b2a8f40e..3667650036fb 100644
--- a/Documentation/arch/arm64/silicon-errata.rst
+++ b/Documentation/arch/arm64/silicon-errata.rst
@@ -213,6 +213,9 @@ stable kernels.
| ARM | GIC-700 | #2941627 | ARM64_ERRATUM_2941627 |
+----------------+-----------------+-----------------+-----------------------------+
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | CMN-650 | #3642720 | N/A |
++----------------+-----------------+-----------------+-----------------------------+
++----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
+----------------+-----------------+-----------------+-----------------------------+
| Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 51cf29bda66e..460ea98a1c92 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -691,6 +691,12 @@ static const struct mpam_quirk mpam_quirks[] = {
.iidr_mask = MPAM_IIDR_MATCH_ONE,
.workaround = T241_MBW_COUNTER_SCALE_64,
},
+ {
+ /* ARM CMN-650 CSU erratum 3642720 */
+ .iidr = MPAM_IIDR_ARM_CMN_650,
+ .iidr_mask = MPAM_IIDR_MATCH_ONE,
+ .workaround = IGNORE_CSU_NRDY,
+ },
{ NULL } /* Sentinel */
};
@@ -1003,6 +1009,7 @@ struct mon_read {
enum mpam_device_features type;
u64 *val;
int err;
+ bool waited_timeout;
};
static bool mpam_ris_has_mbwu_long_counter(struct mpam_msc_ris *ris)
@@ -1249,6 +1256,10 @@ static void __ris_msmon_read(void *arg)
if (mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, rprops))
nrdy = now & MSMON___NRDY;
now = FIELD_GET(MSMON___VALUE, now);
+
+ if (mpam_has_quirk(IGNORE_CSU_NRDY, msc) && m->waited_timeout)
+ nrdy = false;
+
break;
case mpam_feat_msmon_mbwu_31counter:
case mpam_feat_msmon_mbwu_44counter:
@@ -1386,6 +1397,7 @@ int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
.ctx = ctx,
.type = type,
.val = val,
+ .waited_timeout = true,
};
*val = 0;
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 7a3a46b94913..b6d083a73fe2 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -222,6 +222,7 @@ enum mpam_device_quirks {
T241_SCRUB_SHADOW_REGS,
T241_FORCE_MBW_MIN_TO_ONE,
T241_MBW_COUNTER_SCALE_64,
+ IGNORE_CSU_NRDY,
MPAM_QUIRK_LAST
};
@@ -247,6 +248,11 @@ struct mpam_quirk {
FIELD_PREP_CONST(MPAMF_IIDR_REVISION, 0) | \
FIELD_PREP_CONST(MPAMF_IIDR_IMPLEMENTER, 0x36b))
+#define MPAM_IIDR_ARM_CMN_650 (FIELD_PREP_CONST(MPAMF_IIDR_PRODUCTID, 0) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_VARIANT, 0) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_REVISION, 0) | \
+ FIELD_PREP_CONST(MPAMF_IIDR_IMPLEMENTER, 0x43b))
+
/* The values for MSMON_CFG_MBWU_FLT.RWBW */
enum mon_filter_options {
COUNT_BOTH = 0,
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (39 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 40/41] arm_mpam: Quirk CMN-650's CSU NRDY behaviour Ben Horgan
@ 2026-02-03 21:43 ` Ben Horgan
2026-02-05 16:57 ` Catalin Marinas
2026-02-05 17:05 ` Jonathan Cameron
2026-02-09 8:25 ` [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Shaopeng Tan (Fujitsu)
2026-02-14 9:40 ` Zeng Heng
42 siblings, 2 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-03 21:43 UTC (permalink / raw)
To: ben.horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, zengheng4,
linux-doc
MPAM (Memory Partitioning and Monitoring) is now exposed to user-space via
resctrl. Add some documentation so the user knows what features to expect.
Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
---
Changes by Ben:
Some tidying, update for current heuristics
---
Documentation/arch/arm64/index.rst | 1 +
Documentation/arch/arm64/mpam.rst | 93 ++++++++++++++++++++++++++++++
2 files changed, 94 insertions(+)
create mode 100644 Documentation/arch/arm64/mpam.rst
diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
index 6a012c98bdcd..189fa760dade 100644
--- a/Documentation/arch/arm64/index.rst
+++ b/Documentation/arch/arm64/index.rst
@@ -23,6 +23,7 @@ ARM64 Architecture
memory
memory-tagging-extension
mops
+ mpam
perf
pointer-authentication
ptdump
diff --git a/Documentation/arch/arm64/mpam.rst b/Documentation/arch/arm64/mpam.rst
new file mode 100644
index 000000000000..0769bccff25e
--- /dev/null
+++ b/Documentation/arch/arm64/mpam.rst
@@ -0,0 +1,93 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+====
+MPAM
+====
+
+What is MPAM
+============
+ MPAM (Memory Partitioning and Monitoring) is a feature in the CPUs and memory
+ system components such as the caches or memory controllers that allow memory
+ traffic to be labelled, partitioned and monitored.
+
+ Traffic is labelled by the CPU, based on the control or monitor group the
+ current task is assigned to using resctrl. Partitioning policy can be set
+ using the schemata file in resctrl, and monitor values read via resctrl.
+ See Documentation/filesystems/resctrl.rst for more details.
+
+ This allows tasks that share memory system resources, such as caches, to be
+ isolated from each other according to the partitioning policy (so called noisy
+ neighbours).
+
+Supported Platforms
+===================
+ Use of this feature requires CPU support, support in the memory system
+ components, and a description from firmware of where the MPAM device controls
+ are in the MMIO address space. (e.g. the 'MPAM' ACPI table).
+
+ The MMIO device that provides MPAM controls/monitors for a memory system
+ component is called a memory system component. (MSC).
+
+ Because the user interface to MPAM is via resctrl, only MPAM features that are
+ compatible with resctrl can be exposed to user-space.
+
+ MSC are considered as a group based on the topology. MSC that correspond with
+ the L3 cache are considered together, it is not possible to mix MSC between L2
+ and L3 to 'cover' a resctrl schema.
+
+ The supported features are:
+ * Cache portion bitmap controls (CPOR) on the L2 or L3 caches. To expose
+ CPOR at L2 or L3, every CPU must have a corresponding CPU cache at this
+ level that also supports the feature. Mismatched big/little platforms are
+ not supported as resctrl's controls would then also depend on task
+ placement.
+
+ * Memory bandwidth maximum controls (MBW_MAX) on or after the L3 cache.
+ resctrl uses the L3 cache-id to identify where the memory bandwidth
+ control is applied. For this reason the platform must have an L3 cache
+ with cache-id's supplied by firmware. (It doesn't need to support MPAM.)
+
+ To be exported as the 'MB' schema, the topology of the group of MSC chosen
+ must match the topology of the L3 cache so that the cache-id's can be
+ repainted. For example: Platforms with Memory bandwidth maximum controls
+ on CPU-less NUMA nodes cannot expose the 'MB' schema to resctrl as these
+ nodes do not have a corresponding L3 cache. If the memory bandwidth
+ control is on the memory rather than the L3 then there must be a single
+ global L3 as otherwise it is unknown which L3 the traffic came from.
+
+ When the MPAM driver finds multiple groups of MSC it can use for the 'MB'
+ schema, it prefers the group closest to the L3 cache.
+
+ * Cache Storage Usage (CSU) counters can expose the 'llc_occupancy' provided
+ there is at least one CSU monitor on each MSC that makes up the L3 group.
+ Exposing CSU counters from other caches or devices is not supported.
+
+ * Memory Bandwidth Usage (MBWU) on or after the L3 cache. resctrl uses the
+ L3 cache-id to identify where the memory bandwidth is measured. For this
+ reason the platform must have an L3 cache with cache-id's supplied by
+ firmware. (It doesn't need to support MPAM.)
+
+ Memory bandwidth monitoring makes use of MBWU monitors in each MSC that
+ makes up the L3 group. If there are more monitors than the maximum number
+ of control and monitor groups, these will be allocated and configured at
+ boot. Otherwise, the monitors will not be usable as they are required to
+ be free running. If the memory bandwidth monitoring is on the memory
+ rather than the L3 then there must be a single global L3 as otherwise it
+ is unknown which L3 the traffic came from.
+
+ To expose 'mbm_total_bytes', the topology of the group of MSC chosen must
+ match the topology of the L3 cache so that the cache-id's can be
+ repainted. For example: Platforms with Memory bandwidth monitors on
+ CPU-less NUMA nodes cannot expose 'mbm_total_bytes' as these nodes do not
+ have a corresponding L3 cache. 'mbm_local_bytes' is not exposed as MPAM
+ cannot distinguish local traffic from global traffic.
+
+Feature emulation
+=================
+ MPAM will emulate the Code Data Prioritisation (CDP) feature on all platforms.
+
+Reporting Bugs
+==============
+ If you are not seeing the counters or controls you expect please share the
+ debug messages produced when enabling dynamic debug and booting with:
+ dyndbg="file mpam_resctrl.c +pl"
--
2.43.0
^ permalink raw reply related [flat|nested] 89+ messages in thread
* Re: [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction
2026-02-03 21:43 ` [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction Ben Horgan
@ 2026-02-05 14:08 ` Jonathan Cameron
2026-02-05 16:21 ` Catalin Marinas
1 sibling, 0 replies; 89+ messages in thread
From: Jonathan Cameron @ 2026-02-05 14:08 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, 3 Feb 2026 21:43:07 +0000
Ben Horgan <ben.horgan@arm.com> wrote:
> In anticipation of MPAM being useful remove the CONFIG_EXPERT restriction.
>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 04/41] arm64: mpam: Context switch the MPAM registers
2026-02-03 21:43 ` [PATCH v4 04/41] arm64: mpam: Context switch the MPAM registers Ben Horgan
@ 2026-02-05 16:16 ` Catalin Marinas
0 siblings, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:16 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
On Tue, Feb 03, 2026 at 09:43:05PM +0000, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> MPAM allows traffic in the SoC to be labeled by the OS, these labels are
> used to apply policy in caches and bandwidth regulators, and to monitor
> traffic in the SoC. The label is made up of a PARTID and PMG value. The x86
> equivalent calls these CLOSID and RMID, but they don't map precisely.
>
> MPAM has two CPU system registers that is used to hold the PARTID and PMG
> values that traffic generated at each exception level will use. These can
> be set per-task by the resctrl file system. (resctrl is the defacto
> interface for controlling this stuff).
>
> Add a helper to switch this.
>
> struct task_struct's separate CLOSID and RMID fields are insufficient to
> implement resctrl using MPAM, as resctrl can change the PARTID (CLOSID) and
> PMG (sort of like the RMID) separately. On x86, the rmid is an independent
> number, so a race that writes a mismatched closid and rmid into hardware is
> benign. On arm64, the pmg bits extend the partid.
> (i.e. partid-5 has a pmg-0 that is not the same as partid-6's pmg-0). In
> this case, mismatching the values will 'dirty' a pmg value that resctrl
> believes is clean, and is not tracking with its 'limbo' code.
>
> To avoid this, the partid and pmg are always read and written as a
> pair. This requires a new u64 field. In struct task_struct there are two
> u32, rmid and closid for the x86 case, but as we can't use them here do
> something else. Add this new field, mpam_partid_pmg, to struct thread_info
> to avoid adding more architecture specific code to struct task_struct.
> Always use READ_ONCE()/WRITE_ONCE() when accessing this field.
>
> Resctrl allows a per-cpu 'default' value to be set, this overrides the
> values when scheduling a task in the default control-group, which has
> PARTID 0. The way 'code data prioritisation' gets emulated means the
> register value for the default group needs to be a variable.
>
> The current system register value is kept in a per-cpu variable to avoid
> writing to the system register if the value isn't going to change. Writes
> to this register may reset the hardware state for regulating bandwidth.
>
> Finally, there is no reason to context switch these registers unless there
> is a driver changing the values in struct task_struct. Hide the whole thing
> behind a static key. This also allows the driver to disable MPAM in
> response to errors reported by hardware. Move the existing static key to
> belong to the arch code, as in the future the MPAM driver may become a
> loadable module.
>
> All this should depend on whether there is an MPAM driver, hide it behind
> CONFIG_ARM64_MPAM.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> CC: Amit Singh Tomar <amitsinght@marvell.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 05/41] arm64: mpam: Re-initialise MPAM regs when CPU comes online
2026-02-03 21:43 ` [PATCH v4 05/41] arm64: mpam: Re-initialise MPAM regs when CPU comes online Ben Horgan
@ 2026-02-05 16:20 ` Catalin Marinas
0 siblings, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:20 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
On Tue, Feb 03, 2026 at 09:43:06PM +0000, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> Now that the MPAM system registers are expected to have values that change,
> reprogram them based on the previous value when a CPU is brought online.
>
> Previously MPAM's 'default PARTID' of 0 was always used for MPAM in
> kernel-space as this is the PARTID that hardware guarantees to
> reset. Because there are a limited number of PARTID, this value is exposed
> to user-space, meaning resctrl changes to the resctrl default group would
> also affect kernel threads. Instead, use the task's PARTID value for
> kernel work on behalf of user-space too. The default of 0 is kept for both
> user-space and kernel-space when MPAM is not enabled.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction
2026-02-03 21:43 ` [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction Ben Horgan
2026-02-05 14:08 ` Jonathan Cameron
@ 2026-02-05 16:21 ` Catalin Marinas
1 sibling, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:21 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, Feb 03, 2026 at 09:43:07PM +0000, Ben Horgan wrote:
> In anticipation of MPAM being useful remove the CONFIG_EXPERT restriction.
>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-03 21:43 ` [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource Ben Horgan
@ 2026-02-05 16:50 ` Jonathan Cameron
2026-02-13 7:38 ` Zeng Heng
2026-02-18 16:40 ` Ben Horgan
2026-02-10 6:20 ` Shaopeng Tan (Fujitsu)
1 sibling, 2 replies; 89+ messages in thread
From: Jonathan Cameron @ 2026-02-05 16:50 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, 3 Feb 2026 21:43:27 +0000
Ben Horgan <ben.horgan@arm.com> wrote:
> From: James Morse <james.morse@arm.com>
>
> resctrl supports 'MB', as a percentage throttling of traffic from the
> L3. This is the control that mba_sc uses, so ideally the class chosen
> should be as close as possible to the counters used for mbm_total. If
> there is a single L3 and the topology of the memory matches then the
> traffic at the memory controller will be equivalent to that at egress of
> the L3. If these conditions are met allow the memory class to back MB.
>
> MB's percentage control should be backed either with the fixed point
> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
> bitmaps is not used as its tricky to pick which bits to use to avoid
> contention, and may be possible to expose this as something other than a
> percentage in the future.
I'm very curious to know whether this heuristic is actually useful on many
systems or whether many / most of them fail this 'shape' heuristic.
Otherwise, just comments on the placement of __free related declarations.
See the guidance in cleanup.h for that.
With those moved,
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>
> CC: Zeng Heng <zengheng4@huawei.com>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes since v2:
> Code flow change
> Commit message 'or'
>
> Changes since v3:
> initialise tmp_cpumask
> update commit message
> check the traffic matches l3
> update comment on candidate_class update, only mbm_total
> drop tags due to rework
> ---
> drivers/resctrl/mpam_resctrl.c | 275 ++++++++++++++++++++++++++++++++-
> 1 file changed, 274 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> index 25950e667077..4cca3694978d 100644
> --- a/drivers/resctrl/mpam_resctrl.c
> +++ b/drivers/resctrl/mpam_resctrl.c
> +/*
> + * topology_matches_l3() - Is the provided class the same shape as L3
> + * @victim: The class we'd like to pretend is L3.
> + *
> + * resctrl expects all the world's a Xeon, and all counters are on the
> + * L3. We allow some mapping counters on other classes. This requires
> + * that the CPU->domain mapping is the same kind of shape.
> + *
> + * Using cacheinfo directly would make this work even if resctrl can't
> + * use the L3 - but cacheinfo can't tell us anything about offline CPUs.
> + * Using the L3 resctrl domain list also depends on CPUs being online.
> + * Using the mpam_class we picked for L3 so we can use its domain list
> + * assumes that there are MPAM controls on the L3.
> + * Instead, this path eventually uses the mpam_get_cpumask_from_cache_id()
> + * helper which can tell us about offline CPUs ... but getting the cache_id
> + * to start with relies on at least one CPU per L3 cache being online at
> + * boot.
> + *
> + * Walk the victim component list and compare the affinity mask with the
> + * corresponding L3. The topology matches if each victim:component's affinity
> + * mask is the same as the CPU's corresponding L3's. These lists/masks are
> + * computed from firmware tables so don't change at runtime.
> + */
> +static bool topology_matches_l3(struct mpam_class *victim)
> +{
> + int cpu, err;
> + struct mpam_component *victim_iter;
> + cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
Same as below. Move it down right next to the alloc.
> +
> + lockdep_assert_cpus_held();
> +
> + if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL))
> + return false;
> +
> + guard(srcu)(&mpam_srcu);
> + list_for_each_entry_srcu(victim_iter, &victim->components, class_list,
> + srcu_read_lock_held(&mpam_srcu)) {
> + if (cpumask_empty(&victim_iter->affinity)) {
> + pr_debug("class %u has CPU-less component %u - can't match L3!\n",
> + victim->level, victim_iter->comp_id);
> + return false;
> + }
> +
> + cpu = cpumask_any_and(&victim_iter->affinity, cpu_online_mask);
> + if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> + return false;
> +
> + cpumask_clear(tmp_cpumask);
> + err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
> + if (err) {
> + pr_debug("Failed to find L3's equivalent component to class %u component %u\n",
> + victim->level, victim_iter->comp_id);
> + return false;
> + }
> +
> + /* Any differing bits in the affinity mask? */
> + if (!cpumask_equal(tmp_cpumask, &victim_iter->affinity)) {
> + pr_debug("class %u component %u has Mismatched CPU mask with L3 equivalent\n"
> + "L3:%*pbl != victim:%*pbl\n",
> + victim->level, victim_iter->comp_id,
> + cpumask_pr_args(tmp_cpumask),
> + cpumask_pr_args(&victim_iter->affinity));
> +
> + return false;
> + }
> + }
> +
> + return true;
> +}
> +
> +/*
> + * Test if the traffic for a class matches that at egress from the L3. For
> + * MSC at memory controllers this is only possible if there is a single L3
> + * as otherwise the counters at the memory can include bandwidth from the
> + * non-local L3.
> + */
> +static bool traffic_matches_l3(struct mpam_class *class) {
> + int err, cpu;
> + cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
> +
> + lockdep_assert_cpus_held();
> +
> + if (class->type == MPAM_CLASS_CACHE && class->level == 3)
> + return true;
> +
> + if (class->type == MPAM_CLASS_CACHE && class->level != 3) {
> + pr_debug("class %u is a different cache from L3\n", class->level);
> + return false;
> + }
> +
> + if (class->type != MPAM_CLASS_MEMORY) {
> + pr_debug("class %u is neither of type cache or memory\n", class->level);
> + return false;
> + }
> +
I would suggest following guidance in cleanup.h to put declaration of
destructor and constructor together. That would mean bringing declaration
of tmp_cpumask down here. The set it NULL at the top pattern got some
firm push back from Linus a while back.
> + if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL)) {
> + pr_debug("cpumask allocation failed\n");
> + return false;
> + }
> +
> + if (class->type != MPAM_CLASS_MEMORY) {
> + pr_debug("class %u is neither of type cache or memory\n",
> + class->level);
> + return false;
> + }
> +
> + cpu = cpumask_any_and(&class->affinity, cpu_online_mask);
> + err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
> + if (err) {
> + pr_debug("Failed to find L3 downstream to cpu %d\n", cpu);
> + return false;
> + }
> +
> + if (!cpumask_equal(tmp_cpumask, cpu_possible_mask)) {
> + pr_debug("There is more than one L3\n");
> + return false;
> + }
> +
> + /* Be strict; the traffic might stop in the intermediate cache. */
> + if (get_cpu_cacheinfo_id(cpu, 4) != -1) {
> + pr_debug("L3 isn't the last level of cache\n");
> + return false;
> + }
> +
> + return true;
> +}
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 08/41] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs
2026-02-03 21:43 ` [PATCH v4 08/41] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs Ben Horgan
@ 2026-02-05 16:54 ` Catalin Marinas
0 siblings, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:54 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
On Tue, Feb 03, 2026 at 09:43:09PM +0000, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> The MPAM system registers will be lost if the CPU is reset during PSCI's
> CPU_SUSPEND.
>
> Add a PM notifier to restore them.
>
> mpam_thread_switch(current) can't be used as this won't make any changes if
> the in-memory copy says the register already has the correct value. In
> reality the system register is UNKNOWN out of reset.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 27/41] arm_mpam: resctrl: Add support for csu counters
2026-02-03 21:43 ` [PATCH v4 27/41] arm_mpam: resctrl: Add support for csu counters Ben Horgan
@ 2026-02-05 16:55 ` Jonathan Cameron
0 siblings, 0 replies; 89+ messages in thread
From: Jonathan Cameron @ 2026-02-05 16:55 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, 3 Feb 2026 21:43:28 +0000
Ben Horgan <ben.horgan@arm.com> wrote:
> From: James Morse <james.morse@arm.com>
>
> resctrl exposes a counter via a file named llc_occupancy. This isn't really
> a counter as its value goes up and down, this is a snapshot of the cache
> storage usage monitor.
>
> Add some picking code which will only find an L3. The resctrl counter
> file is called llc_occupancy but we don't check it is the last one as
> it is already identified as L3.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Co-developed-by: Dave Martin <dave.martin@arm.com>
> Signed-off-by: Dave Martin <dave.martin@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes since rfc:
> Allow csu counters however many partid or pmg there are
> else if -> if
> reduce scope of local variables
> drop has_csu
>
> Changes since v2:
> return -> break so works for mbwu in later patch
> add for_each_mpam_resctrl_mon
> return error from mpam_resctrl_monitor_init(). It may fail when is abmc
> allocation introduced in a later patch.
> Squashed in patch from Dave Martin:
> https://lore.kernel.org/lkml/20250820131621.54983-1-Dave.Martin@arm.com/
>
> Changes since v3:
> resctrl_enable_mon_event() signature update
> Restrict the events considered
> num-rmid update
> Use raw_smp_processor_id()
> Tighten heuristics:
> Make sure it is the L3
> Please shout if this means the counters aren't exposed on any platforms
I'm guessing that you mean on platforms where they were under previous version
of the heuristic? I'll leave zhengheng to comment on that for our platforms.
> Drop tags due to change in policy/rework
>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Doesn't really matter as it's the bit after --- , but what's a SoB doing here?
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 09/41] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
2026-02-03 21:43 ` [PATCH v4 09/41] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register Ben Horgan
@ 2026-02-05 16:55 ` Catalin Marinas
0 siblings, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:55 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
On Tue, Feb 03, 2026 at 09:43:10PM +0000, Ben Horgan wrote:
> The MPAMSM_EL1 sets the MPAM labels, PMG and PARTID, for loads and stores
> generated by a shared SMCU. Disable the traps so the kernel can use it and
> set it to the same configuration as the per-EL cpu MPAM configuration.
>
> If an SMCU is not shared with other cpus then it is implementation
> defined whether the configuration from MPAMSM_EL1 is used or that from
> the appropriate MPAMy_ELx. As we set the same, PMG_D and PARTID_D,
> configuration for MPAM0_EL1, MPAM1_EL1 and MPAMSM_EL1 the resulting
> configuration is the same regardless.
>
> The range of valid configurations for the PARTID and PMG in MPAMSM_EL1 is
> not currently specified in Arm Architectural Reference Manual but the
> architect has confirmed that it is intended to be the same as that for the
> cpu configuration in the MPAMy_ELx registers.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 10/41] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values
2026-02-03 21:43 ` [PATCH v4 10/41] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values Ben Horgan
@ 2026-02-05 16:56 ` Catalin Marinas
0 siblings, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:56 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
On Tue, Feb 03, 2026 at 09:43:11PM +0000, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> Care must be taken when modifying the PARTID and PMG of a task in any
> per-task structure as writing these values may race with the task being
> scheduled in, and reading the modified values.
>
> Add helpers to set the task properties, and the CPU default value. These
> use WRITE_ONCE() that pairs with the READ_ONCE() in mpam_get_regval() to
> avoid causing torn values.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> CC: Dave Martin <Dave.Martin@arm.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation
2026-02-03 21:43 ` [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation Ben Horgan
@ 2026-02-05 16:57 ` Catalin Marinas
2026-02-05 17:05 ` Jonathan Cameron
1 sibling, 0 replies; 89+ messages in thread
From: Catalin Marinas @ 2026-02-05 16:57 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, Feb 03, 2026 at 09:43:42PM +0000, Ben Horgan wrote:
> MPAM (Memory Partitioning and Monitoring) is now exposed to user-space via
> resctrl. Add some documentation so the user knows what features to expect.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters
2026-02-03 21:43 ` [PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters Ben Horgan
@ 2026-02-05 16:58 ` Jonathan Cameron
0 siblings, 0 replies; 89+ messages in thread
From: Jonathan Cameron @ 2026-02-05 16:58 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, 3 Feb 2026 21:43:29 +0000
Ben Horgan <ben.horgan@arm.com> wrote:
> From: James Morse <james.morse@arm.com>
>
> resctrl has two types of counters, NUMA-local and global. MPAM can only
> count global either using MSC at the L3 cache or in the memory controllers.
> When global and local equate to the same thing continue just to call it
> global.
>
> Because the class or component backing the event may not be 'the L3', it is
> necessary for mpam_resctrl_get_domain_from_cpu() to search the monitor
> domains too. This matters the most for 'monitor only' systems, where 'the
> L3' control domains may be empty, and the ctrl_comp pointer NULL.
>
> resctrl expects there to be enough monitors for every possible control and
> monitor group to have one. Such a system gets called 'free running' as the
> monitors can be programmed once and left running. Any other platform will
> need to emulate ABMC.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Seems fine to me.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation
2026-02-03 21:43 ` [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation Ben Horgan
2026-02-05 16:57 ` Catalin Marinas
@ 2026-02-05 17:05 ` Jonathan Cameron
2026-02-18 17:02 ` Ben Horgan
1 sibling, 1 reply; 89+ messages in thread
From: Jonathan Cameron @ 2026-02-05 17:05 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
On Tue, 3 Feb 2026 21:43:42 +0000
Ben Horgan <ben.horgan@arm.com> wrote:
> MPAM (Memory Partitioning and Monitoring) is now exposed to user-space via
> resctrl. Add some documentation so the user knows what features to expect.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes by Ben:
> Some tidying, update for current heuristics
> ---
> Documentation/arch/arm64/index.rst | 1 +
> Documentation/arch/arm64/mpam.rst | 93 ++++++++++++++++++++++++++++++
> 2 files changed, 94 insertions(+)
> create mode 100644 Documentation/arch/arm64/mpam.rst
>
> diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
> index 6a012c98bdcd..189fa760dade 100644
> --- a/Documentation/arch/arm64/index.rst
> +++ b/Documentation/arch/arm64/index.rst
> @@ -23,6 +23,7 @@ ARM64 Architecture
> memory
> memory-tagging-extension
> mops
> + mpam
> perf
> pointer-authentication
> ptdump
> diff --git a/Documentation/arch/arm64/mpam.rst b/Documentation/arch/arm64/mpam.rst
> new file mode 100644
> index 000000000000..0769bccff25e
> --- /dev/null
> +++ b/Documentation/arch/arm64/mpam.rst
> @@ -0,0 +1,93 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +====
> +MPAM
> +====
> +
> +What is MPAM
> +============
> + MPAM (Memory Partitioning and Monitoring) is a feature in the CPUs and memory
I've not seen this style of indenting much in rst. I checked a few
files in this directory and it's not used in the ones I randomly picked.
+ it's not what the kernel-documentation.rst file suggests is standard
formatting.
Other than that the content looks fine to me.
Jonathan
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation
2026-02-03 21:43 ` [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation Ben Horgan
@ 2026-02-09 1:16 ` Fenghua Yu
2026-02-09 15:36 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Fenghua Yu @ 2026-02-09 1:16 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, gshan, james.morse, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi, Ben,
On 2/3/26 13:43, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> Intel RDT's CDP feature allows the cache to use a different control value
> depending on whether the accesses was for instruction fetch or a data
> access. MPAM's equivalent feature is the other way up: the CPU assigns a
> different partid label to traffic depending on whether it was instruction
> fetch or a data access, which causes the cache to use a different control
> value based solely on the partid.
>
> MPAM can emulate CDP, with the side effect that the alternative partid is
> seen by all MSC, it can't be enabled per-MSC.
>
> Add the resctrl hooks to turn this on or off. Add the helpers that match a
> closid against a task, which need to be aware that the value written to
> hardware is not the same as the one resctrl is using.
>
> Update the 'arm64_mpam_global_default' variable the arch code uses during
> context switch to know when the per-cpu value should be used instead. Also,
> update these per-cpu values and sync the resulting mpam partid/pmg
> configuration to hardware.
>
> Awkwardly, the MB controls don't implement CDP. To emulate this, the MPAM
> equivalent needs programming twice by the resctrl glue, as resctrl expects
> the bandwidth controls to be applied independently for both data and
> instruction-fetch.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> CC: Dave Martin <Dave.Martin@arm.com>
> CC: Amit Singh Tomar <amitsinght@marvell.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes since rfc:
> Fail cdp initialisation if there is only one partid
> Correct data/code confusion
>
> Changes since v2:
> Don't include unused header
>
> Changes since v3:
> Update the per-cpu values and sync to h/w
> ---
> arch/arm64/include/asm/mpam.h | 1 +
> drivers/resctrl/mpam_resctrl.c | 117 +++++++++++++++++++++++++++++++++
> include/linux/arm_mpam.h | 2 +
> 3 files changed, 120 insertions(+)
>
> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
> index 05aa71200f61..70d396e7b6da 100644
> --- a/arch/arm64/include/asm/mpam.h
> +++ b/arch/arm64/include/asm/mpam.h
> @@ -4,6 +4,7 @@
> #ifndef __ASM__MPAM_H
> #define __ASM__MPAM_H
>
> +#include <linux/arm_mpam.h>
> #include <linux/bitfield.h>
> #include <linux/jump_label.h>
> #include <linux/percpu.h>
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> index cd52ca279651..12017264530a 100644
> --- a/drivers/resctrl/mpam_resctrl.c
> +++ b/drivers/resctrl/mpam_resctrl.c
> @@ -38,6 +38,10 @@ static DEFINE_MUTEX(domain_list_lock);
> static bool exposed_alloc_capable;
> static bool exposed_mon_capable;
>
> +/*
> + * MPAM emulates CDP by setting different PARTID in the I/D fields of MPAM0_EL1.
> + * This applies globally to all traffic the CPU generates.
> + */
> static bool cdp_enabled;
>
> bool resctrl_arch_alloc_capable(void)
> @@ -50,6 +54,72 @@ bool resctrl_arch_mon_capable(void)
> return exposed_mon_capable;
> }
>
> +bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
> +{
> + switch (rid) {
> + case RDT_RESOURCE_L2:
> + case RDT_RESOURCE_L3:
> + return cdp_enabled;
> + case RDT_RESOURCE_MBA:
> + default:
> + /*
> + * x86's MBA control doesn't support CDP, so user-space doesn't
s/x86's/ARM's/
Thanks.
-Fenghua
[SNIP]
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (40 preceding siblings ...)
2026-02-03 21:43 ` [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation Ben Horgan
@ 2026-02-09 8:25 ` Shaopeng Tan (Fujitsu)
2026-02-09 10:04 ` Ben Horgan
2026-02-14 9:40 ` Zeng Heng
42 siblings, 1 reply; 89+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-02-09 8:25 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org
Hello Ben,
> This new version of the mpam missing pieces series has a few significant
> changes in the mpam driver part of the series. The heuristics for deciding
> if features should be exposed are tightened. This is to fix some
> inaccuracies and avoid overcommitting before needed - shout if this changes
> anything on your platform. The final patch adds documentation which
> explains which features you should expect. The ABMC emulation is dropped
> for the moment as it requires resctrl changes to support for MPAM without
> breaking the abi. The default 5% gap for min_bw is dropped in favour of a
> simple default (kept for grace). The series is based on x86/resctrl [1] as
> resctrl has telemetry patches queued which change the arch interface.
Could you please elaborate on why fs/resctrl changes are required to support only the counter assignment part of ABMC?
Currently, many SoC chips have an insufficient number of memory bandwidth monitors.
We would be grateful if you could support the counter assignment part of ABMC.
Best regards,
Shaopeng TAN
> Fixes that are in 6.19-rc8 are dropped from the series but
> b9f5c38e4af1 ("arm_mpam: Use non-atomic bitops when modifying feature bitmap")
> is required to avoid an alignment fault in the kunit tests.
>
> Thank you for all the testing and reviewing so far! It all helps.
>
> Changelogs in patches
>
> From James' cover letter:
>
> This is the missing piece to make MPAM usable resctrl in user-space. This has
> shed its debugfs code and the read/write 'event configuration' for the monitors
> to make the series smaller.
>
> This adds the arch code and KVM support first. I anticipate the whole thing
> going via arm64, but if goes via tip instead, the an immutable branch with those
> patches should be easy to do.
>
> Generally the resctrl glue code works by picking what MPAM features it can expose
> from the MPAM drive, then configuring the structs that back the resctrl helpers.
> If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
> bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
> counters are considerably more hairy, and depend on hueristics around the topology,
> and a bunch of stuff trying to emulate ABMC.
> If it didn't pick what you wanted it to, please share the debug messages produced
> when enabling dynamic debug and booting with:
> | dyndbg="file mpam_resctrl.c +pl"
>
> I've not found a platform that can test all the behaviours around the monitors,
> so this is where I'd expect the most bugs.
>
> The MPAM spec that describes all the system and MMIO registers can be found here:
> https://developer.arm.com/documentation/ddi0598/db/?lang=en
> (Ignored the 'RETIRED' warning - that is just arm moving the documentation around.
> This document has the best overview)
>
> Based on:
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
> (To include telemetry code which changes the resctrl arch interface)
>
> The series can be retrieved from:
> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
> (Final commit is a fix already in 6.19-rc8)
>
> v3 can be found at:
> https://lore.kernel.org/linux-arm-kernel/20260112165914.4086692-1-ben.horgan@arm.com/
>
> v2 can be found at:
> https://lore.kernel.org/linux-arm-kernel/20251219181147.3404071-1-ben.horgan@arm.com/
>
> rfc can be found at:
> https://lore.kernel.org/linux-arm-kernel/20251205215901.17772-1-james.morse@arm.com/
>
>
> Ben Horgan (10):
> arm64/sysreg: Add MPAMSM_EL1 register
> KVM: arm64: Preserve host MPAM configuration when changing traps
> KVM: arm64: Make MPAMSM_EL1 accesses UNDEF
> arm64: mpam: Drop the CONFIG_EXPERT restriction
> arm64: mpam: Initialise and context switch the MPAMSM_EL1 register
> KVM: arm64: Use kernel-space partid configuration for hypercalls
> arm_mpam: resctrl: Add rmid index helpers
> arm_mpam: resctrl: Add kunit test for rmid idx conversions
> arm_mpam: resctrl: Wait for cacheinfo to be ready
> arm64: mpam: Add initial MPAM documentation
>
> Dave Martin (2):
> arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats
> arm_mpam: resctrl: Add kunit test for control format conversions
>
> James Morse (25):
> arm64: mpam: Context switch the MPAM registers
> arm64: mpam: Re-initialise MPAM regs when CPU comes online
> arm64: mpam: Advertise the CPUs MPAM limits to the driver
> arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs
> arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG
> values
> KVM: arm64: Force guest EL1 to use user-space's partid configuration
> arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
> arm_mpam: resctrl: Sort the order of the domain lists
> arm_mpam: resctrl: Pick the caches we will use as resctrl resources
> arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()
> arm_mpam: resctrl: Add resctrl_arch_get_config()
> arm_mpam: resctrl: Implement helpers to update configuration
> arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks
> arm_mpam: resctrl: Add CDP emulation
> arm_mpam: resctrl: Add support for 'MB' resource
> arm_mpam: resctrl: Add support for csu counters
> arm_mpam: resctrl: Pick classes for use as mbm counters
> arm_mpam: resctrl: Pre-allocate free running monitors
> arm_mpam: resctrl: Allow resctrl to allocate monitors
> arm_mpam: resctrl: Add resctrl_arch_rmid_read() and
> resctrl_arch_reset_rmid()
> arm_mpam: resctrl: Update the rmid reallocation limit
> arm_mpam: resctrl: Add empty definitions for assorted resctrl
> functions
> arm64: mpam: Select ARCH_HAS_CPU_RESCTRL
> arm_mpam: resctrl: Call resctrl_init() on platforms that can support
> resctrl
> arm_mpam: Quirk CMN-650's CSU NRDY behaviour
>
> Shanker Donthineni (4):
> arm_mpam: Add quirk framework
> arm_mpam: Add workaround for T241-MPAM-1
> arm_mpam: Add workaround for T241-MPAM-4
> arm_mpam: Add workaround for T241-MPAM-6
>
> Documentation/arch/arm64/index.rst | 1 +
> Documentation/arch/arm64/mpam.rst | 93 +
> Documentation/arch/arm64/silicon-errata.rst | 9 +
> arch/arm64/Kconfig | 6 +-
> arch/arm64/include/asm/el2_setup.h | 3 +-
> arch/arm64/include/asm/mpam.h | 96 +
> arch/arm64/include/asm/resctrl.h | 2 +
> arch/arm64/include/asm/thread_info.h | 3 +
> arch/arm64/kernel/Makefile | 1 +
> arch/arm64/kernel/cpufeature.c | 21 +-
> arch/arm64/kernel/mpam.c | 62 +
> arch/arm64/kernel/process.c | 7 +
> arch/arm64/kvm/hyp/include/hyp/switch.h | 12 +-
> arch/arm64/kvm/hyp/nvhe/hyp-main.c | 9 +
> arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 13 +
> arch/arm64/kvm/sys_regs.c | 2 +
> arch/arm64/tools/sysreg | 8 +
> drivers/resctrl/Kconfig | 9 +-
> drivers/resctrl/Makefile | 1 +
> drivers/resctrl/mpam_devices.c | 257 ++-
> drivers/resctrl/mpam_internal.h | 105 +-
> drivers/resctrl/mpam_resctrl.c | 1861 +++++++++++++++++++
> drivers/resctrl/test_mpam_resctrl.c | 364 ++++
> include/linux/arm_mpam.h | 32 +
> 24 files changed, 2949 insertions(+), 28 deletions(-)
> create mode 100644 Documentation/arch/arm64/mpam.rst
> create mode 100644 arch/arm64/include/asm/mpam.h
> create mode 100644 arch/arm64/include/asm/resctrl.h
> create mode 100644 arch/arm64/kernel/mpam.c
> create mode 100644 drivers/resctrl/mpam_resctrl.c
> create mode 100644 drivers/resctrl/test_mpam_resctrl.c
>
> --
> 2.43.0
>
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-09 8:25 ` [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Shaopeng Tan (Fujitsu)
@ 2026-02-09 10:04 ` Ben Horgan
2026-02-12 14:51 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-09 10:04 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu)
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org
Hi Shaopeng,
On 2/9/26 08:25, Shaopeng Tan (Fujitsu) wrote:
> Hello Ben,
>
>> This new version of the mpam missing pieces series has a few significant
>> changes in the mpam driver part of the series. The heuristics for deciding
>> if features should be exposed are tightened. This is to fix some
>> inaccuracies and avoid overcommitting before needed - shout if this changes
>> anything on your platform. The final patch adds documentation which
>> explains which features you should expect. The ABMC emulation is dropped
>> for the moment as it requires resctrl changes to support for MPAM without
>> breaking the abi. The default 5% gap for min_bw is dropped in favour of a
>> simple default (kept for grace). The series is based on x86/resctrl [1] as
>> resctrl has telemetry patches queued which change the arch interface.
>
> Could you please elaborate on why fs/resctrl changes are required to support only the counter assignment part of ABMC?
> Currently, many SoC chips have an insufficient number of memory bandwidth monitors.
Sure. When the counter assignment mode is 'mbm_event; resctrl assumes the mbm events are configurable.
The 'event_filter' files at
info/L3_MON/event_configs/<event>/event_filter
are used to display and set this configuration.
In MPAM event configuration is not supported and so showing a read/writable 'event_filter' file is
misleading to the user and needs to be hidden for MPAM support.
Just to give you a flavour of the change, here's a hack to show the correct thing for MPAM:
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -2338,6 +2338,9 @@ static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_nod
if (ret)
goto out;
+ if (!resctrl_arch_is_evt_configurable(mevt->evtid))
+ continue;
+
> We would be grateful if you could support the counter assignment part of ABMC.
It is not a big change in resctrl but I thought it best to not gate the rest of this series on
an additional change in another subsystem. I am current looking into this and hope to get the
patches on the list early in the next cycle.
>
> Best regards,
> Shaopeng TAN
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation
2026-02-09 1:16 ` Fenghua Yu
@ 2026-02-09 15:36 ` Ben Horgan
2026-02-11 10:50 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-09 15:36 UTC (permalink / raw)
To: Fenghua Yu
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, gshan, james.morse, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi Fenghua,
On 2/9/26 01:16, Fenghua Yu wrote:
> Hi, Ben,
>
> On 2/3/26 13:43, Ben Horgan wrote:
>> From: James Morse <james.morse@arm.com>
>>
>> Intel RDT's CDP feature allows the cache to use a different control value
>> depending on whether the accesses was for instruction fetch or a data
>> access. MPAM's equivalent feature is the other way up: the CPU assigns a
>> different partid label to traffic depending on whether it was instruction
>> fetch or a data access, which causes the cache to use a different control
>> value based solely on the partid.
>>
>> MPAM can emulate CDP, with the side effect that the alternative partid is
>> seen by all MSC, it can't be enabled per-MSC.
>>
>> Add the resctrl hooks to turn this on or off. Add the helpers that
>> match a
>> closid against a task, which need to be aware that the value written to
>> hardware is not the same as the one resctrl is using.
>>
>> Update the 'arm64_mpam_global_default' variable the arch code uses during
>> context switch to know when the per-cpu value should be used instead.
>> Also,
>> update these per-cpu values and sync the resulting mpam partid/pmg
>> configuration to hardware.
>>
>> Awkwardly, the MB controls don't implement CDP. To emulate this, the MPAM
>> equivalent needs programming twice by the resctrl glue, as resctrl
>> expects
>> the bandwidth controls to be applied independently for both data and
>> instruction-fetch.
>>
>> Tested-by: Gavin Shan <gshan@redhat.com>
>> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>> Tested-by: Peter Newman <peternewman@google.com>
>> CC: Dave Martin <Dave.Martin@arm.com>
>> CC: Amit Singh Tomar <amitsinght@marvell.com>
>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
>> ---
>> Changes since rfc:
>> Fail cdp initialisation if there is only one partid
>> Correct data/code confusion
>>
>> Changes since v2:
>> Don't include unused header
>>
>> Changes since v3:
>> Update the per-cpu values and sync to h/w
>> ---
>> arch/arm64/include/asm/mpam.h | 1 +
>> drivers/resctrl/mpam_resctrl.c | 117 +++++++++++++++++++++++++++++++++
>> include/linux/arm_mpam.h | 2 +
>> 3 files changed, 120 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/
>> mpam.h
>> index 05aa71200f61..70d396e7b6da 100644
>> --- a/arch/arm64/include/asm/mpam.h
>> +++ b/arch/arm64/include/asm/mpam.h
>> @@ -4,6 +4,7 @@
>> #ifndef __ASM__MPAM_H
>> #define __ASM__MPAM_H
>> +#include <linux/arm_mpam.h>
>> #include <linux/bitfield.h>
>> #include <linux/jump_label.h>
>> #include <linux/percpu.h>
>> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/
>> mpam_resctrl.c
>> index cd52ca279651..12017264530a 100644
>> --- a/drivers/resctrl/mpam_resctrl.c
>> +++ b/drivers/resctrl/mpam_resctrl.c
>> @@ -38,6 +38,10 @@ static DEFINE_MUTEX(domain_list_lock);
>> static bool exposed_alloc_capable;
>> static bool exposed_mon_capable;
>> +/*
>> + * MPAM emulates CDP by setting different PARTID in the I/D fields of
>> MPAM0_EL1.
>> + * This applies globally to all traffic the CPU generates.
>> + */
>> static bool cdp_enabled;
>> bool resctrl_arch_alloc_capable(void)
>> @@ -50,6 +54,72 @@ bool resctrl_arch_mon_capable(void)
>> return exposed_mon_capable;
>> }
>> +bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
>> +{
>> + switch (rid) {
>> + case RDT_RESOURCE_L2:
>> + case RDT_RESOURCE_L3:
>> + return cdp_enabled;
>> + case RDT_RESOURCE_MBA:
>> + default:
>> + /*
>> + * x86's MBA control doesn't support CDP, so user-space doesn't
>
> s/x86's/ARM's/
In CPUs supporting MPAM the instruction/data distinction is made at the
CPU so doesn't depend on the specific control. The point this comment is
trying to make is that as x86 doesn't support CDP on MBA, resctrl, which
was initially x86 specific, expected CDP not to be supported on MBA and
hence MPAM/ARM64 has to match this behaviour. Therefore, the MPAM driver
doesn't support CDP on MBA either. In essence, the MPAM driver emulates
the x86 CDP behaviour. Having said that, this comment relies on the
reader knowing this historical context, and so I'll update it to not
reference x86 and just mention that it is the expectation of the resctrl
interface.
>
> Thanks.
>
> -Fenghua
>
> [SNIP]
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-03 21:43 ` [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource Ben Horgan
2026-02-05 16:50 ` Jonathan Cameron
@ 2026-02-10 6:20 ` Shaopeng Tan (Fujitsu)
2026-02-18 16:42 ` Ben Horgan
1 sibling, 1 reply; 89+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-02-10 6:20 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org, dave.martin@arm.com
Hello Ben,
> From: James Morse <james.morse@arm.com>
>
> resctrl supports 'MB', as a percentage throttling of traffic from the
> L3. This is the control that mba_sc uses, so ideally the class chosen
> should be as close as possible to the counters used for mbm_total. If
> there is a single L3 and the topology of the memory matches then the
> traffic at the memory controller will be equivalent to that at egress of
> the L3. If these conditions are met allow the memory class to back MB.
>
> MB's percentage control should be backed either with the fixed point
> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
> bitmaps is not used as its tricky to pick which bits to use to avoid
> contention, and may be possible to expose this as something other than a
> percentage in the future.
>
> CC: Zeng Heng <zengheng4@huawei.com>
> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes since v2:
> Code flow change
> Commit message 'or'
>
> Changes since v3:
> initialise tmp_cpumask
> update commit message
> check the traffic matches l3
> update comment on candidate_class update, only mbm_total
> drop tags due to rework
> ---
> drivers/resctrl/mpam_resctrl.c | 275 ++++++++++++++++++++++++++++++++-
> 1 file changed, 274 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> index 25950e667077..4cca3694978d 100644
> --- a/drivers/resctrl/mpam_resctrl.c
> +++ b/drivers/resctrl/mpam_resctrl.c
> @@ -253,6 +253,33 @@ static bool cache_has_usable_cpor(struct mpam_class *class)
> return class->props.cpbm_wd <= 32;
> }
>
> +static bool mba_class_use_mbw_max(struct mpam_props *cprops)
> +{
> + return (mpam_has_feature(mpam_feat_mbw_max, cprops) &&
> + cprops->bwa_wd);
> +}
> +
> +static bool class_has_usable_mba(struct mpam_props *cprops)
> +{
> + return mba_class_use_mbw_max(cprops);
> +}
> +
> +/*
> + * Calculate the worst-case percentage change from each implemented step
> + * in the control.
> + */
> +static u32 get_mba_granularity(struct mpam_props *cprops)
> +{
> + if (!mba_class_use_mbw_max(cprops))
> + return 0;
> +
> + /*
> + * bwa_wd is the number of bits implemented in the 0.xxx
> + * fixed point fraction. 1 bit is 50%, 2 is 25% etc.
> + */
> + return DIV_ROUND_UP(MAX_MBA_BW, 1 << cprops->bwa_wd);
> +}
> +
> /*
> * Each fixed-point hardware value architecturally represents a range
> * of values: the full range 0% - 100% is split contiguously into
> @@ -303,6 +330,153 @@ static u16 percent_to_mbw_max(u8 pc, struct mpam_props *cprops)
> return val;
> }
>
> +static u32 get_mba_min(struct mpam_props *cprops)
> +{
> + if (!mba_class_use_mbw_max(cprops)) {
> + WARN_ON_ONCE(1);
> + return 0;
> + }
> +
> + return mbw_max_to_percent(0, cprops);
> +}
> +
> +/* Find the L3 cache that has affinity with this CPU */
> +static int find_l3_equivalent_bitmask(int cpu, cpumask_var_t tmp_cpumask)
> +{
> + u32 cache_id = get_cpu_cacheinfo_id(cpu, 3);
> +
> + lockdep_assert_cpus_held();
> +
> + return mpam_get_cpumask_from_cache_id(cache_id, 3, tmp_cpumask);
> +}
> +
> +/*
> + * topology_matches_l3() - Is the provided class the same shape as L3
> + * @victim: The class we'd like to pretend is L3.
> + *
> + * resctrl expects all the world's a Xeon, and all counters are on the
> + * L3. We allow some mapping counters on other classes. This requires
> + * that the CPU->domain mapping is the same kind of shape.
> + *
> + * Using cacheinfo directly would make this work even if resctrl can't
> + * use the L3 - but cacheinfo can't tell us anything about offline CPUs.
> + * Using the L3 resctrl domain list also depends on CPUs being online.
> + * Using the mpam_class we picked for L3 so we can use its domain list
> + * assumes that there are MPAM controls on the L3.
> + * Instead, this path eventually uses the mpam_get_cpumask_from_cache_id()
> + * helper which can tell us about offline CPUs ... but getting the cache_id
> + * to start with relies on at least one CPU per L3 cache being online at
> + * boot.
> + *
> + * Walk the victim component list and compare the affinity mask with the
> + * corresponding L3. The topology matches if each victim:component's affinity
> + * mask is the same as the CPU's corresponding L3's. These lists/masks are
> + * computed from firmware tables so don't change at runtime.
> + */
> +static bool topology_matches_l3(struct mpam_class *victim)
> +{
> + int cpu, err;
> + struct mpam_component *victim_iter;
> + cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
> +
> + lockdep_assert_cpus_held();
> +
> + if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL))
> + return false;
> +
> + guard(srcu)(&mpam_srcu);
> + list_for_each_entry_srcu(victim_iter, &victim->components, class_list,
> + srcu_read_lock_held(&mpam_srcu)) {
> + if (cpumask_empty(&victim_iter->affinity)) {
> + pr_debug("class %u has CPU-less component %u - can't match L3!\n",
> + victim->level, victim_iter->comp_id);
> + return false;
> + }
> +
> + cpu = cpumask_any_and(&victim_iter->affinity, cpu_online_mask);
> + if (WARN_ON_ONCE(cpu >= nr_cpu_ids))
> + return false;
> +
> + cpumask_clear(tmp_cpumask);
> + err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
> + if (err) {
> + pr_debug("Failed to find L3's equivalent component to class %u component %u\n",
> + victim->level, victim_iter->comp_id);
> + return false;
> + }
> +
> + /* Any differing bits in the affinity mask? */
> + if (!cpumask_equal(tmp_cpumask, &victim_iter->affinity)) {
> + pr_debug("class %u component %u has Mismatched CPU mask with L3 equivalent\n"
> + "L3:%*pbl != victim:%*pbl\n",
> + victim->level, victim_iter->comp_id,
> + cpumask_pr_args(tmp_cpumask),
> + cpumask_pr_args(&victim_iter->affinity));
> +
> + return false;
> + }
> + }
> +
> + return true;
> +}
> +
> +/*
> + * Test if the traffic for a class matches that at egress from the L3. For
> + * MSC at memory controllers this is only possible if there is a single L3
> + * as otherwise the counters at the memory can include bandwidth from the
> + * non-local L3.
> + */
> +static bool traffic_matches_l3(struct mpam_class *class) {
An error reported by checkpatch.pl is as follows.
ERROR: open brace '{' following function definitions go on the next line
#826: FILE: drivers/resctrl/mpam_resctrl.c:826:
+static bool traffic_matches_l3(struct mpam_class *class) {
Best regards,
Shaopeng TAN
> + int err, cpu;
> + cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
> +
> + lockdep_assert_cpus_held();
> +
> + if (class->type == MPAM_CLASS_CACHE && class->level == 3)
> + return true;
> +
> + if (class->type == MPAM_CLASS_CACHE && class->level != 3) {
> + pr_debug("class %u is a different cache from L3\n", class->level);
> + return false;
> + }
> +
> + if (class->type != MPAM_CLASS_MEMORY) {
> + pr_debug("class %u is neither of type cache or memory\n", class->level);
> + return false;
> + }
> +
> + if (!alloc_cpumask_var(&tmp_cpumask, GFP_KERNEL)) {
> + pr_debug("cpumask allocation failed\n");
> + return false;
> + }
> +
> + if (class->type != MPAM_CLASS_MEMORY) {
> + pr_debug("class %u is neither of type cache or memory\n",
> + class->level);
> + return false;
> + }
> +
> + cpu = cpumask_any_and(&class->affinity, cpu_online_mask);
> + err = find_l3_equivalent_bitmask(cpu, tmp_cpumask);
> + if (err) {
> + pr_debug("Failed to find L3 downstream to cpu %d\n", cpu);
> + return false;
> + }
> +
> + if (!cpumask_equal(tmp_cpumask, cpu_possible_mask)) {
> + pr_debug("There is more than one L3\n");
> + return false;
> + }
> +
> + /* Be strict; the traffic might stop in the intermediate cache. */
> + if (get_cpu_cacheinfo_id(cpu, 4) != -1) {
> + pr_debug("L3 isn't the last level of cache\n");
> + return false;
> + }
> +
> + return true;
> +}
> +
> /* Test whether we can export MPAM_CLASS_CACHE:{2,3}? */
> static void mpam_resctrl_pick_caches(void)
> {
> @@ -345,9 +519,69 @@ static void mpam_resctrl_pick_caches(void)
> }
> }
>
> +static void mpam_resctrl_pick_mba(void)
> +{
> + struct mpam_class *class, *candidate_class = NULL;
> + struct mpam_resctrl_res *res;
> +
> + lockdep_assert_cpus_held();
> +
> + guard(srcu)(&mpam_srcu);
> + list_for_each_entry_srcu(class, &mpam_classes, classes_list,
> + srcu_read_lock_held(&mpam_srcu)) {
> + struct mpam_props *cprops = &class->props;
> +
> + if (class->level != 3 && class->type == MPAM_CLASS_CACHE) {
> + pr_debug("class %u is a cache but not the L3\n", class->level);
> + continue;
> + }
> +
> + if (!class_has_usable_mba(cprops)) {
> + pr_debug("class %u has no bandwidth control\n",
> + class->level);
> + continue;
> + }
> +
> + if (!cpumask_equal(&class->affinity, cpu_possible_mask)) {
> + pr_debug("class %u has missing CPUs\n", class->level);
> + continue;
> + }
> +
> + if (!topology_matches_l3(class)) {
> + pr_debug("class %u topology doesn't match L3\n",
> + class->level);
> + continue;
> + }
> +
> + if (!traffic_matches_l3(class)) {
> + pr_debug("class %u traffic doesn't match L3 egress\n",
> + class->level);
> + continue;
> + }
> +
> + /*
> + * Pick a resource to be MBA that as close as possible to
> + * the L3. mbm_total counts the bandwidth leaving the L3
> + * cache and MBA should correspond as closely as possible
> + * for proper operation of mba_sc.
> + */
> + if (!candidate_class || class->level < candidate_class->level)
> + candidate_class = class;
> + }
> +
> + if (candidate_class) {
> + pr_debug("selected class %u to back MBA\n",
> + candidate_class->level);
> + res = &mpam_resctrl_controls[RDT_RESOURCE_MBA];
> + res->class = candidate_class;
> + exposed_alloc_capable = true;
> + }
> +}
> +
> static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
> {
> struct mpam_class *class = res->class;
> + struct mpam_props *cprops = &class->props;
> struct rdt_resource *r = &res->resctrl_res;
>
> switch (r->rid) {
> @@ -377,6 +611,19 @@ static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
> */
> r->cache.shareable_bits = resctrl_get_default_ctrl(r);
> break;
> + case RDT_RESOURCE_MBA:
> + r->alloc_capable = true;
> + r->schema_fmt = RESCTRL_SCHEMA_RANGE;
> + r->ctrl_scope = RESCTRL_L3_CACHE;
> +
> + r->membw.delay_linear = true;
> + r->membw.throttle_mode = THREAD_THROTTLE_UNDEFINED;
> + r->membw.min_bw = get_mba_min(cprops);
> + r->membw.max_bw = MAX_MBA_BW;
> + r->membw.bw_gran = get_mba_granularity(cprops);
> +
> + r->name = "MB";
> + break;
> default:
> return -EINVAL;
> }
> @@ -391,7 +638,17 @@ static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
> if (class->type == MPAM_CLASS_CACHE)
> return comp->comp_id;
>
> - /* TODO: repaint domain ids to match the L3 domain ids */
> + if (topology_matches_l3(class)) {
> + /* Use the corresponding L3 component ID as the domain ID */
> + int id = get_cpu_cacheinfo_id(cpu, 3);
> +
> + /* Implies topology_matches_l3() made a mistake */
> + if (WARN_ON_ONCE(id == -1))
> + return comp->comp_id;
> +
> + return id;
> + }
> +
> /* Otherwise, expose the ID used by the firmware table code. */
> return comp->comp_id;
> }
> @@ -431,6 +688,12 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> case RDT_RESOURCE_L3:
> configured_by = mpam_feat_cpor_part;
> break;
> + case RDT_RESOURCE_MBA:
> + if (mpam_has_feature(mpam_feat_mbw_max, cprops)) {
> + configured_by = mpam_feat_mbw_max;
> + break;
> + }
> + fallthrough;
> default:
> return resctrl_get_default_ctrl(r);
> }
> @@ -442,6 +705,8 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> switch (configured_by) {
> case mpam_feat_cpor_part:
> return cfg->cpbm;
> + case mpam_feat_mbw_max:
> + return mbw_max_to_percent(cfg->mbw_max, cprops);
> default:
> return resctrl_get_default_ctrl(r);
> }
> @@ -486,6 +751,13 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> cfg.cpbm = cfg_val;
> mpam_set_feature(mpam_feat_cpor_part, &cfg);
> break;
> + case RDT_RESOURCE_MBA:
> + if (mpam_has_feature(mpam_feat_mbw_max, cprops)) {
> + cfg.mbw_max = percent_to_mbw_max(cfg_val, cprops);
> + mpam_set_feature(mpam_feat_mbw_max, &cfg);
> + break;
> + }
> + fallthrough;
> default:
> return -EINVAL;
> }
> @@ -789,6 +1061,7 @@ int mpam_resctrl_setup(void)
>
> /* Find some classes to use for controls */
> mpam_resctrl_pick_caches();
> + mpam_resctrl_pick_mba();
>
> /* Initialise the resctrl structures from the classes */
> for_each_mpam_resctrl_control(res, rid) {
> --
> 2.43.0
>
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
2026-02-03 21:43 ` [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation Ben Horgan
@ 2026-02-10 22:57 ` Reinette Chatre
2026-02-11 15:36 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Reinette Chatre @ 2026-02-10 22:57 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi Ben,
On 2/3/26 1:43 PM, Ben Horgan wrote:
...
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> new file mode 100644
> index 000000000000..4c2248c92955
> --- /dev/null
> +++ b/drivers/resctrl/mpam_resctrl.c
> @@ -0,0 +1,343 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2025 Arm Ltd.
> +
> +#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
> +
> +#include <linux/arm_mpam.h>
> +#include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/errno.h>
> +#include <linux/list.h>
> +#include <linux/printk.h>
> +#include <linux/rculist.h>
> +#include <linux/resctrl.h>
> +#include <linux/slab.h>
> +#include <linux/types.h>
> +
> +#include <asm/mpam.h>
> +
> +#include "mpam_internal.h"
> +
> +/*
> + * The classes we've picked to map to resctrl resources, wrapped
> + * in with their resctrl structure.
> + * Class pointer may be NULL.
> + */
> +static struct mpam_resctrl_res mpam_resctrl_controls[RDT_NUM_RESOURCES];
> +
> +#define for_each_mpam_resctrl_control(res, rid) \
> + for (rid = 0, res = &mpam_resctrl_controls[rid]; \
> + rid < RDT_NUM_RESOURCES; \
> + rid++, res = &mpam_resctrl_controls[rid])
> +
> +/* The lock for modifying resctrl's domain lists from cpuhp callbacks. */
> +static DEFINE_MUTEX(domain_list_lock);
> +
> +static bool exposed_alloc_capable;
> +static bool exposed_mon_capable;
> +
> +bool resctrl_arch_alloc_capable(void)
> +{
> + return exposed_alloc_capable;
> +}
> +
> +bool resctrl_arch_mon_capable(void)
> +{
> + return exposed_mon_capable;
> +}
> +
> +/*
> + * MSC may raise an error interrupt if it sees an out or range partid/pmg,
> + * and go on to truncate the value. Regardless of what the hardware supports,
> + * only the system wide safe value is safe to use.
> + */
> +u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
> +{
> + return mpam_partid_max + 1;
> +}
> +
> +struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
> +{
> + if (l >= RDT_NUM_RESOURCES)
> + return NULL;
> +
> + return &mpam_resctrl_controls[l].resctrl_res;
> +}
> +
> +static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
> +{
> + /* TODO: initialise the resctrl resources */
> +
> + return 0;
> +}
> +
> +static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
> +{
> + struct mpam_class *class = comp->class;
> +
> + if (class->type == MPAM_CLASS_CACHE)
> + return comp->comp_id;
> +
> + /* TODO: repaint domain ids to match the L3 domain ids */
> + /* Otherwise, expose the ID used by the firmware table code. */
> + return comp->comp_id;
> +}
> +
> +static void mpam_resctrl_domain_hdr_init(int cpu, struct mpam_component *comp,
> + struct rdt_domain_hdr *hdr)
> +{
> + lockdep_assert_cpus_held();
> +
> + INIT_LIST_HEAD(&hdr->list);
> + hdr->id = mpam_resctrl_pick_domain_id(cpu, comp);
> + cpumask_set_cpu(cpu, &hdr->cpu_mask);
One addition via the resctrl telemetry enabling is a new rdt_domain_hdr::rid
used for some additional checks on the header.
https://lore.kernel.org/all/20251217172121.12030-2-tony.luck@intel.com/
I may be missing something here though since the additional checking that this
new field supports should have complained loudly ... unless this was tested with
only the L3 resource that happens to be 0.
...
> +static struct mpam_resctrl_dom *
> +mpam_resctrl_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res)
> +{
> + struct mpam_resctrl_dom *dom;
> + struct rdt_resource *r = &res->resctrl_res;
> +
> + lockdep_assert_cpus_held();
> +
> + list_for_each_entry_rcu(dom, &r->ctrl_domains, resctrl_ctrl_dom.hdr.list) {
> + if (cpumask_test_cpu(cpu, &dom->ctrl_comp->affinity))
> + return dom;
> + }
I notice here that that only the ctrl_domains list is searched ...
> +
> + return NULL;
> +}
> +
> +int mpam_resctrl_online_cpu(unsigned int cpu)
> +{
> + struct mpam_resctrl_res *res;
> + enum resctrl_res_level rid;
> +
> + guard(mutex)(&domain_list_lock);
> + for_each_mpam_resctrl_control(res, rid) {
> + struct mpam_resctrl_dom *dom;
> +
> + if (!res->class)
> + continue; // dummy_resource;
> +
> + dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
Consider a system that only supports monitoring (exposed_alloc_capable == false,
exposed_mon_capable == true). Since mpam_resctrl_get_domain_from_cpu() only
searches control domains then it looks to me as though dom will always be false
here?
> + if (!dom) {
> + dom = mpam_resctrl_alloc_domain(cpu, res);
Would this (on hypothetical exposed_alloc_capable == false, exposed_mon_capable == true system)
then cause a new domain to be allocated for each CPU with a single CPU in its cpumask
instead of allocating a single monitoring domain with multiple CPUs in its mask?
> + } else {
> + if (exposed_alloc_capable) {
> + struct rdt_ctrl_domain *ctrl_d = &dom->resctrl_ctrl_dom;
> +
> + mpam_resctrl_online_domain_hdr(cpu, &ctrl_d->hdr);
> + }
> + if (exposed_mon_capable) {
> + struct rdt_l3_mon_domain *mon_d = &dom->resctrl_mon_dom;
> +
> + mpam_resctrl_online_domain_hdr(cpu, &mon_d->hdr);
> + }
> + }
> + if (IS_ERR(dom))
> + return PTR_ERR(dom);
> + }
> +
> + resctrl_online_cpu(cpu);
> +
> + return 0;
> +}
> +
> +void mpam_resctrl_offline_cpu(unsigned int cpu)
> +{
> + struct mpam_resctrl_res *res;
> + enum resctrl_res_level rid;
> +
> + resctrl_offline_cpu(cpu);
> +
> + guard(mutex)(&domain_list_lock);
> + for_each_mpam_resctrl_control(res, rid) {
> + struct mpam_resctrl_dom *dom;
> + struct rdt_l3_mon_domain *mon_d;
> + struct rdt_ctrl_domain *ctrl_d;
> + bool ctrl_dom_empty, mon_dom_empty;
> +
> + if (!res->class)
> + continue; // dummy resource
> +
> + dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
> + if (WARN_ON_ONCE(!dom))
Similar to above ... it looks to me as though this WARN may always be
encountered on a system that only supports monitoring.
> + continue;
> +
> + if (exposed_alloc_capable) {
> + ctrl_d = &dom->resctrl_ctrl_dom;
> + ctrl_dom_empty = mpam_resctrl_offline_domain_hdr(cpu, &ctrl_d->hdr);
> + if (ctrl_dom_empty)
> + resctrl_offline_ctrl_domain(&res->resctrl_res, ctrl_d);
> + } else {
> + ctrl_dom_empty = true;
> + }
> +
> + if (exposed_mon_capable) {
> + mon_d = &dom->resctrl_mon_dom;
> + mon_dom_empty = mpam_resctrl_offline_domain_hdr(cpu, &mon_d->hdr);
> + if (mon_dom_empty)
> + resctrl_offline_mon_domain(&res->resctrl_res, &mon_d->hdr);
> + } else {
> + mon_dom_empty = true;
> + }
> +
> + if (ctrl_dom_empty && mon_dom_empty)
> + kfree(dom);
> + }
> +}
> +
Reinette
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources
2026-02-03 21:43 ` [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources Ben Horgan
@ 2026-02-10 23:39 ` Reinette Chatre
2026-02-11 11:05 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Reinette Chatre @ 2026-02-10 23:39 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi Ben,
On 2/3/26 1:43 PM, Ben Horgan wrote:
...
> +
> static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
> {
> - /* TODO: initialise the resctrl resources */
> + struct mpam_class *class = res->class;
> + struct rdt_resource *r = &res->resctrl_res;
> +
> + switch (r->rid) {
> + case RDT_RESOURCE_L2:
> + case RDT_RESOURCE_L3:
> + r->alloc_capable = true;
> + r->schema_fmt = RESCTRL_SCHEMA_BITMAP;
> + r->cache.arch_has_sparse_bitmasks = true;
> +
> + r->cache.cbm_len = class->props.cpbm_wd;
> + /* mpam_devices will reject empty bitmaps */
> + r->cache.min_cbm_bits = 1;
> +
> + if (r->rid == RDT_RESOURCE_L2) {
> + r->name = "L2";
This code is fine but highlights that resctrl fs should not let the
arch need to do this since this name is used as part of user interface.
> + r->ctrl_scope = RESCTRL_L2_CACHE;
> + } else {
> + r->name = "L3";
> + r->ctrl_scope = RESCTRL_L3_CACHE;
> + }
> +
> + /*
> + * Which bits are shared with other ...things...
> + * Unknown devices use partid-0 which uses all the bitmap
> + * fields. Until we configured the SMMU and GIC not to do this
> + * 'all the bits' is the correct answer here.
> + */
> + r->cache.shareable_bits = resctrl_get_default_ctrl(r);
I would like to recommend one style change to set r->alloc_capable as the final
setting. This is the setting that informs resctrl fs that this resource needs
attention. The reason I recommend this to be done last is if/when a future
change adds some configuration here that may fail then it should not fail with
r->alloc_capable as true while partially initialized.
> + break;
> + default:
> + return -EINVAL;
> + }
>
> return 0;
> }
> @@ -324,7 +409,8 @@ int mpam_resctrl_setup(void)
> res->resctrl_res.rid = rid;
> }
>
> - /* TODO: pick MPAM classes to map to resctrl resources */
> + /* Find some classes to use for controls */
> + mpam_resctrl_pick_caches();
>
> /* Initialise the resctrl structures from the classes */
> for_each_mpam_resctrl_control(res, rid) {
Reinette
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation
2026-02-09 15:36 ` Ben Horgan
@ 2026-02-11 10:50 ` Ben Horgan
0 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-11 10:50 UTC (permalink / raw)
To: Fenghua Yu
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, gshan, james.morse, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
On Mon, Feb 09, 2026 at 03:36:32PM +0000, Ben Horgan wrote:
> Hi Fenghua,
>
> On 2/9/26 01:16, Fenghua Yu wrote:
> > Hi, Ben,
> >
> > On 2/3/26 13:43, Ben Horgan wrote:
> >> From: James Morse <james.morse@arm.com>
> >>
> >> Intel RDT's CDP feature allows the cache to use a different control value
> >> depending on whether the accesses was for instruction fetch or a data
> >> access. MPAM's equivalent feature is the other way up: the CPU assigns a
> >> different partid label to traffic depending on whether it was instruction
> >> fetch or a data access, which causes the cache to use a different control
> >> value based solely on the partid.
> >>
> >> MPAM can emulate CDP, with the side effect that the alternative partid is
> >> seen by all MSC, it can't be enabled per-MSC.
> >>
> >> Add the resctrl hooks to turn this on or off. Add the helpers that
> >> match a
> >> closid against a task, which need to be aware that the value written to
> >> hardware is not the same as the one resctrl is using.
> >>
> >> Update the 'arm64_mpam_global_default' variable the arch code uses during
> >> context switch to know when the per-cpu value should be used instead.
> >> Also,
> >> update these per-cpu values and sync the resulting mpam partid/pmg
> >> configuration to hardware.
> >>
> >> Awkwardly, the MB controls don't implement CDP. To emulate this, the MPAM
> >> equivalent needs programming twice by the resctrl glue, as resctrl
> >> expects
> >> the bandwidth controls to be applied independently for both data and
> >> instruction-fetch.
> >>
> >> Tested-by: Gavin Shan <gshan@redhat.com>
> >> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> >> Tested-by: Peter Newman <peternewman@google.com>
> >> CC: Dave Martin <Dave.Martin@arm.com>
> >> CC: Amit Singh Tomar <amitsinght@marvell.com>
> >> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> >> Signed-off-by: James Morse <james.morse@arm.com>
> >> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> >> ---
> >> Changes since rfc:
> >> Fail cdp initialisation if there is only one partid
> >> Correct data/code confusion
> >>
> >> Changes since v2:
> >> Don't include unused header
> >>
> >> Changes since v3:
> >> Update the per-cpu values and sync to h/w
> >> ---
> >> arch/arm64/include/asm/mpam.h | 1 +
> >> drivers/resctrl/mpam_resctrl.c | 117 +++++++++++++++++++++++++++++++++
> >> include/linux/arm_mpam.h | 2 +
> >> 3 files changed, 120 insertions(+)
> >>
> >> diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/
> >> mpam.h
> >> index 05aa71200f61..70d396e7b6da 100644
> >> --- a/arch/arm64/include/asm/mpam.h
> >> +++ b/arch/arm64/include/asm/mpam.h
> >> @@ -4,6 +4,7 @@
> >> #ifndef __ASM__MPAM_H
> >> #define __ASM__MPAM_H
> >> +#include <linux/arm_mpam.h>
> >> #include <linux/bitfield.h>
> >> #include <linux/jump_label.h>
> >> #include <linux/percpu.h>
> >> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/
> >> mpam_resctrl.c
> >> index cd52ca279651..12017264530a 100644
> >> --- a/drivers/resctrl/mpam_resctrl.c
> >> +++ b/drivers/resctrl/mpam_resctrl.c
> >> @@ -38,6 +38,10 @@ static DEFINE_MUTEX(domain_list_lock);
> >> static bool exposed_alloc_capable;
> >> static bool exposed_mon_capable;
> >> +/*
> >> + * MPAM emulates CDP by setting different PARTID in the I/D fields of
> >> MPAM0_EL1.
> >> + * This applies globally to all traffic the CPU generates.
> >> + */
> >> static bool cdp_enabled;
> >> bool resctrl_arch_alloc_capable(void)
> >> @@ -50,6 +54,72 @@ bool resctrl_arch_mon_capable(void)
> >> return exposed_mon_capable;
> >> }
> >> +bool resctrl_arch_get_cdp_enabled(enum resctrl_res_level rid)
> >> +{
> >> + switch (rid) {
> >> + case RDT_RESOURCE_L2:
> >> + case RDT_RESOURCE_L3:
> >> + return cdp_enabled;
> >> + case RDT_RESOURCE_MBA:
> >> + default:
> >> + /*
> >> + * x86's MBA control doesn't support CDP, so user-space doesn't
> >
> > s/x86's/ARM's/
>
> In CPUs supporting MPAM the instruction/data distinction is made at the
> CPU so doesn't depend on the specific control. The point this comment is
> trying to make is that as x86 doesn't support CDP on MBA, resctrl, which
> was initially x86 specific, expected CDP not to be supported on MBA and
> hence MPAM/ARM64 has to match this behaviour. Therefore, the MPAM driver
> doesn't support CDP on MBA either. In essence, the MPAM driver emulates
> the x86 CDP behaviour. Having said that, this comment relies on the
> reader knowing this historical context, and so I'll update it to not
> reference x86 and just mention that it is the expectation of the resctrl
> interface.
Looking a bit deeper into CDP I notice a couple of issues here that will make
code change. The resctrl mount options for L2 and L3 cdp are separate and so
they need to be considered separately here too. Secondly, as 'CDP' in MPAM is
controlled at the cpu interface rather than the component it needs to be hidden
for the resources that it's not enabled for. (In MPAM, when 'CDP' is enabled
PARTID_D and PARTID_I to different values in MPAMx_ELy and they take the same
value when disabled.) As mbw_max is a per-partid maximum setting the same
configuration for 2 partids is not the same as setting the value for a single
partid. Furthermore, there is no way to pretend convincingly that the 2 partids
are a a single partid. Hence, in MPAM the MBA resource can't hide CDP and needs
to be disabled when MBA is enabled.
In the future, a resctrl mount option to enable CDP for the MBA resource could
be considered. Some more thought is needed here though as it's not obvious how
this would work with the software controller and would likely not work on other
architectures.
>
> >
> > Thanks.
> >
> > -Fenghua
> >
> > [SNIP]
>
> Thanks,
>
> Ben
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources
2026-02-10 23:39 ` Reinette Chatre
@ 2026-02-11 11:05 ` Ben Horgan
2026-02-12 16:22 ` Reinette Chatre
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-11 11:05 UTC (permalink / raw)
To: Reinette Chatre
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi Reinette,
On Tue, Feb 10, 2026 at 03:39:51PM -0800, Reinette Chatre wrote:
> Hi Ben,
>
> On 2/3/26 1:43 PM, Ben Horgan wrote:
> ...
> > +
> > static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
> > {
> > - /* TODO: initialise the resctrl resources */
> > + struct mpam_class *class = res->class;
> > + struct rdt_resource *r = &res->resctrl_res;
> > +
> > + switch (r->rid) {
> > + case RDT_RESOURCE_L2:
> > + case RDT_RESOURCE_L3:
> > + r->alloc_capable = true;
> > + r->schema_fmt = RESCTRL_SCHEMA_BITMAP;
> > + r->cache.arch_has_sparse_bitmasks = true;
> > +
> > + r->cache.cbm_len = class->props.cpbm_wd;
> > + /* mpam_devices will reject empty bitmaps */
> > + r->cache.min_cbm_bits = 1;
> > +
> > + if (r->rid == RDT_RESOURCE_L2) {
> > + r->name = "L2";
>
> This code is fine but highlights that resctrl fs should not let the
> arch need to do this since this name is used as part of user interface.
Yes, not ideal but I don't think it's urgent to tidy this up.
>
> > + r->ctrl_scope = RESCTRL_L2_CACHE;
> > + } else {
> > + r->name = "L3";
> > + r->ctrl_scope = RESCTRL_L3_CACHE;
> > + }
> > +
> > + /*
> > + * Which bits are shared with other ...things...
> > + * Unknown devices use partid-0 which uses all the bitmap
> > + * fields. Until we configured the SMMU and GIC not to do this
> > + * 'all the bits' is the correct answer here.
> > + */
> > + r->cache.shareable_bits = resctrl_get_default_ctrl(r);
>
> I would like to recommend one style change to set r->alloc_capable as the final
> setting. This is the setting that informs resctrl fs that this resource needs
> attention. The reason I recommend this to be done last is if/when a future
> change adds some configuration here that may fail then it should not fail with
> r->alloc_capable as true while partially initialized.
Makes sense. This will make the code more robust.
>
> > + break;
> > + default:
> > + return -EINVAL;
> > + }
> >
> > return 0;
> > }
> > @@ -324,7 +409,8 @@ int mpam_resctrl_setup(void)
> > res->resctrl_res.rid = rid;
> > }
> >
> > - /* TODO: pick MPAM classes to map to resctrl resources */
> > + /* Find some classes to use for controls */
> > + mpam_resctrl_pick_caches();
> >
> > /* Initialise the resctrl structures from the classes */
> > for_each_mpam_resctrl_control(res, rid) {
>
> Reinette
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation
2026-02-10 22:57 ` Reinette Chatre
@ 2026-02-11 15:36 ` Ben Horgan
0 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-11 15:36 UTC (permalink / raw)
To: Reinette Chatre
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi Reinette,
On Tue, Feb 10, 2026 at 02:57:29PM -0800, Reinette Chatre wrote:
> Hi Ben,
>
> On 2/3/26 1:43 PM, Ben Horgan wrote:
> ...
> > diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> > new file mode 100644
> > index 000000000000..4c2248c92955
> > --- /dev/null
> > +++ b/drivers/resctrl/mpam_resctrl.c
> > @@ -0,0 +1,343 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +// Copyright (C) 2025 Arm Ltd.
> > +
> > +#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
> > +
> > +#include <linux/arm_mpam.h>
> > +#include <linux/cacheinfo.h>
> > +#include <linux/cpu.h>
> > +#include <linux/cpumask.h>
> > +#include <linux/errno.h>
> > +#include <linux/list.h>
> > +#include <linux/printk.h>
> > +#include <linux/rculist.h>
> > +#include <linux/resctrl.h>
> > +#include <linux/slab.h>
> > +#include <linux/types.h>
> > +
> > +#include <asm/mpam.h>
> > +
> > +#include "mpam_internal.h"
> > +
> > +/*
> > + * The classes we've picked to map to resctrl resources, wrapped
> > + * in with their resctrl structure.
> > + * Class pointer may be NULL.
> > + */
> > +static struct mpam_resctrl_res mpam_resctrl_controls[RDT_NUM_RESOURCES];
> > +
> > +#define for_each_mpam_resctrl_control(res, rid) \
> > + for (rid = 0, res = &mpam_resctrl_controls[rid]; \
> > + rid < RDT_NUM_RESOURCES; \
> > + rid++, res = &mpam_resctrl_controls[rid])
> > +
> > +/* The lock for modifying resctrl's domain lists from cpuhp callbacks. */
> > +static DEFINE_MUTEX(domain_list_lock);
> > +
> > +static bool exposed_alloc_capable;
> > +static bool exposed_mon_capable;
> > +
> > +bool resctrl_arch_alloc_capable(void)
> > +{
> > + return exposed_alloc_capable;
> > +}
> > +
> > +bool resctrl_arch_mon_capable(void)
> > +{
> > + return exposed_mon_capable;
> > +}
> > +
> > +/*
> > + * MSC may raise an error interrupt if it sees an out or range partid/pmg,
> > + * and go on to truncate the value. Regardless of what the hardware supports,
> > + * only the system wide safe value is safe to use.
> > + */
> > +u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
> > +{
> > + return mpam_partid_max + 1;
> > +}
> > +
> > +struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
> > +{
> > + if (l >= RDT_NUM_RESOURCES)
> > + return NULL;
> > +
> > + return &mpam_resctrl_controls[l].resctrl_res;
> > +}
> > +
> > +static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
> > +{
> > + /* TODO: initialise the resctrl resources */
> > +
> > + return 0;
> > +}
> > +
> > +static int mpam_resctrl_pick_domain_id(int cpu, struct mpam_component *comp)
> > +{
> > + struct mpam_class *class = comp->class;
> > +
> > + if (class->type == MPAM_CLASS_CACHE)
> > + return comp->comp_id;
> > +
> > + /* TODO: repaint domain ids to match the L3 domain ids */
> > + /* Otherwise, expose the ID used by the firmware table code. */
> > + return comp->comp_id;
> > +}
> > +
> > +static void mpam_resctrl_domain_hdr_init(int cpu, struct mpam_component *comp,
> > + struct rdt_domain_hdr *hdr)
> > +{
> > + lockdep_assert_cpus_held();
> > +
> > + INIT_LIST_HEAD(&hdr->list);
> > + hdr->id = mpam_resctrl_pick_domain_id(cpu, comp);
> > + cpumask_set_cpu(cpu, &hdr->cpu_mask);
>
> One addition via the resctrl telemetry enabling is a new rdt_domain_hdr::rid
> used for some additional checks on the header.
> https://lore.kernel.org/all/20251217172121.12030-2-tony.luck@intel.com/
> I may be missing something here though since the additional checking that this
> new field supports should have complained loudly ... unless this was tested with
> only the L3 resource that happens to be 0.
Hmmm, thanks for pointing this out. I think I've been getting away with this as
in the resctrl common code checking for 'rid' only happens for monitors which we
do keep on the L3 resource and the control checking is only in the x86. I'll
look into this a bit more and update for this change.
>
> ...
>
> > +static struct mpam_resctrl_dom *
> > +mpam_resctrl_get_domain_from_cpu(int cpu, struct mpam_resctrl_res *res)
> > +{
> > + struct mpam_resctrl_dom *dom;
> > + struct rdt_resource *r = &res->resctrl_res;
> > +
> > + lockdep_assert_cpus_held();
> > +
> > + list_for_each_entry_rcu(dom, &r->ctrl_domains, resctrl_ctrl_dom.hdr.list) {
> > + if (cpumask_test_cpu(cpu, &dom->ctrl_comp->affinity))
> > + return dom;
> > + }
>
> I notice here that that only the ctrl_domains list is searched ...
The monitor searching is added in:
[PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters
This does seems like a bad code split. Even more so as csu counters are
added before the mbm counters.
>
> > +
> > + return NULL;
> > +}
> > +
> > +int mpam_resctrl_online_cpu(unsigned int cpu)
> > +{
> > + struct mpam_resctrl_res *res;
> > + enum resctrl_res_level rid;
> > +
> > + guard(mutex)(&domain_list_lock);
> > + for_each_mpam_resctrl_control(res, rid) {
> > + struct mpam_resctrl_dom *dom;
> > +
> > + if (!res->class)
> > + continue; // dummy_resource;
> > +
> > + dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
>
> Consider a system that only supports monitoring (exposed_alloc_capable == false,
> exposed_mon_capable == true). Since mpam_resctrl_get_domain_from_cpu() only
> searches control domains then it looks to me as though dom will always be false
> here?
>
> > + if (!dom) {
> > + dom = mpam_resctrl_alloc_domain(cpu, res);
>
> Would this (on hypothetical exposed_alloc_capable == false, exposed_mon_capable == true system)
> then cause a new domain to be allocated for each CPU with a single CPU in its cpumask
> instead of allocating a single monitoring domain with multiple CPUs in its mask?
>
> > + } else {
> > + if (exposed_alloc_capable) {
> > + struct rdt_ctrl_domain *ctrl_d = &dom->resctrl_ctrl_dom;
> > +
> > + mpam_resctrl_online_domain_hdr(cpu, &ctrl_d->hdr);
> > + }
> > + if (exposed_mon_capable) {
> > + struct rdt_l3_mon_domain *mon_d = &dom->resctrl_mon_dom;
> > +
> > + mpam_resctrl_online_domain_hdr(cpu, &mon_d->hdr);
> > + }
> > + }
> > + if (IS_ERR(dom))
> > + return PTR_ERR(dom);
> > + }
> > +
> > + resctrl_online_cpu(cpu);
> > +
> > + return 0;
> > +}
> > +
> > +void mpam_resctrl_offline_cpu(unsigned int cpu)
> > +{
> > + struct mpam_resctrl_res *res;
> > + enum resctrl_res_level rid;
> > +
> > + resctrl_offline_cpu(cpu);
> > +
> > + guard(mutex)(&domain_list_lock);
> > + for_each_mpam_resctrl_control(res, rid) {
> > + struct mpam_resctrl_dom *dom;
> > + struct rdt_l3_mon_domain *mon_d;
> > + struct rdt_ctrl_domain *ctrl_d;
> > + bool ctrl_dom_empty, mon_dom_empty;
> > +
> > + if (!res->class)
> > + continue; // dummy resource
> > +
> > + dom = mpam_resctrl_get_domain_from_cpu(cpu, res);
> > + if (WARN_ON_ONCE(!dom))
>
> Similar to above ... it looks to me as though this WARN may always be
> encountered on a system that only supports monitoring.
I think monitor only systems, (exposed_alloc_capable == false,
exposed_mon_capable == true), are handled properly from
[PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters
onwards in the series.
I'll look into making the division into commits better.
>
> > + continue;
> > +
> > + if (exposed_alloc_capable) {
> > + ctrl_d = &dom->resctrl_ctrl_dom;
> > + ctrl_dom_empty = mpam_resctrl_offline_domain_hdr(cpu, &ctrl_d->hdr);
> > + if (ctrl_dom_empty)
> > + resctrl_offline_ctrl_domain(&res->resctrl_res, ctrl_d);
> > + } else {
> > + ctrl_dom_empty = true;
> > + }
> > +
> > + if (exposed_mon_capable) {
> > + mon_d = &dom->resctrl_mon_dom;
> > + mon_dom_empty = mpam_resctrl_offline_domain_hdr(cpu, &mon_d->hdr);
> > + if (mon_dom_empty)
> > + resctrl_offline_mon_domain(&res->resctrl_res, &mon_d->hdr);
> > + } else {
> > + mon_dom_empty = true;
> > + }
> > +
> > + if (ctrl_dom_empty && mon_dom_empty)
> > + kfree(dom);
> > + }
> > +}
> > +
>
> Reinette
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-09 10:04 ` Ben Horgan
@ 2026-02-12 14:51 ` Ben Horgan
2026-02-13 7:18 ` Shaopeng Tan (Fujitsu)
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-12 14:51 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu)
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org
Hi Shaopeng,
On 2/9/26 10:04, Ben Horgan wrote:
> Hi Shaopeng,
>
> On 2/9/26 08:25, Shaopeng Tan (Fujitsu) wrote:
>> Hello Ben,
>>
>>> This new version of the mpam missing pieces series has a few significant
>>> changes in the mpam driver part of the series. The heuristics for deciding
>>> if features should be exposed are tightened. This is to fix some
>>> inaccuracies and avoid overcommitting before needed - shout if this changes
>>> anything on your platform. The final patch adds documentation which
>>> explains which features you should expect. The ABMC emulation is dropped
>>> for the moment as it requires resctrl changes to support for MPAM without
>>> breaking the abi. The default 5% gap for min_bw is dropped in favour of a
>>> simple default (kept for grace). The series is based on x86/resctrl [1] as
>>> resctrl has telemetry patches queued which change the arch interface.
>>
>> Could you please elaborate on why fs/resctrl changes are required to support only the counter assignment part of ABMC?
>> Currently, many SoC chips have an insufficient number of memory bandwidth monitors.
>
> Sure. When the counter assignment mode is 'mbm_event; resctrl assumes the mbm events are configurable.
> The 'event_filter' files at
> info/L3_MON/event_configs/<event>/event_filter
> are used to display and set this configuration.
>
> In MPAM event configuration is not supported and so showing a read/writable 'event_filter' file is
> misleading to the user and needs to be hidden for MPAM support.
>
> Just to give you a flavour of the change, here's a hack to show the correct thing for MPAM:
>
> --- a/fs/resctrl/rdtgroup.c
> +++ b/fs/resctrl/rdtgroup.c
> @@ -2338,6 +2338,9 @@ static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_nod
> if (ret)
> goto out;
>
> + if (!resctrl_arch_is_evt_configurable(mevt->evtid))
> + continue;
> +
>
>
>> We would be grateful if you could support the counter assignment part of ABMC.
>
> It is not a big change in resctrl but I thought it best to not gate the rest of this series on
> an additional change in another subsystem. I am current looking into this and hope to get the
> patches on the list early in the next cycle.
There is another small change that will be required in resctrl to
support ABMC with MPAM. As counter assignment means that it can't be
guaranteed that each CTRL_MON group has a dedicated memory bandwidth
counter the software controller (mbaMBps mount option) won't work but
the mount won't fail. AMD doesn't hit this problem as it's MBA is
non-linear. I was hoping to repurpose the delay_linear flag to just mean
the software controller isn't supported but resctrl displays this
information to the user in the 'delay_linear' file and we don't want to
mislead.
>
>>
>> Best regards,
>> Shaopeng TAN
>>
>>
> Thanks,
>
> Ben
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources
2026-02-11 11:05 ` Ben Horgan
@ 2026-02-12 16:22 ` Reinette Chatre
0 siblings, 0 replies; 89+ messages in thread
From: Reinette Chatre @ 2026-02-12 16:22 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc,
Shaopeng Tan
Hi Ben,
On 2/11/26 3:05 AM, Ben Horgan wrote:
> Hi Reinette,
>
> On Tue, Feb 10, 2026 at 03:39:51PM -0800, Reinette Chatre wrote:
>> Hi Ben,
>>
>> On 2/3/26 1:43 PM, Ben Horgan wrote:
>> ...
>>> +
>>> static int mpam_resctrl_control_init(struct mpam_resctrl_res *res)
>>> {
>>> - /* TODO: initialise the resctrl resources */
>>> + struct mpam_class *class = res->class;
>>> + struct rdt_resource *r = &res->resctrl_res;
>>> +
>>> + switch (r->rid) {
>>> + case RDT_RESOURCE_L2:
>>> + case RDT_RESOURCE_L3:
>>> + r->alloc_capable = true;
>>> + r->schema_fmt = RESCTRL_SCHEMA_BITMAP;
>>> + r->cache.arch_has_sparse_bitmasks = true;
>>> +
>>> + r->cache.cbm_len = class->props.cpbm_wd;
>>> + /* mpam_devices will reject empty bitmaps */
>>> + r->cache.min_cbm_bits = 1;
>>> +
>>> + if (r->rid == RDT_RESOURCE_L2) {
>>> + r->name = "L2";
>>
>> This code is fine but highlights that resctrl fs should not let the
>> arch need to do this since this name is used as part of user interface.
>
> Yes, not ideal but I don't think it's urgent to tidy this up.
I agree. This code is fine. I just shared an observation.
Reinette
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 16/41] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls()
2026-02-03 21:43 ` [PATCH v4 16/41] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls() Ben Horgan
@ 2026-02-13 3:32 ` Zeng Heng
0 siblings, 0 replies; 89+ messages in thread
From: Zeng Heng @ 2026-02-13 3:32 UTC (permalink / raw)
To: Ben Horgan, james.morse
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, jonathan.cameron, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, linux-doc, Shaopeng Tan
On 2026/2/4 5:43, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> We already have a helper for resetting an mpam class and component. Hook
> it up to resctrl_arch_reset_all_ctrls() and the domain offline path.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Cc: Zeng Heng <zengheng4@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes since v2:
> Don't expose unlocked reset
>
> Changes since v3:
> Don't use or expose mpam_reset_component_locked()
Already review this v4 patch, and confirm the removal of redundant
mpam_reset_component_locked() calls.
+ Reviewed-by: Zeng Heng <zengheng4@huawei.com>
However, I noticed another existing issue regarding the in_reset_state
flag setting, which will be discussed in another separate patch.
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4
2026-02-03 21:43 ` [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4 Ben Horgan
@ 2026-02-13 7:02 ` Shaopeng Tan (Fujitsu)
2026-02-14 1:29 ` Zeng Heng
0 siblings, 1 reply; 89+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-02-13 7:02 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org
Hello Ben, Fenghua
> From: Shanker Donthineni <sdonthineni@nvidia.com>
>
> In the T241 implementation of memory-bandwidth partitioning, in the absence
> of contention for bandwidth, the minimum bandwidth setting can affect the
> amount of achieved bandwidth. Specifically, the achieved bandwidth in the
> absence of contention can settle to any value between the values of
> MPAMCFG_MBW_MIN and MPAMCFG_MBW_MAX. Also, if MPAMCFG_MBW_MIN is set
> zero (below 0.78125%), once a core enters a throttled state, it will never
> leave that state.
>
> The first issue is not a concern if the MPAM software allows to program
> MPAMCFG_MBW_MIN through the sysfs interface. This patch ensures program
> MBW_MIN=1 (0.78125%) whenever MPAMCFG_MBW_MIN=0 is programmed.
>
> In the scenario where the resctrl doesn't support the MBW_MIN interface via
> sysfs, to achieve bandwidth closer to MBW_MAX in the absence of contention,
> software should configure a relatively narrow gap between MBW_MIN and
> MBW_MAX. The recommendation is to use a 5% gap to mitigate the problem.
I have a question regarding the MBW_MIN values.
Are there any cases where the sum of all MBW_MIN values from different control groups exceeds 100%?
And if so, is it acceptable for it to exceed 100%?"
Best regards,
Shaopeng TAN
> Clear the feature MBW_MIN feature from the class to ensure we don't
> accidentally change behaviour when resctrl adds support for a MBW_MIN
> interface.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> [ morse: Added as second quirk, adapted to use the new intermediate values
> in mpam_extend_config() ]
>
> Changes since rfc:
> MPAM_IIDR_NVIDIA_T421 -> MPAM_IIDR_NVIDIA_T241
> Handling when reset_mbw_min is set
>
> Changes since v3:
> Move the 5% gap policy back here
> Clear mbw_min feature in class
> ---
> Documentation/arch/arm64/silicon-errata.rst | 2 +
> drivers/resctrl/mpam_devices.c | 50 +++++++++++++++++++--
> drivers/resctrl/mpam_internal.h | 1 +
> 3 files changed, 50 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/arch/arm64/silicon-errata.rst b/Documentation/arch/arm64/silicon-errata.rst
> index 4e86b85fe3d6..b18bc704d4a1 100644
> --- a/Documentation/arch/arm64/silicon-errata.rst
> +++ b/Documentation/arch/arm64/silicon-errata.rst
> @@ -248,6 +248,8 @@ stable kernels.
> +----------------+-----------------+-----------------+-----------------------------+
> | NVIDIA | T241 MPAM | T241-MPAM-1 | N/A |
> +----------------+-----------------+-----------------+-----------------------------+
> +| NVIDIA | T241 MPAM | T241-MPAM-4 | N/A |
> ++----------------+-----------------+-----------------+-----------------------------+
> +----------------+-----------------+-----------------+-----------------------------+
> | Freescale/NXP | LS2080A/LS1043A | A-008585 | FSL_ERRATUM_A008585 |
> +----------------+-----------------+-----------------+-----------------------------+
> diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
> index 76c8cfcef3c0..37a105d95d66 100644
> --- a/drivers/resctrl/mpam_devices.c
> +++ b/drivers/resctrl/mpam_devices.c
> @@ -679,6 +679,12 @@ static const struct mpam_quirk mpam_quirks[] = {
> .iidr_mask = MPAM_IIDR_MATCH_ONE,
> .workaround = T241_SCRUB_SHADOW_REGS,
> },
> + {
> + /* NVIDIA t241 erratum T241-MPAM-4 */
> + .iidr = MPAM_IIDR_NVIDIA_T241,
> + .iidr_mask = MPAM_IIDR_MATCH_ONE,
> + .workaround = T241_FORCE_MBW_MIN_TO_ONE,
> + },
> { NULL } /* Sentinel */
> };
>
> @@ -1464,6 +1470,31 @@ static void mpam_quirk_post_config_change(struct mpam_msc_ris *ris, u16 partid,
> mpam_apply_t241_erratum(ris, partid);
> }
>
> +static u16 mpam_wa_t241_force_mbw_min_to_one(struct mpam_props *props)
> +{
> + u16 max_hw_value, min_hw_granule, res0_bits;
> +
> + res0_bits = 16 - props->bwa_wd;
> + max_hw_value = ((1 << props->bwa_wd) - 1) << res0_bits;
> + min_hw_granule = ~max_hw_value;
> +
> + return min_hw_granule + 1;
> +}
> +
> +static u16 mpam_wa_t241_calc_min_from_max(struct mpam_config *cfg)
> +{
> + u16 val = 0;
> +
> + if (mpam_has_feature(mpam_feat_mbw_max, cfg)) {
> + u16 delta = ((5 * MPAMCFG_MBW_MAX_MAX) / 100) - 1;
> +
> + if (cfg->mbw_max > delta)
> + val = cfg->mbw_max - delta;
> + }
> +
> + return val;
> +}
> +
> /* Called via IPI. Call while holding an SRCU reference */
> static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
> struct mpam_config *cfg)
> @@ -1506,9 +1537,19 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
> mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
> }
>
> - if (mpam_has_feature(mpam_feat_mbw_min, rprops) &&
> - mpam_has_feature(mpam_feat_mbw_min, cfg))
> - mpam_write_partsel_reg(msc, MBW_MIN, 0);
> + if (mpam_has_feature(mpam_feat_mbw_min, rprops)) {
> + u16 val = 0;
> +
> + if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, msc)) {
> + u16 min = mpam_wa_t241_force_mbw_min_to_one(rprops);
> +
> + val = mpam_wa_t241_calc_min_from_max(cfg);
> + if (val < min)
> + val = min;
> + }
> +
> + mpam_write_partsel_reg(msc, MBW_MIN, val);
> + }
>
> if (mpam_has_feature(mpam_feat_mbw_max, rprops) &&
> mpam_has_feature(mpam_feat_mbw_max, cfg)) {
> @@ -2304,6 +2345,9 @@ static void mpam_enable_merge_class_features(struct mpam_component *comp)
>
> list_for_each_entry(vmsc, &comp->vmsc, comp_list)
> __class_props_mismatch(class, vmsc);
> +
> + if (mpam_has_quirk(T241_FORCE_MBW_MIN_TO_ONE, class))
> + mpam_clear_feature(mpam_feat_mbw_min, &class->props);
> }
>
> /*
> diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
> index 6b832f2100d9..f6bf462058d9 100644
> --- a/drivers/resctrl/mpam_internal.h
> +++ b/drivers/resctrl/mpam_internal.h
> @@ -220,6 +220,7 @@ struct mpam_props {
> /* Workaround bits for msc->quirks */
> enum mpam_device_quirks {
> T241_SCRUB_SHADOW_REGS,
> + T241_FORCE_MBW_MIN_TO_ONE,
> MPAM_QUIRK_LAST
> };
>
> --
> 2.43.0
>
>
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-12 14:51 ` Ben Horgan
@ 2026-02-13 7:18 ` Shaopeng Tan (Fujitsu)
0 siblings, 0 replies; 89+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-02-13 7:18 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org
Hello Ben,
> Hi Shaopeng,
>
> On 2/9/26 10:04, Ben Horgan wrote:
> > Hi Shaopeng,
> >
> > On 2/9/26 08:25, Shaopeng Tan (Fujitsu) wrote:
> >> Hello Ben,
> >>
> >>> This new version of the mpam missing pieces series has a few significant
> >>> changes in the mpam driver part of the series. The heuristics for deciding
> >>> if features should be exposed are tightened. This is to fix some
> >>> inaccuracies and avoid overcommitting before needed - shout if this changes
> >>> anything on your platform. The final patch adds documentation which
> >>> explains which features you should expect. The ABMC emulation is dropped
> >>> for the moment as it requires resctrl changes to support for MPAM without
> >>> breaking the abi. The default 5% gap for min_bw is dropped in favour of a
> >>> simple default (kept for grace). The series is based on x86/resctrl [1] as
> >>> resctrl has telemetry patches queued which change the arch interface.
> >>
> >> Could you please elaborate on why fs/resctrl changes are required to support only the counter assignment part of ABMC?
> >> Currently, many SoC chips have an insufficient number of memory bandwidth monitors.
> >
> > Sure. When the counter assignment mode is 'mbm_event; resctrl assumes the mbm events are configurable.
> > The 'event_filter' files at
> > info/L3_MON/event_configs/<event>/event_filter
> > are used to display and set this configuration.
> >
> > In MPAM event configuration is not supported and so showing a read/writable 'event_filter' file is
> > misleading to the user and needs to be hidden for MPAM support.
> >
> > Just to give you a flavour of the change, here's a hack to show the correct thing for MPAM:
> >
> > --- a/fs/resctrl/rdtgroup.c
> > +++ b/fs/resctrl/rdtgroup.c
> > @@ -2338,6 +2338,9 @@ static int resctrl_mkdir_event_configs(struct rdt_resource *r, struct kernfs_nod
> > if (ret)
> > goto out;
> >
> > + if (!resctrl_arch_is_evt_configurable(mevt->evtid))
> > + continue;
> > +
> >
> >
> >> We would be grateful if you could support the counter assignment part of ABMC.
> >
> > It is not a big change in resctrl but I thought it best to not gate the rest of this series on
> > an additional change in another subsystem. I am current looking into this and hope to get the
> > patches on the list early in the next cycle.
>
> There is another small change that will be required in resctrl to
> support ABMC with MPAM. As counter assignment means that it can't be
> guaranteed that each CTRL_MON group has a dedicated memory bandwidth
> counter the software controller (mbaMBps mount option) won't work but
> the mount won't fail. AMD doesn't hit this problem as it's MBA is
> non-linear. I was hoping to repurpose the delay_linear flag to just mean
> the software controller isn't supported but resctrl displays this
> information to the user in the 'delay_linear' file and we don't want to
> mislead.
>
Thank you for your detailed explanation.
I retested this patch series and there is no problem.
I reviewed the source code, and it looks fine to me.
Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Best regards,
Shaopeng TAN
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-05 16:50 ` Jonathan Cameron
@ 2026-02-13 7:38 ` Zeng Heng
2026-02-16 13:54 ` Ben Horgan
2026-02-18 16:40 ` Ben Horgan
1 sibling, 1 reply; 89+ messages in thread
From: Zeng Heng @ 2026-02-13 7:38 UTC (permalink / raw)
To: Ben Horgan, james.morse, Jonathan Cameron
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, kobak, lcherian, linux-arm-kernel,
linux-kernel, peternewman, punit.agrawal, quic_jiles,
reinette.chatre, rohit.mathew, scott, sdonthineni, tan.shaopeng,
xhao, catalin.marinas, will, corbet, maz, oupton, joey.gouly,
suzuki.poulose, kvmarm, linux-doc, Kefeng Wang
Hi Ben,
On 2026/2/6 0:50, Jonathan Cameron wrote:
> On Tue, 3 Feb 2026 21:43:27 +0000
> Ben Horgan <ben.horgan@arm.com> wrote:
>
>> From: James Morse <james.morse@arm.com>
>>
>> resctrl supports 'MB', as a percentage throttling of traffic from the
>> L3. This is the control that mba_sc uses, so ideally the class chosen
>> should be as close as possible to the counters used for mbm_total. If
>> there is a single L3 and the topology of the memory matches then the
>> traffic at the memory controller will be equivalent to that at egress of
>> the L3. If these conditions are met allow the memory class to back MB.
>>
>> MB's percentage control should be backed either with the fixed point
>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>> bitmaps is not used as its tricky to pick which bits to use to avoid
>> contention, and may be possible to expose this as something other than a
>> percentage in the future.
>
> I'm very curious to know whether this heuristic is actually useful on many
> systems or whether many / most of them fail this 'shape' heuristic.
>
The current MPAM driver has restrictions that limit MB control support.
For example, on some systems, the MPAM memory class MSCs are not located
at the L3 cache but rather at the memory controller (e.g., Mata). In
such cases, MB control and mbm_total bandwidth monitoring cannot be
enabled.
I'm unsure whether the community would welcome and be willing to review
a patch series supporting such systems. Of course, the changes would
involve minor refactoring in the resctrl common layer.
Any feedback would be greatly appreciated.
Best Regards,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4
2026-02-13 7:02 ` Shaopeng Tan (Fujitsu)
@ 2026-02-14 1:29 ` Zeng Heng
2026-02-20 2:30 ` Shaopeng Tan (Fujitsu)
0 siblings, 1 reply; 89+ messages in thread
From: Zeng Heng @ 2026-02-14 1:29 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), Ben Horgan
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, linux-doc@vger.kernel.org, Kefeng Wang
Hi Shaopeng,
On 2026/2/13 15:02, Shaopeng Tan (Fujitsu) wrote:
> Hello Ben, Fenghua
>
>> From: Shanker Donthineni <sdonthineni@nvidia.com>
>>
>> In the T241 implementation of memory-bandwidth partitioning, in the absence
>> of contention for bandwidth, the minimum bandwidth setting can affect the
>> amount of achieved bandwidth. Specifically, the achieved bandwidth in the
>> absence of contention can settle to any value between the values of
>> MPAMCFG_MBW_MIN and MPAMCFG_MBW_MAX. Also, if MPAMCFG_MBW_MIN is set
>> zero (below 0.78125%), once a core enters a throttled state, it will never
>> leave that state.
>>
>> The first issue is not a concern if the MPAM software allows to program
>> MPAMCFG_MBW_MIN through the sysfs interface. This patch ensures program
>> MBW_MIN=1 (0.78125%) whenever MPAMCFG_MBW_MIN=0 is programmed.
>>
>> In the scenario where the resctrl doesn't support the MBW_MIN interface via
>> sysfs, to achieve bandwidth closer to MBW_MAX in the absence of contention,
>> software should configure a relatively narrow gap between MBW_MIN and
>> MBW_MAX. The recommendation is to use a 5% gap to mitigate the problem.
>
> I have a question regarding the MBW_MIN values.
> Are there any cases where the sum of all MBW_MIN values from different control groups exceeds 100%?
> And if so, is it acceptable for it to exceed 100%?"
>
> Best regards,
> Shaopeng TAN
>
Per the ARM MPAM architecture specification: "A PARTID that has used
less than MIN is given preferential access to bandwidth."
MBW_MIN is not a guaranteed bandwidth allocation. Instead, it serves as
a priority threshold: when a partid's memory bandwidth usage falls below
the configured MBW_MIN value, its priority for memory bandwidth access
is elevated.
Therefore, it is acceptable for the sum of MBW_MIN values across
different control groups to exceed 100%. There is no requirement
for these values to add up to 100% or less.
Hope this clarifies the behavior.
Best regards,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
` (41 preceding siblings ...)
2026-02-09 8:25 ` [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Shaopeng Tan (Fujitsu)
@ 2026-02-14 9:40 ` Zeng Heng
2026-02-16 12:22 ` Ben Horgan
42 siblings, 1 reply; 89+ messages in thread
From: Zeng Heng @ 2026-02-14 9:40 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
Hi Ben,
On 2026/2/4 5:43, Ben Horgan wrote:
> This new version of the mpam missing pieces series has a few significant
> changes in the mpam driver part of the series. The heuristics for deciding
> if features should be exposed are tightened. This is to fix some
> inaccuracies and avoid overcommitting before needed - shout if this changes
> anything on your platform. The final patch adds documentation which
> explains which features you should expect. The ABMC emulation is dropped
> for the moment as it requires resctrl changes to support for MPAM without
> breaking the abi. The default 5% gap for min_bw is dropped in favour of a
> simple default (kept for grace). The series is based on x86/resctrl [1] as
> resctrl has telemetry patches queued which change the arch interface.
>
> Fixes that are in 6.19-rc8 are dropped from the series but
> b9f5c38e4af1 ("arm_mpam: Use non-atomic bitops when modifying feature bitmap")
> is required to avoid an alignment fault in the kunit tests.
>
> Thank you for all the testing and reviewing so far! It all helps.
>
> Changelogs in patches
>
>>From James' cover letter:
>
> This is the missing piece to make MPAM usable resctrl in user-space. This has
> shed its debugfs code and the read/write 'event configuration' for the monitors
> to make the series smaller.
>
> This adds the arch code and KVM support first. I anticipate the whole thing
> going via arm64, but if goes via tip instead, the an immutable branch with those
> patches should be easy to do.
>
> Generally the resctrl glue code works by picking what MPAM features it can expose
> from the MPAM drive, then configuring the structs that back the resctrl helpers.
> If your platform is sufficiently Xeon shaped, you should be able to get L2/L3 CPOR
> bitmaps exposed via resctrl. CSU counters work if they are on/after the L3. MBWU
> counters are considerably more hairy, and depend on hueristics around the topology,
> and a bunch of stuff trying to emulate ABMC.
> If it didn't pick what you wanted it to, please share the debug messages produced
> when enabling dynamic debug and booting with:
> | dyndbg="file mpam_resctrl.c +pl"
>
> I've not found a platform that can test all the behaviours around the monitors,
> so this is where I'd expect the most bugs.
>
> The MPAM spec that describes all the system and MMIO registers can be found here:
> https://developer.arm.com/documentation/ddi0598/db/?lang=en
> (Ignored the 'RETIRED' warning - that is just arm moving the documentation around.
> This document has the best overview)
>
> Based on:
> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
> (To include telemetry code which changes the resctrl arch interface)
>
> The series can be retrieved from:
> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
> (Final commit is a fix already in 6.19-rc8)
>
> v3 can be found at:
> https://lore.kernel.org/linux-arm-kernel/20260112165914.4086692-1-ben.horgan@arm.com/
>
> v2 can be found at:
> https://lore.kernel.org/linux-arm-kernel/20251219181147.3404071-1-ben.horgan@arm.com/
>
> rfc can be found at:
> https://lore.kernel.org/linux-arm-kernel/20251205215901.17772-1-james.morse@arm.com/
>
>
I've tested the MPAM functionality on my local Kunpeng platform. Here's
a summary of the results:
Features enabled and verified:
* L2 and L3 CPBM
* L3 CSU
* L2 and L3 CDP
All enabled features passed functional testing as expected.
+ Tested-by: Zeng Heng <zengheng4@huawei.com>
Features not enabled:
1. MATA MBMAX partition and MBWU monitor.
Reason: These do not meet the driver's current topology
expectations for MB support, hence they were not initialized.
This behavior is expected.
2. L2 CSU and MBWU monitors.
Reason: The current MPAM driver does not support L2-related
functionality yet.
+ Tested-by: Zeng Heng <zengheng4@huawei.com>
Detailed test logs are as follows:
Boot logs:
[root@localhost ~]# dmesg | grep -i mpam
[ 0.000000] ACPI: MPAM 0x000000007FF35018 003024 (v01 HISI HIP12
00000000 HISI 20151124)
[ 9.509852] mpam_msc mpam_msc.64: Merging features for
vmsc:0xffff0800973cf5a0 |= ris:0xffff08009757ee90
[ 9.509859] mpam_msc mpam_msc.254: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff080097628520
[ 9.509860] mpam:__props_mismatch:
mpam_has_feature(mpam_feat_cpor_part, parent) = 1
[ 9.509864] mpam:__props_mismatch:
mpam_has_feature(mpam_feat_cpor_part, child) = 0
[ 9.509866] mpam:__props_mismatch: parent->cpbm_wd = 8
[ 9.509869] mpam:__props_mismatch: child->cpbm_wd = 0
[ 9.509871] mpam:__props_mismatch: alias = 0
[ 9.509873] mpam:__props_mismatch: cleared cpor_part
[ 9.509876] mpam:__props_mismatch: took the min num_csu_mon
[ 9.509878] mpam:__props_mismatch: took the min num_mbwu_mon
[ 9.509881] mpam_msc mpam_msc.252: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800976284a0
[ 9.509884] mpam_msc mpam_msc.250: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff080097628420
[ 9.509887] mpam_msc mpam_msc.248: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800976283a0
[ 9.509889] mpam_msc mpam_msc.246: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff080097628320
[ 9.509892] mpam_msc mpam_msc.244: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800976282a0
[ 9.509894] mpam_msc mpam_msc.242: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff080097628220
[ 9.509897] mpam_msc mpam_msc.240: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800976281a0
[ 9.509900] mpam_msc mpam_msc.238: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff080097628120
[ 9.509902] mpam_msc mpam_msc.236: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800976280a0
[ 9.509905] mpam_msc mpam_msc.234: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff080097628020
[ 9.509907] mpam_msc mpam_msc.232: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975affa0
[ 9.509910] mpam_msc mpam_msc.230: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aff20
[ 9.509913] mpam_msc mpam_msc.228: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afea0
[ 9.509915] mpam_msc mpam_msc.226: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afe20
[ 9.509918] mpam_msc mpam_msc.224: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afda0
[ 9.509920] mpam_msc mpam_msc.222: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afd20
[ 9.509923] mpam_msc mpam_msc.220: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afca0
[ 9.509925] mpam_msc mpam_msc.218: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afc20
[ 9.509928] mpam_msc mpam_msc.216: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afba0
[ 9.509931] mpam_msc mpam_msc.214: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afb20
[ 9.509933] mpam_msc mpam_msc.212: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afaa0
[ 9.509936] mpam_msc mpam_msc.210: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975afa20
[ 9.509938] mpam_msc mpam_msc.208: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af9a0
[ 9.509941] mpam_msc mpam_msc.206: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af920
[ 9.509943] mpam_msc mpam_msc.204: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af8a0
[ 9.509946] mpam_msc mpam_msc.202: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af820
[ 9.509949] mpam_msc mpam_msc.200: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af7a0
[ 9.509951] mpam_msc mpam_msc.198: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af720
[ 9.509954] mpam_msc mpam_msc.196: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af6a0
[ 9.509956] mpam_msc mpam_msc.194: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af620
[ 9.509959] mpam_msc mpam_msc.192: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af5a0
[ 9.509962] mpam_msc mpam_msc.190: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af520
[ 9.509964] mpam_msc mpam_msc.188: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af4a0
[ 9.509967] mpam_msc mpam_msc.186: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af420
[ 9.509969] mpam_msc mpam_msc.184: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af3a0
[ 9.509972] mpam_msc mpam_msc.182: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af320
[ 9.509974] mpam_msc mpam_msc.180: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af2a0
[ 9.509977] mpam_msc mpam_msc.178: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af220
[ 9.509980] mpam_msc mpam_msc.176: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af1a0
[ 9.509982] mpam_msc mpam_msc.174: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af120
[ 9.509985] mpam_msc mpam_msc.172: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af0a0
[ 9.509987] mpam_msc mpam_msc.170: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975af020
[ 9.509990] mpam_msc mpam_msc.168: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aefa0
[ 9.509993] mpam_msc mpam_msc.166: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aef20
[ 9.509995] mpam_msc mpam_msc.164: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aeea0
[ 9.509998] mpam_msc mpam_msc.162: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aee20
[ 9.510000] mpam_msc mpam_msc.160: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aeda0
[ 9.510003] mpam_msc mpam_msc.158: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aed20
[ 9.510005] mpam_msc mpam_msc.156: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aeca0
[ 9.510008] mpam_msc mpam_msc.154: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aec20
[ 9.510010] mpam_msc mpam_msc.152: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aeba0
[ 9.510013] mpam_msc mpam_msc.150: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aeb20
[ 9.510016] mpam_msc mpam_msc.148: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aeaa0
[ 9.510018] mpam_msc mpam_msc.146: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975aea20
[ 9.510021] mpam_msc mpam_msc.144: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae9a0
[ 9.510023] mpam_msc mpam_msc.142: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae920
[ 9.510026] mpam_msc mpam_msc.140: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae8a0
[ 9.510029] mpam_msc mpam_msc.138: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae820
[ 9.510031] mpam_msc mpam_msc.136: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae7a0
[ 9.510034] mpam_msc mpam_msc.134: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae720
[ 9.510036] mpam_msc mpam_msc.132: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae6a0
[ 9.510039] mpam_msc mpam_msc.130: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae620
[ 9.510041] mpam_msc mpam_msc.128: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae5a0
[ 9.510044] mpam_msc mpam_msc.126: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae520
[ 9.510047] mpam_msc mpam_msc.124: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae4a0
[ 9.510049] mpam_msc mpam_msc.122: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae420
[ 9.510052] mpam_msc mpam_msc.120: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae3a0
[ 9.510054] mpam_msc mpam_msc.118: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae2a0
[ 9.510057] mpam_msc mpam_msc.116: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae220
[ 9.510060] mpam_msc mpam_msc.114: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae1a0
[ 9.510062] mpam_msc mpam_msc.112: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae120
[ 9.510065] mpam_msc mpam_msc.110: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae0a0
[ 9.510067] mpam_msc mpam_msc.108: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800975ae020
[ 9.510070] mpam_msc mpam_msc.106: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff08009729c720
[ 9.510073] mpam_msc mpam_msc.104: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cffa0
[ 9.510075] mpam_msc mpam_msc.102: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cff20
[ 9.510078] mpam_msc mpam_msc.100: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfea0
[ 9.510080] mpam_msc mpam_msc.98: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfe20
[ 9.510083] mpam_msc mpam_msc.96: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfda0
[ 9.510086] mpam_msc mpam_msc.94: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfd20
[ 9.510088] mpam_msc mpam_msc.92: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfca0
[ 9.510091] mpam_msc mpam_msc.90: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfc20
[ 9.510094] mpam_msc mpam_msc.88: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfba0
[ 9.510096] mpam_msc mpam_msc.86: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfb20
[ 9.510099] mpam_msc mpam_msc.84: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfaa0
[ 9.510102] mpam_msc mpam_msc.82: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cfa20
[ 9.510104] mpam_msc mpam_msc.80: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf9a0
[ 9.510107] mpam_msc mpam_msc.78: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf920
[ 9.510109] mpam_msc mpam_msc.76: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf8a0
[ 9.510112] mpam_msc mpam_msc.74: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf820
[ 9.510115] mpam_msc mpam_msc.72: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf7a0
[ 9.510117] mpam_msc mpam_msc.70: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf720
[ 9.510120] mpam_msc mpam_msc.68: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf6a0
[ 9.510123] mpam_msc mpam_msc.66: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf620
[ 9.510125] mpam_msc mpam_msc.64: Merging features for
class:0xffff08009736fe50 &= vmsc:0xffff0800973cf5a0
[ 9.510129] mpam_msc mpam_msc.62: Merging features for
vmsc:0xffff0800973cf520 |= ris:0xffff08009757ea90
[ 9.510132] mpam_msc mpam_msc.60: Merging features for
vmsc:0xffff0800973cf4a0 |= ris:0xffff08009757e690
[ 9.510133] mpam_msc mpam_msc.58: Merging features for
vmsc:0xffff0800973cf420 |= ris:0xffff08009757e290
[ 9.510136] mpam_msc mpam_msc.56: Merging features for
vmsc:0xffff0800973cf3a0 |= ris:0xffff08009757de90
[ 9.510139] mpam_msc mpam_msc.54: Merging features for
vmsc:0xffff0800973cf320 |= ris:0xffff08009757da90
[ 9.510141] mpam_msc mpam_msc.52: Merging features for
vmsc:0xffff0800973cf2a0 |= ris:0xffff08009757d690
[ 9.510144] mpam_msc mpam_msc.50: Merging features for
vmsc:0xffff0800973cf220 |= ris:0xffff08009757d290
[ 9.510146] mpam_msc mpam_msc.48: Merging features for
vmsc:0xffff0800973cf1a0 |= ris:0xffff08009757ce90
[ 9.510149] mpam_msc mpam_msc.46: Merging features for
vmsc:0xffff0800973cf120 |= ris:0xffff08009757ca90
[ 9.510152] mpam_msc mpam_msc.44: Merging features for
vmsc:0xffff0800973cf0a0 |= ris:0xffff08009757c690
[ 9.510154] mpam_msc mpam_msc.42: Merging features for
vmsc:0xffff0800973cf020 |= ris:0xffff08009757c290
[ 9.510157] mpam_msc mpam_msc.40: Merging features for
vmsc:0xffff0800973cefa0 |= ris:0xffff08009757be90
[ 9.510160] mpam_msc mpam_msc.38: Merging features for
vmsc:0xffff0800973cef20 |= ris:0xffff08009757ba90
[ 9.510162] mpam_msc mpam_msc.36: Merging features for
vmsc:0xffff0800973ceea0 |= ris:0xffff08009757b690
[ 9.510166] mpam_msc mpam_msc.34: Merging features for
vmsc:0xffff0800973cee20 |= ris:0xffff08009757b290
[ 9.510167] mpam_msc mpam_msc.32: Merging features for
vmsc:0xffff0800973ceda0 |= ris:0xffff08009757ae90
[ 9.510170] mpam_msc mpam_msc.30: Merging features for
vmsc:0xffff0800973ced20 |= ris:0xffff08009757aa90
[ 9.510173] mpam_msc mpam_msc.28: Merging features for
vmsc:0xffff0800973ceca0 |= ris:0xffff08009757a690
[ 9.510175] mpam_msc mpam_msc.26: Merging features for
vmsc:0xffff0800973cec20 |= ris:0xffff08009757a290
[ 9.510178] mpam_msc mpam_msc.24: Merging features for
vmsc:0xffff0800973ceba0 |= ris:0xffff080097579e90
[ 9.510181] mpam_msc mpam_msc.22: Merging features for
vmsc:0xffff0800973ceb20 |= ris:0xffff080097579a90
[ 9.510183] mpam_msc mpam_msc.20: Merging features for
vmsc:0xffff0800973ceaa0 |= ris:0xffff080097579690
[ 9.510186] mpam_msc mpam_msc.18: Merging features for
vmsc:0xffff0800973cea20 |= ris:0xffff080097579290
[ 9.510189] mpam_msc mpam_msc.16: Merging features for
vmsc:0xffff0800973ce9a0 |= ris:0xffff080097578e90
[ 9.510191] mpam_msc mpam_msc.62: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf520
[ 9.510195] mpam_msc mpam_msc.60: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf4a0
[ 9.510196] mpam_msc mpam_msc.58: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf420
[ 9.510199] mpam_msc mpam_msc.56: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf3a0
[ 9.510202] mpam_msc mpam_msc.54: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf320
[ 9.510204] mpam_msc mpam_msc.52: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf2a0
[ 9.510207] mpam_msc mpam_msc.50: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf220
[ 9.510209] mpam_msc mpam_msc.48: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf1a0
[ 9.510212] mpam_msc mpam_msc.46: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf120
[ 9.510214] mpam_msc mpam_msc.44: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf0a0
[ 9.510217] mpam_msc mpam_msc.42: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cf020
[ 9.510219] mpam_msc mpam_msc.40: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cefa0
[ 9.510222] mpam_msc mpam_msc.38: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cef20
[ 9.510224] mpam_msc mpam_msc.36: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ceea0
[ 9.510227] mpam_msc mpam_msc.34: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cee20
[ 9.510230] mpam_msc mpam_msc.32: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ceda0
[ 9.510232] mpam_msc mpam_msc.30: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ced20
[ 9.510235] mpam_msc mpam_msc.28: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ceca0
[ 9.510237] mpam_msc mpam_msc.26: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cec20
[ 9.510240] mpam_msc mpam_msc.24: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ceba0
[ 9.510242] mpam_msc mpam_msc.22: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ceb20
[ 9.510245] mpam_msc mpam_msc.20: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ceaa0
[ 9.510247] mpam_msc mpam_msc.18: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973cea20
[ 9.510250] mpam_msc mpam_msc.16: Merging features for
class:0xffff08009736cc50 &= vmsc:0xffff0800973ce9a0
[ 9.510254] mpam_msc mpam_msc.14: Merging features for
vmsc:0xffff0800973ce920 |= ris:0xffff080097578a90
[ 9.510255] mpam_msc mpam_msc.12: Merging features for
vmsc:0xffff0800973ce8a0 |= ris:0xffff080097578690
[ 9.510258] mpam_msc mpam_msc.10: Merging features for
vmsc:0xffff0800973ce820 |= ris:0xffff080097578290
[ 9.510260] mpam_msc mpam_msc.8: Merging features for
vmsc:0xffff0800973ce7a0 |= ris:0xffff080097417e90
[ 9.510263] mpam_msc mpam_msc.6: Merging features for
vmsc:0xffff0800973ce720 |= ris:0xffff080097417a90
[ 9.510266] mpam_msc mpam_msc.4: Merging features for
vmsc:0xffff0800973ce6a0 |= ris:0xffff080097417690
[ 9.510268] mpam_msc mpam_msc.2: Merging features for
vmsc:0xffff0800973ce620 |= ris:0xffff080097417290
[ 9.510271] mpam_msc mpam_msc.0: Merging features for
vmsc:0xffff0800973ce5a0 |= ris:0xffff080097416e90
[ 9.510274] mpam_msc mpam_msc.14: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce920
[ 9.510276] mpam_msc mpam_msc.12: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce8a0
[ 9.510279] mpam_msc mpam_msc.10: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce820
[ 9.510281] mpam_msc mpam_msc.8: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce7a0
[ 9.510284] mpam_msc mpam_msc.6: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce720
[ 9.510287] mpam_msc mpam_msc.4: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce6a0
[ 9.510289] mpam_msc mpam_msc.2: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce620
[ 9.510292] mpam_msc mpam_msc.0: Merging features for
class:0xffff08009735bb50 &= vmsc:0xffff0800973ce5a0
[ 10.978496] mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR
[ 10.978497] mpam:mpam_resctrl_pick_caches: class 255 is not a cache
[ 10.980470] mpam:mpam_resctrl_pick_mba: class 2 is before L3
[ 10.980472] mpam:mpam_resctrl_pick_mba: class 3 has no bandwidth control
[ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
Mismatched CPU mask with L3 equivalent
[ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
match L3
[ 10.997415] mpam:mpam_resctrl_pick_counters: class 2 is before L3
[ 11.024109] mpam:topology_matches_l3: class 3 component 276 has
Mismatched CPU mask with L3 equivalent
[ 11.024114] mpam:class_has_usable_mbwu: monitors usable in
free-running mode
[ 11.063882] mpam:topology_matches_l3: class 255 component 0 has
Mismatched CPU mask with L3 equivalent
[ 11.113183] mpam:mpam_resctrl_alloc_domain: Skipped monitor domain
online - no monitors
[ 11.113189] MPAM enabled with 32 PARTIDs and 4 PMGs
Kunit test logs:
[ 31.253697] KTAP version 1
[ 31.253698] 1..2
[ 31.258263] KTAP version 1
[ 31.258265] # Subtest: mpam_devices_test_suite
[ 31.258267] # module: mpam
[ 31.258268] 1..3
[ 31.258775] ok 1 test_mpam_reset_msc_bitmap
[ 31.259558] ok 2 test_mpam_enable_merge_features
[ 31.260259] ok 3 test__props_mismatch
[ 31.260261] # mpam_devices_test_suite: pass:3 fail:0 skip:0 total:3
[ 31.260263] # Totals: pass:3 fail:0 skip:0 total:3
[ 31.260265] ok 1 mpam_devices_test_suite
[ 31.260267] KTAP version 1
[ 31.260268] # Subtest: mpam_resctrl_test_suite
[ 31.260269] # module: mpam
[ 31.260271] 1..7
[ 31.260965] ok 1 test_get_mba_granularity
[ 31.260968] KTAP version 1
[ 31.260969] # Subtest: test_mbw_max_to_percent
[ 31.261372] ok 1 pc=1, width=8, value=0x01
[ 31.261794] ok 2 pc=1, width=12, value=0x027
[ 31.262081] ok 3 pc=1, width=16, value=0x028e
[ 31.262183] ok 4 pc=25, width=8, value=0x3f
[ 31.262287] ok 5 pc=25, width=12, value=0x3ff
[ 31.262388] ok 6 pc=25, width=16, value=0x3fff
[ 31.262489] ok 7 pc=33, width=8, value=0x53
[ 31.262608] ok 8 pc=33, width=12, value=0x546
[ 31.262860] ok 9 pc=33, width=16, value=0x5479
[ 31.263113] ok 10 pc=35, width=8, value=0x58
[ 31.263491] ok 11 pc=35, width=12, value=0x598
[ 31.263872] ok 12 pc=35, width=16, value=0x5998
[ 31.264249] ok 13 pc=45, width=8, value=0x72
[ 31.264352] ok 14 pc=45, width=12, value=0x732
[ 31.264455] ok 15 pc=45, width=16, value=0x7332
[ 31.264559] ok 16 pc=50, width=8, value=0x7f
[ 31.264661] ok 17 pc=50, width=12, value=0x7ff
[ 31.264764] ok 18 pc=50, width=16, value=0x7fff
[ 31.264872] ok 19 pc=52, width=8, value=0x84
[ 31.264978] ok 20 pc=52, width=12, value=0x850
[ 31.265082] ok 21 pc=52, width=16, value=0x851d
[ 31.265190] ok 22 pc=55, width=8, value=0x8b
[ 31.265297] ok 23 pc=55, width=12, value=0x8cb
[ 31.265403] ok 24 pc=55, width=16, value=0x8ccb
[ 31.265507] ok 25 pc=58, width=8, value=0x93
[ 31.265609] ok 26 pc=58, width=12, value=0x946
[ 31.265714] ok 27 pc=58, width=16, value=0x9479
[ 31.265817] ok 28 pc=75, width=8, value=0xbf
[ 31.265918] ok 29 pc=75, width=12, value=0xbff
[ 31.266020] ok 30 pc=75, width=16, value=0xbfff
[ 31.266120] ok 31 pc=80, width=8, value=0xcb
[ 31.266220] ok 32 pc=80, width=12, value=0xccb
[ 31.266322] ok 33 pc=80, width=16, value=0xcccb
[ 31.266425] ok 34 pc=88, width=8, value=0xe0
[ 31.266533] ok 35 pc=88, width=12, value=0xe13
[ 31.266637] ok 36 pc=88, width=16, value=0xe146
[ 31.266758] ok 37 pc=95, width=8, value=0xf2
[ 31.267150] ok 38 pc=95, width=12, value=0xf32
[ 31.267535] ok 39 pc=95, width=16, value=0xf332
[ 31.267918] ok 40 pc=100, width=8, value=0xff
[ 31.268160] ok 41 pc=100, width=12, value=0xfff
[ 31.268264] ok 42 pc=100, width=16, value=0xffff
[ 31.268266] # test_mbw_max_to_percent: pass:42 fail:0 skip:0 total:42
[ 31.268268] ok 2 test_mbw_max_to_percent
[ 31.268270] KTAP version 1
[ 31.268271] # Subtest: test_percent_to_mbw_max
[ 31.268376] ok 1 pc=1, width=8, value=0x01
[ 31.268483] ok 2 pc=1, width=12, value=0x027
[ 31.268595] ok 3 pc=1, width=16, value=0x028e
[ 31.268701] ok 4 pc=25, width=8, value=0x3f
[ 31.268806] ok 5 pc=25, width=12, value=0x3ff
[ 31.268915] ok 6 pc=25, width=16, value=0x3fff
[ 31.269022] ok 7 pc=33, width=8, value=0x53
[ 31.269129] ok 8 pc=33, width=12, value=0x546
[ 31.269237] ok 9 pc=33, width=16, value=0x5479
[ 31.269342] ok 10 pc=35, width=8, value=0x58
[ 31.269446] ok 11 pc=35, width=12, value=0x598
[ 31.269551] ok 12 pc=35, width=16, value=0x5998
[ 31.269658] ok 13 pc=45, width=8, value=0x72
[ 31.269764] ok 14 pc=45, width=12, value=0x732
[ 31.269868] ok 15 pc=45, width=16, value=0x7332
[ 31.269975] ok 16 pc=50, width=8, value=0x7f
[ 31.270081] ok 17 pc=50, width=12, value=0x7ff
[ 31.270185] ok 18 pc=50, width=16, value=0x7fff
[ 31.270287] ok 19 pc=52, width=8, value=0x84
[ 31.270388] ok 20 pc=52, width=12, value=0x850
[ 31.270494] ok 21 pc=52, width=16, value=0x851d
[ 31.270606] ok 22 pc=55, width=8, value=0x8b
[ 31.271004] ok 23 pc=55, width=12, value=0x8cb
[ 31.271387] ok 24 pc=55, width=16, value=0x8ccb
[ 31.271770] ok 25 pc=58, width=8, value=0x93
[ 31.272151] ok 26 pc=58, width=12, value=0x946
[ 31.272260] ok 27 pc=58, width=16, value=0x9479
[ 31.272366] ok 28 pc=75, width=8, value=0xbf
[ 31.272472] ok 29 pc=75, width=12, value=0xbff
[ 31.272580] ok 30 pc=75, width=16, value=0xbfff
[ 31.272686] ok 31 pc=80, width=8, value=0xcb
[ 31.272790] ok 32 pc=80, width=12, value=0xccb
[ 31.272895] ok 33 pc=80, width=16, value=0xcccb
[ 31.273000] ok 34 pc=88, width=8, value=0xe0
[ 31.273106] ok 35 pc=88, width=12, value=0xe13
[ 31.273209] ok 36 pc=88, width=16, value=0xe146
[ 31.273318] ok 37 pc=95, width=8, value=0xf2
[ 31.273424] ok 38 pc=95, width=12, value=0xf32
[ 31.273528] ok 39 pc=95, width=16, value=0xf332
[ 31.273635] ok 40 pc=100, width=8, value=0xff
[ 31.273742] ok 41 pc=100, width=12, value=0xfff
[ 31.273847] ok 42 pc=100, width=16, value=0xffff
[ 31.273849] # test_percent_to_mbw_max: pass:42 fail:0 skip:0 total:42
[ 31.273850] ok 3 test_percent_to_mbw_max
[ 31.273852] KTAP version 1
[ 31.273853] # Subtest: test_mbw_max_to_percent_limits
[ 31.273957] ok 1 wd=1
[ 31.274064] ok 2 wd=2
[ 31.274171] ok 3 wd=3
[ 31.274276] ok 4 wd=4
[ 31.274381] ok 5 wd=5
[ 31.274485] ok 6 wd=6
[ 31.274603] ok 7 wd=7
[ 31.274710] ok 8 wd=8
[ 31.274974] ok 9 wd=9
[ 31.275362] ok 10 wd=10
[ 31.275746] ok 11 wd=11
[ 31.276122] ok 12 wd=12
[ 31.276230] ok 13 wd=13
[ 31.276335] ok 14 wd=14
[ 31.276444] ok 15 wd=15
[ 31.276551] ok 16 wd=16
[ 31.276553] # test_mbw_max_to_percent_limits: pass:16 fail:0
skip:0 total:16
[ 31.276554] ok 4 test_mbw_max_to_percent_limits
[ 31.276605] # test_percent_to_max_rounding: Round-up rate: 43%
(18/42)
[ 31.276668] ok 5 test_percent_to_max_rounding
[ 31.276671] KTAP version 1
[ 31.276672] # Subtest: test_percent_max_roundtrip_stability
[ 31.276776] ok 1 wd=1
[ 31.276883] ok 2 wd=2
[ 31.276988] ok 3 wd=3
[ 31.277096] ok 4 wd=4
[ 31.277202] ok 5 wd=5
[ 31.277309] ok 6 wd=6
[ 31.277416] ok 7 wd=7
[ 31.277524] ok 8 wd=8
[ 31.277629] ok 9 wd=9
[ 31.277737] ok 10 wd=10
[ 31.277843] ok 11 wd=11
[ 31.277948] ok 12 wd=12
[ 31.278061] ok 13 wd=13
[ 31.278167] ok 14 wd=14
[ 31.278273] ok 15 wd=15
[ 31.278380] ok 16 wd=16
[ 31.278381] # test_percent_max_roundtrip_stability: pass:16
fail:0 skip:0 total:16
[ 31.278383] ok 6 test_percent_max_roundtrip_stability
[ 31.278385] KTAP version 1
[ 31.278386] # Subtest: test_rmid_idx_encoding
[ 31.278490] ok 1 max_partid=0, max_pmg=0
[ 31.278604] ok 2 max_partid=1, max_pmg=4
[ 31.279008] ok 3 max_partid=3, max_pmg=1
[ 31.279394] ok 4 max_partid=5, max_pmg=9
[ 31.279777] ok 5 max_partid=4, max_pmg=4
[ 31.280167] ok 6 max_partid=100, max_pmg=11
[ 31.358979] ok 7 max_partid=65535, max_pmg=255
[ 31.358985] # test_rmid_idx_encoding: pass:7 fail:0 skip:0 total:7
[ 31.358987] ok 7 test_rmid_idx_encoding
[ 31.358989] # mpam_resctrl_test_suite: pass:7 fail:0 skip:0 total:7
[ 31.358990] # Totals: pass:125 fail:0 skip:0 total:125
[ 31.358992] ok 2 mpam_resctrl_test_suite
------
Best regards,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration
2026-02-03 21:43 ` [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration Ben Horgan
@ 2026-02-14 10:39 ` Zeng Heng
2026-02-16 14:23 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Zeng Heng @ 2026-02-14 10:39 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Shaopeng Tan
Hi Ben,
On 2026/2/4 5:43, Ben Horgan wrote:
> From: James Morse <james.morse@arm.com>
>
> resctrl has two helpers for updating the configuration.
> resctrl_arch_update_one() updates a single value, and is used by the
> software-controller to apply feedback to the bandwidth controls, it has to
> be called on one of the CPUs in the resctrl:domain.
>
> resctrl_arch_update_domains() copies multiple staged configurations, it can
> be called from anywhere.
>
> Both helpers should update any changes to the underlying hardware.
>
> Implement resctrl_arch_update_domains() to use
> resctrl_arch_update_one(). Neither need to be called on a specific CPU as
> the mpam driver will send IPIs as needed.
>
> Tested-by: Gavin Shan <gshan@redhat.com>
> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
> Tested-by: Peter Newman <peternewman@google.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> ---
> Changes since rfc:
> list_for_each_entry -> list_for_each_entry_rcu
> return 0
> Restrict scope of local variables
>
> Changes since v2:
> whitespace fix
> ---
> drivers/resctrl/mpam_resctrl.c | 70 ++++++++++++++++++++++++++++++++++
> 1 file changed, 70 insertions(+)
>
> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
> index ecf00386edca..48d047510089 100644
> --- a/drivers/resctrl/mpam_resctrl.c
> +++ b/drivers/resctrl/mpam_resctrl.c
> @@ -212,6 +212,76 @@ u32 resctrl_arch_get_config(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> }
> }
>
> +int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
> + u32 closid, enum resctrl_conf_type t, u32 cfg_val)
> +{
> + u32 partid;
> + struct mpam_config cfg;
> + struct mpam_props *cprops;
> + struct mpam_resctrl_res *res;
> + struct mpam_resctrl_dom *dom;
> +
> + lockdep_assert_cpus_held();
> + lockdep_assert_irqs_enabled();
> +
> + /*
> + * No need to check the CPU as mpam_apply_config() doesn't care, and
> + * resctrl_arch_update_domains() relies on this.
> + */
> + res = container_of(r, struct mpam_resctrl_res, resctrl_res);
> + dom = container_of(d, struct mpam_resctrl_dom, resctrl_ctrl_dom);
> + cprops = &res->class->props;
> +
> + partid = resctrl_get_config_index(closid, t);
As a victim, I must admit I cannot verify this feedback on my local
Kunpeng environment since MB functionality is not yet supported by the
driver. However, after careful consideration, I believe this is worth
bringing up for discussion.
Regarding the MB configuration flow, the partid conversion should
include the mpam_resctrl_hide_cdp() condition check. Here's the
rationale:
After resctrl parsing schemata update, MB configuration is set via
parse_bw() or rdtgroup_init_mba(), which stores the updated
configuration in dom->staged_config[CDP_NONE]. If the MB configuration
update directly uses t = CDP_NONE, it would result in MB obtaining the
wrong partid and cfg[partid].
The specific fix would be like:
- partid = resctrl_get_config_index(closid, t);
+ if (mpam_resctrl_hide_cdp(r->rid))
+ /* The configuration of CDP_DATA is same as the CDP_CODE one. */
+ partid = resctrl_get_config_index(closid, CDP_DATA);
+ else
+ partid = resctrl_get_config_index(closid, t);
Similarly, in resctrl_arch_get_config() requires the same treatment to
ensure consistency.
Best regards,
and happy Lunar New Year!
Zeng Heng
> + if (!r->alloc_capable || partid >= resctrl_arch_get_num_closid(r)) {
> + pr_debug("Not alloc capable or computed PARTID out of range\n");
> + return -EINVAL;
> + }
> +
> + /*
> + * Copy the current config to avoid clearing other resources when the
> + * same component is exposed multiple times through resctrl.
> + */
> + cfg = dom->ctrl_comp->cfg[partid]; > +
> + switch (r->rid) {
> + case RDT_RESOURCE_L2:
> + case RDT_RESOURCE_L3:
> + cfg.cpbm = cfg_val;
> + mpam_set_feature(mpam_feat_cpor_part, &cfg);
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
> +}
> +
> +int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
> +{
> + int err;
> + struct rdt_ctrl_domain *d;
> +
> + lockdep_assert_cpus_held();
> + lockdep_assert_irqs_enabled();
> +
> + list_for_each_entry_rcu(d, &r->ctrl_domains, hdr.list) {
> + for (enum resctrl_conf_type t = 0; t < CDP_NUM_TYPES; t++) {
> + struct resctrl_staged_config *cfg = &d->staged_config[t];
> +
> + if (!cfg->have_new_ctrl)
> + continue;
> +
> + err = resctrl_arch_update_one(r, d, closid, t,
> + cfg->new_ctrl);
> + if (err)
> + return err;
> + }
> + }
> +
> + return 0;
> +}
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-14 9:40 ` Zeng Heng
@ 2026-02-16 12:22 ` Ben Horgan
2026-02-24 11:03 ` Zeng Heng
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-16 12:22 UTC (permalink / raw)
To: Zeng Heng
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
Hi Zeng,
On 2/14/26 09:40, Zeng Heng wrote:
> Hi Ben,
>
> On 2026/2/4 5:43, Ben Horgan wrote:
[...]
>>
>> Based on:
>> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
>> (To include telemetry code which changes the resctrl arch interface)
>>
>> The series can be retrieved from:
>> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
>> (Final commit is a fix already in 6.19-rc8)
>>
[...]
>>
>
> I've tested the MPAM functionality on my local Kunpeng platform. Here's
> a summary of the results:
Thank you very much for your testing and detailed report.
>
> Features enabled and verified:
> * L2 and L3 CPBM
> * L3 CSU
> * L2 and L3 CDP
> All enabled features passed functional testing as expected.
>
> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>
> Features not enabled:
> 1. MATA MBMAX partition and MBWU monitor.
What's MATA here? Just memory allocation or something more specific?
> Reason: These do not meet the driver's current topology>
expectations for MB support, hence they were not initialized.
> This behavior is expected.
Is this because you have more than 1 L3 cache?
>
> 2. L2 CSU and MBWU monitors.
> Reason: The current MPAM driver does not support L2-related
> functionality yet.
Thanks for letting us know you have these. But, yes, unfortunately
monitoring is only supported on the L3 at the moment.
>
> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>
>
> Detailed test logs are as follows:
I'm confused by these logs as it looks like you aren't getting any
monitors but you verified the L3 CSU. Also, it looks like cpor (cbpm) is
disabled (at least partially) but you verified L2 and L3 CPBM. Is this
across different machines?
>
> Boot logs:
> [root@localhost ~]# dmesg | grep -i mpam
> [ 0.000000] ACPI: MPAM 0x000000007FF35018 003024 (v01 HISI HIP12
> 00000000 HISI 20151124)
> [ 9.509852] mpam_msc mpam_msc.64: Merging features for
> vmsc:0xffff0800973cf5a0 |= ris:0xffff08009757ee90
> [ 9.509859] mpam_msc mpam_msc.254: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff080097628520
> [ 9.509860] mpam:__props_mismatch:
> mpam_has_feature(mpam_feat_cpor_part, parent) = 1
> [ 9.509864] mpam:__props_mismatch:
> mpam_has_feature(mpam_feat_cpor_part, child) = 0
> [ 9.509866] mpam:__props_mismatch: parent->cpbm_wd = 8
> [ 9.509869] mpam:__props_mismatch: child->cpbm_wd = 0
> [ 9.509871] mpam:__props_mismatch: alias = 0
> [ 9.509873] mpam:__props_mismatch: cleared cpor_part
cpor (partially) disabled?
> [ 9.509876] mpam:__props_mismatch: took the min num_csu_mon
> [ 9.509878] mpam:__props_mismatch: took the min num_mbwu_mon
> [ 9.509881] mpam_msc mpam_msc.252: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800976284a0
> [ 9.509884] mpam_msc mpam_msc.250: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff080097628420
> [ 9.509887] mpam_msc mpam_msc.248: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800976283a0
> [ 9.509889] mpam_msc mpam_msc.246: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff080097628320
> [ 9.509892] mpam_msc mpam_msc.244: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800976282a0
> [ 9.509894] mpam_msc mpam_msc.242: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff080097628220
> [ 9.509897] mpam_msc mpam_msc.240: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800976281a0
> [ 9.509900] mpam_msc mpam_msc.238: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff080097628120
> [ 9.509902] mpam_msc mpam_msc.236: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800976280a0
> [ 9.509905] mpam_msc mpam_msc.234: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff080097628020
> [ 9.509907] mpam_msc mpam_msc.232: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975affa0
> [ 9.509910] mpam_msc mpam_msc.230: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aff20
> [ 9.509913] mpam_msc mpam_msc.228: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afea0
> [ 9.509915] mpam_msc mpam_msc.226: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afe20
> [ 9.509918] mpam_msc mpam_msc.224: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afda0
> [ 9.509920] mpam_msc mpam_msc.222: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afd20
> [ 9.509923] mpam_msc mpam_msc.220: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afca0
> [ 9.509925] mpam_msc mpam_msc.218: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afc20
> [ 9.509928] mpam_msc mpam_msc.216: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afba0
> [ 9.509931] mpam_msc mpam_msc.214: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afb20
> [ 9.509933] mpam_msc mpam_msc.212: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afaa0
> [ 9.509936] mpam_msc mpam_msc.210: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975afa20
> [ 9.509938] mpam_msc mpam_msc.208: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af9a0
> [ 9.509941] mpam_msc mpam_msc.206: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af920
> [ 9.509943] mpam_msc mpam_msc.204: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af8a0
> [ 9.509946] mpam_msc mpam_msc.202: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af820
> [ 9.509949] mpam_msc mpam_msc.200: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af7a0
> [ 9.509951] mpam_msc mpam_msc.198: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af720
> [ 9.509954] mpam_msc mpam_msc.196: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af6a0
> [ 9.509956] mpam_msc mpam_msc.194: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af620
> [ 9.509959] mpam_msc mpam_msc.192: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af5a0
> [ 9.509962] mpam_msc mpam_msc.190: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af520
> [ 9.509964] mpam_msc mpam_msc.188: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af4a0
> [ 9.509967] mpam_msc mpam_msc.186: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af420
> [ 9.509969] mpam_msc mpam_msc.184: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af3a0
> [ 9.509972] mpam_msc mpam_msc.182: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af320
> [ 9.509974] mpam_msc mpam_msc.180: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af2a0
> [ 9.509977] mpam_msc mpam_msc.178: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af220
> [ 9.509980] mpam_msc mpam_msc.176: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af1a0
> [ 9.509982] mpam_msc mpam_msc.174: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af120
> [ 9.509985] mpam_msc mpam_msc.172: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af0a0
> [ 9.509987] mpam_msc mpam_msc.170: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975af020
> [ 9.509990] mpam_msc mpam_msc.168: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aefa0
> [ 9.509993] mpam_msc mpam_msc.166: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aef20
> [ 9.509995] mpam_msc mpam_msc.164: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aeea0
> [ 9.509998] mpam_msc mpam_msc.162: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aee20
> [ 9.510000] mpam_msc mpam_msc.160: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aeda0
> [ 9.510003] mpam_msc mpam_msc.158: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aed20
> [ 9.510005] mpam_msc mpam_msc.156: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aeca0
> [ 9.510008] mpam_msc mpam_msc.154: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aec20
> [ 9.510010] mpam_msc mpam_msc.152: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aeba0
> [ 9.510013] mpam_msc mpam_msc.150: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aeb20
> [ 9.510016] mpam_msc mpam_msc.148: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aeaa0
> [ 9.510018] mpam_msc mpam_msc.146: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975aea20
> [ 9.510021] mpam_msc mpam_msc.144: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae9a0
> [ 9.510023] mpam_msc mpam_msc.142: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae920
> [ 9.510026] mpam_msc mpam_msc.140: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae8a0
> [ 9.510029] mpam_msc mpam_msc.138: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae820
> [ 9.510031] mpam_msc mpam_msc.136: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae7a0
> [ 9.510034] mpam_msc mpam_msc.134: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae720
> [ 9.510036] mpam_msc mpam_msc.132: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae6a0
> [ 9.510039] mpam_msc mpam_msc.130: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae620
> [ 9.510041] mpam_msc mpam_msc.128: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae5a0
> [ 9.510044] mpam_msc mpam_msc.126: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae520
> [ 9.510047] mpam_msc mpam_msc.124: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae4a0
> [ 9.510049] mpam_msc mpam_msc.122: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae420
> [ 9.510052] mpam_msc mpam_msc.120: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae3a0
> [ 9.510054] mpam_msc mpam_msc.118: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae2a0
> [ 9.510057] mpam_msc mpam_msc.116: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae220
> [ 9.510060] mpam_msc mpam_msc.114: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae1a0
> [ 9.510062] mpam_msc mpam_msc.112: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae120
> [ 9.510065] mpam_msc mpam_msc.110: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae0a0
> [ 9.510067] mpam_msc mpam_msc.108: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800975ae020
> [ 9.510070] mpam_msc mpam_msc.106: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff08009729c720
> [ 9.510073] mpam_msc mpam_msc.104: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cffa0
> [ 9.510075] mpam_msc mpam_msc.102: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cff20
> [ 9.510078] mpam_msc mpam_msc.100: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfea0
> [ 9.510080] mpam_msc mpam_msc.98: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfe20
> [ 9.510083] mpam_msc mpam_msc.96: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfda0
> [ 9.510086] mpam_msc mpam_msc.94: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfd20
> [ 9.510088] mpam_msc mpam_msc.92: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfca0
> [ 9.510091] mpam_msc mpam_msc.90: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfc20
> [ 9.510094] mpam_msc mpam_msc.88: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfba0
> [ 9.510096] mpam_msc mpam_msc.86: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfb20
> [ 9.510099] mpam_msc mpam_msc.84: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfaa0
> [ 9.510102] mpam_msc mpam_msc.82: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cfa20
> [ 9.510104] mpam_msc mpam_msc.80: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf9a0
> [ 9.510107] mpam_msc mpam_msc.78: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf920
> [ 9.510109] mpam_msc mpam_msc.76: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf8a0
> [ 9.510112] mpam_msc mpam_msc.74: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf820
> [ 9.510115] mpam_msc mpam_msc.72: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf7a0
> [ 9.510117] mpam_msc mpam_msc.70: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf720
> [ 9.510120] mpam_msc mpam_msc.68: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf6a0
> [ 9.510123] mpam_msc mpam_msc.66: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf620
> [ 9.510125] mpam_msc mpam_msc.64: Merging features for
> class:0xffff08009736fe50 &= vmsc:0xffff0800973cf5a0
> [ 9.510129] mpam_msc mpam_msc.62: Merging features for
> vmsc:0xffff0800973cf520 |= ris:0xffff08009757ea90
> [ 9.510132] mpam_msc mpam_msc.60: Merging features for
> vmsc:0xffff0800973cf4a0 |= ris:0xffff08009757e690
> [ 9.510133] mpam_msc mpam_msc.58: Merging features for
> vmsc:0xffff0800973cf420 |= ris:0xffff08009757e290
> [ 9.510136] mpam_msc mpam_msc.56: Merging features for
> vmsc:0xffff0800973cf3a0 |= ris:0xffff08009757de90
> [ 9.510139] mpam_msc mpam_msc.54: Merging features for
> vmsc:0xffff0800973cf320 |= ris:0xffff08009757da90
> [ 9.510141] mpam_msc mpam_msc.52: Merging features for
> vmsc:0xffff0800973cf2a0 |= ris:0xffff08009757d690
> [ 9.510144] mpam_msc mpam_msc.50: Merging features for
> vmsc:0xffff0800973cf220 |= ris:0xffff08009757d290
> [ 9.510146] mpam_msc mpam_msc.48: Merging features for
> vmsc:0xffff0800973cf1a0 |= ris:0xffff08009757ce90
> [ 9.510149] mpam_msc mpam_msc.46: Merging features for
> vmsc:0xffff0800973cf120 |= ris:0xffff08009757ca90
> [ 9.510152] mpam_msc mpam_msc.44: Merging features for
> vmsc:0xffff0800973cf0a0 |= ris:0xffff08009757c690
> [ 9.510154] mpam_msc mpam_msc.42: Merging features for
> vmsc:0xffff0800973cf020 |= ris:0xffff08009757c290
> [ 9.510157] mpam_msc mpam_msc.40: Merging features for
> vmsc:0xffff0800973cefa0 |= ris:0xffff08009757be90
> [ 9.510160] mpam_msc mpam_msc.38: Merging features for
> vmsc:0xffff0800973cef20 |= ris:0xffff08009757ba90
> [ 9.510162] mpam_msc mpam_msc.36: Merging features for
> vmsc:0xffff0800973ceea0 |= ris:0xffff08009757b690
> [ 9.510166] mpam_msc mpam_msc.34: Merging features for
> vmsc:0xffff0800973cee20 |= ris:0xffff08009757b290
> [ 9.510167] mpam_msc mpam_msc.32: Merging features for
> vmsc:0xffff0800973ceda0 |= ris:0xffff08009757ae90
> [ 9.510170] mpam_msc mpam_msc.30: Merging features for
> vmsc:0xffff0800973ced20 |= ris:0xffff08009757aa90
> [ 9.510173] mpam_msc mpam_msc.28: Merging features for
> vmsc:0xffff0800973ceca0 |= ris:0xffff08009757a690
> [ 9.510175] mpam_msc mpam_msc.26: Merging features for
> vmsc:0xffff0800973cec20 |= ris:0xffff08009757a290
> [ 9.510178] mpam_msc mpam_msc.24: Merging features for
> vmsc:0xffff0800973ceba0 |= ris:0xffff080097579e90
> [ 9.510181] mpam_msc mpam_msc.22: Merging features for
> vmsc:0xffff0800973ceb20 |= ris:0xffff080097579a90
> [ 9.510183] mpam_msc mpam_msc.20: Merging features for
> vmsc:0xffff0800973ceaa0 |= ris:0xffff080097579690
> [ 9.510186] mpam_msc mpam_msc.18: Merging features for
> vmsc:0xffff0800973cea20 |= ris:0xffff080097579290
> [ 9.510189] mpam_msc mpam_msc.16: Merging features for
> vmsc:0xffff0800973ce9a0 |= ris:0xffff080097578e90
> [ 9.510191] mpam_msc mpam_msc.62: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf520
> [ 9.510195] mpam_msc mpam_msc.60: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf4a0
> [ 9.510196] mpam_msc mpam_msc.58: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf420
> [ 9.510199] mpam_msc mpam_msc.56: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf3a0
> [ 9.510202] mpam_msc mpam_msc.54: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf320
> [ 9.510204] mpam_msc mpam_msc.52: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf2a0
> [ 9.510207] mpam_msc mpam_msc.50: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf220
> [ 9.510209] mpam_msc mpam_msc.48: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf1a0
> [ 9.510212] mpam_msc mpam_msc.46: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf120
> [ 9.510214] mpam_msc mpam_msc.44: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf0a0
> [ 9.510217] mpam_msc mpam_msc.42: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cf020
> [ 9.510219] mpam_msc mpam_msc.40: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cefa0
> [ 9.510222] mpam_msc mpam_msc.38: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cef20
> [ 9.510224] mpam_msc mpam_msc.36: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ceea0
> [ 9.510227] mpam_msc mpam_msc.34: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cee20
> [ 9.510230] mpam_msc mpam_msc.32: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ceda0
> [ 9.510232] mpam_msc mpam_msc.30: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ced20
> [ 9.510235] mpam_msc mpam_msc.28: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ceca0
> [ 9.510237] mpam_msc mpam_msc.26: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cec20
> [ 9.510240] mpam_msc mpam_msc.24: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ceba0
> [ 9.510242] mpam_msc mpam_msc.22: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ceb20
> [ 9.510245] mpam_msc mpam_msc.20: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ceaa0
> [ 9.510247] mpam_msc mpam_msc.18: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973cea20
> [ 9.510250] mpam_msc mpam_msc.16: Merging features for
> class:0xffff08009736cc50 &= vmsc:0xffff0800973ce9a0
> [ 9.510254] mpam_msc mpam_msc.14: Merging features for
> vmsc:0xffff0800973ce920 |= ris:0xffff080097578a90
> [ 9.510255] mpam_msc mpam_msc.12: Merging features for
> vmsc:0xffff0800973ce8a0 |= ris:0xffff080097578690
> [ 9.510258] mpam_msc mpam_msc.10: Merging features for
> vmsc:0xffff0800973ce820 |= ris:0xffff080097578290
> [ 9.510260] mpam_msc mpam_msc.8: Merging features for
> vmsc:0xffff0800973ce7a0 |= ris:0xffff080097417e90
> [ 9.510263] mpam_msc mpam_msc.6: Merging features for
> vmsc:0xffff0800973ce720 |= ris:0xffff080097417a90
> [ 9.510266] mpam_msc mpam_msc.4: Merging features for
> vmsc:0xffff0800973ce6a0 |= ris:0xffff080097417690
> [ 9.510268] mpam_msc mpam_msc.2: Merging features for
> vmsc:0xffff0800973ce620 |= ris:0xffff080097417290
> [ 9.510271] mpam_msc mpam_msc.0: Merging features for
> vmsc:0xffff0800973ce5a0 |= ris:0xffff080097416e90
> [ 9.510274] mpam_msc mpam_msc.14: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce920
> [ 9.510276] mpam_msc mpam_msc.12: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce8a0
> [ 9.510279] mpam_msc mpam_msc.10: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce820
> [ 9.510281] mpam_msc mpam_msc.8: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce7a0
> [ 9.510284] mpam_msc mpam_msc.6: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce720
> [ 9.510287] mpam_msc mpam_msc.4: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce6a0
> [ 9.510289] mpam_msc mpam_msc.2: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce620
> [ 9.510292] mpam_msc mpam_msc.0: Merging features for
> class:0xffff08009735bb50 &= vmsc:0xffff0800973ce5a0
> [ 10.978496] mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR
No L2 cpor?
> [ 10.978497] mpam:mpam_resctrl_pick_caches: class 255 is not a cache
> [ 10.980470] mpam:mpam_resctrl_pick_mba: class 2 is before L3
> [ 10.980472] mpam:mpam_resctrl_pick_mba: class 3 has no bandwidth control
> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
> Mismatched CPU mask with L3 equivalent
> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
> match L3
> [ 10.997415] mpam:mpam_resctrl_pick_counters: class 2 is before L3
> [ 11.024109] mpam:topology_matches_l3: class 3 component 276 has
> Mismatched CPU mask with L3 equivalent
> [ 11.024114] mpam:class_has_usable_mbwu: monitors usable in free-
> running mode
mbwu enabled?
> [ 11.063882] mpam:topology_matches_l3: class 255 component 0 has
> Mismatched CPU mask with L3 equivalent
> [ 11.113183] mpam:mpam_resctrl_alloc_domain: Skipped monitor domain
> online - no monitors
> [ 11.113189] MPAM enabled with 32 PARTIDs and 4 PMGs
>
>
> Kunit test logs:
> [ 31.253697] KTAP version 1
> [ 31.253698] 1..2
> [ 31.258263] KTAP version 1
> [ 31.258265] # Subtest: mpam_devices_test_suite
> [ 31.258267] # module: mpam
> [ 31.258268] 1..3
> [ 31.258775] ok 1 test_mpam_reset_msc_bitmap
> [ 31.259558] ok 2 test_mpam_enable_merge_features
> [ 31.260259] ok 3 test__props_mismatch
> [ 31.260261] # mpam_devices_test_suite: pass:3 fail:0 skip:0 total:3
> [ 31.260263] # Totals: pass:3 fail:0 skip:0 total:3
> [ 31.260265] ok 1 mpam_devices_test_suite
> [ 31.260267] KTAP version 1
> [ 31.260268] # Subtest: mpam_resctrl_test_suite
> [ 31.260269] # module: mpam
> [ 31.260271] 1..7
> [ 31.260965] ok 1 test_get_mba_granularity
> [ 31.260968] KTAP version 1
> [ 31.260969] # Subtest: test_mbw_max_to_percent
> [ 31.261372] ok 1 pc=1, width=8, value=0x01
> [ 31.261794] ok 2 pc=1, width=12, value=0x027
> [ 31.262081] ok 3 pc=1, width=16, value=0x028e
> [ 31.262183] ok 4 pc=25, width=8, value=0x3f
> [ 31.262287] ok 5 pc=25, width=12, value=0x3ff
> [ 31.262388] ok 6 pc=25, width=16, value=0x3fff
> [ 31.262489] ok 7 pc=33, width=8, value=0x53
> [ 31.262608] ok 8 pc=33, width=12, value=0x546
> [ 31.262860] ok 9 pc=33, width=16, value=0x5479
> [ 31.263113] ok 10 pc=35, width=8, value=0x58
> [ 31.263491] ok 11 pc=35, width=12, value=0x598
> [ 31.263872] ok 12 pc=35, width=16, value=0x5998
> [ 31.264249] ok 13 pc=45, width=8, value=0x72
> [ 31.264352] ok 14 pc=45, width=12, value=0x732
> [ 31.264455] ok 15 pc=45, width=16, value=0x7332
> [ 31.264559] ok 16 pc=50, width=8, value=0x7f
> [ 31.264661] ok 17 pc=50, width=12, value=0x7ff
> [ 31.264764] ok 18 pc=50, width=16, value=0x7fff
> [ 31.264872] ok 19 pc=52, width=8, value=0x84
> [ 31.264978] ok 20 pc=52, width=12, value=0x850
> [ 31.265082] ok 21 pc=52, width=16, value=0x851d
> [ 31.265190] ok 22 pc=55, width=8, value=0x8b
> [ 31.265297] ok 23 pc=55, width=12, value=0x8cb
> [ 31.265403] ok 24 pc=55, width=16, value=0x8ccb
> [ 31.265507] ok 25 pc=58, width=8, value=0x93
> [ 31.265609] ok 26 pc=58, width=12, value=0x946
> [ 31.265714] ok 27 pc=58, width=16, value=0x9479
> [ 31.265817] ok 28 pc=75, width=8, value=0xbf
> [ 31.265918] ok 29 pc=75, width=12, value=0xbff
> [ 31.266020] ok 30 pc=75, width=16, value=0xbfff
> [ 31.266120] ok 31 pc=80, width=8, value=0xcb
> [ 31.266220] ok 32 pc=80, width=12, value=0xccb
> [ 31.266322] ok 33 pc=80, width=16, value=0xcccb
> [ 31.266425] ok 34 pc=88, width=8, value=0xe0
> [ 31.266533] ok 35 pc=88, width=12, value=0xe13
> [ 31.266637] ok 36 pc=88, width=16, value=0xe146
> [ 31.266758] ok 37 pc=95, width=8, value=0xf2
> [ 31.267150] ok 38 pc=95, width=12, value=0xf32
> [ 31.267535] ok 39 pc=95, width=16, value=0xf332
> [ 31.267918] ok 40 pc=100, width=8, value=0xff
> [ 31.268160] ok 41 pc=100, width=12, value=0xfff
> [ 31.268264] ok 42 pc=100, width=16, value=0xffff
> [ 31.268266] # test_mbw_max_to_percent: pass:42 fail:0 skip:0
> total:42
> [ 31.268268] ok 2 test_mbw_max_to_percent
> [ 31.268270] KTAP version 1
> [ 31.268271] # Subtest: test_percent_to_mbw_max
> [ 31.268376] ok 1 pc=1, width=8, value=0x01
> [ 31.268483] ok 2 pc=1, width=12, value=0x027
> [ 31.268595] ok 3 pc=1, width=16, value=0x028e
> [ 31.268701] ok 4 pc=25, width=8, value=0x3f
> [ 31.268806] ok 5 pc=25, width=12, value=0x3ff
> [ 31.268915] ok 6 pc=25, width=16, value=0x3fff
> [ 31.269022] ok 7 pc=33, width=8, value=0x53
> [ 31.269129] ok 8 pc=33, width=12, value=0x546
> [ 31.269237] ok 9 pc=33, width=16, value=0x5479
> [ 31.269342] ok 10 pc=35, width=8, value=0x58
> [ 31.269446] ok 11 pc=35, width=12, value=0x598
> [ 31.269551] ok 12 pc=35, width=16, value=0x5998
> [ 31.269658] ok 13 pc=45, width=8, value=0x72
> [ 31.269764] ok 14 pc=45, width=12, value=0x732
> [ 31.269868] ok 15 pc=45, width=16, value=0x7332
> [ 31.269975] ok 16 pc=50, width=8, value=0x7f
> [ 31.270081] ok 17 pc=50, width=12, value=0x7ff
> [ 31.270185] ok 18 pc=50, width=16, value=0x7fff
> [ 31.270287] ok 19 pc=52, width=8, value=0x84
> [ 31.270388] ok 20 pc=52, width=12, value=0x850
> [ 31.270494] ok 21 pc=52, width=16, value=0x851d
> [ 31.270606] ok 22 pc=55, width=8, value=0x8b
> [ 31.271004] ok 23 pc=55, width=12, value=0x8cb
> [ 31.271387] ok 24 pc=55, width=16, value=0x8ccb
> [ 31.271770] ok 25 pc=58, width=8, value=0x93
> [ 31.272151] ok 26 pc=58, width=12, value=0x946
> [ 31.272260] ok 27 pc=58, width=16, value=0x9479
> [ 31.272366] ok 28 pc=75, width=8, value=0xbf
> [ 31.272472] ok 29 pc=75, width=12, value=0xbff
> [ 31.272580] ok 30 pc=75, width=16, value=0xbfff
> [ 31.272686] ok 31 pc=80, width=8, value=0xcb
> [ 31.272790] ok 32 pc=80, width=12, value=0xccb
> [ 31.272895] ok 33 pc=80, width=16, value=0xcccb
> [ 31.273000] ok 34 pc=88, width=8, value=0xe0
> [ 31.273106] ok 35 pc=88, width=12, value=0xe13
> [ 31.273209] ok 36 pc=88, width=16, value=0xe146
> [ 31.273318] ok 37 pc=95, width=8, value=0xf2
> [ 31.273424] ok 38 pc=95, width=12, value=0xf32
> [ 31.273528] ok 39 pc=95, width=16, value=0xf332
> [ 31.273635] ok 40 pc=100, width=8, value=0xff
> [ 31.273742] ok 41 pc=100, width=12, value=0xfff
> [ 31.273847] ok 42 pc=100, width=16, value=0xffff
> [ 31.273849] # test_percent_to_mbw_max: pass:42 fail:0 skip:0
> total:42
> [ 31.273850] ok 3 test_percent_to_mbw_max
> [ 31.273852] KTAP version 1
> [ 31.273853] # Subtest: test_mbw_max_to_percent_limits
> [ 31.273957] ok 1 wd=1
> [ 31.274064] ok 2 wd=2
> [ 31.274171] ok 3 wd=3
> [ 31.274276] ok 4 wd=4
> [ 31.274381] ok 5 wd=5
> [ 31.274485] ok 6 wd=6
> [ 31.274603] ok 7 wd=7
> [ 31.274710] ok 8 wd=8
> [ 31.274974] ok 9 wd=9
> [ 31.275362] ok 10 wd=10
> [ 31.275746] ok 11 wd=11
> [ 31.276122] ok 12 wd=12
> [ 31.276230] ok 13 wd=13
> [ 31.276335] ok 14 wd=14
> [ 31.276444] ok 15 wd=15
> [ 31.276551] ok 16 wd=16
> [ 31.276553] # test_mbw_max_to_percent_limits: pass:16 fail:0
> skip:0 total:16
> [ 31.276554] ok 4 test_mbw_max_to_percent_limits
> [ 31.276605] # test_percent_to_max_rounding: Round-up rate: 43%
> (18/42)
> [ 31.276668] ok 5 test_percent_to_max_rounding
> [ 31.276671] KTAP version 1
> [ 31.276672] # Subtest: test_percent_max_roundtrip_stability
> [ 31.276776] ok 1 wd=1
> [ 31.276883] ok 2 wd=2
> [ 31.276988] ok 3 wd=3
> [ 31.277096] ok 4 wd=4
> [ 31.277202] ok 5 wd=5
> [ 31.277309] ok 6 wd=6
> [ 31.277416] ok 7 wd=7
> [ 31.277524] ok 8 wd=8
> [ 31.277629] ok 9 wd=9
> [ 31.277737] ok 10 wd=10
> [ 31.277843] ok 11 wd=11
> [ 31.277948] ok 12 wd=12
> [ 31.278061] ok 13 wd=13
> [ 31.278167] ok 14 wd=14
> [ 31.278273] ok 15 wd=15
> [ 31.278380] ok 16 wd=16
> [ 31.278381] # test_percent_max_roundtrip_stability: pass:16
> fail:0 skip:0 total:16
> [ 31.278383] ok 6 test_percent_max_roundtrip_stability
> [ 31.278385] KTAP version 1
> [ 31.278386] # Subtest: test_rmid_idx_encoding
> [ 31.278490] ok 1 max_partid=0, max_pmg=0
> [ 31.278604] ok 2 max_partid=1, max_pmg=4
> [ 31.279008] ok 3 max_partid=3, max_pmg=1
> [ 31.279394] ok 4 max_partid=5, max_pmg=9
> [ 31.279777] ok 5 max_partid=4, max_pmg=4
> [ 31.280167] ok 6 max_partid=100, max_pmg=11
> [ 31.358979] ok 7 max_partid=65535, max_pmg=255
> [ 31.358985] # test_rmid_idx_encoding: pass:7 fail:0 skip:0 total:7
> [ 31.358987] ok 7 test_rmid_idx_encoding
> [ 31.358989] # mpam_resctrl_test_suite: pass:7 fail:0 skip:0 total:7
> [ 31.358990] # Totals: pass:125 fail:0 skip:0 total:125
> [ 31.358992] ok 2 mpam_resctrl_test_suite
>
>
> ------
>
> Best regards,
> Zeng Heng
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-13 7:38 ` Zeng Heng
@ 2026-02-16 13:54 ` Ben Horgan
2026-02-18 16:22 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-16 13:54 UTC (permalink / raw)
To: Zeng Heng, james.morse, Jonathan Cameron
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, kobak, lcherian, linux-arm-kernel,
linux-kernel, peternewman, punit.agrawal, quic_jiles,
reinette.chatre, rohit.mathew, scott, sdonthineni, tan.shaopeng,
xhao, catalin.marinas, will, corbet, maz, oupton, joey.gouly,
suzuki.poulose, kvmarm, linux-doc, Kefeng Wang
Hi Zeng,
On 2/13/26 07:38, Zeng Heng wrote:
> Hi Ben,
>
> On 2026/2/6 0:50, Jonathan Cameron wrote:
>> On Tue, 3 Feb 2026 21:43:27 +0000
>> Ben Horgan <ben.horgan@arm.com> wrote:
>>
>>> From: James Morse <james.morse@arm.com>
>>>
>>> resctrl supports 'MB', as a percentage throttling of traffic from the
>>> L3. This is the control that mba_sc uses, so ideally the class chosen
>>> should be as close as possible to the counters used for mbm_total. If
>>> there is a single L3 and the topology of the memory matches then the
>>> traffic at the memory controller will be equivalent to that at egress of
>>> the L3. If these conditions are met allow the memory class to back MB.
>>>
>>> MB's percentage control should be backed either with the fixed point
>>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>>> bitmaps is not used as its tricky to pick which bits to use to avoid
>>> contention, and may be possible to expose this as something other than a
>>> percentage in the future.
>>
>> I'm very curious to know whether this heuristic is actually useful on
>> many
>> systems or whether many / most of them fail this 'shape' heuristic.
>>
>
> The current MPAM driver has restrictions that limit MB control support.
> For example, on some systems, the MPAM memory class MSCs are not located
> at the L3 cache but rather at the memory controller (e.g., Mata). In
> such cases, MB control and mbm_total bandwidth monitoring cannot be
> enabled.
> > I'm unsure whether the community would welcome and be willing to review
> a patch series supporting such systems. Of course, the changes would
> involve minor refactoring in the resctrl common layer.
Having MSC at the memory controllers is quite common and it would be
good for the mpam driver and resctrl to support this. My current
priority is the initial MPAM support and look at this and other extra
features later but I'd be happy to help progress support in this area
through review and discussion. There is some discussion about adding new
schema at [1] and we should make sure this is consistent with monitors
too. James has some out of tree patches from before that disccussion at [2].
[1] https://lore.kernel.org/lkml/aPtfMFfLV1l%2FRB0L@e133380.arm.com/
[2] git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
mpam/snapshot+extras/v6.18-rc1
>
> Any feedback would be greatly appreciated.
>
>
> Best Regards,
> Zeng Heng
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration
2026-02-14 10:39 ` Zeng Heng
@ 2026-02-16 14:23 ` Ben Horgan
2026-02-25 6:39 ` Zeng Heng
0 siblings, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-16 14:23 UTC (permalink / raw)
To: Zeng Heng
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Shaopeng Tan
Hi Zeng,
On 2/14/26 10:39, Zeng Heng wrote:
> Hi Ben,
>
> On 2026/2/4 5:43, Ben Horgan wrote:
>> From: James Morse <james.morse@arm.com>
>>
>> resctrl has two helpers for updating the configuration.
>> resctrl_arch_update_one() updates a single value, and is used by the
>> software-controller to apply feedback to the bandwidth controls, it
>> has to
>> be called on one of the CPUs in the resctrl:domain.
>>
>> resctrl_arch_update_domains() copies multiple staged configurations,
>> it can
>> be called from anywhere.
>>
>> Both helpers should update any changes to the underlying hardware.
>>
>> Implement resctrl_arch_update_domains() to use
>> resctrl_arch_update_one(). Neither need to be called on a specific CPU as
>> the mpam driver will send IPIs as needed.
>>
>> Tested-by: Gavin Shan <gshan@redhat.com>
>> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>> Tested-by: Peter Newman <peternewman@google.com>
>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
>> ---
>> Changes since rfc:
>> list_for_each_entry -> list_for_each_entry_rcu
>> return 0
>> Restrict scope of local variables
>>
>> Changes since v2:
>> whitespace fix
>> ---
>> drivers/resctrl/mpam_resctrl.c | 70 ++++++++++++++++++++++++++++++++++
>> 1 file changed, 70 insertions(+)
>>
>> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/
>> mpam_resctrl.c
>> index ecf00386edca..48d047510089 100644
>> --- a/drivers/resctrl/mpam_resctrl.c
>> +++ b/drivers/resctrl/mpam_resctrl.c
>> @@ -212,6 +212,76 @@ u32 resctrl_arch_get_config(struct rdt_resource
>> *r, struct rdt_ctrl_domain *d,
>> }
>> }
>> +int resctrl_arch_update_one(struct rdt_resource *r, struct
>> rdt_ctrl_domain *d,
>> + u32 closid, enum resctrl_conf_type t, u32 cfg_val)
>> +{
>> + u32 partid;
>> + struct mpam_config cfg;
>> + struct mpam_props *cprops;
>> + struct mpam_resctrl_res *res;
>> + struct mpam_resctrl_dom *dom;
>> +
>> + lockdep_assert_cpus_held();
>> + lockdep_assert_irqs_enabled();
>> +
>> + /*
>> + * No need to check the CPU as mpam_apply_config() doesn't care, and
>> + * resctrl_arch_update_domains() relies on this.
>> + */
>> + res = container_of(r, struct mpam_resctrl_res, resctrl_res);
>> + dom = container_of(d, struct mpam_resctrl_dom, resctrl_ctrl_dom);
>> + cprops = &res->class->props;
>> +
>> + partid = resctrl_get_config_index(closid, t);
>
>
> As a victim, I must admit I cannot verify this feedback on my local
> Kunpeng environment since MB functionality is not yet supported by the
> driver. However, after careful consideration, I believe this is worth
> bringing up for discussion.
Thank you for thinking and finding problems beyond your platform.
>
> Regarding the MB configuration flow, the partid conversion should
> include the mpam_resctrl_hide_cdp() condition check. Here's the
> rationale:
>
> After resctrl parsing schemata update, MB configuration is set via
> parse_bw() or rdtgroup_init_mba(), which stores the updated
> configuration in dom->staged_config[CDP_NONE]. If the MB configuration
> update directly uses t = CDP_NONE, it would result in MB obtaining the
> wrong partid and cfg[partid].
>
> The specific fix would be like:
>
> - partid = resctrl_get_config_index(closid, t);
> + if (mpam_resctrl_hide_cdp(r->rid))
> + /* The configuration of CDP_DATA is same as the CDP_CODE one. */
> + partid = resctrl_get_config_index(closid, CDP_DATA);
> + else
> + partid = resctrl_get_config_index(closid, t);
The CDP emulation support is added later in the series in patch 20, Add
CDP emulation. However, I think you have spotted an actual problem. With
hidden CDP the cfg is first found with
resctrl_get_config_index(closid, t) when CDP_NONE
but then the setting does use CDP_CODE and CDP_DATA.
CDP in general is proving to be quite tricky.
>
> Similarly, in resctrl_arch_get_config() requires the same treatment to
> ensure consistency.
I don't see the problem here but maybe I'm missing something?
Isn't this handled by:
/*
* When CDP is enabled, but the resource doesn't support it,
* the control is cloned across both partids.
* Pick one at random to read:
*/
if (mpam_resctrl_hide_cdp(r->rid))
type = CDP_DATA;
I think we could do similar in resctrl_arch_update_one()
>
>
> Best regards,
> and happy Lunar New Year!
>
> Zeng Heng
>
>
>> + if (!r->alloc_capable || partid >= resctrl_arch_get_num_closid(r)) {
>> + pr_debug("Not alloc capable or computed PARTID out of range\n");
>> + return -EINVAL;
>> + }
>> +
>> + /*
>> + * Copy the current config to avoid clearing other resources when
>> the
>> + * same component is exposed multiple times through resctrl.
>> + */
>> + cfg = dom->ctrl_comp->cfg[partid]; > +
>> + switch (r->rid) {
>> + case RDT_RESOURCE_L2:
>> + case RDT_RESOURCE_L3:
>> + cfg.cpbm = cfg_val;
>> + mpam_set_feature(mpam_feat_cpor_part, &cfg);
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
>> +}
>> +
>> +int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
>> +{
>> + int err;
>> + struct rdt_ctrl_domain *d;
>> +
>> + lockdep_assert_cpus_held();
>> + lockdep_assert_irqs_enabled();
>> +
>> + list_for_each_entry_rcu(d, &r->ctrl_domains, hdr.list) {
>> + for (enum resctrl_conf_type t = 0; t < CDP_NUM_TYPES; t++) {
>> + struct resctrl_staged_config *cfg = &d->staged_config[t];
>> +
>> + if (!cfg->have_new_ctrl)
>> + continue;
>> +
>> + err = resctrl_arch_update_one(r, d, closid, t,
>> + cfg->new_ctrl);
>> + if (err)
>> + return err;
>> + }
>> + }
>> +
>> + return 0;
>> +}
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-16 13:54 ` Ben Horgan
@ 2026-02-18 16:22 ` Ben Horgan
2026-02-18 17:17 ` Reinette Chatre
2026-02-25 8:08 ` Zeng Heng
0 siblings, 2 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-18 16:22 UTC (permalink / raw)
To: Zeng Heng, james.morse, Jonathan Cameron
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, kobak, lcherian, linux-arm-kernel,
linux-kernel, peternewman, punit.agrawal, quic_jiles,
reinette.chatre, rohit.mathew, scott, sdonthineni, tan.shaopeng,
xhao, catalin.marinas, will, corbet, maz, oupton, joey.gouly,
suzuki.poulose, kvmarm, linux-doc, Kefeng Wang
Hi Fenghua, Zeng,
On 2/16/26 13:54, Ben Horgan wrote:
> Hi Zeng,
>
> On 2/13/26 07:38, Zeng Heng wrote:
>> Hi Ben,
>>
>> On 2026/2/6 0:50, Jonathan Cameron wrote:
>>> On Tue, 3 Feb 2026 21:43:27 +0000
>>> Ben Horgan <ben.horgan@arm.com> wrote:
>>>
>>>> From: James Morse <james.morse@arm.com>
>>>>
>>>> resctrl supports 'MB', as a percentage throttling of traffic from the
>>>> L3. This is the control that mba_sc uses, so ideally the class chosen
>>>> should be as close as possible to the counters used for mbm_total. If
>>>> there is a single L3 and the topology of the memory matches then the
>>>> traffic at the memory controller will be equivalent to that at egress of
>>>> the L3. If these conditions are met allow the memory class to back MB.
>>>>
>>>> MB's percentage control should be backed either with the fixed point
>>>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>>>> bitmaps is not used as its tricky to pick which bits to use to avoid
>>>> contention, and may be possible to expose this as something other than a
>>>> percentage in the future.
>>>
>>> I'm very curious to know whether this heuristic is actually useful on
>>> many
>>> systems or whether many / most of them fail this 'shape' heuristic.
>>>
>>
>> The current MPAM driver has restrictions that limit MB control support.
>> For example, on some systems, the MPAM memory class MSCs are not located
>> at the L3 cache but rather at the memory controller (e.g., Mata). In
>> such cases, MB control and mbm_total bandwidth monitoring cannot be
>> enabled.
>>> I'm unsure whether the community would welcome and be willing to review
>> a patch series supporting such systems. Of course, the changes would
>> involve minor refactoring in the resctrl common layer.
>
> Having MSC at the memory controllers is quite common and it would be
> good for the mpam driver and resctrl to support this. My current
> priority is the initial MPAM support and look at this and other extra
> features later but I'd be happy to help progress support in this area
> through review and discussion. There is some discussion about adding new
> schema at [1] and we should make sure this is consistent with monitors
> too. James has some out of tree patches from before that disccussion at [2].
>
> [1] https://lore.kernel.org/lkml/aPtfMFfLV1l%2FRB0L@e133380.arm.com/
> [2] git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
> mpam/snapshot+extras/v6.18-rc1
Fenghua gave a talk at LPC on supporting cpu-less numa nodes in resctrl
so is likely to have some patches/ideas around measuring bandwidth at
memory controllers.
>
>>
>> Any feedback would be greatly appreciated.
>>
>>
>> Best Regards,
>> Zeng Heng
>
> Thanks,
>
> Ben
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-05 16:50 ` Jonathan Cameron
2026-02-13 7:38 ` Zeng Heng
@ 2026-02-18 16:40 ` Ben Horgan
1 sibling, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-18 16:40 UTC (permalink / raw)
To: Jonathan Cameron
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
Hi Jonathan,
On 2/5/26 16:50, Jonathan Cameron wrote:
> On Tue, 3 Feb 2026 21:43:27 +0000
> Ben Horgan <ben.horgan@arm.com> wrote:
>
>> From: James Morse <james.morse@arm.com>
>>
>> resctrl supports 'MB', as a percentage throttling of traffic from the
>> L3. This is the control that mba_sc uses, so ideally the class chosen
>> should be as close as possible to the counters used for mbm_total. If
>> there is a single L3 and the topology of the memory matches then the
>> traffic at the memory controller will be equivalent to that at egress of
>> the L3. If these conditions are met allow the memory class to back MB.
>>
>> MB's percentage control should be backed either with the fixed point
>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>> bitmaps is not used as its tricky to pick which bits to use to avoid
>> contention, and may be possible to expose this as something other than a
>> percentage in the future.
>
> I'm very curious to know whether this heuristic is actually useful on many
> systems or whether many / most of them fail this 'shape' heuristic.
>
> Otherwise, just comments on the placement of __free related declarations.
> See the guidance in cleanup.h for that.
>
> With those moved,
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>
>
>>
>> CC: Zeng Heng <zengheng4@huawei.com>
>> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
>> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> Signed-off-by: James Morse <james.morse@arm.com>>
>> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
>> ---
>> Changes since v2:
>> Code flow change
>> Commit message 'or'
>>
>> Changes since v3:
>> initialise tmp_cpumask
>> update commit message
>> check the traffic matches l3
>> update comment on candidate_class update, only mbm_total
>> drop tags due to rework
>> ---
>> drivers/resctrl/mpam_resctrl.c | 275 ++++++++++++++++++++++++++++++++-
>> 1 file changed, 274 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
>> index 25950e667077..4cca3694978d 100644
>> --- a/drivers/resctrl/mpam_resctrl.c
>> +++ b/drivers/resctrl/mpam_resctrl.c
>
>> +/*
>> + * topology_matches_l3() - Is the provided class the same shape as L3
>> + * @victim: The class we'd like to pretend is L3.
>> + *
>> + * resctrl expects all the world's a Xeon, and all counters are on the
>> + * L3. We allow some mapping counters on other classes. This requires
>> + * that the CPU->domain mapping is the same kind of shape.
>> + *
>> + * Using cacheinfo directly would make this work even if resctrl can't
>> + * use the L3 - but cacheinfo can't tell us anything about offline CPUs.
>> + * Using the L3 resctrl domain list also depends on CPUs being online.
>> + * Using the mpam_class we picked for L3 so we can use its domain list
>> + * assumes that there are MPAM controls on the L3.
>> + * Instead, this path eventually uses the mpam_get_cpumask_from_cache_id()
>> + * helper which can tell us about offline CPUs ... but getting the cache_id
>> + * to start with relies on at least one CPU per L3 cache being online at
>> + * boot.
>> + *
>> + * Walk the victim component list and compare the affinity mask with the
>> + * corresponding L3. The topology matches if each victim:component's affinity
>> + * mask is the same as the CPU's corresponding L3's. These lists/masks are
>> + * computed from firmware tables so don't change at runtime.
>> + */
>> +static bool topology_matches_l3(struct mpam_class *victim)
>> +{
>> + int cpu, err;
>> + struct mpam_component *victim_iter;
>> + cpumask_var_t __free(free_cpumask_var) tmp_cpumask = CPUMASK_VAR_NULL;
>
> Same as below. Move it down right next to the alloc.
I've moved both.
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-10 6:20 ` Shaopeng Tan (Fujitsu)
@ 2026-02-18 16:42 ` Ben Horgan
0 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-18 16:42 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu)
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, zengheng4@huawei.com,
linux-doc@vger.kernel.org
Hi Shaopeng,
On 2/10/26 06:20, Shaopeng Tan (Fujitsu) wrote:
> Hello Ben,
>
>> From: James Morse <james.morse@arm.com>
>>
>> resctrl supports 'MB', as a percentage throttling of traffic from the
>> L3. This is the control that mba_sc uses, so ideally the class chosen
>> should be as close as possible to the counters used for mbm_total. If
>> there is a single L3 and the topology of the memory matches then the
>> traffic at the memory controller will be equivalent to that at egress of
>> the L3. If these conditions are met allow the memory class to back MB.
>>
>> MB's percentage control should be backed either with the fixed point
>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>> bitmaps is not used as its tricky to pick which bits to use to avoid
>> contention, and may be possible to expose this as something other than a
>> percentage in the future.
>>
>> CC: Zeng Heng <zengheng4@huawei.com>
>> Co-developed-by: Dave Martin <Dave.Martin@arm.com>
>> Signed-off-by: Dave Martin <Dave.Martin@arm.com>
>> Signed-off-by: James Morse <james.morse@arm.com>>
>> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
>> ---
>> Changes since v2:
>> Code flow change
>> Commit message 'or'
>>
>> Changes since v3:
>> initialise tmp_cpumask
>> update commit message
>> check the traffic matches l3
>> update comment on candidate_class update, only mbm_total
>> drop tags due to rework
>> ---
>> +/*
>> + * Test if the traffic for a class matches that at egress from the L3. For
>> + * MSC at memory controllers this is only possible if there is a single L3
>> + * as otherwise the counters at the memory can include bandwidth from the
>> + * non-local L3.
>> + */
>> +static bool traffic_matches_l3(struct mpam_class *class) {
>
> An error reported by checkpatch.pl is as follows.
>
> ERROR: open brace '{' following function definitions go on the next line
> #826: FILE: drivers/resctrl/mpam_resctrl.c:826:
> +static bool traffic_matches_l3(struct mpam_class *class) {
Not sure how I let that slip through. Fixed now.
>
>
> Best regards,
> Shaopeng TAN
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation
2026-02-05 17:05 ` Jonathan Cameron
@ 2026-02-18 17:02 ` Ben Horgan
0 siblings, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-18 17:02 UTC (permalink / raw)
To: Jonathan Cameron
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, kobak, lcherian,
linux-arm-kernel, linux-kernel, peternewman, punit.agrawal,
quic_jiles, reinette.chatre, rohit.mathew, scott, sdonthineni,
tan.shaopeng, xhao, catalin.marinas, will, corbet, maz, oupton,
joey.gouly, suzuki.poulose, kvmarm, zengheng4, linux-doc
Hi Jonathan,
On 2/5/26 17:05, Jonathan Cameron wrote:
> On Tue, 3 Feb 2026 21:43:42 +0000
> Ben Horgan <ben.horgan@arm.com> wrote:
>
>> MPAM (Memory Partitioning and Monitoring) is now exposed to user-space via
>> resctrl. Add some documentation so the user knows what features to expect.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
>> ---
>> Changes by Ben:
>> Some tidying, update for current heuristics
>> ---
>> Documentation/arch/arm64/index.rst | 1 +
>> Documentation/arch/arm64/mpam.rst | 93 ++++++++++++++++++++++++++++++
>> 2 files changed, 94 insertions(+)
>> create mode 100644 Documentation/arch/arm64/mpam.rst
>>
>> diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
>> index 6a012c98bdcd..189fa760dade 100644
>> --- a/Documentation/arch/arm64/index.rst
>> +++ b/Documentation/arch/arm64/index.rst
>> @@ -23,6 +23,7 @@ ARM64 Architecture
>> memory
>> memory-tagging-extension
>> mops
>> + mpam
>> perf
>> pointer-authentication
>> ptdump
>> diff --git a/Documentation/arch/arm64/mpam.rst b/Documentation/arch/arm64/mpam.rst
>> new file mode 100644
>> index 000000000000..0769bccff25e
>> --- /dev/null
>> +++ b/Documentation/arch/arm64/mpam.rst
>> @@ -0,0 +1,93 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +====
>> +MPAM
>> +====
>> +
>> +What is MPAM
>> +============
>
>
>> + MPAM (Memory Partitioning and Monitoring) is a feature in the CPUs and memory
>
> I've not seen this style of indenting much in rst. I checked a few
> files in this directory and it's not used in the ones I randomly picked.
> + it's not what the kernel-documentation.rst file suggests is standard
> formatting.
I seem to have needlessly invented a new style. I've not removed lots of
white space. It should match the rest now.
>
> Other than that the content looks fine to me.
>
> Jonathan
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-18 16:22 ` Ben Horgan
@ 2026-02-18 17:17 ` Reinette Chatre
2026-02-25 8:08 ` Zeng Heng
1 sibling, 0 replies; 89+ messages in thread
From: Reinette Chatre @ 2026-02-18 17:17 UTC (permalink / raw)
To: Ben Horgan, Zeng Heng, james.morse, Jonathan Cameron
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, kobak, lcherian, linux-arm-kernel,
linux-kernel, peternewman, punit.agrawal, quic_jiles,
rohit.mathew, scott, sdonthineni, tan.shaopeng, xhao,
catalin.marinas, will, corbet, maz, oupton, joey.gouly,
suzuki.poulose, kvmarm, linux-doc, Kefeng Wang
On 2/18/26 8:22 AM, Ben Horgan wrote:
>
> Fenghua gave a talk at LPC on supporting cpu-less numa nodes in resctrl
> so is likely to have some patches/ideas around measuring bandwidth at
> memory controllers.
One idea on how to accommodate this from resctrl side:
https://lore.kernel.org/lkml/fb1e2686-237b-4536-acd6-15159abafcba@intel.com/
Reinette
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4
2026-02-14 1:29 ` Zeng Heng
@ 2026-02-20 2:30 ` Shaopeng Tan (Fujitsu)
0 siblings, 0 replies; 89+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-02-20 2:30 UTC (permalink / raw)
To: Zeng Heng, Ben Horgan
Cc: amitsinght@marvell.com, baisheng.gao@unisoc.com,
baolin.wang@linux.alibaba.com, carl@os.amperecomputing.com,
dave.martin@arm.com, david@kernel.org, dfustini@baylibre.com,
fenghuay@nvidia.com, gshan@redhat.com, james.morse@arm.com,
jonathan.cameron@huawei.com, kobak@nvidia.com,
lcherian@marvell.com, linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org, peternewman@google.com,
punit.agrawal@oss.qualcomm.com, quic_jiles@quicinc.com,
reinette.chatre@intel.com, rohit.mathew@arm.com,
scott@os.amperecomputing.com, sdonthineni@nvidia.com,
xhao@linux.alibaba.com, catalin.marinas@arm.com, will@kernel.org,
corbet@lwn.net, maz@kernel.org, oupton@kernel.org,
joey.gouly@arm.com, suzuki.poulose@arm.com,
kvmarm@lists.linux.dev, linux-doc@vger.kernel.org, Kefeng Wang
Hi Zengheng
> Tan, Shaopeng/譚 紹鵬;Ben Horgan <ben.horgan@arm.com>
> amitsinght@marvell.com;baisheng.gao@unisoc.com;baolin.wang@linux.alibaba.com;carl@os.amperecomputing.com;dave.martin@arm.com;david@kernel.org;dfustini@baylibre.com;fenghuay@nvidia.com;gshan@redhat.com;james.morse@arm.com;jonathan.cameron@huawei.com;kobak@nvidia.com;lcherian@marvell.com;linux-arm-kernel@lists.infradead.org;linux-kernel@vger.kernel.org;他 +18 件
> Hi Shaopeng,
>
> On 2026/2/13 15:02, Shaopeng Tan (Fujitsu) wrote:
> > Hello Ben, Fenghua
> >
> >> From: Shanker Donthineni <sdonthineni@nvidia.com>
> >>
> >> In the T241 implementation of memory-bandwidth partitioning, in the absence
> >> of contention for bandwidth, the minimum bandwidth setting can affect the
> >> amount of achieved bandwidth. Specifically, the achieved bandwidth in the
> >> absence of contention can settle to any value between the values of
> >> MPAMCFG_MBW_MIN and MPAMCFG_MBW_MAX. Also, if MPAMCFG_MBW_MIN is set
> >> zero (below 0.78125%), once a core enters a throttled state, it will never
> >> leave that state.
> >>
> >> The first issue is not a concern if the MPAM software allows to program
> >> MPAMCFG_MBW_MIN through the sysfs interface. This patch ensures program
> >> MBW_MIN=1 (0.78125%) whenever MPAMCFG_MBW_MIN=0 is programmed.
> >>
> >> In the scenario where the resctrl doesn't support the MBW_MIN interface via
> >> sysfs, to achieve bandwidth closer to MBW_MAX in the absence of contention,
> >> software should configure a relatively narrow gap between MBW_MIN and
> >> MBW_MAX. The recommendation is to use a 5% gap to mitigate the problem.
> >
> > I have a question regarding the MBW_MIN values.
> > Are there any cases where the sum of all MBW_MIN values from different control groups exceeds 100%?
> > And if so, is it acceptable for it to exceed 100%?"
> >
> > Best regards,
> > Shaopeng TAN
> >
>
>
> Per the ARM MPAM architecture specification: "A PARTID that has used
> less than MIN is given preferential access to bandwidth."
>
> MBW_MIN is not a guaranteed bandwidth allocation. Instead, it serves as
> a priority threshold: when a partid's memory bandwidth usage falls below
> the configured MBW_MIN value, its priority for memory bandwidth access
> is elevated.
>
> Therefore, it is acceptable for the sum of MBW_MIN values across
> different control groups to exceed 100%. There is no requirement
> for these values to add up to 100% or less.
>
> Hope this clarifies the behavior.
I see, thank you.
Best regards,
Shaopeng TAN
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-16 12:22 ` Ben Horgan
@ 2026-02-24 11:03 ` Zeng Heng
2026-02-24 14:19 ` Ben Horgan
0 siblings, 1 reply; 89+ messages in thread
From: Zeng Heng @ 2026-02-24 11:03 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
Hi Ben,
On 2026/2/16 20:22, Ben Horgan wrote:
> Hi Zeng,
>
> On 2/14/26 09:40, Zeng Heng wrote:
>> Hi Ben,
>>
>> On 2026/2/4 5:43, Ben Horgan wrote:
> [...]
>>>
>>> Based on:
>>> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
>>> (To include telemetry code which changes the resctrl arch interface)
>>>
>>> The series can be retrieved from:
>>> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
>>> (Final commit is a fix already in 6.19-rc8)
>>>
> [...]
>>>
>>
>> I've tested the MPAM functionality on my local Kunpeng platform. Here's
>> a summary of the results:
>
> Thank you very much for your testing and detailed report.
>
>>
>> Features enabled and verified:
>> * L2 and L3 CPBM
>> * L3 CSU
>> * L2 and L3 CDP
>> All enabled features passed functional testing as expected.
>>
>> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>>
>> Features not enabled:
>> 1. MATA MBMAX partition and MBWU monitor.
>
> What's MATA here? Just memory allocation or something more specific?
>
The MATA serves as the agent for the Broadway MESI coherence protocol
among multiple L3 caches or between I/O and L3 caches. It maintains data
coherence among multiple L3 caches or between I/O and L3 caches.
As the connection module between the SoC and the memory system, the MATA
interfaces with the DMC on the DDR die. It provides the system with DDR
access paths, delivering high-bandwidth, low-latency DDR read/write
access.
On the Kunpeng chip, the MB related MSC resides in this module rather
than being directly located on the L3 cache controller.
>> Reason: These do not meet the driver's current topology>
> expectations for MB support, hence they were not initialized.
>> This behavior is expected.
>
> Is this because you have more than 1 L3 cache?
Yes, Kunpeng platform has more than 1 L3 cache.
However, the reason it is not supported here is that the current driver
does not support MATA recognition, while both Kunpeng MBM and MBA
functionalities reside in the MATA MSC side as mentioned above,
resulting in the inability to provide support.
Relevant logs are as follows:
[ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
Mismatched CPU mask with L3 equivalent
[ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
match L3
>
>>
>> 2. L2 CSU and MBWU monitors.
>> Reason: The current MPAM driver does not support L2-related
>> functionality yet.
>
> Thanks for letting us know you have these. But, yes, unfortunately
> monitoring is only supported on the L3 at the moment.
>
>>
>> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>>
>>
>> Detailed test logs are as follows:
>
> I'm confused by these logs as it looks like you aren't getting any
> monitors but you verified the L3 CSU. Also, it looks like cpor (cbpm) is
> disabled (at least partially) but you verified L2 and L3 CPBM. Is this
> across different machines?
>
The logs were of course tested on the same machine.
Since the L3 CSU/CPBM resides on the L3 cache, it can be correctly
recognized and run smoothly.
In fact, not only was L2 successfully mounted, but all L2 MSC CPBMs were
also correctly recognized. The suspicion here is that the L2
class->components were incorrectly mounted to an unknown object, which
is believed to be related to the monitoring capabilities (CSU and MBWU)
of Kunpeng L2. The root cause is still being investigated.
Resctrl mounting status:
# cat schemata
L2:4=ff;7=ff;10=ff;13=ff;16=ff;19=ff;22=ff;25=ff;29=ff;32=ff;35=ff;
38=ff;41=ff;44=ff;47=ff;50=ff;54=ff;57=ff;60=ff;63=ff;66=ff;69=ff;72=ff;
75=ff;79=ff;82=ff;85=ff;88=ff;91=ff;94=ff;97=ff;100=ff;104=ff;107=ff;
110=ff;113=ff;116=ff;119=ff;122=ff;125=ff;129=ff;132=ff;135=ff;138=ff;
141=ff;144=ff;147=ff;150=ff;154=ff;157=ff;160=ff;163=ff;166=ff;169=ff;
172=ff;175=ff;179=ff;182=ff;185=ff;188=ff;191=ff;194=ff;197=ff;200=ff;
204=ff;207=ff;210=ff;213=ff;216=ff;219=ff;222=ff;225=ff;229=ff;232=ff;
235=ff;238=ff;241=ff;244=ff;247=ff;250=ff;254=ff;257=ff;260=ff;263=ff;
266=ff;269=ff;272=ff;275=ff;279=ff;282=ff;285=ff;288=ff;291=ff;294=ff;
297=ff;300=ff
L3:1=1ffff;26=1ffff;51=1ffff;76=1ffff;101=1ffff;126=1ffff;151=1ffff;
176=1ffff;201=1ffff;226=1ffff;251=1ffff;276=1ffff
# ls mon_data/*/*
mon_data/mon_L3_01/llc_occupancy mon_data/mon_L3_151/llc_occupancy
mon_data/mon_L3_226/llc_occupancy mon_data/mon_L3_276/llc_occupancy
mon_data/mon_L3_101/llc_occupancy mon_data/mon_L3_176/llc_occupancy
mon_data/mon_L3_251/llc_occupancy mon_data/mon_L3_51/llc_occupancy
mon_data/mon_L3_126/llc_occupancy mon_data/mon_L3_201/llc_occupancy
mon_data/mon_L3_26/llc_occupancy mon_data/mon_L3_76/llc_occupancy
>>
>> Boot logs:
>> [root@localhost ~]# dmesg | grep -i mpam
>> [ 0.000000] ACPI: MPAM 0x000000007FF35018 003024 (v01 HISI HIP12
>> 00000000 HISI 20151124)
>> [ 9.509852] mpam_msc mpam_msc.64: Merging features for
>> vmsc:0xffff0800973cf5a0 |= ris:0xffff08009757ee90
>> [ 9.509859] mpam_msc mpam_msc.254: Merging features for
>> class:0xffff08009736fe50 &= vmsc:0xffff080097628520
>> [ 9.509860] mpam:__props_mismatch:
>> mpam_has_feature(mpam_feat_cpor_part, parent) = 1
>> [ 9.509864] mpam:__props_mismatch:
>> mpam_has_feature(mpam_feat_cpor_part, child) = 0
>> [ 9.509866] mpam:__props_mismatch: parent->cpbm_wd = 8
>> [ 9.509869] mpam:__props_mismatch: child->cpbm_wd = 0
>> [ 9.509871] mpam:__props_mismatch: alias = 0
>> [ 9.509873] mpam:__props_mismatch: cleared cpor_part
>
> cpor (partially) disabled?
>
>> [ 9.509876] mpam:__props_mismatch: took the min num_csu_mon
>> [ 9.509878] mpam:__props_mismatch: took the min num_mbwu_mon
>> [ 9.509881] mpam_msc mpam_msc.252: Merging features for
[...]
>> [ 10.978496] mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR
>
> No L2 cpor?
>
>> [ 10.978497] mpam:mpam_resctrl_pick_caches: class 255 is not a cache
>> [ 10.980470] mpam:mpam_resctrl_pick_mba: class 2 is before L3
>> [ 10.980472] mpam:mpam_resctrl_pick_mba: class 3 has no bandwidth control
>> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
>> Mismatched CPU mask with L3 equivalent
>> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
>> match L3
>> [ 10.997415] mpam:mpam_resctrl_pick_counters: class 2 is before L3
>> [ 11.024109] mpam:topology_matches_l3: class 3 component 276 has
>> Mismatched CPU mask with L3 equivalent
>> [ 11.024114] mpam:class_has_usable_mbwu: monitors usable in free-
>> running mode
>
> mbwu enabled?
The fact that the number of monitors merely satisfies the conditions for
free-running mode does not imply that the MBWU functionality can be
successfully mounted. The specific reasons are explained above.
>
>> [ 11.063882] mpam:topology_matches_l3: class 255 component 0 has
>> Mismatched CPU mask with L3 equivalent
>> [ 11.113183] mpam:mpam_resctrl_alloc_domain: Skipped monitor domain
>> online - no monitors
>> [ 11.113189] MPAM enabled with 32 PARTIDs and 4 PMGs
>>
>>
Sorry for the late reply. And this is my first day back from a long
vacation.
Best regards,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-24 11:03 ` Zeng Heng
@ 2026-02-24 14:19 ` Ben Horgan
2026-02-24 15:27 ` Ben Horgan
2026-02-24 17:53 ` Ben Horgan
0 siblings, 2 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-24 14:19 UTC (permalink / raw)
To: Zeng Heng
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
Hi Zeng,
On 2/24/26 11:03, Zeng Heng wrote:
> Hi Ben,
>
> On 2026/2/16 20:22, Ben Horgan wrote:
>> Hi Zeng,
>>
>> On 2/14/26 09:40, Zeng Heng wrote:
>>> Hi Ben,
>>>
>>> On 2026/2/4 5:43, Ben Horgan wrote:
>> [...]
>>>>
>>>> Based on:
>>>> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
>>>> (To include telemetry code which changes the resctrl arch interface)
>>>>
>>>> The series can be retrieved from:
>>>> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
>>>> (Final commit is a fix already in 6.19-rc8)
>>>>
>> [...]
>>>>
>>>
>>> I've tested the MPAM functionality on my local Kunpeng platform. Here's
>>> a summary of the results:
>>
>> Thank you very much for your testing and detailed report.
>>
>>>
>>> Features enabled and verified:
>>> * L2 and L3 CPBM
>>> * L3 CSU
>>> * L2 and L3 CDP
>>> All enabled features passed functional testing as expected.
>>>
>>> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>>>
>>> Features not enabled:
>>> 1. MATA MBMAX partition and MBWU monitor.
>>
>> What's MATA here? Just memory allocation or something more specific?
>>
>
> The MATA serves as the agent for the Broadway MESI coherence protocol
> among multiple L3 caches or between I/O and L3 caches. It maintains data
> coherence among multiple L3 caches or between I/O and L3 caches.
>
> As the connection module between the SoC and the memory system, the MATA
> interfaces with the DMC on the DDR die. It provides the system with DDR
> access paths, delivering high-bandwidth, low-latency DDR read/write
> access.
>
> On the Kunpeng chip, the MB related MSC resides in this module rather
> than being directly located on the L3 cache controller.
>
>>> Reason: These do not meet the driver's current topology>
>> expectations for MB support, hence they were not initialized.
>>> This behavior is expected.
>>
>> Is this because you have more than 1 L3 cache?
>
> Yes, Kunpeng platform has more than 1 L3 cache.
>
> However, the reason it is not supported here is that the current driver
> does not support MATA recognition, while both Kunpeng MBM and MBA
> functionalities reside in the MATA MSC side as mentioned above,
> resulting in the inability to provide support.
>
> Relevant logs are as follows:
>
> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
> Mismatched CPU mask with L3 equivalent
> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
> match L3
>
>>
>>>
>>> 2. L2 CSU and MBWU monitors.
>>> Reason: The current MPAM driver does not support L2-related
>>> functionality yet.
>>
>> Thanks for letting us know you have these. But, yes, unfortunately
>> monitoring is only supported on the L3 at the moment.
>>
>>>
>>> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>>>
>>>
>>> Detailed test logs are as follows:
>>
>> I'm confused by these logs as it looks like you aren't getting any
>> monitors but you verified the L3 CSU. Also, it looks like cpor (cbpm) is
>> disabled (at least partially) but you verified L2 and L3 CPBM. Is this
>> across different machines?
>>
>
> The logs were of course tested on the same machine.
Ok and the same run? Just asking because I can't yet see how this all
goes together.
>
> Since the L3 CSU/CPBM resides on the L3 cache, it can be correctly
> recognized and run smoothly.
>
> In fact, not only was L2 successfully mounted, but all L2 MSC CPBMs were
> also correctly recognized. The suspicion here is that the L2
> class->components were incorrectly mounted to an unknown object, which
> is believed to be related to the monitoring capabilities (CSU and MBWU)
> of Kunpeng L2. The root cause is still being investigated.
I'll try and mock up a system with L2 monitoring and cpbm to see if the
driver behaves in the same way.
However, I still can't understand how you get CPOR on L2 after seeing
"mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR" as that
should be the only place class is set for the l2 cache which guards the
call to mpam_control_init() in mpam_resctrl_setup(). The
mpam_monitor_init() comes later so I can't see how that effects.
Is the driver corrupting something or writing to mpam_resctrl_controls
with the wrong index? Clutching at straws here.
>
> Resctrl mounting status:
>
> # cat schemata
> L2:4=ff;7=ff;10=ff;13=ff;16=ff;19=ff;22=ff;25=ff;29=ff;32=ff;35=ff;
> 38=ff;41=ff;44=ff;47=ff;50=ff;54=ff;57=ff;60=ff;63=ff;66=ff;69=ff;72=ff;
> 75=ff;79=ff;82=ff;85=ff;88=ff;91=ff;94=ff;97=ff;100=ff;104=ff;107=ff;
> 110=ff;113=ff;116=ff;119=ff;122=ff;125=ff;129=ff;132=ff;135=ff;138=ff;
> 141=ff;144=ff;147=ff;150=ff;154=ff;157=ff;160=ff;163=ff;166=ff;169=ff;
> 172=ff;175=ff;179=ff;182=ff;185=ff;188=ff;191=ff;194=ff;197=ff;200=ff;
> 204=ff;207=ff;210=ff;213=ff;216=ff;219=ff;222=ff;225=ff;229=ff;232=ff;
> 235=ff;238=ff;241=ff;244=ff;247=ff;250=ff;254=ff;257=ff;260=ff;263=ff;
> 266=ff;269=ff;272=ff;275=ff;279=ff;282=ff;285=ff;288=ff;291=ff;294=ff;
> 297=ff;300=ff
> L3:1=1ffff;26=1ffff;51=1ffff;76=1ffff;101=1ffff;126=1ffff;151=1ffff;
> 176=1ffff;201=1ffff;226=1ffff;251=1ffff;276=1ffff
>
> # ls mon_data/*/*
> mon_data/mon_L3_01/llc_occupancy mon_data/mon_L3_151/llc_occupancy
> mon_data/mon_L3_226/llc_occupancy mon_data/mon_L3_276/llc_occupancy
> mon_data/mon_L3_101/llc_occupancy mon_data/mon_L3_176/llc_occupancy
> mon_data/mon_L3_251/llc_occupancy mon_data/mon_L3_51/llc_occupancy
> mon_data/mon_L3_126/llc_occupancy mon_data/mon_L3_201/llc_occupancy
> mon_data/mon_L3_26/llc_occupancy mon_data/mon_L3_76/llc_occupancy
Thanks for the extra details. These are as expected, right?
>
>>>
>>> Boot logs:
>>> [root@localhost ~]# dmesg | grep -i mpam
>>> [ 0.000000] ACPI: MPAM 0x000000007FF35018 003024 (v01 HISI HIP12
>>> 00000000 HISI 20151124)
>>> [ 9.509852] mpam_msc mpam_msc.64: Merging features for
>>> vmsc:0xffff0800973cf5a0 |= ris:0xffff08009757ee90
>>> [ 9.509859] mpam_msc mpam_msc.254: Merging features for
>>> class:0xffff08009736fe50 &= vmsc:0xffff080097628520
>>> [ 9.509860] mpam:__props_mismatch:
>>> mpam_has_feature(mpam_feat_cpor_part, parent) = 1
>>> [ 9.509864] mpam:__props_mismatch:
>>> mpam_has_feature(mpam_feat_cpor_part, child) = 0
>>> [ 9.509866] mpam:__props_mismatch: parent->cpbm_wd = 8
>>> [ 9.509869] mpam:__props_mismatch: child->cpbm_wd = 0
>>> [ 9.509871] mpam:__props_mismatch: alias = 0
>>> [ 9.509873] mpam:__props_mismatch: cleared cpor_part
Do you know where this mismatch is happening? Is it expected?
>>
>> cpor (partially) disabled?
>>
>>> [ 9.509876] mpam:__props_mismatch: took the min num_csu_mon
>>> [ 9.509878] mpam:__props_mismatch: took the min num_mbwu_mon
>>> [ 9.509881] mpam_msc mpam_msc.252: Merging features for
>
> [...]
>
>>> [ 10.978496] mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR
>>
>> No L2 cpor?
This particularly confuses me...
>>
>>> [ 10.978497] mpam:mpam_resctrl_pick_caches: class 255 is not a cache
>>> [ 10.980470] mpam:mpam_resctrl_pick_mba: class 2 is before L3
>>> [ 10.980472] mpam:mpam_resctrl_pick_mba: class 3 has no bandwidth
>>> control
>>> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
>>> Mismatched CPU mask with L3 equivalent
>>> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
>>> match L3
>>> [ 10.997415] mpam:mpam_resctrl_pick_counters: class 2 is before L3
>>> [ 11.024109] mpam:topology_matches_l3: class 3 component 276 has
>>> Mismatched CPU mask with L3 equivalent
>>> [ 11.024114] mpam:class_has_usable_mbwu: monitors usable in free-
>>> running mode
>>
>> mbwu enabled?
>
> The fact that the number of monitors merely satisfies the conditions for
> free-running mode does not imply that the MBWU functionality can be
> successfully mounted. The specific reasons are explained above.
True
>
>>
>>> [ 11.063882] mpam:topology_matches_l3: class 255 component 0 has
>>> Mismatched CPU mask with L3 equivalent
>>> [ 11.113183] mpam:mpam_resctrl_alloc_domain: Skipped monitor domain
>>> online - no monitors
>>> [ 11.113189] MPAM enabled with 32 PARTIDs and 4 PMGs
>>>
>>>
>
> Sorry for the late reply. And this is my first day back from a long
> vacation.
No problem. Hope you had a good holiday.
>
>
> Best regards,
> Zeng Heng
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-24 14:19 ` Ben Horgan
@ 2026-02-24 15:27 ` Ben Horgan
2026-02-24 17:53 ` Ben Horgan
1 sibling, 0 replies; 89+ messages in thread
From: Ben Horgan @ 2026-02-24 15:27 UTC (permalink / raw)
To: Zeng Heng
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
Hi Zeng,
On 2/24/26 14:19, Ben Horgan wrote:
> Hi Zeng,
>
> On 2/24/26 11:03, Zeng Heng wrote:
>> Hi Ben,
>>
>> On 2026/2/16 20:22, Ben Horgan wrote:
>>> Hi Zeng,
>>>
>>> On 2/14/26 09:40, Zeng Heng wrote:
>>>> Hi Ben,
>>>>
>>>> On 2026/2/4 5:43, Ben Horgan wrote:
>>> [...]
>>>>>
>>>>> Based on:
>>>>> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
>>>>> (To include telemetry code which changes the resctrl arch interface)
>>>>>
>>>>> The series can be retrieved from:
>>>>> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
>>>>> (Final commit is a fix already in 6.19-rc8)
[...]
>>>> [ 10.978497] mpam:mpam_resctrl_pick_caches: class 255 is not a cache
>>>> [ 10.980470] mpam:mpam_resctrl_pick_mba: class 2 is before L3
This debug message was removed in v4 of the series.
v3 has it: https://lore.kernel.org/linux-arm-kernel/20260112165914.4086692-28-ben.horgan@arm.com/
v4 doesn't: https://lore.kernel.org/linux-arm-kernel/20260203214342.584712-27-ben.horgan@arm.com/
Do you know which version of the series you were running?
>>>> [ 10.980472] mpam:mpam_resctrl_pick_mba: class 3 has no bandwidth
>>>> control
>>>> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
>>>> Mismatched CPU mask with L3 equivalent
>>>> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
>>>> match L3
>>>> [ 10.997415] mpam:mpam_resctrl_pick_counters: class 2 is before L3
>>>> [ 11.024109] mpam:topology_matches_l3: class 3 component 276 has
>>>> Mismatched CPU mask with L3 equivalent
>>>> [ 11.024114] mpam:class_has_usable_mbwu: monitors usable in free-
>>>> running mode
>>>
>>> mbwu enabled?
>>
>> The fact that the number of monitors merely satisfies the conditions for
>> free-running mode does not imply that the MBWU functionality can be
>> successfully mounted. The specific reasons are explained above.
>
> True
>
>>
>>>
>>>> [ 11.063882] mpam:topology_matches_l3: class 255 component 0 has
>>>> Mismatched CPU mask with L3 equivalent
>>>> [ 11.113183] mpam:mpam_resctrl_alloc_domain: Skipped monitor domain
>>>> online - no monitors
>>>> [ 11.113189] MPAM enabled with 32 PARTIDs and 4 PMGs
>>>>
>>>>
>>
>> Sorry for the late reply. And this is my first day back from a long
>> vacation.
>
> No problem. Hope you had a good holiday.
>
>>
>>
>> Best regards,
>> Zeng Heng
>
> Thanks,
>
> Ben
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-24 14:19 ` Ben Horgan
2026-02-24 15:27 ` Ben Horgan
@ 2026-02-24 17:53 ` Ben Horgan
2026-02-26 7:17 ` Zeng Heng
1 sibling, 1 reply; 89+ messages in thread
From: Ben Horgan @ 2026-02-24 17:53 UTC (permalink / raw)
To: Zeng Heng
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
Hi Zeng,
On 2/24/26 14:19, Ben Horgan wrote:
> Hi Zeng,
>
> On 2/24/26 11:03, Zeng Heng wrote:
>> Hi Ben,
>>
>> On 2026/2/16 20:22, Ben Horgan wrote:
>>> Hi Zeng,
>>>
>>> On 2/14/26 09:40, Zeng Heng wrote:
>>>> Hi Ben,
>>>>
>>>> On 2026/2/4 5:43, Ben Horgan wrote:
>>> [...]
>>>>>
>>>>> Based on:
>>>>> [1] git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/cache
>>>>> (To include telemetry code which changes the resctrl arch interface)
>>>>>
>>>>> The series can be retrieved from:
>>>>> https://gitlab.arm.com/linux-arm/linux-bh.git mpam_resctrl_glue_v4
>>>>> (Final commit is a fix already in 6.19-rc8)
>>>>>
>>> [...]
>>>>>
>>>>
>>>> I've tested the MPAM functionality on my local Kunpeng platform. Here's
>>>> a summary of the results:
>>>
>>> Thank you very much for your testing and detailed report.
>>>
>>>>
>>>> Features enabled and verified:
>>>> * L2 and L3 CPBM
>>>> * L3 CSU
>>>> * L2 and L3 CDP
>>>> All enabled features passed functional testing as expected.
>>>>
>>>> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>>>>
>>>> Features not enabled:
>>>> 1. MATA MBMAX partition and MBWU monitor.
>>>
>>> What's MATA here? Just memory allocation or something more specific?
>>>
>>
>> The MATA serves as the agent for the Broadway MESI coherence protocol
>> among multiple L3 caches or between I/O and L3 caches. It maintains data
>> coherence among multiple L3 caches or between I/O and L3 caches.
>>
>> As the connection module between the SoC and the memory system, the MATA
>> interfaces with the DMC on the DDR die. It provides the system with DDR
>> access paths, delivering high-bandwidth, low-latency DDR read/write
>> access.
>>
>> On the Kunpeng chip, the MB related MSC resides in this module rather
>> than being directly located on the L3 cache controller.
>>
>>>> Reason: These do not meet the driver's current topology>
>>> expectations for MB support, hence they were not initialized.
>>>> This behavior is expected.
>>>
>>> Is this because you have more than 1 L3 cache?
>>
>> Yes, Kunpeng platform has more than 1 L3 cache.
>>
>> However, the reason it is not supported here is that the current driver
>> does not support MATA recognition, while both Kunpeng MBM and MBA
>> functionalities reside in the MATA MSC side as mentioned above,
>> resulting in the inability to provide support.
>>
>> Relevant logs are as follows:
>>
>> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
>> Mismatched CPU mask with L3 equivalent
>> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
>> match L3
>>
>>>
>>>>
>>>> 2. L2 CSU and MBWU monitors.
>>>> Reason: The current MPAM driver does not support L2-related
>>>> functionality yet.
>>>
>>> Thanks for letting us know you have these. But, yes, unfortunately
>>> monitoring is only supported on the L3 at the moment.
>>>
>>>>
>>>> + Tested-by: Zeng Heng <zengheng4@huawei.com>
>>>>
>>>>
>>>> Detailed test logs are as follows:
>>>
>>> I'm confused by these logs as it looks like you aren't getting any
>>> monitors but you verified the L3 CSU. Also, it looks like cpor (cbpm) is
>>> disabled (at least partially) but you verified L2 and L3 CPBM. Is this
>>> across different machines?
>>>
>>
>> The logs were of course tested on the same machine.
>
> Ok and the same run? Just asking because I can't yet see how this all
> goes together.
>
>>
>> Since the L3 CSU/CPBM resides on the L3 cache, it can be correctly
>> recognized and run smoothly.
>>
>> In fact, not only was L2 successfully mounted, but all L2 MSC CPBMs were
>> also correctly recognized. The suspicion here is that the L2
>> class->components were incorrectly mounted to an unknown object, which
>> is believed to be related to the monitoring capabilities (CSU and MBWU)
>> of Kunpeng L2. The root cause is still being investigated.
>
> I'll try and mock up a system with L2 monitoring and cpbm to see if the
> driver behaves in the same way.
>
> However, I still can't understand how you get CPOR on L2 after seeing
> "mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR" as that
> should be the only place class is set for the l2 cache which guards the
> call to mpam_control_init() in mpam_resctrl_setup(). The
> mpam_monitor_init() comes later so I can't see how that effects.
>
> Is the driver corrupting something or writing to mpam_resctrl_controls
> with the wrong index? Clutching at straws here.
>
>>
>> Resctrl mounting status:
>>
>> # cat schemata
>> L2:4=ff;7=ff;10=ff;13=ff;16=ff;19=ff;22=ff;25=ff;29=ff;32=ff;35=ff;
>> 38=ff;41=ff;44=ff;47=ff;50=ff;54=ff;57=ff;60=ff;63=ff;66=ff;69=ff;72=ff;
>> 75=ff;79=ff;82=ff;85=ff;88=ff;91=ff;94=ff;97=ff;100=ff;104=ff;107=ff;
>> 110=ff;113=ff;116=ff;119=ff;122=ff;125=ff;129=ff;132=ff;135=ff;138=ff;
>> 141=ff;144=ff;147=ff;150=ff;154=ff;157=ff;160=ff;163=ff;166=ff;169=ff;
>> 172=ff;175=ff;179=ff;182=ff;185=ff;188=ff;191=ff;194=ff;197=ff;200=ff;
>> 204=ff;207=ff;210=ff;213=ff;216=ff;219=ff;222=ff;225=ff;229=ff;232=ff;
>> 235=ff;238=ff;241=ff;244=ff;247=ff;250=ff;254=ff;257=ff;260=ff;263=ff;
>> 266=ff;269=ff;272=ff;275=ff;279=ff;282=ff;285=ff;288=ff;291=ff;294=ff;
>> 297=ff;300=ff
>> L3:1=1ffff;26=1ffff;51=1ffff;76=1ffff;101=1ffff;126=1ffff;151=1ffff;
>> 176=1ffff;201=1ffff;226=1ffff;251=1ffff;276=1ffff
>>
>> # ls mon_data/*/*
>> mon_data/mon_L3_01/llc_occupancy mon_data/mon_L3_151/llc_occupancy
>> mon_data/mon_L3_226/llc_occupancy mon_data/mon_L3_276/llc_occupancy
>> mon_data/mon_L3_101/llc_occupancy mon_data/mon_L3_176/llc_occupancy
>> mon_data/mon_L3_251/llc_occupancy mon_data/mon_L3_51/llc_occupancy
>> mon_data/mon_L3_126/llc_occupancy mon_data/mon_L3_201/llc_occupancy
>> mon_data/mon_L3_26/llc_occupancy mon_data/mon_L3_76/llc_occupancy
>
> Thanks for the extra details. These are as expected, right?
>
>>
>>>>
>>>> Boot logs:
>>>> [root@localhost ~]# dmesg | grep -i mpam
>>>> [ 0.000000] ACPI: MPAM 0x000000007FF35018 003024 (v01 HISI HIP12
>>>> 00000000 HISI 20151124)
>>>> [ 9.509852] mpam_msc mpam_msc.64: Merging features for
>>>> vmsc:0xffff0800973cf5a0 |= ris:0xffff08009757ee90
>>>> [ 9.509859] mpam_msc mpam_msc.254: Merging features for
>>>> class:0xffff08009736fe50 &= vmsc:0xffff080097628520
>>>> [ 9.509860] mpam:__props_mismatch:
>>>> mpam_has_feature(mpam_feat_cpor_part, parent) = 1
>>>> [ 9.509864] mpam:__props_mismatch:
>>>> mpam_has_feature(mpam_feat_cpor_part, child) = 0
>>>> [ 9.509866] mpam:__props_mismatch: parent->cpbm_wd = 8
>>>> [ 9.509869] mpam:__props_mismatch: child->cpbm_wd = 0
>>>> [ 9.509871] mpam:__props_mismatch: alias = 0
>>>> [ 9.509873] mpam:__props_mismatch: cleared cpor_part
>
> Do you know where this mismatch is happening? Is it expected?
I have added some of James' debug patches at:
https://gitlab.arm.com/linux-arm/linux-bh/-/tree/mpam_resctrl_glue_v5_debugfs?ref_type=heads
These are on top of a v5 of the series. They apply to older series
as well.
The debugfs should then contain details of all the MSC which
should help debugging.
It would be great if you were able to share that information, either on list or
privately if needed.
You can snapshot the mpam debugfs using:
find /sys/kernel/debug/mpam -type f -exec sh -c 'echo {}; cat {}' \; > mpam_debugfs.txt
>
>>>
>>> cpor (partially) disabled?
>>>
>>>> [ 9.509876] mpam:__props_mismatch: took the min num_csu_mon
>>>> [ 9.509878] mpam:__props_mismatch: took the min num_mbwu_mon
>>>> [ 9.509881] mpam_msc mpam_msc.252: Merging features for
>>
>> [...]
>>
>>>> [ 10.978496] mpam:mpam_resctrl_pick_caches: class 2 cache misses CPOR
>>>
>>> No L2 cpor?
>
> This particularly confuses me...
>
>>>
>>>> [ 10.978497] mpam:mpam_resctrl_pick_caches: class 255 is not a cache
>>>> [ 10.980470] mpam:mpam_resctrl_pick_mba: class 2 is before L3
>>>> [ 10.980472] mpam:mpam_resctrl_pick_mba: class 3 has no bandwidth
>>>> control
>>>> [ 10.997406] mpam:topology_matches_l3: class 255 component 0 has
>>>> Mismatched CPU mask with L3 equivalent
>>>> [ 10.997411] mpam:mpam_resctrl_pick_mba: class 255 topology doesn't
>>>> match L3
>>>> [ 10.997415] mpam:mpam_resctrl_pick_counters: class 2 is before L3
>>>> [ 11.024109] mpam:topology_matches_l3: class 3 component 276 has
>>>> Mismatched CPU mask with L3 equivalent
>>>> [ 11.024114] mpam:class_has_usable_mbwu: monitors usable in free-
>>>> running mode
>>>
>>> mbwu enabled?
>>
>> The fact that the number of monitors merely satisfies the conditions for
>> free-running mode does not imply that the MBWU functionality can be
>> successfully mounted. The specific reasons are explained above.
>
> True
>
>>
>>>
>>>> [ 11.063882] mpam:topology_matches_l3: class 255 component 0 has
>>>> Mismatched CPU mask with L3 equivalent
>>>> [ 11.113183] mpam:mpam_resctrl_alloc_domain: Skipped monitor domain
>>>> online - no monitors
>>>> [ 11.113189] MPAM enabled with 32 PARTIDs and 4 PMGs
>>>>
>>>>
>>
>> Sorry for the late reply. And this is my first day back from a long
>> vacation.
>
> No problem. Hope you had a good holiday.
>
>>
>>
>> Best regards,
>> Zeng Heng
>
> Thanks,
>
> Ben
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration
2026-02-16 14:23 ` Ben Horgan
@ 2026-02-25 6:39 ` Zeng Heng
0 siblings, 0 replies; 89+ messages in thread
From: Zeng Heng @ 2026-02-25 6:39 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Shaopeng Tan
Hi Ben,
On 2026/2/16 22:23, Ben Horgan wrote:
> Hi Zeng,
>
> On 2/14/26 10:39, Zeng Heng wrote:
>> Hi Ben,
>>
>> On 2026/2/4 5:43, Ben Horgan wrote:
>>> From: James Morse <james.morse@arm.com>
>>>
>>> resctrl has two helpers for updating the configuration.
>>> resctrl_arch_update_one() updates a single value, and is used by the
>>> software-controller to apply feedback to the bandwidth controls, it
>>> has to
>>> be called on one of the CPUs in the resctrl:domain.
>>>
>>> resctrl_arch_update_domains() copies multiple staged configurations,
>>> it can
>>> be called from anywhere.
>>>
>>> Both helpers should update any changes to the underlying hardware.
>>>
>>> Implement resctrl_arch_update_domains() to use
>>> resctrl_arch_update_one(). Neither need to be called on a specific CPU as
>>> the mpam driver will send IPIs as needed.
>>>
>>> Tested-by: Gavin Shan <gshan@redhat.com>
>>> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
>>> Tested-by: Peter Newman <peternewman@google.com>
>>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>>> Signed-off-by: James Morse <james.morse@arm.com>
>>> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
>>> ---
>>> Changes since rfc:
>>> list_for_each_entry -> list_for_each_entry_rcu
>>> return 0
>>> Restrict scope of local variables
>>>
>>> Changes since v2:
>>> whitespace fix
>>> ---
>>> drivers/resctrl/mpam_resctrl.c | 70 ++++++++++++++++++++++++++++++++++
>>> 1 file changed, 70 insertions(+)
>>>
>>> diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/
>>> mpam_resctrl.c
>>> index ecf00386edca..48d047510089 100644
>>> --- a/drivers/resctrl/mpam_resctrl.c
>>> +++ b/drivers/resctrl/mpam_resctrl.c
>>> @@ -212,6 +212,76 @@ u32 resctrl_arch_get_config(struct rdt_resource
>>> *r, struct rdt_ctrl_domain *d,
>>> }
>>> }
>>> +int resctrl_arch_update_one(struct rdt_resource *r, struct
>>> rdt_ctrl_domain *d,
>>> + u32 closid, enum resctrl_conf_type t, u32 cfg_val)
>>> +{
>>> + u32 partid;
>>> + struct mpam_config cfg;
>>> + struct mpam_props *cprops;
>>> + struct mpam_resctrl_res *res;
>>> + struct mpam_resctrl_dom *dom;
>>> +
>>> + lockdep_assert_cpus_held();
>>> + lockdep_assert_irqs_enabled();
>>> +
>>> + /*
>>> + * No need to check the CPU as mpam_apply_config() doesn't care, and
>>> + * resctrl_arch_update_domains() relies on this.
>>> + */
>>> + res = container_of(r, struct mpam_resctrl_res, resctrl_res);
>>> + dom = container_of(d, struct mpam_resctrl_dom, resctrl_ctrl_dom);
>>> + cprops = &res->class->props;
>>> +
>>> + partid = resctrl_get_config_index(closid, t);
>>
>>
>> As a victim, I must admit I cannot verify this feedback on my local
>> Kunpeng environment since MB functionality is not yet supported by the
>> driver. However, after careful consideration, I believe this is worth
>> bringing up for discussion.
>
> Thank you for thinking and finding problems beyond your platform.
>
>>
>> Regarding the MB configuration flow, the partid conversion should
>> include the mpam_resctrl_hide_cdp() condition check. Here's the
>> rationale:
>>
>> After resctrl parsing schemata update, MB configuration is set via
>> parse_bw() or rdtgroup_init_mba(), which stores the updated
>> configuration in dom->staged_config[CDP_NONE]. If the MB configuration
>> update directly uses t = CDP_NONE, it would result in MB obtaining the
>> wrong partid and cfg[partid].
>>
>> The specific fix would be like:
>>
>> - partid = resctrl_get_config_index(closid, t);
>> + if (mpam_resctrl_hide_cdp(r->rid))
>> + /* The configuration of CDP_DATA is same as the CDP_CODE one. */
>> + partid = resctrl_get_config_index(closid, CDP_DATA);
>> + else
>> + partid = resctrl_get_config_index(closid, t);
>
> The CDP emulation support is added later in the series in patch 20, Add
> CDP emulation. However, I think you have spotted an actual problem. With
> hidden CDP the cfg is first found with
>
> resctrl_get_config_index(closid, t) when CDP_NONE
> but then the setting does use CDP_CODE and CDP_DATA.
>
> CDP in general is proving to be quite tricky.
>
>>
>> Similarly, in resctrl_arch_get_config() requires the same treatment to
>> ensure consistency.
>
> I don't see the problem here but maybe I'm missing something?
>
> Isn't this handled by:
>
> /*
> * When CDP is enabled, but the resource doesn't support it,
> * the control is cloned across both partids.
> * Pick one at random to read:
> */
> if (mpam_resctrl_hide_cdp(r->rid))
> type = CDP_DATA;
>
> I think we could do similar in resctrl_arch_update_one()
>
>>
>>
Yes, I noticed that handling has already been added to
resctrl_arch_get_config(), and I have updated the comments on v5
accordingly.
Thanks,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource
2026-02-18 16:22 ` Ben Horgan
2026-02-18 17:17 ` Reinette Chatre
@ 2026-02-25 8:08 ` Zeng Heng
1 sibling, 0 replies; 89+ messages in thread
From: Zeng Heng @ 2026-02-25 8:08 UTC (permalink / raw)
To: Ben Horgan, james.morse, Reinette Chatre
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, kobak, lcherian, linux-arm-kernel,
linux-kernel, peternewman, punit.agrawal, quic_jiles,
reinette.chatre, rohit.mathew, scott, sdonthineni, tan.shaopeng,
xhao, catalin.marinas, will, corbet, maz, oupton, joey.gouly,
suzuki.poulose, kvmarm, linux-doc, Kefeng Wang, Jonathan Cameron
On 2026/2/19 0:22, Ben Horgan wrote:
> Hi Fenghua, Zeng,
>
> On 2/16/26 13:54, Ben Horgan wrote:
>> Hi Zeng,
>>
>> On 2/13/26 07:38, Zeng Heng wrote:
>>> Hi Ben,
>>>
>>> On 2026/2/6 0:50, Jonathan Cameron wrote:
>>>> On Tue, 3 Feb 2026 21:43:27 +0000
>>>> Ben Horgan <ben.horgan@arm.com> wrote:
>>>>
>>>>> From: James Morse <james.morse@arm.com>
>>>>>
>>>>> resctrl supports 'MB', as a percentage throttling of traffic from the
>>>>> L3. This is the control that mba_sc uses, so ideally the class chosen
>>>>> should be as close as possible to the counters used for mbm_total. If
>>>>> there is a single L3 and the topology of the memory matches then the
>>>>> traffic at the memory controller will be equivalent to that at egress of
>>>>> the L3. If these conditions are met allow the memory class to back MB.
>>>>>
>>>>> MB's percentage control should be backed either with the fixed point
>>>>> fraction MBW_MAX or bandwidth portion bitmaps. The bandwidth portion
>>>>> bitmaps is not used as its tricky to pick which bits to use to avoid
>>>>> contention, and may be possible to expose this as something other than a
>>>>> percentage in the future.
>>>>
>>>> I'm very curious to know whether this heuristic is actually useful on
>>>> many
>>>> systems or whether many / most of them fail this 'shape' heuristic.
>>>>
>>>
>>> The current MPAM driver has restrictions that limit MB control support.
>>> For example, on some systems, the MPAM memory class MSCs are not located
>>> at the L3 cache but rather at the memory controller (e.g., Mata). In
>>> such cases, MB control and mbm_total bandwidth monitoring cannot be
>>> enabled.
>>>> I'm unsure whether the community would welcome and be willing to review
>>> a patch series supporting such systems. Of course, the changes would
>>> involve minor refactoring in the resctrl common layer.
>>
>> Having MSC at the memory controllers is quite common and it would be
>> good for the mpam driver and resctrl to support this. My current
>> priority is the initial MPAM support and look at this and other extra
>> features later but I'd be happy to help progress support in this area
>> through review and discussion. There is some discussion about adding new
>> schema at [1] and we should make sure this is consistent with monitors
>> too. James has some out of tree patches from before that disccussion at [2].
>>
>> [1] https://lore.kernel.org/lkml/aPtfMFfLV1l%2FRB0L@e133380.arm.com/
>> [2] git://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git
>> mpam/snapshot+extras/v6.18-rc1
>
Thank you for the information.
> Fenghua gave a talk at LPC on supporting cpu-less numa nodes in resctrl
> so is likely to have some patches/ideas around measuring bandwidth at
> memory controllers.
>
I also listened to the LPC presentation on this subject. The Kunpeng
memory controller is per-NUMA, however not CPU-less.
From an interface perspective, the interface that the Kunpeng memory
controller MSC solution aims to provide is somewhat similar to the
approach offered by Reinette:
"
One idea on how to accommodate this from resctrl side:
https://lore.kernel.org/lkml/fb1e2686-237b-4536-acd6-15159abafcba@intel.com/
mon_data
├── mon_L3_00 <== monitoring data at scope L3
│ ├── llc_occupancy
│ ├── mbm_local_bytes
│ └── mbm_total_bytes
├── mon_L3_01 <== monitoring data at scope L3
│ ├── llc_occupancy
│ ├── mbm_local_bytes
│ └── mbm_total_bytes
├── mon_NODE_00 <== monitoring data at scope NODE
│ └── mbm_total_bytes
└── mon_NODE_01 <== monitoring data at scope NODE
└── mbm_total_bytes
"
Best regards,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
* Re: [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code
2026-02-24 17:53 ` Ben Horgan
@ 2026-02-26 7:17 ` Zeng Heng
0 siblings, 0 replies; 89+ messages in thread
From: Zeng Heng @ 2026-02-26 7:17 UTC (permalink / raw)
To: Ben Horgan
Cc: amitsinght, baisheng.gao, baolin.wang, carl, dave.martin, david,
dfustini, fenghuay, gshan, james.morse, jonathan.cameron, kobak,
lcherian, linux-arm-kernel, linux-kernel, peternewman,
punit.agrawal, quic_jiles, reinette.chatre, rohit.mathew, scott,
sdonthineni, tan.shaopeng, xhao, catalin.marinas, will, corbet,
maz, oupton, joey.gouly, suzuki.poulose, kvmarm, linux-doc,
Kefeng Wang
On 2026/2/25 1:53, Ben Horgan wrote:
> Hi Zeng,
>
>>>
>>> Resctrl mounting status:
>>>
>>> # cat schemata
>>> L2:4=ff;7=ff;10=ff;13=ff;16=ff;19=ff;22=ff;25=ff;29=ff;32=ff;35=ff;
>>> 38=ff;41=ff;44=ff;47=ff;50=ff;54=ff;57=ff;60=ff;63=ff;66=ff;69=ff;72=ff;
>>> 75=ff;79=ff;82=ff;85=ff;88=ff;91=ff;94=ff;97=ff;100=ff;104=ff;107=ff;
>>> 110=ff;113=ff;116=ff;119=ff;122=ff;125=ff;129=ff;132=ff;135=ff;138=ff;
>>> 141=ff;144=ff;147=ff;150=ff;154=ff;157=ff;160=ff;163=ff;166=ff;169=ff;
>>> 172=ff;175=ff;179=ff;182=ff;185=ff;188=ff;191=ff;194=ff;197=ff;200=ff;
>>> 204=ff;207=ff;210=ff;213=ff;216=ff;219=ff;222=ff;225=ff;229=ff;232=ff;
>>> 235=ff;238=ff;241=ff;244=ff;247=ff;250=ff;254=ff;257=ff;260=ff;263=ff;
>>> 266=ff;269=ff;272=ff;275=ff;279=ff;282=ff;285=ff;288=ff;291=ff;294=ff;
>>> 297=ff;300=ff
>>> L3:1=1ffff;26=1ffff;51=1ffff;76=1ffff;101=1ffff;126=1ffff;151=1ffff;
>>> 176=1ffff;201=1ffff;226=1ffff;251=1ffff;276=1ffff
>>>
>>> # ls mon_data/*/*
>>> mon_data/mon_L3_01/llc_occupancy mon_data/mon_L3_151/llc_occupancy
>>> mon_data/mon_L3_226/llc_occupancy mon_data/mon_L3_276/llc_occupancy
>>> mon_data/mon_L3_101/llc_occupancy mon_data/mon_L3_176/llc_occupancy
>>> mon_data/mon_L3_251/llc_occupancy mon_data/mon_L3_51/llc_occupancy
>>> mon_data/mon_L3_126/llc_occupancy mon_data/mon_L3_201/llc_occupancy
>>> mon_data/mon_L3_26/llc_occupancy mon_data/mon_L3_76/llc_occupancy
>>
>> Thanks for the extra details. These are as expected, right?
Yes, Both the resctrl mount functionality and the control domain count
are as expected.
>>
>>>
>>>>>
>>>>> Boot logs:
>>>>> [root@localhost ~]# dmesg | grep -i mpam
>>>>> [ 0.000000] ACPI: MPAM 0x000000007FF35018 003024 (v01 HISI HIP12
>>>>> 00000000 HISI 20151124)
>>>>> [ 9.509852] mpam_msc mpam_msc.64: Merging features for
>>>>> vmsc:0xffff0800973cf5a0 |= ris:0xffff08009757ee90
>>>>> [ 9.509859] mpam_msc mpam_msc.254: Merging features for
>>>>> class:0xffff08009736fe50 &= vmsc:0xffff080097628520
>>>>> [ 9.509860] mpam:__props_mismatch:
>>>>> mpam_has_feature(mpam_feat_cpor_part, parent) = 1
>>>>> [ 9.509864] mpam:__props_mismatch:
>>>>> mpam_has_feature(mpam_feat_cpor_part, child) = 0
>>>>> [ 9.509866] mpam:__props_mismatch: parent->cpbm_wd = 8
>>>>> [ 9.509869] mpam:__props_mismatch: child->cpbm_wd = 0
>>>>> [ 9.509871] mpam:__props_mismatch: alias = 0
>>>>> [ 9.509873] mpam:__props_mismatch: cleared cpor_part
>>
>> Do you know where this mismatch is happening? Is it expected?
I've rerun tests against the latest v5 and no longer see the
"__props_mismatch" messages. I'll post the newest boot log on the v5
thread soon.
My suspicion is that the MSC initialization refactoring in the MPAM
driver between v3 and v5 inadvertently fixed the L2 initialization error
printing.
>
> I have added some of James' debug patches at:
> https://gitlab.arm.com/linux-arm/linux-bh/-/tree/mpam_resctrl_glue_v5_debugfs?ref_type=heads
>
> These are on top of a v5 of the series. They apply to older series
> as well.
>
> The debugfs should then contain details of all the MSC which
> should help debugging.
>
> It would be great if you were able to share that information, either on list or
> privately if needed.
>
> You can snapshot the mpam debugfs using:
>
> find /sys/kernel/debug/mpam -type f -exec sh -c 'echo {}; cat {}' \; > mpam_debugfs.txt
I've enabled MPAM debugfs locally on v5 and dumped the logs. The file is
over 100k, which would trigger corporate email security scans if sent
externally.
I've reviewed it briefly and the contents look as expected.
Best regards,
Zeng Heng
^ permalink raw reply [flat|nested] 89+ messages in thread
end of thread, other threads:[~2026-02-26 7:17 UTC | newest]
Thread overview: 89+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-03 21:43 [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Ben Horgan
2026-02-03 21:43 ` [PATCH v4 01/41] arm64/sysreg: Add MPAMSM_EL1 register Ben Horgan
2026-02-03 21:43 ` [PATCH v4 02/41] KVM: arm64: Preserve host MPAM configuration when changing traps Ben Horgan
2026-02-03 21:43 ` [PATCH v4 03/41] KVM: arm64: Make MPAMSM_EL1 accesses UNDEF Ben Horgan
2026-02-03 21:43 ` [PATCH v4 04/41] arm64: mpam: Context switch the MPAM registers Ben Horgan
2026-02-05 16:16 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 05/41] arm64: mpam: Re-initialise MPAM regs when CPU comes online Ben Horgan
2026-02-05 16:20 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 06/41] arm64: mpam: Drop the CONFIG_EXPERT restriction Ben Horgan
2026-02-05 14:08 ` Jonathan Cameron
2026-02-05 16:21 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 07/41] arm64: mpam: Advertise the CPUs MPAM limits to the driver Ben Horgan
2026-02-03 21:43 ` [PATCH v4 08/41] arm64: mpam: Add cpu_pm notifier to restore MPAM sysregs Ben Horgan
2026-02-05 16:54 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 09/41] arm64: mpam: Initialise and context switch the MPAMSM_EL1 register Ben Horgan
2026-02-05 16:55 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 10/41] arm64: mpam: Add helpers to change a task or cpu's MPAM PARTID/PMG values Ben Horgan
2026-02-05 16:56 ` Catalin Marinas
2026-02-03 21:43 ` [PATCH v4 11/41] KVM: arm64: Force guest EL1 to use user-space's partid configuration Ben Horgan
2026-02-03 21:43 ` [PATCH v4 12/41] KVM: arm64: Use kernel-space partid configuration for hypercalls Ben Horgan
2026-02-03 21:43 ` [PATCH v4 13/41] arm_mpam: resctrl: Add boilerplate cpuhp and domain allocation Ben Horgan
2026-02-10 22:57 ` Reinette Chatre
2026-02-11 15:36 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 14/41] arm_mpam: resctrl: Sort the order of the domain lists Ben Horgan
2026-02-03 21:43 ` [PATCH v4 15/41] arm_mpam: resctrl: Pick the caches we will use as resctrl resources Ben Horgan
2026-02-10 23:39 ` Reinette Chatre
2026-02-11 11:05 ` Ben Horgan
2026-02-12 16:22 ` Reinette Chatre
2026-02-03 21:43 ` [PATCH v4 16/41] arm_mpam: resctrl: Implement resctrl_arch_reset_all_ctrls() Ben Horgan
2026-02-13 3:32 ` Zeng Heng
2026-02-03 21:43 ` [PATCH v4 17/41] arm_mpam: resctrl: Add resctrl_arch_get_config() Ben Horgan
2026-02-03 21:43 ` [PATCH v4 18/41] arm_mpam: resctrl: Implement helpers to update configuration Ben Horgan
2026-02-14 10:39 ` Zeng Heng
2026-02-16 14:23 ` Ben Horgan
2026-02-25 6:39 ` Zeng Heng
2026-02-03 21:43 ` [PATCH v4 19/41] arm_mpam: resctrl: Add plumbing against arm64 task and cpu hooks Ben Horgan
2026-02-03 21:43 ` [PATCH v4 20/41] arm_mpam: resctrl: Add CDP emulation Ben Horgan
2026-02-09 1:16 ` Fenghua Yu
2026-02-09 15:36 ` Ben Horgan
2026-02-11 10:50 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 21/41] arm_mpam: resctrl: Convert to/from MPAMs fixed-point formats Ben Horgan
2026-02-03 21:43 ` [PATCH v4 22/41] arm_mpam: resctrl: Add kunit test for control format conversions Ben Horgan
2026-02-03 21:43 ` [PATCH v4 23/41] arm_mpam: resctrl: Add rmid index helpers Ben Horgan
2026-02-03 21:43 ` [PATCH v4 24/41] arm_mpam: resctrl: Add kunit test for rmid idx conversions Ben Horgan
2026-02-03 21:43 ` [PATCH v4 25/41] arm_mpam: resctrl: Wait for cacheinfo to be ready Ben Horgan
2026-02-03 21:43 ` [PATCH v4 26/41] arm_mpam: resctrl: Add support for 'MB' resource Ben Horgan
2026-02-05 16:50 ` Jonathan Cameron
2026-02-13 7:38 ` Zeng Heng
2026-02-16 13:54 ` Ben Horgan
2026-02-18 16:22 ` Ben Horgan
2026-02-18 17:17 ` Reinette Chatre
2026-02-25 8:08 ` Zeng Heng
2026-02-18 16:40 ` Ben Horgan
2026-02-10 6:20 ` Shaopeng Tan (Fujitsu)
2026-02-18 16:42 ` Ben Horgan
2026-02-03 21:43 ` [PATCH v4 27/41] arm_mpam: resctrl: Add support for csu counters Ben Horgan
2026-02-05 16:55 ` Jonathan Cameron
2026-02-03 21:43 ` [PATCH v4 28/41] arm_mpam: resctrl: Pick classes for use as mbm counters Ben Horgan
2026-02-05 16:58 ` Jonathan Cameron
2026-02-03 21:43 ` [PATCH v4 29/41] arm_mpam: resctrl: Pre-allocate free running monitors Ben Horgan
2026-02-03 21:43 ` [PATCH v4 30/41] arm_mpam: resctrl: Allow resctrl to allocate monitors Ben Horgan
2026-02-03 21:43 ` [PATCH v4 31/41] arm_mpam: resctrl: Add resctrl_arch_rmid_read() and resctrl_arch_reset_rmid() Ben Horgan
2026-02-03 21:43 ` [PATCH v4 32/41] arm_mpam: resctrl: Update the rmid reallocation limit Ben Horgan
2026-02-03 21:43 ` [PATCH v4 33/41] arm_mpam: resctrl: Add empty definitions for assorted resctrl functions Ben Horgan
2026-02-03 21:43 ` [PATCH v4 34/41] arm64: mpam: Select ARCH_HAS_CPU_RESCTRL Ben Horgan
2026-02-03 21:43 ` [PATCH v4 35/41] arm_mpam: resctrl: Call resctrl_init() on platforms that can support resctrl Ben Horgan
2026-02-03 21:43 ` [PATCH v4 36/41] arm_mpam: Add quirk framework Ben Horgan
2026-02-03 21:43 ` [PATCH v4 37/41] arm_mpam: Add workaround for T241-MPAM-1 Ben Horgan
2026-02-03 21:43 ` [PATCH v4 38/41] arm_mpam: Add workaround for T241-MPAM-4 Ben Horgan
2026-02-13 7:02 ` Shaopeng Tan (Fujitsu)
2026-02-14 1:29 ` Zeng Heng
2026-02-20 2:30 ` Shaopeng Tan (Fujitsu)
2026-02-03 21:43 ` [PATCH v4 39/41] arm_mpam: Add workaround for T241-MPAM-6 Ben Horgan
2026-02-03 21:43 ` [PATCH v4 40/41] arm_mpam: Quirk CMN-650's CSU NRDY behaviour Ben Horgan
2026-02-03 21:43 ` [PATCH v4 41/41] arm64: mpam: Add initial MPAM documentation Ben Horgan
2026-02-05 16:57 ` Catalin Marinas
2026-02-05 17:05 ` Jonathan Cameron
2026-02-18 17:02 ` Ben Horgan
2026-02-09 8:25 ` [PATCH v4 00/41] arm_mpam: Add KVM/arm64 and resctrl glue code Shaopeng Tan (Fujitsu)
2026-02-09 10:04 ` Ben Horgan
2026-02-12 14:51 ` Ben Horgan
2026-02-13 7:18 ` Shaopeng Tan (Fujitsu)
2026-02-14 9:40 ` Zeng Heng
2026-02-16 12:22 ` Ben Horgan
2026-02-24 11:03 ` Zeng Heng
2026-02-24 14:19 ` Ben Horgan
2026-02-24 15:27 ` Ben Horgan
2026-02-24 17:53 ` Ben Horgan
2026-02-26 7:17 ` Zeng Heng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox