* Re: [GIT PULL] firmware: arm_ffa: Fix for v7.1
From: Sudeep Holla @ 2026-04-13 8:32 UTC (permalink / raw)
To: Arnd Bergmann; +Cc: Krzysztof Kozlowski, arm, SoC Team, Sudeep Holla, ALKML
In-Reply-To: <45bc09d9-33e0-48c4-92a3-9bf8e64ef80a@app.fastmail.com>
On Mon, Apr 13, 2026 at 08:23:58AM +0200, Arnd Bergmann wrote:
> On Sat, Apr 11, 2026, at 19:35, Sudeep Holla wrote:
> > On Sat, Apr 11, 2026 at 10:49:50AM +0200, Krzysztof Kozlowski wrote:
> >> On Tue, Apr 07, 2026 at 11:08:39AM +0100, Sudeep Holla wrote:
> >> > ----------------------------------------------------------------
> >> > Arm FF-A fix for v7.1
> >> >
> >> > Use the page aligned backing allocation size when computing the RXTX_MAP
> >> > page count. This fixes FF-A RX/TX buffer registration on kernels built
> >> > with 16K/64K PAGE_SIZE, where alloc_pages_exact() backs the buffer with a
> >> > larger aligned span than the discovered minimum buffer size.
> >>
> >> Can we avoid per-driver trees or pulls? You do maintain also ARM SCMI
> >> firmware driver, so this could be sent together? I think you also use
> >> the same Git tree, right?
> >
> > Sure, I can put all of the firmware drivers I maintain together. I had
> > for some reason assumed individual PR is preferred.
>
> To me, that's a function of how complex the changes are and how
> you describe them in the changelog text: If you have a lot of changes
> for the merge window, having one branch per firmware type probably
> works best, or even multiple ones if you have a series that implements
> something new and a number of random changes do existing code.
>
> If you have only a handful of bugfixes across multiple firmware
> subsystems, a single 'firmware fixes' is less work for all of
> us, with no loss of readability in the git history.
>
Understood, will try to follow something along these in the future.
--
Regards,
Sudeep
^ permalink raw reply
* [PING] Re: [PATCH v7 0/2] arm64: dts/defconfig: enable BST C1200 eMMC
From: Albert Yang @ 2026-04-13 8:34 UTC (permalink / raw)
To: krzk, arnd
Cc: krzk+dt, robh, conor+dt, gordon.ge, bst-upstream,
linux-arm-kernel, devicetree, linux-kernel, yangzh0906
In-Reply-To: <20260310091211.4171307-1-yangzh0906@thundersoft.com>
Hi Krzysztof, Arnd, Rob, and Conor,
Gentle ping for this v7 series posted on 2026-03-10:
https://lore.kernel.org/lkml/20260310091211.4171307-1-yangzh0906@thundersoft.com/
This series only contains the remaining DTS + defconfig parts for BST C1200 eMMC:
- 1/2 arm64: dts: bst: enable eMMC controller in C1200 CDCU1.0 board
- 2/2 arm64: defconfig: enable BST SDHCI controller
The MMC driver-side patches were already applied in mmc-next, so this series is for
arm64/DT review and merge path.
No functional code changes since v7. If preferred, I can send a rebase/refresh (v8)
on top of current mainline immediately.
Could you please help review and let me know if any changes are still needed?
Thanks for your time.
Best regards,
Albert
^ permalink raw reply
* Re: [PATCH V3 7/7] arm64/hw_breakpoint: Enable FEAT_Debugv8p9
From: Mark Rutland @ 2026-04-13 8:48 UTC (permalink / raw)
To: Rob Herring
Cc: Anshuman Khandual, linux-arm-kernel, linux-kernel,
Jonathan Corbet, Marc Zyngier, Oliver Upton, James Morse,
Suzuki K Poulose, Catalin Marinas, Will Deacon, Mark Brown,
kvmarm
In-Reply-To: <CAL_Jsq+=3CKrUDcdgSHFaSGgvnh9ZxTGEBTRLkwrYgzQ=frOdg@mail.gmail.com>
On Fri, Apr 10, 2026 at 02:55:55PM -0500, Rob Herring wrote:
> On Thu, Apr 9, 2026 at 5:52 AM Mark Rutland <mark.rutland@arm.com> wrote:
> > Both breakpoint and watchpoint exceptions are synchronous, meanning that
> > they can only be taken from the specific instruction that triggered
> > them. However, updates to the watchpoint control registers *do* need a
> > context synchronization event before they're guarnateed to take effect.
> >
> > For a sequence:
> >
> > // Initially:
> > // - MDSCR, MDCR, DAIF.D permit debug exceptions at CurrentEL
> > // - No watchpoints enabled
> >
> > 0x000: LDR <val>, [<addr>]
> > 0x004: MSR DBGWVR<n>_EL1, <addr>
> > 0x008: MSR DBGWCR<n>_EL1, <configure_and_enable>
> > 0x010: LDR <val>, [<addr>]
> > 0x014: ISB
> > 0x018: LDR <val>, [<addr>]
> >
> > ... we know:
> >
> > (a) The LDR at 0x000 *cannot* trigger the watchpoint.
>
> Why not?
Because:
(a) As noted above, both breakpoint and watchpoint exceptions are
synchronous. They can only be taken from the specific *instruction*
that triggered them, and the exception must be consistent with no
subsequent instructions having been executed.
In ARM DDI 0487 M.a.a, see:
- Section D1.4 ("Exceptions"), and in particular RFQHGR, RTNVSL.
- Section D2.8 ("Breakpoint exceptions")
- Section D2.9 ("Watchpoint exceptions")
(b) Generally, writes to system registers cannot affect earlier
instructions.
In ARM DDI 0487 M.a.a, see: Section D24.1.2.2 ("Synchronization
requirements for AArch64 System registers").
Note in particular the statement: "Direct writes to System registers
are not allowed to affect any instructions appearing in program
order before the direct write."
> Can't the LDR complete after the MSR?
The memory effects of the LDR at 0x000 could complete after the MSRs at
0x004 and 0x008 are executed.
However, any watchpoints/breakpoint triggered by the LDR at 0x000 must
be taken *before* the next instruction (i.e. the MSR at 0x004) can be
architecturally executed, and any such execption must be consistent with
th PE state prior to that next instruction being executed.
Note that this doesn't forbid the PE from speculating the MSR and other
subsequent instructions; it just has to maintain program order (so later
instructions can't affect earlier instructions), and must be able to
discard anything that was speculated past taking an exception.
> Is ordering ensured between those? Surely the watchpoint triggers on
> completion of the load and that wouldn't stall the PE from doing the
> MSR(s)?
Hopefully the above covers these concerns?
>
> > (b) The LDR at 0x010 *might* trigger the matchpoint.
> > (c) The LDR at 0x018 *must* trigger the watchpoint.
> >
> > For C code, we can enforce this order with barrier(), e.g.
> >
> > val = *addr;
> > barrier();
> > write_sysreg(addr, DBGWVR<n>_EL1);
> > write_sysreg(configure_and_enable, DBGWCR<n>_EL1);
> > isb();
> >
> > ... where the compiler cannot re-order the memory access (or
> > write_sysreg(), or isb()) across the barrier(), and as isb() has a
> > memory clobber, the same is true for isb().
> >
> > Likewise, for the inverse sequence:
> >
> > // Initially:
> > // - MDSCR, MDCR, DAIF.D permit debug exceptions at CurrentEL
> > // - Watchpoint configured and enabled for <addr>
> >
> > 0x100: LDR <val>, [<addr>]
> > 0x104: MSR DBGWCR<n>_EL1, <disable>
> > 0x108: LDR <val>, [<addr>]
> > 0x110: ISB
> > 0x114: LDR <val>, [<addr>]
> >
> > ... we know:
> >
> > (a) The LDR at 0x100 *must* trigger the watchpoint.
> > (b) The LDR at 0x108 *might* trigger the watchpoint.
> > (c) The LDR at 0x114 *cannot* trigger the watchpoint.
> >
> > > Any guidance on the flavor of dsb here? (And is there any guarantee
> > > that the access is visible to the watchpoint h/w after a dsb
> > > completes?)
> >
> > Hopefully the above was sufficient?
> >
> > As mentioned above, I think we have a latent issue where we can take a
> > breakpoint or watchpoint under arch_uninstall_hw_breakpoint(), where we
> > have:
> >
> > arch_uninstall_hw_breakpoint(bp) {
> > ...
> > hw_breakpoint_control(bp, HW_BREAKPOINT_UNINSTALL) {
> > ...
> > hw_breakpoint_slot_setup(slots, max_slots, bp, HW_BREAKPOINT_UNINSTALL) {
> > ...
> > *slot = NULL;
> > ...
> > }
> > ...
> > write_wb_reg(ctrl_reg, i, 0) {
> > ...
> > write_sysreg(0, ...);
> > isb();
> > ...
> > }
> > }
> > }
> >
> > The HW breakpoint/watchpoint associated with 'bp' could be triggered
> > between setting '*slot' to NULL and the ISB. If that happens, then
> > do_breakpoint() won't find 'bp', and will return *without* disabling the
> > HW breakpoint or attempting to step.
> >
> > If that first exception was taken *before* the MSR in write_sysreg(),
> > then nothing has changed, and the breakpoint/watchpoint will then be
> > taken again ad infinitum.
> >
> > If that first exception was taken *after* the MSR in write_sysreg(), the
> > context synchronization provided by exception entry/return will prevent
> > it from being taken again.
> >
> > Building v6.19 and testing (with pseudo-NMI enabled):
> >
> > | # grep write_wb_reg /proc/kallsyms
> > | ffff80008004b980 t write_wb_reg
> > | # ./perf-6.19 stat -a -C 1 -e 'mem:0xffff80008004b980/4:xk' true
>
> I'll give it a try with v4, but that should be prevented with my
> changes to exclude kprobe regions.
That'll work for breakpoints specifically, but not for watchpoints, and
I think for that we either have to dynamically disable
watchpoint/breakpoint exceptions (e.g. via DAIF.D), or write the code to
be race-aware. I strongly suspect that the latter will be simpler.
I think we can solve that with some wider cleanup, e.g. having
arch_uninstall_hw_breakpoint() disable the breakpoint/watchpoint
*before* setting its slot to NULL, but we'll need to take some care
(e.g. to save/restore MDSELR).
I strongly suspect that we can defer implementing support for EMBWE
until we've done a more general cleanup, but I'll need to go do some
reading first.
Mark.
^ permalink raw reply
* Re: [PATCH v10 16/20] coresight: Add PM callbacks for sink device
From: Leo Yan @ 2026-04-13 8:48 UTC (permalink / raw)
To: Jie Gan
Cc: Suzuki K Poulose, Mike Leach, James Clark, Yeoreum Yun,
Mark Rutland, Will Deacon, Yabin Cui, Keita Morisaki,
Yuanfang Zhang, Greg Kroah-Hartman, Alexander Shishkin,
Tamas Petz, Thomas Gleixner, Peter Zijlstra, coresight,
linux-arm-kernel
In-Reply-To: <227b77b9-5232-4cff-b26a-458477e9eb32@oss.qualcomm.com>
On Mon, Apr 13, 2026 at 01:45:50PM +0800, Jie Gan wrote:
[...]
> > @@ -1787,15 +1808,32 @@ static int coresight_pm_save(struct coresight_path *path)
> > to = list_prev_entry(coresight_path_last_node(path), link);
> > coresight_disable_path_from_to(path, from, to);
> > + ret = coresight_pm_device_save(coresight_get_sink(path));
> > + if (ret)
> > + goto sink_failed;
> > +
> > return 0;
> > +
> > +sink_failed:
> > + if (!coresight_enable_path_from_to(path, coresight_get_mode(source),
> > + from, to))
> > + coresight_pm_device_restore(source);
>
> I have go through the history messages. I have a question about this point
> here:
>
> how can we handle the scenario if coresight_enable_path_from_to failed? It
> means we are never calling coresight_pm_device_restore for the ETM and
> leaving the ETM with OS lock state until CPU reset?
From a design perspective, if any failure occurs in the idle flow, the
priority is to avoid further mess, especially partial enable/disable
sequences that could lead to lockups.
The case you mentioned is a typical risk - if a path after source to
sink fails to be enabled, it is unsafe to arbitrarily enable the source
further. We rely on the per-CPU flag "percpu_pm_failed" to disable idle
states, if ETE/TRBE fails to be disabled, if CPU is turned off, this
also might cause lockup.
> Consider we are calling etm4_disable_hw with OS lock:
> etm4_disable_hw -> etm4_disable_trace_unit -> etm4x_wait_status (may timeout
> here?)
This is expected. I don't want to introduce a _recovery_ mechanism for
CPU PM failures, which is complex and over-engineering. CPU PM notifier
is low level code, and in my experience, PM issues can be easily
observed once CPU idle is enabled and should be resolved during the
development phase.
In many cases PM issues are often not caused by CoreSight drivers but by
other modules (e.g., clock or regulator drivers). The log "Failed in
coresight PM save ..." reminds developers the bugs. As said,
percpu_pm_failed is used as a last resort to prevent the platform from
locking up if there is a PM bug.
Thanks,
Leo
^ permalink raw reply
* [PATCH 0/3] arm64/virt: Add Arm CCA measurement register support
From: Sami Mujawar @ 2026-04-13 8:49 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, jgg, thuth, Suzuki.Poulose, steven.price,
gshan, YeoReum.Yun, Sami Mujawar
This series adds support for Arm Confidential Compute Architecture (CCA)
measurement registers in the Linux kernel, enabling guest Realms to
access, extend, and expose measurement values for attestation and runtime
integrity tracking.
The Realm Management Monitor (RMM) defines a set of measurement registers
consisting of a Realm Initial Measurement (RIM) and a number of Realm
Extensible Measurements (REMs). This series introduces the necessary
infrastructure to interact with these registers via the RSI interface
and exposes them to userspace through the TSM measurement framework.
At a high level, the series includes:
- Helper interfaces for reading and extending measurement
registers via RSI
- Definitions for Realm hash algorithms as defined by the
RMM specification
- Integration with the TSM measurement subsystem and sysfs
exposure for userspace visibility and interaction
After applying this series, measurement registers are exposed under:
/sys/devices/virtual/misc/arm_cca_guest/measurements/
Where:
- rim is read-only (initial measurement)
- rem[0-3] are read/write (extensible measurements)
- The hash algorithm reflects the Realm configuration
Patch summary:
1. arm64: rsi: Add helpers for Arm CCA measurement registers
- Introduces RSI helper APIs to read and extend RIM/REM registers
2. arm64: rsi: Add realm hash algorithm defines
- Adds definitions for SHA-256 and SHA-512 identifiers returned
by the RMM
3. virt: arm-cca-guest: Add support for measurement registers
- Integrates with TSM measurement framework
- Implements measurement register refresh and extend operations
- Exposes registers via sysfs using a misc device
- Dynamically configures hash algorithm and digest size per Realm
This enables a consistent mechanism for attestation-related measurements
in Arm CCA guests and aligns with the kernel TSM measurement abstraction.
Feedback is very welcome.
Signed-off-by: Sami Mujawar <sami.mujawar@arm.com>
Sami Mujawar (3):
arm64: rsi: Add helpers for Arm CCA measurement register operations
arm64: rsi: Add realm hash algorithm defines
virt: arm-cca-guest: Add support for measurement registers
.../sysfs-devices-virtual-misc-arm_cca_guest | 38 +++
arch/arm64/include/asm/rsi_cmds.h | 105 ++++++-
arch/arm64/include/asm/rsi_smc.h | 7 +
drivers/virt/coco/arm-cca-guest/Kconfig | 1 +
.../virt/coco/arm-cca-guest/arm-cca-guest.c | 296 +++++++++++++++++-
5 files changed, 442 insertions(+), 5 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-devices-virtual-misc-arm_cca_guest
--
SAMI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
^ permalink raw reply
* [PATCH 1/3] arm64: rsi: Add helpers for Arm CCA measurement register operations
From: Sami Mujawar @ 2026-04-13 8:49 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, jgg, thuth, Suzuki.Poulose, steven.price,
gshan, YeoReum.Yun, Sami Mujawar
In-Reply-To: <20260413084957.327661-1-sami.mujawar@arm.com>
Add static inline helper functions to support reading the Realm
Initial Measurement (RIM) and reading/extending the Realm
Extensible Measurement (REM) registers.
The indices of the Arm CCA measurement registers, as defined by
the Realm Management Monitor specification, are as follows:
Index Register
0 RIM
1 - 4 REM[0 - 3]
The rsi_measurement_extend() function allows extending REM[0–3]
registers with a caller-provided digest (up to 64 bytes).
Index 0 (RIM) is read-only and cannot be extended.
The rsi_measurement_read() function allows reading measurement
values from RIM (index 0) or REM[0–3] (indices 1–4). The returned
digest is expected to be 64 bytes.
Signed-off-by: Sami Mujawar <sami.mujawar@arm.com>
---
arch/arm64/include/asm/rsi_cmds.h | 105 +++++++++++++++++++++++++++++-
1 file changed, 104 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/rsi_cmds.h b/arch/arm64/include/asm/rsi_cmds.h
index 2c8763876dfb..cfd9bff88147 100644
--- a/arch/arm64/include/asm/rsi_cmds.h
+++ b/arch/arm64/include/asm/rsi_cmds.h
@@ -1,6 +1,6 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
- * Copyright (C) 2023 ARM Ltd.
+ * Copyright (C) 2023 - 2025 ARM Ltd.
*/
#ifndef __ASM_RSI_CMDS_H
@@ -15,6 +15,26 @@
#define RSI_GRANULE_SHIFT 12
#define RSI_GRANULE_SIZE (_AC(1, UL) << RSI_GRANULE_SHIFT)
+/*
+ * Maximum measurement data size in bytes.
+ * According to the RMM Specification, the width of the RmmRealmMeasurement type
+ * is 512 bits.
+ */
+#define RSI_MAX_MEASUREMENT_DATA_SIZE_BYTES 64
+
+/*
+ * Indices for the Realm Initial Measurement register (RIM) and the Realm
+ * Extensible Measurement registers (REMs).
+ * According to the RMM Specification, Realm attributes of a Realm include
+ * an array of measurement values. The first entry in this array is a RIM.
+ * The remaining entries in this array are REMs.
+ */
+#define RSI_INDEX_RIM 0
+#define RSI_INDEX_REM0 1
+#define RSI_INDEX_REM1 2
+#define RSI_INDEX_REM2 3
+#define RSI_INDEX_REM3 4
+
enum ripas {
RSI_RIPAS_EMPTY = 0,
RSI_RIPAS_RAM = 1,
@@ -159,4 +179,87 @@ static inline unsigned long rsi_attestation_token_continue(phys_addr_t granule,
return res.a0;
}
+/**
+ * rsi_measurement_extend - Extend the measurement value to the Realm Extensible
+ * Measurement (REM).
+ *
+ * @idx: Index of the REM register.
+ * Where:
+ * Index Register
+ * 1 - 4 REM[0-3]
+ * @digest: The digest data to be extended.
+ * @digest_size: Size of the digest data in bytes.
+ *
+ * Returns:
+ * On success, returns RSI_SUCCESS.
+ * Otherwise, -EINVAL
+ */
+static inline unsigned long rsi_measurement_extend(u32 idx,
+ const u8 *digest,
+ unsigned long digest_size)
+{
+ struct arm_smccc_1_2_regs regs = { 0 };
+
+ /*
+ * Index 0 is for RIM (which is Read Only), while
+ * REM[0-3] are indexed from 1 - 4.
+ * The digest size can be at the most 64 bytes.
+ */
+ if (!digest || idx < RSI_INDEX_REM0 || idx > RSI_INDEX_REM3 ||
+ digest_size == 0 || digest_size > RSI_MAX_MEASUREMENT_DATA_SIZE_BYTES)
+ return -EINVAL;
+
+ regs.a0 = SMC_RSI_MEASUREMENT_EXTEND;
+ regs.a1 = idx;
+ regs.a2 = digest_size;
+ memcpy(®s.a3, digest, digest_size);
+ arm_smccc_1_2_smc(®s, ®s);
+
+ if (regs.a0 != RSI_SUCCESS)
+ return -EINVAL;
+
+ return regs.a0;
+}
+
+/**
+ * rsi_measurement_read - Read the measurement value from the Realm Initial
+ * Measurement (RIM) or the Realm Extensible Measurement (REM) register.
+ *
+ * @idx: Index of the RIM or REM register.
+ * Where:
+ * Index Register
+ * 0 RIM
+ * 1 - 4 REM[0-3]
+ * @digest: The digest data to be returned.
+ * @digest_size: Size of the digest data buffer in bytes.
+ *
+ * Returns:
+ * On success, returns RSI_SUCCESS.
+ * Otherwise, -EINVAL
+ */
+static inline unsigned long rsi_measurement_read(u32 idx,
+ u8 *digest,
+ unsigned long digest_size)
+{
+ struct arm_smccc_1_2_regs regs = { 0 };
+
+ /*
+ * The digest size can be at the most 64 bytes, if less then 64 bytes
+ * it is zero padded.
+ */
+ if (!digest || idx > RSI_INDEX_REM3 ||
+ digest_size == 0 || digest_size > RSI_MAX_MEASUREMENT_DATA_SIZE_BYTES)
+ return -EINVAL;
+
+ regs.a0 = SMC_RSI_MEASUREMENT_READ;
+ regs.a1 = idx;
+ arm_smccc_1_2_smc(®s, ®s);
+
+ if (regs.a0 != RSI_SUCCESS)
+ return -EINVAL;
+
+ memcpy(digest, ®s.a1, digest_size);
+ return regs.a0;
+}
+
#endif /* __ASM_RSI_CMDS_H */
--
SAMI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
^ permalink raw reply related
* [PATCH 2/3] arm64: rsi: Add realm hash algorithm defines
From: Sami Mujawar @ 2026-04-13 8:49 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, jgg, thuth, Suzuki.Poulose, steven.price,
gshan, YeoReum.Yun, Sami Mujawar
In-Reply-To: <20260413084957.327661-1-sami.mujawar@arm.com>
Add macro definitions for the hash algorithm identifiers, as
specified in the Realm Management Monitor (RMM) specification.
Signed-off-by: Sami Mujawar <sami.mujawar@arm.com>
---
arch/arm64/include/asm/rsi_smc.h | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/arch/arm64/include/asm/rsi_smc.h b/arch/arm64/include/asm/rsi_smc.h
index e19253f96c94..dbcb98a6d5c9 100644
--- a/arch/arm64/include/asm/rsi_smc.h
+++ b/arch/arm64/include/asm/rsi_smc.h
@@ -144,6 +144,13 @@ struct realm_config {
#endif /* __ASSEMBLER__ */
+/*
+ * The RSI definition of the Hash Algorithm (as specified by the Secure
+ * Hash Standard) returned in the realm_config data structure.
+ */
+#define RSI_HASH_SHA_256 0
+#define RSI_HASH_SHA_512 1
+
/*
* Read configuration for the current Realm.
*
--
SAMI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
^ permalink raw reply related
* [PATCH 3/3] virt: arm-cca-guest: Add support for measurement registers
From: Sami Mujawar @ 2026-04-13 8:49 UTC (permalink / raw)
To: linux-arm-kernel, linux-kernel
Cc: catalin.marinas, will, jgg, thuth, Suzuki.Poulose, steven.price,
gshan, YeoReum.Yun, Sami Mujawar
In-Reply-To: <20260413084957.327661-1-sami.mujawar@arm.com>
Add support for Arm CCA measurement registers (MRs), enabling attestation
and runtime integrity tracking from guest Realms.
This implementation registers a measurement configuration with the TSM
framework and exposes measurement register values via sysfs using a
misc device. The supported registers include the Realm Initial
Measurement (RIM) and four Runtime Extensible Measurement Registers
(REM0–REM3), each using SHA-256 or SHA-512 depending on Realm
configuration.
The measurement registers are located under the following sysfs node:
/sys/devices/virtual/misc/arm_cca_guest/measurements/
-rw-r--r-- 1 0 0 64 Jul 21 11:46 rem0:sha512
-rw-r--r-- 1 0 0 64 Jul 21 11:46 rem1:sha512
-rw-r--r-- 1 0 0 64 Jul 21 11:46 rem2:sha512
-rw-r--r-- 1 0 0 64 Jul 21 11:46 rem3:sha512
-r--r--r-- 1 0 0 64 Jul 21 11:46 rim:sha512
As seen above the attributes for the REMs are 'rw' indicating they can
be read or extended. While the attributes for RIM is 'r' indicating
that it can only be read and not extended.
The sysfs node suffix for the measurement register (i.e. ':sha512')
indicates the hash algorithm used is sha512. This also reflects
that the Realm was launched with SHA512 as the measurement algorithm.
Signed-off-by: Sami Mujawar <sami.mujawar@arm.com>
---
.../sysfs-devices-virtual-misc-arm_cca_guest | 38 +++
drivers/virt/coco/arm-cca-guest/Kconfig | 1 +
.../virt/coco/arm-cca-guest/arm-cca-guest.c | 296 +++++++++++++++++-
3 files changed, 331 insertions(+), 4 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-devices-virtual-misc-arm_cca_guest
diff --git a/Documentation/ABI/testing/sysfs-devices-virtual-misc-arm_cca_guest b/Documentation/ABI/testing/sysfs-devices-virtual-misc-arm_cca_guest
new file mode 100644
index 000000000000..878dc54e48f8
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-devices-virtual-misc-arm_cca_guest
@@ -0,0 +1,38 @@
+What: /sys/devices/virtual/misc/arm_cca_guest/measurements/MRNAME[:HASH]
+Date: July, 2025
+KernelVersion: v6.16
+Contact: linux-coco@lists.linux.dev
+Description:
+ Value of a Arm CCA Realm measurement register (MR). The optional
+ suffix :HASH is to represent the hash algorithms associated with
+ the MRs. See below for a complete list of Arm CCA Realm MRs exposed
+ via sysfs. Refer to the Arm Realm Management Monitor (RMM)
+ Specification for more information on the Realm Measurement registers.
+
+ The Arm Realm Management Monitor Specification can be found at:
+ https://developer.arm.com/documentation/den0137/latest/
+
+ See also:
+ https://docs.kernel.org/driver-api/coco/measurement-registers.html
+
+What: /sys/devices/virtual/misc/arm_cca_guest/measurements/rim:[sha256|sha512]
+Date: July, 2025
+KernelVersion: v6.16
+Contact: linux-coco@lists.linux.dev
+Description:
+ (RO) RIM - [32|64]-byte immutable storage typically used to represent
+ the Realm Initial Measurement (RIM) which is the measurement of
+ the configuration and contents of a Realm at the time of activation.
+
+What: /sys/devices/virtual/misc/arm_cca_guest/measurements/rem[0123]:[sha256|sha512]
+Date: July, 2025
+KernelVersion: v6.16
+Contact: linux-coco@lists.linux.dev
+Description:
+ (RW) REM[0123] - 4 Run-Time extendable Measurement Registers that
+ represent the Realm Extensible Measurement (REM) registers which
+ can be extended during the lifetime of a Realm.
+ Read from any of these returns the current value of the corresponding
+ REM. Write extends the written buffer to the REM. All writes must start
+ at offset 0 and be maximum 64 bytes in size. Attempting to write more
+ than 64 bytes will result in EINVAL returned by the write() syscall.
diff --git a/drivers/virt/coco/arm-cca-guest/Kconfig b/drivers/virt/coco/arm-cca-guest/Kconfig
index 3f0f013f03f1..62fcc6b16843 100644
--- a/drivers/virt/coco/arm-cca-guest/Kconfig
+++ b/drivers/virt/coco/arm-cca-guest/Kconfig
@@ -2,6 +2,7 @@ config ARM_CCA_GUEST
tristate "Arm CCA Guest driver"
depends on ARM64
select TSM_REPORTS
+ select TSM_MEASUREMENTS
help
The driver provides userspace interface to request and
attestation report from the Realm Management Monitor(RMM).
diff --git a/drivers/virt/coco/arm-cca-guest/arm-cca-guest.c b/drivers/virt/coco/arm-cca-guest/arm-cca-guest.c
index 0c9ea24a200c..2b5c5fa01cb3 100644
--- a/drivers/virt/coco/arm-cca-guest/arm-cca-guest.c
+++ b/drivers/virt/coco/arm-cca-guest/arm-cca-guest.c
@@ -1,18 +1,286 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
- * Copyright (C) 2023 ARM Ltd.
+ * Copyright (C) 2023 - 2025 ARM Ltd.
*/
#include <linux/arm-smccc.h>
#include <linux/cc_platform.h>
#include <linux/kernel.h>
+#include <linux/miscdevice.h>
#include <linux/mod_devicetable.h>
#include <linux/module.h>
#include <linux/smp.h>
#include <linux/tsm.h>
+#include <linux/tsm-mr.h>
#include <linux/types.h>
#include <asm/rsi.h>
+#include <crypto/hash.h>
+
+/* MR buffer */
+static u8 *arm_cca_mr_buf;
+
+/**
+ * arm_cca_mrs - ARM CCA measurement register set.
+ *
+ * Defines a static array of measurement registers used by the ARM
+ * Confidential Compute Architecture (CCA). These registers are used
+ * for attestation and runtime integrity tracking.
+ *
+ * Register types:
+ * - rim: Realm initial measurement register (RIM)
+ * - rem0–rem3: Runtime extensible measurement registers (REMs)
+ */
+static struct tsm_measurement_register arm_cca_mrs[] = {
+ { TSM_MR_(rim, SHA256) | TSM_MR_F_READABLE },
+ { TSM_MR_(rem0, SHA256) | TSM_MR_F_RTMR },
+ { TSM_MR_(rem1, SHA256) | TSM_MR_F_RTMR },
+ { TSM_MR_(rem2, SHA256) | TSM_MR_F_RTMR },
+ { TSM_MR_(rem3, SHA256) | TSM_MR_F_RTMR }
+};
+
+/**
+ * arm_cca_mr_refresh - Refresh measurement registers for ARM CCA.
+ *
+ * @tm: Pointer to a struct tsm_measurements containing measurement registers.
+ *
+ * Iterates through all measurement registers in @tm and refreshes those
+ * marked with TSM_MR_F_LIVE or TSM_MR_F_READABLE by invoking
+ * rsi_measurement_read() for each.
+ *
+ * Return: 0 on success, or -EINVAL if @tm is NULL or a read operation fails.
+ */
+static int arm_cca_mr_refresh(const struct tsm_measurements *tm)
+{
+ int retval;
+ int index = 0;
+ const struct tsm_measurement_register *mr;
+
+ if (!tm)
+ return -EINVAL;
+
+ while (index < tm->nr_mrs) {
+ mr = &tm->mrs[index];
+
+ /* Skip if the MR is not Live or Readable. */
+ if ((mr->mr_flags & (TSM_MR_F_LIVE | TSM_MR_F_READABLE)) != 0) {
+ retval = rsi_measurement_read(index,
+ mr->mr_value,
+ mr->mr_size);
+ if (retval != 0)
+ return -EINVAL;
+ }
+
+ index++;
+ }
+
+ return 0;
+}
+
+/**
+ * arm_cca_mr_extend - Extend a measurement register with new data.
+ *
+ * @tm: Pointer to the tsm_measurements structure containing measurement
+ * registers.
+ * @mr: Pointer to the specific measurement register to extend.
+ * @data: Pointer to the data to be used for extension.
+ *
+ * This function extends a measurement register with new input data.
+ *
+ * Return: 0 on success, or a negative error code (e.g., -EINVAL for invalid
+ * arguments).
+ */
+static int arm_cca_mr_extend(const struct tsm_measurements *tm,
+ const struct tsm_measurement_register *mr,
+ const u8 *data)
+{
+ if (!tm || !mr || !data)
+ return -EINVAL;
+
+ return rsi_measurement_extend((mr - tm->mrs), data, mr->mr_size);
+}
+
+/**
+ * arm_cca_measurements - ARM CCA measurement configuration instance.
+ *
+ * This defines the measurement set and behavior for the ARM
+ * Confidential Compute Architecture, enabling measurements
+ * for attestation and runtime validation.
+ */
+static struct tsm_measurements arm_cca_measurements = {
+ .mrs = arm_cca_mrs,
+ .nr_mrs = ARRAY_SIZE(arm_cca_mrs),
+ .refresh = arm_cca_mr_refresh,
+ .write = arm_cca_mr_extend,
+};
+
+/**
+ * arm_cca_attr_groups - Attribute groups for the arm_cca_misc_dev miscellaneous
+ * device.
+ *
+ */
+static const struct attribute_group *arm_cca_attr_groups[] = {
+ NULL, /* measurements */
+ NULL
+};
+
+/**
+ * arm_cca_misc_dev - Miscellaneous device for ARM CCA functionality.
+ *
+ */
+static struct miscdevice arm_cca_misc_dev = {
+ .name = KBUILD_MODNAME,
+ .minor = MISC_DYNAMIC_MINOR,
+ .groups = arm_cca_attr_groups,
+};
+
+/**
+ * arm_cca_get_hash_algorithm - Get the hash algorithm and digest size for
+ * a Realm.
+ *
+ * @hash_algo: Pointer to an int to receive the internal hash algorithm ID
+ * (e.g., HASH_ALGO_SHA256 or HASH_ALGO_SHA512).
+ * @digest_size: Pointer to an int to receive the digest size in bytes
+ * (e.g., SHA256_DIGEST_SIZE or SHA512_DIGEST_SIZE).
+ *
+ * This function retrieves the hash algorithm used in a Realm's configuration
+ * by invoking the `rsi_get_realm_config()` interface.
+ *
+ * Return:
+ * * %0 - Success. The hash algorithm and digest size are returned.
+ * * %-ENOMEM - Memory allocation failed.
+ * * %-EINVAL - Configuration fetch failed or algorithm is unsupported.
+ *
+ */
+static int arm_cca_get_hash_algorithm(int *hash_algo, int *digest_size)
+{
+ int ret = 0;
+ unsigned long result;
+ struct realm_config *cfg = NULL;
+
+ cfg = alloc_pages_exact(sizeof(*cfg), GFP_KERNEL);
+ if (!cfg)
+ return -ENOMEM;
+
+ result = rsi_get_realm_config(cfg);
+ if (result != RSI_SUCCESS) {
+ ret = -EINVAL;
+ goto exit_free_realm_config;
+ }
+
+ switch (cfg->hash_algo) {
+ case RSI_HASH_SHA_512:
+ *hash_algo = HASH_ALGO_SHA512;
+ *digest_size = SHA512_DIGEST_SIZE;
+ break;
+ case RSI_HASH_SHA_256:
+ *hash_algo = HASH_ALGO_SHA256;
+ *digest_size = SHA256_DIGEST_SIZE;
+ break;
+ default:
+ /* Unknown/unsupported algorithm. */
+ ret = -EINVAL;
+ break;
+ }
+
+exit_free_realm_config:
+ free_pages_exact(cfg, RSI_GRANULE_SIZE);
+ return ret;
+}
+
+/**
+ * arm_cca_mr_init - Initialize ARM CCA measurement register infrastructure.
+ *
+ * This function sets up the internal data structures for handling ARM CCA
+ * measurement registers (MRs) and creates a sysfs attribute group. It also
+ * registers a miscelaneous device for exposing the Arm CCA measurement
+ * registers to userspace.
+ *
+ * Return:
+ * * %0 - On success.
+ * * %-ENOMEM - if memory allocation fails.
+ * * %-EINVAL - On hash algorithm retrieval or attribute group creation
+ * failure.
+ */
+static int arm_cca_mr_init(void)
+{
+ const struct attribute_group *g;
+ int ret;
+ int hash_algo;
+ int digest_size;
+ int digest_buf_size;
+
+ /* Retrieve the hash algorithm and digest size. */
+ ret = arm_cca_get_hash_algorithm(&hash_algo, &digest_size);
+ if (ret)
+ return ret;
+
+ /*
+ * Allocate a single contiguous buffer to hold the digest values
+ * for all MRs.
+ */
+ digest_buf_size = ARRAY_SIZE(arm_cca_mrs) * digest_size;
+ u8 *digest_buf __free(kfree) = kzalloc(digest_buf_size, GFP_KERNEL);
+ if (!digest_buf)
+ return -ENOMEM;
+
+ arm_cca_mr_buf = digest_buf;
+
+ /* Initialise the mr_value storage and the mr_size. */
+ for (size_t i = 0; i < ARRAY_SIZE(arm_cca_mrs); ++i) {
+ arm_cca_mrs[i].mr_value = digest_buf + (digest_size * i);
+ arm_cca_mrs[i].mr_size = digest_size;
+ arm_cca_mrs[i].mr_hash = hash_algo;
+ }
+
+ /* Read the measurement registers. */
+ ret = arm_cca_mr_refresh(&arm_cca_measurements);
+ if (ret)
+ return ret;
+
+ /*
+ * Create a sysfs attribute group to expose the measurements
+ * to userspace.
+ */
+ g = tsm_mr_create_attribute_group(&arm_cca_measurements);
+ if (IS_ERR_OR_NULL(g))
+ return PTR_ERR(g);
+
+ /* Initialise the attribute group before registering the misc device. */
+ arm_cca_attr_groups[0] = g;
+
+ /*
+ * Register a miscelaneous device for exposing
+ * the Arm CCA measurement registers to userspace.
+ */
+ ret = misc_register(&arm_cca_misc_dev);
+ if (ret < 0) {
+ tsm_mr_free_attribute_group(g);
+ return ret;
+ }
+
+ arm_cca_mr_buf = no_free_ptr(digest_buf);
+
+ return 0;
+}
+
+/**
+ * arm_cca_mr_cleanup - Unregister sysfs attribute group and free the
+ * measurement digest buffer region.
+ *
+ * @mr_grp: Pointer to the sysfs attribute group.
+ *
+ * This function performs cleanup for the Arm CCA memory registers (MR).
+ *
+ * The function should be called during the teardown or cleanup phase
+ * to ensure proper resource deallocation.
+ */
+static void arm_cca_mr_cleanup(const struct attribute_group *mr_grp)
+{
+ misc_deregister(&arm_cca_misc_dev);
+ tsm_mr_free_attribute_group(mr_grp);
+ kfree(arm_cca_mr_buf);
+}
/**
* struct arm_cca_token_info - a descriptor for the token buffer.
@@ -188,12 +456,16 @@ static const struct tsm_report_ops arm_cca_tsm_ops = {
/**
* arm_cca_guest_init - Register with the Trusted Security Module (TSM)
- * interface.
+ * interface and also register a miscelaneous device used for exposing
+ * the Arm CCA measurement registers to userspace.
*
* Return:
* * %0 - Registered successfully with the TSM interface.
* * %-ENODEV - The execution context is not an Arm Realm.
* * %-EBUSY - Already registered.
+ * * %-ENOMEM - If memory allocation fails.
+ * * %-EINVAL - On hash algorithm retrieval or attribute group creation
+ * failure.
*/
static int __init arm_cca_guest_init(void)
{
@@ -202,9 +474,22 @@ static int __init arm_cca_guest_init(void)
if (!is_realm_world())
return -ENODEV;
+ ret = arm_cca_mr_init();
+ if (ret < 0) {
+ pr_err("Error %d initialising MRs\n", ret);
+ return ret;
+ }
+
ret = tsm_report_register(&arm_cca_tsm_ops, NULL);
- if (ret < 0)
+ if (ret < 0) {
pr_err("Error %d registering with TSM\n", ret);
+ goto cleanup_mr;
+ }
+
+ return ret;
+
+cleanup_mr:
+ arm_cca_mr_cleanup(arm_cca_attr_groups[0]);
return ret;
}
@@ -212,11 +497,14 @@ module_init(arm_cca_guest_init);
/**
* arm_cca_guest_exit - unregister with the Trusted Security Module (TSM)
- * interface.
+ * interface and deregister the miscelaneous device used for exposing the
+ * Arm CCA measurement registers to userspace.
+ *
*/
static void __exit arm_cca_guest_exit(void)
{
tsm_report_unregister(&arm_cca_tsm_ops);
+ arm_cca_mr_cleanup(arm_cca_attr_groups[0]);
}
module_exit(arm_cca_guest_exit);
--
SAMI:{C3F47F37-75D8-414A-A8BA-3980EC8A46D7}
^ permalink raw reply related
* [PATCH v8 next 09/10] fs/resctrl: Wire up rmid expansion and reclaim functions
From: Zeng Heng @ 2026-04-13 8:54 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
The previous patch implemented resctrl_arch_rmid_expand() and
resctrl_arch_rmid_reclaim() for ARM MPAM. This patch integrates
these architecture-specific functions into the generic resctrl layer.
Refactor resctrl_find_free_rmid() to support dynamic rmid expansion.
If no free rmid is available for the current closid, attempt to expand
via resctrl_arch_expand_rmid(). On success, retry the rmid allocation.
As this capability is architecture-specific, x86 maintains its existing
behavior by returning -ENOSPC when rmid resources are exhausted.
Additionally, invoke resctrl_arch_rmid_reclaim() when rmids are released
to enable architecture-specific resource cleanup.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
arch/x86/include/asm/resctrl.h | 7 +++++++
fs/resctrl/monitor.c | 32 ++++++++++++++++++++++++++++++--
2 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/resctrl.h b/arch/x86/include/asm/resctrl.h
index 575f8408a9e7..7f8c8d84f2a0 100644
--- a/arch/x86/include/asm/resctrl.h
+++ b/arch/x86/include/asm/resctrl.h
@@ -167,6 +167,13 @@ static inline void resctrl_arch_sched_in(struct task_struct *tsk)
__resctrl_sched_in(tsk);
}
+static inline int resctrl_arch_rmid_expand(u32 closid)
+{
+ return -ENOSPC;
+}
+
+static inline void resctrl_arch_rmid_reclaim(u32 closid, u32 rmid) {}
+
static inline void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
{
*rmid = idx;
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 7473c43600cf..3ac86995278e 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -122,6 +122,8 @@ static void limbo_release_entry(struct rmid_entry *entry)
if (IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID))
closid_num_dirty_rmid[entry->closid]--;
+
+ resctrl_arch_rmid_reclaim(entry->closid, entry->rmid);
}
/*
@@ -197,7 +199,7 @@ bool has_busy_rmid(struct rdt_l3_mon_domain *d)
return find_first_bit(d->rmid_busy_llc, idx_limit) != idx_limit;
}
-static struct rmid_entry *resctrl_find_free_rmid(u32 closid)
+static struct rmid_entry *__resctrl_find_free_rmid(u32 closid)
{
struct rmid_entry *itr;
u32 itr_idx, cmp_idx;
@@ -214,7 +216,12 @@ static struct rmid_entry *resctrl_find_free_rmid(u32 closid)
* very first entry will be returned.
*/
itr_idx = resctrl_arch_rmid_idx_encode(itr->closid, itr->rmid);
+ if (itr_idx == U32_MAX)
+ continue;
+
cmp_idx = resctrl_arch_rmid_idx_encode(closid, itr->rmid);
+ if (cmp_idx == U32_MAX)
+ continue;
if (itr_idx == cmp_idx)
return itr;
@@ -223,6 +230,25 @@ static struct rmid_entry *resctrl_find_free_rmid(u32 closid)
return ERR_PTR(-ENOSPC);
}
+static struct rmid_entry *resctrl_find_free_rmid(u32 closid)
+{
+ struct rmid_entry *err;
+ int ret;
+
+ err = __resctrl_find_free_rmid(closid);
+ if (err == ERR_PTR(-ENOSPC)) {
+ ret = resctrl_arch_rmid_expand(closid);
+ if (ret < 0)
+ /* Out of rmid */
+ goto out;
+
+ /* Try it again */
+ return __resctrl_find_free_rmid(closid);
+ }
+out:
+ return err;
+}
+
/**
* resctrl_find_cleanest_closid() - Find a CLOSID where all the associated
* RMID are clean, or the CLOSID that has
@@ -342,8 +368,10 @@ void free_rmid(u32 closid, u32 rmid)
if (resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID))
add_rmid_to_limbo(entry);
- else
+ else {
list_add_tail(&entry->list, &rmid_free_lru);
+ resctrl_arch_rmid_reclaim(closid, rmid);
+ }
}
bool rmid_is_occupied(u32 closid, u32 rmid)
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 00/10] arm_mpam: Introduce Narrow-PARTID feature
From: Zeng Heng @ 2026-04-13 8:53 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
Background
==========
On x86, the resctrl allows creating up to num_rmids monitoring groups
under parent control group. However, ARM64 MPAM is currently limited by
the PMG (Performance Monitoring Group) count, which is typically much
smaller than the theoretical RMID limit. This creates a significant
scalability gap: users expecting fine-grained per-process or per-thread
monitoring quickly exhaust the PMG space, even when plenty of reqPARTIDs
remain available.
The Narrow-PARTID feature, defined in the ARM MPAM architecture,
addresses this by associating reqPARTIDs with intPARTIDs through a
programmable many-to-one mapping. This allows the kernel to present more
logical monitoring contexts.
Design Overview
===============
The implementation extends the RMID encoding to carry reqPARTID
information:
RMID = reqPARTID * NUM_PMG + PMG
In this patchset, a monitoring group is uniquely identified by the
combination of reqPARTID and PMG. The closid is represented by intPARTID,
which is exactly the original PARTID.
For systems with homogeneous MSCs (all supporting Narrow-PARTID), the
driver exposes the full reqPARTID range directly. For heterogeneous
systems where some MSCs lack Narrow-PARTID support, the driver utilizes
PARTIDs beyond the intPARTID range as reqPARTIDs to expand monitoring
capability. The sole exception is when any type of MSCs lack Narrow-PARTID
support, their percentage-based control mechanism prevents the use of
PARTIDs as reqPARTIDs.
Capability Improvements
=======================
--------------------------------------------------------------------------
The maximum | Sub-monitoring groups | System-wide
number of | under a control group | monitoring groups
--------------------------------------------------------------------------
Without reqPARTID | PMG | intPARTID * PMG
--------------------------------------------------------------------------
reqPARTID | |
static allocation | (reqPARTID // intPARTID) * PMG | reqPARTID * PMG
--------------------------------------------------------------------------
reqPARTID | |
dynamic allocation | (reqPARTID - intPARTID + 1) * PMG | reqPARTID * PMG
--------------------------------------------------------------------------
Note: The number of intPARTIDs can be capped via the boot parameter
mpam.intpartid_max. Under MPAM, reqPARTID count is always greater than
or equal to intPARTID count.
Series Structure
================
Patch 1: Fix pre-existing out-of-range PARTID issue between mount sessions.
Patches 2-6: Implement static reqPARTID allocation.
Patches 7-10: Implement dynamic reqPARTID allocation.
Changes
=======
Compared with v7:
- Add boot parameter to limit mpam_intpartid_max.
- Update the Narrow-PARTID enablement condition checks.
- Add default group detection in mpam_thread_switch().
- Correct patch series revision tag for consistency.
Compared with v6:
- Add dynamic reqPARTID allocation implementation.
- Add Patch 1 to fix pre-existing out-of-range PARTID issue.
- Drop original patch 4 which has been merged into the baseline.
Compared with v5:
- Redefine the RMID information.
- Refactor the resctrl_arch_rmid_idx_decode() and
resctrl_arch_rmid_idx_encode().
- Simplify closid_rmid2reqpartid() to rmid2reqpartid() and replace it
accordingly.
Compared with RFC-v4:
- Rebase the patch set on the v6.14-rc1 branch.
Compared with RFC-v3:
- Add limitation of the Narrow-PARTID feature (See Patch 2).
- Remove redundant reqpartid2closid() and reqpartid_pmg2rmid().
- Refactor closid_rmid2reqpartid() partially.
- Merge the PARTID conversion-related patches into a single patch for
bisectability.
- Skip adaptation of resctrl_arch_set_rmid() which is going to be
removed.
Compared with RFC-v2:
- Refactor closid/rmid pair translation.
- Simplify the logic of synchronize configuration.
- Remove reqPARTID source bitmap.
Compared with RFC-v1:
- Rebase this patch set on latest MPAM driver of the v6.12-rc1 branch.
Previous Versions
=================
v7: https://lore.kernel.org/all/20260317132141.1272506-1-zengheng4@huawei.com/
v6: https://lore.kernel.org/all/20250222112448.2438586-1-zengheng4@huawei.com/
v5: https://lore.kernel.org/all/20250217031852.2014939-1-zengheng4@huawei.com/
RFC-v4: https://lore.kernel.org/all/20250104101224.873926-1-zengheng4@huawei.com/
RFC-v3: https://lore.kernel.org/all/20241207092136.2488426-1-zengheng4@huawei.com/
RFC-v2: https://lore.kernel.org/all/20241119135104.595630-1-zengheng4@huawei.com/
RFC-v1: https://lore.kernel.org/all/20241114135037.918470-1-zengheng4@huawei.com/
---
Zeng Heng (10):
fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state
during umount
arm_mpam: Add intPARTID and reqPARTID support for Narrow-PARTID
feature
arm_mpam: Disable reqPARTID expansion when Narrow-PARTID is
unavailable
arm_mpam: Refactor rmid to reqPARTID/PMG mapping
arm_mpam: Propagate control group config to sub-monitoring groups
arm_mpam: Add boot parameter to limit mpam_intpartid_max
fs/resctrl: Add rmid_entry state helpers
arm_mpam: Implement dynamic reqPARTID allocation for monitoring groups
fs/resctrl: Wire up rmid expansion and reclaim functions
arm_mpam: Add mpam_sync_config() for dynamic rmid expansion
arch/arm64/include/asm/mpam.h | 12 +-
arch/x86/include/asm/resctrl.h | 7 +
drivers/resctrl/mpam_devices.c | 103 ++++++++---
drivers/resctrl/mpam_internal.h | 6 +-
drivers/resctrl/mpam_resctrl.c | 294 ++++++++++++++++++++++++++++----
fs/resctrl/monitor.c | 50 +++++-
fs/resctrl/rdtgroup.c | 24 ++-
include/linux/arm_mpam.h | 17 ++
include/linux/resctrl.h | 21 +++
9 files changed, 469 insertions(+), 65 deletions(-)
--
2.25.1
^ permalink raw reply
* [PATCH v8 next 10/10] arm_mpam: Add mpam_sync_config() for dynamic rmid expansion
From: Zeng Heng @ 2026-04-13 8:54 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
Add mpam_sync_config() to synchronize configuration when dynamically
expanding rmid resources. When binding a new reqpartid to a control group,
the driver maps the reqpartid to the corresponding intpartid or applies
the control group's existing configuration to new partid if without
Narrow-PARTID feature.
Extend mpam_apply_config() with the 'sync' work mode:
* Sync mode: mpam_sync_config() calls this to apply existing
configuration without updating config.
* Update (non-sync) mode: resctrl_arch_update_one() calls this to
compare, update, and apply configuration. This mode retains the
original behavior.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/resctrl/mpam_devices.c | 23 ++++++++++++++++++-----
drivers/resctrl/mpam_internal.h | 2 +-
drivers/resctrl/mpam_resctrl.c | 29 ++++++++++++++++++++++++++---
3 files changed, 45 insertions(+), 9 deletions(-)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index cf94b45b4f9e..47345b8add3c 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1762,6 +1762,7 @@ struct mpam_write_config_arg {
struct mpam_msc_ris *ris;
struct mpam_component *comp;
u16 partid;
+ bool sync;
};
static int __write_config(void *arg)
@@ -1770,6 +1771,15 @@ static int __write_config(void *arg)
struct mpam_write_config_arg *c = arg;
u32 reqpartid;
+ if (c->sync) {
+ /* c->partid should be within the range of reqPARTIDs */
+ WARN_ON_ONCE(c->partid < closid_num);
+
+ mpam_reprogram_ris_partid(c->ris, c->partid,
+ &c->comp->cfg[req2intpartid(c->partid)]);
+ return 0;
+ }
+
/* c->partid should be within the range of intPARTIDs */
WARN_ON_ONCE(c->partid >= closid_num);
@@ -2942,7 +2952,7 @@ static bool mpam_update_config(struct mpam_config *cfg,
}
int mpam_apply_config(struct mpam_component *comp, u16 partid,
- struct mpam_config *cfg)
+ struct mpam_config *cfg, bool sync)
{
struct mpam_write_config_arg arg;
struct mpam_msc_ris *ris;
@@ -2951,14 +2961,17 @@ int mpam_apply_config(struct mpam_component *comp, u16 partid,
lockdep_assert_cpus_held();
- /* Don't pass in the current config! */
- WARN_ON_ONCE(&comp->cfg[partid] == cfg);
+ if (!sync) {
+ /* The partid is within the range of intPARTIDs */
+ WARN_ON_ONCE(partid >= resctrl_arch_get_num_closid(NULL));
- if (!mpam_update_config(&comp->cfg[partid], cfg))
- return 0;
+ if (!mpam_update_config(&comp->cfg[partid], cfg))
+ return 0;
+ }
arg.comp = comp;
arg.partid = partid;
+ arg.sync = sync;
guard(srcu)(&mpam_srcu);
list_for_each_entry_srcu(vmsc, &comp->vmsc, comp_list,
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 16ce968344d0..90c8f8d16c79 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -464,7 +464,7 @@ void mpam_disable(struct work_struct *work);
void mpam_reset_class_locked(struct mpam_class *class);
int mpam_apply_config(struct mpam_component *comp, u16 partid,
- struct mpam_config *cfg);
+ struct mpam_config *cfg, bool sync);
int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
enum mpam_device_features, u64 *val);
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 2762462d80e5..5a97a7f18670 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -1329,15 +1329,15 @@ int resctrl_arch_update_one(struct rdt_resource *r, struct rdt_ctrl_domain *d,
*/
if (mpam_resctrl_hide_cdp(r->rid)) {
partid = resctrl_get_config_index(closid, CDP_CODE);
- err = mpam_apply_config(dom->ctrl_comp, partid, &cfg);
+ err = mpam_apply_config(dom->ctrl_comp, partid, &cfg, false);
if (err)
return err;
partid = resctrl_get_config_index(closid, CDP_DATA);
- return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
+ return mpam_apply_config(dom->ctrl_comp, partid, &cfg, false);
}
- return mpam_apply_config(dom->ctrl_comp, partid, &cfg);
+ return mpam_apply_config(dom->ctrl_comp, partid, &cfg, false);
}
int resctrl_arch_update_domains(struct rdt_resource *r, u32 closid)
@@ -1718,6 +1718,23 @@ static void update_rmid_entries_for_reqpartid(u32 reqpartid)
rmid_entry_reassign_closid(closid, req_pmg2rmid(reqpartid, pmg));
}
+static int mpam_sync_config(u32 reqpartid)
+{
+ struct mpam_component *comp;
+ struct mpam_class *class;
+ int err;
+
+ list_for_each_entry(class, &mpam_classes, classes_list) {
+ list_for_each_entry(comp, &class->components, class_list) {
+ err = mpam_apply_config(comp, reqpartid, NULL, true);
+ if (err)
+ return err;
+ }
+ }
+
+ return 0;
+}
+
int resctrl_arch_rmid_expand(u32 closid)
{
int i;
@@ -1729,14 +1746,20 @@ int resctrl_arch_rmid_expand(u32 closid)
if (reqpartid_map[i] >= resctrl_arch_get_num_closid(NULL)) {
if (cdp_enabled) {
reqpartid_map[i] = resctrl_get_config_index(closid, CDP_DATA);
+ mpam_sync_config(i);
+
/*
* Reqpartids are always allocated in
* pairs, never out-of-bounds access.
*/
reqpartid_map[i + 1] = resctrl_get_config_index(closid, CDP_CODE);
+ mpam_sync_config(i + 1);
+
} else {
reqpartid_map[i] = resctrl_get_config_index(closid, CDP_NONE);
+ mpam_sync_config(i);
}
+
update_rmid_entries_for_reqpartid(i);
return i;
}
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 01/10] fs/resctrl: Fix MPAM Partid parsing errors by preserving CDP state during umount
From: Zeng Heng @ 2026-04-13 8:53 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
This patch fixes a pre-existing issue in the resctrl filesystem teardown
sequence where premature clearing of cdp_enabled could lead to MPAM Partid
parsing errors.
The closid to partid conversion logic inherently depends on the global
cdp_enabled state. However, rdt_disable_ctx() clears this flag early in
the umount path, while free_rmid() operations will reference after that.
This creates a window where partid parsing operates with inconsistent CDP
state, potentially makes monitor reads with wrong partid mapping.
Additionally, rmid_entry remaining in limbo between mount sessions may
trigger potential partid out-of-range errors, leading to MPAM fault
interrupts and subsequent MPAM disablement.
Reorder rdt_kill_sb() to delay rdt_disable_ctx() until after
rmdir_all_sub() and resctrl_fs_teardown() complete. This ensures
all rmid-related operations finish with correct CDP state.
Introduce rdt_flush_limbo() to flush and cancel limbo work before the
filesystem teardown completes. An alternative approach would be to cancel
limbo work on umount and restart it on remount with remaked bitmap.
However, this would require substantial changes in the resctrl layer to
handle CDP state transitions across mount sessions, which is beyond the
scope of the reqpartid feature work this patchset focuses on. The current
fix addresses the immediate correctness issue with minimal churn.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
fs/resctrl/rdtgroup.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)
diff --git a/fs/resctrl/rdtgroup.c b/fs/resctrl/rdtgroup.c
index 5dfdaa6f9d8f..69966589713f 100644
--- a/fs/resctrl/rdtgroup.c
+++ b/fs/resctrl/rdtgroup.c
@@ -3169,6 +3169,25 @@ static void resctrl_fs_teardown(void)
rdtgroup_destroy_root();
}
+static void rdt_flush_limbo(void)
+{
+ struct rdt_resource *r = resctrl_arch_get_resource(RDT_RESOURCE_L3);
+ struct rdt_l3_mon_domain *d;
+
+ if (!IS_ENABLED(CONFIG_RESCTRL_RMID_DEPENDS_ON_CLOSID))
+ return;
+
+ if (!resctrl_is_mon_event_enabled(QOS_L3_OCCUP_EVENT_ID))
+ return;
+
+ list_for_each_entry(d, &r->mon_domains, hdr.list) {
+ if (has_busy_rmid(d)) {
+ __check_limbo(d, true);
+ cancel_delayed_work(&d->cqm_limbo);
+ }
+ }
+}
+
static void rdt_kill_sb(struct super_block *sb)
{
struct rdt_resource *r;
@@ -3176,13 +3195,14 @@ static void rdt_kill_sb(struct super_block *sb)
cpus_read_lock();
mutex_lock(&rdtgroup_mutex);
- rdt_disable_ctx();
-
/* Put everything back to default values. */
for_each_alloc_capable_rdt_resource(r)
resctrl_arch_reset_all_ctrls(r);
resctrl_fs_teardown();
+ rdt_flush_limbo();
+ rdt_disable_ctx();
+
if (resctrl_arch_alloc_capable())
resctrl_arch_disable_alloc();
if (resctrl_arch_mon_capable())
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 05/10] arm_mpam: Propagate control group config to sub-monitoring groups
From: Zeng Heng @ 2026-04-13 8:54 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
With the Narrow-PARTID feature, each control group is assigned multiple
(req)PARTIDs to expand monitoring capacity. When a control group's
configuration is updated, all associated sub-monitoring groups (each
identified by a unique reqPARTID) should be synchronized.
In __write_config(), iterate over all reqPARTIDs belonging to the
control group and propagate the configuration to each sub-monitoring
group:
* For MSCs supporting Narrow-PARTID, establish the reqPARTID to
intPARTID mapping.
* For MSCs without Narrow-PARTID support, synchronize the configuration
to new PARTIDs directly.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/resctrl/mpam_devices.c | 33 +++++++++++++++++++++++++++------
1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index cf067bf5092e..8fbc8f9f9688 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1547,6 +1547,7 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
{
u32 pri_val = 0;
u16 cmax = MPAMCFG_CMAX_CMAX;
+ u16 intpartid = req2intpartid(partid);
struct mpam_msc *msc = ris->vmsc->msc;
struct mpam_props *rprops = &ris->props;
u16 dspri = GENMASK(rprops->dspri_wd, 0);
@@ -1556,15 +1557,17 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
__mpam_part_sel(ris->ris_idx, partid, msc);
if (mpam_has_feature(mpam_feat_partid_nrw, rprops)) {
- /* Update the intpartid mapping */
mpam_write_partsel_reg(msc, INTPARTID,
- MPAMCFG_INTPARTID_INTERNAL | partid);
+ MPAMCFG_INTPARTID_INTERNAL | intpartid);
/*
- * Then switch to the 'internal' partid to update the
- * configuration.
+ * Mapping from reqpartid to intpartid already established.
+ * Sub-monitoring groups share the parent's configuration.
*/
- __mpam_intpart_sel(ris->ris_idx, partid, msc);
+ if (partid != intpartid)
+ goto out;
+
+ __mpam_intpart_sel(ris->ris_idx, intpartid, msc);
}
if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
@@ -1631,6 +1634,7 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
mpam_quirk_post_config_change(ris, partid, cfg);
+out:
mutex_unlock(&msc->part_sel_lock);
}
@@ -1755,11 +1759,28 @@ struct mpam_write_config_arg {
u16 partid;
};
+static u32 get_num_reqpartid_per_intpartid(void)
+{
+ return (mpam_partid_max + 1) / (mpam_intpartid_max + 1);
+}
+
static int __write_config(void *arg)
{
+ int closid_num = resctrl_arch_get_num_closid(NULL);
struct mpam_write_config_arg *c = arg;
+ u32 reqpartid, req_idx;
- mpam_reprogram_ris_partid(c->ris, c->partid, &c->comp->cfg[c->partid]);
+ /* c->partid should be within the range of intPARTIDs */
+ WARN_ON_ONCE(c->partid >= closid_num);
+
+ /* Synchronize the configuration to each sub-monitoring group. */
+ for (req_idx = 0; req_idx < get_num_reqpartid_per_intpartid();
+ req_idx++) {
+ reqpartid = req_idx * closid_num + c->partid;
+
+ mpam_reprogram_ris_partid(c->ris, reqpartid,
+ &c->comp->cfg[c->partid]);
+ }
return 0;
}
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 04/10] arm_mpam: Refactor rmid to reqPARTID/PMG mapping
From: Zeng Heng @ 2026-04-13 8:53 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
The Narrow-PARTID feature allows the MPAM driver to statically or
dynamically allocate request PARTIDs (reqPARTIDs) to internal
PARTIDs (intPARTIDs). This enables expanding the number of monitoring
groups beyond the hardware PMG limit.
For systems with mixed MSCs (Memory System Components), MSCs that do
not support Narrow-PARTID use PARTIDs exceeding the minimum number of
intPARTIDs as reqPARTIDs to expand monitoring groups.
Expand RMID to include reqPARTID information:
RMID = reqPARTID * NUM_PMG + PMG
To maintain compatibility with the existing resctrl layer, reqPARTIDs
are allocated statically with a linear mapping to intPARTIDs via
req2intpartid().
Mapping relationships (n = intPARTID count, m = reqPARTIDs per intPARTID):
P - Partition group
M - Monitoring group
Group closid rmid.reqPARTID MSCs w/ Narrow-PARTID MSCs w/o Narrow-PARTID
P1 0 intPARTID_1 PARTID_1
M1_1 0 0 ├── reqPARTID_1_1 ├── PARTID_1
M1_2 0 0+n ├── reqPARTID_1_2 ├── PARTID_1_2
M1_3 0 0+n*2 ├── reqPARTID_1_3 ├── PARTID_1_3
... ├── ... ├── ...
M1_m 0 0+n*(m-1) └── reqPARTID_1_m └── PARTID_1_m
P2 1 intPARTID_2 PARTID_2
M2_1 1 1 ├── reqPARTID_2_1 ├── PARTID_2
M2_2 1 1+n ├── reqPARTID_2_2 ├── PARTID_2_2
M2_3 1 1+n*2 ├── reqPARTID_2_3 ├── PARTID_2_3
... ├── ... ├── ...
M2_m 1 1+n*(m-1) └── reqPARTID_2_m └── PARTID_2_m
Pn n-1 intPARTID_n PARTID_n
Mn_1 n-1 n-1 ├── reqPARTID_n_1 ├── PARTID_n
Mn_2 n-1 n-1+n ├── reqPARTID_n_2 ├── PARTID_n_2
Mn_3 n-1 n-1+n*2 ├── reqPARTID_n_3 ├── PARTID_n_3
... ├── ... ├── ...
Mn_m n-1 n*m-1 └── reqPARTID_n_m └── PARTID_n_m
Refactor the glue layer between resctrl abstractions (rmid) and MPAM
hardware registers (reqPARTID/PMG) to support Narrow-PARTID. The resctrl
layer uses rmid2reqpartid() and rmid2pmg() to extract components from
rmid. The closid-to-intPARTID translation remains unchanged via
resctrl_get_config_index().
Since Narrow-PARTID is a monitoring enhancement, reqPARTID is only
used in monitoring paths while configuration paths maintain the original
semantics of closid.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
arch/arm64/include/asm/mpam.h | 12 +++-
drivers/resctrl/mpam_resctrl.c | 101 +++++++++++++++++++++++----------
2 files changed, 82 insertions(+), 31 deletions(-)
diff --git a/arch/arm64/include/asm/mpam.h b/arch/arm64/include/asm/mpam.h
index 70d396e7b6da..40d60b0ab0fd 100644
--- a/arch/arm64/include/asm/mpam.h
+++ b/arch/arm64/include/asm/mpam.h
@@ -63,6 +63,15 @@ static inline void mpam_set_task_partid_pmg(struct task_struct *tsk,
WRITE_ONCE(task_thread_info(tsk)->mpam_partid_pmg, regval);
}
+u16 req2intpartid(u16 reqpartid);
+
+static inline u16 mpam_get_regval_partid(u64 regval)
+{
+ u16 reqpartid = (regval & MPAMSM_EL1_PARTID_D_MASK) >> MPAMSM_EL1_PARTID_D_SHIFT;
+
+ return req2intpartid(reqpartid);
+}
+
static inline void mpam_thread_switch(struct task_struct *tsk)
{
u64 oldregval;
@@ -72,7 +81,8 @@ static inline void mpam_thread_switch(struct task_struct *tsk)
if (!static_branch_likely(&mpam_enabled))
return;
- if (regval == READ_ONCE(arm64_mpam_global_default))
+ if (regval == READ_ONCE(arm64_mpam_global_default) ||
+ !mpam_get_regval_partid(regval))
regval = READ_ONCE(per_cpu(arm64_mpam_default, cpu));
oldregval = READ_ONCE(per_cpu(arm64_mpam_current, cpu));
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 56859f354efa..9d0a7a4dffd1 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -303,15 +303,60 @@ u32 resctrl_arch_system_num_rmid_idx(void)
return (mpam_pmg_max + 1) * get_num_reqpartid();
}
+static u16 rmid2reqpartid(u32 rmid)
+{
+ rmid /= (mpam_pmg_max + 1);
+
+ if (cdp_enabled)
+ return resctrl_get_config_index(rmid, CDP_DATA);
+
+ return resctrl_get_config_index(rmid, CDP_NONE);
+}
+
+static u8 rmid2pmg(u32 rmid)
+{
+ return rmid % (mpam_pmg_max + 1);
+}
+
+u16 req2intpartid(u16 reqpartid)
+{
+ return reqpartid % (mpam_intpartid_max + 1);
+}
+
+/*
+ * To avoid the reuse of rmid across multiple control groups, check
+ * the incoming closid to prevent rmid from being reallocated by
+ * resctrl_find_free_rmid().
+ *
+ * If the closid and rmid do not match upon inspection, immediately
+ * returns an invalid rmid. A valid rmid must not exceed 24 bits.
+ */
u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
{
- return closid * (mpam_pmg_max + 1) + rmid;
+ u32 reqpartid = rmid2reqpartid(rmid);
+ u32 intpartid = req2intpartid(reqpartid);
+
+ if (cdp_enabled)
+ intpartid >>= 1;
+
+ if (closid != intpartid)
+ return U32_MAX;
+
+ return rmid;
}
void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
{
- *closid = idx / (mpam_pmg_max + 1);
- *rmid = idx % (mpam_pmg_max + 1);
+ u32 reqpartid = rmid2reqpartid(idx);
+ u32 intpartid = req2intpartid(reqpartid);
+
+ if (rmid)
+ *rmid = idx;
+ if (closid) {
+ if (cdp_enabled)
+ intpartid >>= 1;
+ *closid = intpartid;
+ }
}
void resctrl_arch_sched_in(struct task_struct *tsk)
@@ -323,21 +368,17 @@ void resctrl_arch_sched_in(struct task_struct *tsk)
void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid)
{
- WARN_ON_ONCE(closid > U16_MAX);
- WARN_ON_ONCE(rmid > U8_MAX);
+ u32 reqpartid = rmid2reqpartid(rmid);
+ u8 pmg = rmid2pmg(rmid);
- if (!cdp_enabled) {
- mpam_set_cpu_defaults(cpu, closid, closid, rmid, rmid);
- } else {
+ if (!cdp_enabled)
+ mpam_set_cpu_defaults(cpu, reqpartid, reqpartid, pmg, pmg);
+ else
/*
* When CDP is enabled, resctrl halves the closid range and we
* use odd/even partid for one closid.
*/
- u32 partid_d = resctrl_get_config_index(closid, CDP_DATA);
- u32 partid_i = resctrl_get_config_index(closid, CDP_CODE);
-
- mpam_set_cpu_defaults(cpu, partid_d, partid_i, rmid, rmid);
- }
+ mpam_set_cpu_defaults(cpu, reqpartid, reqpartid + 1, pmg, pmg);
}
void resctrl_arch_sync_cpu_closid_rmid(void *info)
@@ -356,17 +397,16 @@ void resctrl_arch_sync_cpu_closid_rmid(void *info)
void resctrl_arch_set_closid_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
{
- WARN_ON_ONCE(closid > U16_MAX);
- WARN_ON_ONCE(rmid > U8_MAX);
+ u32 reqpartid = rmid2reqpartid(rmid);
+ u8 pmg = rmid2pmg(rmid);
- if (!cdp_enabled) {
- mpam_set_task_partid_pmg(tsk, closid, closid, rmid, rmid);
- } else {
- u32 partid_d = resctrl_get_config_index(closid, CDP_DATA);
- u32 partid_i = resctrl_get_config_index(closid, CDP_CODE);
+ WARN_ON_ONCE(reqpartid > U16_MAX);
+ WARN_ON_ONCE(pmg > U8_MAX);
- mpam_set_task_partid_pmg(tsk, partid_d, partid_i, rmid, rmid);
- }
+ if (!cdp_enabled)
+ mpam_set_task_partid_pmg(tsk, reqpartid, reqpartid, pmg, pmg);
+ else
+ mpam_set_task_partid_pmg(tsk, reqpartid, reqpartid + 1, pmg, pmg);
}
bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid)
@@ -374,6 +414,8 @@ bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid)
u64 regval = mpam_get_regval(tsk);
u32 tsk_closid = FIELD_GET(MPAM0_EL1_PARTID_D, regval);
+ tsk_closid = req2intpartid(tsk_closid);
+
if (cdp_enabled)
tsk_closid >>= 1;
@@ -384,13 +426,11 @@ bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid)
bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid)
{
u64 regval = mpam_get_regval(tsk);
- u32 tsk_closid = FIELD_GET(MPAM0_EL1_PARTID_D, regval);
- u32 tsk_rmid = FIELD_GET(MPAM0_EL1_PMG_D, regval);
-
- if (cdp_enabled)
- tsk_closid >>= 1;
+ u32 tsk_partid = FIELD_GET(MPAM0_EL1_PARTID_D, regval);
+ u32 tsk_pmg = FIELD_GET(MPAM0_EL1_PMG_D, regval);
- return (tsk_closid == closid) && (tsk_rmid == rmid);
+ return (tsk_partid == rmid2reqpartid(rmid)) &&
+ (tsk_pmg == rmid2pmg(rmid));
}
struct rdt_resource *resctrl_arch_get_resource(enum resctrl_res_level l)
@@ -478,6 +518,7 @@ static int __read_mon(struct mpam_resctrl_mon *mon, struct mpam_component *mon_c
enum resctrl_conf_type cdp_type, u32 closid, u32 rmid, u64 *val)
{
struct mon_cfg cfg;
+ u32 reqpartid = rmid2reqpartid(rmid);
if (!mpam_is_enabled())
return -EINVAL;
@@ -493,8 +534,8 @@ static int __read_mon(struct mpam_resctrl_mon *mon, struct mpam_component *mon_c
cfg = (struct mon_cfg) {
.mon = mon_idx,
.match_pmg = true,
- .partid = closid,
- .pmg = rmid,
+ .partid = (cdp_type == CDP_CODE) ? reqpartid + 1 : reqpartid,
+ .pmg = rmid2pmg(rmid),
};
return mpam_msmon_read(mon_comp, &cfg, mon_type, val);
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 07/10] fs/resctrl: Add rmid_entry state helpers
From: Zeng Heng @ 2026-04-13 8:54 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
Introduce helper functions for rmid_entry management, in preparation
for upcoming patches supporting dynamic monitoring group allocation:
* rmid_is_occupied(): Query whether a RMID entry is currently allocated
by checking if its list node has been removed from the free list.
* rmid_entry_reassign_closid(): Update the closid associated with a RMID
entry.
Fix list node initialization in alloc_rmid() and dom_data_init() by
using list_del_init() instead of list_del(). This ensures list_empty()
checks in rmid_is_occupied() work correctly without encountering
LIST_POISON values.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
fs/resctrl/monitor.c | 18 ++++++++++++++++--
include/linux/resctrl.h | 21 +++++++++++++++++++++
2 files changed, 37 insertions(+), 2 deletions(-)
diff --git a/fs/resctrl/monitor.c b/fs/resctrl/monitor.c
index 9fd901c78dc6..7473c43600cf 100644
--- a/fs/resctrl/monitor.c
+++ b/fs/resctrl/monitor.c
@@ -286,7 +286,7 @@ int alloc_rmid(u32 closid)
if (IS_ERR(entry))
return PTR_ERR(entry);
- list_del(&entry->list);
+ list_del_init(&entry->list);
return entry->rmid;
}
@@ -346,6 +346,20 @@ void free_rmid(u32 closid, u32 rmid)
list_add_tail(&entry->list, &rmid_free_lru);
}
+bool rmid_is_occupied(u32 closid, u32 rmid)
+{
+ u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid);
+
+ return list_empty(&rmid_ptrs[idx].list);
+}
+
+void rmid_entry_reassign_closid(u32 closid, u32 rmid)
+{
+ u32 idx = resctrl_arch_rmid_idx_encode(closid, rmid);
+
+ rmid_ptrs[idx].closid = closid;
+}
+
static struct mbm_state *get_mbm_state(struct rdt_l3_mon_domain *d, u32 closid,
u32 rmid, enum resctrl_event_id evtid)
{
@@ -945,7 +959,7 @@ int setup_rmid_lru_list(void)
idx = resctrl_arch_rmid_idx_encode(RESCTRL_RESERVED_CLOSID,
RESCTRL_RESERVED_RMID);
entry = __rmid_entry(idx);
- list_del(&entry->list);
+ list_del_init(&entry->list);
return 0;
}
diff --git a/include/linux/resctrl.h b/include/linux/resctrl.h
index 006e57fd7ca5..b636e7250c20 100644
--- a/include/linux/resctrl.h
+++ b/include/linux/resctrl.h
@@ -702,6 +702,27 @@ bool resctrl_arch_get_io_alloc_enabled(struct rdt_resource *r);
extern unsigned int resctrl_rmid_realloc_threshold;
extern unsigned int resctrl_rmid_realloc_limit;
+/**
+ * rmid_is_occupied() - Check whether the specified rmid has been
+ * allocated.
+ * @closid: Specify the closid that matches the rmid.
+ * @rmid: Specify the rmid entry to check status.
+ *
+ * This function checks if the rmid_entry is currently allocated by testing
+ * whether its list node is empty (removed from the free list).
+ *
+ * Return:
+ * True if the specified rmid is still in use.
+ */
+bool rmid_is_occupied(u32 closid, u32 rmid);
+
+/**
+ * rmid_entry_reassign_closid() - Update the closid field of a rmid_entry.
+ * @closid: Specify the reassigned closid.
+ * @rmid: Specify the rmid entry to update closid.
+ */
+void rmid_entry_reassign_closid(u32 closid, u32 rmid);
+
int resctrl_init(void);
void resctrl_exit(void);
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 03/10] arm_mpam: Disable reqPARTID expansion when Narrow-PARTID is unavailable
From: Zeng Heng @ 2026-04-13 8:53 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
MPAM supports heterogeneous systems where some type of MSCs may implement
Narrow-PARTID while others do not. However, when an MSC uses
percentage-based throttling (non-bitmap partition control) and lacks
Narrow-PARTID support, resctrl cannot correctly apply control group
configurations across multiple PARTIDs.
To enable free assignment of multiple reqPARTIDs to resource control
groups, all MSCs used by resctrl must either: Implement Narrow-PARTID,
allowing explicit PARTID remapping, or only have stateless resource
controls (non-percentage-based), such that splitting a control group
across multiple PARTIDs does not affect behavior.
The detection occurs at initialization time on the first call to
get_num_reqpartid() from update_rmid_limits(). This call is guaranteed
to occur after mpam_resctrl_pick_{mba,caches}() have set up the
resource classes, ensuring the necessary properties are available
for the Narrow-PARTID capability check.
When an MSC with percentage-based control lacks Narrow-PARTID support,
get_num_reqpartid() falls back to returning the number of intPARTIDs,
effectively disabling the reqPARTID expansion for monitoring groups.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/resctrl/mpam_resctrl.c | 43 +++++++++++++++++++++++++++++++++-
1 file changed, 42 insertions(+), 1 deletion(-)
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 5f4364c8101a..56859f354efa 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -257,9 +257,50 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
return mpam_intpartid_max + 1;
}
+/*
+ * Determine the effective number of PARTIDs available for resctrl.
+ *
+ * This function performs a one-time check to determine if Narrow-PARTID
+ * can be used. It must be called after mpam_resctrl_pick_{mba,caches}()
+ * have initialized the resource classes, as class properties are used
+ * to detect Narrow-PARTID support.
+ *
+ * The first call occurs in update_rmid_limits(), ensuring the
+ * prerequisite initialization is complete.
+ */
+static u32 get_num_reqpartid(void)
+{
+ struct mpam_resctrl_res *res;
+ struct mpam_props *cprops;
+ static bool first = true;
+ int rid;
+
+ if (first) {
+ for_each_mpam_resctrl_control(res, rid) {
+ if (!res->class)
+ continue;
+
+ cprops = &res->class->props;
+ if (mpam_has_feature(mpam_feat_partid_nrw, cprops))
+ continue;
+
+ if (mpam_has_feature(mpam_feat_mbw_max, cprops) ||
+ mpam_has_feature(mpam_feat_mbw_min, cprops) ||
+ mpam_has_feature(mpam_feat_cmax_cmax, cprops) ||
+ mpam_has_feature(mpam_feat_cmax_cmin, cprops)) {
+ mpam_partid_max = mpam_intpartid_max;
+ break;
+ }
+ }
+ }
+
+ first = false;
+ return mpam_partid_max + 1;
+}
+
u32 resctrl_arch_system_num_rmid_idx(void)
{
- return (mpam_pmg_max + 1) * (mpam_partid_max + 1);
+ return (mpam_pmg_max + 1) * get_num_reqpartid();
}
u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 08/10] arm_mpam: Implement dynamic reqPARTID allocation for monitoring groups
From: Zeng Heng @ 2026-04-13 8:54 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
Replace static reqPARTID allocation with dynamic binding to maximize
the monitoring group utilization. Static allocation wastes resources when
control groups create fewer sub-groups than the pre-allocated limit.
Add a lookup table (reqpartid_map) to dynamically bind reqPARTIDs to
control groups needing extended monitoring capacity:
* resctrl_arch_rmid_expand(): Find and bind a free reqPARTID to the
specified closid when creating monitoring groups.
* resctrl_arch_rmid_reclaim(): Unbind reqPARTID when all monitoring
groups associated with pmg are freed, making it available for reuse.
Update conversion helpers for dynamic mapping:
* req2intpartid() switches to lookup table for dynamic allocation.
* Add partid2closid() and req_pmg2rmid() helpers.
Refactor __write_config() to iterate over all reqPARTIDs that match
by intPARTID, removing fixed per-closid slot assumption.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/resctrl/mpam_devices.c | 21 ++---
drivers/resctrl/mpam_internal.h | 2 +
drivers/resctrl/mpam_resctrl.c | 141 +++++++++++++++++++++++++++++---
include/linux/arm_mpam.h | 17 ++++
4 files changed, 157 insertions(+), 24 deletions(-)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 2aeff798a865..cf94b45b4f9e 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -1764,27 +1764,24 @@ struct mpam_write_config_arg {
u16 partid;
};
-static u32 get_num_reqpartid_per_intpartid(void)
-{
- return (mpam_partid_max + 1) / (mpam_intpartid_max + 1);
-}
-
static int __write_config(void *arg)
{
int closid_num = resctrl_arch_get_num_closid(NULL);
struct mpam_write_config_arg *c = arg;
- u32 reqpartid, req_idx;
+ u32 reqpartid;
/* c->partid should be within the range of intPARTIDs */
WARN_ON_ONCE(c->partid >= closid_num);
- /* Synchronize the configuration to each sub-monitoring group. */
- for (req_idx = 0; req_idx < get_num_reqpartid_per_intpartid();
- req_idx++) {
- reqpartid = req_idx * closid_num + c->partid;
+ mpam_reprogram_ris_partid(c->ris, c->partid,
+ &c->comp->cfg[c->partid]);
- mpam_reprogram_ris_partid(c->ris, reqpartid,
- &c->comp->cfg[c->partid]);
+ /* Synchronize the configuration to each sub-monitoring group. */
+ for (reqpartid = closid_num;
+ reqpartid < get_num_reqpartid(); reqpartid++) {
+ if (req2intpartid(reqpartid) == c->partid)
+ mpam_reprogram_ris_partid(c->ris, reqpartid,
+ &c->comp->cfg[c->partid]);
}
return 0;
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 790a90a5ccd9..16ce968344d0 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -473,6 +473,8 @@ void mpam_msmon_reset_mbwu(struct mpam_component *comp, struct mon_cfg *ctx);
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
+u32 get_num_reqpartid(void);
+
#ifdef CONFIG_RESCTRL_FS
int mpam_resctrl_setup(void);
void mpam_resctrl_exit(void);
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 9d0a7a4dffd1..2762462d80e5 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -268,7 +268,7 @@ u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
* The first call occurs in update_rmid_limits(), ensuring the
* prerequisite initialization is complete.
*/
-static u32 get_num_reqpartid(void)
+u32 get_num_reqpartid(void)
{
struct mpam_resctrl_res *res;
struct mpam_props *cprops;
@@ -318,9 +318,34 @@ static u8 rmid2pmg(u32 rmid)
return rmid % (mpam_pmg_max + 1);
}
+static u32 req_pmg2rmid(u32 reqpartid, u8 pmg)
+{
+ if (cdp_enabled)
+ reqpartid >>= 1;
+
+ return reqpartid * (mpam_pmg_max + 1) + pmg;
+}
+
+static u32 *reqpartid_map;
+
u16 req2intpartid(u16 reqpartid)
{
- return reqpartid % (mpam_intpartid_max + 1);
+ /*
+ * Directly return intPartid in case that mpam_reset_ris() access
+ * NULL pointer.
+ */
+ if (reqpartid < (mpam_intpartid_max + 1))
+ return reqpartid;
+
+ return reqpartid_map[reqpartid];
+}
+
+static u32 partid2closid(u32 partid)
+{
+ if (cdp_enabled)
+ partid >>= 1;
+
+ return partid;
}
/*
@@ -334,12 +359,12 @@ u16 req2intpartid(u16 reqpartid)
u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid)
{
u32 reqpartid = rmid2reqpartid(rmid);
- u32 intpartid = req2intpartid(reqpartid);
- if (cdp_enabled)
- intpartid >>= 1;
+ /* When enable CDP mode, needs to filter invalid rmid entry out */
+ if (reqpartid >= get_num_reqpartid())
+ return U32_MAX;
- if (closid != intpartid)
+ if (closid != partid2closid(req2intpartid(reqpartid)))
return U32_MAX;
return rmid;
@@ -352,11 +377,9 @@ void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid)
if (rmid)
*rmid = idx;
- if (closid) {
- if (cdp_enabled)
- intpartid >>= 1;
- *closid = intpartid;
- }
+
+ if (closid)
+ *closid = partid2closid(intpartid);
}
void resctrl_arch_sched_in(struct task_struct *tsk)
@@ -1665,6 +1688,93 @@ void mpam_resctrl_offline_cpu(unsigned int cpu)
}
}
+static int reqpartid_init(void)
+{
+ int req_num, idx;
+
+ req_num = get_num_reqpartid();
+ reqpartid_map = kcalloc(req_num, sizeof(u32), GFP_KERNEL);
+ if (!reqpartid_map)
+ return -ENOMEM;
+
+ for (idx = 0; idx < req_num; idx++)
+ reqpartid_map[idx] = idx;
+
+ return 0;
+}
+
+static void reqpartid_exit(void)
+{
+ kfree(reqpartid_map);
+}
+
+static void update_rmid_entries_for_reqpartid(u32 reqpartid)
+{
+ int pmg;
+ u32 intpartid = reqpartid_map[reqpartid];
+ u32 closid = partid2closid(intpartid);
+
+ for (pmg = 0; pmg <= mpam_pmg_max; pmg++)
+ rmid_entry_reassign_closid(closid, req_pmg2rmid(reqpartid, pmg));
+}
+
+int resctrl_arch_rmid_expand(u32 closid)
+{
+ int i;
+
+ for (i = resctrl_arch_get_num_closid(NULL);
+ i < get_num_reqpartid(); i++) {
+
+ /* Here means the reqpartid 'i' is free. */
+ if (reqpartid_map[i] >= resctrl_arch_get_num_closid(NULL)) {
+ if (cdp_enabled) {
+ reqpartid_map[i] = resctrl_get_config_index(closid, CDP_DATA);
+ /*
+ * Reqpartids are always allocated in
+ * pairs, never out-of-bounds access.
+ */
+ reqpartid_map[i + 1] = resctrl_get_config_index(closid, CDP_CODE);
+ } else {
+ reqpartid_map[i] = resctrl_get_config_index(closid, CDP_NONE);
+ }
+ update_rmid_entries_for_reqpartid(i);
+ return i;
+ }
+ }
+
+ return -ENOSPC;
+}
+
+void resctrl_arch_rmid_reclaim(u32 closid, u32 rmid)
+{
+ int pmg;
+ u32 intpartid;
+ int reqpartid = rmid2reqpartid(rmid);
+
+ if (reqpartid < resctrl_arch_get_num_closid(NULL))
+ return;
+
+ if (cdp_enabled)
+ intpartid = resctrl_get_config_index(closid, CDP_DATA);
+ else
+ intpartid = resctrl_get_config_index(closid, CDP_NONE);
+
+ WARN_ON_ONCE(intpartid != req2intpartid(reqpartid));
+
+ for (pmg = 0; pmg <= mpam_pmg_max; pmg++) {
+ if (rmid_is_occupied(closid, req_pmg2rmid(reqpartid, pmg)))
+ break;
+ }
+
+ if (pmg > mpam_pmg_max) {
+ reqpartid_map[reqpartid] = reqpartid;
+ if (cdp_enabled)
+ reqpartid_map[reqpartid + 1] = reqpartid + 1;
+
+ update_rmid_entries_for_reqpartid(reqpartid);
+ }
+}
+
int mpam_resctrl_setup(void)
{
int err = 0;
@@ -1720,10 +1830,16 @@ int mpam_resctrl_setup(void)
return -EOPNOTSUPP;
}
- err = resctrl_init();
+ err = reqpartid_init();
if (err)
return err;
+ err = resctrl_init();
+ if (err) {
+ reqpartid_exit();
+ return err;
+ }
+
WRITE_ONCE(resctrl_enabled, true);
return 0;
@@ -1741,6 +1857,7 @@ void mpam_resctrl_exit(void)
WRITE_ONCE(resctrl_enabled, false);
resctrl_exit();
+ reqpartid_exit();
}
/*
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index f92a36187a52..d45422965907 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -59,6 +59,23 @@ void resctrl_arch_set_cpu_default_closid_rmid(int cpu, u32 closid, u32 rmid);
void resctrl_arch_sched_in(struct task_struct *tsk);
bool resctrl_arch_match_closid(struct task_struct *tsk, u32 closid);
bool resctrl_arch_match_rmid(struct task_struct *tsk, u32 closid, u32 rmid);
+
+/**
+ * resctrl_arch_rmid_expand() - Expand the RMID resources for the specified closid.
+ * @closid: closid that matches the rmid.
+ *
+ * Return:
+ * 0 on success, or -ENOSPC etc on error.
+ */
+int resctrl_arch_rmid_expand(u32 closid);
+
+/**
+ * resctrl_arch_rmid_reclaim() - Reclaim the rmid resources for the specified closid.
+ * @closid: closid that matches the rmid.
+ * @rmid: Reclaim the rmid specified.
+ */
+void resctrl_arch_rmid_reclaim(u32 closid, u32 rmid);
+
u32 resctrl_arch_rmid_idx_encode(u32 closid, u32 rmid);
void resctrl_arch_rmid_idx_decode(u32 idx, u32 *closid, u32 *rmid);
u32 resctrl_arch_system_num_rmid_idx(void);
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 06/10] arm_mpam: Add boot parameter to limit mpam_intpartid_max
From: Zeng Heng @ 2026-04-13 8:54 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
Add a new boot parameter "intpartid_max" to allow system administrators
to limit the number of internal PARTIDs used by the MPAM driver for
resource control groups. This provides flexibility to configure the
trade-off between:
* Number of resource control groups (CLOSIDs, limited by intpartid_max)
* Number of monitoring groups (RMIDs, limited by reqpartid and intpartid)
By default, the driver uses all available intPARTIDs for control groups.
With this parameter, users are able to reserve internal PARTIDs to create
additional sub-monitoring groups (provided that the narrow PARTID feature
is successfully enabled).
Example usage:
mpam.intpartid_max=7
mpam.intpartid_max is set to the maximum number of internal PARTIDs
minus one, which is applied as the limit at initialization time.
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/resctrl/mpam_devices.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 8fbc8f9f9688..2aeff798a865 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -3,6 +3,9 @@
#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
+#undef KBUILD_MODNAME
+#define KBUILD_MODNAME "mpam"
+
#include <linux/acpi.h>
#include <linux/atomic.h>
#include <linux/arm_mpam.h>
@@ -18,6 +21,7 @@
#include <linux/irq.h>
#include <linux/irqdesc.h>
#include <linux/list.h>
+#include <linux/moduleparam.h>
#include <linux/lockdep.h>
#include <linux/mutex.h>
#include <linux/platform_device.h>
@@ -65,6 +69,7 @@ static DEFINE_MUTEX(mpam_cpuhp_state_lock);
u16 mpam_partid_max;
u16 mpam_intpartid_max;
u8 mpam_pmg_max;
+static u16 bootparam_intpartid_max = USHRT_MAX;
static bool partid_max_init, partid_max_published;
static DEFINE_SPINLOCK(partid_max_lock);
@@ -2725,6 +2730,7 @@ static void mpam_enable_once(void)
* longer change.
*/
spin_lock(&partid_max_lock);
+ mpam_intpartid_max = min(mpam_intpartid_max, bootparam_intpartid_max);
partid_max_published = true;
spin_unlock(&partid_max_lock);
@@ -2779,6 +2785,20 @@ static void mpam_enable_once(void)
mpam_partid_max + 1, mpam_intpartid_max + 1, mpam_pmg_max + 1);
}
+static int mpam_intpartid_max_set(const char *val,
+ const struct kernel_param *kp)
+{
+ return param_set_uint_minmax(val, kp, 1, USHRT_MAX);
+}
+
+static const struct kernel_param_ops mpam_intpartid_max_ops = {
+ .set = mpam_intpartid_max_set,
+ .get = param_get_uint,
+};
+
+arch_param_cb(intpartid_max, &mpam_intpartid_max_ops,
+ &bootparam_intpartid_max, 0444);
+
static void mpam_reset_component_locked(struct mpam_component *comp)
{
struct mpam_vmsc *vmsc;
--
2.25.1
^ permalink raw reply related
* [PATCH v8 next 02/10] arm_mpam: Add intPARTID and reqPARTID support for Narrow-PARTID feature
From: Zeng Heng @ 2026-04-13 8:53 UTC (permalink / raw)
To: ben.horgan, Dave.Martin, james.morse, tan.shaopeng,
reinette.chatre, fenghuay, tglx, will, hpa, bp, babu.moger,
dave.hansen, mingo, tony.luck, gshan, catalin.marinas
Cc: linux-arm-kernel, x86, linux-kernel, wangkefeng.wang
In-Reply-To: <20260413085405.1166412-1-zengheng4@huawei.com>
Introduce Narrow-PARTID (partid_nrw) feature support, which enables
many-to-one mapping of request PARTIDs (reqPARTID) to internal PARTIDs
(intPARTID). This expands monitoring capability by allowing a single
control group to track more task types through multiple reqPARTIDs
per intPARTID, bypassing the PMG limit in some extent.
intPARTID: Internal PARTID used for control group configuration.
Configurations are synchronized to all reqPARTIDs mapped to the same
intPARTID. Count is indicated by MPAMF_PARTID_NRW_IDR.INTPARTID_MAX, or
defaults to PARTID count if Narrow-PARTID is unsupported.
reqPARTID: Request PARTID used to expand monitoring groups. Enables
a single control group to monitor more task types by multiple reqPARTIDs
within one intPARTID, overcoming the PMG count limitation.
For systems with homogeneous MSCs (all supporting Narrow-PARTID), the
driver exposes the full reqPARTID range directly. For heterogeneous
systems where some MSCs lack Narrow-PARTID support, the driver utilizes
PARTIDs beyond the intPARTID range as reqPARTIDs to expand monitoring
capacity.
So, the numbers of control group and monitoring group are calculated as:
n = min(intPARTID, PARTID) /* the number of control groups */
l = min(reqPARTID, PARTID) /* the number of monitoring groups */
m = l // n /* monitoring groups per control group */
Where:
intPARTID: intPARTIDs on Narrow-PARTID-capable MSCs
reqPARTID: reqPARTIDs on Narrow-PARTID-capable MSCs
PARTID: PARTIDs on non-Narrow-PARTID-capable MSCs
Example: L3 cache (256 PARTIDs, without Narrow-PARTID feature) +
MATA (32 intPARTIDs, 256 reqPARTIDs):
n = min( 32, 256) = 32 intPARTIDs
l = min(256, 256) = 256 reqPARTIDs
m = 256 / 32 = 8 reqPARTIDs per intPARTID
Implementation notes:
* Handle mixed MSC systems (some support Narrow-PARTID, some don't) by
taking minimum number of intPARTIDs across all MSCs.
* resctrl_arch_get_num_closid() now returns the number of intPARTIDs
(was PARTID).
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
---
drivers/resctrl/mpam_devices.c | 30 ++++++++++++++++++++----------
drivers/resctrl/mpam_internal.h | 2 ++
drivers/resctrl/mpam_resctrl.c | 4 ++--
3 files changed, 24 insertions(+), 12 deletions(-)
diff --git a/drivers/resctrl/mpam_devices.c b/drivers/resctrl/mpam_devices.c
index 41b14344b16f..cf067bf5092e 100644
--- a/drivers/resctrl/mpam_devices.c
+++ b/drivers/resctrl/mpam_devices.c
@@ -63,6 +63,7 @@ static DEFINE_MUTEX(mpam_cpuhp_state_lock);
* Generating traffic outside this range will result in screaming interrupts.
*/
u16 mpam_partid_max;
+u16 mpam_intpartid_max;
u8 mpam_pmg_max;
static bool partid_max_init, partid_max_published;
static DEFINE_SPINLOCK(partid_max_lock);
@@ -288,10 +289,12 @@ int mpam_register_requestor(u16 partid_max, u8 pmg_max)
{
guard(spinlock)(&partid_max_lock);
if (!partid_max_init) {
+ mpam_intpartid_max = partid_max;
mpam_partid_max = partid_max;
mpam_pmg_max = pmg_max;
partid_max_init = true;
} else if (!partid_max_published) {
+ mpam_intpartid_max = min(mpam_intpartid_max, partid_max);
mpam_partid_max = min(mpam_partid_max, partid_max);
mpam_pmg_max = min(mpam_pmg_max, pmg_max);
} else {
@@ -931,7 +934,9 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
u16 partid_max = FIELD_GET(MPAMF_PARTID_NRW_IDR_INTPARTID_MAX, nrwidr);
mpam_set_feature(mpam_feat_partid_nrw, props);
- msc->partid_max = min(msc->partid_max, partid_max);
+ msc->intpartid_max = min(msc->partid_max, partid_max);
+ } else {
+ msc->intpartid_max = msc->partid_max;
}
}
@@ -995,6 +1000,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
spin_lock(&partid_max_lock);
mpam_partid_max = min(mpam_partid_max, msc->partid_max);
+ mpam_intpartid_max = min(mpam_intpartid_max, msc->intpartid_max);
mpam_pmg_max = min(mpam_pmg_max, msc->pmg_max);
spin_unlock(&partid_max_lock);
@@ -1711,7 +1717,7 @@ static int mpam_reset_ris(void *arg)
return 0;
spin_lock(&partid_max_lock);
- partid_max = mpam_partid_max;
+ partid_max = mpam_intpartid_max;
spin_unlock(&partid_max_lock);
for (partid = 0; partid <= partid_max; partid++)
mpam_reprogram_ris_partid(ris, partid, &reset_cfg);
@@ -1767,7 +1773,7 @@ static void mpam_reprogram_msc(struct mpam_msc *msc)
struct mpam_write_config_arg arg;
/*
- * No lock for mpam_partid_max as partid_max_published has been
+ * No lock for mpam_intpartid_max as partid_max_published has been
* set by mpam_enabled(), so the values can no longer change.
*/
mpam_assert_partid_sizes_fixed();
@@ -1784,7 +1790,7 @@ static void mpam_reprogram_msc(struct mpam_msc *msc)
arg.comp = ris->vmsc->comp;
arg.ris = ris;
reset = true;
- for (partid = 0; partid <= mpam_partid_max; partid++) {
+ for (partid = 0; partid <= mpam_intpartid_max; partid++) {
cfg = &ris->vmsc->comp->cfg[partid];
if (!bitmap_empty(cfg->features, MPAM_FEATURE_LAST))
reset = false;
@@ -2607,7 +2613,7 @@ static void mpam_reset_component_cfg(struct mpam_component *comp)
if (!comp->cfg)
return;
- for (i = 0; i <= mpam_partid_max; i++) {
+ for (i = 0; i <= mpam_intpartid_max; i++) {
comp->cfg[i] = (struct mpam_config) {};
if (cprops->cpbm_wd)
comp->cfg[i].cpbm = GENMASK(cprops->cpbm_wd - 1, 0);
@@ -2627,7 +2633,7 @@ static int __allocate_component_cfg(struct mpam_component *comp)
if (comp->cfg)
return 0;
- comp->cfg = kzalloc_objs(*comp->cfg, mpam_partid_max + 1);
+ comp->cfg = kzalloc_objs(*comp->cfg, mpam_intpartid_max + 1);
if (!comp->cfg)
return -ENOMEM;
@@ -2694,7 +2700,7 @@ static void mpam_enable_once(void)
int err;
/*
- * Once the cpuhp callbacks have been changed, mpam_partid_max can no
+ * Once the cpuhp callbacks have been changed, mpam_intpartid_max can no
* longer change.
*/
spin_lock(&partid_max_lock);
@@ -2743,9 +2749,13 @@ static void mpam_enable_once(void)
mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline,
"mpam:online");
- /* Use printk() to avoid the pr_fmt adding the function name. */
- printk(KERN_INFO "MPAM enabled with %u PARTIDs and %u PMGs\n",
- mpam_partid_max + 1, mpam_pmg_max + 1);
+ if (mpam_partid_max == mpam_intpartid_max)
+ /* Use printk() to avoid the pr_fmt adding the function name. */
+ printk(KERN_INFO "MPAM enabled with %u PARTIDs and %u PMGs\n",
+ mpam_partid_max + 1, mpam_pmg_max + 1);
+ else
+ printk(KERN_INFO "MPAM enabled with %u reqPARTIDs, %u intPARTIDs and %u PMGs\n",
+ mpam_partid_max + 1, mpam_intpartid_max + 1, mpam_pmg_max + 1);
}
static void mpam_reset_component_locked(struct mpam_component *comp)
diff --git a/drivers/resctrl/mpam_internal.h b/drivers/resctrl/mpam_internal.h
index 1914aefdcba9..790a90a5ccd9 100644
--- a/drivers/resctrl/mpam_internal.h
+++ b/drivers/resctrl/mpam_internal.h
@@ -81,6 +81,7 @@ struct mpam_msc {
*/
struct mutex probe_lock;
bool probed;
+ u16 intpartid_max;
u16 partid_max;
u8 pmg_max;
unsigned long ris_idxs;
@@ -452,6 +453,7 @@ extern struct list_head mpam_classes;
/* System wide partid/pmg values */
extern u16 mpam_partid_max;
+extern u16 mpam_intpartid_max;
extern u8 mpam_pmg_max;
/* Scheduled work callback to enable mpam once all MSC have been probed */
diff --git a/drivers/resctrl/mpam_resctrl.c b/drivers/resctrl/mpam_resctrl.c
index 161fb8905e28..5f4364c8101a 100644
--- a/drivers/resctrl/mpam_resctrl.c
+++ b/drivers/resctrl/mpam_resctrl.c
@@ -223,7 +223,7 @@ int resctrl_arch_set_cdp_enabled(enum resctrl_res_level rid, bool enable)
mpam_resctrl_controls[RDT_RESOURCE_MBA].resctrl_res.alloc_capable = false;
if (enable) {
- if (mpam_partid_max < 1)
+ if (mpam_intpartid_max < 1)
return -EINVAL;
partid_d = resctrl_get_config_index(RESCTRL_RESERVED_CLOSID, CDP_DATA);
@@ -254,7 +254,7 @@ static bool mpam_resctrl_hide_cdp(enum resctrl_res_level rid)
*/
u32 resctrl_arch_get_num_closid(struct rdt_resource *ignored)
{
- return mpam_partid_max + 1;
+ return mpam_intpartid_max + 1;
}
u32 resctrl_arch_system_num_rmid_idx(void)
--
2.25.1
^ permalink raw reply related
* [PATCH] arm64: dts: imx8ulp-evk: Correct Type-C int GPIO flags
From: Krzysztof Kozlowski @ 2026-04-13 9:07 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Frank Li,
Sascha Hauer, Pengutronix Kernel Team, Fabio Estevam, Shawn Guo,
Xu Yang, devicetree, imx, linux-arm-kernel, linux-kernel
Cc: Krzysztof Kozlowski, stable
IRQ_TYPE_xxx flags are not correct in the context of GPIO flags.
These are simple defines so they could be used in DTS but they will not
have the same meaning: IRQ_TYPE_EDGE_FALLING = 2 = GPIO_SINGLE_ENDED.
Correct the Type-C int-gpios to use proper flags, assuming the author of
the code wanted similar logical behavior:
IRQ_TYPE_EDGE_FALLING => GPIO_ACTIVE_LOW
Fixes: c4b4593ecb0b ("arm64: dts: imx8ulp-evk: enable usb nodes and add ptn5150 nodes")
Cc: <stable@vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com>
---
arch/arm64/boot/dts/freescale/imx8ulp-evk.dts | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/imx8ulp-evk.dts b/arch/arm64/boot/dts/freescale/imx8ulp-evk.dts
index 290a49bea2f7..5dea66c1e7aa 100644
--- a/arch/arm64/boot/dts/freescale/imx8ulp-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx8ulp-evk.dts
@@ -166,7 +166,7 @@ &lpi2c7 {
ptn5150_1: typec@1d {
compatible = "nxp,ptn5150";
reg = <0x1d>;
- int-gpios = <&gpiof 3 IRQ_TYPE_EDGE_FALLING>;
+ int-gpios = <&gpiof 3 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_typec1>;
status = "disabled";
@@ -182,7 +182,7 @@ pcal6408: gpio@21 {
ptn5150_2: typec@3d {
compatible = "nxp,ptn5150";
reg = <0x3d>;
- int-gpios = <&gpiof 5 IRQ_TYPE_EDGE_FALLING>;
+ int-gpios = <&gpiof 5 GPIO_ACTIVE_LOW>;
pinctrl-names = "default";
pinctrl-0 = <&pinctrl_typec2>;
status = "disabled";
--
2.51.0
^ permalink raw reply related
* Re: [PING] Re: [PATCH v7 0/2] arm64: dts/defconfig: enable BST C1200 eMMC
From: Krzysztof Kozlowski @ 2026-04-13 9:09 UTC (permalink / raw)
To: Albert Yang, arnd
Cc: krzk+dt, robh, conor+dt, gordon.ge, bst-upstream,
linux-arm-kernel, devicetree, linux-kernel
In-Reply-To: <20260413083441.1758438-1-yangzh0906@thundersoft.com>
On 13/04/2026 10:34, Albert Yang wrote:
> Hi Krzysztof, Arnd, Rob, and Conor,
>
> Gentle ping for this v7 series posted on 2026-03-10:
> https://lore.kernel.org/lkml/20260310091211.4171307-1-yangzh0906@thundersoft.com/
That's a BST patch, so pinging SoC and DT maintainers won't help you.
You need to work with BST maintainers.
Additionally, don't ping during merge window.
Best regards,
Krzysztof
^ permalink raw reply
* Re: [PATCH 2/4] soc: amlogic: clk-measure: Add A1 and T7 support
From: Krzysztof Kozlowski @ 2026-04-13 9:10 UTC (permalink / raw)
To: Jian Hu, Neil Armstrong, Jerome Brunet, Kevin Hilman,
Michael Turquette, Martin Blumenstingl, robh+dt, Rob Herring
Cc: devicetree, linux-amlogic, linux-kernel, linux-arm-kernel
In-Reply-To: <274d2abd-05b9-4dbd-b962-ff70044b8d07@amlogic.com>
On 13/04/2026 10:21, Jian Hu wrote:
>
> On 4/12/2026 5:55 PM, Krzysztof Kozlowski wrote:
>> [ EXTERNAL EMAIL ]
>>
>> On 10/04/2026 12:03, Jian Hu wrote:
>>> Add support for the A1 and T7 SoC family in amlogic clk measure.
>>>
>>> Signed-off-by: Jian Hu <jian.hu@amlogic.com>
>>> ---
>>> drivers/soc/amlogic/meson-clk-measure.c | 272 ++++++++++++++++++++++++
>>> 1 file changed, 272 insertions(+)
>>>
>>> diff --git a/drivers/soc/amlogic/meson-clk-measure.c b/drivers/soc/amlogic/meson-clk-measure.c
>>> index d862e30a244e..083524671b76 100644
>>> --- a/drivers/soc/amlogic/meson-clk-measure.c
>>> +++ b/drivers/soc/amlogic/meson-clk-measure.c
>>> @@ -787,6 +787,258 @@ static const struct meson_msr_id clk_msr_s4[] = {
>>>
>>> };
>>>
>>> +static struct meson_msr_id clk_msr_a1[] = {
>> And existing code uses what sort of array? Seems you send us obsolete or
>> downstream code.
>
>
> Thanks for your review.
>
>
> I have checked the previous Amlogic SoC's commits. Such as Amlogic AXG,
> G12A, C3, S4.
>
> The clk_msr_xx entry is added after last SoC's array, sorted by
> submissin date rather than alphabetical order.
>
> So I place A1 and T7 after S4 accordingly.
>
>
> The A1 clock controller driver was already supported in
> https://lore.kernel.org/all/20230523135351.19133-7-ddrokosov@sberdevices.ru/
>
> It is also present in the mainline kernel:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/clk/meson/Kconfig#n113
>
>
> This clock measure IP is used to measure the internal clock paths
> frequencies, and A1 clock controller driver was supported.
>
> Since the corresponding clock measure driver does not support A1 yet, So
> add A1 clk msr here.
No, what qualifiers or keywords are used for existing arrays? IOW,
please investigate and understand why you are doing this very different
than existing code. Maybe because you sent us downstream, so you
replicated all other downstream issues.
Best regards,
Krzysztof
^ permalink raw reply
* Re: [patch 07/38] treewide: Consolidate cycles_t
From: Ojaswin Mujoo @ 2026-04-13 9:15 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Arnd Bergmann, x86, Lu Baolu, iommu, Michael Grzeschik,
netdev, linux-wireless, Herbert Xu, linux-crypto, Vlastimil Babka,
linux-mm, David Woodhouse, Bernie Thompson, linux-fbdev,
Theodore Tso, linux-ext4, Andrew Morton, Uladzislau Rezki,
Marco Elver, Dmitry Vyukov, kasan-dev, Andrey Ryabinin,
Thomas Sailer, linux-hams, Jason A. Donenfeld, Richard Henderson,
linux-alpha, Russell King, linux-arm-kernel, Catalin Marinas,
Huacai Chen, loongarch, Geert Uytterhoeven, linux-m68k,
Dinh Nguyen, Jonas Bonn, linux-openrisc, Helge Deller,
linux-parisc, Michael Ellerman, linuxppc-dev, Paul Walmsley,
linux-riscv, Heiko Carstens, linux-s390, David S. Miller,
sparclinux
In-Reply-To: <20260410120318.045532623@kernel.org>
On Fri, Apr 10, 2026 at 02:19:03PM +0200, Thomas Gleixner wrote:
> Most architectures define cycles_t as unsigned long execpt:
>
> - x86 requires it to be 64-bit independent of the 32-bit/64-bit build.
>
> - parisc and mips define it as unsigned int
>
> parisc has no real reason to do so as there are only a few usage sites
> which either expand it to a 64-bit value or utilize only the lower
> 32bits.
>
> mips has no real requirement either.
>
> Move the typedef to types.h and provide a config switch to enforce the
> 64-bit type for x86.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> ---
> arch/Kconfig | 4 ++++
> arch/alpha/include/asm/timex.h | 3 ---
> arch/arm/include/asm/timex.h | 1 -
> arch/loongarch/include/asm/timex.h | 2 --
> arch/m68k/include/asm/timex.h | 2 --
> arch/mips/include/asm/timex.h | 2 --
> arch/nios2/include/asm/timex.h | 2 --
> arch/parisc/include/asm/timex.h | 2 --
> arch/powerpc/include/asm/timex.h | 4 +---
> arch/riscv/include/asm/timex.h | 2 --
> arch/s390/include/asm/timex.h | 2 --
> arch/sparc/include/asm/timex_64.h | 1 -
> arch/x86/Kconfig | 1 +
> arch/x86/include/asm/tsc.h | 2 --
> include/asm-generic/timex.h | 1 -
> include/linux/types.h | 6 ++++++
> 16 files changed, 12 insertions(+), 25 deletions(-)
>
<...>
> --- a/arch/powerpc/include/asm/timex.h
> +++ b/arch/powerpc/include/asm/timex.h
> @@ -11,9 +11,7 @@
> #include <asm/cputable.h>
> #include <asm/vdso/timebase.h>
>
> -typedef unsigned long cycles_t;
> -
> -static inline cycles_t get_cycles(void)
> +ostatic inline cycles_t get_cycles(void)
Hi Thomas, I'm in middle of testing this series on powerpc. In the meantime I
noticed that there's probably a small typo here (althrough this is fixed
later)
Regards,
ojaswin
> {
> return mftb();
> }
^ permalink raw reply
* Re: [PATCH v2 3/4] KVM: arm64: sefltests: Add basic NV selftest
From: Wei-Lin Chang @ 2026-04-13 9:18 UTC (permalink / raw)
To: Itaru Kitayama
Cc: linux-arm-kernel, kvmarm, kvm, linux-kselftest, linux-kernel,
Marc Zyngier, Oliver Upton, Joey Gouly, Suzuki K Poulose,
Zenghui Yu, Catalin Marinas, Will Deacon, Paolo Bonzini,
Shuah Khan
In-Reply-To: <adwofV7yf71OdMFk@sm-arm-grace07>
Hi Itaru,
On Mon, Apr 13, 2026 at 08:19:25AM +0900, Itaru Kitayama wrote:
> On Sun, Apr 12, 2026 at 03:22:15PM +0100, Wei-Lin Chang wrote:
> > This selftest simply starts an L1, which starts its own guest (L2). L2
> > runs without stage-1 and 2 translations, it calls an HVC to jump back
> > to L1.
>
> How do you disable both the nested guest (L2)'s MMU and stage 2
> translations?
Guest stage-2 is disabled by not setting HCR_EL2.VM in prepare_hyp(),
and stage-1 is disabled by not writing to SCTLR_EL12 in init_vcpu(),
effectively using the default value set by L0. However since SCTLR_EL1
has many architecturally UNKNOWN bits (including SCTLR_EL1.M), it should
be better to write a value before running L2 I suppose...
Thanks,
Wei-Lin Chang
>
> Itaru.
>
> >
> > Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> > ---
> > tools/testing/selftests/kvm/Makefile.kvm | 1 +
> > .../selftests/kvm/arm64/hello_nested.c | 103 ++++++++++++++++++
> > 2 files changed, 104 insertions(+)
> > create mode 100644 tools/testing/selftests/kvm/arm64/hello_nested.c
> >
> > diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm
> > index 3dc3e39f7025..e8c108e0c487 100644
> > --- a/tools/testing/selftests/kvm/Makefile.kvm
> > +++ b/tools/testing/selftests/kvm/Makefile.kvm
> > @@ -168,6 +168,7 @@ TEST_GEN_PROGS_arm64 += arm64/arch_timer_edge_cases
> > TEST_GEN_PROGS_arm64 += arm64/at
> > TEST_GEN_PROGS_arm64 += arm64/debug-exceptions
> > TEST_GEN_PROGS_arm64 += arm64/hello_el2
> > +TEST_GEN_PROGS_arm64 += arm64/hello_nested
> > TEST_GEN_PROGS_arm64 += arm64/host_sve
> > TEST_GEN_PROGS_arm64 += arm64/hypercalls
> > TEST_GEN_PROGS_arm64 += arm64/external_aborts
> > diff --git a/tools/testing/selftests/kvm/arm64/hello_nested.c b/tools/testing/selftests/kvm/arm64/hello_nested.c
> > new file mode 100644
> > index 000000000000..97387e4697b3
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/arm64/hello_nested.c
> > @@ -0,0 +1,103 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * hello_nested - Go from vEL2 to EL1 then back
> > + */
> > +
> > +#include "nested.h"
> > +#include "processor.h"
> > +#include "test_util.h"
> > +#include "ucall.h"
> > +
> > +#define XLATE2GPA (0xABCD)
> > +#define L2STACKSZ (0x100)
> > +
> > +/*
> > + * TPIDR_EL2 is used to store vcpu id, so save and restore it.
> > + */
> > +static vm_paddr_t ucall_translate_to_gpa(void *gva)
> > +{
> > + vm_paddr_t gpa;
> > + u64 vcpu_id = read_sysreg(tpidr_el2);
> > +
> > + GUEST_SYNC2(XLATE2GPA, gva);
> > +
> > + /* get the result from userspace */
> > + gpa = read_sysreg(tpidr_el2);
> > +
> > + write_sysreg(vcpu_id, tpidr_el2);
> > +
> > + return gpa;
> > +}
> > +
> > +static void l2_guest_code(void)
> > +{
> > + do_hvc();
> > +}
> > +
> > +static void guest_code(void)
> > +{
> > + struct vcpu vcpu;
> > + struct hyp_data hyp_data;
> > + int ret;
> > + vm_paddr_t l2_pc, l2_stack_top;
> > + /* force 16-byte alignment for the stack pointer */
> > + u8 l2_stack[L2STACKSZ] __attribute__((aligned(16)));
> > +
> > + GUEST_ASSERT_EQ(get_current_el(), 2);
> > + GUEST_PRINTF("vEL2 entry\n");
> > +
> > + l2_pc = ucall_translate_to_gpa(l2_guest_code);
> > + l2_stack_top = ucall_translate_to_gpa(&l2_stack[L2STACKSZ]);
> > +
> > + init_vcpu(&vcpu, l2_pc, l2_stack_top);
> > + prepare_hyp();
> > +
> > + ret = run_l2(&vcpu, &hyp_data);
> > + GUEST_ASSERT_EQ(ret, ARM_EXCEPTION_TRAP);
> > + GUEST_DONE();
> > +}
> > +
> > +int main(void)
> > +{
> > + struct kvm_vcpu_init init;
> > + struct kvm_vcpu *vcpu;
> > + struct kvm_vm *vm;
> > + struct ucall uc;
> > + vm_paddr_t gpa;
> > +
> > + TEST_REQUIRE(kvm_check_cap(KVM_CAP_ARM_EL2));
> > + vm = vm_create(1);
> > +
> > + kvm_get_default_vcpu_target(vm, &init);
> > + init.features[0] |= BIT(KVM_ARM_VCPU_HAS_EL2);
> > + vcpu = aarch64_vcpu_add(vm, 0, &init, guest_code);
> > + kvm_arch_vm_finalize_vcpus(vm);
> > +
> > + while (true) {
> > + vcpu_run(vcpu);
> > +
> > + switch (get_ucall(vcpu, &uc)) {
> > + case UCALL_SYNC:
> > + if (uc.args[0] == XLATE2GPA) {
> > + gpa = addr_gva2gpa(vm, (vm_vaddr_t)uc.args[1]);
> > + vcpu_set_reg(vcpu, KVM_ARM64_SYS_REG(SYS_TPIDR_EL2), gpa);
> > + }
> > + break;
> > + case UCALL_PRINTF:
> > + pr_info("%s", uc.buffer);
> > + break;
> > + case UCALL_DONE:
> > + pr_info("DONE!\n");
> > + goto end;
> > + case UCALL_ABORT:
> > + REPORT_GUEST_ASSERT(uc);
> > + fallthrough;
> > + default:
> > + TEST_FAIL("Unhandled ucall: %ld\n", uc.cmd);
> > + }
> > + }
> > +
> > +end:
> > + kvm_vm_free(vm);
> > + return 0;
> > +}
> > --
> > 2.43.0
> >
^ permalink raw reply
* [PATCH] ARM: dts: bcm4709: fix bus range assignment
From: Arnd Bergmann @ 2026-04-13 9:21 UTC (permalink / raw)
To: Florian Fainelli, Hauke Mehrtens, Rafał Miłecki,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Rosen Penev
Cc: soc, Arnd Bergmann, Broadcom internal kernel review list,
linux-arm-kernel, devicetree, linux-kernel
From: Arnd Bergmann <arnd@arndb.de>
The netgear r8000 dts file limits the bus range for the first host
bridge to exclude bus 0, but the two devices on the first bus are
explicitly assigned to bus 0, causing a build time warning:
/home/arnd/arm-soc/arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts:142.3-27: Warning (pci_device_bus_num): /axi@18000000/pcie@13000/pcie@0/pcie@0,0/pcie@1,0:bus-range: PCI bus number 0 out of range, expected (1 - 255)
/home/arnd/arm-soc/arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts:142.3-27: Warning (pci_device_bus_num): /axi@18000000/pcie@13000/pcie@0/pcie@0,0/pcie@2,0:bus-range: PCI bus number 0 out of range, expected (1 - 255)
I could not find any reason why this is done in the first place, but
this can be easily addressed by reassigning the two devices to
bus 1, or by dropping the bus-range property in order to allow
secondary bus 0 to be assigned.
Assuming the bus-range is intentional, fix this by moving the
devices to the first valid secondary bus number.
Fixes: 893faf67438c ("ARM: dts: BCM5301X: add root pcie bridges")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts b/arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts
index d170c71cbd76..355be5014943 100644
--- a/arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts
+++ b/arch/arm/boot/dts/broadcom/bcm4709-netgear-r8000.dts
@@ -147,7 +147,7 @@ pcie@0,0 {
pcie@1,0 {
device_type = "pci";
- reg = <0x800 0 0 0 0>;
+ reg = <0x10800 0 0 0 0>;
#address-cells = <3>;
#size-cells = <2>;
@@ -162,7 +162,7 @@ wifi@0,0 {
pcie@2,0 {
device_type = "pci";
- reg = <0x1000 0 0 0 0>;
+ reg = <0x11000 0 0 0 0>;
#address-cells = <3>;
#size-cells = <2>;
--
2.39.5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox