* [RFC PATCH 00/36] arm_mpam: Add basic mpam driver
@ 2025-07-11 18:36 James Morse
2025-07-11 18:36 ` [RFC PATCH 01/36] cacheinfo: Set cache 'id' based on DT data James Morse
` (36 more replies)
0 siblings, 37 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Hello,
This is just enough MPAM driver for the ACPI and DT pre-requisites.
It doesn't contain any of the resctrl code, meaning you can't actually drive it
from user-space yet.
This is the initial group of patches that allows the resctrl code to be built
on top. Including that will increase the number of trees that may need to
coordinate, so breaking it up make sense.
The locking looks very strange - but is influenced by the 'mpam-fb' firmware
interface specification that is still alpha. That thing needs to wait for an
interrupt after every system register write, which significantly impacts the
driver. Some features just won't work, e.g. reading the monitor registers via
perf.
The aim is to not have to make invasive changes to the locking to support the
firmware interface, hence it looks strange from day-1.
I've not found a platform that can test all the behaviours around the monitors,
so this is where I'd expect the most bugs.
It's unclear where in the tree this should be put. It affects memory bandwidth
and cache allocation, but doesn't (yet) interact with perf. The main interaction
is with resctrl in fs/resctrl - but there will be no filesystem code in here.
Its also likely there will be other in-kernel users. (in-kernel MSC emulation by
KVM being an obvious example).
(I'm not a fan of drivers/resctrl or drivers/mpam - its not the sort of thing
that justifies being a 'subsystem'.)
For now, I've put this under drivers/platform/arm64. Other ideas welcome.
The first three patches are currently a series on the list, the PPTT stuff
has previously been posted - this is where the users of those helpers appear.
The MPAM spec that describes all the system and MMIO registers can be found
here:
https://developer.arm.com/documentation/ddi0598/db/?lang=en
(Ignored the 'RETIRED' warning - that is just arm moving the documentation
around. This document has the best overview)
This series is based on v6.16-rc4, and can be retrieved from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/driver/rfc
The rest of the driver can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/snapshot/v6.16-rc4
What is MPAM? Set your time-machine to 2020:
https://lore.kernel.org/lkml/20201030161120.227225-1-james.morse@arm.com/
Bugs welcome,
Thanks,
James Morse (31):
cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for
cache-id
arm64: cacheinfo: Provide helper to compress MPIDR value into u32
cacheinfo: Expose the code to generate a cache-id from a device_node
ACPI / PPTT: Add a helper to fill a cpumask from a processor container
ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear
levels
ACPI / PPTT: Find cache level by cache-id
ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
arm64: kconfig: Add Kconfig entry for MPAM
ACPI / MPAM: Parse the MPAM table
platform: arm64: Move ec devices to an ec subdirectory
arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
arm_mpam: Add the class and component structures for ris firmware
described
arm_mpam: Add MPAM MSC register layout definitions
arm_mpam: Add cpuhp callbacks to probe MSC hardware
arm_mpam: Probe MSCs to find the supported partid/pmg values
arm_mpam: Add helpers for managing the locking around the mon_sel
registers
arm_mpam: Probe the hardware features resctrl supports
arm_mpam: Merge supported features during mpam_enable() into
mpam_class
arm_mpam: Reset MSC controls from cpu hp callbacks
arm_mpam: Add a helper to touch an MSC from any CPU
arm_mpam: Extend reset logic to allow devices to be reset any time
arm_mpam: Register and enable IRQs
arm_mpam: Use a static key to indicate when mpam is enabled
arm_mpam: Allow configuration to be applied and restored during cpu
online
arm_mpam: Probe and reset the rest of the features
arm_mpam: Add helpers to allocate monitors
arm_mpam: Add mpam_msmon_read() to read monitor value
arm_mpam: Track bandwidth counter state for overflow and power
management
arm_mpam: Add helper to reset saved mbwu state
arm_mpam: Add kunit test for bitmap reset
arm_mpam: Add kunit tests for props_mismatch()
Rob Herring (2):
cacheinfo: Set cache 'id' based on DT data
dt-bindings: arm: Add MPAM MSC binding
Rohit Mathew (2):
arm_mpam: Probe for long/lwd mbwu counters
arm_mpam: Use long MBWU counters if supported
Shanker Donthineni (1):
arm_mpam: Add support for memory controller MSC on DT platforms
.../devicetree/bindings/arm/arm,mpam-msc.yaml | 227 ++
MAINTAINERS | 6 +-
arch/arm64/Kconfig | 19 +
arch/arm64/include/asm/cache.h | 17 +
drivers/acpi/arm64/Kconfig | 3 +
drivers/acpi/arm64/Makefile | 1 +
drivers/acpi/arm64/mpam.c | 365 +++
drivers/acpi/pptt.c | 240 +-
drivers/acpi/tables.c | 2 +-
drivers/base/cacheinfo.c | 57 +
drivers/platform/arm64/Kconfig | 73 +-
drivers/platform/arm64/Makefile | 10 +-
drivers/platform/arm64/ec/Kconfig | 73 +
drivers/platform/arm64/ec/Makefile | 10 +
.../platform/arm64/{ => ec}/acer-aspire1-ec.c | 0
.../arm64/{ => ec}/huawei-gaokun-ec.c | 0
.../arm64/{ => ec}/lenovo-yoga-c630.c | 0
drivers/platform/arm64/mpam/Kconfig | 23 +
drivers/platform/arm64/mpam/Makefile | 4 +
drivers/platform/arm64/mpam/mpam_devices.c | 2910 +++++++++++++++++
drivers/platform/arm64/mpam/mpam_internal.h | 697 ++++
.../platform/arm64/mpam/test_mpam_devices.c | 390 +++
include/linux/acpi.h | 17 +
include/linux/arm_mpam.h | 56 +
include/linux/cacheinfo.h | 1 +
25 files changed, 5117 insertions(+), 84 deletions(-)
create mode 100644 Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
create mode 100644 drivers/acpi/arm64/mpam.c
create mode 100644 drivers/platform/arm64/ec/Kconfig
create mode 100644 drivers/platform/arm64/ec/Makefile
rename drivers/platform/arm64/{ => ec}/acer-aspire1-ec.c (100%)
rename drivers/platform/arm64/{ => ec}/huawei-gaokun-ec.c (100%)
rename drivers/platform/arm64/{ => ec}/lenovo-yoga-c630.c (100%)
create mode 100644 drivers/platform/arm64/mpam/Kconfig
create mode 100644 drivers/platform/arm64/mpam/Makefile
create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
create mode 100644 drivers/platform/arm64/mpam/test_mpam_devices.c
create mode 100644 include/linux/arm_mpam.h
--
2.39.5
^ permalink raw reply [flat|nested] 117+ messages in thread
* [RFC PATCH 01/36] cacheinfo: Set cache 'id' based on DT data
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 02/36] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
` (35 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Greg Kroah-Hartman, Rafael J. Wysocki, James Morse,
Jonathan Cameron, Gavin Shan
From: Rob Herring <robh@kernel.org>
Use the minimum CPU h/w id of the CPUs associated with the cache for the
cache 'id'. This will provide a stable id value for a given system. As
we need to check all possible CPUs, we can't use the shared_cpu_map
which is just online CPUs. As there's not a cache to CPUs mapping in DT,
we have to walk all CPU nodes and then walk cache levels.
The cache_id exposed to user-space has historically been 32 bits, and
is too late to change. This value is parsed into a u32 by user-space
libraries such as libvirt:
https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
is found.
match_cache_node() does not make use of the __free() cleanup helpers
because of_find_next_cache_node(prev) does not drop a reference to prev,
and its too easy to accidentally drop the reference on cpu, which belongs
to for_each_of_cpu_node().
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
[ ben: converted to use the __free cleanup idiom ]
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
[ morse: Add checks to give up if a value larger than 32 bits is seen. ]
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
---
Use as a 32bit value has also been seen in DPDK patches here:
http://inbox.dpdk.org/dev/20241021015246.304431-2-wathsala.vithanage@arm.com/
Changes since v2:
* Removed broken use of cleanup in the match helper
Changes since v1:
* Remove the second loop in favour of a helper.
---
drivers/base/cacheinfo.c | 45 ++++++++++++++++++++++++++++++++++++++++
1 file changed, 45 insertions(+)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index cf0d455209d7..4e2f60c85e74 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -8,6 +8,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/acpi.h>
+#include <linux/bitfield.h>
#include <linux/bitops.h>
#include <linux/cacheinfo.h>
#include <linux/compiler.h>
@@ -183,6 +184,49 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
return of_property_read_bool(np, "cache-unified");
}
+static bool match_cache_node(struct device_node *cpu,
+ const struct device_node *cache_node)
+{
+ struct device_node *prev, *cache = of_find_next_cache_node(cpu);
+
+ while (cache) {
+ if (cache == cache_node) {
+ of_node_put(cache);
+ return true;
+ }
+
+ prev = cache;
+ cache = of_find_next_cache_node(cache);
+ of_node_put(prev);
+ }
+
+ return false;
+}
+
+static void cache_of_set_id(struct cacheinfo *this_leaf,
+ struct device_node *cache_node)
+{
+ struct device_node *cpu;
+ u32 min_id = ~0;
+
+ for_each_of_cpu_node(cpu) {
+ u64 id = of_get_cpu_hwid(cpu, 0);
+
+ if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
+ of_node_put(cpu);
+ return;
+ }
+
+ if (match_cache_node(cpu, cache_node))
+ min_id = min(min_id, id);
+ }
+
+ if (min_id != ~0) {
+ this_leaf->id = min_id;
+ this_leaf->attributes |= CACHE_ID;
+ }
+}
+
static void cache_of_set_props(struct cacheinfo *this_leaf,
struct device_node *np)
{
@@ -198,6 +242,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf,
cache_get_line_size(this_leaf, np);
cache_nr_sets(this_leaf, np);
cache_associativity(this_leaf);
+ cache_of_set_id(this_leaf, np);
}
static int cache_setup_of_node(unsigned int cpu)
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 02/36] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
2025-07-11 18:36 ` [RFC PATCH 01/36] cacheinfo: Set cache 'id' based on DT data James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 03/36] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
` (34 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Jonathan Cameron, Gavin Shan
Filesystems like resctrl use the cache-id exposed via sysfs to identify
groups of CPUs. The value is also used for PCIe cache steering tags. On
DT platforms cache-id is not something that is described in the
device-tree, but instead generated from the smallest CPU h/w id of the
CPUs associated with that cache.
CPU h/w ids may be larger than 32 bits.
Add a hook to allow architectures to compress the value from the devicetree
into 32 bits. Returning the same value is always safe as cache_of_set_id()
will stop if a value larger than 32 bits is seen.
For example, on arm64 the value is the MPIDR affinity register, which only
has 32 bits of affinity data, but spread accross the 64 bit field. An
arch-specific bit swizzle gives a 32 bit value.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
---
drivers/base/cacheinfo.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 4e2f60c85e74..613410705a47 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -203,6 +203,10 @@ static bool match_cache_node(struct device_node *cpu,
return false;
}
+#ifndef arch_compact_of_hwid
+#define arch_compact_of_hwid(_x) (_x)
+#endif
+
static void cache_of_set_id(struct cacheinfo *this_leaf,
struct device_node *cache_node)
{
@@ -212,6 +216,7 @@ static void cache_of_set_id(struct cacheinfo *this_leaf,
for_each_of_cpu_node(cpu) {
u64 id = of_get_cpu_hwid(cpu, 0);
+ id = arch_compact_of_hwid(id);
if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
of_node_put(cpu);
return;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 03/36] arm64: cacheinfo: Provide helper to compress MPIDR value into u32
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
2025-07-11 18:36 ` [RFC PATCH 01/36] cacheinfo: Set cache 'id' based on DT data James Morse
2025-07-11 18:36 ` [RFC PATCH 02/36] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node James Morse
` (33 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Jonathan Cameron, Gavin Shan
Filesystems like resctrl use the cache-id exposed via sysfs to identify
groups of CPUs. The value is also used for PCIe cache steering tags. On
DT platforms cache-id is not something that is described in the
device-tree, but instead generated from the smallest MPIDR of the CPUs
associated with that cache. The cache-id exposed to user-space has
historically been 32 bits.
MPIDR values may be larger than 32 bits.
MPIDR only has 32 bits worth of affinity data, but the aff3 field lives
above 32bits. The corresponding lower bits are masked out by
MPIDR_HWID_BITMASK and contain an SMT flag and Uni-Processor flag.
Swizzzle the aff3 field into the bottom 32 bits and using that.
In case more affinity fields are added in the future, the upper RES0
area should be checked. Returning a value greater than 32 bits from
this helper will cause the caller to give up on allocating cache-ids.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
---
Changes since v1:
* Removal of unrelated changes.
* Added a comment about how the RES0 bit safety net works.
---
arch/arm64/include/asm/cache.h | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index 99cd6546e72e..09963004ceea 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -87,6 +87,23 @@ int cache_line_size(void);
#define dma_get_cache_alignment cache_line_size
+/* Compress a u64 MPIDR value into 32 bits. */
+static inline u64 arch_compact_of_hwid(u64 id)
+{
+ u64 aff3 = MPIDR_AFFINITY_LEVEL(id, 3);
+
+ /*
+ * These bits are expected to be RES0. If not, return a value with
+ * the upper 32 bits set to force the caller to give up on 32 bit
+ * cache ids.
+ */
+ if (FIELD_GET(GENMASK_ULL(63, 40), id))
+ return id;
+
+ return (aff3 << 24) | FIELD_GET(GENMASK_ULL(23, 0), id);
+}
+#define arch_compact_of_hwid arch_compact_of_hwid
+
/*
* Read the effective value of CTR_EL0.
*
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (2 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 03/36] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-14 11:40 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
` (32 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
The MPAM driver identifies caches by id for use with resctrl. It
needs to know the cache-id when probe-ing, but the value isn't set
in cacheinfo until device_initcall().
Expose the code that generates the cache-id. The parts of the MPAM
driver that run early can use this to set up the resctrl structures
before cacheinfo is ready in device_initcall().
Signed-off-by: James Morse <james.morse@arm.com>
---
Changes since v1:
* Renamed cache_of_get_id() cache_of_calculate_id().
---
drivers/base/cacheinfo.c | 17 ++++++++++++-----
include/linux/cacheinfo.h | 1 +
2 files changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index 613410705a47..0fdd6358ee73 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -207,8 +207,7 @@ static bool match_cache_node(struct device_node *cpu,
#define arch_compact_of_hwid(_x) (_x)
#endif
-static void cache_of_set_id(struct cacheinfo *this_leaf,
- struct device_node *cache_node)
+unsigned long cache_of_calculate_id(struct device_node *cache_node)
{
struct device_node *cpu;
u32 min_id = ~0;
@@ -219,15 +218,23 @@ static void cache_of_set_id(struct cacheinfo *this_leaf,
id = arch_compact_of_hwid(id);
if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
of_node_put(cpu);
- return;
+ return ~0UL;
}
if (match_cache_node(cpu, cache_node))
min_id = min(min_id, id);
}
- if (min_id != ~0) {
- this_leaf->id = min_id;
+ return min_id;
+}
+
+static void cache_of_set_id(struct cacheinfo *this_leaf,
+ struct device_node *cache_node)
+{
+ unsigned long id = cache_of_calculate_id(cache_node);
+
+ if (id != ~0UL) {
+ this_leaf->id = id;
this_leaf->attributes |= CACHE_ID;
}
}
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index c8f4f0a0b874..2dcbb69139e9 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -112,6 +112,7 @@ int acpi_get_cache_info(unsigned int cpu,
#endif
const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
+unsigned long cache_of_calculate_id(struct device_node *np);
/*
* Get the cacheinfo structure for the cache associated with @cpu at
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (3 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
` (2 more replies)
2025-07-11 18:36 ` [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels James Morse
` (31 subsequent siblings)
36 siblings, 3 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Sudeep Holla
The PPTT describes CPUs and caches, as well as processor containers.
The ACPI table for MPAM describes the set of CPUs that can access an MSC
with the UID of a processor container.
Add a helper to find the processor container by its id, then walk
the possible CPUs to fill a cpumask with the CPUs that have this
processor container as a parent.
CC: Dave Martin <dave.martin@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/acpi/pptt.c | 93 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/acpi.h | 6 +++
2 files changed, 99 insertions(+)
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 54676e3d82dd..13619b1b821b 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -298,6 +298,99 @@ static struct acpi_pptt_processor *acpi_find_processor_node(struct acpi_table_he
return NULL;
}
+/**
+ * acpi_pptt_get_child_cpus() - Find all the CPUs below a PPTT processor node
+ * @table_hdr: A reference to the PPTT table.
+ * @parent_node: A pointer to the processor node in the @table_hdr.
+ * @cpus: A cpumask to fill with the CPUs below @parent_node.
+ *
+ * Walks up the PPTT from every possible CPU to find if the provided
+ * @parent_node is a parent of this CPU.
+ */
+static void acpi_pptt_get_child_cpus(struct acpi_table_header *table_hdr,
+ struct acpi_pptt_processor *parent_node,
+ cpumask_t *cpus)
+{
+ struct acpi_pptt_processor *cpu_node;
+ u32 acpi_id;
+ int cpu;
+
+ cpumask_clear(cpus);
+
+ for_each_possible_cpu(cpu) {
+ acpi_id = get_acpi_id_for_cpu(cpu);
+ cpu_node = acpi_find_processor_node(table_hdr, acpi_id);
+
+ while (cpu_node) {
+ if (cpu_node == parent_node) {
+ cpumask_set_cpu(cpu, cpus);
+ break;
+ }
+ cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
+ }
+ }
+}
+
+/**
+ * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs in a
+ * processor containers
+ * @acpi_cpu_id: The UID of the processor container.
+ * @cpus The resulting CPU mask.
+ *
+ * Find the specified Processor Container, and fill @cpus with all the cpus
+ * below it.
+ *
+ * Not all 'Processor' entries in the PPTT are either a CPU or a Processor
+ * Container, they may exist purely to describe a Private resource. CPUs
+ * have to be leaves, so a Processor Container is a non-leaf that has the
+ * 'ACPI Processor ID valid' flag set.
+ *
+ * Return: 0 for a complete walk, or an error if the mask is incomplete.
+ */
+int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus)
+{
+ struct acpi_pptt_processor *cpu_node;
+ struct acpi_table_header *table_hdr;
+ struct acpi_subtable_header *entry;
+ bool leaf_flag, has_leaf_flag = false;
+ unsigned long table_end;
+ acpi_status status;
+ u32 proc_sz;
+ int ret = 0;
+
+ cpumask_clear(cpus);
+
+ status = acpi_get_table(ACPI_SIG_PPTT, 0, &table_hdr);
+ if (ACPI_FAILURE(status))
+ return 0;
+
+ if (table_hdr->revision > 1)
+ has_leaf_flag = true;
+
+ table_end = (unsigned long)table_hdr + table_hdr->length;
+ entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
+ sizeof(struct acpi_table_pptt));
+ proc_sz = sizeof(struct acpi_pptt_processor);
+ while ((unsigned long)entry + proc_sz <= table_end) {
+ cpu_node = (struct acpi_pptt_processor *)entry;
+ if (entry->type == ACPI_PPTT_TYPE_PROCESSOR &&
+ cpu_node->flags & ACPI_PPTT_ACPI_PROCESSOR_ID_VALID) {
+ leaf_flag = cpu_node->flags & ACPI_PPTT_ACPI_LEAF_NODE;
+ if ((has_leaf_flag && !leaf_flag) ||
+ (!has_leaf_flag && !acpi_pptt_leaf_node(table_hdr, cpu_node))) {
+ if (cpu_node->acpi_processor_id == acpi_cpu_id)
+ acpi_pptt_get_child_cpus(table_hdr, cpu_node, cpus);
+ }
+ }
+ entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
+ entry->length);
+ }
+
+ acpi_put_table(table_hdr);
+
+ return ret;
+}
+
static u8 acpi_cache_type(enum cache_type type)
{
switch (type) {
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index f102c0fe3431..8c3165c2b083 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1541,6 +1541,7 @@ int find_acpi_cpu_topology(unsigned int cpu, int level);
int find_acpi_cpu_topology_cluster(unsigned int cpu);
int find_acpi_cpu_topology_package(unsigned int cpu);
int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
+int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
#else
static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
{
@@ -1562,6 +1563,11 @@ static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
{
return -EINVAL;
}
+static inline int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id,
+ cpumask_t *cpus)
+{
+ return -EINVAL;
+}
#endif
void acpi_arch_init(void);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (4 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-16 15:51 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
` (30 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Sudeep Holla
acpi_count_levels() passes the number of levels back via a pointer argument.
It also passes this to acpi_find_cache_level() as the starting_level, and
preserves this value as it walks up the cpu_node tree counting the levels.
The only caller acpi_get_cache_info() happens to have already initialised
levels to zero, which acpi_count_levels() depends on to get the correct
result.
Explicitly zero the levels variable, so the count always starts at zero.
This saves any additional callers having to work out they need to do this.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
---
drivers/acpi/pptt.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 13619b1b821b..13ca2eee3b98 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -183,7 +183,7 @@ acpi_find_cache_level(struct acpi_table_header *table_hdr,
* @cpu_node: processor node we wish to count caches for
* @levels: Number of levels if success.
* @split_levels: Number of split cache levels (data/instruction) if
- * success. Can by NULL.
+ * success. Can be NULL.
*
* Given a processor node containing a processing unit, walk into it and count
* how many levels exist solely for it, and then walk up each level until we hit
@@ -196,6 +196,8 @@ static void acpi_count_levels(struct acpi_table_header *table_hdr,
struct acpi_pptt_processor *cpu_node,
unsigned int *levels, unsigned int *split_levels)
{
+ *levels = 0;
+
do {
acpi_find_cache_level(table_hdr, cpu_node, levels, split_levels, 0, 0);
cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (5 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-14 11:42 ` Ben Horgan
2025-07-16 16:21 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-idUIRE Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id James Morse
` (29 subsequent siblings)
36 siblings, 2 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Sudeep Holla
The MPAM table identifies caches by id. The MPAM driver also wants to know
the cache level to determine if the platform is of the shape that can be
managed via resctrl. Cacheinfo has this information, but only for CPUs that
are online.
Waiting for all CPUs to come online is a problem for platforms where
CPUs are brought online late by user-space.
Add a helper that walks every possible cache, until it finds the one
identified by cache-id, then return the level.
acpi_count_levels() expects its levels parameter to be initialised to
zero as it passes it to acpi_find_cache_level() as starting_level.
The existing callers do this. Document it.
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
---
drivers/acpi/pptt.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/acpi.h | 5 +++
2 files changed, 78 insertions(+)
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index 13ca2eee3b98..f53748a5df19 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -912,3 +912,76 @@ int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
ACPI_PPTT_ACPI_IDENTICAL);
}
+
+/**
+ * find_acpi_cache_level_from_id() - Get the level of the specified cache
+ * @cache_id: The id field of the unified cache
+ *
+ * Determine the level relative to any CPU for the unified cache identified by
+ * cache_id. This allows the property to be found even if the CPUs are offline.
+ *
+ * The returned level can be used to group unified caches that are peers.
+ *
+ * The PPTT table must be rev 3 or later,
+ *
+ * If one CPUs L2 is shared with another as L3, this function will return
+ * an unpredictable value.
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
+ * Otherwise returns a value which represents the level of the specified cache.
+ */
+int find_acpi_cache_level_from_id(u32 cache_id)
+{
+ u32 acpi_cpu_id;
+ acpi_status status;
+ int level, cpu, num_levels;
+ struct acpi_pptt_cache *cache;
+ struct acpi_table_header *table;
+ struct acpi_pptt_cache_v1 *cache_v1;
+ struct acpi_pptt_processor *cpu_node;
+
+ status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+ if (ACPI_FAILURE(status)) {
+ acpi_pptt_warn_missing();
+ return -ENOENT;
+ }
+
+ if (table->revision < 3) {
+ acpi_put_table(table);
+ return -ENOENT;
+ }
+
+ /*
+ * If we found the cache first, we'd still need to walk from each CPU
+ * to find the level...
+ */
+ for_each_possible_cpu(cpu) {
+ acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+ cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
+ if (!cpu_node)
+ break;
+ acpi_count_levels(table, cpu_node, &num_levels, NULL);
+
+ /* Start at 1 for L1 */
+ for (level = 1; level <= num_levels; level++) {
+ cache = acpi_find_cache_node(table, acpi_cpu_id,
+ ACPI_PPTT_CACHE_TYPE_UNIFIED,
+ level, &cpu_node);
+ if (!cache)
+ continue;
+
+ cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
+ cache,
+ sizeof(struct acpi_pptt_cache));
+
+ if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
+ cache_v1->cache_id == cache_id) {
+ acpi_put_table(table);
+ return level;
+ }
+ }
+ }
+
+ acpi_put_table(table);
+ return -ENOENT;
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 8c3165c2b083..82947f6d2a43 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1542,6 +1542,7 @@ int find_acpi_cpu_topology_cluster(unsigned int cpu);
int find_acpi_cpu_topology_package(unsigned int cpu);
int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
+int find_acpi_cache_level_from_id(u32 cache_id);
#else
static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
{
@@ -1568,6 +1569,10 @@ static inline int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id,
{
return -EINVAL;
}
+static inline int find_acpi_cache_level_from_id(u32 cache_id)
+{
+ return -EINVAL;
+}
#endif
void acpi_arch_init(void);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (6 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-16 16:24 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM James Morse
` (28 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Rohit Mathew, Sudeep Holla
MPAM identifies CPUs by the cache_id in the PPTT cache structure.
The driver needs to know which CPUs are associated with the cache,
the CPUs may not all be online, so cacheinfo does not have the
information.
Add a helper to pull this information out of the PPTT.
CC: Rohit Mathew <Rohit.Mathew@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
---
drivers/acpi/pptt.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
include/linux/acpi.h | 6 ++++
2 files changed, 76 insertions(+)
diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
index f53748a5df19..81f7ac18c023 100644
--- a/drivers/acpi/pptt.c
+++ b/drivers/acpi/pptt.c
@@ -985,3 +985,73 @@ int find_acpi_cache_level_from_id(u32 cache_id)
acpi_put_table(table);
return -ENOENT;
}
+
+/**
+ * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
+ * specified cache
+ * @cache_id: The id field of the unified cache
+ * @cpus: Where to build the cpumask
+ *
+ * Determine which CPUs are below this cache in the PPTT. This allows the property
+ * to be found even if the CPUs are offline.
+ *
+ * The PPTT table must be rev 3 or later,
+ *
+ * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
+ * Otherwise returns 0 and sets the cpus in the provided cpumask.
+ */
+int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
+{
+ u32 acpi_cpu_id;
+ acpi_status status;
+ int level, cpu, num_levels;
+ struct acpi_pptt_cache *cache;
+ struct acpi_table_header *table;
+ struct acpi_pptt_cache_v1 *cache_v1;
+ struct acpi_pptt_processor *cpu_node;
+
+ cpumask_clear(cpus);
+
+ status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
+ if (ACPI_FAILURE(status)) {
+ acpi_pptt_warn_missing();
+ return -ENOENT;
+ }
+
+ if (table->revision < 3) {
+ acpi_put_table(table);
+ return -ENOENT;
+ }
+
+ /*
+ * If we found the cache first, we'd still need to walk from each cpu.
+ */
+ for_each_possible_cpu(cpu) {
+ acpi_cpu_id = get_acpi_id_for_cpu(cpu);
+ cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
+ if (!cpu_node)
+ break;
+ acpi_count_levels(table, cpu_node, &num_levels, NULL);
+
+ /* Start at 1 for L1 */
+ for (level = 1; level <= num_levels; level++) {
+ cache = acpi_find_cache_node(table, acpi_cpu_id,
+ ACPI_PPTT_CACHE_TYPE_UNIFIED,
+ level, &cpu_node);
+ if (!cache)
+ continue;
+
+ cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
+ cache,
+ sizeof(struct acpi_pptt_cache));
+
+ if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
+ cache_v1->cache_id == cache_id) {
+ cpumask_set_cpu(cpu, cpus);
+ }
+ }
+ }
+
+ acpi_put_table(table);
+ return 0;
+}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 82947f6d2a43..61ac3d1de1e8 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -1543,6 +1543,7 @@ int find_acpi_cpu_topology_package(unsigned int cpu);
int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
int find_acpi_cache_level_from_id(u32 cache_id);
+int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus);
#else
static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
{
@@ -1573,6 +1574,11 @@ static inline int find_acpi_cache_level_from_id(u32 cache_id)
{
return -EINVAL;
}
+static inline int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id,
+ cpumask_t *cpus)
+{
+ return -EINVAL;
+}
#endif
void acpi_arch_init(void);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (7 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-16 16:26 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table James Morse
` (27 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
The bulk of the MPAM driver lives outside the arch code because it
largely manages MMIO devices that generate interrupts. The driver
needs a Kconfig symbol to enable it, as MPAM is only found on arm64
platforms, that is where the Kconfig option makes the most sense.
This Kconfig option will later be used by the arch code to enable
or disable the MPAM context-switch code, and registering the CPUs
properties with the MPAM driver.
Signed-off-by: James Morse <james.morse@arm.com>
---
arch/arm64/Kconfig | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 55fc331af337..5f08214537d0 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2058,6 +2058,23 @@ config ARM64_TLB_RANGE
ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
range of input addresses.
+config ARM64_MPAM
+ bool "Enable support for MPAM"
+ help
+ Memory Partitioning and Monitoring is an optional extension
+ that allows the CPUs to mark load and store transactions with
+ labels for partition-id and performance-monitoring-group.
+ System components, such as the caches, can use the partition-id
+ to apply a performance policy. MPAM monitors can use the
+ partition-id and performance-monitoring-group to measure the
+ cache occupancy or data throughput.
+
+ Use of this extension requires CPU support, support in the
+ memory system components (MSC), and a description from firmware
+ of where the MSC are in the address space.
+
+ MPAM is exposed to user-space via the resctrl pseudo filesystem.
+
endmenu # "ARMv8.4 architectural features"
menu "ARMv8.5 architectural features"
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (8 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-16 17:07 ` Jonathan Cameron
2025-07-24 10:50 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding James Morse
` (26 subsequent siblings)
36 siblings, 2 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Add code to parse the arm64 specific MPAM table, looking up the cache
level from the PPTT and feeding the end result into the MPAM driver.
CC: Carl Worth <carl@os.amperecomputing.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
arch/arm64/Kconfig | 1 +
drivers/acpi/arm64/Kconfig | 3 +
drivers/acpi/arm64/Makefile | 1 +
drivers/acpi/arm64/mpam.c | 365 ++++++++++++++++++++++++++++++++++++
drivers/acpi/tables.c | 2 +-
include/linux/arm_mpam.h | 46 +++++
6 files changed, 417 insertions(+), 1 deletion(-)
create mode 100644 drivers/acpi/arm64/mpam.c
create mode 100644 include/linux/arm_mpam.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5f08214537d0..ad9a49a39e41 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2060,6 +2060,7 @@ config ARM64_TLB_RANGE
config ARM64_MPAM
bool "Enable support for MPAM"
+ select ACPI_MPAM if ACPI
help
Memory Partitioning and Monitoring is an optional extension
that allows the CPUs to mark load and store transactions with
diff --git a/drivers/acpi/arm64/Kconfig b/drivers/acpi/arm64/Kconfig
index b3ed6212244c..f2fd79f22e7d 100644
--- a/drivers/acpi/arm64/Kconfig
+++ b/drivers/acpi/arm64/Kconfig
@@ -21,3 +21,6 @@ config ACPI_AGDI
config ACPI_APMT
bool
+
+config ACPI_MPAM
+ bool
diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
index 05ecde9eaabe..27b872249baa 100644
--- a/drivers/acpi/arm64/Makefile
+++ b/drivers/acpi/arm64/Makefile
@@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
obj-$(CONFIG_ACPI_IORT) += iort.o
obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
obj-$(CONFIG_ARM_AMBA) += amba.o
+obj-$(CONFIG_ACPI_MPAM) += mpam.o
obj-y += dma.o init.o
obj-y += thermal_cpufreq.o
diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
new file mode 100644
index 000000000000..f4791bac9a2a
--- /dev/null
+++ b/drivers/acpi/arm64/mpam.c
@@ -0,0 +1,365 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2025 Arm Ltd.
+
+/* Parse the MPAM ACPI table feeding the discovered nodes into the driver */
+
+#define pr_fmt(fmt) "ACPI MPAM: " fmt
+
+#include <linux/acpi.h>
+#include <linux/arm_mpam.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/platform_device.h>
+
+#include <acpi/processor.h>
+
+/* Flags for acpi_table_mpam_msc.*_interrupt_flags */
+#define ACPI_MPAM_MSC_IRQ_MODE_EDGE 1
+#define ACPI_MPAM_MSC_IRQ_TYPE_MASK (3 << 1)
+#define ACPI_MPAM_MSC_IRQ_TYPE_WIRED 0
+#define ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER BIT(3)
+#define ACPI_MPAM_MSC_IRQ_AFFINITY_VALID BIT(4)
+
+static bool frob_irq(struct platform_device *pdev, int intid, u32 flags,
+ int *irq, u32 processor_container_uid)
+{
+ int sense;
+
+ if (!intid)
+ return false;
+
+ /* 0 in this field indicates a wired interrupt */
+ if (flags & ACPI_MPAM_MSC_IRQ_TYPE_MASK)
+ return false;
+
+ if (flags & ACPI_MPAM_MSC_IRQ_MODE_EDGE)
+ sense = ACPI_EDGE_SENSITIVE;
+ else
+ sense = ACPI_LEVEL_SENSITIVE;
+
+ /*
+ * If the GSI is in the GIC's PPI range, try and create a partitioned
+ * percpu interrupt.
+ */
+ if (16 <= intid && intid < 32 && processor_container_uid != ~0) {
+ pr_err_once("Partitioned interrupts not supported\n");
+ return false;
+ }
+
+ *irq = acpi_register_gsi(&pdev->dev, intid, sense, ACPI_ACTIVE_HIGH);
+ if (*irq <= 0) {
+ pr_err_once("Failed to register interrupt 0x%x with ACPI\n",
+ intid);
+ return false;
+ }
+
+ return true;
+}
+
+static void acpi_mpam_parse_irqs(struct platform_device *pdev,
+ struct acpi_mpam_msc_node *tbl_msc,
+ struct resource *res, int *res_idx)
+{
+ u32 flags, aff = ~0;
+ int irq;
+
+ flags = tbl_msc->overflow_interrupt_flags;
+ if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
+ flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
+ aff = tbl_msc->overflow_interrupt_affinity;
+ if (frob_irq(pdev, tbl_msc->overflow_interrupt, flags, &irq, aff)) {
+ res[*res_idx].start = irq;
+ res[*res_idx].end = irq;
+ res[*res_idx].flags = IORESOURCE_IRQ;
+ res[*res_idx].name = "overflow";
+
+ (*res_idx)++;
+ }
+
+ flags = tbl_msc->error_interrupt_flags;
+ if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
+ flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
+ aff = tbl_msc->error_interrupt_affinity;
+ else
+ aff = ~0;
+ if (frob_irq(pdev, tbl_msc->error_interrupt, flags, &irq, aff)) {
+ res[*res_idx].start = irq;
+ res[*res_idx].end = irq;
+ res[*res_idx].flags = IORESOURCE_IRQ;
+ res[*res_idx].name = "error";
+
+ (*res_idx)++;
+ }
+}
+
+static int acpi_mpam_parse_resource(struct mpam_msc *msc,
+ struct acpi_mpam_resource_node *res)
+{
+ int level, nid;
+ u32 cache_id;
+
+ switch (res->locator_type) {
+ case ACPI_MPAM_LOCATION_TYPE_PROCESSOR_CACHE:
+ cache_id = res->locator.cache_locator.cache_reference;
+ level = find_acpi_cache_level_from_id(cache_id);
+ if (level < 0) {
+ pr_err_once("Bad level (%u) for cache with id %u\n", level, cache_id);
+ return -EINVAL;
+ }
+ return mpam_ris_create(msc, res->ris_index, MPAM_CLASS_CACHE,
+ level, cache_id);
+ case ACPI_MPAM_LOCATION_TYPE_MEMORY:
+ nid = pxm_to_node(res->locator.memory_locator.proximity_domain);
+ if (nid == NUMA_NO_NODE)
+ nid = 0;
+ return mpam_ris_create(msc, res->ris_index, MPAM_CLASS_MEMORY,
+ 255, nid);
+ default:
+ /* These get discovered later and treated as unknown */
+ return 0;
+ }
+}
+
+int acpi_mpam_parse_resources(struct mpam_msc *msc,
+ struct acpi_mpam_msc_node *tbl_msc)
+{
+ int i, err;
+ struct acpi_mpam_resource_node *resources;
+
+ resources = (struct acpi_mpam_resource_node *)(tbl_msc + 1);
+ for (i = 0; i < tbl_msc->num_resource_nodes; i++) {
+ err = acpi_mpam_parse_resource(msc, &resources[i]);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static bool __init parse_msc_pm_link(struct acpi_mpam_msc_node *tbl_msc,
+ struct platform_device *pdev,
+ u32 *acpi_id)
+{
+ bool acpi_id_valid = false;
+ struct acpi_device *buddy;
+ char hid[16], uid[16];
+ int err;
+
+ memset(&hid, 0, sizeof(hid));
+ memcpy(hid, &tbl_msc->hardware_id_linked_device,
+ sizeof(tbl_msc->hardware_id_linked_device));
+
+ if (!strcmp(hid, ACPI_PROCESSOR_CONTAINER_HID)) {
+ *acpi_id = tbl_msc->instance_id_linked_device;
+ acpi_id_valid = true;
+ }
+
+ err = snprintf(uid, sizeof(uid), "%u",
+ tbl_msc->instance_id_linked_device);
+ if (err < 0 || err >= sizeof(uid))
+ return acpi_id_valid;
+
+ buddy = acpi_dev_get_first_match_dev(hid, uid, -1);
+ if (buddy)
+ device_link_add(&pdev->dev, &buddy->dev, DL_FLAG_STATELESS);
+
+ return acpi_id_valid;
+}
+
+static int decode_interface_type(struct acpi_mpam_msc_node *tbl_msc,
+ enum mpam_msc_iface *iface)
+{
+ switch (tbl_msc->interface_type) {
+ case 0:
+ *iface = MPAM_IFACE_MMIO;
+ return 0;
+ case 0xa:
+ *iface = MPAM_IFACE_PCC;
+ return 0;
+ default:
+ return -EINVAL;
+ }
+}
+
+static int __init _parse_table(struct acpi_table_header *table)
+{
+ char *table_end, *table_offset = (char *)(table + 1);
+ struct property_entry props[4]; /* needs a sentinel */
+ struct acpi_mpam_msc_node *tbl_msc;
+ int next_res, next_prop, err = 0;
+ struct acpi_device *companion;
+ struct platform_device *pdev;
+ enum mpam_msc_iface iface;
+ struct resource res[3];
+ char uid[16];
+ u32 acpi_id;
+
+ table_end = (char *)table + table->length;
+
+ while (table_offset < table_end) {
+ tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
+ table_offset += tbl_msc->length;
+
+ /*
+ * If any of the reserved fields are set, make no attempt to
+ * parse the msc structure. This will prevent the driver from
+ * probing all the MSC, meaning it can't discover the system
+ * wide supported partid and pmg ranges. This avoids whatever
+ * this MSC is truncating the partids and creating a screaming
+ * error interrupt.
+ */
+ if (tbl_msc->reserved || tbl_msc->reserved1 || tbl_msc->reserved2)
+ continue;
+
+ if (decode_interface_type(tbl_msc, &iface))
+ continue;
+
+ next_res = 0;
+ next_prop = 0;
+ memset(res, 0, sizeof(res));
+ memset(props, 0, sizeof(props));
+
+ pdev = platform_device_alloc("mpam_msc", tbl_msc->identifier);
+ if (IS_ERR(pdev)) {
+ err = PTR_ERR(pdev);
+ break;
+ }
+
+ if (tbl_msc->length < sizeof(*tbl_msc)) {
+ err = -EINVAL;
+ break;
+ }
+
+ /* Some power management is described in the namespace: */
+ err = snprintf(uid, sizeof(uid), "%u", tbl_msc->identifier);
+ if (err > 0 && err < sizeof(uid)) {
+ companion = acpi_dev_get_first_match_dev("ARMHAA5C", uid, -1);
+ if (companion)
+ ACPI_COMPANION_SET(&pdev->dev, companion);
+ }
+
+ if (iface == MPAM_IFACE_MMIO) {
+ res[next_res].name = "MPAM:MSC";
+ res[next_res].start = tbl_msc->base_address;
+ res[next_res].end = tbl_msc->base_address + tbl_msc->mmio_size - 1;
+ res[next_res].flags = IORESOURCE_MEM;
+ next_res++;
+ } else if (iface == MPAM_IFACE_PCC) {
+ props[next_prop++] = PROPERTY_ENTRY_U32("pcc-channel",
+ tbl_msc->base_address);
+ next_prop++;
+ }
+
+ acpi_mpam_parse_irqs(pdev, tbl_msc, res, &next_res);
+ err = platform_device_add_resources(pdev, res, next_res);
+ if (err)
+ break;
+
+ props[next_prop++] = PROPERTY_ENTRY_U32("arm,not-ready-us",
+ tbl_msc->max_nrdy_usec);
+
+ /*
+ * The MSC's CPU affinity is described via its linked power
+ * management device, but only if it points at a Processor or
+ * Processor Container.
+ */
+ if (parse_msc_pm_link(tbl_msc, pdev, &acpi_id)) {
+ props[next_prop++] = PROPERTY_ENTRY_U32("cpu_affinity",
+ acpi_id);
+ }
+
+ err = device_create_managed_software_node(&pdev->dev, props,
+ NULL);
+ if (err)
+ break;
+
+ /* Come back later if you want the RIS too */
+ err = platform_device_add_data(pdev, tbl_msc, tbl_msc->length);
+ if (err)
+ break;
+
+ platform_device_add(pdev);
+ }
+
+ if (err)
+ platform_device_put(pdev);
+
+ return err;
+}
+
+static struct acpi_table_header *get_table(void)
+{
+ struct acpi_table_header *table;
+ acpi_status status;
+
+ if (acpi_disabled || !system_supports_mpam())
+ return NULL;
+
+ status = acpi_get_table(ACPI_SIG_MPAM, 0, &table);
+ if (ACPI_FAILURE(status))
+ return NULL;
+
+ if (table->revision != 1)
+ return NULL;
+
+ return table;
+}
+
+static int __init acpi_mpam_parse(void)
+{
+ struct acpi_table_header *mpam;
+ int err;
+
+ mpam = get_table();
+ if (!mpam)
+ return 0;
+
+ err = _parse_table(mpam);
+ acpi_put_table(mpam);
+
+ return err;
+}
+
+static int _count_msc(struct acpi_table_header *table)
+{
+ char *table_end, *table_offset = (char *)(table + 1);
+ struct acpi_mpam_msc_node *tbl_msc;
+ int ret = 0;
+
+ tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
+ table_end = (char *)table + table->length;
+
+ while (table_offset < table_end) {
+ if (tbl_msc->length < sizeof(*tbl_msc))
+ return -EINVAL;
+
+ ret++;
+
+ table_offset += tbl_msc->length;
+ tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
+ }
+
+ return ret;
+}
+
+int acpi_mpam_count_msc(void)
+{
+ struct acpi_table_header *mpam;
+ int ret;
+
+ mpam = get_table();
+ if (!mpam)
+ return 0;
+
+ ret = _count_msc(mpam);
+ acpi_put_table(mpam);
+
+ return ret;
+}
+
+/*
+ * Call after ACPI devices have been created, which happens behind acpi_scan_init()
+ * called from subsys_initcall(). PCC requires the mailbox driver, which is
+ * initialised from postcore_initcall().
+ */
+subsys_initcall_sync(acpi_mpam_parse);
diff --git a/drivers/acpi/tables.c b/drivers/acpi/tables.c
index fa9bb8c8ce95..835e3795ede3 100644
--- a/drivers/acpi/tables.c
+++ b/drivers/acpi/tables.c
@@ -408,7 +408,7 @@ static const char table_sigs[][ACPI_NAMESEG_SIZE] __nonstring_array __initconst
ACPI_SIG_PSDT, ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT,
ACPI_SIG_IORT, ACPI_SIG_NFIT, ACPI_SIG_HMAT, ACPI_SIG_PPTT,
ACPI_SIG_NHLT, ACPI_SIG_AEST, ACPI_SIG_CEDT, ACPI_SIG_AGDI,
- ACPI_SIG_NBFT };
+ ACPI_SIG_NBFT, ACPI_SIG_MPAM };
#define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
new file mode 100644
index 000000000000..0edefa6ba019
--- /dev/null
+++ b/include/linux/arm_mpam.h
@@ -0,0 +1,46 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2025 Arm Ltd. */
+
+#ifndef __LINUX_ARM_MPAM_H
+#define __LINUX_ARM_MPAM_H
+
+#include <linux/acpi.h>
+#include <linux/types.h>
+
+struct mpam_msc;
+
+enum mpam_msc_iface {
+ MPAM_IFACE_MMIO, /* a real MPAM MSC */
+ MPAM_IFACE_PCC, /* a fake MPAM MSC */
+};
+
+enum mpam_class_types {
+ MPAM_CLASS_CACHE, /* Well known caches, e.g. L2 */
+ MPAM_CLASS_MEMORY, /* Main memory */
+ MPAM_CLASS_UNKNOWN, /* Everything else, e.g. SMMU */
+};
+
+#ifdef CONFIG_ACPI_MPAM
+/* Parse the ACPI description of resources entries for this MSC. */
+int acpi_mpam_parse_resources(struct mpam_msc *msc,
+ struct acpi_mpam_msc_node *tbl_msc);
+
+int acpi_mpam_count_msc(void);
+#else
+static inline int acpi_mpam_parse_resources(struct mpam_msc *msc,
+ struct acpi_mpam_msc_node *tbl_msc)
+{
+ return -EINVAL;
+}
+
+static inline int acpi_mpam_count_msc(void) { return -EINVAL; }
+#endif
+
+static inline int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
+ enum mpam_class_types type, u8 class_id,
+ int component_id)
+{
+ return -EINVAL;
+}
+
+#endif /* __LINUX_ARM_MPAM_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (9 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 21:43 ` Rob Herring
2025-07-11 18:36 ` [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory James Morse
` (25 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
From: Rob Herring <robh@kernel.org>
The binding is designed around the assumption that an MSC will be a
sub-block of something else such as a memory controller, cache controller,
or IOMMU. However, it's certainly possible a design does not have that
association or has a mixture of both, so the binding illustrates how we can
support that with RIS child nodes.
A key part of MPAM is we need to know about all of the MSCs in the system
before it can be enabled. This drives the need for the genericish
'arm,mpam-msc' compatible. Though we can't assume an MSC is accessible
until a h/w specific driver potentially enables the h/w.
Cc: James Morse <james.morse@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
.../devicetree/bindings/arm/arm,mpam-msc.yaml | 227 ++++++++++++++++++
1 file changed, 227 insertions(+)
create mode 100644 Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
diff --git a/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml b/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
new file mode 100644
index 000000000000..9d542ecb1a7d
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
@@ -0,0 +1,227 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/arm,mpam-msc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm Memory System Resource Partitioning and Monitoring (MPAM)
+
+description: |
+ The Arm MPAM specification can be found here:
+
+ https://developer.arm.com/documentation/ddi0598/latest
+
+maintainers:
+ - Rob Herring <robh@kernel.org>
+
+properties:
+ compatible:
+ items:
+ - const: arm,mpam-msc # Further details are discoverable
+ - const: arm,mpam-memory-controller-msc
+
+ reg:
+ maxItems: 1
+ description: A memory region containing registers as defined in the MPAM
+ specification.
+
+ interrupts:
+ minItems: 1
+ items:
+ - description: error (optional)
+ - description: overflow (optional, only for monitoring)
+
+ interrupt-names:
+ oneOf:
+ - items:
+ - enum: [ error, overflow ]
+ - items:
+ - const: error
+ - const: overflow
+
+ arm,not-ready-us:
+ description: The maximum time in microseconds for monitoring data to be
+ accurate after a settings change. For more information, see the
+ Not-Ready (NRDY) bit description in the MPAM specification.
+
+ numa-node-id: true # see NUMA binding
+
+ '#address-cells':
+ const: 1
+
+ '#size-cells':
+ const: 0
+
+patternProperties:
+ '^ris@[0-9a-f]$':
+ type: object
+ additionalProperties: false
+ description: |
+ RIS nodes for each RIS in an MSC. These nodes are required for each RIS
+ implementing known MPAM controls
+
+ properties:
+ compatible:
+ enum:
+ # Bulk storage for cache
+ - arm,mpam-cache
+ # Memory bandwidth
+ - arm,mpam-memory
+
+ reg:
+ minimum: 0
+ maximum: 0xf
+
+ cpus:
+ $ref: '/schemas/types.yaml#/definitions/phandle-array'
+ description:
+ Phandle(s) to the CPU node(s) this RIS belongs to. By default, the parent
+ device's affinity is used.
+
+ arm,mpam-device:
+ $ref: '/schemas/types.yaml#/definitions/phandle'
+ description:
+ By default, the MPAM enabled device associated with a RIS is the MSC's
+ parent node. It is possible for each RIS to be associated with different
+ devices in which case 'arm,mpam-device' should be used.
+
+ required:
+ - compatible
+ - reg
+
+required:
+ - compatible
+ - reg
+
+dependencies:
+ interrupts: [ interrupt-names ]
+
+additionalProperties: false
+
+examples:
+ - |
+ /*
+ cpus {
+ cpu@0 {
+ next-level-cache = <&L2_0>;
+ };
+ cpu@100 {
+ next-level-cache = <&L2_1>;
+ };
+ };
+ */
+ L2_0: cache-controller-0 {
+ compatible = "cache";
+ cache-level = <2>;
+ cache-unified;
+ next-level-cache = <&L3>;
+
+ };
+
+ L2_1: cache-controller-1 {
+ compatible = "cache";
+ cache-level = <2>;
+ cache-unified;
+ next-level-cache = <&L3>;
+
+ };
+
+ L3: cache-controller@30000000 {
+ compatible = "arm,dsu-l3-cache", "cache";
+ cache-level = <3>;
+ cache-unified;
+
+ ranges = <0x0 0x30000000 0x800000>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ msc@10000 {
+ compatible = "arm,mpam-msc";
+
+ /* CPU affinity implied by parent cache node's */
+ reg = <0x10000 0x2000>;
+ interrupts = <1>, <2>;
+ interrupt-names = "error", "overflow";
+ arm,not-ready-us = <1>;
+ };
+ };
+
+ mem: memory-controller@20000 {
+ compatible = "foo,a-memory-controller";
+ reg = <0x20000 0x1000>;
+
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges;
+
+ msc@21000 {
+ compatible = "arm,mpam-memory-controller-msc", "arm,mpam-msc";
+ reg = <0x21000 0x1000>;
+ interrupts = <3>;
+ interrupt-names = "error";
+ arm,not-ready-us = <1>;
+ numa-node-id = <1>;
+ };
+ };
+
+ iommu@40000 {
+ reg = <0x40000 0x1000>;
+
+ ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ msc@41000 {
+ compatible = "arm,mpam-msc";
+ reg = <0 0x1000>;
+ interrupts = <5>, <6>;
+ interrupt-names = "error", "overflow";
+ arm,not-ready-us = <1>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ris@2 {
+ compatible = "arm,mpam-cache";
+ reg = <0>;
+ // TODO: How to map to device(s)?
+ };
+ };
+ };
+
+ msc@80000 {
+ compatible = "foo,a-standalone-msc";
+ reg = <0x80000 0x1000>;
+
+ clocks = <&clks 123>;
+
+ ranges;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ msc@10000 {
+ compatible = "arm,mpam-msc";
+
+ reg = <0x10000 0x2000>;
+ interrupts = <7>;
+ interrupt-names = "overflow";
+ arm,not-ready-us = <1>;
+
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ ris@0 {
+ compatible = "arm,mpam-cache";
+ reg = <0>;
+ arm,mpam-device = <&L2_0>;
+ };
+
+ ris@1 {
+ compatible = "arm,mpam-memory";
+ reg = <1>;
+ arm,mpam-device = <&mem>;
+ };
+ };
+ };
+
+...
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (10 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-21 16:32 ` Jonathan Cameron
2025-07-24 10:56 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
` (24 subsequent siblings)
36 siblings, 2 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
commit 363c8aea257 "platform: Add ARM64 platform directory" added a
subdirectory for arm64 platform devices, but claims that all such
devices must be 'EC like'.
The arm64 MPAM driver manages an MMIO interface that appears in memory
controllers, caches, IOMMU and connection points on the interconnect.
It doesn't fit into any existing subsystem.
It would be convenient to use this subdirectory for drivers for other
arm64 platform devices which aren't closely coupled to the architecture
code and don't fit into any existing subsystem.
Move the existing code and maintainer entries to be under
drivers/platform/arm64/ec. The MPAM driver will be added under
drivers/platform/arm64/mpam.
Signed-off-by: James Morse <james.morse@arm.com>
---
MAINTAINERS | 6 +-
drivers/platform/arm64/Kconfig | 72 +-----------------
drivers/platform/arm64/Makefile | 9 +--
drivers/platform/arm64/ec/Kconfig | 73 +++++++++++++++++++
drivers/platform/arm64/ec/Makefile | 10 +++
.../platform/arm64/{ => ec}/acer-aspire1-ec.c | 0
.../arm64/{ => ec}/huawei-gaokun-ec.c | 0
.../arm64/{ => ec}/lenovo-yoga-c630.c | 0
8 files changed, 88 insertions(+), 82 deletions(-)
create mode 100644 drivers/platform/arm64/ec/Kconfig
create mode 100644 drivers/platform/arm64/ec/Makefile
rename drivers/platform/arm64/{ => ec}/acer-aspire1-ec.c (100%)
rename drivers/platform/arm64/{ => ec}/huawei-gaokun-ec.c (100%)
rename drivers/platform/arm64/{ => ec}/lenovo-yoga-c630.c (100%)
diff --git a/MAINTAINERS b/MAINTAINERS
index 4bac4ea21b64..bea01d413666 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -3549,15 +3549,15 @@ S: Maintained
F: arch/arm64/boot/Makefile
F: scripts/make_fit.py
-ARM64 PLATFORM DRIVERS
-M: Hans de Goede <hansg@kernel.org>
+ARM64 EC PLATFORM DRIVERS
+M: Hans de Goede <hdegoede@redhat.com>
M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
R: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
L: platform-driver-x86@vger.kernel.org
S: Maintained
Q: https://patchwork.kernel.org/project/platform-driver-x86/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git
-F: drivers/platform/arm64/
+F: drivers/platform/arm64/ec
ARM64 PORT (AARCH64 ARCHITECTURE)
M: Catalin Marinas <catalin.marinas@arm.com>
diff --git a/drivers/platform/arm64/Kconfig b/drivers/platform/arm64/Kconfig
index 06288aebc559..1eb8ab0855e5 100644
--- a/drivers/platform/arm64/Kconfig
+++ b/drivers/platform/arm64/Kconfig
@@ -1,73 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
-#
-# EC-like Drivers for aarch64 based devices.
-#
-menuconfig ARM64_PLATFORM_DEVICES
- bool "ARM64 Platform-Specific Device Drivers"
- depends on ARM64 || COMPILE_TEST
- default ARM64
- help
- Say Y here to get to see options for platform-specific device drivers
- for arm64 based devices, primarily EC-like device drivers.
- This option alone does not add any kernel code.
-
- If you say N, all options in this submenu will be skipped and disabled.
-
-if ARM64_PLATFORM_DEVICES
-
-config EC_ACER_ASPIRE1
- tristate "Acer Aspire 1 Embedded Controller driver"
- depends on ARCH_QCOM || COMPILE_TEST
- depends on I2C
- depends on DRM
- depends on POWER_SUPPLY
- depends on INPUT
- help
- Say Y here to enable the EC driver for the (Snapdragon-based)
- Acer Aspire 1 laptop. The EC handles battery and charging
- monitoring as well as some misc functions like the lid sensor
- and USB Type-C DP HPD events.
-
- This driver provides battery and AC status support for the mentioned
- laptop where this information is not properly exposed via the
- standard ACPI devices.
-
-config EC_HUAWEI_GAOKUN
- tristate "Huawei Matebook E Go Embedded Controller driver"
- depends on ARCH_QCOM || COMPILE_TEST
- depends on I2C
- depends on INPUT
- depends on HWMON
- select AUXILIARY_BUS
-
- help
- Say Y here to enable the EC driver for the Huawei Matebook E Go
- which is a sc8280xp-based 2-in-1 tablet. The driver handles battery
- (information, charge control) and USB Type-C DP HPD events as well
- as some misc functions like the lid sensor and temperature sensors,
- etc.
-
- This driver provides battery and AC status support for the mentioned
- laptop where this information is not properly exposed via the
- standard ACPI devices.
-
- Say M or Y here to include this support.
-
-config EC_LENOVO_YOGA_C630
- tristate "Lenovo Yoga C630 Embedded Controller driver"
- depends on ARCH_QCOM || COMPILE_TEST
- depends on I2C
- select AUXILIARY_BUS
- help
- Driver for the Embedded Controller in the Qualcomm Snapdragon-based
- Lenovo Yoga C630, which provides battery and power adapter
- information.
-
- This driver provides battery and AC status support for the mentioned
- laptop where this information is not properly exposed via the
- standard ACPI devices.
-
- Say M or Y here to include this support.
-
-endif # ARM64_PLATFORM_DEVICES
+source "drivers/platform/arm64/ec/Kconfig"
diff --git a/drivers/platform/arm64/Makefile b/drivers/platform/arm64/Makefile
index 46a99eba3264..ce840a8cf8cc 100644
--- a/drivers/platform/arm64/Makefile
+++ b/drivers/platform/arm64/Makefile
@@ -1,10 +1,3 @@
# SPDX-License-Identifier: GPL-2.0-only
-#
-# Makefile for linux/drivers/platform/arm64
-#
-# This dir should only include drivers for EC-like devices.
-#
-obj-$(CONFIG_EC_ACER_ASPIRE1) += acer-aspire1-ec.o
-obj-$(CONFIG_EC_HUAWEI_GAOKUN) += huawei-gaokun-ec.o
-obj-$(CONFIG_EC_LENOVO_YOGA_C630) += lenovo-yoga-c630.o
+obj-y += ec/
diff --git a/drivers/platform/arm64/ec/Kconfig b/drivers/platform/arm64/ec/Kconfig
new file mode 100644
index 000000000000..06288aebc559
--- /dev/null
+++ b/drivers/platform/arm64/ec/Kconfig
@@ -0,0 +1,73 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# EC-like Drivers for aarch64 based devices.
+#
+
+menuconfig ARM64_PLATFORM_DEVICES
+ bool "ARM64 Platform-Specific Device Drivers"
+ depends on ARM64 || COMPILE_TEST
+ default ARM64
+ help
+ Say Y here to get to see options for platform-specific device drivers
+ for arm64 based devices, primarily EC-like device drivers.
+ This option alone does not add any kernel code.
+
+ If you say N, all options in this submenu will be skipped and disabled.
+
+if ARM64_PLATFORM_DEVICES
+
+config EC_ACER_ASPIRE1
+ tristate "Acer Aspire 1 Embedded Controller driver"
+ depends on ARCH_QCOM || COMPILE_TEST
+ depends on I2C
+ depends on DRM
+ depends on POWER_SUPPLY
+ depends on INPUT
+ help
+ Say Y here to enable the EC driver for the (Snapdragon-based)
+ Acer Aspire 1 laptop. The EC handles battery and charging
+ monitoring as well as some misc functions like the lid sensor
+ and USB Type-C DP HPD events.
+
+ This driver provides battery and AC status support for the mentioned
+ laptop where this information is not properly exposed via the
+ standard ACPI devices.
+
+config EC_HUAWEI_GAOKUN
+ tristate "Huawei Matebook E Go Embedded Controller driver"
+ depends on ARCH_QCOM || COMPILE_TEST
+ depends on I2C
+ depends on INPUT
+ depends on HWMON
+ select AUXILIARY_BUS
+
+ help
+ Say Y here to enable the EC driver for the Huawei Matebook E Go
+ which is a sc8280xp-based 2-in-1 tablet. The driver handles battery
+ (information, charge control) and USB Type-C DP HPD events as well
+ as some misc functions like the lid sensor and temperature sensors,
+ etc.
+
+ This driver provides battery and AC status support for the mentioned
+ laptop where this information is not properly exposed via the
+ standard ACPI devices.
+
+ Say M or Y here to include this support.
+
+config EC_LENOVO_YOGA_C630
+ tristate "Lenovo Yoga C630 Embedded Controller driver"
+ depends on ARCH_QCOM || COMPILE_TEST
+ depends on I2C
+ select AUXILIARY_BUS
+ help
+ Driver for the Embedded Controller in the Qualcomm Snapdragon-based
+ Lenovo Yoga C630, which provides battery and power adapter
+ information.
+
+ This driver provides battery and AC status support for the mentioned
+ laptop where this information is not properly exposed via the
+ standard ACPI devices.
+
+ Say M or Y here to include this support.
+
+endif # ARM64_PLATFORM_DEVICES
diff --git a/drivers/platform/arm64/ec/Makefile b/drivers/platform/arm64/ec/Makefile
new file mode 100644
index 000000000000..b3a7c4096f08
--- /dev/null
+++ b/drivers/platform/arm64/ec/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for linux/drivers/platform/arm64/ec
+#
+# This dir should only include drivers for EC-like devices.
+#
+
+obj-$(CONFIG_EC_ACER_ASPIRE1) += acer-aspire1-ec.o
+obj-$(CONFIG_EC_HUAWEI_GAOKUN) += huawei-gaokun-ec.o
+obj-$(CONFIG_EC_LENOVO_YOGA_C630) += lenovo-yoga-c630.o
diff --git a/drivers/platform/arm64/acer-aspire1-ec.c b/drivers/platform/arm64/ec/acer-aspire1-ec.c
similarity index 100%
rename from drivers/platform/arm64/acer-aspire1-ec.c
rename to drivers/platform/arm64/ec/acer-aspire1-ec.c
diff --git a/drivers/platform/arm64/huawei-gaokun-ec.c b/drivers/platform/arm64/ec/huawei-gaokun-ec.c
similarity index 100%
rename from drivers/platform/arm64/huawei-gaokun-ec.c
rename to drivers/platform/arm64/ec/huawei-gaokun-ec.c
diff --git a/drivers/platform/arm64/lenovo-yoga-c630.c b/drivers/platform/arm64/ec/lenovo-yoga-c630.c
similarity index 100%
rename from drivers/platform/arm64/lenovo-yoga-c630.c
rename to drivers/platform/arm64/ec/lenovo-yoga-c630.c
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (11 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-24 11:02 ` Ben Horgan
2025-07-24 12:09 ` Catalin Marinas
2025-07-11 18:36 ` [RFC PATCH 14/36] arm_mpam: Add support for memory controller MSC on DT platforms James Morse
` (23 subsequent siblings)
36 siblings, 2 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Probing MPAM is convoluted. MSCs that are integrated with a CPU may
only be accessible from those CPUs, and they may not be online.
Touching the hardware early is pointless as MPAM can't be used until
the system-wide common values for num_partid and num_pmg have been
discovered.
Start with driver probe/remove and mapping the MSC.
CC: Carl Worth <carl@os.amperecomputing.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
arch/arm64/Kconfig | 1 +
drivers/platform/arm64/Kconfig | 1 +
drivers/platform/arm64/Makefile | 1 +
drivers/platform/arm64/mpam/Kconfig | 10 +
drivers/platform/arm64/mpam/Makefile | 4 +
drivers/platform/arm64/mpam/mpam_devices.c | 336 ++++++++++++++++++++
drivers/platform/arm64/mpam/mpam_internal.h | 62 ++++
7 files changed, 415 insertions(+)
create mode 100644 drivers/platform/arm64/mpam/Kconfig
create mode 100644 drivers/platform/arm64/mpam/Makefile
create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index ad9a49a39e41..8abce7f4eb1e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2060,6 +2060,7 @@ config ARM64_TLB_RANGE
config ARM64_MPAM
bool "Enable support for MPAM"
+ select ARM64_MPAM_DRIVER
select ACPI_MPAM if ACPI
help
Memory Partitioning and Monitoring is an optional extension
diff --git a/drivers/platform/arm64/Kconfig b/drivers/platform/arm64/Kconfig
index 1eb8ab0855e5..16a927cf6ea2 100644
--- a/drivers/platform/arm64/Kconfig
+++ b/drivers/platform/arm64/Kconfig
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0-only
source "drivers/platform/arm64/ec/Kconfig"
+source "drivers/platform/arm64/mpam/Kconfig"
diff --git a/drivers/platform/arm64/Makefile b/drivers/platform/arm64/Makefile
index ce840a8cf8cc..c6ec3bc6a100 100644
--- a/drivers/platform/arm64/Makefile
+++ b/drivers/platform/arm64/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0-only
obj-y += ec/
+obj-y += mpam/
diff --git a/drivers/platform/arm64/mpam/Kconfig b/drivers/platform/arm64/mpam/Kconfig
new file mode 100644
index 000000000000..b63495d7da87
--- /dev/null
+++ b/drivers/platform/arm64/mpam/Kconfig
@@ -0,0 +1,10 @@
+# Confusingly, this is everything but the CPU bits of MPAM. CPU here means
+# CPU resources, not containers or cgroups etc.
+config ARM_CPU_RESCTRL
+ bool
+ depends on ARM64
+
+config ARM64_MPAM_DRIVER_DEBUG
+ bool "Enable debug messages from the MPAM driver."
+ help
+ Say yes here to enable debug messages from the MPAM driver.
diff --git a/drivers/platform/arm64/mpam/Makefile b/drivers/platform/arm64/mpam/Makefile
new file mode 100644
index 000000000000..4255975c7724
--- /dev/null
+++ b/drivers/platform/arm64/mpam/Makefile
@@ -0,0 +1,4 @@
+obj-$(CONFIG_ARM64_MPAM) += mpam.o
+mpam-y += mpam_devices.o
+
+cflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
new file mode 100644
index 000000000000..5b886ba54ba8
--- /dev/null
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -0,0 +1,336 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2025 Arm Ltd.
+
+#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
+
+#include <linux/acpi.h>
+#include <linux/arm_mpam.h>
+#include <linux/cacheinfo.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/gfp.h>
+#include <linux/list.h>
+#include <linux/lockdep.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/printk.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/srcu.h>
+#include <linux/types.h>
+
+#include <acpi/pcc.h>
+
+#include "mpam_internal.h"
+
+/*
+ * mpam_list_lock protects the SRCU lists when writing. Once the
+ * mpam_enabled key is enabled these lists are read-only,
+ * unless the error interrupt disables the driver.
+ */
+static DEFINE_MUTEX(mpam_list_lock);
+static LIST_HEAD(mpam_all_msc);
+
+static struct srcu_struct mpam_srcu;
+
+/* MPAM isn't available until all the MSC have been probed. */
+static u32 mpam_num_msc;
+
+static void mpam_discovery_complete(void)
+{
+ pr_err("Discovered all MSC\n");
+}
+
+static int mpam_dt_count_msc(void)
+{
+ int count = 0;
+ struct device_node *np;
+
+ for_each_compatible_node(np, NULL, "arm,mpam-msc")
+ count++;
+
+ return count;
+}
+
+static int mpam_dt_parse_resource(struct mpam_msc *msc, struct device_node *np,
+ u32 ris_idx)
+{
+ int err = 0;
+ u32 level = 0;
+ unsigned long cache_id;
+ struct device_node *cache;
+
+ do {
+ if (of_device_is_compatible(np, "arm,mpam-cache")) {
+ cache = of_parse_phandle(np, "arm,mpam-device", 0);
+ if (!cache) {
+ pr_err("Failed to read phandle\n");
+ break;
+ }
+ } else if (of_device_is_compatible(np->parent, "cache")) {
+ cache = of_node_get(np->parent);
+ } else {
+ /* For now, only caches are supported */
+ cache = NULL;
+ break;
+ }
+
+ err = of_property_read_u32(cache, "cache-level", &level);
+ if (err) {
+ pr_err("Failed to read cache-level\n");
+ break;
+ }
+
+ cache_id = cache_of_calculate_id(cache);
+ if (cache_id == ~0UL) {
+ err = -ENOENT;
+ break;
+ }
+
+ err = mpam_ris_create(msc, ris_idx, MPAM_CLASS_CACHE, level,
+ cache_id);
+ } while (0);
+ of_node_put(cache);
+
+ return err;
+}
+
+static int mpam_dt_parse_resources(struct mpam_msc *msc, void *ignored)
+{
+ int err, num_ris = 0;
+ const u32 *ris_idx_p;
+ struct device_node *iter, *np;
+
+ np = msc->pdev->dev.of_node;
+ for_each_child_of_node(np, iter) {
+ ris_idx_p = of_get_property(iter, "reg", NULL);
+ if (ris_idx_p) {
+ num_ris++;
+ err = mpam_dt_parse_resource(msc, iter, *ris_idx_p);
+ if (err) {
+ of_node_put(iter);
+ return err;
+ }
+ }
+ }
+
+ if (!num_ris)
+ mpam_dt_parse_resource(msc, np, 0);
+
+ return err;
+}
+
+/*
+ * An MSC can control traffic from a set of CPUs, but may only be accessible
+ * from a (hopefully wider) set of CPUs. The common reason for this is power
+ * management. If all the CPUs in a cluster are in PSCI:CPU_SUSPEND, the
+ * the corresponding cache may also be powered off. By making accesses from
+ * one of those CPUs, we ensure this isn't the case.
+ */
+static int update_msc_accessibility(struct mpam_msc *msc)
+{
+ struct device_node *parent;
+ u32 affinity_id;
+ int err;
+
+ if (!acpi_disabled) {
+ err = device_property_read_u32(&msc->pdev->dev, "cpu_affinity",
+ &affinity_id);
+ if (err) {
+ cpumask_copy(&msc->accessibility, cpu_possible_mask);
+ err = 0;
+ } else {
+ err = acpi_pptt_get_cpus_from_container(affinity_id,
+ &msc->accessibility);
+ }
+
+ return err;
+ }
+
+ /* This depends on the path to of_node */
+ parent = of_get_parent(msc->pdev->dev.of_node);
+ if (parent == of_root) {
+ cpumask_copy(&msc->accessibility, cpu_possible_mask);
+ err = 0;
+ } else {
+ err = -EINVAL;
+ pr_err("Cannot determine accessibility of MSC: %s\n",
+ dev_name(&msc->pdev->dev));
+ }
+ of_node_put(parent);
+
+ return err;
+}
+
+static int fw_num_msc;
+
+static void mpam_pcc_rx_callback(struct mbox_client *cl, void *msg)
+{
+ /* TODO: wake up tasks blocked on this MSC's PCC channel */
+}
+
+static void mpam_msc_drv_remove(struct platform_device *pdev)
+{
+ struct mpam_msc *msc = platform_get_drvdata(pdev);
+
+ if (!msc)
+ return;
+
+ mutex_lock(&mpam_list_lock);
+ mpam_num_msc--;
+ platform_set_drvdata(pdev, NULL);
+ list_del_rcu(&msc->glbl_list);
+ synchronize_srcu(&mpam_srcu);
+ devm_kfree(&pdev->dev, msc);
+ mutex_unlock(&mpam_list_lock);
+}
+
+static int mpam_msc_drv_probe(struct platform_device *pdev)
+{
+ int err;
+ struct mpam_msc *msc;
+ struct resource *msc_res;
+ void *plat_data = pdev->dev.platform_data;
+
+ mutex_lock(&mpam_list_lock);
+ do {
+ msc = devm_kzalloc(&pdev->dev, sizeof(*msc), GFP_KERNEL);
+ if (!msc) {
+ err = -ENOMEM;
+ break;
+ }
+
+ mutex_init(&msc->probe_lock);
+ mutex_init(&msc->part_sel_lock);
+ mutex_init(&msc->outer_mon_sel_lock);
+ raw_spin_lock_init(&msc->inner_mon_sel_lock);
+ msc->id = mpam_num_msc++;
+ msc->pdev = pdev;
+ INIT_LIST_HEAD_RCU(&msc->glbl_list);
+ INIT_LIST_HEAD_RCU(&msc->ris);
+
+ err = update_msc_accessibility(msc);
+ if (err)
+ break;
+ if (cpumask_empty(&msc->accessibility)) {
+ pr_err_once("msc:%u is not accessible from any CPU!",
+ msc->id);
+ err = -EINVAL;
+ break;
+ }
+
+ if (device_property_read_u32(&pdev->dev, "pcc-channel",
+ &msc->pcc_subspace_id))
+ msc->iface = MPAM_IFACE_MMIO;
+ else
+ msc->iface = MPAM_IFACE_PCC;
+
+ if (msc->iface == MPAM_IFACE_MMIO) {
+ void __iomem *io;
+
+ io = devm_platform_get_and_ioremap_resource(pdev, 0,
+ &msc_res);
+ if (IS_ERR(io)) {
+ pr_err("Failed to map MSC base address\n");
+ err = PTR_ERR(io);
+ break;
+ }
+ msc->mapped_hwpage_sz = msc_res->end - msc_res->start;
+ msc->mapped_hwpage = io;
+ } else if (msc->iface == MPAM_IFACE_PCC) {
+ msc->pcc_cl.dev = &pdev->dev;
+ msc->pcc_cl.rx_callback = mpam_pcc_rx_callback;
+ msc->pcc_cl.tx_block = false;
+ msc->pcc_cl.tx_tout = 1000; /* 1s */
+ msc->pcc_cl.knows_txdone = false;
+
+ msc->pcc_chan = pcc_mbox_request_channel(&msc->pcc_cl,
+ msc->pcc_subspace_id);
+ if (IS_ERR(msc->pcc_chan)) {
+ pr_err("Failed to request MSC PCC channel\n");
+ err = PTR_ERR(msc->pcc_chan);
+ break;
+ }
+ }
+
+ list_add_rcu(&msc->glbl_list, &mpam_all_msc);
+ platform_set_drvdata(pdev, msc);
+ } while (0);
+ mutex_unlock(&mpam_list_lock);
+
+ if (!err) {
+ /* Create RIS entries described by firmware */
+ if (!acpi_disabled)
+ err = acpi_mpam_parse_resources(msc, plat_data);
+ else
+ err = mpam_dt_parse_resources(msc, plat_data);
+ }
+
+ if (!err && fw_num_msc == mpam_num_msc)
+ mpam_discovery_complete();
+
+ if (err && msc)
+ mpam_msc_drv_remove(pdev);
+
+ return err;
+}
+
+static const struct of_device_id mpam_of_match[] = {
+ { .compatible = "arm,mpam-msc", },
+ {},
+};
+MODULE_DEVICE_TABLE(of, mpam_of_match);
+
+static struct platform_driver mpam_msc_driver = {
+ .driver = {
+ .name = "mpam_msc",
+ .of_match_table = of_match_ptr(mpam_of_match),
+ },
+ .probe = mpam_msc_drv_probe,
+ .remove = mpam_msc_drv_remove,
+};
+
+/*
+ * MSC that are hidden under caches are not created as platform devices
+ * as there is no cache driver. Caches are also special-cased in
+ * update_msc_accessibility().
+ */
+static void mpam_dt_create_foundling_msc(void)
+{
+ int err;
+ struct device_node *cache;
+
+ for_each_compatible_node(cache, NULL, "cache") {
+ err = of_platform_populate(cache, mpam_of_match, NULL, NULL);
+ if (err)
+ pr_err("Failed to create MSC devices under caches\n");
+ }
+}
+
+static int __init mpam_msc_driver_init(void)
+{
+ if (!system_supports_mpam())
+ return -EOPNOTSUPP;
+
+ init_srcu_struct(&mpam_srcu);
+
+ if (!acpi_disabled)
+ fw_num_msc = acpi_mpam_count_msc();
+ else
+ fw_num_msc = mpam_dt_count_msc();
+
+ if (fw_num_msc <= 0) {
+ pr_err("No MSC devices found in firmware\n");
+ return -EINVAL;
+ }
+
+ if (acpi_disabled)
+ mpam_dt_create_foundling_msc();
+
+ return platform_driver_register(&mpam_msc_driver);
+}
+subsys_initcall(mpam_msc_driver_init);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
new file mode 100644
index 000000000000..07e0f240eaca
--- /dev/null
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+// Copyright (C) 2024 Arm Ltd.
+
+#ifndef MPAM_INTERNAL_H
+#define MPAM_INTERNAL_H
+
+#include <linux/arm_mpam.h>
+#include <linux/cpumask.h>
+#include <linux/io.h>
+#include <linux/mailbox_client.h>
+#include <linux/mutex.h>
+#include <linux/resctrl.h>
+#include <linux/sizes.h>
+
+struct mpam_msc {
+ /* member of mpam_all_msc */
+ struct list_head glbl_list;
+
+ int id;
+ struct platform_device *pdev;
+
+ /* Not modified after mpam_is_enabled() becomes true */
+ enum mpam_msc_iface iface;
+ u32 pcc_subspace_id;
+ struct mbox_client pcc_cl;
+ struct pcc_mbox_chan *pcc_chan;
+ u32 nrdy_usec;
+ cpumask_t accessibility;
+
+ /*
+ * probe_lock is only take during discovery. After discovery these
+ * properties become read-only and the lists are protected by SRCU.
+ */
+ struct mutex probe_lock;
+ unsigned long ris_idxs[128 / BITS_PER_LONG];
+ u32 ris_max;
+
+ /* mpam_msc_ris of this component */
+ struct list_head ris;
+
+ /*
+ * part_sel_lock protects access to the MSC hardware registers that are
+ * affected by MPAMCFG_PART_SEL. (including the ID registers that vary
+ * by RIS).
+ * If needed, take msc->lock first.
+ */
+ struct mutex part_sel_lock;
+
+ /*
+ * mon_sel_lock protects access to the MSC hardware registers that are
+ * affeted by MPAMCFG_MON_SEL.
+ * If needed, take msc->lock first.
+ */
+ struct mutex outer_mon_sel_lock;
+ raw_spinlock_t inner_mon_sel_lock;
+ unsigned long inner_mon_sel_flags;
+
+ void __iomem *mapped_hwpage;
+ size_t mapped_hwpage_sz;
+};
+
+#endif /* MPAM_INTERNAL_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 14/36] arm_mpam: Add support for memory controller MSC on DT platforms
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (12 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 15/36] arm_mpam: Add the class and component structures for ris firmware described James Morse
` (22 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
From: Shanker Donthineni <sdonthineni@nvidia.com>
The device-tree binding has two examples for MSC associated with
memory controllers. Add the support to discover the component_id
from the device-tree and create 'memory' RIS.
Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
[ morse: split out of a bigger patch, added affinity piece ]
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 67 +++++++++++++++-------
1 file changed, 47 insertions(+), 20 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 5b886ba54ba8..f5abd5f0d41a 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -60,41 +60,63 @@ static int mpam_dt_parse_resource(struct mpam_msc *msc, struct device_node *np,
u32 ris_idx)
{
int err = 0;
- u32 level = 0;
- unsigned long cache_id;
- struct device_node *cache;
+ u32 class_id = 0, component_id = 0;
+ struct device_node *cache = NULL, *memory = NULL;
+ enum mpam_class_types type = MPAM_CLASS_UNKNOWN;
do {
+ /* What kind of MSC is this? */
if (of_device_is_compatible(np, "arm,mpam-cache")) {
cache = of_parse_phandle(np, "arm,mpam-device", 0);
if (!cache) {
pr_err("Failed to read phandle\n");
break;
}
+ type = MPAM_CLASS_CACHE;
} else if (of_device_is_compatible(np->parent, "cache")) {
cache = of_node_get(np->parent);
+ type = MPAM_CLASS_CACHE;
+ } else if (of_device_is_compatible(np, "arm,mpam-memory")) {
+ memory = of_parse_phandle(np, "arm,mpam-device", 0);
+ if (!memory) {
+ pr_err("Failed to read phandle\n");
+ break;
+ }
+ type = MPAM_CLASS_MEMORY;
+ } else if (of_device_is_compatible(np, "arm,mpam-memory-controller-msc")) {
+ memory = of_node_get(np->parent);
+ type = MPAM_CLASS_MEMORY;
} else {
- /* For now, only caches are supported */
- cache = NULL;
+ /*
+ * For now, only caches and memory controllers are
+ * supported.
+ */
break;
}
- err = of_property_read_u32(cache, "cache-level", &level);
- if (err) {
- pr_err("Failed to read cache-level\n");
- break;
- }
-
- cache_id = cache_of_calculate_id(cache);
- if (cache_id == ~0UL) {
- err = -ENOENT;
- break;
+ /* Determine the class and component ids, based on type. */
+ if (type == MPAM_CLASS_CACHE) {
+ err = of_property_read_u32(cache, "cache-level", &class_id);
+ if (err) {
+ pr_err("Failed to read cache-level\n");
+ break;
+ }
+ component_id = cache_of_calculate_id(cache);
+ if (component_id == ~0UL) {
+ err = -ENOENT;
+ break;
+ }
+ } else if (type == MPAM_CLASS_MEMORY) {
+ err = of_node_to_nid(np);
+ component_id = (err == NUMA_NO_NODE) ? 0 : err;
+ class_id = 255;
}
- err = mpam_ris_create(msc, ris_idx, MPAM_CLASS_CACHE, level,
- cache_id);
+ err = mpam_ris_create(msc, ris_idx, type, class_id,
+ component_id);
} while (0);
of_node_put(cache);
+ of_node_put(memory);
return err;
}
@@ -157,9 +179,14 @@ static int update_msc_accessibility(struct mpam_msc *msc)
cpumask_copy(&msc->accessibility, cpu_possible_mask);
err = 0;
} else {
- err = -EINVAL;
- pr_err("Cannot determine accessibility of MSC: %s\n",
- dev_name(&msc->pdev->dev));
+ if (of_device_is_compatible(parent, "memory")) {
+ cpumask_copy(&msc->accessibility, cpu_possible_mask);
+ err = 0;
+ } else {
+ err = -EINVAL;
+ pr_err("Cannot determine accessibility of MSC: %s\n",
+ dev_name(&msc->pdev->dev));
+ }
}
of_node_put(parent);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 15/36] arm_mpam: Add the class and component structures for ris firmware described
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (13 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 14/36] arm_mpam: Add support for memory controller MSC on DT platforms James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions James Morse
` (21 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
An MSC is a container of resources, each identified by their RIS index.
Some RIS are described by firmware to provide their position in the system.
Others are discovered when the driver probes the hardware.
To configure a resource it needs to be found by its class, e.g. 'L2'.
There are two kinds of grouping, a class is a set of components, which
are visible to user-space as there are likely to be multiple instances
of the L2 cache. (e.g. one per cluster or package)
struct mpam_components are a set of struct mpam_vmsc. A vMSC groups the
RIS in an MSC that control the same logical piece of hardware. (e.g. L2).
This is to allow hardware implementations where two controls are presented
as different RIS. Re-combining these RIS allows their feature bits to
be or-ed. This structure is not visible outside mpam_devices.c
struct mpam_vmsc are then a set of struct mpam_msc_ris, which are not
visible as each L2 cache may be composed of individual slices which need
to be configured the same as the hardware is not able to distribute the
configuration.
Add support for creating and destroying these structures.
A gfp is passed as the structures may need creating when a new RIS entry
is discovered when probing the MSC.
CC: Ben Horgan <ben.horgan@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 490 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 91 ++++
include/linux/arm_mpam.h | 8 +-
3 files changed, 576 insertions(+), 13 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index f5abd5f0d41a..0d6d5180903b 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -20,7 +20,6 @@
#include <linux/printk.h>
#include <linux/slab.h>
#include <linux/spinlock.h>
-#include <linux/srcu.h>
#include <linux/types.h>
#include <acpi/pcc.h>
@@ -35,11 +34,485 @@
static DEFINE_MUTEX(mpam_list_lock);
static LIST_HEAD(mpam_all_msc);
-static struct srcu_struct mpam_srcu;
+struct srcu_struct mpam_srcu;
/* MPAM isn't available until all the MSC have been probed. */
static u32 mpam_num_msc;
+/*
+ * An MSC is a physical container for controls and monitors, each identified by
+ * their RIS index. These share a base-address, interrupts and some MMIO
+ * registers. A vMSC is a virtual container for RIS in an MSC that control or
+ * monitor the same thing. Members of a vMSC are all RIS in the same MSC, but
+ * not all RIS in an MSC share a vMSC.
+ * Components are a group of vMSC that control or monitor the same thing but
+ * are from different MSC, so have different base-address, interrupts etc.
+ * Classes are the set components of the same type.
+ *
+ * The features of a vMSC is the union of the RIS it contains.
+ * The features of a Class and Component are the common subset of the vMSC
+ * they contain.
+ *
+ * e.g. The system cache may have bandwidth controls on multiple interfaces,
+ * for regulating traffic from devices independently of traffic from CPUs.
+ * If these are two RIS in one MSC, they will be treated as controlling
+ * different things, and will not share a vMSC/component/class.
+ *
+ * e.g. The L2 may have one MSC and two RIS, one for cache-controls another
+ * for bandwidth. These two RIS are members of the same vMSC.
+ *
+ * e.g. The set of RIS that make up the L2 are grouped as a component. These
+ * are sometimes termed slices. They should be configured the same, as if there
+ * were only one.
+ *
+ * e.g. The SoC probably has more than one L2, each attached to a distinct set
+ * of CPUs. All the L2 components are grouped as a class.
+ *
+ * When creating an MSC, struct mpam_msc is added to the all mpam_all_msc list,
+ * then linked via struct mpam_ris to a vmsc, component and class.
+ * The same MSC may exist under different class->component->vmsc paths, but the
+ * RIS index will be unique.
+ */
+LIST_HEAD(mpam_classes);
+
+/* List of all objects that can be free()d after synchronise_srcu() */
+static LLIST_HEAD(mpam_garbage);
+
+#define init_garbage(x) init_llist_node(&(x)->garbage.llist)
+
+static struct mpam_vmsc *
+mpam_vmsc_alloc(struct mpam_component *comp, struct mpam_msc *msc, gfp_t gfp)
+{
+ struct mpam_vmsc *vmsc;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ vmsc = kzalloc(sizeof(*vmsc), gfp);
+ if (!comp)
+ return ERR_PTR(-ENOMEM);
+ init_garbage(vmsc);
+
+ INIT_LIST_HEAD_RCU(&vmsc->ris);
+ INIT_LIST_HEAD_RCU(&vmsc->comp_list);
+ vmsc->comp = comp;
+ vmsc->msc = msc;
+
+ list_add_rcu(&vmsc->comp_list, &comp->vmsc);
+
+ return vmsc;
+}
+
+static struct mpam_vmsc *mpam_vmsc_get(struct mpam_component *comp,
+ struct mpam_msc *msc, bool alloc,
+ gfp_t gfp)
+{
+ struct mpam_vmsc *vmsc;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_for_each_entry(vmsc, &comp->vmsc, comp_list) {
+ if (vmsc->msc->id == msc->id)
+ return vmsc;
+ }
+
+ if (!alloc)
+ return ERR_PTR(-ENOENT);
+
+ return mpam_vmsc_alloc(comp, msc, gfp);
+}
+
+static struct mpam_component *
+mpam_component_alloc(struct mpam_class *class, int id, gfp_t gfp)
+{
+ struct mpam_component *comp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ comp = kzalloc(sizeof(*comp), gfp);
+ if (!comp)
+ return ERR_PTR(-ENOMEM);
+ init_garbage(comp);
+
+ comp->comp_id = id;
+ INIT_LIST_HEAD_RCU(&comp->vmsc);
+ /* affinity is updated when ris are added */
+ INIT_LIST_HEAD_RCU(&comp->class_list);
+ comp->class = class;
+
+ list_add_rcu(&comp->class_list, &class->components);
+
+ return comp;
+}
+
+static struct mpam_component *
+mpam_component_get(struct mpam_class *class, int id, bool alloc, gfp_t gfp)
+{
+ struct mpam_component *comp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_for_each_entry(comp, &class->components, class_list) {
+ if (comp->comp_id == id)
+ return comp;
+ }
+
+ if (!alloc)
+ return ERR_PTR(-ENOENT);
+
+ return mpam_component_alloc(class, id, gfp);
+}
+
+static struct mpam_class *
+mpam_class_alloc(u8 level_idx, enum mpam_class_types type, gfp_t gfp)
+{
+ struct mpam_class *class;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ class = kzalloc(sizeof(*class), gfp);
+ if (!class)
+ return ERR_PTR(-ENOMEM);
+ init_garbage(class);
+
+ INIT_LIST_HEAD_RCU(&class->components);
+ /* affinity is updated when ris are added */
+ class->level = level_idx;
+ class->type = type;
+ INIT_LIST_HEAD_RCU(&class->classes_list);
+
+ list_add_rcu(&class->classes_list, &mpam_classes);
+
+ return class;
+}
+
+static struct mpam_class *
+mpam_class_get(u8 level_idx, enum mpam_class_types type, bool alloc, gfp_t gfp)
+{
+ bool found = false;
+ struct mpam_class *class;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_for_each_entry(class, &mpam_classes, classes_list) {
+ if (class->type == type && class->level == level_idx) {
+ found = true;
+ break;
+ }
+ }
+
+ if (found)
+ return class;
+
+ if (!alloc)
+ return ERR_PTR(-ENOENT);
+
+ return mpam_class_alloc(level_idx, type, gfp);
+}
+
+#define add_to_garbage(x) \
+do { \
+ __typeof__(x) _x = x; \
+ (_x)->garbage.to_free = (_x); \
+ llist_add(&(_x)->garbage.llist, &mpam_garbage); \
+} while (0)
+
+static void mpam_class_destroy(struct mpam_class *class)
+{
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_del_rcu(&class->classes_list);
+ add_to_garbage(class);
+}
+
+static void mpam_comp_destroy(struct mpam_component *comp)
+{
+ struct mpam_class *class = comp->class;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_del_rcu(&comp->class_list);
+ add_to_garbage(comp);
+
+ if (list_empty(&class->components))
+ mpam_class_destroy(class);
+}
+
+static void mpam_vmsc_destroy(struct mpam_vmsc *vmsc)
+{
+ struct mpam_component *comp = vmsc->comp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_del_rcu(&vmsc->comp_list);
+ add_to_garbage(vmsc);
+
+ if (list_empty(&comp->vmsc))
+ mpam_comp_destroy(comp);
+}
+
+static void mpam_ris_destroy(struct mpam_msc_ris *ris)
+{
+ struct mpam_vmsc *vmsc = ris->vmsc;
+ struct mpam_msc *msc = vmsc->msc;
+ struct platform_device *pdev = msc->pdev;
+ struct mpam_component *comp = vmsc->comp;
+ struct mpam_class *class = comp->class;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ cpumask_andnot(&comp->affinity, &comp->affinity, &ris->affinity);
+ cpumask_andnot(&class->affinity, &class->affinity, &ris->affinity);
+ clear_bit(ris->ris_idx, msc->ris_idxs);
+ list_del_rcu(&ris->vmsc_list);
+ list_del_rcu(&ris->msc_list);
+ add_to_garbage(ris);
+ ris->garbage.pdev = pdev;
+
+ if (list_empty(&vmsc->ris))
+ mpam_vmsc_destroy(vmsc);
+}
+
+/*
+ * There are two ways of reaching a struct mpam_msc_ris. Via the
+ * class->component->vmsc->ris, or via the msc.
+ * When destroying the msc, the other side needs unlinking and cleaning up too.
+ */
+static void mpam_msc_destroy(struct mpam_msc *msc)
+{
+ struct platform_device *pdev = msc->pdev;
+ struct mpam_msc_ris *ris, *tmp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_del_rcu(&msc->glbl_list);
+ platform_set_drvdata(pdev, NULL);
+
+ list_for_each_entry_safe(ris, tmp, &msc->ris, msc_list)
+ mpam_ris_destroy(ris);
+
+ add_to_garbage(msc);
+ msc->garbage.pdev = pdev;
+}
+
+static void mpam_free_garbage(void)
+{
+ struct mpam_garbage *iter, *tmp;
+ struct llist_node *to_free = llist_del_all(&mpam_garbage);
+
+ if (!to_free)
+ return;
+
+ synchronize_srcu(&mpam_srcu);
+
+ llist_for_each_entry_safe(iter, tmp, to_free, llist) {
+ if (iter->pdev)
+ devm_kfree(&iter->pdev->dev, iter->to_free);
+ else
+ kfree(iter->to_free);
+ }
+}
+
+/* Called recursively to walk the list of caches from a particular CPU */
+static void __mpam_get_cpumask_from_cache_id(int cpu, struct device_node *cache_node,
+ unsigned long cache_id,
+ u32 cache_level,
+ cpumask_t *affinity)
+{
+ int err;
+ u32 iter_level;
+ unsigned long iter_cache_id;
+ struct device_node *iter_node __free(device_node) = of_find_next_cache_node(cache_node);
+
+ if (!iter_node) {
+ pr_err("cpu %u next_cache_node returned NULL\n", cpu);
+ return;
+ }
+
+ err = of_property_read_u32(iter_node, "cache-level", &iter_level);
+ if (err)
+ return;
+
+ /*
+ * get_cpu_cacheinfo_id() isn't ready until sometime
+ * during device_initcall(). Use cache_of_calculate_id().
+ */
+ iter_cache_id = cache_of_calculate_id(iter_node);
+ if (cache_id == ~0UL)
+ return;
+
+ if (iter_level == cache_level && iter_cache_id == cache_id)
+ cpumask_set_cpu(cpu, affinity);
+
+ __mpam_get_cpumask_from_cache_id(cpu, iter_node, cache_id, cache_level,
+ affinity);
+}
+
+/*
+ * The cacheinfo structures are only populated when CPUs are online.
+ * This helper walks the device tree to include offline CPUs too.
+ */
+int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
+ cpumask_t *affinity)
+{
+ int cpu;
+
+ if (!acpi_disabled)
+ return acpi_pptt_get_cpumask_from_cache_id(cache_id, affinity);
+
+ for_each_possible_cpu(cpu) {
+ struct device_node *cpu_node __free(device_node) = of_get_cpu_node(cpu, NULL);
+ if (!cpu_node) {
+ pr_err("Failed to find cpu%d device node\n", cpu);
+ return -ENOENT;
+ }
+
+ __mpam_get_cpumask_from_cache_id(cpu, cpu_node, cache_id,
+ cache_level, affinity);
+ continue;
+ }
+
+ return 0;
+}
+
+/*
+ * cpumask_of_node() only knows about online CPUs. This can't tell us whether
+ * a class is represented on all possible CPUs.
+ */
+static void get_cpumask_from_node_id(u32 node_id, cpumask_t *affinity)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu) {
+ if (node_id == cpu_to_node(cpu))
+ cpumask_set_cpu(cpu, affinity);
+ }
+}
+
+static int get_cpumask_from_cache(struct device_node *cache,
+ cpumask_t *affinity)
+{
+ int err;
+ u32 cache_level;
+ unsigned long cache_id;
+
+ err = of_property_read_u32(cache, "cache-level", &cache_level);
+ if (err) {
+ pr_err("Failed to read cache-level from cache node\n");
+ return -ENOENT;
+ }
+
+ cache_id = cache_of_calculate_id(cache);
+ if (cache_id == ~0UL) {
+ pr_err("Failed to calculate cache-id from cache node\n");
+ return -ENOENT;
+ }
+
+ return mpam_get_cpumask_from_cache_id(cache_id, cache_level, affinity);
+}
+
+static int mpam_ris_get_affinity(struct mpam_msc *msc, cpumask_t *affinity,
+ enum mpam_class_types type,
+ struct mpam_class *class,
+ struct mpam_component *comp)
+{
+ int err;
+
+ switch (type) {
+ case MPAM_CLASS_CACHE:
+ err = mpam_get_cpumask_from_cache_id(comp->comp_id, class->level,
+ affinity);
+ if (err)
+ return err;
+
+ if (cpumask_empty(affinity))
+ pr_warn_once("%s no CPUs associated with cache node",
+ dev_name(&msc->pdev->dev));
+
+ break;
+ case MPAM_CLASS_MEMORY:
+ get_cpumask_from_node_id(comp->comp_id, affinity);
+ /* affinity may be empty for CPU-less memory nodes */
+ break;
+ case MPAM_CLASS_UNKNOWN:
+ return 0;
+ }
+
+ cpumask_and(affinity, affinity, &msc->accessibility);
+
+ return 0;
+}
+
+static int mpam_ris_create_locked(struct mpam_msc *msc, u8 ris_idx,
+ enum mpam_class_types type, u8 class_id,
+ int component_id, gfp_t gfp)
+{
+ int err;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+ struct mpam_class *class;
+ struct mpam_component *comp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ if (test_and_set_bit(ris_idx, msc->ris_idxs))
+ return -EBUSY;
+
+ ris = devm_kzalloc(&msc->pdev->dev, sizeof(*ris), gfp);
+ if (!ris)
+ return -ENOMEM;
+ init_garbage(ris);
+
+ class = mpam_class_get(class_id, type, true, gfp);
+ if (IS_ERR(class))
+ return PTR_ERR(class);
+
+ comp = mpam_component_get(class, component_id, true, gfp);
+ if (IS_ERR(comp)) {
+ if (list_empty(&class->components))
+ mpam_class_destroy(class);
+ return PTR_ERR(comp);
+ }
+
+ vmsc = mpam_vmsc_get(comp, msc, true, gfp);
+ if (IS_ERR(vmsc)) {
+ if (list_empty(&comp->vmsc))
+ mpam_comp_destroy(comp);
+ return PTR_ERR(vmsc);
+ }
+
+ err = mpam_ris_get_affinity(msc, &ris->affinity, type, class, comp);
+ if (err) {
+ if (list_empty(&vmsc->ris))
+ mpam_vmsc_destroy(vmsc);
+ return err;
+ }
+
+ ris->ris_idx = ris_idx;
+ INIT_LIST_HEAD_RCU(&ris->vmsc_list);
+ ris->vmsc = vmsc;
+
+ cpumask_or(&comp->affinity, &comp->affinity, &ris->affinity);
+ cpumask_or(&class->affinity, &class->affinity, &ris->affinity);
+ list_add_rcu(&ris->vmsc_list, &vmsc->ris);
+
+ return 0;
+}
+
+int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
+ enum mpam_class_types type, u8 class_id, int component_id)
+{
+ int err;
+
+ mutex_lock(&mpam_list_lock);
+ err = mpam_ris_create_locked(msc, ris_idx, type, class_id,
+ component_id, GFP_KERNEL);
+ mutex_unlock(&mpam_list_lock);
+ if (err)
+ mpam_free_garbage();
+
+ return err;
+}
+
static void mpam_discovery_complete(void)
{
pr_err("Discovered all MSC\n");
@@ -179,7 +652,10 @@ static int update_msc_accessibility(struct mpam_msc *msc)
cpumask_copy(&msc->accessibility, cpu_possible_mask);
err = 0;
} else {
- if (of_device_is_compatible(parent, "memory")) {
+ if (of_device_is_compatible(parent, "cache")) {
+ err = get_cpumask_from_cache(parent,
+ &msc->accessibility);
+ } else if (of_device_is_compatible(parent, "memory")) {
cpumask_copy(&msc->accessibility, cpu_possible_mask);
err = 0;
} else {
@@ -209,11 +685,10 @@ static void mpam_msc_drv_remove(struct platform_device *pdev)
mutex_lock(&mpam_list_lock);
mpam_num_msc--;
- platform_set_drvdata(pdev, NULL);
- list_del_rcu(&msc->glbl_list);
- synchronize_srcu(&mpam_srcu);
- devm_kfree(&pdev->dev, msc);
+ mpam_msc_destroy(msc);
mutex_unlock(&mpam_list_lock);
+
+ mpam_free_garbage();
}
static int mpam_msc_drv_probe(struct platform_device *pdev)
@@ -230,6 +705,7 @@ static int mpam_msc_drv_probe(struct platform_device *pdev)
err = -ENOMEM;
break;
}
+ init_garbage(msc);
mutex_init(&msc->probe_lock);
mutex_init(&msc->part_sel_lock);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 07e0f240eaca..d49bb884b433 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -7,10 +7,27 @@
#include <linux/arm_mpam.h>
#include <linux/cpumask.h>
#include <linux/io.h>
+#include <linux/llist.h>
#include <linux/mailbox_client.h>
#include <linux/mutex.h>
#include <linux/resctrl.h>
#include <linux/sizes.h>
+#include <linux/srcu.h>
+
+/*
+ * Structures protected by SRCU may not be freed for a surprising amount of
+ * time (especially if perf is running). To ensure the MPAM error interrupt can
+ * tear down all the structures, build a list of objects that can be gargbage
+ * collected once synchronize_srcu() has returned.
+ * If pdev is non-NULL, use devm_kfree().
+ */
+struct mpam_garbage {
+ /* member of mpam_garbage */
+ struct llist_node llist;
+
+ void *to_free;
+ struct platform_device *pdev;
+};
struct mpam_msc {
/* member of mpam_all_msc */
@@ -57,6 +74,80 @@ struct mpam_msc {
void __iomem *mapped_hwpage;
size_t mapped_hwpage_sz;
+
+ struct mpam_garbage garbage;
};
+struct mpam_class {
+ /* mpam_components in this class */
+ struct list_head components;
+
+ cpumask_t affinity;
+
+ u8 level;
+ enum mpam_class_types type;
+
+ /* member of mpam_classes */
+ struct list_head classes_list;
+
+ struct mpam_garbage garbage;
+};
+
+struct mpam_component {
+ u32 comp_id;
+
+ /* mpam_vmsc in this component */
+ struct list_head vmsc;
+
+ cpumask_t affinity;
+
+ /* member of mpam_class:components */
+ struct list_head class_list;
+
+ /* parent: */
+ struct mpam_class *class;
+
+ struct mpam_garbage garbage;
+};
+
+struct mpam_vmsc {
+ /* member of mpam_component:vmsc_list */
+ struct list_head comp_list;
+
+ /* mpam_msc_ris in this vmsc */
+ struct list_head ris;
+
+ /* All RIS in this vMSC are members of this MSC */
+ struct mpam_msc *msc;
+
+ /* parent: */
+ struct mpam_component *comp;
+
+ struct mpam_garbage garbage;
+};
+
+struct mpam_msc_ris {
+ u8 ris_idx;
+
+ cpumask_t affinity;
+
+ /* member of mpam_vmsc:ris */
+ struct list_head vmsc_list;
+
+ /* member of mpam_msc:ris */
+ struct list_head msc_list;
+
+ /* parent: */
+ struct mpam_vmsc *vmsc;
+
+ struct mpam_garbage garbage;
+};
+
+/* List of all classes - protected by srcu*/
+extern struct srcu_struct mpam_srcu;
+extern struct list_head mpam_classes;
+
+int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
+ cpumask_t *affinity);
+
#endif /* MPAM_INTERNAL_H */
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 0edefa6ba019..406a77be68cb 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -36,11 +36,7 @@ static inline int acpi_mpam_parse_resources(struct mpam_msc *msc,
static inline int acpi_mpam_count_msc(void) { return -EINVAL; }
#endif
-static inline int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
- enum mpam_class_types type, u8 class_id,
- int component_id)
-{
- return -EINVAL;
-}
+int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
+ enum mpam_class_types type, u8 class_id, int component_id);
#endif /* __LINUX_ARM_MPAM_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (14 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 15/36] arm_mpam: Add the class and component structures for ris firmware described James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-17 1:04 ` Shaopeng Tan (Fujitsu)
2025-07-24 14:02 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
` (20 subsequent siblings)
36 siblings, 2 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Memory Partitioning and Monitoring (MPAM) has memory mapped devices
(MSCs) with an identity/configuration page.
Add the definitions for these registers as offset within the page(s).
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_internal.h | 268 ++++++++++++++++++++
1 file changed, 268 insertions(+)
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index d49bb884b433..9110c171d9d2 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -150,4 +150,272 @@ extern struct list_head mpam_classes;
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
+/*
+ * MPAM MSCs have the following register layout. See:
+ * Arm Architecture Reference Manual Supplement - Memory System Resource
+ * Partitioning and Monitoring (MPAM), for Armv8-A. DDI 0598A.a
+ */
+#define MPAM_ARCHITECTURE_V1 0x10
+
+/* Memory mapped control pages: */
+/* ID Register offsets in the memory mapped page */
+#define MPAMF_IDR 0x0000 /* features id register */
+#define MPAMF_MSMON_IDR 0x0080 /* performance monitoring features */
+#define MPAMF_IMPL_IDR 0x0028 /* imp-def partitioning */
+#define MPAMF_CPOR_IDR 0x0030 /* cache-portion partitioning */
+#define MPAMF_CCAP_IDR 0x0038 /* cache-capacity partitioning */
+#define MPAMF_MBW_IDR 0x0040 /* mem-bw partitioning */
+#define MPAMF_PRI_IDR 0x0048 /* priority partitioning */
+#define MPAMF_CSUMON_IDR 0x0088 /* cache-usage monitor */
+#define MPAMF_MBWUMON_IDR 0x0090 /* mem-bw usage monitor */
+#define MPAMF_PARTID_NRW_IDR 0x0050 /* partid-narrowing */
+#define MPAMF_IIDR 0x0018 /* implementer id register */
+#define MPAMF_AIDR 0x0020 /* architectural id register */
+
+/* Configuration and Status Register offsets in the memory mapped page */
+#define MPAMCFG_PART_SEL 0x0100 /* partid to configure: */
+#define MPAMCFG_CPBM 0x1000 /* cache-portion config */
+#define MPAMCFG_CMAX 0x0108 /* cache-capacity config */
+#define MPAMCFG_CMIN 0x0110 /* cache-capacity config */
+#define MPAMCFG_MBW_MIN 0x0200 /* min mem-bw config */
+#define MPAMCFG_MBW_MAX 0x0208 /* max mem-bw config */
+#define MPAMCFG_MBW_WINWD 0x0220 /* mem-bw accounting window config */
+#define MPAMCFG_MBW_PBM 0x2000 /* mem-bw portion bitmap config */
+#define MPAMCFG_PRI 0x0400 /* priority partitioning config */
+#define MPAMCFG_MBW_PROP 0x0500 /* mem-bw stride config */
+#define MPAMCFG_INTPARTID 0x0600 /* partid-narrowing config */
+
+#define MSMON_CFG_MON_SEL 0x0800 /* monitor selector */
+#define MSMON_CFG_CSU_FLT 0x0810 /* cache-usage monitor filter */
+#define MSMON_CFG_CSU_CTL 0x0818 /* cache-usage monitor config */
+#define MSMON_CFG_MBWU_FLT 0x0820 /* mem-bw monitor filter */
+#define MSMON_CFG_MBWU_CTL 0x0828 /* mem-bw monitor config */
+#define MSMON_CSU 0x0840 /* current cache-usage */
+#define MSMON_CSU_CAPTURE 0x0848 /* last cache-usage value captured */
+#define MSMON_MBWU 0x0860 /* current mem-bw usage value */
+#define MSMON_MBWU_CAPTURE 0x0868 /* last mem-bw value captured */
+#define MSMON_CAPT_EVNT 0x0808 /* signal a capture event */
+#define MPAMF_ESR 0x00F8 /* error status register */
+#define MPAMF_ECR 0x00F0 /* error control register */
+
+/* MPAMF_IDR - MPAM features ID register */
+#define MPAMF_IDR_PARTID_MAX GENMASK(15, 0)
+#define MPAMF_IDR_PMG_MAX GENMASK(23, 16)
+#define MPAMF_IDR_HAS_CCAP_PART BIT(24)
+#define MPAMF_IDR_HAS_CPOR_PART BIT(25)
+#define MPAMF_IDR_HAS_MBW_PART BIT(26)
+#define MPAMF_IDR_HAS_PRI_PART BIT(27)
+#define MPAMF_IDR_HAS_EXT BIT(28)
+#define MPAMF_IDR_HAS_IMPL_IDR BIT(29)
+#define MPAMF_IDR_HAS_MSMON BIT(30)
+#define MPAMF_IDR_HAS_PARTID_NRW BIT(31)
+#define MPAMF_IDR_HAS_RIS BIT(32)
+#define MPAMF_IDR_HAS_EXT_ESR BIT(38)
+#define MPAMF_IDR_HAS_ESR BIT(39)
+#define MPAMF_IDR_RIS_MAX GENMASK(59, 56)
+
+/* MPAMF_MSMON_IDR - MPAM performance monitoring ID register */
+#define MPAMF_MSMON_IDR_MSMON_CSU BIT(16)
+#define MPAMF_MSMON_IDR_MSMON_MBWU BIT(17)
+#define MPAMF_MSMON_IDR_HAS_LOCAL_CAPT_EVNT BIT(31)
+
+/* MPAMF_CPOR_IDR - MPAM features cache portion partitioning ID register */
+#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15, 0)
+
+/* MPAMF_CCAP_IDR - MPAM features cache capacity partitioning ID register */
+#define MPAMF_CCAP_IDR_HAS_CMAX_SOFTLIM BIT(31)
+#define MPAMF_CCAP_IDR_NO_CMAX BIT(30)
+#define MPAMF_CCAP_IDR_HAS_CMIN BIT(29)
+#define MPAMF_CCAP_IDR_HAS_CASSOC BIT(28)
+#define MPAMF_CCAP_IDR_CASSOC_WD GENMASK(12, 8)
+#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0)
+
+/* MPAMF_MBW_IDR - MPAM features memory bandwidth partitioning ID register */
+#define MPAMF_MBW_IDR_BWA_WD GENMASK(5, 0)
+#define MPAMF_MBW_IDR_HAS_MIN BIT(10)
+#define MPAMF_MBW_IDR_HAS_MAX BIT(11)
+#define MPAMF_MBW_IDR_HAS_PBM BIT(12)
+#define MPAMF_MBW_IDR_HAS_PROP BIT(13)
+#define MPAMF_MBW_IDR_WINDWR BIT(14)
+#define MPAMF_MBW_IDR_BWPBM_WD GENMASK(28, 16)
+
+/* MPAMF_PRI_IDR - MPAM features priority partitioning ID register */
+#define MPAMF_PRI_IDR_HAS_INTPRI BIT(0)
+#define MPAMF_PRI_IDR_INTPRI_0_IS_LOW BIT(1)
+#define MPAMF_PRI_IDR_INTPRI_WD GENMASK(9, 4)
+#define MPAMF_PRI_IDR_HAS_DSPRI BIT(16)
+#define MPAMF_PRI_IDR_DSPRI_0_IS_LOW BIT(17)
+#define MPAMF_PRI_IDR_DSPRI_WD GENMASK(25, 20)
+
+/* MPAMF_CSUMON_IDR - MPAM cache storage usage monitor ID register */
+#define MPAMF_CSUMON_IDR_NUM_MON GENMASK(15, 0)
+#define MPAMF_CSUMON_IDR_HAS_OFLOW_CAPT BIT(24)
+#define MPAMF_CSUMON_IDR_HAS_CEVNT_OFLW BIT(25)
+#define MPAMF_CSUMON_IDR_HAS_OFSR BIT(26)
+#define MPAMF_CSUMON_IDR_HAS_OFLOW_LNKG BIT(27)
+#define MPAMF_CSUMON_IDR_HAS_XCL BIT(29)
+#define MPAMF_CSUMON_IDR_CSU_RO BIT(30)
+#define MPAMF_CSUMON_IDR_HAS_CAPTURE BIT(31)
+
+/* MPAMF_MBWUMON_IDR - MPAM memory bandwidth usage monitor ID register */
+#define MPAMF_MBWUMON_IDR_NUM_MON GENMASK(15, 0)
+#define MPAMF_MBWUMON_IDR_HAS_RWBW BIT(28)
+#define MPAMF_MBWUMON_IDR_LWD BIT(29)
+#define MPAMF_MBWUMON_IDR_HAS_LONG BIT(30)
+#define MPAMF_MBWUMON_IDR_HAS_CAPTURE BIT(31)
+
+/* MPAMF_PARTID_NRW_IDR - MPAM PARTID narrowing ID register */
+#define MPAMF_PARTID_NRW_IDR_INTPARTID_MAX GENMASK(15, 0)
+
+/* MPAMF_IIDR - MPAM implementation ID register */
+#define MPAMF_IIDR_PRODUCTID GENMASK(31, 20)
+#define MPAMF_IIDR_PRODUCTID_SHIFT 20
+#define MPAMF_IIDR_VARIANT GENMASK(19, 16)
+#define MPAMF_IIDR_VARIANT_SHIFT 16
+#define MPAMF_IIDR_REVISON GENMASK(15, 12)
+#define MPAMF_IIDR_REVISON_SHIFT 12
+#define MPAMF_IIDR_IMPLEMENTER GENMASK(11, 0)
+#define MPAMF_IIDR_IMPLEMENTER_SHIFT 0
+
+/* MPAMF_AIDR - MPAM architecture ID register */
+#define MPAMF_AIDR_ARCH_MAJOR_REV GENMASK(7, 4)
+#define MPAMF_AIDR_ARCH_MINOR_REV GENMASK(3, 0)
+
+/* MPAMCFG_PART_SEL - MPAM partition configuration selection register */
+#define MPAMCFG_PART_SEL_PARTID_SEL GENMASK(15, 0)
+#define MPAMCFG_PART_SEL_INTERNAL BIT(16)
+#define MPAMCFG_PART_SEL_RIS GENMASK(27, 24)
+
+/* MPAMCFG_CMAX - MPAM cache capacity configuration register */
+#define MPAMCFG_CMAX_SOFTLIM BIT(31)
+#define MPAMCFG_CMAX_CMAX GENMASK(15, 0)
+
+/* MPAMCFG_CMIN - MPAM cache capacity configuration register */
+#define MPAMCFG_CMIN_CMIN GENMASK(15, 0)
+
+/*
+ * MPAMCFG_MBW_MIN - MPAM memory minimum bandwidth partitioning configuration
+ * register
+ */
+#define MPAMCFG_MBW_MIN_MIN GENMASK(15, 0)
+
+/*
+ * MPAMCFG_MBW_MAX - MPAM memory maximum bandwidth partitioning configuration
+ * register
+ */
+#define MPAMCFG_MBW_MAX_MAX GENMASK(15, 0)
+#define MPAMCFG_MBW_MAX_HARDLIM BIT(31)
+
+/*
+ * MPAMCFG_MBW_WINWD - MPAM memory bandwidth partitioning window width
+ * register
+ */
+#define MPAMCFG_MBW_WINWD_US_FRAC GENMASK(7, 0)
+#define MPAMCFG_MBW_WINWD_US_INT GENMASK(23, 8)
+
+/* MPAMCFG_PRI - MPAM priority partitioning configuration register */
+#define MPAMCFG_PRI_INTPRI GENMASK(15, 0)
+#define MPAMCFG_PRI_DSPRI GENMASK(31, 16)
+
+/*
+ * MPAMCFG_MBW_PROP - Memory bandwidth proportional stride partitioning
+ * configuration register
+ */
+#define MPAMCFG_MBW_PROP_STRIDEM1 GENMASK(15, 0)
+#define MPAMCFG_MBW_PROP_EN BIT(31)
+
+/*
+ * MPAMCFG_INTPARTID - MPAM internal partition narrowing configuration register
+ */
+#define MPAMCFG_INTPARTID_INTPARTID GENMASK(15, 0)
+#define MPAMCFG_INTPARTID_INTERNAL BIT(16)
+
+/* MSMON_CFG_MON_SEL - Memory system performance monitor selection register */
+#define MSMON_CFG_MON_SEL_MON_SEL GENMASK(15, 0)
+#define MSMON_CFG_MON_SEL_RIS GENMASK(27, 24)
+
+/* MPAMF_ESR - MPAM Error Status Register */
+#define MPAMF_ESR_PARTID_OR_MON GENMASK(15, 0)
+#define MPAMF_ESR_PMG GENMASK(23, 16)
+#define MPAMF_ESR_ERRCODE GENMASK(27, 24)
+#define MPAMF_ESR_OVRWR BIT(31)
+#define MPAMF_ESR_RIS GENMASK(35, 32)
+
+/* MPAMF_ECR - MPAM Error Control Register */
+#define MPAMF_ECR_INTEN BIT(0)
+
+/* Error conditions in accessing memory mapped registers */
+#define MPAM_ERRCODE_NONE 0
+#define MPAM_ERRCODE_PARTID_SEL_RANGE 1
+#define MPAM_ERRCODE_REQ_PARTID_RANGE 2
+#define MPAM_ERRCODE_MSMONCFG_ID_RANGE 3
+#define MPAM_ERRCODE_REQ_PMG_RANGE 4
+#define MPAM_ERRCODE_MONITOR_RANGE 5
+#define MPAM_ERRCODE_INTPARTID_RANGE 6
+#define MPAM_ERRCODE_UNEXPECTED_INTERNAL 7
+
+/*
+ * MSMON_CFG_CSU_FLT - Memory system performance monitor configure cache storage
+ * usage monitor filter register
+ */
+#define MSMON_CFG_CSU_FLT_PARTID GENMASK(15, 0)
+#define MSMON_CFG_CSU_FLT_PMG GENMASK(23, 16)
+
+/*
+ * MSMON_CFG_CSU_CTL - Memory system performance monitor configure cache storage
+ * usage monitor control register
+ * MSMON_CFG_MBWU_CTL - Memory system performance monitor configure memory
+ * bandwidth usage monitor control register
+ */
+#define MSMON_CFG_x_CTL_TYPE GENMASK(7, 0)
+#define MSMON_CFG_x_CTL_OFLOW_STATUS_L BIT(15)
+#define MSMON_CFG_x_CTL_MATCH_PARTID BIT(16)
+#define MSMON_CFG_x_CTL_MATCH_PMG BIT(17)
+#define MSMON_CFG_x_CTL_SCLEN BIT(19)
+#define MSMON_CFG_x_CTL_SUBTYPE GENMASK(23, 20)
+#define MSMON_CFG_x_CTL_OFLOW_FRZ BIT(24)
+#define MSMON_CFG_x_CTL_OFLOW_INTR BIT(25)
+#define MSMON_CFG_x_CTL_OFLOW_STATUS BIT(26)
+#define MSMON_CFG_x_CTL_CAPT_RESET BIT(27)
+#define MSMON_CFG_x_CTL_CAPT_EVNT GENMASK(30, 28)
+#define MSMON_CFG_x_CTL_EN BIT(31)
+
+#define MSMON_CFG_MBWU_CTL_TYPE_MBWU 0x42
+#define MSMON_CFG_MBWU_CTL_TYPE_CSU 0x43
+
+#define MSMON_CFG_MBWU_CTL_SUBTYPE_NONE 0
+#define MSMON_CFG_MBWU_CTL_SUBTYPE_READ 1
+#define MSMON_CFG_MBWU_CTL_SUBTYPE_WRITE 2
+#define MSMON_CFG_MBWU_CTL_SUBTYPE_BOTH 3
+
+#define MSMON_CFG_MBWU_CTL_SUBTYPE_MAX 3
+#define MSMON_CFG_MBWU_CTL_SUBTYPE_MASK 0x3
+
+/*
+ * MSMON_CFG_MBWU_FLT - Memory system performance monitor configure memory
+ * bandwidth usage monitor filter register
+ */
+#define MSMON_CFG_MBWU_FLT_PARTID GENMASK(15, 0)
+#define MSMON_CFG_MBWU_FLT_PMG GENMASK(23, 16)
+#define MSMON_CFG_MBWU_FLT_RWBW GENMASK(31, 30)
+
+/*
+ * MSMON_CSU - Memory system performance monitor cache storage usage monitor
+ * register
+ * MSMON_CSU_CAPTURE - Memory system performance monitor cache storage usage
+ * capture register
+ * MSMON_MBWU - Memory system performance monitor memory bandwidth usage
+ * monitor register
+ * MSMON_MBWU_CAPTURE - Memory system performance monitor memory bandwidth usage
+ * capture register
+ */
+#define MSMON___VALUE GENMASK(30, 0)
+#define MSMON___NRDY BIT(31)
+#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
+/*
+ * MSMON_CAPT_EVNT - Memory system performance monitoring capture event
+ * generation register
+ */
+#define MSMON_CAPT_EVNT_NOW BIT(0)
+
#endif /* MPAM_INTERNAL_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (15 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-24 14:13 ` Ben Horgan
` (2 more replies)
2025-07-11 18:36 ` [RFC PATCH 18/36] arm_mpam: Probe MSCs to find the supported partid/pmg values James Morse
` (19 subsequent siblings)
36 siblings, 3 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Because an MSC can only by accessed from the CPUs in its cpu-affinity
set we need to be running on one of those CPUs to probe the MSC
hardware.
Do this work in the cpuhp callback. Probing the hardware will only
happen before MPAM is enabled, walk all the MSCs and probe those we can
reach that haven't already been probed.
Later once MPAM is enabled, this cpuhp callback will be replaced by
one that avoids the global list.
Enabling a static key will also take the cpuhp lock, so can't be done
from the cpuhp callback. Whenever a new MSC has been probed schedule
work to test if all the MSCs have now been probed.
CC: Lecopzer Chen <lecopzerc@nvidia.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 149 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 8 +-
2 files changed, 152 insertions(+), 5 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 0d6d5180903b..89434ae3efa6 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -4,6 +4,7 @@
#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
#include <linux/acpi.h>
+#include <linux/atomic.h>
#include <linux/arm_mpam.h>
#include <linux/cacheinfo.h>
#include <linux/cpu.h>
@@ -21,6 +22,7 @@
#include <linux/slab.h>
#include <linux/spinlock.h>
#include <linux/types.h>
+#include <linux/workqueue.h>
#include <acpi/pcc.h>
@@ -39,6 +41,16 @@ struct srcu_struct mpam_srcu;
/* MPAM isn't available until all the MSC have been probed. */
static u32 mpam_num_msc;
+static int mpam_cpuhp_state;
+static DEFINE_MUTEX(mpam_cpuhp_state_lock);
+
+/*
+ * mpam is enabled once all devices have been probed from CPU online callbacks,
+ * scheduled via this work_struct. If access to an MSC depends on a CPU that
+ * was not brought online at boot, this can happen surprisingly late.
+ */
+static DECLARE_WORK(mpam_enable_work, &mpam_enable);
+
/*
* An MSC is a physical container for controls and monitors, each identified by
* their RIS index. These share a base-address, interrupts and some MMIO
@@ -78,6 +90,22 @@ LIST_HEAD(mpam_classes);
/* List of all objects that can be free()d after synchronise_srcu() */
static LLIST_HEAD(mpam_garbage);
+static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg)
+{
+ WARN_ON_ONCE(reg > msc->mapped_hwpage_sz);
+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
+
+ return readl_relaxed(msc->mapped_hwpage + reg);
+}
+
+static inline u32 _mpam_read_partsel_reg(struct mpam_msc *msc, u16 reg)
+{
+ lockdep_assert_held_once(&msc->part_sel_lock);
+ return __mpam_read_reg(msc, reg);
+}
+
+#define mpam_read_partsel_reg(msc, reg) _mpam_read_partsel_reg(msc, MPAMF_##reg)
+
#define init_garbage(x) init_llist_node(&(x)->garbage.llist)
static struct mpam_vmsc *
@@ -513,9 +541,84 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
return err;
}
-static void mpam_discovery_complete(void)
+static int mpam_msc_hw_probe(struct mpam_msc *msc)
{
- pr_err("Discovered all MSC\n");
+ u64 idr;
+ int err;
+
+ lockdep_assert_held(&msc->probe_lock);
+
+ mutex_lock(&msc->part_sel_lock);
+ idr = mpam_read_partsel_reg(msc, AIDR);
+ if ((idr & MPAMF_AIDR_ARCH_MAJOR_REV) != MPAM_ARCHITECTURE_V1) {
+ pr_err_once("%s does not match MPAM architecture v1.0\n",
+ dev_name(&msc->pdev->dev));
+ err = -EIO;
+ } else {
+ msc->probed = true;
+ err = 0;
+ }
+ mutex_unlock(&msc->part_sel_lock);
+
+ return err;
+}
+
+static int mpam_cpu_online(unsigned int cpu)
+{
+ return 0;
+}
+
+/* Before mpam is enabled, try to probe new MSC */
+static int mpam_discovery_cpu_online(unsigned int cpu)
+{
+ int err = 0;
+ struct mpam_msc *msc;
+ bool new_device_probed = false;
+
+ mutex_lock(&mpam_list_lock);
+ list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
+ if (!cpumask_test_cpu(cpu, &msc->accessibility))
+ continue;
+
+ mutex_lock(&msc->probe_lock);
+ if (!msc->probed)
+ err = mpam_msc_hw_probe(msc);
+ mutex_unlock(&msc->probe_lock);
+
+ if (!err)
+ new_device_probed = true;
+ else
+ break; // mpam_broken
+ }
+ mutex_unlock(&mpam_list_lock);
+
+ if (new_device_probed && !err)
+ schedule_work(&mpam_enable_work);
+
+ return err;
+}
+
+static int mpam_cpu_offline(unsigned int cpu)
+{
+ return 0;
+}
+
+static void mpam_register_cpuhp_callbacks(int (*online)(unsigned int online),
+ int (*offline)(unsigned int offline))
+{
+ mutex_lock(&mpam_cpuhp_state_lock);
+ if (mpam_cpuhp_state) {
+ cpuhp_remove_state(mpam_cpuhp_state);
+ mpam_cpuhp_state = 0;
+ }
+
+ mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mpam:online",
+ online, offline);
+ if (mpam_cpuhp_state <= 0) {
+ pr_err("Failed to register cpuhp callbacks");
+ mpam_cpuhp_state = 0;
+ }
+ mutex_unlock(&mpam_cpuhp_state_lock);
}
static int mpam_dt_count_msc(void)
@@ -774,7 +877,7 @@ static int mpam_msc_drv_probe(struct platform_device *pdev)
}
if (!err && fw_num_msc == mpam_num_msc)
- mpam_discovery_complete();
+ mpam_register_cpuhp_callbacks(&mpam_discovery_cpu_online, NULL);
if (err && msc)
mpam_msc_drv_remove(pdev);
@@ -797,6 +900,46 @@ static struct platform_driver mpam_msc_driver = {
.remove = mpam_msc_drv_remove,
};
+static void mpam_enable_once(void)
+{
+ mutex_lock(&mpam_cpuhp_state_lock);
+ cpuhp_remove_state(mpam_cpuhp_state);
+ mpam_cpuhp_state = 0;
+ mutex_unlock(&mpam_cpuhp_state_lock);
+
+ mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
+
+ pr_info("MPAM enabled\n");
+}
+
+/*
+ * Enable mpam once all devices have been probed.
+ * Scheduled by mpam_discovery_cpu_online() once all devices have been created.
+ * Also scheduled when new devices are probed when new CPUs come online.
+ */
+void mpam_enable(struct work_struct *work)
+{
+ static atomic_t once;
+ struct mpam_msc *msc;
+ bool all_devices_probed = true;
+
+ /* Have we probed all the hw devices? */
+ mutex_lock(&mpam_list_lock);
+ list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
+ mutex_lock(&msc->probe_lock);
+ if (!msc->probed)
+ all_devices_probed = false;
+ mutex_unlock(&msc->probe_lock);
+
+ if (!all_devices_probed)
+ break;
+ }
+ mutex_unlock(&mpam_list_lock);
+
+ if (all_devices_probed && !atomic_fetch_inc(&once))
+ mpam_enable_once();
+}
+
/*
* MSC that are hidden under caches are not created as platform devices
* as there is no cache driver. Caches are also special-cased in
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 9110c171d9d2..f56e69ff8397 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -49,6 +49,7 @@ struct mpam_msc {
* properties become read-only and the lists are protected by SRCU.
*/
struct mutex probe_lock;
+ bool probed;
unsigned long ris_idxs[128 / BITS_PER_LONG];
u32 ris_max;
@@ -59,14 +60,14 @@ struct mpam_msc {
* part_sel_lock protects access to the MSC hardware registers that are
* affected by MPAMCFG_PART_SEL. (including the ID registers that vary
* by RIS).
- * If needed, take msc->lock first.
+ * If needed, take msc->probe_lock first.
*/
struct mutex part_sel_lock;
/*
* mon_sel_lock protects access to the MSC hardware registers that are
* affeted by MPAMCFG_MON_SEL.
- * If needed, take msc->lock first.
+ * If needed, take msc->probe_lock first.
*/
struct mutex outer_mon_sel_lock;
raw_spinlock_t inner_mon_sel_lock;
@@ -147,6 +148,9 @@ struct mpam_msc_ris {
extern struct srcu_struct mpam_srcu;
extern struct list_head mpam_classes;
+/* Scheduled work callback to enable mpam once all MSC have been probed */
+void mpam_enable(struct work_struct *work);
+
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 18/36] arm_mpam: Probe MSCs to find the supported partid/pmg values
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (16 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 19/36] arm_mpam: Add helpers for managing the locking around the mon_sel registers James Morse
` (18 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
CPUs can generate traffic with a range of PARTID and PMG values,
but each MSC may have its own maximum size for these fields.
Before MPAM can be used, the driver needs to probe each RIS on
each MSC, to find the system-wide smallest value that can be used.
While doing this, RIS entries that firmware didn't describe are create
under MPAM_CLASS_UNKNOWN.
While we're here, implement the mpam_register_requestor() call
for the arch code. Future callers of this will tell us about the
SMMU and ITS.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 172 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 6 +
include/linux/arm_mpam.h | 14 ++
3 files changed, 185 insertions(+), 7 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 89434ae3efa6..8646fb85ad09 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -6,6 +6,7 @@
#include <linux/acpi.h>
#include <linux/atomic.h>
#include <linux/arm_mpam.h>
+#include <linux/bitfield.h>
#include <linux/cacheinfo.h>
#include <linux/cpu.h>
#include <linux/cpumask.h>
@@ -44,6 +45,15 @@ static u32 mpam_num_msc;
static int mpam_cpuhp_state;
static DEFINE_MUTEX(mpam_cpuhp_state_lock);
+/*
+ * The smallest common values for any CPU or MSC in the system.
+ * Generating traffic outside this range will result in screaming interrupts.
+ */
+u16 mpam_partid_max;
+u8 mpam_pmg_max;
+static bool partid_max_init, partid_max_published;
+static DEFINE_SPINLOCK(partid_max_lock);
+
/*
* mpam is enabled once all devices have been probed from CPU online callbacks,
* scheduled via this work_struct. If access to an MSC depends on a CPU that
@@ -106,6 +116,74 @@ static inline u32 _mpam_read_partsel_reg(struct mpam_msc *msc, u16 reg)
#define mpam_read_partsel_reg(msc, reg) _mpam_read_partsel_reg(msc, MPAMF_##reg)
+static void __mpam_write_reg(struct mpam_msc *msc, u16 reg, u32 val)
+{
+ WARN_ON_ONCE(reg + sizeof(u32) > msc->mapped_hwpage_sz);
+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
+
+ writel_relaxed(val, msc->mapped_hwpage + reg);
+}
+
+static inline void _mpam_write_partsel_reg(struct mpam_msc *msc, u16 reg, u32 val)
+{
+ lockdep_assert_held_once(&msc->part_sel_lock);
+ __mpam_write_reg(msc, reg, val);
+}
+#define mpam_write_partsel_reg(msc, reg, val) _mpam_write_partsel_reg(msc, MPAMCFG_##reg, val)
+
+static u64 mpam_msc_read_idr(struct mpam_msc *msc)
+{
+ u64 idr_high = 0, idr_low;
+
+ lockdep_assert_held(&msc->part_sel_lock);
+
+ idr_low = mpam_read_partsel_reg(msc, IDR);
+ if (FIELD_GET(MPAMF_IDR_HAS_EXT, idr_low))
+ idr_high = mpam_read_partsel_reg(msc, IDR + 4);
+
+ return (idr_high << 32) | idr_low;
+}
+
+static void __mpam_part_sel_raw(u32 partsel, struct mpam_msc *msc)
+{
+ lockdep_assert_held(&msc->part_sel_lock);
+
+ mpam_write_partsel_reg(msc, PART_SEL, partsel);
+}
+
+static void __mpam_part_sel(u8 ris_idx, u16 partid, struct mpam_msc *msc)
+{
+ u32 partsel = FIELD_PREP(MPAMCFG_PART_SEL_RIS, ris_idx) |
+ FIELD_PREP(MPAMCFG_PART_SEL_PARTID_SEL, partid);
+
+ __mpam_part_sel_raw(partsel, msc);
+}
+
+int mpam_register_requestor(u16 partid_max, u8 pmg_max)
+{
+ int err = 0;
+
+ lockdep_assert_irqs_enabled();
+
+ spin_lock(&partid_max_lock);
+ if (!partid_max_init) {
+ mpam_partid_max = partid_max;
+ mpam_pmg_max = pmg_max;
+ partid_max_init = true;
+ } else if (!partid_max_published) {
+ mpam_partid_max = min(mpam_partid_max, partid_max);
+ mpam_pmg_max = min(mpam_pmg_max, pmg_max);
+ } else {
+ /* New requestors can't lower the values */
+ if (partid_max < mpam_partid_max || pmg_max < mpam_pmg_max)
+ err = -EBUSY;
+ }
+ spin_unlock(&partid_max_lock);
+
+ return err;
+}
+EXPORT_SYMBOL(mpam_register_requestor);
+
#define init_garbage(x) init_llist_node(&(x)->garbage.llist)
static struct mpam_vmsc *
@@ -522,6 +600,7 @@ static int mpam_ris_create_locked(struct mpam_msc *msc, u8 ris_idx,
cpumask_or(&comp->affinity, &comp->affinity, &ris->affinity);
cpumask_or(&class->affinity, &class->affinity, &ris->affinity);
list_add_rcu(&ris->vmsc_list, &vmsc->ris);
+ list_add_rcu(&ris->msc_list, &msc->ris);
return 0;
}
@@ -541,10 +620,37 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
return err;
}
+static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc *msc,
+ u8 ris_idx)
+{
+ int err;
+ struct mpam_msc_ris *ris, *found = ERR_PTR(-ENOENT);
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ if (!test_bit(ris_idx, msc->ris_idxs)) {
+ err = mpam_ris_create_locked(msc, ris_idx, MPAM_CLASS_UNKNOWN,
+ 0, 0, GFP_ATOMIC);
+ if (err)
+ return ERR_PTR(err);
+ }
+
+ list_for_each_entry(ris, &msc->ris, msc_list) {
+ if (ris->ris_idx == ris_idx) {
+ found = ris;
+ break;
+ }
+ }
+
+ return found;
+}
+
static int mpam_msc_hw_probe(struct mpam_msc *msc)
{
u64 idr;
- int err;
+ u16 partid_max;
+ u8 ris_idx, pmg_max;
+ struct mpam_msc_ris *ris;
lockdep_assert_held(&msc->probe_lock);
@@ -553,14 +659,42 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
if ((idr & MPAMF_AIDR_ARCH_MAJOR_REV) != MPAM_ARCHITECTURE_V1) {
pr_err_once("%s does not match MPAM architecture v1.0\n",
dev_name(&msc->pdev->dev));
- err = -EIO;
- } else {
- msc->probed = true;
- err = 0;
+ mutex_unlock(&msc->part_sel_lock);
+ return -EIO;
}
+
+ idr = mpam_msc_read_idr(msc);
mutex_unlock(&msc->part_sel_lock);
+ msc->ris_max = FIELD_GET(MPAMF_IDR_RIS_MAX, idr);
- return err;
+ /* Use these values so partid/pmg always starts with a valid value */
+ msc->partid_max = FIELD_GET(MPAMF_IDR_PARTID_MAX, idr);
+ msc->pmg_max = FIELD_GET(MPAMF_IDR_PMG_MAX, idr);
+
+ for (ris_idx = 0; ris_idx <= msc->ris_max; ris_idx++) {
+ mutex_lock(&msc->part_sel_lock);
+ __mpam_part_sel(ris_idx, 0, msc);
+ idr = mpam_msc_read_idr(msc);
+ mutex_unlock(&msc->part_sel_lock);
+
+ partid_max = FIELD_GET(MPAMF_IDR_PARTID_MAX, idr);
+ pmg_max = FIELD_GET(MPAMF_IDR_PMG_MAX, idr);
+ msc->partid_max = min(msc->partid_max, partid_max);
+ msc->pmg_max = min(msc->pmg_max, pmg_max);
+
+ ris = mpam_get_or_create_ris(msc, ris_idx);
+ if (IS_ERR(ris))
+ return PTR_ERR(ris);
+ }
+
+ spin_lock(&partid_max_lock);
+ mpam_partid_max = min(mpam_partid_max, msc->partid_max);
+ mpam_pmg_max = min(mpam_pmg_max, msc->pmg_max);
+ spin_unlock(&partid_max_lock);
+
+ msc->probed = true;
+
+ return 0;
}
static int mpam_cpu_online(unsigned int cpu)
@@ -907,9 +1041,18 @@ static void mpam_enable_once(void)
mpam_cpuhp_state = 0;
mutex_unlock(&mpam_cpuhp_state_lock);
+ /*
+ * Once the cpuhp callbacks have been changed, mpam_partid_max can no
+ * longer change.
+ */
+ spin_lock(&partid_max_lock);
+ partid_max_published = true;
+ spin_unlock(&partid_max_lock);
+
mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
- pr_info("MPAM enabled\n");
+ printk(KERN_INFO "MPAM enabled with %u partid and %u pmg\n",
+ mpam_partid_max + 1, mpam_pmg_max + 1);
}
/*
@@ -959,11 +1102,25 @@ static void mpam_dt_create_foundling_msc(void)
static int __init mpam_msc_driver_init(void)
{
+ bool mpam_not_available = false;
+
if (!system_supports_mpam())
return -EOPNOTSUPP;
init_srcu_struct(&mpam_srcu);
+ /*
+ * If the MPAM CPU interface is not implemented, or reserved by
+ * firmware, there is no point touching the rest of the hardware.
+ */
+ spin_lock(&partid_max_lock);
+ if (!partid_max_init || (!mpam_partid_max && !mpam_pmg_max))
+ mpam_not_available = true;
+ spin_unlock(&partid_max_lock);
+
+ if (mpam_not_available)
+ return 0;
+
if (!acpi_disabled)
fw_num_msc = acpi_mpam_count_msc();
else
@@ -979,4 +1136,5 @@ static int __init mpam_msc_driver_init(void)
return platform_driver_register(&mpam_msc_driver);
}
+/* Must occur after arm64_mpam_register_cpus() from arch_initcall() */
subsys_initcall(mpam_msc_driver_init);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index f56e69ff8397..eb5cc6775d54 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -50,6 +50,8 @@ struct mpam_msc {
*/
struct mutex probe_lock;
bool probed;
+ u16 partid_max;
+ u8 pmg_max;
unsigned long ris_idxs[128 / BITS_PER_LONG];
u32 ris_max;
@@ -148,6 +150,10 @@ struct mpam_msc_ris {
extern struct srcu_struct mpam_srcu;
extern struct list_head mpam_classes;
+/* System wide partid/pmg values */
+extern u16 mpam_partid_max;
+extern u8 mpam_pmg_max;
+
/* Scheduled work callback to enable mpam once all MSC have been probed */
void mpam_enable(struct work_struct *work);
diff --git a/include/linux/arm_mpam.h b/include/linux/arm_mpam.h
index 406a77be68cb..8af93794c7a2 100644
--- a/include/linux/arm_mpam.h
+++ b/include/linux/arm_mpam.h
@@ -39,4 +39,18 @@ static inline int acpi_mpam_count_msc(void) { return -EINVAL; }
int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
enum mpam_class_types type, u8 class_id, int component_id);
+/**
+ * mpam_register_requestor() - Register a requestor with the MPAM driver
+ * @partid_max: The maximum PARTID value the requestor can generate.
+ * @pmg_max: The maximum PMG value the requestor can generate.
+ *
+ * Registers a requestor with the MPAM driver to ensure the chosen system-wide
+ * minimum PARTID and PMG values will allow the requestors features to be used.
+ *
+ * Returns an error if the registration is too late, and a larger PARTID/PMG
+ * value has been advertised to user-space. In this case the requestor should
+ * not use its MPAM features. Returns 0 on success.
+ */
+int mpam_register_requestor(u16 partid_max, u8 pmg_max);
+
#endif /* __LINUX_ARM_MPAM_H */
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 19/36] arm_mpam: Add helpers for managing the locking around the mon_sel registers
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (17 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 18/36] arm_mpam: Probe MSCs to find the supported partid/pmg values James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports James Morse
` (17 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
The MSC MON_SEL register needs to be accessed from hardirq context by the
PMU drivers, making an irqsave spinlock the obvious lock to protect these
registers. On systems with SCMI mailboxes it must be able to sleep, meaning
a mutex must be used.
Clearly these two can't exist at the same time.
Add helpers for the MON_SEL locking. The outer lock must be taken in a
pre-emptible context before the inner lock can be taken. On systems with
SCMI mailboxes where the MON_SEL accesses must sleep - the inner lock
will fail to be 'taken' if the caller is unable to sleep. This will allow
the PMU driver to fail without having to check the interface type of
each MSC.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_internal.h | 57 ++++++++++++++++++++-
1 file changed, 56 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index eb5cc6775d54..42a454d5f914 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -68,10 +68,19 @@ struct mpam_msc {
/*
* mon_sel_lock protects access to the MSC hardware registers that are
- * affeted by MPAMCFG_MON_SEL.
+ * affected by MPAMCFG_MON_SEL, and the mbwu_state.
+ * Both the 'inner' and 'outer' must be taken.
+ * For real MMIO MSC, the outer lock is unnecessary - but keeps the
+ * code common with:
+ * Firmware backed MSC need to sleep when accessing the MSC, which
+ * means some code-paths will always fail. For these MSC the outer
+ * lock is providing the protection, and the inner lock fails to
+ * be taken if the task is unable to sleep.
+ *
* If needed, take msc->probe_lock first.
*/
struct mutex outer_mon_sel_lock;
+ bool outer_lock_held;
raw_spinlock_t inner_mon_sel_lock;
unsigned long inner_mon_sel_flags;
@@ -81,6 +90,52 @@ struct mpam_msc {
struct mpam_garbage garbage;
};
+static inline bool __must_check mpam_mon_sel_inner_lock(struct mpam_msc *msc)
+{
+ /*
+ * The outer lock may be taken by a CPU that then issues an IPI to run
+ * a helper that takes the inner lock. lockdep can't help us here.
+ */
+ WARN_ON_ONCE(!msc->outer_lock_held);
+
+ if (msc->iface == MPAM_IFACE_MMIO) {
+ raw_spin_lock_irqsave(&msc->inner_mon_sel_lock, msc->inner_mon_sel_flags);
+ return true;
+ }
+
+ /* Accesses must fail if we are not pre-emptible */
+ return !!preemptible();
+}
+
+static inline void mpam_mon_sel_inner_unlock(struct mpam_msc *msc)
+{
+ WARN_ON_ONCE(!msc->outer_lock_held);
+
+ if (msc->iface == MPAM_IFACE_MMIO)
+ raw_spin_unlock_irqrestore(&msc->inner_mon_sel_lock, msc->inner_mon_sel_flags);
+}
+
+static inline void mpam_mon_sel_outer_lock(struct mpam_msc *msc)
+{
+ mutex_lock(&msc->outer_mon_sel_lock);
+ msc->outer_lock_held = true;
+}
+
+static inline void mpam_mon_sel_outer_unlock(struct mpam_msc *msc)
+{
+ msc->outer_lock_held = false;
+ mutex_unlock(&msc->outer_mon_sel_lock);
+}
+
+static inline void mpam_mon_sel_lock_held(struct mpam_msc *msc)
+{
+ WARN_ON_ONCE(!msc->outer_lock_held);
+ if (msc->iface == MPAM_IFACE_MMIO)
+ lockdep_assert_held_once(&msc->inner_mon_sel_lock);
+ else
+ lockdep_assert_preemption_enabled();
+}
+
struct mpam_class {
/* mpam_components in this class */
struct list_head components;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (18 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 19/36] arm_mpam: Add helpers for managing the locking around the mon_sel registers James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-24 15:08 ` Ben Horgan
2025-07-28 8:56 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class James Morse
` (16 subsequent siblings)
36 siblings, 2 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Dave Martin
Expand the probing support with the control and monitor types
we can use with resctrl.
CC: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 154 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 53 +++++++
2 files changed, 206 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 8646fb85ad09..61911831ab39 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -102,7 +102,7 @@ static LLIST_HEAD(mpam_garbage);
static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg)
{
- WARN_ON_ONCE(reg > msc->mapped_hwpage_sz);
+ WARN_ON_ONCE(reg + sizeof(u32) > msc->mapped_hwpage_sz);
WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
return readl_relaxed(msc->mapped_hwpage + reg);
@@ -131,6 +131,20 @@ static inline void _mpam_write_partsel_reg(struct mpam_msc *msc, u16 reg, u32 va
}
#define mpam_write_partsel_reg(msc, reg, val) _mpam_write_partsel_reg(msc, MPAMCFG_##reg, val)
+static inline u32 _mpam_read_monsel_reg(struct mpam_msc *msc, u16 reg)
+{
+ mpam_mon_sel_lock_held(msc);
+ return __mpam_read_reg(msc, reg);
+}
+#define mpam_read_monsel_reg(msc, reg) _mpam_read_monsel_reg(msc, MSMON_##reg)
+
+static inline void _mpam_write_monsel_reg(struct mpam_msc *msc, u16 reg, u32 val)
+{
+ mpam_mon_sel_lock_held(msc);
+ __mpam_write_reg(msc, reg, val);
+}
+#define mpam_write_monsel_reg(msc, reg, val) _mpam_write_monsel_reg(msc, MSMON_##reg, val)
+
static u64 mpam_msc_read_idr(struct mpam_msc *msc)
{
u64 idr_high = 0, idr_low;
@@ -645,6 +659,137 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc *msc,
return found;
}
+/*
+ * IHI009A.a has this nugget: "If a monitor does not support automatic behaviour
+ * of NRDY, software can use this bit for any purpose" - so hardware might not
+ * implement this - but it isn't RES0.
+ *
+ * Try and see what values stick in this bit. If we can write either value,
+ * its probably not implemented by hardware.
+ */
+#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result) \
+do { \
+ u32 now; \
+ u64 mon_sel; \
+ bool can_set, can_clear; \
+ struct mpam_msc *_msc = _ris->vmsc->msc; \
+ \
+ if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(_msc))) { \
+ _result = false; \
+ break; \
+ } \
+ mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, 0) | \
+ FIELD_PREP(MSMON_CFG_MON_SEL_RIS, _ris->ris_idx); \
+ mpam_write_monsel_reg(_msc, CFG_MON_SEL, mon_sel); \
+ \
+ mpam_write_monsel_reg(_msc, _mon_reg, MSMON___NRDY); \
+ now = mpam_read_monsel_reg(_msc, _mon_reg); \
+ can_set = now & MSMON___NRDY; \
+ \
+ mpam_write_monsel_reg(_msc, _mon_reg, 0); \
+ now = mpam_read_monsel_reg(_msc, _mon_reg); \
+ can_clear = !(now & MSMON___NRDY); \
+ mpam_mon_sel_inner_unlock(_msc); \
+ \
+ _result = (!can_set || !can_clear); \
+} while (0)
+
+static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
+{
+ int err;
+ struct mpam_msc *msc = ris->vmsc->msc;
+ struct mpam_props *props = &ris->props;
+
+ lockdep_assert_held(&msc->probe_lock);
+ lockdep_assert_held(&msc->part_sel_lock);
+
+ /* Cache Portion partitioning */
+ if (FIELD_GET(MPAMF_IDR_HAS_CPOR_PART, ris->idr)) {
+ u32 cpor_features = mpam_read_partsel_reg(msc, CPOR_IDR);
+
+ props->cpbm_wd = FIELD_GET(MPAMF_CPOR_IDR_CPBM_WD, cpor_features);
+ if (props->cpbm_wd)
+ mpam_set_feature(mpam_feat_cpor_part, props);
+ }
+
+ /* Memory bandwidth partitioning */
+ if (FIELD_GET(MPAMF_IDR_HAS_MBW_PART, ris->idr)) {
+ u32 mbw_features = mpam_read_partsel_reg(msc, MBW_IDR);
+
+ /* portion bitmap resolution */
+ props->mbw_pbm_bits = FIELD_GET(MPAMF_MBW_IDR_BWPBM_WD, mbw_features);
+ if (props->mbw_pbm_bits &&
+ FIELD_GET(MPAMF_MBW_IDR_HAS_PBM, mbw_features))
+ mpam_set_feature(mpam_feat_mbw_part, props);
+
+ props->bwa_wd = FIELD_GET(MPAMF_MBW_IDR_BWA_WD, mbw_features);
+ if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_MAX, mbw_features))
+ mpam_set_feature(mpam_feat_mbw_max, props);
+ }
+
+ /* Performance Monitoring */
+ if (FIELD_GET(MPAMF_IDR_HAS_MSMON, ris->idr)) {
+ u32 msmon_features = mpam_read_partsel_reg(msc, MSMON_IDR);
+
+ /*
+ * If the firmware max-nrdy-us property is missing, the
+ * CSU counters can't be used. Should we wait forever?
+ */
+ err = device_property_read_u32(&msc->pdev->dev,
+ "arm,not-ready-us",
+ &msc->nrdy_usec);
+
+ if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_CSU, msmon_features)) {
+ u32 csumonidr;
+
+ csumonidr = mpam_read_partsel_reg(msc, CSUMON_IDR);
+ props->num_csu_mon = FIELD_GET(MPAMF_CSUMON_IDR_NUM_MON, csumonidr);
+ if (props->num_csu_mon) {
+ bool hw_managed;
+
+ mpam_set_feature(mpam_feat_msmon_csu, props);
+
+ /* Is NRDY hardware managed? */
+ mpam_mon_sel_outer_lock(msc);
+ mpam_ris_hw_probe_hw_nrdy(ris, CSU, hw_managed);
+ mpam_mon_sel_outer_unlock(msc);
+ if (hw_managed)
+ mpam_set_feature(mpam_feat_msmon_csu_hw_nrdy, props);
+ }
+
+ /*
+ * Accept the missing firmware property if NRDY appears
+ * un-implemented.
+ */
+ if (err && mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, props))
+ pr_err_once("Counters are not usable because not-ready timeout was not provided by firmware.");
+ }
+ if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_MBWU, msmon_features)) {
+ bool hw_managed;
+ u32 mbwumonidr = mpam_read_partsel_reg(msc, MBWUMON_IDR);
+
+ props->num_mbwu_mon = FIELD_GET(MPAMF_MBWUMON_IDR_NUM_MON, mbwumonidr);
+ if (props->num_mbwu_mon)
+ mpam_set_feature(mpam_feat_msmon_mbwu, props);
+
+ if (FIELD_GET(MPAMF_MBWUMON_IDR_HAS_RWBW, mbwumonidr))
+ mpam_set_feature(mpam_feat_msmon_mbwu_rwbw, props);
+
+ /* Is NRDY hardware managed? */
+ mpam_mon_sel_outer_lock(msc);
+ mpam_ris_hw_probe_hw_nrdy(ris, MBWU, hw_managed);
+ mpam_mon_sel_outer_unlock(msc);
+ if (hw_managed)
+ mpam_set_feature(mpam_feat_msmon_mbwu_hw_nrdy, props);
+
+ /*
+ * Don't warn about any missing firmware property for
+ * MBWU NRDY - it doesn't make any sense!
+ */
+ }
+ }
+}
+
static int mpam_msc_hw_probe(struct mpam_msc *msc)
{
u64 idr;
@@ -665,6 +810,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
idr = mpam_msc_read_idr(msc);
mutex_unlock(&msc->part_sel_lock);
+
msc->ris_max = FIELD_GET(MPAMF_IDR_RIS_MAX, idr);
/* Use these values so partid/pmg always starts with a valid value */
@@ -685,6 +831,12 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
ris = mpam_get_or_create_ris(msc, ris_idx);
if (IS_ERR(ris))
return PTR_ERR(ris);
+ ris->idr = idr;
+
+ mutex_lock(&msc->part_sel_lock);
+ __mpam_part_sel(ris_idx, 0, msc);
+ mpam_ris_hw_probe(ris);
+ mutex_unlock(&msc->part_sel_lock);
}
spin_lock(&partid_max_lock);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 42a454d5f914..ae6fd1f62cc4 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -136,6 +136,55 @@ static inline void mpam_mon_sel_lock_held(struct mpam_msc *msc)
lockdep_assert_preemption_enabled();
}
+/*
+ * When we compact the supported features, we don't care what they are.
+ * Storing them as a bitmap makes life easy.
+ */
+typedef u16 mpam_features_t;
+
+/* Bits for mpam_features_t */
+enum mpam_device_features {
+ mpam_feat_ccap_part = 0,
+ mpam_feat_cpor_part,
+ mpam_feat_mbw_part,
+ mpam_feat_mbw_min,
+ mpam_feat_mbw_max,
+ mpam_feat_mbw_prop,
+ mpam_feat_msmon,
+ mpam_feat_msmon_csu,
+ mpam_feat_msmon_csu_capture,
+ mpam_feat_msmon_csu_hw_nrdy,
+ mpam_feat_msmon_mbwu,
+ mpam_feat_msmon_mbwu_capture,
+ mpam_feat_msmon_mbwu_rwbw,
+ mpam_feat_msmon_mbwu_hw_nrdy,
+ mpam_feat_msmon_capt,
+ MPAM_FEATURE_LAST,
+};
+#define MPAM_ALL_FEATURES ((1 << MPAM_FEATURE_LAST) - 1)
+
+struct mpam_props {
+ mpam_features_t features;
+
+ u16 cpbm_wd;
+ u16 mbw_pbm_bits;
+ u16 bwa_wd;
+ u16 num_csu_mon;
+ u16 num_mbwu_mon;
+};
+
+static inline bool mpam_has_feature(enum mpam_device_features feat,
+ struct mpam_props *props)
+{
+ return (1 << feat) & props->features;
+}
+
+static inline void mpam_set_feature(enum mpam_device_features feat,
+ struct mpam_props *props)
+{
+ props->features |= (1 << feat);
+}
+
struct mpam_class {
/* mpam_components in this class */
struct list_head components;
@@ -175,6 +224,8 @@ struct mpam_vmsc {
/* mpam_msc_ris in this vmsc */
struct list_head ris;
+ struct mpam_props props;
+
/* All RIS in this vMSC are members of this MSC */
struct mpam_msc *msc;
@@ -186,6 +237,8 @@ struct mpam_vmsc {
struct mpam_msc_ris {
u8 ris_idx;
+ u64 idr;
+ struct mpam_props props;
cpumask_t affinity;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (19 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-28 9:15 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks James Morse
` (15 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
To make a decision about whether to expose an mpam class as
a resctrl resource we need to know its overall supported
features and properties.
Once we've probed all the resources, we can walk the tree
and produced overall values by merging the bitmaps. This
eliminates features that are only supported by some MSC
that make up a component or class.
If bitmap properties are mismatched within a component we
cannot support the mismatched feature.
Care has to be taken as vMSC may hold mismatched RIS.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 215 ++++++++++++++++++++
drivers/platform/arm64/mpam/mpam_internal.h | 8 +
2 files changed, 223 insertions(+)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 61911831ab39..7b042a35405a 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -1186,8 +1186,223 @@ static struct platform_driver mpam_msc_driver = {
.remove = mpam_msc_drv_remove,
};
+/* Any of these features mean the BWA_WD field is valid. */
+static bool mpam_has_bwa_wd_feature(struct mpam_props *props)
+{
+ if (mpam_has_feature(mpam_feat_mbw_min, props))
+ return true;
+ if (mpam_has_feature(mpam_feat_mbw_max, props))
+ return true;
+ if (mpam_has_feature(mpam_feat_mbw_prop, props))
+ return true;
+ return false;
+}
+
+#define MISMATCHED_HELPER(parent, child, helper, field, alias) \
+ helper(parent) && \
+ ((helper(child) && (parent)->field != (child)->field) || \
+ (!helper(child) && !(alias)))
+
+#define MISMATCHED_FEAT(parent, child, feat, field, alias) \
+ mpam_has_feature((feat), (parent)) && \
+ ((mpam_has_feature((feat), (child)) && (parent)->field != (child)->field) || \
+ (!mpam_has_feature((feat), (child)) && !(alias)))
+
+#define CAN_MERGE_FEAT(parent, child, feat, alias) \
+ (alias) && !mpam_has_feature((feat), (parent)) && \
+ mpam_has_feature((feat), (child))
+
+/*
+ * Combime two props fields.
+ * If this is for controls that alias the same resource, it is safe to just
+ * copy the values over. If two aliasing controls implement the same scheme
+ * a safe value must be picked.
+ * For non-aliasing controls, these control different resources, and the
+ * resulting safe value must be compatible with both. When merging values in
+ * the tree, all the aliasing resources must be handled first.
+ * On mismatch, parent is modified.
+ */
+static void __props_mismatch(struct mpam_props *parent,
+ struct mpam_props *child, bool alias)
+{
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_cpor_part, alias)) {
+ parent->cpbm_wd = child->cpbm_wd;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_cpor_part,
+ cpbm_wd, alias)) {
+ pr_debug("%s cleared cpor_part\n", __func__);
+ mpam_clear_feature(mpam_feat_cpor_part, &parent->features);
+ parent->cpbm_wd = 0;
+ }
+
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_mbw_part, alias)) {
+ parent->mbw_pbm_bits = child->mbw_pbm_bits;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_mbw_part,
+ mbw_pbm_bits, alias)) {
+ pr_debug("%s cleared mbw_part\n", __func__);
+ mpam_clear_feature(mpam_feat_mbw_part, &parent->features);
+ parent->mbw_pbm_bits = 0;
+ }
+
+ /* bwa_wd is a count of bits, fewer bits means less precision */
+ if (alias && !mpam_has_bwa_wd_feature(parent) && mpam_has_bwa_wd_feature(child)) {
+ parent->bwa_wd = child->bwa_wd;
+ } else if (MISMATCHED_HELPER(parent, child, mpam_has_bwa_wd_feature,
+ bwa_wd, alias)) {
+ pr_debug("%s took the min bwa_wd\n", __func__);
+ parent->bwa_wd = min(parent->bwa_wd, child->bwa_wd);
+ }
+
+ /* For num properties, take the minimum */
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_msmon_csu, alias)) {
+ parent->num_csu_mon = child->num_csu_mon;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_msmon_csu,
+ num_csu_mon, alias)) {
+ pr_debug("%s took the min num_csu_mon\n", __func__);
+ parent->num_csu_mon = min(parent->num_csu_mon, child->num_csu_mon);
+ }
+
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_msmon_mbwu, alias)) {
+ parent->num_mbwu_mon = child->num_mbwu_mon;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_msmon_mbwu,
+ num_mbwu_mon, alias)) {
+ pr_debug("%s took the min num_mbwu_mon\n", __func__);
+ parent->num_mbwu_mon = min(parent->num_mbwu_mon, child->num_mbwu_mon);
+ }
+
+ if (alias) {
+ /* Merge features for aliased resources */
+ parent->features |= child->features;
+ } else {
+ /* Clear missing features for non aliasing */
+ parent->features &= child->features;
+ }
+}
+
+/*
+ * If a vmsc doesn't match class feature/configuration, do the right thing(tm).
+ * For 'num' properties we can just take the minimum.
+ * For properties where the mismatched unused bits would make a difference, we
+ * nobble the class feature, as we can't configure all the resources.
+ * e.g. The L3 cache is composed of two resources with 13 and 17 portion
+ * bitmaps respectively.
+ */
+static void
+__class_props_mismatch(struct mpam_class *class, struct mpam_vmsc *vmsc)
+{
+ struct mpam_props *cprops = &class->props;
+ struct mpam_props *vprops = &vmsc->props;
+
+ lockdep_assert_held(&mpam_list_lock); /* we modify class */
+
+ pr_debug("%s: Merging features for class:0x%lx &= vmsc:0x%lx\n",
+ dev_name(&vmsc->msc->pdev->dev),
+ (long)cprops->features, (long)vprops->features);
+
+ /* Take the safe value for any common features */
+ __props_mismatch(cprops, vprops, false);
+}
+
+static void
+__vmsc_props_mismatch(struct mpam_vmsc *vmsc, struct mpam_msc_ris *ris)
+{
+ struct mpam_props *rprops = &ris->props;
+ struct mpam_props *vprops = &vmsc->props;
+
+ lockdep_assert_held(&mpam_list_lock); /* we modify vmsc */
+
+ pr_debug("%s: Merging features for vmsc:0x%lx |= ris:0x%lx\n",
+ dev_name(&vmsc->msc->pdev->dev),
+ (long)vprops->features, (long)rprops->features);
+
+ /*
+ * Merge mismatched features - Copy any features that aren't common,
+ * but take the safe value for any common features.
+ */
+ __props_mismatch(vprops, rprops, true);
+}
+
+/*
+ * Copy the first component's first vMSC's properties and features to the
+ * class. __class_props_mismatch() will remove conflicts.
+ * It is not possible to have a class with no components, or a component with
+ * no resources. The vMSC properties have already been built.
+ */
+static void mpam_enable_init_class_features(struct mpam_class *class)
+{
+ struct mpam_vmsc *vmsc;
+ struct mpam_component *comp;
+
+ comp = list_first_entry_or_null(&class->components,
+ struct mpam_component, class_list);
+ if (WARN_ON(!comp))
+ return;
+
+ vmsc = list_first_entry_or_null(&comp->vmsc,
+ struct mpam_vmsc, comp_list);
+ if (WARN_ON(!vmsc))
+ return;
+
+ class->props = vmsc->props;
+}
+
+static void mpam_enable_merge_vmsc_features(struct mpam_component *comp)
+{
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+ struct mpam_class *class = comp->class;
+
+ list_for_each_entry(vmsc, &comp->vmsc, comp_list) {
+ list_for_each_entry(ris, &vmsc->ris, vmsc_list) {
+ __vmsc_props_mismatch(vmsc, ris);
+ class->nrdy_usec = max(class->nrdy_usec,
+ vmsc->msc->nrdy_usec);
+ }
+ }
+}
+
+static void mpam_enable_merge_class_features(struct mpam_component *comp)
+{
+ struct mpam_vmsc *vmsc;
+ struct mpam_class *class = comp->class;
+
+ list_for_each_entry(vmsc, &comp->vmsc, comp_list)
+ __class_props_mismatch(class, vmsc);
+}
+
+/*
+ * Merge all the common resource features into class.
+ * vmsc features are bitwise-or'd together, this must be done first.
+ * Next the class features are the bitwise-and of all the vmsc features.
+ * Other features are the min/max as appropriate.
+ *
+ * To avoid walking the whole tree twice, the class->nrdy_usec property is
+ * updated when working with the vmsc as it is a max(), and doesn't need
+ * initialising first.
+ */
+static void mpam_enable_merge_features(struct list_head *all_classes_list)
+{
+ struct mpam_class *class;
+ struct mpam_component *comp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_for_each_entry(class, all_classes_list, classes_list) {
+ list_for_each_entry(comp, &class->components, class_list)
+ mpam_enable_merge_vmsc_features(comp);
+
+ mpam_enable_init_class_features(class);
+
+ list_for_each_entry(comp, &class->components, class_list)
+ mpam_enable_merge_class_features(comp);
+ }
+}
+
static void mpam_enable_once(void)
{
+ mutex_lock(&mpam_list_lock);
+ mpam_enable_merge_features(&mpam_classes);
+ mutex_unlock(&mpam_list_lock);
+
mutex_lock(&mpam_cpuhp_state_lock);
cpuhp_remove_state(mpam_cpuhp_state);
mpam_cpuhp_state = 0;
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index ae6fd1f62cc4..be56234b84b4 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -185,12 +185,20 @@ static inline void mpam_set_feature(enum mpam_device_features feat,
props->features |= (1 << feat);
}
+static inline void mpam_clear_feature(enum mpam_device_features feat,
+ mpam_features_t *supported)
+{
+ *supported &= ~(1 << feat);
+}
+
struct mpam_class {
/* mpam_components in this class */
struct list_head components;
cpumask_t affinity;
+ struct mpam_props props;
+ u32 nrdy_usec;
u8 level;
enum mpam_class_types type;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (20 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-28 9:49 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 23/36] arm_mpam: Add a helper to touch an MSC from any CPU James Morse
` (14 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Rohit Mathew
When a CPU comes online, it may bring a newly accessible MSC with
it. Only the default partid has its value reset by hardware, and
even then the MSC might not have been reset since its config was
previously dirtyied. e.g. Kexec.
Any in-use partid must have its configuration restored, or reset.
In-use partids may be held in caches and evicted later.
MSC are also reset when CPUs are taken offline to cover cases where
firmware doesn't reset the MSC over reboot using UEFI, or kexec
where there is no firmware involvement.
If the configuration for a RIS has not been touched since it was
brought online, it does not need resetting again.
To reset, write the maximum values for all discovered controls.
CC: Rohit Mathew <Rohit.Mathew@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 124 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 8 ++
2 files changed, 131 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 7b042a35405a..d014dbe0ab96 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -7,6 +7,7 @@
#include <linux/atomic.h>
#include <linux/arm_mpam.h>
#include <linux/bitfield.h>
+#include <linux/bitmap.h>
#include <linux/cacheinfo.h>
#include <linux/cpu.h>
#include <linux/cpumask.h>
@@ -849,8 +850,116 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
return 0;
}
+static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
+{
+ u32 num_words, msb;
+ u32 bm = ~0;
+ int i;
+
+ lockdep_assert_held(&msc->part_sel_lock);
+
+ if (wd == 0)
+ return;
+
+ /*
+ * Write all ~0 to all but the last 32bit-word, which may
+ * have fewer bits...
+ */
+ num_words = DIV_ROUND_UP(wd, 32);
+ for (i = 0; i < num_words - 1; i++, reg += sizeof(bm))
+ __mpam_write_reg(msc, reg, bm);
+
+ /*
+ * ....and then the last (maybe) partial 32bit word. When wd is a
+ * multiple of 32, msb should be 31 to write a full 32bit word.
+ */
+ msb = (wd - 1) % 32;
+ bm = GENMASK(msb, 0);
+ if (bm)
+ __mpam_write_reg(msc, reg, bm);
+}
+
+static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
+{
+ u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
+ struct mpam_msc *msc = ris->vmsc->msc;
+ struct mpam_props *rprops = &ris->props;
+
+ mpam_assert_srcu_read_lock_held();
+
+ mutex_lock(&msc->part_sel_lock);
+ __mpam_part_sel(ris->ris_idx, partid, msc);
+
+ if (mpam_has_feature(mpam_feat_cpor_part, rprops))
+ mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
+
+ if (mpam_has_feature(mpam_feat_mbw_part, rprops))
+ mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
+
+ if (mpam_has_feature(mpam_feat_mbw_min, rprops))
+ mpam_write_partsel_reg(msc, MBW_MIN, 0);
+
+ if (mpam_has_feature(mpam_feat_mbw_max, rprops))
+ mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
+
+ if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
+ mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
+ mutex_unlock(&msc->part_sel_lock);
+}
+
+static void mpam_reset_ris(struct mpam_msc_ris *ris)
+{
+ u16 partid, partid_max;
+
+ mpam_assert_srcu_read_lock_held();
+
+ if (ris->in_reset_state)
+ return;
+
+ spin_lock(&partid_max_lock);
+ partid_max = mpam_partid_max;
+ spin_unlock(&partid_max_lock);
+ for (partid = 0; partid < partid_max; partid++)
+ mpam_reset_ris_partid(ris, partid);
+}
+
+static void mpam_reset_msc(struct mpam_msc *msc, bool online)
+{
+ int idx;
+ struct mpam_msc_ris *ris;
+
+ mpam_assert_srcu_read_lock_held();
+
+ mpam_mon_sel_outer_lock(msc);
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_srcu(ris, &msc->ris, msc_list, srcu_read_lock_held(&mpam_srcu)) {
+ mpam_reset_ris(ris);
+
+ /*
+ * Set in_reset_state when coming online. The reset state
+ * for non-zero partid may be lost while the CPUs are offline.
+ */
+ ris->in_reset_state = online;
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+ mpam_mon_sel_outer_unlock(msc);
+}
+
static int mpam_cpu_online(unsigned int cpu)
{
+ int idx;
+ struct mpam_msc *msc;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
+ if (!cpumask_test_cpu(cpu, &msc->accessibility))
+ continue;
+
+ if (atomic_fetch_inc(&msc->online_refs) == 0)
+ mpam_reset_msc(msc, true);
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+
return 0;
}
@@ -886,6 +995,19 @@ static int mpam_discovery_cpu_online(unsigned int cpu)
static int mpam_cpu_offline(unsigned int cpu)
{
+ int idx;
+ struct mpam_msc *msc;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
+ if (!cpumask_test_cpu(cpu, &msc->accessibility))
+ continue;
+
+ if (atomic_dec_and_test(&msc->online_refs))
+ mpam_reset_msc(msc, false);
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+
return 0;
}
@@ -1419,7 +1541,7 @@ static void mpam_enable_once(void)
mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
printk(KERN_INFO "MPAM enabled with %u partid and %u pmg\n",
- mpam_partid_max + 1, mpam_pmg_max + 1);
+ READ_ONCE(mpam_partid_max) + 1, mpam_pmg_max + 1);
}
/*
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index be56234b84b4..f3cc88136524 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -5,6 +5,7 @@
#define MPAM_INTERNAL_H
#include <linux/arm_mpam.h>
+#include <linux/atomic.h>
#include <linux/cpumask.h>
#include <linux/io.h>
#include <linux/llist.h>
@@ -43,6 +44,7 @@ struct mpam_msc {
struct pcc_mbox_chan *pcc_chan;
u32 nrdy_usec;
cpumask_t accessibility;
+ atomic_t online_refs;
/*
* probe_lock is only take during discovery. After discovery these
@@ -247,6 +249,7 @@ struct mpam_msc_ris {
u8 ris_idx;
u64 idr;
struct mpam_props props;
+ bool in_reset_state;
cpumask_t affinity;
@@ -266,6 +269,11 @@ struct mpam_msc_ris {
extern struct srcu_struct mpam_srcu;
extern struct list_head mpam_classes;
+static inline void mpam_assert_srcu_read_lock_held(void)
+{
+ WARN_ON_ONCE(!srcu_read_lock_held((&mpam_srcu)));
+}
+
/* System wide partid/pmg values */
extern u16 mpam_partid_max;
extern u8 mpam_pmg_max;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 23/36] arm_mpam: Add a helper to touch an MSC from any CPU
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (21 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time James Morse
` (13 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Resetting RIS entries from the cpuhp callback is easy as the
callback occurs on the correct CPU. This won't be true for any other
caller that wants to reset or configure an MSC.
Add a helper that schedules the provided function if necessary.
Prevent the cpuhp callbacks from changing the MSC state by taking the
cpuhp lock.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 37 ++++++++++++++++++++--
1 file changed, 34 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index d014dbe0ab96..2e32e54cc081 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -907,20 +907,51 @@ static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
mutex_unlock(&msc->part_sel_lock);
}
-static void mpam_reset_ris(struct mpam_msc_ris *ris)
+/*
+ * Called via smp_call_on_cpu() to prevent migration, while still being
+ * pre-emptible.
+ */
+static int mpam_reset_ris(void *arg)
{
u16 partid, partid_max;
+ struct mpam_msc_ris *ris = arg;
mpam_assert_srcu_read_lock_held();
if (ris->in_reset_state)
- return;
+ return 0;
spin_lock(&partid_max_lock);
partid_max = mpam_partid_max;
spin_unlock(&partid_max_lock);
for (partid = 0; partid < partid_max; partid++)
mpam_reset_ris_partid(ris, partid);
+
+ return 0;
+}
+
+/*
+ * Get the preferred CPU for this MSC. If it is accessible from this CPU,
+ * this CPU is preferred. This can be preempted/migrated, it will only result
+ * in more work.
+ */
+static int mpam_get_msc_preferred_cpu(struct mpam_msc *msc)
+{
+ int cpu = raw_smp_processor_id();
+
+ if (cpumask_test_cpu(cpu, &msc->accessibility))
+ return cpu;
+
+ return cpumask_first_and(&msc->accessibility, cpu_online_mask);
+}
+
+static int mpam_touch_msc(struct mpam_msc *msc, int (*fn)(void *a), void *arg)
+{
+ lockdep_assert_irqs_enabled();
+ lockdep_assert_cpus_held();
+ mpam_assert_srcu_read_lock_held();
+
+ return smp_call_on_cpu(mpam_get_msc_preferred_cpu(msc), fn, arg, true);
}
static void mpam_reset_msc(struct mpam_msc *msc, bool online)
@@ -933,7 +964,7 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
mpam_mon_sel_outer_lock(msc);
idx = srcu_read_lock(&mpam_srcu);
list_for_each_entry_srcu(ris, &msc->ris, msc_list, srcu_read_lock_held(&mpam_srcu)) {
- mpam_reset_ris(ris);
+ mpam_touch_msc(msc, &mpam_reset_ris, ris);
/*
* Set in_reset_state when coming online. The reset state
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (22 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 23/36] arm_mpam: Add a helper to touch an MSC from any CPU James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-28 10:22 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
` (12 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
cpuhp callbacks aren't the only time the MSC configuration may need to
be reset. Resctrl has an API call to reset a class.
If an MPAM error interrupt arrives it indicates the driver has
misprogrammed an MSC. The safest thing to do is reset all the MSCs
and disable MPAM.
Add a helper to reset RIS via their class. Call this from mpam_disable(),
which can be scheduled from the error interrupt handler.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 62 ++++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 1 +
2 files changed, 61 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 2e32e54cc081..145535cd4732 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -916,8 +916,6 @@ static int mpam_reset_ris(void *arg)
u16 partid, partid_max;
struct mpam_msc_ris *ris = arg;
- mpam_assert_srcu_read_lock_held();
-
if (ris->in_reset_state)
return 0;
@@ -1575,6 +1573,66 @@ static void mpam_enable_once(void)
READ_ONCE(mpam_partid_max) + 1, mpam_pmg_max + 1);
}
+static void mpam_reset_component_locked(struct mpam_component *comp)
+{
+ int idx;
+ struct mpam_msc *msc;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+
+ might_sleep();
+ lockdep_assert_cpus_held();
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
+ msc = vmsc->msc;
+
+ list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
+ if (!ris->in_reset_state)
+ mpam_touch_msc(msc, mpam_reset_ris, ris);
+ ris->in_reset_state = true;
+ }
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+}
+
+static void mpam_reset_class_locked(struct mpam_class *class)
+{
+ int idx;
+ struct mpam_component *comp;
+
+ lockdep_assert_cpus_held();
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_rcu(comp, &class->components, class_list)
+ mpam_reset_component_locked(comp);
+ srcu_read_unlock(&mpam_srcu, idx);
+}
+
+static void mpam_reset_class(struct mpam_class *class)
+{
+ cpus_read_lock();
+ mpam_reset_class_locked(class);
+ cpus_read_unlock();
+}
+
+/*
+ * Called in response to an error IRQ.
+ * All of MPAMs errors indicate a software bug, restore any modified
+ * controls to their reset values.
+ */
+void mpam_disable(void)
+{
+ int idx;
+ struct mpam_class *class;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_srcu(class, &mpam_classes, classes_list,
+ srcu_read_lock_held(&mpam_srcu))
+ mpam_reset_class(class);
+ srcu_read_unlock(&mpam_srcu, idx);
+}
+
/*
* Enable mpam once all devices have been probed.
* Scheduled by mpam_discovery_cpu_online() once all devices have been created.
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index f3cc88136524..de05eece0a31 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -280,6 +280,7 @@ extern u8 mpam_pmg_max;
/* Scheduled work callback to enable mpam once all MSC have been probed */
void mpam_enable(struct work_struct *work);
+void mpam_disable(void);
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (23 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
` (4 more replies)
2025-07-11 18:36 ` [RFC PATCH 26/36] arm_mpam: Use a static key to indicate when mpam is enabled James Morse
` (11 subsequent siblings)
36 siblings, 5 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Register and enable error IRQs. All the MPAM error interrupts indicate a
software bug, e.g. out of range partid. If the error interrupt is ever
signalled, attempt to disable MPAM.
Only the irq handler accesses the ESR register, so no locking is needed.
The work to disable MPAM after an error needs to happen at process
context, use a threaded interrupt.
There is no support for percpu threaded interrupts, for now schedule
the work to be done from the irq handler.
Enabling the IRQs in the MSC may involve cross calling to a CPU that
can access the MSC.
CC: Rohit Mathew <rohit.mathew@arm.com>
Tested-by: Rohit Mathew <rohit.mathew@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 304 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 9 +-
2 files changed, 307 insertions(+), 6 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 145535cd4732..af19cc25d16e 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -14,6 +14,9 @@
#include <linux/device.h>
#include <linux/errno.h>
#include <linux/gfp.h>
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/irqdesc.h>
#include <linux/list.h>
#include <linux/lockdep.h>
#include <linux/mutex.h>
@@ -62,6 +65,12 @@ static DEFINE_SPINLOCK(partid_max_lock);
*/
static DECLARE_WORK(mpam_enable_work, &mpam_enable);
+/*
+ * All mpam error interrupts indicate a software bug. On receipt, disable the
+ * driver.
+ */
+static DECLARE_WORK(mpam_broken_work, &mpam_disable);
+
/*
* An MSC is a physical container for controls and monitors, each identified by
* their RIS index. These share a base-address, interrupts and some MMIO
@@ -159,6 +168,24 @@ static u64 mpam_msc_read_idr(struct mpam_msc *msc)
return (idr_high << 32) | idr_low;
}
+static void mpam_msc_zero_esr(struct mpam_msc *msc)
+{
+ __mpam_write_reg(msc, MPAMF_ESR, 0);
+ if (msc->has_extd_esr)
+ __mpam_write_reg(msc, MPAMF_ESR + 4, 0);
+}
+
+static u64 mpam_msc_read_esr(struct mpam_msc *msc)
+{
+ u64 esr_high = 0, esr_low;
+
+ esr_low = __mpam_read_reg(msc, MPAMF_ESR);
+ if (msc->has_extd_esr)
+ esr_high = __mpam_read_reg(msc, MPAMF_ESR + 4);
+
+ return (esr_high << 32) | esr_low;
+}
+
static void __mpam_part_sel_raw(u32 partsel, struct mpam_msc *msc)
{
lockdep_assert_held(&msc->part_sel_lock);
@@ -405,12 +432,12 @@ static void mpam_msc_destroy(struct mpam_msc *msc)
lockdep_assert_held(&mpam_list_lock);
- list_del_rcu(&msc->glbl_list);
- platform_set_drvdata(pdev, NULL);
-
list_for_each_entry_safe(ris, tmp, &msc->ris, msc_list)
mpam_ris_destroy(ris);
+ list_del_rcu(&msc->glbl_list);
+ platform_set_drvdata(pdev, NULL);
+
add_to_garbage(msc);
msc->garbage.pdev = pdev;
}
@@ -828,6 +855,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
pmg_max = FIELD_GET(MPAMF_IDR_PMG_MAX, idr);
msc->partid_max = min(msc->partid_max, partid_max);
msc->pmg_max = min(msc->pmg_max, pmg_max);
+ msc->has_extd_esr = FIELD_GET(MPAMF_IDR_HAS_EXT_ESR, idr);
ris = mpam_get_or_create_ris(msc, ris_idx);
if (IS_ERR(ris))
@@ -974,6 +1002,13 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
mpam_mon_sel_outer_unlock(msc);
}
+static void _enable_percpu_irq(void *_irq)
+{
+ int *irq = _irq;
+
+ enable_percpu_irq(*irq, IRQ_TYPE_NONE);
+}
+
static int mpam_cpu_online(unsigned int cpu)
{
int idx;
@@ -984,6 +1019,9 @@ static int mpam_cpu_online(unsigned int cpu)
if (!cpumask_test_cpu(cpu, &msc->accessibility))
continue;
+ if (msc->reenable_error_ppi)
+ _enable_percpu_irq(&msc->reenable_error_ppi);
+
if (atomic_fetch_inc(&msc->online_refs) == 0)
mpam_reset_msc(msc, true);
}
@@ -1032,6 +1070,9 @@ static int mpam_cpu_offline(unsigned int cpu)
if (!cpumask_test_cpu(cpu, &msc->accessibility))
continue;
+ if (msc->reenable_error_ppi)
+ disable_percpu_irq(msc->reenable_error_ppi);
+
if (atomic_dec_and_test(&msc->online_refs))
mpam_reset_msc(msc, false);
}
@@ -1058,6 +1099,51 @@ static void mpam_register_cpuhp_callbacks(int (*online)(unsigned int online),
mutex_unlock(&mpam_cpuhp_state_lock);
}
+static int __setup_ppi(struct mpam_msc *msc)
+{
+ int cpu;
+
+ msc->error_dev_id = alloc_percpu_gfp(struct mpam_msc *, GFP_KERNEL);
+ if (!msc->error_dev_id)
+ return -ENOMEM;
+
+ for_each_cpu(cpu, &msc->accessibility) {
+ struct mpam_msc *empty = *per_cpu_ptr(msc->error_dev_id, cpu);
+
+ if (empty) {
+ pr_err_once("%s shares PPI with %s!\n",
+ dev_name(&msc->pdev->dev),
+ dev_name(&empty->pdev->dev));
+ return -EBUSY;
+ }
+ *per_cpu_ptr(msc->error_dev_id, cpu) = msc;
+ }
+
+ return 0;
+}
+
+static int mpam_msc_setup_error_irq(struct mpam_msc *msc)
+{
+ int irq;
+
+ irq = platform_get_irq_byname_optional(msc->pdev, "error");
+ if (irq <= 0)
+ return 0;
+
+ /* Allocate and initialise the percpu device pointer for PPI */
+ if (irq_is_percpu(irq))
+ return __setup_ppi(msc);
+
+ /* sanity check: shared interrupts can be routed anywhere? */
+ if (!cpumask_equal(&msc->accessibility, cpu_possible_mask)) {
+ pr_err_once("msc:%u is a private resource with a shared error interrupt",
+ msc->id);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int mpam_dt_count_msc(void)
{
int count = 0;
@@ -1266,6 +1352,10 @@ static int mpam_msc_drv_probe(struct platform_device *pdev)
break;
}
+ err = mpam_msc_setup_error_irq(msc);
+ if (err)
+ break;
+
if (device_property_read_u32(&pdev->dev, "pcc-channel",
&msc->pcc_subspace_id))
msc->iface = MPAM_IFACE_MMIO;
@@ -1548,11 +1638,193 @@ static void mpam_enable_merge_features(struct list_head *all_classes_list)
}
}
+static char *mpam_errcode_names[16] = {
+ [0] = "No error",
+ [1] = "PARTID_SEL_Range",
+ [2] = "Req_PARTID_Range",
+ [3] = "MSMONCFG_ID_RANGE",
+ [4] = "Req_PMG_Range",
+ [5] = "Monitor_Range",
+ [6] = "intPARTID_Range",
+ [7] = "Unexpected_INTERNAL",
+ [8] = "Undefined_RIS_PART_SEL",
+ [9] = "RIS_No_Control",
+ [10] = "Undefined_RIS_MON_SEL",
+ [11] = "RIS_No_Monitor",
+ [12 ... 15] = "Reserved"
+};
+
+static int mpam_enable_msc_ecr(void *_msc)
+{
+ struct mpam_msc *msc = _msc;
+
+ __mpam_write_reg(msc, MPAMF_ECR, 1);
+
+ return 0;
+}
+
+static int mpam_disable_msc_ecr(void *_msc)
+{
+ struct mpam_msc *msc = _msc;
+
+ __mpam_write_reg(msc, MPAMF_ECR, 0);
+
+ return 0;
+}
+
+static irqreturn_t __mpam_irq_handler(int irq, struct mpam_msc *msc)
+{
+ u64 reg;
+ u16 partid;
+ u8 errcode, pmg, ris;
+
+ if (WARN_ON_ONCE(!msc) ||
+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
+ &msc->accessibility)))
+ return IRQ_NONE;
+
+ reg = mpam_msc_read_esr(msc);
+
+ errcode = FIELD_GET(MPAMF_ESR_ERRCODE, reg);
+ if (!errcode)
+ return IRQ_NONE;
+
+ /* Clear level triggered irq */
+ mpam_msc_zero_esr(msc);
+
+ partid = FIELD_GET(MPAMF_ESR_PARTID_OR_MON, reg);
+ pmg = FIELD_GET(MPAMF_ESR_PMG, reg);
+ ris = FIELD_GET(MPAMF_ESR_PMG, reg);
+
+ pr_err("error irq from msc:%u '%s', partid:%u, pmg: %u, ris: %u\n",
+ msc->id, mpam_errcode_names[errcode], partid, pmg, ris);
+
+ if (irq_is_percpu(irq)) {
+ mpam_disable_msc_ecr(msc);
+ schedule_work(&mpam_broken_work);
+ return IRQ_HANDLED;
+ }
+
+ return IRQ_WAKE_THREAD;
+}
+
+static irqreturn_t mpam_ppi_handler(int irq, void *dev_id)
+{
+ struct mpam_msc *msc = *(struct mpam_msc **)dev_id;
+
+ return __mpam_irq_handler(irq, msc);
+}
+
+static irqreturn_t mpam_spi_handler(int irq, void *dev_id)
+{
+ struct mpam_msc *msc = dev_id;
+
+ return __mpam_irq_handler(irq, msc);
+}
+
+static irqreturn_t mpam_disable_thread(int irq, void *dev_id);
+
+static int mpam_register_irqs(void)
+{
+ int err, irq, idx;
+ struct mpam_msc *msc;
+
+ lockdep_assert_cpus_held();
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
+ irq = platform_get_irq_byname_optional(msc->pdev, "error");
+ if (irq <= 0)
+ continue;
+
+ /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
+ /* We anticipate sharing the interrupt with other MSCs */
+ if (irq_is_percpu(irq)) {
+ err = request_percpu_irq(irq, &mpam_ppi_handler,
+ "mpam:msc:error",
+ msc->error_dev_id);
+ if (err)
+ return err;
+
+ msc->reenable_error_ppi = irq;
+ smp_call_function_many(&msc->accessibility,
+ &_enable_percpu_irq, &irq,
+ true);
+ } else {
+ err = devm_request_threaded_irq(&msc->pdev->dev, irq,
+ &mpam_spi_handler,
+ &mpam_disable_thread,
+ IRQF_SHARED,
+ "mpam:msc:error", msc);
+ if (err)
+ return err;
+ }
+
+ msc->error_irq_requested = true;
+ mpam_touch_msc(msc, mpam_enable_msc_ecr, msc);
+ msc->error_irq_hw_enabled = true;
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+
+ return 0;
+}
+
+static void mpam_unregister_irqs(void)
+{
+ int irq, idx;
+ struct mpam_msc *msc;
+
+ cpus_read_lock();
+ /* take the lock as free_irq() can sleep */
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
+ irq = platform_get_irq_byname_optional(msc->pdev, "error");
+ if (irq <= 0)
+ continue;
+
+ if (msc->error_irq_hw_enabled) {
+ mpam_touch_msc(msc, mpam_disable_msc_ecr, msc);
+ msc->error_irq_hw_enabled = false;
+ }
+
+ if (msc->error_irq_requested) {
+ if (irq_is_percpu(irq)) {
+ msc->reenable_error_ppi = 0;
+ free_percpu_irq(irq, msc->error_dev_id);
+ } else {
+ devm_free_irq(&msc->pdev->dev, irq, msc);
+ }
+ msc->error_irq_requested = false;
+ }
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+ cpus_read_unlock();
+}
+
static void mpam_enable_once(void)
{
+ int err;
+
+ /*
+ * If all the MSC have been probed, enabling the IRQs happens next.
+ * That involves cross-calling to a CPU that can reach the MSC, and
+ * the locks must be taken in this order:
+ */
+ cpus_read_lock();
mutex_lock(&mpam_list_lock);
mpam_enable_merge_features(&mpam_classes);
+
+ err = mpam_register_irqs();
+ if (err)
+ pr_warn("Failed to register irqs: %d\n", err);
+
mutex_unlock(&mpam_list_lock);
+ cpus_read_unlock();
+
+ if (err) {
+ schedule_work(&mpam_broken_work);
+ return;
+ }
mutex_lock(&mpam_cpuhp_state_lock);
cpuhp_remove_state(mpam_cpuhp_state);
@@ -1621,16 +1893,39 @@ static void mpam_reset_class(struct mpam_class *class)
* All of MPAMs errors indicate a software bug, restore any modified
* controls to their reset values.
*/
-void mpam_disable(void)
+static irqreturn_t mpam_disable_thread(int irq, void *dev_id)
{
int idx;
struct mpam_class *class;
+ struct mpam_msc *msc, *tmp;
+
+ mutex_lock(&mpam_cpuhp_state_lock);
+ if (mpam_cpuhp_state) {
+ cpuhp_remove_state(mpam_cpuhp_state);
+ mpam_cpuhp_state = 0;
+ }
+ mutex_unlock(&mpam_cpuhp_state_lock);
+
+ mpam_unregister_irqs();
idx = srcu_read_lock(&mpam_srcu);
list_for_each_entry_srcu(class, &mpam_classes, classes_list,
srcu_read_lock_held(&mpam_srcu))
mpam_reset_class(class);
srcu_read_unlock(&mpam_srcu, idx);
+
+ mutex_lock(&mpam_list_lock);
+ list_for_each_entry_safe(msc, tmp, &mpam_all_msc, glbl_list)
+ mpam_msc_destroy(msc);
+ mutex_unlock(&mpam_list_lock);
+ mpam_free_garbage();
+
+ return IRQ_HANDLED;
+}
+
+void mpam_disable(struct work_struct *ignored)
+{
+ mpam_disable_thread(0, NULL);
}
/*
@@ -1644,7 +1939,6 @@ void mpam_enable(struct work_struct *work)
struct mpam_msc *msc;
bool all_devices_probed = true;
- /* Have we probed all the hw devices? */
mutex_lock(&mpam_list_lock);
list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
mutex_lock(&msc->probe_lock);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index de05eece0a31..e1c6a2676b54 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -44,6 +44,11 @@ struct mpam_msc {
struct pcc_mbox_chan *pcc_chan;
u32 nrdy_usec;
cpumask_t accessibility;
+ bool has_extd_esr;
+
+ int reenable_error_ppi;
+ struct mpam_msc * __percpu *error_dev_id;
+
atomic_t online_refs;
/*
@@ -52,6 +57,8 @@ struct mpam_msc {
*/
struct mutex probe_lock;
bool probed;
+ bool error_irq_requested;
+ bool error_irq_hw_enabled;
u16 partid_max;
u8 pmg_max;
unsigned long ris_idxs[128 / BITS_PER_LONG];
@@ -280,7 +287,7 @@ extern u8 mpam_pmg_max;
/* Scheduled work callback to enable mpam once all MSC have been probed */
void mpam_enable(struct work_struct *work);
-void mpam_disable(void);
+void mpam_disable(struct work_struct *work);
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 26/36] arm_mpam: Use a static key to indicate when mpam is enabled
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (24 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
` (10 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Once all the MSC have been probed, the system wide usable number of
PARTID is known and the configuration arrays can be allocated.
After this point, checking all the MSC have been probed is pointless,
and the cpuhp callbacks should restore the configuration, instead of
just resetting the MSC.
Add a static key to enable this behaviour. This will also allow MPAM
to be disabled in repsonse to an error, and the architecture code to
enable/disable the context switch of the MPAM system registers.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 8 ++++++++
drivers/platform/arm64/mpam/mpam_internal.h | 8 ++++++++
2 files changed, 16 insertions(+)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index af19cc25d16e..bb3695eb84e9 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -33,6 +33,8 @@
#include "mpam_internal.h"
+DEFINE_STATIC_KEY_FALSE(mpam_enabled); /* TODO: move to arch code */
+
/*
* mpam_list_lock protects the SRCU lists when writing. Once the
* mpam_enabled key is enabled these lists are read-only,
@@ -1037,6 +1039,9 @@ static int mpam_discovery_cpu_online(unsigned int cpu)
struct mpam_msc *msc;
bool new_device_probed = false;
+ if (mpam_is_enabled())
+ return 0;
+
mutex_lock(&mpam_list_lock);
list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
if (!cpumask_test_cpu(cpu, &msc->accessibility))
@@ -1839,6 +1844,7 @@ static void mpam_enable_once(void)
partid_max_published = true;
spin_unlock(&partid_max_lock);
+ static_branch_enable(&mpam_enabled);
mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
printk(KERN_INFO "MPAM enabled with %u partid and %u pmg\n",
@@ -1906,6 +1912,8 @@ static irqreturn_t mpam_disable_thread(int irq, void *dev_id)
}
mutex_unlock(&mpam_cpuhp_state_lock);
+ static_branch_disable(&mpam_enabled);
+
mpam_unregister_irqs();
idx = srcu_read_lock(&mpam_srcu);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index e1c6a2676b54..1a24424b48df 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -8,6 +8,7 @@
#include <linux/atomic.h>
#include <linux/cpumask.h>
#include <linux/io.h>
+#include <linux/jump_label.h>
#include <linux/llist.h>
#include <linux/mailbox_client.h>
#include <linux/mutex.h>
@@ -15,6 +16,13 @@
#include <linux/sizes.h>
#include <linux/srcu.h>
+DECLARE_STATIC_KEY_FALSE(mpam_enabled);
+
+static inline bool mpam_is_enabled(void)
+{
+ return static_branch_likely(&mpam_enabled);
+}
+
/*
* Structures protected by SRCU may not be freed for a surprising amount of
* time (especially if perf is running). To ensure the MPAM error interrupt can
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (25 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 26/36] arm_mpam: Use a static key to indicate when mpam is enabled James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
` (2 more replies)
2025-07-11 18:36 ` [RFC PATCH 28/36] arm_mpam: Probe and reset the rest of the features James Morse
` (9 subsequent siblings)
36 siblings, 3 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Dave Martin
When CPUs come online the original configuration should be restored.
Once the maximum partid is known, allocate an configuration array for
each component, and reprogram each RIS configuration from this.
The MPAM spec describes how multiple controls can interact. To prevent
this happening by accident, always reset controls that don't have a
valid configuration. This allows the same helper to be used for
configuration and reset.
CC: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 236 ++++++++++++++++++--
drivers/platform/arm64/mpam/mpam_internal.h | 26 ++-
2 files changed, 234 insertions(+), 28 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index bb3695eb84e9..f3ecfda265d2 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -374,12 +374,16 @@ static void mpam_class_destroy(struct mpam_class *class)
add_to_garbage(class);
}
+static void __destroy_component_cfg(struct mpam_component *comp);
+
static void mpam_comp_destroy(struct mpam_component *comp)
{
struct mpam_class *class = comp->class;
lockdep_assert_held(&mpam_list_lock);
+ __destroy_component_cfg(comp);
+
list_del_rcu(&comp->class_list);
add_to_garbage(comp);
@@ -909,51 +913,90 @@ static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
__mpam_write_reg(msc, reg, bm);
}
-static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
+/* Called via IPI. Call while holding an SRCU reference */
+static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
+ struct mpam_config *cfg)
{
u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
struct mpam_msc *msc = ris->vmsc->msc;
struct mpam_props *rprops = &ris->props;
- mpam_assert_srcu_read_lock_held();
-
mutex_lock(&msc->part_sel_lock);
__mpam_part_sel(ris->ris_idx, partid, msc);
- if (mpam_has_feature(mpam_feat_cpor_part, rprops))
- mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
+ if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
+ if (mpam_has_feature(mpam_feat_cpor_part, cfg))
+ mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
+ else
+ mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
+ rprops->cpbm_wd);
+ }
- if (mpam_has_feature(mpam_feat_mbw_part, rprops))
- mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
+ if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
+ if (mpam_has_feature(mpam_feat_mbw_part, cfg))
+ mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
+ else
+ mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM,
+ rprops->mbw_pbm_bits);
+ }
if (mpam_has_feature(mpam_feat_mbw_min, rprops))
mpam_write_partsel_reg(msc, MBW_MIN, 0);
- if (mpam_has_feature(mpam_feat_mbw_max, rprops))
- mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
+ if (mpam_has_feature(mpam_feat_mbw_max, rprops)) {
+ if (mpam_has_feature(mpam_feat_mbw_max, cfg))
+ mpam_write_partsel_reg(msc, MBW_MAX, cfg->mbw_max);
+ else
+ mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
+ }
if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
mutex_unlock(&msc->part_sel_lock);
}
+struct reprogram_ris {
+ struct mpam_msc_ris *ris;
+ struct mpam_config *cfg;
+};
+
+/* Call with MSC lock held */
+static int mpam_reprogram_ris(void *_arg)
+{
+ u16 partid, partid_max;
+ struct reprogram_ris *arg = _arg;
+ struct mpam_msc_ris *ris = arg->ris;
+ struct mpam_config *cfg = arg->cfg;
+
+ if (ris->in_reset_state)
+ return 0;
+
+ spin_lock(&partid_max_lock);
+ partid_max = mpam_partid_max;
+ spin_unlock(&partid_max_lock);
+ for (partid = 0; partid <= partid_max; partid++)
+ mpam_reprogram_ris_partid(ris, partid, cfg);
+
+ return 0;
+}
+
/*
* Called via smp_call_on_cpu() to prevent migration, while still being
* pre-emptible.
*/
static int mpam_reset_ris(void *arg)
{
- u16 partid, partid_max;
struct mpam_msc_ris *ris = arg;
+ struct reprogram_ris reprogram_arg;
+ struct mpam_config empty_cfg = { 0 };
if (ris->in_reset_state)
return 0;
- spin_lock(&partid_max_lock);
- partid_max = mpam_partid_max;
- spin_unlock(&partid_max_lock);
- for (partid = 0; partid < partid_max; partid++)
- mpam_reset_ris_partid(ris, partid);
+ reprogram_arg.ris = ris;
+ reprogram_arg.cfg = &empty_cfg;
+
+ mpam_reprogram_ris(&reprogram_arg);
return 0;
}
@@ -984,13 +1027,11 @@ static int mpam_touch_msc(struct mpam_msc *msc, int (*fn)(void *a), void *arg)
static void mpam_reset_msc(struct mpam_msc *msc, bool online)
{
- int idx;
struct mpam_msc_ris *ris;
mpam_assert_srcu_read_lock_held();
mpam_mon_sel_outer_lock(msc);
- idx = srcu_read_lock(&mpam_srcu);
list_for_each_entry_srcu(ris, &msc->ris, msc_list, srcu_read_lock_held(&mpam_srcu)) {
mpam_touch_msc(msc, &mpam_reset_ris, ris);
@@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
*/
ris->in_reset_state = online;
}
- srcu_read_unlock(&mpam_srcu, idx);
mpam_mon_sel_outer_unlock(msc);
}
+static void mpam_reprogram_msc(struct mpam_msc *msc)
+{
+ int idx;
+ u16 partid;
+ bool reset;
+ struct mpam_config *cfg;
+ struct mpam_msc_ris *ris;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
+ if (!mpam_is_enabled() && !ris->in_reset_state) {
+ mpam_touch_msc(msc, &mpam_reset_ris, ris);
+ ris->in_reset_state = true;
+ continue;
+ }
+
+ reset = true;
+ for (partid = 0; partid <= mpam_partid_max; partid++) {
+ cfg = &ris->vmsc->comp->cfg[partid];
+ if (cfg->features)
+ reset = false;
+
+ mpam_reprogram_ris_partid(ris, partid, cfg);
+ }
+ ris->in_reset_state = reset;
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+}
+
static void _enable_percpu_irq(void *_irq)
{
int *irq = _irq;
@@ -1025,7 +1094,7 @@ static int mpam_cpu_online(unsigned int cpu)
_enable_percpu_irq(&msc->reenable_error_ppi);
if (atomic_fetch_inc(&msc->online_refs) == 0)
- mpam_reset_msc(msc, true);
+ mpam_reprogram_msc(msc);
}
srcu_read_unlock(&mpam_srcu, idx);
@@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)
cpus_read_unlock();
}
+static void __destroy_component_cfg(struct mpam_component *comp)
+{
+ add_to_garbage(comp->cfg);
+}
+
+static int __allocate_component_cfg(struct mpam_component *comp)
+{
+ if (comp->cfg)
+ return 0;
+
+ comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg), GFP_KERNEL);
+ if (!comp->cfg)
+ return -ENOMEM;
+ init_garbage(comp->cfg);
+
+ return 0;
+}
+
+static int mpam_allocate_config(void)
+{
+ int err = 0;
+ struct mpam_class *class;
+ struct mpam_component *comp;
+
+ lockdep_assert_held(&mpam_list_lock);
+
+ list_for_each_entry(class, &mpam_classes, classes_list) {
+ list_for_each_entry(comp, &class->components, class_list) {
+ err = __allocate_component_cfg(comp);
+ if (err)
+ return err;
+ }
+ }
+
+ return 0;
+}
+
static void mpam_enable_once(void)
{
int err;
@@ -1817,12 +1923,21 @@ static void mpam_enable_once(void)
*/
cpus_read_lock();
mutex_lock(&mpam_list_lock);
- mpam_enable_merge_features(&mpam_classes);
+ do {
+ mpam_enable_merge_features(&mpam_classes);
- err = mpam_register_irqs();
- if (err)
- pr_warn("Failed to register irqs: %d\n", err);
+ err = mpam_allocate_config();
+ if (err) {
+ pr_err("Failed to allocate configuration arrays.\n");
+ break;
+ }
+ err = mpam_register_irqs();
+ if (err) {
+ pr_warn("Failed to register irqs: %d\n", err);
+ break;
+ }
+ } while (0);
mutex_unlock(&mpam_list_lock);
cpus_read_unlock();
@@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct mpam_component *comp)
might_sleep();
lockdep_assert_cpus_held();
+ memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));
+
idx = srcu_read_lock(&mpam_srcu);
list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
msc = vmsc->msc;
@@ -1963,6 +2080,79 @@ void mpam_enable(struct work_struct *work)
mpam_enable_once();
}
+struct mpam_write_config_arg {
+ struct mpam_msc_ris *ris;
+ struct mpam_component *comp;
+ u16 partid;
+};
+
+static int __write_config(void *arg)
+{
+ struct mpam_write_config_arg *c = arg;
+
+ mpam_reprogram_ris_partid(c->ris, c->partid, &c->comp->cfg[c->partid]);
+
+ return 0;
+}
+
+#define maybe_update_config(cfg, feature, newcfg, member, changes) do { \
+ if (mpam_has_feature(feature, newcfg) && \
+ (newcfg)->member != (cfg)->member) { \
+ (cfg)->member = (newcfg)->member; \
+ cfg->features |= (1 << feature); \
+ \
+ (changes) |= (1 << feature); \
+ } \
+} while (0)
+
+static mpam_features_t mpam_update_config(struct mpam_config *cfg,
+ const struct mpam_config *newcfg)
+{
+ mpam_features_t changes = 0;
+
+ maybe_update_config(cfg, mpam_feat_cpor_part, newcfg, cpbm, changes);
+ maybe_update_config(cfg, mpam_feat_mbw_part, newcfg, mbw_pbm, changes);
+ maybe_update_config(cfg, mpam_feat_mbw_max, newcfg, mbw_max, changes);
+
+ return changes;
+}
+
+/* TODO: split into write_config/sync_config */
+/* TODO: add config_dirty bitmap to drive sync_config */
+int mpam_apply_config(struct mpam_component *comp, u16 partid,
+ struct mpam_config *cfg)
+{
+ struct mpam_write_config_arg arg;
+ struct mpam_msc_ris *ris;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc *msc;
+ int idx;
+
+ lockdep_assert_cpus_held();
+
+ /* Don't pass in the current config! */
+ WARN_ON_ONCE(&comp->cfg[partid] == cfg);
+
+ if (!mpam_update_config(&comp->cfg[partid], cfg))
+ return 0;
+
+ arg.comp = comp;
+ arg.partid = partid;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
+ msc = vmsc->msc;
+
+ list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
+ arg.ris = ris;
+ mpam_touch_msc(msc, __write_config, &arg);
+ }
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+
+ return 0;
+}
+
/*
* MSC that are hidden under caches are not created as platform devices
* as there is no cache driver. Caches are also special-cased in
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 1a24424b48df..029ec89f56f2 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -190,11 +190,7 @@ struct mpam_props {
u16 num_mbwu_mon;
};
-static inline bool mpam_has_feature(enum mpam_device_features feat,
- struct mpam_props *props)
-{
- return (1 << feat) & props->features;
-}
+#define mpam_has_feature(_feat, x) ((1 << (_feat)) & (x)->features)
static inline void mpam_set_feature(enum mpam_device_features feat,
struct mpam_props *props)
@@ -225,6 +221,17 @@ struct mpam_class {
struct mpam_garbage garbage;
};
+struct mpam_config {
+ /* Which configuration values are valid. 0 is used for reset */
+ mpam_features_t features;
+
+ u32 cpbm;
+ u32 mbw_pbm;
+ u16 mbw_max;
+
+ struct mpam_garbage garbage;
+};
+
struct mpam_component {
u32 comp_id;
@@ -233,6 +240,12 @@ struct mpam_component {
cpumask_t affinity;
+ /*
+ * Array of configuration values, indexed by partid.
+ * Read from cpuhp callbacks, hold the cpuhp lock when writing.
+ */
+ struct mpam_config *cfg;
+
/* member of mpam_class:components */
struct list_head class_list;
@@ -297,6 +310,9 @@ extern u8 mpam_pmg_max;
void mpam_enable(struct work_struct *work);
void mpam_disable(struct work_struct *work);
+int mpam_apply_config(struct mpam_component *comp, u16 partid,
+ struct mpam_config *cfg);
+
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 28/36] arm_mpam: Probe and reset the rest of the features
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (26 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 29/36] arm_mpam: Add helpers to allocate monitors James Morse
` (8 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Rohit Mathew, Dave Martin
MPAM supports more features than are going to be exposed to resctrl.
For partid other than 0, the reset values of these controls isn't
known.
Discover the rest of the features so they can be reset to avoid any
side effects when resctrl is in use.
PARTID narrowing allows MSC/RIS to support less configuration space than
is usable. If this feature is found on a class of device we are likely
to use, then reduce the partid_max to make it usable. This allows us
to map a PARTID to itself.
CC: Rohit Mathew <Rohit.Mathew@arm.com>
CC: Zeng Heng <zengheng4@huawei.com>
CC: Dave Martin <Dave.Martin@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 175 ++++++++++++++++++++
drivers/platform/arm64/mpam/mpam_internal.h | 16 +-
2 files changed, 189 insertions(+), 2 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index f3ecfda265d2..2cf081e7c4b2 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -203,6 +203,15 @@ static void __mpam_part_sel(u8 ris_idx, u16 partid, struct mpam_msc *msc)
__mpam_part_sel_raw(partsel, msc);
}
+static void __mpam_intpart_sel(u8 ris_idx, u16 intpartid, struct mpam_msc *msc)
+{
+ u32 partsel = FIELD_PREP(MPAMCFG_PART_SEL_RIS, ris_idx) |
+ FIELD_PREP(MPAMCFG_PART_SEL_PARTID_SEL, intpartid) |
+ MPAMCFG_PART_SEL_INTERNAL;
+
+ __mpam_part_sel_raw(partsel, msc);
+}
+
int mpam_register_requestor(u16 partid_max, u8 pmg_max)
{
int err = 0;
@@ -733,10 +742,35 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
int err;
struct mpam_msc *msc = ris->vmsc->msc;
struct mpam_props *props = &ris->props;
+ struct mpam_class *class = ris->vmsc->comp->class;
lockdep_assert_held(&msc->probe_lock);
lockdep_assert_held(&msc->part_sel_lock);
+ /* Cache Capacity Partitioning */
+ if (FIELD_GET(MPAMF_IDR_HAS_CCAP_PART, ris->idr)) {
+ u32 ccap_features = mpam_read_partsel_reg(msc, CCAP_IDR);
+
+ props->cmax_wd = FIELD_GET(MPAMF_CCAP_IDR_CMAX_WD, ccap_features);
+ if (props->cmax_wd &&
+ FIELD_GET(MPAMF_CCAP_IDR_HAS_CMAX_SOFTLIM, ccap_features))
+ mpam_set_feature(mpam_feat_cmax_softlim, props);
+
+ if (props->cmax_wd &&
+ !FIELD_GET(MPAMF_CCAP_IDR_NO_CMAX, ccap_features))
+ mpam_set_feature(mpam_feat_cmax_cmax, props);
+
+ if (props->cmax_wd &&
+ FIELD_GET(MPAMF_CCAP_IDR_HAS_CMIN, ccap_features))
+ mpam_set_feature(mpam_feat_cmax_cmin, props);
+
+ props->cassoc_wd = FIELD_GET(MPAMF_CCAP_IDR_CASSOC_WD, ccap_features);
+
+ if (props->cassoc_wd &&
+ FIELD_GET(MPAMF_CCAP_IDR_HAS_CASSOC, ccap_features))
+ mpam_set_feature(mpam_feat_cmax_cassoc, props);
+ }
+
/* Cache Portion partitioning */
if (FIELD_GET(MPAMF_IDR_HAS_CPOR_PART, ris->idr)) {
u32 cpor_features = mpam_read_partsel_reg(msc, CPOR_IDR);
@@ -759,6 +793,31 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
props->bwa_wd = FIELD_GET(MPAMF_MBW_IDR_BWA_WD, mbw_features);
if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_MAX, mbw_features))
mpam_set_feature(mpam_feat_mbw_max, props);
+
+ if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_MIN, mbw_features))
+ mpam_set_feature(mpam_feat_mbw_min, props);
+
+ if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_PROP, mbw_features))
+ mpam_set_feature(mpam_feat_mbw_prop, props);
+ }
+
+ /* Priority partitioning */
+ if (FIELD_GET(MPAMF_IDR_HAS_PRI_PART, ris->idr)) {
+ u32 pri_features = mpam_read_partsel_reg(msc, PRI_IDR);
+
+ props->intpri_wd = FIELD_GET(MPAMF_PRI_IDR_INTPRI_WD, pri_features);
+ if (props->intpri_wd && FIELD_GET(MPAMF_PRI_IDR_HAS_INTPRI, pri_features)) {
+ mpam_set_feature(mpam_feat_intpri_part, props);
+ if (FIELD_GET(MPAMF_PRI_IDR_INTPRI_0_IS_LOW, pri_features))
+ mpam_set_feature(mpam_feat_intpri_part_0_low, props);
+ }
+
+ props->dspri_wd = FIELD_GET(MPAMF_PRI_IDR_DSPRI_WD, pri_features);
+ if (props->dspri_wd && FIELD_GET(MPAMF_PRI_IDR_HAS_DSPRI, pri_features)) {
+ mpam_set_feature(mpam_feat_dspri_part, props);
+ if (FIELD_GET(MPAMF_PRI_IDR_DSPRI_0_IS_LOW, pri_features))
+ mpam_set_feature(mpam_feat_dspri_part_0_low, props);
+ }
}
/* Performance Monitoring */
@@ -822,6 +881,21 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
*/
}
}
+
+ /*
+ * RIS with PARTID narrowing don't have enough storage for one
+ * configuration per PARTID. If these are in a class we could use,
+ * reduce the supported partid_max to match the number of intpartid.
+ * If the class is unknown, just ignore it.
+ */
+ if (FIELD_GET(MPAMF_IDR_HAS_PARTID_NRW, ris->idr) &&
+ class->type != MPAM_CLASS_UNKNOWN) {
+ u32 nrwidr = mpam_read_partsel_reg(msc, PARTID_NRW_IDR);
+ u16 partid_max = FIELD_GET(MPAMF_PARTID_NRW_IDR_INTPARTID_MAX, nrwidr);
+
+ mpam_set_feature(mpam_feat_partid_nrw, props);
+ msc->partid_max = min(msc->partid_max, partid_max);
+ }
}
static int mpam_msc_hw_probe(struct mpam_msc *msc)
@@ -917,13 +991,29 @@ static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
struct mpam_config *cfg)
{
+ u32 pri_val = 0;
+ u16 cmax = MPAMCFG_CMAX_CMAX;
u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
struct mpam_msc *msc = ris->vmsc->msc;
struct mpam_props *rprops = &ris->props;
+ u16 dspri = GENMASK(rprops->dspri_wd, 0);
+ u16 intpri = GENMASK(rprops->intpri_wd, 0);
mutex_lock(&msc->part_sel_lock);
__mpam_part_sel(ris->ris_idx, partid, msc);
+ if (mpam_has_feature(mpam_feat_partid_nrw, rprops)) {
+ /* Update the intpartid mapping */
+ mpam_write_partsel_reg(msc, INTPARTID,
+ MPAMCFG_INTPARTID_INTERNAL | partid);
+
+ /*
+ * Then switch to the 'internal' partid to update the
+ * configuration.
+ */
+ __mpam_intpart_sel(ris->ris_idx, partid, msc);
+ }
+
if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
if (mpam_has_feature(mpam_feat_cpor_part, cfg))
mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
@@ -952,6 +1042,29 @@ static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
+
+ if (mpam_has_feature(mpam_feat_cmax_cmax, rprops))
+ mpam_write_partsel_reg(msc, CMAX, cmax);
+
+ if (mpam_has_feature(mpam_feat_cmax_cmin, rprops))
+ mpam_write_partsel_reg(msc, CMIN, 0);
+
+ if (mpam_has_feature(mpam_feat_intpri_part, rprops) ||
+ mpam_has_feature(mpam_feat_dspri_part, rprops)) {
+ /* aces high? */
+ if (!mpam_has_feature(mpam_feat_intpri_part_0_low, rprops))
+ intpri = 0;
+ if (!mpam_has_feature(mpam_feat_dspri_part_0_low, rprops))
+ dspri = 0;
+
+ if (mpam_has_feature(mpam_feat_intpri_part, rprops))
+ pri_val |= FIELD_PREP(MPAMCFG_PRI_INTPRI, intpri);
+ if (mpam_has_feature(mpam_feat_dspri_part, rprops))
+ pri_val |= FIELD_PREP(MPAMCFG_PRI_DSPRI, dspri);
+
+ mpam_write_partsel_reg(msc, PRI, pri_val);
+ }
+
mutex_unlock(&msc->part_sel_lock);
}
@@ -1513,6 +1626,16 @@ static bool mpam_has_bwa_wd_feature(struct mpam_props *props)
return false;
}
+/* Any of these features mean the CMAX_WD field is valid. */
+static bool mpam_has_cmax_wd_feature(struct mpam_props *props)
+{
+ if (mpam_has_feature(mpam_feat_cmax_cmax, props))
+ return true;
+ if (mpam_has_feature(mpam_feat_cmax_cmin, props))
+ return true;
+ return false;
+}
+
#define MISMATCHED_HELPER(parent, child, helper, field, alias) \
helper(parent) && \
((helper(child) && (parent)->field != (child)->field) || \
@@ -1567,6 +1690,23 @@ static void __props_mismatch(struct mpam_props *parent,
parent->bwa_wd = min(parent->bwa_wd, child->bwa_wd);
}
+ if (alias && !mpam_has_cmax_wd_feature(parent) && mpam_has_cmax_wd_feature(child)) {
+ parent->cmax_wd = child->cmax_wd;
+ } else if (MISMATCHED_HELPER(parent, child, mpam_has_cmax_wd_feature,
+ cmax_wd, alias)) {
+ pr_debug("%s took the min cmax_wd\n", __func__);
+ parent->cmax_wd = min(parent->cmax_wd, child->cmax_wd);
+ }
+
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_cmax_cassoc, alias)) {
+ parent->cassoc_wd = child->cassoc_wd;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_cmax_cassoc,
+ cassoc_wd, alias)) {
+ pr_debug("%s cleared cassoc_wd\n", __func__);
+ mpam_clear_feature(mpam_feat_cmax_cassoc, &parent->features);
+ parent->cassoc_wd = 0;
+ }
+
/* For num properties, take the minimum */
if (CAN_MERGE_FEAT(parent, child, mpam_feat_msmon_csu, alias)) {
parent->num_csu_mon = child->num_csu_mon;
@@ -1584,6 +1724,41 @@ static void __props_mismatch(struct mpam_props *parent,
parent->num_mbwu_mon = min(parent->num_mbwu_mon, child->num_mbwu_mon);
}
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_intpri_part, alias)) {
+ parent->intpri_wd = child->intpri_wd;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_intpri_part,
+ intpri_wd, alias)) {
+ pr_debug("%s took the min intpri_wd\n", __func__);
+ parent->intpri_wd = min(parent->intpri_wd, child->intpri_wd);
+ }
+
+ if (CAN_MERGE_FEAT(parent, child, mpam_feat_dspri_part, alias)) {
+ parent->dspri_wd = child->dspri_wd;
+ } else if (MISMATCHED_FEAT(parent, child, mpam_feat_dspri_part,
+ dspri_wd, alias)) {
+ pr_debug("%s took the min dspri_wd\n", __func__);
+ parent->dspri_wd = min(parent->dspri_wd, child->dspri_wd);
+ }
+
+ /* TODO: alias support for these two */
+ /* {int,ds}pri may not have differing 0-low behaviour */
+ if (mpam_has_feature(mpam_feat_intpri_part, parent) &&
+ (!mpam_has_feature(mpam_feat_intpri_part, child) ||
+ mpam_has_feature(mpam_feat_intpri_part_0_low, parent) !=
+ mpam_has_feature(mpam_feat_intpri_part_0_low, child))) {
+ pr_debug("%s cleared intpri_part\n", __func__);
+ mpam_clear_feature(mpam_feat_intpri_part, &parent->features);
+ mpam_clear_feature(mpam_feat_intpri_part_0_low, &parent->features);
+ }
+ if (mpam_has_feature(mpam_feat_dspri_part, parent) &&
+ (!mpam_has_feature(mpam_feat_dspri_part, child) ||
+ mpam_has_feature(mpam_feat_dspri_part_0_low, parent) !=
+ mpam_has_feature(mpam_feat_dspri_part_0_low, child))) {
+ pr_debug("%s cleared dspri_part\n", __func__);
+ mpam_clear_feature(mpam_feat_dspri_part, &parent->features);
+ mpam_clear_feature(mpam_feat_dspri_part_0_low, &parent->features);
+ }
+
if (alias) {
/* Merge features for aliased resources */
parent->features |= child->features;
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 029ec89f56f2..1586d6bd12f0 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -157,16 +157,23 @@ static inline void mpam_mon_sel_lock_held(struct mpam_msc *msc)
* When we compact the supported features, we don't care what they are.
* Storing them as a bitmap makes life easy.
*/
-typedef u16 mpam_features_t;
+typedef u32 mpam_features_t;
/* Bits for mpam_features_t */
enum mpam_device_features {
- mpam_feat_ccap_part = 0,
+ mpam_feat_cmax_softlim,
+ mpam_feat_cmax_cmax,
+ mpam_feat_cmax_cmin,
+ mpam_feat_cmax_cassoc,
mpam_feat_cpor_part,
mpam_feat_mbw_part,
mpam_feat_mbw_min,
mpam_feat_mbw_max,
mpam_feat_mbw_prop,
+ mpam_feat_intpri_part,
+ mpam_feat_intpri_part_0_low,
+ mpam_feat_dspri_part,
+ mpam_feat_dspri_part_0_low,
mpam_feat_msmon,
mpam_feat_msmon_csu,
mpam_feat_msmon_csu_capture,
@@ -176,6 +183,7 @@ enum mpam_device_features {
mpam_feat_msmon_mbwu_rwbw,
mpam_feat_msmon_mbwu_hw_nrdy,
mpam_feat_msmon_capt,
+ mpam_feat_partid_nrw,
MPAM_FEATURE_LAST,
};
#define MPAM_ALL_FEATURES ((1 << MPAM_FEATURE_LAST) - 1)
@@ -186,6 +194,10 @@ struct mpam_props {
u16 cpbm_wd;
u16 mbw_pbm_bits;
u16 bwa_wd;
+ u16 cmax_wd;
+ u16 cassoc_wd;
+ u16 intpri_wd;
+ u16 dspri_wd;
u16 num_csu_mon;
u16 num_mbwu_mon;
};
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 29/36] arm_mpam: Add helpers to allocate monitors
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (27 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 28/36] arm_mpam: Probe and reset the rest of the features James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value James Morse
` (7 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
MPAM's MSC support a number of monitors, each of which supports
bandwidth counters, or cache-storage-utilisation counters. To use
a counter, a monitor needs to be configured. Add helpers to allocate
and free CSU or MBWU monitors.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 2 ++
drivers/platform/arm64/mpam/mpam_internal.h | 35 +++++++++++++++++++++
2 files changed, 37 insertions(+)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 2cf081e7c4b2..b11503d8ef1b 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -338,6 +338,8 @@ mpam_class_alloc(u8 level_idx, enum mpam_class_types type, gfp_t gfp)
class->level = level_idx;
class->type = type;
INIT_LIST_HEAD_RCU(&class->classes_list);
+ ida_init(&class->ida_csu_mon);
+ ida_init(&class->ida_mbwu_mon);
list_add_rcu(&class->classes_list, &mpam_classes);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 1586d6bd12f0..aca91f7dfbf6 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -230,6 +230,9 @@ struct mpam_class {
/* member of mpam_classes */
struct list_head classes_list;
+ struct ida ida_csu_mon;
+ struct ida ida_mbwu_mon;
+
struct mpam_garbage garbage;
};
@@ -305,6 +308,38 @@ struct mpam_msc_ris {
struct mpam_garbage garbage;
};
+static inline int mpam_alloc_csu_mon(struct mpam_class *class)
+{
+ struct mpam_props *cprops = &class->props;
+
+ if (!mpam_has_feature(mpam_feat_msmon_csu, cprops))
+ return -EOPNOTSUPP;
+
+ return ida_alloc_range(&class->ida_csu_mon, 0, cprops->num_csu_mon - 1,
+ GFP_KERNEL);
+}
+
+static inline void mpam_free_csu_mon(struct mpam_class *class, int csu_mon)
+{
+ ida_free(&class->ida_csu_mon, csu_mon);
+}
+
+static inline int mpam_alloc_mbwu_mon(struct mpam_class *class)
+{
+ struct mpam_props *cprops = &class->props;
+
+ if (!mpam_has_feature(mpam_feat_msmon_mbwu, cprops))
+ return -EOPNOTSUPP;
+
+ return ida_alloc_range(&class->ida_mbwu_mon, 0,
+ cprops->num_mbwu_mon - 1, GFP_KERNEL);
+}
+
+static inline void mpam_free_mbwu_mon(struct mpam_class *class, int mbwu_mon)
+{
+ ida_free(&class->ida_mbwu_mon, mbwu_mon);
+}
+
/* List of all classes - protected by srcu*/
extern struct srcu_struct mpam_srcu;
extern struct list_head mpam_classes;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (28 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 29/36] arm_mpam: Add helpers to allocate monitors James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-28 13:02 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 31/36] arm_mpam: Track bandwidth counter state for overflow and power management James Morse
` (6 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Reaing a monitor involves configuring what you want to monitor, and
reading the value. Components made up of multiple MSC may need values
from each MSC. MSCs may take time to configure, returning 'not ready'.
The maximum 'not ready' time should have been provided by firmware.
Add mpam_msmon_read() to hide all this. If (one of) the MSC returns
not ready, then wait the full timeout value before trying again.
CC: Shanker Donthineni <sdonthineni@nvidia.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 222 ++++++++++++++++++++
drivers/platform/arm64/mpam/mpam_internal.h | 18 ++
2 files changed, 240 insertions(+)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index b11503d8ef1b..7d2d2929b292 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -960,6 +960,228 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
return 0;
}
+struct mon_read {
+ struct mpam_msc_ris *ris;
+ struct mon_cfg *ctx;
+ enum mpam_device_features type;
+ u64 *val;
+ int err;
+};
+
+static void gen_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
+ u32 *flt_val)
+{
+ struct mon_cfg *ctx = m->ctx;
+
+ switch (m->type) {
+ case mpam_feat_msmon_csu:
+ *ctl_val = MSMON_CFG_MBWU_CTL_TYPE_CSU;
+ break;
+ case mpam_feat_msmon_mbwu:
+ *ctl_val = MSMON_CFG_MBWU_CTL_TYPE_MBWU;
+ break;
+ default:
+ return;
+ }
+
+ /*
+ * For CSU counters its implementation-defined what happens when not
+ * filtering by partid.
+ */
+ *ctl_val |= MSMON_CFG_x_CTL_MATCH_PARTID;
+
+ *flt_val = FIELD_PREP(MSMON_CFG_MBWU_FLT_PARTID, ctx->partid);
+ if (m->ctx->match_pmg) {
+ *ctl_val |= MSMON_CFG_x_CTL_MATCH_PMG;
+ *flt_val |= FIELD_PREP(MSMON_CFG_MBWU_FLT_PMG, ctx->pmg);
+ }
+
+ if (mpam_has_feature(mpam_feat_msmon_mbwu_rwbw, &m->ris->props))
+ *flt_val |= FIELD_PREP(MSMON_CFG_MBWU_FLT_RWBW, ctx->opts);
+}
+
+static void read_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
+ u32 *flt_val)
+{
+ struct mpam_msc *msc = m->ris->vmsc->msc;
+
+ switch (m->type) {
+ case mpam_feat_msmon_csu:
+ *ctl_val = mpam_read_monsel_reg(msc, CFG_CSU_CTL);
+ *flt_val = mpam_read_monsel_reg(msc, CFG_CSU_FLT);
+ break;
+ case mpam_feat_msmon_mbwu:
+ *ctl_val = mpam_read_monsel_reg(msc, CFG_MBWU_CTL);
+ *flt_val = mpam_read_monsel_reg(msc, CFG_MBWU_FLT);
+ break;
+ default:
+ return;
+ }
+}
+
+/* Remove values set by the hardware to prevent aparant mismatches. */
+static void clean_msmon_ctl_val(u32 *cur_ctl)
+{
+ *cur_ctl &= ~MSMON_CFG_x_CTL_OFLOW_STATUS;
+}
+
+static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
+ u32 flt_val)
+{
+ struct mpam_msc *msc = m->ris->vmsc->msc;
+
+ /*
+ * Write the ctl_val with the enable bit cleared, reset the counter,
+ * then enable counter.
+ */
+ switch (m->type) {
+ case mpam_feat_msmon_csu:
+ mpam_write_monsel_reg(msc, CFG_CSU_FLT, flt_val);
+ mpam_write_monsel_reg(msc, CFG_CSU_CTL, ctl_val);
+ mpam_write_monsel_reg(msc, CSU, 0);
+ mpam_write_monsel_reg(msc, CFG_CSU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
+ break;
+ case mpam_feat_msmon_mbwu:
+ mpam_write_monsel_reg(msc, CFG_MBWU_FLT, flt_val);
+ mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val);
+ mpam_write_monsel_reg(msc, MBWU, 0);
+ mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
+ break;
+ default:
+ return;
+ }
+}
+
+/* Call with MSC lock held */
+static void __ris_msmon_read(void *arg)
+{
+ u64 now;
+ bool nrdy = false;
+ struct mon_read *m = arg;
+ struct mon_cfg *ctx = m->ctx;
+ struct mpam_msc_ris *ris = m->ris;
+ struct mpam_props *rprops = &ris->props;
+ struct mpam_msc *msc = m->ris->vmsc->msc;
+ u32 mon_sel, ctl_val, flt_val, cur_ctl, cur_flt;
+
+ if (!mpam_mon_sel_inner_lock(msc)) {
+ m->err = -EIO;
+ return;
+ }
+ mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, ctx->mon) |
+ FIELD_PREP(MSMON_CFG_MON_SEL_RIS, ris->ris_idx);
+ mpam_write_monsel_reg(msc, CFG_MON_SEL, mon_sel);
+
+ /*
+ * Read the existing configuration to avoid re-writing the same values.
+ * This saves waiting for 'nrdy' on subsequent reads.
+ */
+ read_msmon_ctl_flt_vals(m, &cur_ctl, &cur_flt);
+ clean_msmon_ctl_val(&cur_ctl);
+ gen_msmon_ctl_flt_vals(m, &ctl_val, &flt_val);
+ if (cur_flt != flt_val || cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN))
+ write_msmon_ctl_flt_vals(m, ctl_val, flt_val);
+
+ switch (m->type) {
+ case mpam_feat_msmon_csu:
+ now = mpam_read_monsel_reg(msc, CSU);
+ if (mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, rprops))
+ nrdy = now & MSMON___NRDY;
+ break;
+ case mpam_feat_msmon_mbwu:
+ now = mpam_read_monsel_reg(msc, MBWU);
+ if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
+ nrdy = now & MSMON___NRDY;
+ break;
+ default:
+ m->err = -EINVAL;
+ break;
+ }
+ mpam_mon_sel_inner_unlock(msc);
+
+ if (nrdy) {
+ m->err = -EBUSY;
+ return;
+ }
+
+ now = FIELD_GET(MSMON___VALUE, now);
+ *m->val += now;
+}
+
+static int _msmon_read(struct mpam_component *comp, struct mon_read *arg)
+{
+ int err, idx;
+ struct mpam_msc *msc;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
+ msc = vmsc->msc;
+
+ mpam_mon_sel_outer_lock(msc);
+ list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
+ arg->ris = ris;
+
+ err = smp_call_function_any(&msc->accessibility,
+ __ris_msmon_read, arg,
+ true);
+ if (!err && arg->err)
+ err = arg->err;
+ if (err)
+ break;
+ }
+ mpam_mon_sel_outer_unlock(msc);
+ if (err)
+ break;
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+
+ return err;
+}
+
+int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
+ enum mpam_device_features type, u64 *val)
+{
+ int err;
+ struct mon_read arg;
+ u64 wait_jiffies = 0;
+ struct mpam_props *cprops = &comp->class->props;
+
+ might_sleep();
+
+ if (!mpam_is_enabled())
+ return -EIO;
+
+ if (!mpam_has_feature(type, cprops))
+ return -EOPNOTSUPP;
+
+ memset(&arg, 0, sizeof(arg));
+ arg.ctx = ctx;
+ arg.type = type;
+ arg.val = val;
+ *val = 0;
+
+ err = _msmon_read(comp, &arg);
+ if (err == -EBUSY && comp->class->nrdy_usec)
+ wait_jiffies = usecs_to_jiffies(comp->class->nrdy_usec);
+
+ while (wait_jiffies)
+ wait_jiffies = schedule_timeout_uninterruptible(wait_jiffies);
+
+ if (err == -EBUSY) {
+ memset(&arg, 0, sizeof(arg));
+ arg.ctx = ctx;
+ arg.type = type;
+ arg.val = val;
+ *val = 0;
+
+ err = _msmon_read(comp, &arg);
+ }
+
+ return err;
+}
+
static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
{
u32 num_words, msb;
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index aca91f7dfbf6..4aabef96fb7a 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -308,6 +308,21 @@ struct mpam_msc_ris {
struct mpam_garbage garbage;
};
+/* The values for MSMON_CFG_MBWU_FLT.RWBW */
+enum mon_filter_options {
+ COUNT_BOTH = 0,
+ COUNT_WRITE = 1,
+ COUNT_READ = 2,
+};
+
+struct mon_cfg {
+ u16 mon;
+ u8 pmg;
+ bool match_pmg;
+ u32 partid;
+ enum mon_filter_options opts;
+};
+
static inline int mpam_alloc_csu_mon(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
@@ -360,6 +375,9 @@ void mpam_disable(struct work_struct *work);
int mpam_apply_config(struct mpam_component *comp, u16 partid,
struct mpam_config *cfg);
+int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
+ enum mpam_device_features, u64 *val);
+
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 31/36] arm_mpam: Track bandwidth counter state for overflow and power management
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (29 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 32/36] arm_mpam: Probe for long/lwd mbwu counters James Morse
` (5 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
Bandwidth counters need to run continuously to correctly reflect the
bandwidth.
The value read may be lower than the previous value read in the case
of overflow and when the hardware is reset due to CPU hotplug.
Add struct mbwu_state to track the bandwidth counter to allow overflow
and power management to be handled.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 164 +++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 54 +++++--
2 files changed, 201 insertions(+), 17 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 7d2d2929b292..5da2666e9ee1 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -991,6 +991,7 @@ static void gen_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
*ctl_val |= MSMON_CFG_x_CTL_MATCH_PARTID;
*flt_val = FIELD_PREP(MSMON_CFG_MBWU_FLT_PARTID, ctx->partid);
+ *flt_val |= FIELD_PREP(MSMON_CFG_MBWU_FLT_RWBW, ctx->opts);
if (m->ctx->match_pmg) {
*ctl_val |= MSMON_CFG_x_CTL_MATCH_PMG;
*flt_val |= FIELD_PREP(MSMON_CFG_MBWU_FLT_PMG, ctx->pmg);
@@ -1028,6 +1029,7 @@ static void clean_msmon_ctl_val(u32 *cur_ctl)
static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
u32 flt_val)
{
+ struct msmon_mbwu_state *mbwu_state;
struct mpam_msc *msc = m->ris->vmsc->msc;
/*
@@ -1046,20 +1048,32 @@ static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val);
mpam_write_monsel_reg(msc, MBWU, 0);
mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
+
+ mbwu_state = &m->ris->mbwu_state[m->ctx->mon];
+ if (mbwu_state)
+ mbwu_state->prev_val = 0;
+
break;
default:
return;
}
}
+static u64 mpam_msmon_overflow_val(struct mpam_msc_ris *ris)
+{
+ /* TODO: scaling, and long counters */
+ return GENMASK_ULL(30, 0);
+}
+
/* Call with MSC lock held */
static void __ris_msmon_read(void *arg)
{
- u64 now;
bool nrdy = false;
struct mon_read *m = arg;
+ u64 now, overflow_val = 0;
struct mon_cfg *ctx = m->ctx;
struct mpam_msc_ris *ris = m->ris;
+ struct msmon_mbwu_state *mbwu_state;
struct mpam_props *rprops = &ris->props;
struct mpam_msc *msc = m->ris->vmsc->msc;
u32 mon_sel, ctl_val, flt_val, cur_ctl, cur_flt;
@@ -1087,11 +1101,30 @@ static void __ris_msmon_read(void *arg)
now = mpam_read_monsel_reg(msc, CSU);
if (mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, rprops))
nrdy = now & MSMON___NRDY;
+ now = FIELD_GET(MSMON___VALUE, now);
break;
case mpam_feat_msmon_mbwu:
now = mpam_read_monsel_reg(msc, MBWU);
if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
nrdy = now & MSMON___NRDY;
+ now = FIELD_GET(MSMON___VALUE, now);
+
+ if (nrdy)
+ break;
+
+ mbwu_state = &ris->mbwu_state[ctx->mon];
+ if (!mbwu_state)
+ break;
+
+ /* Add any pre-overflow value to the mbwu_state->val */
+ if (mbwu_state->prev_val > now)
+ overflow_val = mpam_msmon_overflow_val(ris) - mbwu_state->prev_val;
+
+ mbwu_state->prev_val = now;
+ mbwu_state->correction += overflow_val;
+
+ /* Include bandwidth consumed before the last hardware reset */
+ now += mbwu_state->correction;
break;
default:
m->err = -EINVAL;
@@ -1104,7 +1137,6 @@ static void __ris_msmon_read(void *arg)
return;
}
- now = FIELD_GET(MSMON___VALUE, now);
*m->val += now;
}
@@ -1317,6 +1349,72 @@ static int mpam_reprogram_ris(void *_arg)
return 0;
}
+/* Call with MSC lock and outer mon_sel lock held */
+static int mpam_restore_mbwu_state(void *_ris)
+{
+ int i;
+ struct mon_read mwbu_arg;
+ struct mpam_msc_ris *ris = _ris;
+ struct mpam_msc *msc = ris->vmsc->msc;
+
+ mpam_mon_sel_outer_lock(msc);
+
+ for (i = 0; i < ris->props.num_mbwu_mon; i++) {
+ if (ris->mbwu_state[i].enabled) {
+ mwbu_arg.ris = ris;
+ mwbu_arg.ctx = &ris->mbwu_state[i].cfg;
+ mwbu_arg.type = mpam_feat_msmon_mbwu;
+
+ __ris_msmon_read(&mwbu_arg);
+ }
+ }
+
+ mpam_mon_sel_outer_unlock(msc);
+
+ return 0;
+}
+
+/* Call with MSC lock and outer mon_sel lock held */
+static int mpam_save_mbwu_state(void *arg)
+{
+ int i;
+ u64 val;
+ struct mon_cfg *cfg;
+ u32 cur_flt, cur_ctl, mon_sel;
+ struct mpam_msc_ris *ris = arg;
+ struct msmon_mbwu_state *mbwu_state;
+ struct mpam_msc *msc = ris->vmsc->msc;
+
+ for (i = 0; i < ris->props.num_mbwu_mon; i++) {
+ mbwu_state = &ris->mbwu_state[i];
+ cfg = &mbwu_state->cfg;
+
+ if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(msc)))
+ return -EIO;
+
+ mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, i) |
+ FIELD_PREP(MSMON_CFG_MON_SEL_RIS, ris->ris_idx);
+ mpam_write_monsel_reg(msc, CFG_MON_SEL, mon_sel);
+
+ cur_flt = mpam_read_monsel_reg(msc, CFG_MBWU_FLT);
+ cur_ctl = mpam_read_monsel_reg(msc, CFG_MBWU_CTL);
+ mpam_write_monsel_reg(msc, CFG_MBWU_CTL, 0);
+
+ val = mpam_read_monsel_reg(msc, MBWU);
+ mpam_write_monsel_reg(msc, MBWU, 0);
+
+ cfg->mon = i;
+ cfg->pmg = FIELD_GET(MSMON_CFG_MBWU_FLT_PMG, cur_flt);
+ cfg->match_pmg = FIELD_GET(MSMON_CFG_x_CTL_MATCH_PMG, cur_ctl);
+ cfg->partid = FIELD_GET(MSMON_CFG_MBWU_FLT_PARTID, cur_flt);
+ mbwu_state->correction += val;
+ mbwu_state->enabled = FIELD_GET(MSMON_CFG_x_CTL_EN, cur_ctl);
+ mpam_mon_sel_inner_unlock(msc);
+ }
+
+ return 0;
+}
+
/*
* Called via smp_call_on_cpu() to prevent migration, while still being
* pre-emptible.
@@ -1377,6 +1475,9 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
* for non-zero partid may be lost while the CPUs are offline.
*/
ris->in_reset_state = online;
+
+ if (mpam_is_enabled() && !online)
+ mpam_touch_msc(msc, &mpam_save_mbwu_state, ris);
}
mpam_mon_sel_outer_unlock(msc);
}
@@ -1406,6 +1507,9 @@ static void mpam_reprogram_msc(struct mpam_msc *msc)
mpam_reprogram_ris_partid(ris, partid, cfg);
}
ris->in_reset_state = reset;
+
+ if (mpam_has_feature(mpam_feat_msmon_mbwu, &ris->props))
+ mpam_touch_msc(msc, &mpam_restore_mbwu_state, ris);
}
srcu_read_unlock(&mpam_srcu, idx);
}
@@ -2276,11 +2380,36 @@ static void mpam_unregister_irqs(void)
static void __destroy_component_cfg(struct mpam_component *comp)
{
+ struct mpam_msc *msc;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+
+ lockdep_assert_held(&mpam_list_lock);
+
add_to_garbage(comp->cfg);
+ list_for_each_entry(vmsc, &comp->vmsc, comp_list) {
+ msc = vmsc->msc;
+
+ mpam_mon_sel_outer_lock(msc);
+ if (mpam_mon_sel_inner_lock(msc)) {
+ list_for_each_entry(ris, &vmsc->ris, vmsc_list)
+ add_to_garbage(ris->mbwu_state);
+ mpam_mon_sel_inner_unlock(msc);
+ }
+ mpam_mon_sel_outer_lock(msc);
+ }
}
static int __allocate_component_cfg(struct mpam_component *comp)
{
+ int err = 0;
+ struct mpam_msc *msc;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+ struct msmon_mbwu_state *mbwu_state;
+
+ lockdep_assert_held(&mpam_list_lock);
+
if (comp->cfg)
return 0;
@@ -2289,6 +2418,37 @@ static int __allocate_component_cfg(struct mpam_component *comp)
return -ENOMEM;
init_garbage(comp->cfg);
+ list_for_each_entry(vmsc, &comp->vmsc, comp_list) {
+ if (!vmsc->props.num_mbwu_mon)
+ continue;
+
+ msc = vmsc->msc;
+ mpam_mon_sel_outer_lock(msc);
+ list_for_each_entry(ris, &vmsc->ris, vmsc_list) {
+ if (!ris->props.num_mbwu_mon)
+ continue;
+
+ mbwu_state = kcalloc(ris->props.num_mbwu_mon,
+ sizeof(*ris->mbwu_state),
+ GFP_KERNEL);
+ if (!mbwu_state) {
+ __destroy_component_cfg(comp);
+ err = -ENOMEM;
+ break;
+ }
+
+ if (mpam_mon_sel_inner_lock(msc)) {
+ init_garbage(mbwu_state);
+ ris->mbwu_state = mbwu_state;
+ mpam_mon_sel_inner_unlock(msc);
+ }
+ }
+ mpam_mon_sel_outer_unlock(msc);
+
+ if (err)
+ break;
+ }
+
return 0;
}
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 4aabef96fb7a..fc71afce3180 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -270,6 +270,42 @@ struct mpam_component {
struct mpam_garbage garbage;
};
+/* The values for MSMON_CFG_MBWU_FLT.RWBW */
+enum mon_filter_options {
+ COUNT_BOTH = 0,
+ COUNT_WRITE = 1,
+ COUNT_READ = 2,
+};
+
+struct mon_cfg {
+ /* mon is wider than u16 to hold an out of range 'USE_RMID_IDX' */
+ u32 mon;
+ u8 pmg;
+ bool match_pmg;
+ u32 partid;
+ enum mon_filter_options opts;
+};
+
+/*
+ * Changes to enabled and cfg are protected by the msc->lock.
+ * Changes to prev_val and correction are protected by the msc's mon_sel_lock.
+ */
+struct msmon_mbwu_state {
+ bool enabled;
+ struct mon_cfg cfg;
+
+ /* The value last read from the hardware. Used to detect overflow. */
+ u64 prev_val;
+
+ /*
+ * The value to add to the new reading to account for power management,
+ * and shifts to trigger the overflow interrupt.
+ */
+ u64 correction;
+
+ struct mpam_garbage garbage;
+};
+
struct mpam_vmsc {
/* member of mpam_component:vmsc_list */
struct list_head comp_list;
@@ -305,24 +341,12 @@ struct mpam_msc_ris {
/* parent: */
struct mpam_vmsc *vmsc;
+ /* msmon mbwu configuration is preserved over reset */
+ struct msmon_mbwu_state *mbwu_state;
+
struct mpam_garbage garbage;
};
-/* The values for MSMON_CFG_MBWU_FLT.RWBW */
-enum mon_filter_options {
- COUNT_BOTH = 0,
- COUNT_WRITE = 1,
- COUNT_READ = 2,
-};
-
-struct mon_cfg {
- u16 mon;
- u8 pmg;
- bool match_pmg;
- u32 partid;
- enum mon_filter_options opts;
-};
-
static inline int mpam_alloc_csu_mon(struct mpam_class *class)
{
struct mpam_props *cprops = &class->props;
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 32/36] arm_mpam: Probe for long/lwd mbwu counters
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (30 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 31/36] arm_mpam: Track bandwidth counter state for overflow and power management James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported James Morse
` (4 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
From: Rohit Mathew <rohit.mathew@arm.com>
mpam v0.1 and versions above v1.0 support optional long counter for
memory bandwidth monitoring. The MPAMF_MBWUMON_IDR register have fields
indicating support for long counters. As of now, a 44 bit counter
represented by HAS_LONG field (bit 30) and a 63 bit counter represented
by LWD (bit 29) can be optionally integrated. Probe for these counters
and set corresponding feature bits if any of these counters are present.
Signed-off-by: Rohit Mathew <rohit.mathew@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 23 ++++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 8 +++++++
2 files changed, 30 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 5da2666e9ee1..774137a124f8 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -860,7 +860,7 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
pr_err_once("Counters are not usable because not-ready timeout was not provided by firmware.");
}
if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_MBWU, msmon_features)) {
- bool hw_managed;
+ bool has_long, hw_managed;
u32 mbwumonidr = mpam_read_partsel_reg(msc, MBWUMON_IDR);
props->num_mbwu_mon = FIELD_GET(MPAMF_MBWUMON_IDR_NUM_MON, mbwumonidr);
@@ -870,6 +870,27 @@ static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
if (FIELD_GET(MPAMF_MBWUMON_IDR_HAS_RWBW, mbwumonidr))
mpam_set_feature(mpam_feat_msmon_mbwu_rwbw, props);
+ /*
+ * Treat long counter and its extension, lwd as mutually
+ * exclusive feature bits. Though these are dependent
+ * fields at the implementation level, there would never
+ * be a need for mpam_feat_msmon_mbwu_44counter (long
+ * counter) and mpam_feat_msmon_mbwu_63counter (lwd)
+ * bits to be set together.
+ *
+ * mpam_feat_msmon_mbwu isn't treated as an exclusive
+ * bit as this feature bit would be used as the "front
+ * facing feature bit" for any checks related to mbwu
+ * monitors.
+ */
+ has_long = FIELD_GET(MPAMF_MBWUMON_IDR_HAS_LONG, mbwumonidr);
+ if (props->num_mbwu_mon && has_long) {
+ if (FIELD_GET(MPAMF_MBWUMON_IDR_LWD, mbwumonidr))
+ mpam_set_feature(mpam_feat_msmon_mbwu_63counter, props);
+ else
+ mpam_set_feature(mpam_feat_msmon_mbwu_44counter, props);
+ }
+
/* Is NRDY hardware managed? */
mpam_mon_sel_outer_lock(msc);
mpam_ris_hw_probe_hw_nrdy(ris, MBWU, hw_managed);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index fc71afce3180..fc705801c1b6 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -178,7 +178,15 @@ enum mpam_device_features {
mpam_feat_msmon_csu,
mpam_feat_msmon_csu_capture,
mpam_feat_msmon_csu_hw_nrdy,
+
+ /*
+ * Having mpam_feat_msmon_mbwu set doesn't mean the regular 31 bit MBWU
+ * counter would be used. The exact counter used is decided based on the
+ * status of mpam_feat_msmon_mbwu_l/mpam_feat_msmon_mbwu_lwd as well.
+ */
mpam_feat_msmon_mbwu,
+ mpam_feat_msmon_mbwu_44counter,
+ mpam_feat_msmon_mbwu_63counter,
mpam_feat_msmon_mbwu_capture,
mpam_feat_msmon_mbwu_rwbw,
mpam_feat_msmon_mbwu_hw_nrdy,
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (31 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 32/36] arm_mpam: Probe for long/lwd mbwu counters James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-28 13:46 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 34/36] arm_mpam: Add helper to reset saved mbwu state James Morse
` (3 subsequent siblings)
36 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
From: Rohit Mathew <rohit.mathew@arm.com>
If the 44 bit (long) or 63 bit (LWD) counters are detected on probing
the RIS, use long/LWD counter instead of the regular 31 bit mbwu
counter.
Only 32bit accesses to the MSC are required to be supported by the
spec, but these registers are 64bits. The lower half may overflow
into the higher half between two 32bit reads. To avoid this, use
a helper that reads the top half twice to check for overflow.
Signed-off-by: Rohit Mathew <rohit.mathew@arm.com>
[morse: merged multiple patches from Rohit]
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 89 ++++++++++++++++++---
drivers/platform/arm64/mpam/mpam_internal.h | 8 +-
2 files changed, 86 insertions(+), 11 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 774137a124f8..ace69ac2d0ee 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -989,6 +989,48 @@ struct mon_read {
int err;
};
+static bool mpam_ris_has_mbwu_long_counter(struct mpam_msc_ris *ris)
+{
+ return (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, &ris->props) ||
+ mpam_has_feature(mpam_feat_msmon_mbwu_44counter, &ris->props));
+}
+
+static u64 mpam_msc_read_mbwu_l(struct mpam_msc *msc)
+{
+ int retry = 3;
+ u32 mbwu_l_low;
+ u64 mbwu_l_high1, mbwu_l_high2;
+
+ mpam_mon_sel_lock_held(msc);
+
+ WARN_ON_ONCE((MSMON_MBWU_L + sizeof(u64)) > msc->mapped_hwpage_sz);
+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
+
+ mbwu_l_high2 = __mpam_read_reg(msc, MSMON_MBWU_L + 4);
+ do {
+ mbwu_l_high1 = mbwu_l_high2;
+ mbwu_l_low = __mpam_read_reg(msc, MSMON_MBWU_L);
+ mbwu_l_high2 = __mpam_read_reg(msc, MSMON_MBWU_L + 4);
+
+ retry--;
+ } while (mbwu_l_high1 != mbwu_l_high2 && retry > 0);
+
+ if (mbwu_l_high1 == mbwu_l_high2)
+ return (mbwu_l_high1 << 32) | mbwu_l_low;
+ return MSMON___NRDY_L;
+}
+
+static void mpam_msc_zero_mbwu_l(struct mpam_msc *msc)
+{
+ mpam_mon_sel_lock_held(msc);
+
+ WARN_ON_ONCE((MSMON_MBWU_L + sizeof(u64)) > msc->mapped_hwpage_sz);
+ WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
+
+ __mpam_write_reg(msc, MSMON_MBWU_L, 0);
+ __mpam_write_reg(msc, MSMON_MBWU_L + 4, 0);
+}
+
static void gen_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
u32 *flt_val)
{
@@ -1045,6 +1087,7 @@ static void read_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
static void clean_msmon_ctl_val(u32 *cur_ctl)
{
*cur_ctl &= ~MSMON_CFG_x_CTL_OFLOW_STATUS;
+ *cur_ctl &= ~MSMON_CFG_x_CTL_OFLOW_STATUS_L;
}
static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
@@ -1067,7 +1110,11 @@ static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
case mpam_feat_msmon_mbwu:
mpam_write_monsel_reg(msc, CFG_MBWU_FLT, flt_val);
mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val);
- mpam_write_monsel_reg(msc, MBWU, 0);
+ if (mpam_ris_has_mbwu_long_counter(m->ris))
+ mpam_msc_zero_mbwu_l(m->ris->vmsc->msc);
+ else
+ mpam_write_monsel_reg(msc, MBWU, 0);
+
mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
mbwu_state = &m->ris->mbwu_state[m->ctx->mon];
@@ -1082,8 +1129,13 @@ static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
static u64 mpam_msmon_overflow_val(struct mpam_msc_ris *ris)
{
- /* TODO: scaling, and long counters */
- return GENMASK_ULL(30, 0);
+ /* TODO: implement scaling counters */
+ if (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, &ris->props))
+ return GENMASK_ULL(62, 0);
+ else if (mpam_has_feature(mpam_feat_msmon_mbwu_44counter, &ris->props))
+ return GENMASK_ULL(43, 0);
+ else
+ return GENMASK_ULL(30, 0);
}
/* Call with MSC lock held */
@@ -1125,10 +1177,24 @@ static void __ris_msmon_read(void *arg)
now = FIELD_GET(MSMON___VALUE, now);
break;
case mpam_feat_msmon_mbwu:
- now = mpam_read_monsel_reg(msc, MBWU);
- if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
- nrdy = now & MSMON___NRDY;
- now = FIELD_GET(MSMON___VALUE, now);
+ /*
+ * If long or lwd counters are supported, use them, else revert
+ * to the 32 bit counter.
+ */
+ if (mpam_ris_has_mbwu_long_counter(ris)) {
+ now = mpam_msc_read_mbwu_l(msc);
+ if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
+ nrdy = now & MSMON___NRDY_L;
+ if (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, rprops))
+ now = FIELD_GET(MSMON___LWD_VALUE, now);
+ else
+ now = FIELD_GET(MSMON___L_VALUE, now);
+ } else {
+ now = mpam_read_monsel_reg(msc, MBWU);
+ if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
+ nrdy = now & MSMON___NRDY;
+ now = FIELD_GET(MSMON___VALUE, now);
+ }
if (nrdy)
break;
@@ -1421,8 +1487,13 @@ static int mpam_save_mbwu_state(void *arg)
cur_ctl = mpam_read_monsel_reg(msc, CFG_MBWU_CTL);
mpam_write_monsel_reg(msc, CFG_MBWU_CTL, 0);
- val = mpam_read_monsel_reg(msc, MBWU);
- mpam_write_monsel_reg(msc, MBWU, 0);
+ if (mpam_ris_has_mbwu_long_counter(ris)) {
+ val = mpam_msc_read_mbwu_l(msc);
+ mpam_msc_zero_mbwu_l(msc);
+ } else {
+ val = mpam_read_monsel_reg(msc, MBWU);
+ mpam_write_monsel_reg(msc, MBWU, 0);
+ }
cfg->mon = i;
cfg->pmg = FIELD_GET(MSMON_CFG_MBWU_FLT_PMG, cur_flt);
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index fc705801c1b6..4553616f2f67 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -178,7 +178,6 @@ enum mpam_device_features {
mpam_feat_msmon_csu,
mpam_feat_msmon_csu_capture,
mpam_feat_msmon_csu_hw_nrdy,
-
/*
* Having mpam_feat_msmon_mbwu set doesn't mean the regular 31 bit MBWU
* counter would be used. The exact counter used is decided based on the
@@ -457,6 +456,8 @@ int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
#define MSMON_CSU_CAPTURE 0x0848 /* last cache-usage value captured */
#define MSMON_MBWU 0x0860 /* current mem-bw usage value */
#define MSMON_MBWU_CAPTURE 0x0868 /* last mem-bw value captured */
+#define MSMON_MBWU_L 0x0880 /* current long mem-bw usage value */
+#define MSMON_MBWU_CAPTURE_L 0x0890 /* last long mem-bw value captured */
#define MSMON_CAPT_EVNT 0x0808 /* signal a capture event */
#define MPAMF_ESR 0x00F8 /* error status register */
#define MPAMF_ECR 0x00F0 /* error control register */
@@ -674,7 +675,10 @@ int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
*/
#define MSMON___VALUE GENMASK(30, 0)
#define MSMON___NRDY BIT(31)
-#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
+#define MSMON___NRDY_L BIT(63)
+#define MSMON___L_VALUE GENMASK(43, 0)
+#define MSMON___LWD_VALUE GENMASK(62, 0)
+
/*
* MSMON_CAPT_EVNT - Memory system performance monitoring capture event
* generation register
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 34/36] arm_mpam: Add helper to reset saved mbwu state
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (32 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 35/36] arm_mpam: Add kunit test for bitmap reset James Morse
` (2 subsequent siblings)
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
resctrl expects to reset the bandwidth counters when the filesystem
is mounted.
To allow this, add a helper that clears the saved mbwu state. Instead
of cross calling to each CPU that can access the component MSC to
write to the counter, set a flag that causes it to be zero'd on the
the next read. This is easily done by forcing a configuration update.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_devices.c | 49 ++++++++++++++++++++-
drivers/platform/arm64/mpam/mpam_internal.h | 5 ++-
2 files changed, 51 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index ace69ac2d0ee..470a3709670e 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -1142,9 +1142,11 @@ static u64 mpam_msmon_overflow_val(struct mpam_msc_ris *ris)
static void __ris_msmon_read(void *arg)
{
bool nrdy = false;
+ bool config_mismatch;
struct mon_read *m = arg;
u64 now, overflow_val = 0;
struct mon_cfg *ctx = m->ctx;
+ bool reset_on_next_read = false;
struct mpam_msc_ris *ris = m->ris;
struct msmon_mbwu_state *mbwu_state;
struct mpam_props *rprops = &ris->props;
@@ -1159,6 +1161,14 @@ static void __ris_msmon_read(void *arg)
FIELD_PREP(MSMON_CFG_MON_SEL_RIS, ris->ris_idx);
mpam_write_monsel_reg(msc, CFG_MON_SEL, mon_sel);
+ if (m->type == mpam_feat_msmon_mbwu) {
+ mbwu_state = &ris->mbwu_state[ctx->mon];
+ if (mbwu_state) {
+ reset_on_next_read = mbwu_state->reset_on_next_read;
+ mbwu_state->reset_on_next_read = false;
+ }
+ }
+
/*
* Read the existing configuration to avoid re-writing the same values.
* This saves waiting for 'nrdy' on subsequent reads.
@@ -1166,7 +1176,10 @@ static void __ris_msmon_read(void *arg)
read_msmon_ctl_flt_vals(m, &cur_ctl, &cur_flt);
clean_msmon_ctl_val(&cur_ctl);
gen_msmon_ctl_flt_vals(m, &ctl_val, &flt_val);
- if (cur_flt != flt_val || cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN))
+ config_mismatch = cur_flt != flt_val ||
+ cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN);
+
+ if (config_mismatch || reset_on_next_read)
write_msmon_ctl_flt_vals(m, ctl_val, flt_val);
switch (m->type) {
@@ -1199,7 +1212,6 @@ static void __ris_msmon_read(void *arg)
if (nrdy)
break;
- mbwu_state = &ris->mbwu_state[ctx->mon];
if (!mbwu_state)
break;
@@ -1301,6 +1313,39 @@ int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
return err;
}
+void mpam_msmon_reset_mbwu(struct mpam_component *comp, struct mon_cfg *ctx)
+{
+ int idx;
+ struct mpam_msc *msc;
+ struct mpam_vmsc *vmsc;
+ struct mpam_msc_ris *ris;
+
+ if (!mpam_is_enabled())
+ return;
+
+ idx = srcu_read_lock(&mpam_srcu);
+ list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
+ if (!mpam_has_feature(mpam_feat_msmon_mbwu, &vmsc->props))
+ continue;
+
+ msc = vmsc->msc;
+ mpam_mon_sel_outer_lock(msc);
+ list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
+ if (!mpam_has_feature(mpam_feat_msmon_mbwu, &ris->props))
+ continue;
+
+ if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(msc)))
+ continue;
+
+ ris->mbwu_state[ctx->mon].correction = 0;
+ ris->mbwu_state[ctx->mon].reset_on_next_read = true;
+ mpam_mon_sel_inner_unlock(msc);
+ }
+ mpam_mon_sel_outer_unlock(msc);
+ }
+ srcu_read_unlock(&mpam_srcu, idx);
+}
+
static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
{
u32 num_words, msb;
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 4553616f2f67..76b83bda3d37 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -295,10 +295,12 @@ struct mon_cfg {
/*
* Changes to enabled and cfg are protected by the msc->lock.
- * Changes to prev_val and correction are protected by the msc's mon_sel_lock.
+ * Changes to reset_on_next_read, prev_val and correction are protected by the
+ * msc's mon_sel_lock.
*/
struct msmon_mbwu_state {
bool enabled;
+ bool reset_on_next_read;
struct mon_cfg cfg;
/* The value last read from the hardware. Used to detect overflow. */
@@ -408,6 +410,7 @@ int mpam_apply_config(struct mpam_component *comp, u16 partid,
int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
enum mpam_device_features, u64 *val);
+void mpam_msmon_reset_mbwu(struct mpam_component *comp, struct mon_cfg *ctx);
int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
cpumask_t *affinity);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 35/36] arm_mpam: Add kunit test for bitmap reset
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (33 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 34/36] arm_mpam: Add helper to reset saved mbwu state James Morse
@ 2025-07-11 18:36 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 36/36] arm_mpam: Add kunit tests for props_mismatch() James Morse
2025-08-01 16:09 ` [RFC PATCH 00/36] arm_mpam: Add basic mpam driver Jonathan Cameron
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse, Jonathan Cameron
The bitmap reset code has been a source of bugs. Add a unit test.
This currently has to be built in, as the rest of the driver is
builtin.
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/Kconfig | 13 ++++
drivers/platform/arm64/mpam/mpam_devices.c | 4 ++
.../platform/arm64/mpam/test_mpam_devices.c | 68 +++++++++++++++++++
3 files changed, 85 insertions(+)
create mode 100644 drivers/platform/arm64/mpam/test_mpam_devices.c
diff --git a/drivers/platform/arm64/mpam/Kconfig b/drivers/platform/arm64/mpam/Kconfig
index b63495d7da87..861e4b946ddc 100644
--- a/drivers/platform/arm64/mpam/Kconfig
+++ b/drivers/platform/arm64/mpam/Kconfig
@@ -4,7 +4,20 @@ config ARM_CPU_RESCTRL
bool
depends on ARM64
+menu "ARM64 MPAM driver options"
+
config ARM64_MPAM_DRIVER_DEBUG
bool "Enable debug messages from the MPAM driver."
help
Say yes here to enable debug messages from the MPAM driver.
+
+config MPAM_KUNIT_TEST
+ bool "KUnit tests for MPAM driver " if !KUNIT_ALL_TESTS
+ depends on KUNIT=y
+ default KUNIT_ALL_TESTS
+ help
+ Enable this option to run tests in the MPAM driver.
+
+ If unsure, say N.
+
+endmenu
diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
index 470a3709670e..a7f301b0da84 100644
--- a/drivers/platform/arm64/mpam/mpam_devices.c
+++ b/drivers/platform/arm64/mpam/mpam_devices.c
@@ -2904,3 +2904,7 @@ static int __init mpam_msc_driver_init(void)
}
/* Must occur after arm64_mpam_register_cpus() from arch_initcall() */
subsys_initcall(mpam_msc_driver_init);
+
+#ifdef CONFIG_MPAM_KUNIT_TEST
+#include "test_mpam_devices.c"
+#endif
diff --git a/drivers/platform/arm64/mpam/test_mpam_devices.c b/drivers/platform/arm64/mpam/test_mpam_devices.c
new file mode 100644
index 000000000000..8e9d6c88171c
--- /dev/null
+++ b/drivers/platform/arm64/mpam/test_mpam_devices.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2024 Arm Ltd.
+/* This file is intended to be included into mpam_devices.c */
+
+#include <kunit/test.h>
+
+static void test_mpam_reset_msc_bitmap(struct kunit *test)
+{
+ char *buf = kunit_kzalloc(test, SZ_16K, GFP_KERNEL);
+ struct mpam_msc fake_msc;
+ u32 *test_result;
+
+ if (!buf)
+ return;
+
+ fake_msc.mapped_hwpage = buf;
+ fake_msc.mapped_hwpage_sz = SZ_16K;
+ cpumask_copy(&fake_msc.accessibility, cpu_possible_mask);
+
+ mutex_init(&fake_msc.part_sel_lock);
+ mutex_lock(&fake_msc.part_sel_lock);
+
+ test_result = (u32 *)(buf + MPAMCFG_CPBM);
+
+ mpam_reset_msc_bitmap(&fake_msc, MPAMCFG_CPBM, 0);
+ KUNIT_EXPECT_EQ(test, test_result[0], 0);
+ KUNIT_EXPECT_EQ(test, test_result[1], 0);
+ test_result[0] = 0;
+ test_result[1] = 0;
+
+ mpam_reset_msc_bitmap(&fake_msc, MPAMCFG_CPBM, 1);
+ KUNIT_EXPECT_EQ(test, test_result[0], 1);
+ KUNIT_EXPECT_EQ(test, test_result[1], 0);
+ test_result[0] = 0;
+ test_result[1] = 0;
+
+ mpam_reset_msc_bitmap(&fake_msc, MPAMCFG_CPBM, 16);
+ KUNIT_EXPECT_EQ(test, test_result[0], 0xffff);
+ KUNIT_EXPECT_EQ(test, test_result[1], 0);
+ test_result[0] = 0;
+ test_result[1] = 0;
+
+ mpam_reset_msc_bitmap(&fake_msc, MPAMCFG_CPBM, 32);
+ KUNIT_EXPECT_EQ(test, test_result[0], 0xffffffff);
+ KUNIT_EXPECT_EQ(test, test_result[1], 0);
+ test_result[0] = 0;
+ test_result[1] = 0;
+
+ mpam_reset_msc_bitmap(&fake_msc, MPAMCFG_CPBM, 33);
+ KUNIT_EXPECT_EQ(test, test_result[0], 0xffffffff);
+ KUNIT_EXPECT_EQ(test, test_result[1], 1);
+ test_result[0] = 0;
+ test_result[1] = 0;
+
+ mutex_unlock(&fake_msc.part_sel_lock);
+}
+
+static struct kunit_case mpam_devices_test_cases[] = {
+ KUNIT_CASE(test_mpam_reset_msc_bitmap),
+ {}
+};
+
+static struct kunit_suite mpam_devices_test_suite = {
+ .name = "mpam_devices_test_suite",
+ .test_cases = mpam_devices_test_cases,
+};
+
+kunit_test_suites(&mpam_devices_test_suite);
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* [RFC PATCH 36/36] arm_mpam: Add kunit tests for props_mismatch()
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (34 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 35/36] arm_mpam: Add kunit test for bitmap reset James Morse
@ 2025-07-11 18:36 ` James Morse
2025-08-01 16:09 ` [RFC PATCH 00/36] arm_mpam: Add basic mpam driver Jonathan Cameron
36 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-11 18:36 UTC (permalink / raw)
To: linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
James Morse
When features are mismatched between MSC the way features are combined
to the class determines whether resctrl can support this SoC.
Add some tests to illustrate the sort of thing that is expected to
work, and those that must be removed.
Signed-off-by: James Morse <james.morse@arm.com>
---
drivers/platform/arm64/mpam/mpam_internal.h | 8 +-
.../platform/arm64/mpam/test_mpam_devices.c | 322 ++++++++++++++++++
2 files changed, 329 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
index 76b83bda3d37..08ed9bbf1f5e 100644
--- a/drivers/platform/arm64/mpam/mpam_internal.h
+++ b/drivers/platform/arm64/mpam/mpam_internal.h
@@ -18,6 +18,12 @@
DECLARE_STATIC_KEY_FALSE(mpam_enabled);
+#ifdef CONFIG_MPAM_KUNIT_TEST
+#define PACKED_FOR_KUNIT __packed
+#else
+#define PACKED_FOR_KUNIT
+#endif
+
static inline bool mpam_is_enabled(void)
{
return static_branch_likely(&mpam_enabled);
@@ -207,7 +213,7 @@ struct mpam_props {
u16 dspri_wd;
u16 num_csu_mon;
u16 num_mbwu_mon;
-};
+} PACKED_FOR_KUNIT;
#define mpam_has_feature(_feat, x) ((1 << (_feat)) & (x)->features)
diff --git a/drivers/platform/arm64/mpam/test_mpam_devices.c b/drivers/platform/arm64/mpam/test_mpam_devices.c
index 8e9d6c88171c..ef39696e7ff8 100644
--- a/drivers/platform/arm64/mpam/test_mpam_devices.c
+++ b/drivers/platform/arm64/mpam/test_mpam_devices.c
@@ -4,6 +4,326 @@
#include <kunit/test.h>
+/*
+ * This test catches fields that aren't being sanitised - but can't tell you
+ * which one...
+ */
+static void test__props_mismatch(struct kunit *test)
+{
+ struct mpam_props parent = { 0 };
+ struct mpam_props child;
+
+ memset(&child, 0xff, sizeof(child));
+ __props_mismatch(&parent, &child, false);
+
+ memset(&child, 0, sizeof(child));
+ KUNIT_EXPECT_EQ(test, memcmp(&parent, &child, sizeof(child)), 0);
+
+ memset(&child, 0xff, sizeof(child));
+ __props_mismatch(&parent, &child, true);
+
+ KUNIT_EXPECT_EQ(test, memcmp(&parent, &child, sizeof(child)), 0);
+}
+
+static void test_mpam_enable_merge_features(struct kunit *test)
+{
+ /* o/` How deep is your stack? o/` */
+ struct list_head fake_classes_list;
+ struct mpam_class fake_class = { 0 };
+ struct mpam_component fake_comp1 = { 0 };
+ struct mpam_component fake_comp2 = { 0 };
+ struct mpam_vmsc fake_vmsc1 = { 0 };
+ struct mpam_vmsc fake_vmsc2 = { 0 };
+ struct mpam_msc fake_msc1 = { 0 };
+ struct mpam_msc fake_msc2 = { 0 };
+ struct mpam_msc_ris fake_ris1 = { 0 };
+ struct mpam_msc_ris fake_ris2 = { 0 };
+ struct platform_device fake_pdev = { 0 };
+
+#define RESET_FAKE_HIEARCHY() do { \
+ INIT_LIST_HEAD(&fake_classes_list); \
+ \
+ memset(&fake_class, 0, sizeof(fake_class)); \
+ fake_class.level = 3; \
+ fake_class.type = MPAM_CLASS_CACHE; \
+ INIT_LIST_HEAD_RCU(&fake_class.components); \
+ INIT_LIST_HEAD(&fake_class.classes_list); \
+ \
+ memset(&fake_comp1, 0, sizeof(fake_comp1)); \
+ memset(&fake_comp2, 0, sizeof(fake_comp2)); \
+ fake_comp1.comp_id = 1; \
+ fake_comp2.comp_id = 2; \
+ INIT_LIST_HEAD(&fake_comp1.vmsc); \
+ INIT_LIST_HEAD(&fake_comp1.class_list); \
+ INIT_LIST_HEAD(&fake_comp2.vmsc); \
+ INIT_LIST_HEAD(&fake_comp2.class_list); \
+ \
+ memset(&fake_vmsc1, 0, sizeof(fake_vmsc1)); \
+ memset(&fake_vmsc2, 0, sizeof(fake_vmsc2)); \
+ INIT_LIST_HEAD(&fake_vmsc1.ris); \
+ INIT_LIST_HEAD(&fake_vmsc1.comp_list); \
+ fake_vmsc1.msc = &fake_msc1; \
+ INIT_LIST_HEAD(&fake_vmsc2.ris); \
+ INIT_LIST_HEAD(&fake_vmsc2.comp_list); \
+ fake_vmsc2.msc = &fake_msc2; \
+ \
+ memset(&fake_ris1, 0, sizeof(fake_ris1)); \
+ memset(&fake_ris2, 0, sizeof(fake_ris2)); \
+ fake_ris1.ris_idx = 1; \
+ INIT_LIST_HEAD(&fake_ris1.msc_list); \
+ fake_ris2.ris_idx = 2; \
+ INIT_LIST_HEAD(&fake_ris2.msc_list); \
+ \
+ fake_msc1.pdev = &fake_pdev; \
+ fake_msc2.pdev = &fake_pdev; \
+ \
+ list_add(&fake_class.classes_list, &fake_classes_list); \
+} while (0)
+
+ RESET_FAKE_HIEARCHY();
+
+ mutex_lock(&mpam_list_lock);
+
+ /* One Class+Comp, two RIS in one vMSC with common features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = NULL;
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = NULL;
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc1;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc1.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 4;
+ fake_ris2.props.cpbm_wd = 4;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 4);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class+Comp, two RIS in one vMSC with non-overlapping features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = NULL;
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = NULL;
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc1;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc1.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cmax_cmin, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 4;
+ fake_ris2.props.cmax_wd = 4;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ /* Multiple RIS within one MSC controlling the same resource can be mismatched */
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cmax_cmin, &fake_class.props));
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cmax_cmin, &fake_vmsc1.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 4);
+ KUNIT_EXPECT_EQ(test, fake_vmsc1.props.cmax_wd, 4);
+ KUNIT_EXPECT_EQ(test, fake_class.props.cmax_wd, 4);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class+Comp, two MSC with overlapping features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = NULL;
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = &fake_comp1;
+ list_add(&fake_vmsc2.comp_list, &fake_comp1.vmsc);
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc2;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc2.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 4;
+ fake_ris2.props.cpbm_wd = 4;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 4);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class+Comp, two MSC with non-overlapping features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = NULL;
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = &fake_comp1;
+ list_add(&fake_vmsc2.comp_list, &fake_comp1.vmsc);
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc2;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc2.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cmax_cmin, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 4;
+ fake_ris2.props.cmax_wd = 4;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ /*
+ * Multiple RIS in different MSC can't the same resource, mismatched
+ * features can not be supported.
+ */
+ KUNIT_EXPECT_FALSE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_FALSE(test, mpam_has_feature(mpam_feat_cmax_cmin, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 0);
+ KUNIT_EXPECT_EQ(test, fake_class.props.cmax_wd, 0);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class+Comp, two MSC with incompatible overlapping features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = NULL;
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = &fake_comp1;
+ list_add(&fake_vmsc2.comp_list, &fake_comp1.vmsc);
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc2;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc2.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris2.props);
+ mpam_set_feature(mpam_feat_mbw_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_mbw_part, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 5;
+ fake_ris2.props.cpbm_wd = 3;
+ fake_ris1.props.mbw_pbm_bits = 5;
+ fake_ris2.props.mbw_pbm_bits = 3;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ /*
+ * Multiple RIS in different MSC can't the same resource, mismatched
+ * features can not be supported.
+ */
+ KUNIT_EXPECT_FALSE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_FALSE(test, mpam_has_feature(mpam_feat_mbw_part, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 0);
+ KUNIT_EXPECT_EQ(test, fake_class.props.mbw_pbm_bits, 0);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class+Comp, two MSC with overlapping features that need tweaking */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = NULL;
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = &fake_comp1;
+ list_add(&fake_vmsc2.comp_list, &fake_comp1.vmsc);
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc2;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc2.ris);
+
+ mpam_set_feature(mpam_feat_mbw_min, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_mbw_min, &fake_ris2.props);
+ mpam_set_feature(mpam_feat_cmax_cmax, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cmax_cmax, &fake_ris2.props);
+ fake_ris1.props.bwa_wd = 5;
+ fake_ris2.props.bwa_wd = 3;
+ fake_ris1.props.cmax_wd = 5;
+ fake_ris2.props.cmax_wd = 3;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ /*
+ * Multiple RIS in different MSC can't the same resource, mismatched
+ * features can not be supported.
+ */
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_mbw_min, &fake_class.props));
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cmax_cmax, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.bwa_wd, 3);
+ KUNIT_EXPECT_EQ(test, fake_class.props.cmax_wd, 3);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class Two Comp with overlapping features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = &fake_class;
+ list_add(&fake_comp2.class_list, &fake_class.components);
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = &fake_comp2;
+ list_add(&fake_vmsc2.comp_list, &fake_comp2.vmsc);
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc2;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc2.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 4;
+ fake_ris2.props.cpbm_wd = 4;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ KUNIT_EXPECT_TRUE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 4);
+
+ RESET_FAKE_HIEARCHY();
+
+ /* One Class Two Comp with non-overlapping features */
+ fake_comp1.class = &fake_class;
+ list_add(&fake_comp1.class_list, &fake_class.components);
+ fake_comp2.class = &fake_class;
+ list_add(&fake_comp2.class_list, &fake_class.components);
+ fake_vmsc1.comp = &fake_comp1;
+ list_add(&fake_vmsc1.comp_list, &fake_comp1.vmsc);
+ fake_vmsc2.comp = &fake_comp2;
+ list_add(&fake_vmsc2.comp_list, &fake_comp2.vmsc);
+ fake_ris1.vmsc = &fake_vmsc1;
+ list_add(&fake_ris1.vmsc_list, &fake_vmsc1.ris);
+ fake_ris2.vmsc = &fake_vmsc2;
+ list_add(&fake_ris2.vmsc_list, &fake_vmsc2.ris);
+
+ mpam_set_feature(mpam_feat_cpor_part, &fake_ris1.props);
+ mpam_set_feature(mpam_feat_cmax_cmin, &fake_ris2.props);
+ fake_ris1.props.cpbm_wd = 4;
+ fake_ris2.props.cmax_wd = 4;
+
+ mpam_enable_merge_features(&fake_classes_list);
+
+ /*
+ * Multiple components can't control the same resource, mismatched features can
+ * not be supported.
+ */
+ KUNIT_EXPECT_FALSE(test, mpam_has_feature(mpam_feat_cpor_part, &fake_class.props));
+ KUNIT_EXPECT_FALSE(test, mpam_has_feature(mpam_feat_cmax_cmin, &fake_class.props));
+ KUNIT_EXPECT_EQ(test, fake_class.props.cpbm_wd, 0);
+ KUNIT_EXPECT_EQ(test, fake_class.props.cmax_wd, 0);
+
+ mutex_unlock(&mpam_list_lock);
+
+#undef RESET_FAKE_HIEARCHY
+}
+
static void test_mpam_reset_msc_bitmap(struct kunit *test)
{
char *buf = kunit_kzalloc(test, SZ_16K, GFP_KERNEL);
@@ -57,6 +377,8 @@ static void test_mpam_reset_msc_bitmap(struct kunit *test)
static struct kunit_case mpam_devices_test_cases[] = {
KUNIT_CASE(test_mpam_reset_msc_bitmap),
+ KUNIT_CASE(test_mpam_enable_merge_features),
+ KUNIT_CASE(test__props_mismatch),
{}
};
--
2.39.5
^ permalink raw reply related [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding
2025-07-11 18:36 ` [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding James Morse
@ 2025-07-11 21:43 ` Rob Herring
2025-08-05 17:08 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Rob Herring @ 2025-07-11 21:43 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Ben Horgan, Rohit Mathew,
Shanker Donthineni, Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko
On Fri, Jul 11, 2025 at 06:36:23PM +0000, James Morse wrote:
> From: Rob Herring <robh@kernel.org>
>
> The binding is designed around the assumption that an MSC will be a
> sub-block of something else such as a memory controller, cache controller,
> or IOMMU. However, it's certainly possible a design does not have that
> association or has a mixture of both, so the binding illustrates how we can
> support that with RIS child nodes.
>
> A key part of MPAM is we need to know about all of the MSCs in the system
> before it can be enabled. This drives the need for the genericish
> 'arm,mpam-msc' compatible. Though we can't assume an MSC is accessible
> until a h/w specific driver potentially enables the h/w.
>
> Cc: James Morse <james.morse@arm.com>
> Signed-off-by: Rob Herring <robh@kernel.org>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> .../devicetree/bindings/arm/arm,mpam-msc.yaml | 227 ++++++++++++++++++
> 1 file changed, 227 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
Is there any DT based h/w using this? I'm not aware of any. I would
prefer not merging this until there is. I have little insight whether
these genericish compatibles will be sufficient, but I have lots of
experience to say they won't be. I would also suspect that if anyone has
started using this, they've just extended/modified it however they
wanted and no feedback got to me.
> diff --git a/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml b/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
> new file mode 100644
> index 000000000000..9d542ecb1a7d
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
> @@ -0,0 +1,227 @@
> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/arm/arm,mpam-msc.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Arm Memory System Resource Partitioning and Monitoring (MPAM)
> +
> +description: |
> + The Arm MPAM specification can be found here:
> +
> + https://developer.arm.com/documentation/ddi0598/latest
> +
> +maintainers:
> + - Rob Herring <robh@kernel.org>
> +
> +properties:
> + compatible:
> + items:
> + - const: arm,mpam-msc # Further details are discoverable
> + - const: arm,mpam-memory-controller-msc
> +
> + reg:
> + maxItems: 1
> + description: A memory region containing registers as defined in the MPAM
> + specification.
> +
> + interrupts:
> + minItems: 1
> + items:
> + - description: error (optional)
> + - description: overflow (optional, only for monitoring)
> +
> + interrupt-names:
> + oneOf:
> + - items:
> + - enum: [ error, overflow ]
> + - items:
> + - const: error
> + - const: overflow
> +
> + arm,not-ready-us:
> + description: The maximum time in microseconds for monitoring data to be
> + accurate after a settings change. For more information, see the
> + Not-Ready (NRDY) bit description in the MPAM specification.
> +
> + numa-node-id: true # see NUMA binding
> +
> + '#address-cells':
> + const: 1
> +
> + '#size-cells':
> + const: 0
> +
> +patternProperties:
> + '^ris@[0-9a-f]$':
> + type: object
> + additionalProperties: false
> + description: |
'|' can be dropped.
> + RIS nodes for each RIS in an MSC. These nodes are required for each RIS
> + implementing known MPAM controls
> +
> + properties:
> + compatible:
> + enum:
> + # Bulk storage for cache
> + - arm,mpam-cache
> + # Memory bandwidth
> + - arm,mpam-memory
> +
> + reg:
> + minimum: 0
> + maximum: 0xf
> +
> + cpus:
> + $ref: '/schemas/types.yaml#/definitions/phandle-array'
Don't need the type. It's in the core schemas now.
> + description:
> + Phandle(s) to the CPU node(s) this RIS belongs to. By default, the parent
> + device's affinity is used.
> +
> + arm,mpam-device:
> + $ref: '/schemas/types.yaml#/definitions/phandle'
Don't need quotes. This should be a warning, but no testing happened
because the DT list and maintainers weren't CCed.
> + description:
> + By default, the MPAM enabled device associated with a RIS is the MSC's
> + parent node. It is possible for each RIS to be associated with different
> + devices in which case 'arm,mpam-device' should be used.
> +
> + required:
> + - compatible
> + - reg
> +
> +required:
> + - compatible
> + - reg
> +
> +dependencies:
> + interrupts: [ interrupt-names ]
> +
> +additionalProperties: false
> +
> +examples:
> + - |
> + /*
> + cpus {
> + cpu@0 {
> + next-level-cache = <&L2_0>;
> + };
> + cpu@100 {
> + next-level-cache = <&L2_1>;
> + };
> + };
> + */
> + L2_0: cache-controller-0 {
> + compatible = "cache";
> + cache-level = <2>;
> + cache-unified;
> + next-level-cache = <&L3>;
> +
> + };
> +
> + L2_1: cache-controller-1 {
> + compatible = "cache";
> + cache-level = <2>;
> + cache-unified;
> + next-level-cache = <&L3>;
> +
> + };
All the above should be dropped. Not part of this binding.
> +
> + L3: cache-controller@30000000 {
> + compatible = "arm,dsu-l3-cache", "cache";
Pretty sure this is a warning because that compatible doesn't exist.
> + cache-level = <3>;
> + cache-unified;
> +
> + ranges = <0x0 0x30000000 0x800000>;
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + msc@10000 {
> + compatible = "arm,mpam-msc";
> +
> + /* CPU affinity implied by parent cache node's */
> + reg = <0x10000 0x2000>;
> + interrupts = <1>, <2>;
> + interrupt-names = "error", "overflow";
> + arm,not-ready-us = <1>;
> + };
> + };
> +
> + mem: memory-controller@20000 {
> + compatible = "foo,a-memory-controller";
> + reg = <0x20000 0x1000>;
> +
> + #address-cells = <1>;
> + #size-cells = <1>;
> + ranges;
> +
> + msc@21000 {
> + compatible = "arm,mpam-memory-controller-msc", "arm,mpam-msc";
> + reg = <0x21000 0x1000>;
> + interrupts = <3>;
> + interrupt-names = "error";
> + arm,not-ready-us = <1>;
> + numa-node-id = <1>;
> + };
> + };
> +
> + iommu@40000 {
> + reg = <0x40000 0x1000>;
> +
> + ranges;
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + msc@41000 {
> + compatible = "arm,mpam-msc";
> + reg = <0 0x1000>;
> + interrupts = <5>, <6>;
> + interrupt-names = "error", "overflow";
> + arm,not-ready-us = <1>;
> +
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + ris@2 {
> + compatible = "arm,mpam-cache";
> + reg = <0>;
> + // TODO: How to map to device(s)?
> + };
> + };
> + };
> +
> + msc@80000 {
> + compatible = "foo,a-standalone-msc";
> + reg = <0x80000 0x1000>;
> +
> + clocks = <&clks 123>;
> +
> + ranges;
> + #address-cells = <1>;
> + #size-cells = <1>;
> +
> + msc@10000 {
> + compatible = "arm,mpam-msc";
> +
> + reg = <0x10000 0x2000>;
> + interrupts = <7>;
> + interrupt-names = "overflow";
> + arm,not-ready-us = <1>;
> +
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + ris@0 {
> + compatible = "arm,mpam-cache";
> + reg = <0>;
> + arm,mpam-device = <&L2_0>;
> + };
> +
> + ris@1 {
> + compatible = "arm,mpam-memory";
> + reg = <1>;
> + arm,mpam-device = <&mem>;
> + };
> + };
> + };
> +
> +...
> --
> 2.39.5
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node
2025-07-11 18:36 ` [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node James Morse
@ 2025-07-14 11:40 ` Ben Horgan
2025-07-25 17:08 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-14 11:40 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> The MPAM driver identifies caches by id for use with resctrl. It
> needs to know the cache-id when probe-ing, but the value isn't set
> in cacheinfo until device_initcall().
>
> Expose the code that generates the cache-id. The parts of the MPAM
> driver that run early can use this to set up the resctrl structures
> before cacheinfo is ready in device_initcall().
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Changes since v1:
> * Renamed cache_of_get_id() cache_of_calculate_id().
> ---
> drivers/base/cacheinfo.c | 17 ++++++++++++-----
> include/linux/cacheinfo.h | 1 +
> 2 files changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index 613410705a47..0fdd6358ee73 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -207,8 +207,7 @@ static bool match_cache_node(struct device_node *cpu,
> #define arch_compact_of_hwid(_x) (_x)
> #endif
>
> -static void cache_of_set_id(struct cacheinfo *this_leaf,
> - struct device_node *cache_node)
> +unsigned long cache_of_calculate_id(struct device_node *cache_node)
> {
> struct device_node *cpu;
> u32 min_id = ~0;
> @@ -219,15 +218,23 @@ static void cache_of_set_id(struct cacheinfo *this_leaf,
> id = arch_compact_of_hwid(id);
> if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
> of_node_put(cpu);
> - return;
> + return ~0UL;
> }
>
> if (match_cache_node(cpu, cache_node))
> min_id = min(min_id, id);
> }
>
> - if (min_id != ~0) {
> - this_leaf->id = min_id;
> + return min_id;
Looks like some 32bit/64bit confusion. Don't we want to return ~0UL if
min_id == ~0?
> +}
> +
> +static void cache_of_set_id(struct cacheinfo *this_leaf,
> + struct device_node *cache_node)
> +{
> + unsigned long id = cache_of_calculate_id(cache_node);
> +
> + if (id != ~0UL) {
> + this_leaf->id = id;
> this_leaf->attributes |= CACHE_ID;
> }
> }
> diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
> index c8f4f0a0b874..2dcbb69139e9 100644
> --- a/include/linux/cacheinfo.h
> +++ b/include/linux/cacheinfo.h
> @@ -112,6 +112,7 @@ int acpi_get_cache_info(unsigned int cpu,
> #endif
>
> const struct attribute_group *cache_get_priv_group(struct cacheinfo *this_leaf);
> +unsigned long cache_of_calculate_id(struct device_node *np);
>
> /*
> * Get the cacheinfo structure for the cache associated with @cpu at
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id
2025-07-11 18:36 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
@ 2025-07-14 11:42 ` Ben Horgan
2025-08-05 17:06 ` James Morse
2025-07-16 16:21 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-idUIRE Jonathan Cameron
1 sibling, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-14 11:42 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Sudeep Holla
Hi James,
On 7/11/25 19:36, James Morse wrote:
> The MPAM table identifies caches by id. The MPAM driver also wants to know
> the cache level to determine if the platform is of the shape that can be
> managed via resctrl. Cacheinfo has this information, but only for CPUs that
> are online.
>
> Waiting for all CPUs to come online is a problem for platforms where
> CPUs are brought online late by user-space.
>
> Add a helper that walks every possible cache, until it finds the one
> identified by cache-id, then return the level.
>
> acpi_count_levels() expects its levels parameter to be initialised to
> zero as it passes it to acpi_find_cache_level() as starting_level.
> The existing callers do this. Document it.
This paragraph is stale. You dealt with this in the previous commit.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
> drivers/acpi/pptt.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
> include/linux/acpi.h | 5 +++
> 2 files changed, 78 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 13ca2eee3b98..f53748a5df19 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -912,3 +912,76 @@ int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
> ACPI_PPTT_ACPI_IDENTICAL);
> }
> +
> +/**
> + * find_acpi_cache_level_from_id() - Get the level of the specified cache
> + * @cache_id: The id field of the unified cache
> + *
> + * Determine the level relative to any CPU for the unified cache identified by
> + * cache_id. This allows the property to be found even if the CPUs are offline.
> + *
> + * The returned level can be used to group unified caches that are peers.
> + *
> + * The PPTT table must be rev 3 or later,
> + *
> + * If one CPUs L2 is shared with another as L3, this function will return
> + * an unpredictable value.
> + *
> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
> + * Otherwise returns a value which represents the level of the specified cache.
> + */
> +int find_acpi_cache_level_from_id(u32 cache_id)
> +{
> + u32 acpi_cpu_id;
> + acpi_status status;
> + int level, cpu, num_levels;
> + struct acpi_pptt_cache *cache;
> + struct acpi_table_header *table;
> + struct acpi_pptt_cache_v1 *cache_v1;
> + struct acpi_pptt_processor *cpu_node;
> +
> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> + if (ACPI_FAILURE(status)) {
> + acpi_pptt_warn_missing();
> + return -ENOENT;
> + }
> +
> + if (table->revision < 3) {
> + acpi_put_table(table);
> + return -ENOENT;
> + }
> +
> + /*
> + * If we found the cache first, we'd still need to walk from each CPU
> + * to find the level...
> + */
> + for_each_possible_cpu(cpu) {
> + acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> + cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
> + if (!cpu_node)
> + break;
> + acpi_count_levels(table, cpu_node, &num_levels, NULL);
> +
> + /* Start at 1 for L1 */
> + for (level = 1; level <= num_levels; level++) {
> + cache = acpi_find_cache_node(table, acpi_cpu_id,
> + ACPI_PPTT_CACHE_TYPE_UNIFIED,
> + level, &cpu_node);
> + if (!cache)
> + continue;
> +
> + cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
> + cache,
> + sizeof(struct acpi_pptt_cache));
> +
> + if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
> + cache_v1->cache_id == cache_id) {
> + acpi_put_table(table);
> + return level;
> + }
> + }
> + }
> +
> + acpi_put_table(table);
> + return -ENOENT;
> +}
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 8c3165c2b083..82947f6d2a43 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1542,6 +1542,7 @@ int find_acpi_cpu_topology_cluster(unsigned int cpu);
> int find_acpi_cpu_topology_package(unsigned int cpu);
> int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
> int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
> +int find_acpi_cache_level_from_id(u32 cache_id);
> #else
> static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
> {
> @@ -1568,6 +1569,10 @@ static inline int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id,
> {
> return -EINVAL;
> }
> +static inline int find_acpi_cache_level_from_id(u32 cache_id)
> +{
> + return -EINVAL;
> +}
> #endif
>
> void acpi_arch_init(void);
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* RE: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
@ 2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:13 ` James Morse
2025-07-28 11:59 ` Ben Horgan
2025-08-04 16:39 ` Fenghua Yu
2 siblings, 1 reply; 117+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2025-07-16 6:49 UTC (permalink / raw)
To: 'James Morse', linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Dave Martin
Hello James,
> When CPUs come online the original configuration should be restored.
> Once the maximum partid is known, allocate an configuration array for each
> component, and reprogram each RIS configuration from this.
>
> The MPAM spec describes how multiple controls can interact. To prevent this
> happening by accident, always reset controls that don't have a valid
> configuration. This allows the same helper to be used for configuration and
> reset.
>
> CC: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 236
> ++++++++++++++++++--
> drivers/platform/arm64/mpam/mpam_internal.h | 26 ++-
> 2 files changed, 234 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
> b/drivers/platform/arm64/mpam/mpam_devices.c
> index bb3695eb84e9..f3ecfda265d2 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -374,12 +374,16 @@ static void mpam_class_destroy(struct mpam_class
> *class)
> add_to_garbage(class);
> }
>
> +static void __destroy_component_cfg(struct mpam_component *comp);
> +
> static void mpam_comp_destroy(struct mpam_component *comp) {
> struct mpam_class *class = comp->class;
>
> lockdep_assert_held(&mpam_list_lock);
>
> + __destroy_component_cfg(comp);
> +
> list_del_rcu(&comp->class_list);
> add_to_garbage(comp);
>
> @@ -909,51 +913,90 @@ static void mpam_reset_msc_bitmap(struct
> mpam_msc *msc, u16 reg, u16 wd)
> __mpam_write_reg(msc, reg, bm);
> }
>
> -static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
> +/* Called via IPI. Call while holding an SRCU reference */ static void
> +mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
> + struct mpam_config *cfg)
> {
> u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
> struct mpam_msc *msc = ris->vmsc->msc;
> struct mpam_props *rprops = &ris->props;
>
> - mpam_assert_srcu_read_lock_held();
> -
> mutex_lock(&msc->part_sel_lock);
> __mpam_part_sel(ris->ris_idx, partid, msc);
>
> - if (mpam_has_feature(mpam_feat_cpor_part, rprops))
> - mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
> rprops->cpbm_wd);
> + if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_cpor_part, cfg))
> + mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
> + else
> + mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
> + rprops->cpbm_wd);
> + }
>
> - if (mpam_has_feature(mpam_feat_mbw_part, rprops))
> - mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM,
> rprops->mbw_pbm_bits);
> + if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_part, cfg))
> + mpam_write_partsel_reg(msc, MBW_PBM,
> cfg->mbw_pbm);
> + else
> + mpam_reset_msc_bitmap(msc,
> MPAMCFG_MBW_PBM,
> + rprops->mbw_pbm_bits);
> + }
>
> if (mpam_has_feature(mpam_feat_mbw_min, rprops))
> mpam_write_partsel_reg(msc, MBW_MIN, 0);
>
> - if (mpam_has_feature(mpam_feat_mbw_max, rprops))
> - mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
> + if (mpam_has_feature(mpam_feat_mbw_max, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_max, cfg))
> + mpam_write_partsel_reg(msc, MBW_MAX,
> cfg->mbw_max);
> + else
> + mpam_write_partsel_reg(msc, MBW_MAX,
> bwa_fract);
> + }
0 was written to MPAMCFG_MBW_MAX. [HARDLIM].
Depending on the chip, if [HARDLIM] is set to 1 by default, it will be overwritten.
Best regards,
Shaopeng TAN
> if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
> mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
> mutex_unlock(&msc->part_sel_lock);
> }
>
> +struct reprogram_ris {
> + struct mpam_msc_ris *ris;
> + struct mpam_config *cfg;
> +};
> +
> +/* Call with MSC lock held */
> +static int mpam_reprogram_ris(void *_arg) {
> + u16 partid, partid_max;
> + struct reprogram_ris *arg = _arg;
> + struct mpam_msc_ris *ris = arg->ris;
> + struct mpam_config *cfg = arg->cfg;
> +
> + if (ris->in_reset_state)
> + return 0;
> +
> + spin_lock(&partid_max_lock);
> + partid_max = mpam_partid_max;
> + spin_unlock(&partid_max_lock);
> + for (partid = 0; partid <= partid_max; partid++)
> + mpam_reprogram_ris_partid(ris, partid, cfg);
> +
> + return 0;
> +}
> +
> /*
> * Called via smp_call_on_cpu() to prevent migration, while still being
> * pre-emptible.
> */
> static int mpam_reset_ris(void *arg)
> {
> - u16 partid, partid_max;
> struct mpam_msc_ris *ris = arg;
> + struct reprogram_ris reprogram_arg;
> + struct mpam_config empty_cfg = { 0 };
>
> if (ris->in_reset_state)
> return 0;
>
> - spin_lock(&partid_max_lock);
> - partid_max = mpam_partid_max;
> - spin_unlock(&partid_max_lock);
> - for (partid = 0; partid < partid_max; partid++)
> - mpam_reset_ris_partid(ris, partid);
> + reprogram_arg.ris = ris;
> + reprogram_arg.cfg = &empty_cfg;
> +
> + mpam_reprogram_ris(&reprogram_arg);
>
> return 0;
> }
> @@ -984,13 +1027,11 @@ static int mpam_touch_msc(struct mpam_msc
> *msc, int (*fn)(void *a), void *arg)
>
> static void mpam_reset_msc(struct mpam_msc *msc, bool online) {
> - int idx;
> struct mpam_msc_ris *ris;
>
> mpam_assert_srcu_read_lock_held();
>
> mpam_mon_sel_outer_lock(msc);
> - idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_srcu(ris, &msc->ris, msc_list,
> srcu_read_lock_held(&mpam_srcu)) {
> mpam_touch_msc(msc, &mpam_reset_ris, ris);
>
> @@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc
> *msc, bool online)
> */
> ris->in_reset_state = online;
> }
> - srcu_read_unlock(&mpam_srcu, idx);
> mpam_mon_sel_outer_unlock(msc);
> }
>
> +static void mpam_reprogram_msc(struct mpam_msc *msc) {
> + int idx;
> + u16 partid;
> + bool reset;
> + struct mpam_config *cfg;
> + struct mpam_msc_ris *ris;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
> + if (!mpam_is_enabled() && !ris->in_reset_state) {
> + mpam_touch_msc(msc, &mpam_reset_ris, ris);
> + ris->in_reset_state = true;
> + continue;
> + }
> +
> + reset = true;
> + for (partid = 0; partid <= mpam_partid_max; partid++) {
> + cfg = &ris->vmsc->comp->cfg[partid];
> + if (cfg->features)
> + reset = false;
> +
> + mpam_reprogram_ris_partid(ris, partid, cfg);
> + }
> + ris->in_reset_state = reset;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +}
> +
> static void _enable_percpu_irq(void *_irq) {
> int *irq = _irq;
> @@ -1025,7 +1094,7 @@ static int mpam_cpu_online(unsigned int cpu)
> _enable_percpu_irq(&msc->reenable_error_ppi);
>
> if (atomic_fetch_inc(&msc->online_refs) == 0)
> - mpam_reset_msc(msc, true);
> + mpam_reprogram_msc(msc);
> }
> srcu_read_unlock(&mpam_srcu, idx);
>
> @@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)
> cpus_read_unlock();
> }
>
> +static void __destroy_component_cfg(struct mpam_component *comp) {
> + add_to_garbage(comp->cfg);
> +}
> +
> +static int __allocate_component_cfg(struct mpam_component *comp) {
> + if (comp->cfg)
> + return 0;
> +
> + comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg),
> GFP_KERNEL);
> + if (!comp->cfg)
> + return -ENOMEM;
> + init_garbage(comp->cfg);
> +
> + return 0;
> +}
> +
> +static int mpam_allocate_config(void)
> +{
> + int err = 0;
> + struct mpam_class *class;
> + struct mpam_component *comp;
> +
> + lockdep_assert_held(&mpam_list_lock);
> +
> + list_for_each_entry(class, &mpam_classes, classes_list) {
> + list_for_each_entry(comp, &class->components, class_list) {
> + err = __allocate_component_cfg(comp);
> + if (err)
> + return err;
> + }
> + }
> +
> + return 0;
> +}
> +
> static void mpam_enable_once(void)
> {
> int err;
> @@ -1817,12 +1923,21 @@ static void mpam_enable_once(void)
> */
> cpus_read_lock();
> mutex_lock(&mpam_list_lock);
> - mpam_enable_merge_features(&mpam_classes);
> + do {
> + mpam_enable_merge_features(&mpam_classes);
>
> - err = mpam_register_irqs();
> - if (err)
> - pr_warn("Failed to register irqs: %d\n", err);
> + err = mpam_allocate_config();
> + if (err) {
> + pr_err("Failed to allocate configuration arrays.\n");
> + break;
> + }
>
> + err = mpam_register_irqs();
> + if (err) {
> + pr_warn("Failed to register irqs: %d\n", err);
> + break;
> + }
> + } while (0);
> mutex_unlock(&mpam_list_lock);
> cpus_read_unlock();
>
> @@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct
> mpam_component *comp)
> might_sleep();
> lockdep_assert_cpus_held();
>
> + memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));
> +
> idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
> msc = vmsc->msc;
> @@ -1963,6 +2080,79 @@ void mpam_enable(struct work_struct *work)
> mpam_enable_once();
> }
>
> +struct mpam_write_config_arg {
> + struct mpam_msc_ris *ris;
> + struct mpam_component *comp;
> + u16 partid;
> +};
> +
> +static int __write_config(void *arg)
> +{
> + struct mpam_write_config_arg *c = arg;
> +
> + mpam_reprogram_ris_partid(c->ris, c->partid,
> +&c->comp->cfg[c->partid]);
> +
> + return 0;
> +}
> +
> +#define maybe_update_config(cfg, feature, newcfg, member, changes) do { \
> + if (mpam_has_feature(feature, newcfg) && \
> + (newcfg)->member != (cfg)->member) { \
> + (cfg)->member = (newcfg)->member; \
> + cfg->features |= (1 << feature); \
> + \
> + (changes) |= (1 << feature); \
> + } \
> +} while (0)
> +
> +static mpam_features_t mpam_update_config(struct mpam_config *cfg,
> + const struct mpam_config
> *newcfg) {
> + mpam_features_t changes = 0;
> +
> + maybe_update_config(cfg, mpam_feat_cpor_part, newcfg, cpbm,
> changes);
> + maybe_update_config(cfg, mpam_feat_mbw_part, newcfg,
> mbw_pbm, changes);
> + maybe_update_config(cfg, mpam_feat_mbw_max, newcfg, mbw_max,
> changes);
> +
> + return changes;
> +}
> +
> +/* TODO: split into write_config/sync_config */
> +/* TODO: add config_dirty bitmap to drive sync_config */ int
> +mpam_apply_config(struct mpam_component *comp, u16 partid,
> + struct mpam_config *cfg)
> +{
> + struct mpam_write_config_arg arg;
> + struct mpam_msc_ris *ris;
> + struct mpam_vmsc *vmsc;
> + struct mpam_msc *msc;
> + int idx;
> +
> + lockdep_assert_cpus_held();
> +
> + /* Don't pass in the current config! */
> + WARN_ON_ONCE(&comp->cfg[partid] == cfg);
> +
> + if (!mpam_update_config(&comp->cfg[partid], cfg))
> + return 0;
> +
> + arg.comp = comp;
> + arg.partid = partid;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
> + msc = vmsc->msc;
> +
> + list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
> + arg.ris = ris;
> + mpam_touch_msc(msc, __write_config, &arg);
> + }
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
> +
> /*
> * MSC that are hidden under caches are not created as platform devices
> * as there is no cache driver. Caches are also special-cased in diff --git
> a/drivers/platform/arm64/mpam/mpam_internal.h
> b/drivers/platform/arm64/mpam/mpam_internal.h
> index 1a24424b48df..029ec89f56f2 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -190,11 +190,7 @@ struct mpam_props {
> u16 num_mbwu_mon;
> };
>
> -static inline bool mpam_has_feature(enum mpam_device_features feat,
> - struct mpam_props *props)
> -{
> - return (1 << feat) & props->features;
> -}
> +#define mpam_has_feature(_feat, x) ((1 << (_feat)) & (x)->features)
>
> static inline void mpam_set_feature(enum mpam_device_features feat,
> struct mpam_props *props)
> @@ -225,6 +221,17 @@ struct mpam_class {
> struct mpam_garbage garbage;
> };
>
> +struct mpam_config {
> + /* Which configuration values are valid. 0 is used for reset */
> + mpam_features_t features;
> +
> + u32 cpbm;
> + u32 mbw_pbm;
> + u16 mbw_max;
> +
> + struct mpam_garbage garbage;
> +};
> +
> struct mpam_component {
> u32 comp_id;
>
> @@ -233,6 +240,12 @@ struct mpam_component {
>
> cpumask_t affinity;
>
> + /*
> + * Array of configuration values, indexed by partid.
> + * Read from cpuhp callbacks, hold the cpuhp lock when writing.
> + */
> + struct mpam_config *cfg;
> +
> /* member of mpam_class:components */
> struct list_head class_list;
>
> @@ -297,6 +310,9 @@ extern u8 mpam_pmg_max; void mpam_enable(struct
> work_struct *work); void mpam_disable(struct work_struct *work);
>
> +int mpam_apply_config(struct mpam_component *comp, u16 partid,
> + struct mpam_config *cfg);
> +
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32
> cache_level,
> cpumask_t *affinity);
>
> --
> 2.39.5
^ permalink raw reply [flat|nested] 117+ messages in thread
* RE: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
@ 2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:08 ` James Morse
2025-07-17 1:08 ` Shaopeng Tan (Fujitsu)
` (3 subsequent siblings)
4 siblings, 1 reply; 117+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2025-07-16 7:31 UTC (permalink / raw)
To: 'James Morse', linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hello James,
> Register and enable error IRQs. All the MPAM error interrupts indicate a
> software bug, e.g. out of range partid. If the error interrupt is ever signalled,
> attempt to disable MPAM.
>
> Only the irq handler accesses the ESR register, so no locking is needed.
> The work to disable MPAM after an error needs to happen at process context,
> use a threaded interrupt.
>
> There is no support for percpu threaded interrupts, for now schedule the work
> to be done from the irq handler.
>
> Enabling the IRQs in the MSC may involve cross calling to a CPU that can
> access the MSC.
>
> CC: Rohit Mathew <rohit.mathew@arm.com>
> Tested-by: Rohit Mathew <rohit.mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 304
> +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 9 +-
> 2 files changed, 307 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
> b/drivers/platform/arm64/mpam/mpam_devices.c
> index 145535cd4732..af19cc25d16e 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -14,6 +14,9 @@
> #include <linux/device.h>
> #include <linux/errno.h>
> #include <linux/gfp.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> #include <linux/list.h>
> #include <linux/lockdep.h>
> #include <linux/mutex.h>
> @@ -62,6 +65,12 @@ static DEFINE_SPINLOCK(partid_max_lock);
> */
> static DECLARE_WORK(mpam_enable_work, &mpam_enable);
>
> +/*
> + * All mpam error interrupts indicate a software bug. On receipt,
> +disable the
> + * driver.
> + */
> +static DECLARE_WORK(mpam_broken_work, &mpam_disable);
> +
> /*
> * An MSC is a physical container for controls and monitors, each identified by
> * their RIS index. These share a base-address, interrupts and some MMIO
> @@ -159,6 +168,24 @@ static u64 mpam_msc_read_idr(struct mpam_msc
> *msc)
> return (idr_high << 32) | idr_low;
> }
>
> +static void mpam_msc_zero_esr(struct mpam_msc *msc) {
> + __mpam_write_reg(msc, MPAMF_ESR, 0);
> + if (msc->has_extd_esr)
> + __mpam_write_reg(msc, MPAMF_ESR + 4, 0); }
> +
> +static u64 mpam_msc_read_esr(struct mpam_msc *msc) {
> + u64 esr_high = 0, esr_low;
> +
> + esr_low = __mpam_read_reg(msc, MPAMF_ESR);
> + if (msc->has_extd_esr)
> + esr_high = __mpam_read_reg(msc, MPAMF_ESR + 4);
> +
> + return (esr_high << 32) | esr_low;
> +}
> +
> static void __mpam_part_sel_raw(u32 partsel, struct mpam_msc *msc) {
> lockdep_assert_held(&msc->part_sel_lock);
> @@ -405,12 +432,12 @@ static void mpam_msc_destroy(struct mpam_msc
> *msc)
>
> lockdep_assert_held(&mpam_list_lock);
>
> - list_del_rcu(&msc->glbl_list);
> - platform_set_drvdata(pdev, NULL);
> -
> list_for_each_entry_safe(ris, tmp, &msc->ris, msc_list)
> mpam_ris_destroy(ris);
>
> + list_del_rcu(&msc->glbl_list);
> + platform_set_drvdata(pdev, NULL);
> +
> add_to_garbage(msc);
> msc->garbage.pdev = pdev;
> }
> @@ -828,6 +855,7 @@ static int mpam_msc_hw_probe(struct mpam_msc
> *msc)
> pmg_max = FIELD_GET(MPAMF_IDR_PMG_MAX, idr);
> msc->partid_max = min(msc->partid_max, partid_max);
> msc->pmg_max = min(msc->pmg_max, pmg_max);
> + msc->has_extd_esr =
> FIELD_GET(MPAMF_IDR_HAS_EXT_ESR, idr);
>
> ris = mpam_get_or_create_ris(msc, ris_idx);
> if (IS_ERR(ris))
> @@ -974,6 +1002,13 @@ static void mpam_reset_msc(struct mpam_msc
> *msc, bool online)
> mpam_mon_sel_outer_unlock(msc);
> }
>
> +static void _enable_percpu_irq(void *_irq) {
> + int *irq = _irq;
> +
> + enable_percpu_irq(*irq, IRQ_TYPE_NONE); }
> +
> static int mpam_cpu_online(unsigned int cpu) {
> int idx;
> @@ -984,6 +1019,9 @@ static int mpam_cpu_online(unsigned int cpu)
> if (!cpumask_test_cpu(cpu, &msc->accessibility))
> continue;
>
> + if (msc->reenable_error_ppi)
> + _enable_percpu_irq(&msc->reenable_error_ppi);
> +
> if (atomic_fetch_inc(&msc->online_refs) == 0)
> mpam_reset_msc(msc, true);
> }
> @@ -1032,6 +1070,9 @@ static int mpam_cpu_offline(unsigned int cpu)
> if (!cpumask_test_cpu(cpu, &msc->accessibility))
> continue;
>
> + if (msc->reenable_error_ppi)
> + disable_percpu_irq(msc->reenable_error_ppi);
> +
> if (atomic_dec_and_test(&msc->online_refs))
> mpam_reset_msc(msc, false);
> }
> @@ -1058,6 +1099,51 @@ static void mpam_register_cpuhp_callbacks(int
> (*online)(unsigned int online),
> mutex_unlock(&mpam_cpuhp_state_lock);
> }
>
> +static int __setup_ppi(struct mpam_msc *msc) {
> + int cpu;
> +
> + msc->error_dev_id = alloc_percpu_gfp(struct mpam_msc *,
> GFP_KERNEL);
> + if (!msc->error_dev_id)
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, &msc->accessibility) {
> + struct mpam_msc *empty = *per_cpu_ptr(msc->error_dev_id,
> cpu);
> +
> + if (empty) {
> + pr_err_once("%s shares PPI with %s!\n",
> + dev_name(&msc->pdev->dev),
> + dev_name(&empty->pdev->dev));
> + return -EBUSY;
> + }
> + *per_cpu_ptr(msc->error_dev_id, cpu) = msc;
> + }
> +
> + return 0;
> +}
> +
> +static int mpam_msc_setup_error_irq(struct mpam_msc *msc) {
> + int irq;
> +
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + return 0;
> +
> + /* Allocate and initialise the percpu device pointer for PPI */
> + if (irq_is_percpu(irq))
> + return __setup_ppi(msc);
> +
> + /* sanity check: shared interrupts can be routed anywhere? */
> + if (!cpumask_equal(&msc->accessibility, cpu_possible_mask)) {
> + pr_err_once("msc:%u is a private resource with a shared error
> interrupt",
> + msc->id);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> static int mpam_dt_count_msc(void)
> {
> int count = 0;
> @@ -1266,6 +1352,10 @@ static int mpam_msc_drv_probe(struct
> platform_device *pdev)
> break;
> }
>
> + err = mpam_msc_setup_error_irq(msc);
> + if (err)
> + break;
> +
> if (device_property_read_u32(&pdev->dev, "pcc-channel",
> &msc->pcc_subspace_id))
> msc->iface = MPAM_IFACE_MMIO;
> @@ -1548,11 +1638,193 @@ static void mpam_enable_merge_features(struct
> list_head *all_classes_list)
> }
> }
>
> +static char *mpam_errcode_names[16] = {
> + [0] = "No error",
> + [1] = "PARTID_SEL_Range",
> + [2] = "Req_PARTID_Range",
> + [3] = "MSMONCFG_ID_RANGE",
> + [4] = "Req_PMG_Range",
> + [5] = "Monitor_Range",
> + [6] = "intPARTID_Range",
> + [7] = "Unexpected_INTERNAL",
> + [8] = "Undefined_RIS_PART_SEL",
> + [9] = "RIS_No_Control",
> + [10] = "Undefined_RIS_MON_SEL",
> + [11] = "RIS_No_Monitor",
> + [12 ... 15] = "Reserved"
> +};
> +
> +static int mpam_enable_msc_ecr(void *_msc) {
> + struct mpam_msc *msc = _msc;
> +
> + __mpam_write_reg(msc, MPAMF_ECR, 1);
> +
> + return 0;
> +}
> +
> +static int mpam_disable_msc_ecr(void *_msc) {
> + struct mpam_msc *msc = _msc;
> +
> + __mpam_write_reg(msc, MPAMF_ECR, 0);
> +
> + return 0;
> +}
> +
> +static irqreturn_t __mpam_irq_handler(int irq, struct mpam_msc *msc) {
> + u64 reg;
> + u16 partid;
> + u8 errcode, pmg, ris;
> +
> + if (WARN_ON_ONCE(!msc) ||
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
> + &msc->accessibility)))
> + return IRQ_NONE;
> +
> + reg = mpam_msc_read_esr(msc);
> +
> + errcode = FIELD_GET(MPAMF_ESR_ERRCODE, reg);
> + if (!errcode)
> + return IRQ_NONE;
> +
> + /* Clear level triggered irq */
> + mpam_msc_zero_esr(msc);
> +
> + partid = FIELD_GET(MPAMF_ESR_PARTID_OR_MON, reg);
> + pmg = FIELD_GET(MPAMF_ESR_PMG, reg);
> + ris = FIELD_GET(MPAMF_ESR_PMG, reg);
MPAMF_ESR_RIS?
Best regards,
Shaopeng TAN
> + pr_err("error irq from msc:%u '%s', partid:%u, pmg: %u, ris: %u\n",
> + msc->id, mpam_errcode_names[errcode], partid, pmg, ris);
> +
> + if (irq_is_percpu(irq)) {
> + mpam_disable_msc_ecr(msc);
> + schedule_work(&mpam_broken_work);
> + return IRQ_HANDLED;
> + }
> +
> + return IRQ_WAKE_THREAD;
> +}
> +
> +static irqreturn_t mpam_ppi_handler(int irq, void *dev_id) {
> + struct mpam_msc *msc = *(struct mpam_msc **)dev_id;
> +
> + return __mpam_irq_handler(irq, msc);
> +}
> +
> +static irqreturn_t mpam_spi_handler(int irq, void *dev_id) {
> + struct mpam_msc *msc = dev_id;
> +
> + return __mpam_irq_handler(irq, msc);
> +}
> +
> +static irqreturn_t mpam_disable_thread(int irq, void *dev_id);
> +
> +static int mpam_register_irqs(void)
> +{
> + int err, irq, idx;
> + struct mpam_msc *msc;
> +
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list,
> srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
> + /* We anticipate sharing the interrupt with other MSCs */
> + if (irq_is_percpu(irq)) {
> + err = request_percpu_irq(irq, &mpam_ppi_handler,
> + "mpam:msc:error",
> + msc->error_dev_id);
> + if (err)
> + return err;
> +
> + msc->reenable_error_ppi = irq;
> + smp_call_function_many(&msc->accessibility,
> + &_enable_percpu_irq, &irq,
> + true);
> + } else {
> + err =
> devm_request_threaded_irq(&msc->pdev->dev, irq,
> +
> &mpam_spi_handler,
> +
> &mpam_disable_thread,
> + IRQF_SHARED,
> + "mpam:msc:error",
> msc);
> + if (err)
> + return err;
> + }
> +
> + msc->error_irq_requested = true;
> + mpam_touch_msc(msc, mpam_enable_msc_ecr, msc);
> + msc->error_irq_hw_enabled = true;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
> +
> +static void mpam_unregister_irqs(void)
> +{
> + int irq, idx;
> + struct mpam_msc *msc;
> +
> + cpus_read_lock();
> + /* take the lock as free_irq() can sleep */
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list,
> srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + if (msc->error_irq_hw_enabled) {
> + mpam_touch_msc(msc, mpam_disable_msc_ecr,
> msc);
> + msc->error_irq_hw_enabled = false;
> + }
> +
> + if (msc->error_irq_requested) {
> + if (irq_is_percpu(irq)) {
> + msc->reenable_error_ppi = 0;
> + free_percpu_irq(irq, msc->error_dev_id);
> + } else {
> + devm_free_irq(&msc->pdev->dev, irq, msc);
> + }
> + msc->error_irq_requested = false;
> + }
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> + cpus_read_unlock();
> +}
> +
> static void mpam_enable_once(void)
> {
> + int err;
> +
> + /*
> + * If all the MSC have been probed, enabling the IRQs happens next.
> + * That involves cross-calling to a CPU that can reach the MSC, and
> + * the locks must be taken in this order:
> + */
> + cpus_read_lock();
> mutex_lock(&mpam_list_lock);
> mpam_enable_merge_features(&mpam_classes);
> +
> + err = mpam_register_irqs();
> + if (err)
> + pr_warn("Failed to register irqs: %d\n", err);
> +
> mutex_unlock(&mpam_list_lock);
> + cpus_read_unlock();
> +
> + if (err) {
> + schedule_work(&mpam_broken_work);
> + return;
> + }
>
> mutex_lock(&mpam_cpuhp_state_lock);
> cpuhp_remove_state(mpam_cpuhp_state);
> @@ -1621,16 +1893,39 @@ static void mpam_reset_class(struct mpam_class
> *class)
> * All of MPAMs errors indicate a software bug, restore any modified
> * controls to their reset values.
> */
> -void mpam_disable(void)
> +static irqreturn_t mpam_disable_thread(int irq, void *dev_id)
> {
> int idx;
> struct mpam_class *class;
> + struct mpam_msc *msc, *tmp;
> +
> + mutex_lock(&mpam_cpuhp_state_lock);
> + if (mpam_cpuhp_state) {
> + cpuhp_remove_state(mpam_cpuhp_state);
> + mpam_cpuhp_state = 0;
> + }
> + mutex_unlock(&mpam_cpuhp_state_lock);
> +
> + mpam_unregister_irqs();
>
> idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_srcu(class, &mpam_classes, classes_list,
> srcu_read_lock_held(&mpam_srcu))
> mpam_reset_class(class);
> srcu_read_unlock(&mpam_srcu, idx);
> +
> + mutex_lock(&mpam_list_lock);
> + list_for_each_entry_safe(msc, tmp, &mpam_all_msc, glbl_list)
> + mpam_msc_destroy(msc);
> + mutex_unlock(&mpam_list_lock);
> + mpam_free_garbage();
> +
> + return IRQ_HANDLED;
> +}
> +
> +void mpam_disable(struct work_struct *ignored) {
> + mpam_disable_thread(0, NULL);
> }
>
> /*
> @@ -1644,7 +1939,6 @@ void mpam_enable(struct work_struct *work)
> struct mpam_msc *msc;
> bool all_devices_probed = true;
>
> - /* Have we probed all the hw devices? */
> mutex_lock(&mpam_list_lock);
> list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
> mutex_lock(&msc->probe_lock);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h
> b/drivers/platform/arm64/mpam/mpam_internal.h
> index de05eece0a31..e1c6a2676b54 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -44,6 +44,11 @@ struct mpam_msc {
> struct pcc_mbox_chan *pcc_chan;
> u32 nrdy_usec;
> cpumask_t accessibility;
> + bool has_extd_esr;
> +
> + int reenable_error_ppi;
> + struct mpam_msc * __percpu *error_dev_id;
> +
> atomic_t online_refs;
>
> /*
> @@ -52,6 +57,8 @@ struct mpam_msc {
> */
> struct mutex probe_lock;
> bool probed;
> + bool error_irq_requested;
> + bool error_irq_hw_enabled;
> u16 partid_max;
> u8 pmg_max;
> unsigned long ris_idxs[128 / BITS_PER_LONG];
> @@ -280,7 +287,7 @@ extern u8 mpam_pmg_max;
>
> /* Scheduled work callback to enable mpam once all MSC have been probed
> */ void mpam_enable(struct work_struct *work); -void mpam_disable(void);
> +void mpam_disable(struct work_struct *work);
>
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32
> cache_level,
> cpumask_t *affinity);
> --
> 2.39.5
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels
2025-07-11 18:36 ` [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels James Morse
@ 2025-07-16 15:51 ` Jonathan Cameron
2025-07-25 17:05 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-16 15:51 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
On Fri, 11 Jul 2025 18:36:18 +0000
James Morse <james.morse@arm.com> wrote:
> acpi_count_levels() passes the number of levels back via a pointer argument.
> It also passes this to acpi_find_cache_level() as the starting_level, and
> preserves this value as it walks up the cpu_node tree counting the levels.
>
> The only caller acpi_get_cache_info() happens to have already initialised
> levels to zero, which acpi_count_levels() depends on to get the correct
> result.
>
> Explicitly zero the levels variable, so the count always starts at zero.
> This saves any additional callers having to work out they need to do this.
Hi James,
This is all a bit fiddly as we now end up with that initialized in various
different places. Perhaps simpler to have acpi_count_levels() return the
number of levels rather than void. Then return number of levels rather
than 0 on success from acpi_get_cache_info(). Negative error codes used
for failure just like now.
That would leave only a local variable in acpi_count_levels being
initialized to 0 and passed to acpi_find_cache_level() before being
returned when the loop terminates.
I think that sequence then makes it such that we can't fail to
initialize it at without the compiler noticing and screaming.
Requires a few changes from if (ret) to if (ret < 0) at callers
of acpi_get_cache_info() but looks simple (says the person who
hasn't actually coded it!)
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
> drivers/acpi/pptt.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 13619b1b821b..13ca2eee3b98 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -183,7 +183,7 @@ acpi_find_cache_level(struct acpi_table_header *table_hdr,
> * @cpu_node: processor node we wish to count caches for
> * @levels: Number of levels if success.
> * @split_levels: Number of split cache levels (data/instruction) if
> - * success. Can by NULL.
> + * success. Can be NULL.
Grumpy reviewer hat. Unrelated cleanup up - good to have but not in this patch where
it's a distraction.
> *
> * Given a processor node containing a processing unit, walk into it and count
> * how many levels exist solely for it, and then walk up each level until we hit
> @@ -196,6 +196,8 @@ static void acpi_count_levels(struct acpi_table_header *table_hdr,
> struct acpi_pptt_processor *cpu_node,
> unsigned int *levels, unsigned int *split_levels)
> {
> + *levels = 0;
> +
> do {
> acpi_find_cache_level(table_hdr, cpu_node, levels, split_levels, 0, 0);
> cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-idUIRE
2025-07-11 18:36 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
2025-07-14 11:42 ` Ben Horgan
@ 2025-07-16 16:21 ` Jonathan Cameron
2025-08-05 17:06 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-16 16:21 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
On Fri, 11 Jul 2025 18:36:19 +0000
James Morse <james.morse@arm.com> wrote:
> The MPAM table identifies caches by id. The MPAM driver also wants to know
> the cache level to determine if the platform is of the shape that can be
> managed via resctrl. Cacheinfo has this information, but only for CPUs that
> are online.
>
> Waiting for all CPUs to come online is a problem for platforms where
> CPUs are brought online late by user-space.
>
> Add a helper that walks every possible cache, until it finds the one
> identified by cache-id, then return the level.
>
> acpi_count_levels() expects its levels parameter to be initialised to
> zero as it passes it to acpi_find_cache_level() as starting_level.
> The existing callers do this. Document it.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
A few suggestions inline. Mostly driven by the number of missing table
puts I've seen in ACPI code. You don't have any missing here but with a
bit of restructuring you can make that easy to see.
> ---
> drivers/acpi/pptt.c | 73 ++++++++++++++++++++++++++++++++++++++++++++
> include/linux/acpi.h | 5 +++
> 2 files changed, 78 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 13ca2eee3b98..f53748a5df19 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -912,3 +912,76 @@ int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
> ACPI_PPTT_ACPI_IDENTICAL);
> }
> +
> +/**
> + * find_acpi_cache_level_from_id() - Get the level of the specified cache
> + * @cache_id: The id field of the unified cache
> + *
> + * Determine the level relative to any CPU for the unified cache identified by
> + * cache_id. This allows the property to be found even if the CPUs are offline.
> + *
> + * The returned level can be used to group unified caches that are peers.
> + *
> + * The PPTT table must be rev 3 or later,
> + *
> + * If one CPUs L2 is shared with another as L3, this function will return
> + * an unpredictable value.
> + *
> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
> + * Otherwise returns a value which represents the level of the specified cache.
> + */
> +int find_acpi_cache_level_from_id(u32 cache_id)
> +{
> + u32 acpi_cpu_id;
> + acpi_status status;
> + int level, cpu, num_levels;
> + struct acpi_pptt_cache *cache;
> + struct acpi_table_header *table;
> + struct acpi_pptt_cache_v1 *cache_v1;
> + struct acpi_pptt_processor *cpu_node;
> +
> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> + if (ACPI_FAILURE(status)) {
> + acpi_pptt_warn_missing();
> + return -ENOENT;
> + }
> +
> + if (table->revision < 3) {
Maybe a unified exit path given all paths need to do
acpi_put_table() and return either error or level.
Or maybe it's time for some cleanup.h magic for acpi tables. I've
been thinking about it for a while and mostly stuck on the name ;)
(simpler suggestion follows)
static struct acpi_table_header *acpi_get_table_ret(char *signature, u32 instance)
{
struct acpi_table_header *table;
int status = acpi_get_table(signature, instance, &table);
if (ACPI_FAILURE(status))
return ERR_PTR(-ENOENT);
return table;
}
DEFINE_FREE(acpi_table, struct acpi_table_header *, if (!IS_ERR(_T)) acpi_put_table(_T))
Finally in here and loads of other places we avoid chance of missing an acpi_put_table
and generally simplify the code a little.
int find_acpi_cache_level_from_id(u32 cache_id)
{
u32 acpi_cpu_id;
acpi_status status;
int level, cpu, num_levels;
struct acpi_pptt_cache *cache;
struct acpi_pptt_cache_v1 *cache_v1;
struct acpi_pptt_processor *cpu_node;
struct acpi_table_header *table __free(acpi_table) =
acpi_get_table_ret(ACPI_SIG_PPTT, 0);
if (IS_ERR(table)
return PTR_ERR(table);
if (table->revision < 3)
return -ENOENT;
/*
* If we found the cache first, we'd still need to walk from each CPU
* to find the level...
*/
for_each_possible_cpu(cpu) {
acpi_cpu_id = get_acpi_id_for_cpu(cpu);
cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
if (!cpu_node)
return -ENOENT;
acpi_count_levels(table, cpu_node, &num_levels, NULL);
/* Start at 1 for L1 */
for (level = 1; level <= num_levels; level++) {
cache = acpi_find_cache_node(table, acpi_cpu_id,
ACPI_PPTT_CACHE_TYPE_UNIFIED,
level, &cpu_node);
if (!cache)
continue;
cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
cache,
sizeof(struct acpi_pptt_cache));
if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
cache_v1->cache_id == cache_id) {
acpi_put_table(table);
return level;
}
}
}
return -ENOENT;
}
A less 'fun' alternative is pull some code out as a helper to make put the get and put
near each other with no conditionals to confuse things.
static int __find_acpi_cache_level_from_id(u32 cache_id, struct acpi_table_header *head);
{
u32 acpi_cpu_id;
int level, cpu, num_levels;
struct acpi_pptt_cache *cache;
struct acpi_pptt_cache_v1 *cache_v1;
struct acpi_pptt_processor *cpu_node;
if (table->revision < 3)
return -ENOENT;
/*
* If we found the cache first, we'd still need to walk from each CPU
* to find the level...
*/
for_each_possible_cpu(cpu) {
acpi_cpu_id = get_acpi_id_for_cpu(cpu);
cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
if (!cpu_node)
return -ENOENT;
acpi_count_levels(table, cpu_node, &num_levels, NULL);
/* Start at 1 for L1 */
for (level = 1; level <= num_levels; level++) {
cache = acpi_find_cache_node(table, acpi_cpu_id,
ACPI_PPTT_CACHE_TYPE_UNIFIED,
level, &cpu_node);
if (!cache)
continue;
cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
cache,
sizeof(struct acpi_pptt_cache));
if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
cache_v1->cache_id == cache_id)
return level;
}
}
return -ENOENT;
}
int find_acpi_cache_level_from_id(u32 cache_id)
{
int ret;
acpi_status status;
struct acpi_table_header *table;
status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
if (ACPI_FAILURE(status)) {
acpi_pptt_warn_missing();
return -ENOENT;
}
ret = __find_acpi_cache_level_from_id(cache_id, table)
acpi_put_table(table);
return ret;
}
> + acpi_put_table(table);
> + return -ENOENT;
> + }
> +
> + /*
> + * If we found the cache first, we'd still need to walk from each CPU
> + * to find the level...
> + */
> + for_each_possible_cpu(cpu) {
> + acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> + cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
> + if (!cpu_node)
> + break;
> + acpi_count_levels(table, cpu_node, &num_levels, NULL);
> +
> + /* Start at 1 for L1 */
> + for (level = 1; level <= num_levels; level++) {
> + cache = acpi_find_cache_node(table, acpi_cpu_id,
> + ACPI_PPTT_CACHE_TYPE_UNIFIED,
> + level, &cpu_node);
> + if (!cache)
> + continue;
> +
> + cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
> + cache,
> + sizeof(struct acpi_pptt_cache));
> +
> + if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
> + cache_v1->cache_id == cache_id) {
> + acpi_put_table(table);
> + return level;
> + }
> + }
> + }
> +
> + acpi_put_table(table);
> + return -ENOENT;
> +}
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 8c3165c2b083..82947f6d2a43 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1542,6 +1542,7 @@ int find_acpi_cpu_topology_cluster(unsigned int cpu);
> int find_acpi_cpu_topology_package(unsigned int cpu);
> int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
> int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
> +int find_acpi_cache_level_from_id(u32 cache_id);
> #else
> static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
> {
> @@ -1568,6 +1569,10 @@ static inline int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id,
> {
> return -EINVAL;
> }
> +static inline int find_acpi_cache_level_from_id(u32 cache_id)
> +{
> + return -EINVAL;
> +}
> #endif
>
> void acpi_arch_init(void);
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
2025-07-11 18:36 ` [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id James Morse
@ 2025-07-16 16:24 ` Jonathan Cameron
2025-08-05 17:06 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-16 16:24 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
On Fri, 11 Jul 2025 18:36:20 +0000
James Morse <james.morse@arm.com> wrote:
> MPAM identifies CPUs by the cache_id in the PPTT cache structure.
>
> The driver needs to know which CPUs are associated with the cache,
> the CPUs may not all be online, so cacheinfo does not have the
> information.
>
> Add a helper to pull this information out of the PPTT.
>
> CC: Rohit Mathew <Rohit.Mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> ---
> drivers/acpi/pptt.c | 70 ++++++++++++++++++++++++++++++++++++++++++++
> include/linux/acpi.h | 6 ++++
> 2 files changed, 76 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index f53748a5df19..81f7ac18c023 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -985,3 +985,73 @@ int find_acpi_cache_level_from_id(u32 cache_id)
> acpi_put_table(table);
> return -ENOENT;
> }
> +
> +/**
> + * acpi_pptt_get_cpumask_from_cache_id() - Get the cpus associated with the
> + * specified cache
> + * @cache_id: The id field of the unified cache
> + * @cpus: Where to build the cpumask
> + *
> + * Determine which CPUs are below this cache in the PPTT. This allows the property
> + * to be found even if the CPUs are offline.
> + *
> + * The PPTT table must be rev 3 or later,
> + *
> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
> + * Otherwise returns 0 and sets the cpus in the provided cpumask.
> + */
> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
> +{
> + u32 acpi_cpu_id;
> + acpi_status status;
> + int level, cpu, num_levels;
> + struct acpi_pptt_cache *cache;
> + struct acpi_table_header *table;
> + struct acpi_pptt_cache_v1 *cache_v1;
> + struct acpi_pptt_processor *cpu_node;
> +
> + cpumask_clear(cpus);
> +
> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
Similar suggestions to previous patch apply here as well.
> + if (ACPI_FAILURE(status)) {
> + acpi_pptt_warn_missing();
> + return -ENOENT;
> + }
> +
> + if (table->revision < 3) {
> + acpi_put_table(table);
> + return -ENOENT;
> + }
> +
> + /*
> + * If we found the cache first, we'd still need to walk from each cpu.
> + */
> + for_each_possible_cpu(cpu) {
> + acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> + cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
> + if (!cpu_node)
> + break;
> + acpi_count_levels(table, cpu_node, &num_levels, NULL);
> +
> + /* Start at 1 for L1 */
> + for (level = 1; level <= num_levels; level++) {
> + cache = acpi_find_cache_node(table, acpi_cpu_id,
> + ACPI_PPTT_CACHE_TYPE_UNIFIED,
> + level, &cpu_node);
> + if (!cache)
> + continue;
> +
> + cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
> + cache,
> + sizeof(struct acpi_pptt_cache));
> +
> + if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
> + cache_v1->cache_id == cache_id) {
> + cpumask_set_cpu(cpu, cpus);
> + }
Unnecessary {} Fine to keep them if you add something else here later.
> + }
> + }
> +
> + acpi_put_table(table);
> + return 0;
> +}
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 82947f6d2a43..61ac3d1de1e8 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1543,6 +1543,7 @@ int find_acpi_cpu_topology_package(unsigned int cpu);
> int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
> int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
> int find_acpi_cache_level_from_id(u32 cache_id);
> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus);
> #else
> static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
> {
> @@ -1573,6 +1574,11 @@ static inline int find_acpi_cache_level_from_id(u32 cache_id)
> {
> return -EINVAL;
> }
> +static inline int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id,
> + cpumask_t *cpus)
> +{
> + return -EINVAL;
> +}
> #endif
>
> void acpi_arch_init(void);
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM
2025-07-11 18:36 ` [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM James Morse
@ 2025-07-16 16:26 ` Jonathan Cameron
0 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-16 16:26 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, 11 Jul 2025 18:36:21 +0000
James Morse <james.morse@arm.com> wrote:
> The bulk of the MPAM driver lives outside the arch code because it
> largely manages MMIO devices that generate interrupts. The driver
> needs a Kconfig symbol to enable it, as MPAM is only found on arm64
> platforms, that is where the Kconfig option makes the most sense.
>
> This Kconfig option will later be used by the arch code to enable
> or disable the MPAM context-switch code, and registering the CPUs
> properties with the MPAM driver.
>
> Signed-off-by: James Morse <james.morse@arm.com>
Seems like a reasonable help test so FWIW
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> ---
> arch/arm64/Kconfig | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 55fc331af337..5f08214537d0 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2058,6 +2058,23 @@ config ARM64_TLB_RANGE
> ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a
> range of input addresses.
>
> +config ARM64_MPAM
> + bool "Enable support for MPAM"
> + help
> + Memory Partitioning and Monitoring is an optional extension
> + that allows the CPUs to mark load and store transactions with
> + labels for partition-id and performance-monitoring-group.
> + System components, such as the caches, can use the partition-id
> + to apply a performance policy. MPAM monitors can use the
> + partition-id and performance-monitoring-group to measure the
> + cache occupancy or data throughput.
> +
> + Use of this extension requires CPU support, support in the
> + memory system components (MSC), and a description from firmware
> + of where the MSC are in the address space.
> +
> + MPAM is exposed to user-space via the resctrl pseudo filesystem.
> +
> endmenu # "ARMv8.4 architectural features"
>
> menu "ARMv8.5 architectural features"
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-11 18:36 ` [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table James Morse
@ 2025-07-16 17:07 ` Jonathan Cameron
2025-07-23 16:39 ` Ben Horgan
` (2 more replies)
2025-07-24 10:50 ` Ben Horgan
1 sibling, 3 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-16 17:07 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, 11 Jul 2025 18:36:22 +0000
James Morse <james.morse@arm.com> wrote:
> Add code to parse the arm64 specific MPAM table, looking up the cache
> level from the PPTT and feeding the end result into the MPAM driver.
Throw in a link to the spec perhaps? Particularly useful to know which
version this was written against when reviewing it.
>
> CC: Carl Worth <carl@os.amperecomputing.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> arch/arm64/Kconfig | 1 +
> drivers/acpi/arm64/Kconfig | 3 +
> drivers/acpi/arm64/Makefile | 1 +
> drivers/acpi/arm64/mpam.c | 365 ++++++++++++++++++++++++++++++++++++
> drivers/acpi/tables.c | 2 +-
> include/linux/arm_mpam.h | 46 +++++
> 6 files changed, 417 insertions(+), 1 deletion(-)
> create mode 100644 drivers/acpi/arm64/mpam.c
> create mode 100644 include/linux/arm_mpam.h
>
> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
> index 05ecde9eaabe..27b872249baa 100644
> --- a/drivers/acpi/arm64/Makefile
> +++ b/drivers/acpi/arm64/Makefile
> @@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
> obj-$(CONFIG_ACPI_IORT) += iort.o
> obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
> obj-$(CONFIG_ARM_AMBA) += amba.o
> +obj-$(CONFIG_ACPI_MPAM) += mpam.o
Keep it with the ACPI ones? There doesn't seem to be a lot of order in here
though so I guess maybe there is logic behind putting it here I'm missing.
> obj-y += dma.o init.o
> obj-y += thermal_cpufreq.o
> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
> new file mode 100644
> index 000000000000..f4791bac9a2a
> --- /dev/null
> +++ b/drivers/acpi/arm64/mpam.c
> @@ -0,0 +1,365 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2025 Arm Ltd.
> +
> +/* Parse the MPAM ACPI table feeding the discovered nodes into the driver */
> +
> +#define pr_fmt(fmt) "ACPI MPAM: " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/arm_mpam.h>
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/platform_device.h>
> +
> +#include <acpi/processor.h>
> +
> +/* Flags for acpi_table_mpam_msc.*_interrupt_flags */
References.. I'm looking at 3.0-alpha table 5 to check this.
I can see why you might be reluctant to point at an alpha if that
is what you are using ;)
> +#define ACPI_MPAM_MSC_IRQ_MODE_EDGE 1
> +#define ACPI_MPAM_MSC_IRQ_TYPE_MASK (3 << 1)
GENMASK(3, 2) would be my preference for how to do masks in new code.
> +#define ACPI_MPAM_MSC_IRQ_TYPE_WIRED 0
> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER BIT(3)
> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_VALID BIT(4)
> +
> +static bool frob_irq(struct platform_device *pdev, int intid, u32 flags,
> + int *irq, u32 processor_container_uid)
> +{
> + int sense;
> +
> + if (!intid)
> + return false;
> +
> + /* 0 in this field indicates a wired interrupt */
> + if (flags & ACPI_MPAM_MSC_IRQ_TYPE_MASK)
I'd prefer more explicit code (and probably no comment)
if (FIELD_GET(flags, ACPI_MPAM_MSC_IRQ_TYPE_MASK) !=
ACPI_MPAM_MSC_IRQ_TYPE_WIRED)
return false;
> + return false;
> +
> + if (flags & ACPI_MPAM_MSC_IRQ_MODE_EDGE)
> + sense = ACPI_EDGE_SENSITIVE;
> + else
> + sense = ACPI_LEVEL_SENSITIVE;
If the spec is supposed to be using standard ACPI_* types for this field
(I don't think the connection is explicitly documented though) then
sense = FIELD_GET(flags, ACPI_MPAM_MSC_IRQ_MODE_MASK);
Assuming a change to define the mask and rely on the ACPI defs for the values
This one is entirely up to you.
> +
> + /*
> + * If the GSI is in the GIC's PPI range, try and create a partitioned
> + * percpu interrupt.
> + */
> + if (16 <= intid && intid < 32 && processor_container_uid != ~0) {
> + pr_err_once("Partitioned interrupts not supported\n");
> + return false;
> + }
> +
> + *irq = acpi_register_gsi(&pdev->dev, intid, sense, ACPI_ACTIVE_HIGH);
> + if (*irq <= 0) {
> + pr_err_once("Failed to register interrupt 0x%x with ACPI\n",
> + intid);
> + return false;
> + }
> +
> + return true;
> +}
> +
> +static void acpi_mpam_parse_irqs(struct platform_device *pdev,
> + struct acpi_mpam_msc_node *tbl_msc,
> + struct resource *res, int *res_idx)
> +{
> + u32 flags, aff = ~0;
> + int irq;
> +
> + flags = tbl_msc->overflow_interrupt_flags;
> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
> + aff = tbl_msc->overflow_interrupt_affinity;
Just to make the two cases look the same I'd do
else
aff = ~0;
here as well and not initialize above. It's not quite worth using
a helper function for these two identical blocks but it's close.
> + if (frob_irq(pdev, tbl_msc->overflow_interrupt, flags, &irq, aff)) {
> + res[*res_idx].start = irq;
> + res[*res_idx].end = irq;
> + res[*res_idx].flags = IORESOURCE_IRQ;
> + res[*res_idx].name = "overflow";
res[*res_idx] = DEFINE_RES_IRQ_NAMED(irq, 1, "overflow");
> +
> + (*res_idx)++;
Can roll this in as well.
> + }
> +
> + flags = tbl_msc->error_interrupt_flags;
> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
> + aff = tbl_msc->error_interrupt_affinity;
> + else
> + aff = ~0;
> + if (frob_irq(pdev, tbl_msc->error_interrupt, flags, &irq, aff)) {
> + res[*res_idx].start = irq;
> + res[*res_idx].end = irq;
> + res[*res_idx].flags = IORESOURCE_IRQ;
> + res[*res_idx].name = "error";
Similar to above.
> +
> + (*res_idx)++;
> + }
> +}
> +
> +static bool __init parse_msc_pm_link(struct acpi_mpam_msc_node *tbl_msc,
> + struct platform_device *pdev,
> + u32 *acpi_id)
> +{
> + bool acpi_id_valid = false;
> + struct acpi_device *buddy;
> + char hid[16], uid[16];
> + int err;
> +
> + memset(&hid, 0, sizeof(hid));
> + memcpy(hid, &tbl_msc->hardware_id_linked_device,
> + sizeof(tbl_msc->hardware_id_linked_device));
> +
> + if (!strcmp(hid, ACPI_PROCESSOR_CONTAINER_HID)) {
> + *acpi_id = tbl_msc->instance_id_linked_device;
> + acpi_id_valid = true;
> + }
> +
> + err = snprintf(uid, sizeof(uid), "%u",
> + tbl_msc->instance_id_linked_device);
> + if (err < 0 || err >= sizeof(uid))
Does snprintf() ever return < 0 ? It's documented as returning
number of chars printed (without the NULL) so that can only be 0 or
greater.
Can it return >= sizeof(uid) ? Looks odd.
+ return acpi_id_valid;
> +
> + buddy = acpi_dev_get_first_match_dev(hid, uid, -1);
> + if (buddy)
> + device_link_add(&pdev->dev, &buddy->dev, DL_FLAG_STATELESS);
> +
> + return acpi_id_valid;
> +}
> +static int __init _parse_table(struct acpi_table_header *table)
> +{
> + char *table_end, *table_offset = (char *)(table + 1);
> + struct property_entry props[4]; /* needs a sentinel */
> + struct acpi_mpam_msc_node *tbl_msc;
> + int next_res, next_prop, err = 0;
> + struct acpi_device *companion;
> + struct platform_device *pdev;
> + enum mpam_msc_iface iface;
> + struct resource res[3];
> + char uid[16];
> + u32 acpi_id;
> +
> + table_end = (char *)table + table->length;
> +
> + while (table_offset < table_end) {
> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
> + table_offset += tbl_msc->length;
> +
> + /*
> + * If any of the reserved fields are set, make no attempt to
> + * parse the msc structure. This will prevent the driver from
> + * probing all the MSC, meaning it can't discover the system
> + * wide supported partid and pmg ranges. This avoids whatever
> + * this MSC is truncating the partids and creating a screaming
> + * error interrupt.
> + */
> + if (tbl_msc->reserved || tbl_msc->reserved1 || tbl_msc->reserved2)
> + continue;
> +
> + if (decode_interface_type(tbl_msc, &iface))
> + continue;
> +
> + next_res = 0;
> + next_prop = 0;
> + memset(res, 0, sizeof(res));
> + memset(props, 0, sizeof(props));
> +
> + pdev = platform_device_alloc("mpam_msc", tbl_msc->identifier);
> + if (IS_ERR(pdev)) {
returns NULL in at least some error cases (probably all, I'm just to lazy to check)
> + err = PTR_ERR(pdev);
> + break;
> + }
> +
> + if (tbl_msc->length < sizeof(*tbl_msc)) {
> + err = -EINVAL;
> + break;
> + }
> +
> + /* Some power management is described in the namespace: */
> + err = snprintf(uid, sizeof(uid), "%u", tbl_msc->identifier);
> + if (err > 0 && err < sizeof(uid)) {
> + companion = acpi_dev_get_first_match_dev("ARMHAA5C", uid, -1);
> + if (companion)
> + ACPI_COMPANION_SET(&pdev->dev, companion);
> + }
> +
> + if (iface == MPAM_IFACE_MMIO) {
> + res[next_res].name = "MPAM:MSC";
> + res[next_res].start = tbl_msc->base_address;
> + res[next_res].end = tbl_msc->base_address + tbl_msc->mmio_size - 1;
> + res[next_res].flags = IORESOURCE_MEM;
> + next_res++;
DEFINE_RES_MEM_NAMED()?
> + } else if (iface == MPAM_IFACE_PCC) {
> + props[next_prop++] = PROPERTY_ENTRY_U32("pcc-channel",
> + tbl_msc->base_address);
> + next_prop++;
> + }
> +
> + acpi_mpam_parse_irqs(pdev, tbl_msc, res, &next_res);
> + err = platform_device_add_resources(pdev, res, next_res);
> + if (err)
> + break;
> +
> + props[next_prop++] = PROPERTY_ENTRY_U32("arm,not-ready-us",
> + tbl_msc->max_nrdy_usec);
> +
> + /*
> + * The MSC's CPU affinity is described via its linked power
> + * management device, but only if it points at a Processor or
> + * Processor Container.
> + */
> + if (parse_msc_pm_link(tbl_msc, pdev, &acpi_id)) {
> + props[next_prop++] = PROPERTY_ENTRY_U32("cpu_affinity",
> + acpi_id);
> + }
> +
> + err = device_create_managed_software_node(&pdev->dev, props,
> + NULL);
> + if (err)
> + break;
> +
> + /* Come back later if you want the RIS too */
> + err = platform_device_add_data(pdev, tbl_msc, tbl_msc->length);
> + if (err)
> + break;
> +
> + platform_device_add(pdev);
Can fail.
> + }
> +
> + if (err)
> + platform_device_put(pdev);
> +
> + return err;
> +}
> +
> +static struct acpi_table_header *get_table(void)
> +{
> + struct acpi_table_header *table;
> + acpi_status status;
> +
> + if (acpi_disabled || !system_supports_mpam())
> + return NULL;
> +
> + status = acpi_get_table(ACPI_SIG_MPAM, 0, &table);
> + if (ACPI_FAILURE(status))
> + return NULL;
> +
> + if (table->revision != 1)
> + return NULL;
> +
> + return table;
> +}
> +
> +static int __init acpi_mpam_parse(void)
> +{
> + struct acpi_table_header *mpam;
> + int err;
> +
> + mpam = get_table();
> + if (!mpam)
> + return 0;
Just what I was suggesting for the PPTT case in early patches. Nice :)
> +
> + err = _parse_table(mpam);
> + acpi_put_table(mpam);
> +
> + return err;
> +}
> +
> +static int _count_msc(struct acpi_table_header *table)
> +{
> + char *table_end, *table_offset = (char *)(table + 1);
> + struct acpi_mpam_msc_node *tbl_msc;
> + int ret = 0;
Call it count as it only ever contains the count?
> +
> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
> + table_end = (char *)table + table->length;
> +
> + while (table_offset < table_end) {
> + if (tbl_msc->length < sizeof(*tbl_msc))
> + return -EINVAL;
> +
> + ret++;
count++ would feel more natural here.
> +
> + table_offset += tbl_msc->length;
> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
> + }
> +
> + return ret;
> +}
That's all I have time for today. Will get to the rest of the series soonish.
Jonathan
^ permalink raw reply [flat|nested] 117+ messages in thread
* RE: [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions
2025-07-11 18:36 ` [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions James Morse
@ 2025-07-17 1:04 ` Shaopeng Tan (Fujitsu)
2025-08-06 18:04 ` James Morse
2025-07-24 14:02 ` Ben Horgan
1 sibling, 1 reply; 117+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2025-07-17 1:04 UTC (permalink / raw)
To: 'James Morse', linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hello James,
> Memory Partitioning and Monitoring (MPAM) has memory mapped devices
> (MSCs) with an identity/configuration page.
>
> Add the definitions for these registers as offset within the page(s).
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_internal.h | 268
> ++++++++++++++++++++
> 1 file changed, 268 insertions(+)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h
> b/drivers/platform/arm64/mpam/mpam_internal.h
> index d49bb884b433..9110c171d9d2 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -150,4 +150,272 @@ extern struct list_head mpam_classes; int
> mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32
> cache_level,
> cpumask_t *affinity);
>
> +/*
> + * MPAM MSCs have the following register layout. See:
> + * Arm Architecture Reference Manual Supplement - Memory System
> +Resource
> + * Partitioning and Monitoring (MPAM), for Armv8-A. DDI 0598A.a */
> +#define MPAM_ARCHITECTURE_V1 0x10
> +
> +/* Memory mapped control pages: */
> +/* ID Register offsets in the memory mapped page */
> +#define MPAMF_IDR 0x0000 /* features id register */
> +#define MPAMF_MSMON_IDR 0x0080 /* performance monitoring
> features */
> +#define MPAMF_IMPL_IDR 0x0028 /* imp-def partitioning */
> +#define MPAMF_CPOR_IDR 0x0030 /* cache-portion partitioning
> */
> +#define MPAMF_CCAP_IDR 0x0038 /* cache-capacity
> partitioning */
> +#define MPAMF_MBW_IDR 0x0040 /* mem-bw partitioning */
> +#define MPAMF_PRI_IDR 0x0048 /* priority partitioning */
> +#define MPAMF_CSUMON_IDR 0x0088 /* cache-usage monitor */
> +#define MPAMF_MBWUMON_IDR 0x0090 /* mem-bw usage
> monitor */
> +#define MPAMF_PARTID_NRW_IDR 0x0050 /* partid-narrowing */
> +#define MPAMF_IIDR 0x0018 /* implementer id register */
> +#define MPAMF_AIDR 0x0020 /* architectural id register */
> +
> +/* Configuration and Status Register offsets in the memory mapped page */
> +#define MPAMCFG_PART_SEL 0x0100 /* partid to configure: */
> +#define MPAMCFG_CPBM 0x1000 /* cache-portion config */
> +#define MPAMCFG_CMAX 0x0108 /* cache-capacity config */
> +#define MPAMCFG_CMIN 0x0110 /* cache-capacity config */
> +#define MPAMCFG_MBW_MIN 0x0200 /* min mem-bw config */
> +#define MPAMCFG_MBW_MAX 0x0208 /* max mem-bw config */
> +#define MPAMCFG_MBW_WINWD 0x0220 /* mem-bw accounting
> window config */
> +#define MPAMCFG_MBW_PBM 0x2000 /* mem-bw portion
> bitmap config */
> +#define MPAMCFG_PRI 0x0400 /* priority partitioning config
> */
> +#define MPAMCFG_MBW_PROP 0x0500 /* mem-bw stride config
> */
> +#define MPAMCFG_INTPARTID 0x0600 /* partid-narrowing config
> */
> +
> +#define MSMON_CFG_MON_SEL 0x0800 /* monitor selector */
> +#define MSMON_CFG_CSU_FLT 0x0810 /* cache-usage monitor
> filter */
> +#define MSMON_CFG_CSU_CTL 0x0818 /* cache-usage monitor
> config */
> +#define MSMON_CFG_MBWU_FLT 0x0820 /* mem-bw monitor filter
> */
> +#define MSMON_CFG_MBWU_CTL 0x0828 /* mem-bw monitor
> config */
> +#define MSMON_CSU 0x0840 /* current cache-usage */
> +#define MSMON_CSU_CAPTURE 0x0848 /* last cache-usage value
> captured */
> +#define MSMON_MBWU 0x0860 /* current mem-bw usage
> value */
> +#define MSMON_MBWU_CAPTURE 0x0868 /* last mem-bw value
> captured */
> +#define MSMON_CAPT_EVNT 0x0808 /* signal a capture event */
> +#define MPAMF_ESR 0x00F8 /* error status register */
> +#define MPAMF_ECR 0x00F0 /* error control register */
> +
> +/* MPAMF_IDR - MPAM features ID register */
> +#define MPAMF_IDR_PARTID_MAX GENMASK(15, 0)
> +#define MPAMF_IDR_PMG_MAX GENMASK(23, 16)
> +#define MPAMF_IDR_HAS_CCAP_PART BIT(24)
> +#define MPAMF_IDR_HAS_CPOR_PART BIT(25)
> +#define MPAMF_IDR_HAS_MBW_PART BIT(26)
> +#define MPAMF_IDR_HAS_PRI_PART BIT(27)
> +#define MPAMF_IDR_HAS_EXT BIT(28)
> +#define MPAMF_IDR_HAS_IMPL_IDR BIT(29)
> +#define MPAMF_IDR_HAS_MSMON BIT(30)
> +#define MPAMF_IDR_HAS_PARTID_NRW BIT(31)
> +#define MPAMF_IDR_HAS_RIS BIT(32)
> +#define MPAMF_IDR_HAS_EXT_ESR BIT(38)
> +#define MPAMF_IDR_HAS_ESR BIT(39)
> +#define MPAMF_IDR_RIS_MAX GENMASK(59, 56)
> +
> +/* MPAMF_MSMON_IDR - MPAM performance monitoring ID register */
> +#define MPAMF_MSMON_IDR_MSMON_CSU BIT(16)
> +#define MPAMF_MSMON_IDR_MSMON_MBWU BIT(17)
> +#define MPAMF_MSMON_IDR_HAS_LOCAL_CAPT_EVNT BIT(31)
> +
> +/* MPAMF_CPOR_IDR - MPAM features cache portion partitioning ID
> register */
> +#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15,
> 0)
> +
> +/* MPAMF_CCAP_IDR - MPAM features cache capacity partitioning ID
> register */
> +#define MPAMF_CCAP_IDR_HAS_CMAX_SOFTLIM BIT(31)
> +#define MPAMF_CCAP_IDR_NO_CMAX BIT(30)
> +#define MPAMF_CCAP_IDR_HAS_CMIN BIT(29)
> +#define MPAMF_CCAP_IDR_HAS_CASSOC BIT(28)
> +#define MPAMF_CCAP_IDR_CASSOC_WD GENMASK(12,
> 8)
> +#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0)
> +
> +/* MPAMF_MBW_IDR - MPAM features memory bandwidth partitioning ID
> register */
> +#define MPAMF_MBW_IDR_BWA_WD GENMASK(5, 0)
> +#define MPAMF_MBW_IDR_HAS_MIN BIT(10)
> +#define MPAMF_MBW_IDR_HAS_MAX BIT(11)
> +#define MPAMF_MBW_IDR_HAS_PBM BIT(12)
> +#define MPAMF_MBW_IDR_HAS_PROP BIT(13)
> +#define MPAMF_MBW_IDR_WINDWR BIT(14)
> +#define MPAMF_MBW_IDR_BWPBM_WD GENMASK(28, 16)
> +
> +/* MPAMF_PRI_IDR - MPAM features priority partitioning ID register */
> +#define MPAMF_PRI_IDR_HAS_INTPRI BIT(0)
> +#define MPAMF_PRI_IDR_INTPRI_0_IS_LOW BIT(1)
> +#define MPAMF_PRI_IDR_INTPRI_WD GENMASK(9, 4)
> +#define MPAMF_PRI_IDR_HAS_DSPRI BIT(16)
> +#define MPAMF_PRI_IDR_DSPRI_0_IS_LOW BIT(17)
> +#define MPAMF_PRI_IDR_DSPRI_WD GENMASK(25, 20)
> +
> +/* MPAMF_CSUMON_IDR - MPAM cache storage usage monitor ID register
> */
> +#define MPAMF_CSUMON_IDR_NUM_MON GENMASK(15, 0)
> +#define MPAMF_CSUMON_IDR_HAS_OFLOW_CAPT BIT(24)
> +#define MPAMF_CSUMON_IDR_HAS_CEVNT_OFLW BIT(25)
> +#define MPAMF_CSUMON_IDR_HAS_OFSR BIT(26)
> +#define MPAMF_CSUMON_IDR_HAS_OFLOW_LNKG BIT(27)
> +#define MPAMF_CSUMON_IDR_HAS_XCL BIT(29)
> +#define MPAMF_CSUMON_IDR_CSU_RO BIT(30)
> +#define MPAMF_CSUMON_IDR_HAS_CAPTURE BIT(31)
> +
> +/* MPAMF_MBWUMON_IDR - MPAM memory bandwidth usage monitor ID
> register */
> +#define MPAMF_MBWUMON_IDR_NUM_MON GENMASK(15, 0)
> +#define MPAMF_MBWUMON_IDR_HAS_RWBW BIT(28)
> +#define MPAMF_MBWUMON_IDR_LWD BIT(29)
> +#define MPAMF_MBWUMON_IDR_HAS_LONG BIT(30)
> +#define MPAMF_MBWUMON_IDR_HAS_CAPTURE BIT(31)
> +
> +/* MPAMF_PARTID_NRW_IDR - MPAM PARTID narrowing ID register */
> +#define MPAMF_PARTID_NRW_IDR_INTPARTID_MAX GENMASK(15,
> 0)
> +
> +/* MPAMF_IIDR - MPAM implementation ID register */
> +#define MPAMF_IIDR_PRODUCTID GENMASK(31, 20)
> +#define MPAMF_IIDR_PRODUCTID_SHIFT 20
> +#define MPAMF_IIDR_VARIANT GENMASK(19, 16)
> +#define MPAMF_IIDR_VARIANT_SHIFT 16
> +#define MPAMF_IIDR_REVISON GENMASK(15, 12)
> +#define MPAMF_IIDR_REVISON_SHIFT 12
> +#define MPAMF_IIDR_IMPLEMENTER GENMASK(11, 0)
> +#define MPAMF_IIDR_IMPLEMENTER_SHIFT 0
> +
> +/* MPAMF_AIDR - MPAM architecture ID register */
> +#define MPAMF_AIDR_ARCH_MAJOR_REV GENMASK(7, 4)
> +#define MPAMF_AIDR_ARCH_MINOR_REV GENMASK(3, 0)
> +
> +/* MPAMCFG_PART_SEL - MPAM partition configuration selection register
> */
> +#define MPAMCFG_PART_SEL_PARTID_SEL GENMASK(15, 0)
> +#define MPAMCFG_PART_SEL_INTERNAL BIT(16)
> +#define MPAMCFG_PART_SEL_RIS GENMASK(27, 24)
> +
> +/* MPAMCFG_CMAX - MPAM cache capacity configuration register */
> +#define MPAMCFG_CMAX_SOFTLIM BIT(31)
> +#define MPAMCFG_CMAX_CMAX GENMASK(15, 0)
> +
> +/* MPAMCFG_CMIN - MPAM cache capacity configuration register */
> +#define MPAMCFG_CMIN_CMIN GENMASK(15, 0)
> +
> +/*
> + * MPAMCFG_MBW_MIN - MPAM memory minimum bandwidth partitioning
> configuration
> + * register
> + */
> +#define MPAMCFG_MBW_MIN_MIN GENMASK(15, 0)
> +
> +/*
> + * MPAMCFG_MBW_MAX - MPAM memory maximum bandwidth
> partitioning configuration
> + * register
> + */
> +#define MPAMCFG_MBW_MAX_MAX GENMASK(15, 0)
> +#define MPAMCFG_MBW_MAX_HARDLIM BIT(31)
> +
> +/*
> + * MPAMCFG_MBW_WINWD - MPAM memory bandwidth partitioning
> window width
> + * register
> + */
> +#define MPAMCFG_MBW_WINWD_US_FRAC GENMASK(7, 0)
> +#define MPAMCFG_MBW_WINWD_US_INT GENMASK(23, 8)
> +
> +/* MPAMCFG_PRI - MPAM priority partitioning configuration register */
> +#define MPAMCFG_PRI_INTPRI GENMASK(15, 0)
> +#define MPAMCFG_PRI_DSPRI GENMASK(31, 16)
> +
> +/*
> + * MPAMCFG_MBW_PROP - Memory bandwidth proportional stride
> partitioning
> + * configuration register
> + */
> +#define MPAMCFG_MBW_PROP_STRIDEM1 GENMASK(15, 0)
> +#define MPAMCFG_MBW_PROP_EN BIT(31)
> +
> +/*
> + * MPAMCFG_INTPARTID - MPAM internal partition narrowing configuration
> +register */
> +#define MPAMCFG_INTPARTID_INTPARTID GENMASK(15, 0)
> +#define MPAMCFG_INTPARTID_INTERNAL BIT(16)
> +
> +/* MSMON_CFG_MON_SEL - Memory system performance monitor
> selection register */
> +#define MSMON_CFG_MON_SEL_MON_SEL GENMASK(15, 0)
> +#define MSMON_CFG_MON_SEL_RIS GENMASK(27, 24)
> +
> +/* MPAMF_ESR - MPAM Error Status Register */ #define
> +MPAMF_ESR_PARTID_OR_MON GENMASK(15, 0)
> +#define MPAMF_ESR_PMG GENMASK(23, 16)
> +#define MPAMF_ESR_ERRCODE GENMASK(27, 24)
> +#define MPAMF_ESR_OVRWR BIT(31)
> +#define MPAMF_ESR_RIS GENMASK(35, 32)
> +
> +/* MPAMF_ECR - MPAM Error Control Register */
> +#define MPAMF_ECR_INTEN BIT(0)
> +
> +/* Error conditions in accessing memory mapped registers */
> +#define MPAM_ERRCODE_NONE 0
> +#define MPAM_ERRCODE_PARTID_SEL_RANGE 1
> +#define MPAM_ERRCODE_REQ_PARTID_RANGE 2
> +#define MPAM_ERRCODE_MSMONCFG_ID_RANGE 3
> +#define MPAM_ERRCODE_REQ_PMG_RANGE 4
> +#define MPAM_ERRCODE_MONITOR_RANGE 5
> +#define MPAM_ERRCODE_INTPARTID_RANGE 6
> +#define MPAM_ERRCODE_UNEXPECTED_INTERNAL 7
> +
> +/*
> + * MSMON_CFG_CSU_FLT - Memory system performance monitor configure
> cache storage
> + * usage monitor filter register
> + */
> +#define MSMON_CFG_CSU_FLT_PARTID GENMASK(15, 0)
> +#define MSMON_CFG_CSU_FLT_PMG GENMASK(23, 16)
> +
> +/*
> + * MSMON_CFG_CSU_CTL - Memory system performance monitor configure
> cache storage
> + * usage monitor control register
> + * MSMON_CFG_MBWU_CTL - Memory system performance monitor
> configure memory
> + * bandwidth usage monitor control register
> + */
> +#define MSMON_CFG_x_CTL_TYPE GENMASK(7, 0)
> +#define MSMON_CFG_x_CTL_OFLOW_STATUS_L BIT(15)
> +#define MSMON_CFG_x_CTL_MATCH_PARTID BIT(16)
> +#define MSMON_CFG_x_CTL_MATCH_PMG BIT(17)
> +#define MSMON_CFG_x_CTL_SCLEN BIT(19)
> +#define MSMON_CFG_x_CTL_SUBTYPE GENMASK(23, 20)
> +#define MSMON_CFG_x_CTL_OFLOW_FRZ BIT(24)
> +#define MSMON_CFG_x_CTL_OFLOW_INTR BIT(25)
> +#define MSMON_CFG_x_CTL_OFLOW_STATUS BIT(26)
> +#define MSMON_CFG_x_CTL_CAPT_RESET BIT(27)
> +#define MSMON_CFG_x_CTL_CAPT_EVNT GENMASK(30, 28)
> +#define MSMON_CFG_x_CTL_EN BIT(31)
> +
> +#define MSMON_CFG_MBWU_CTL_TYPE_MBWU
> 0x42
> +#define MSMON_CFG_MBWU_CTL_TYPE_CSU
> 0x43
MSMON_CFG_CSU_CTL_TYPE_CSU?
Best regards,
Shaopeng TAN
> +
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_NONE 0
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_READ 1
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_WRITE 2
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_BOTH 3
> +
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_MAX 3
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_MASK 0x3
> +
> +/*
> + * MSMON_CFG_MBWU_FLT - Memory system performance monitor
> configure memory
> + * bandwidth usage monitor filter register
> + */
> +#define MSMON_CFG_MBWU_FLT_PARTID GENMASK(15,
> 0)
> +#define MSMON_CFG_MBWU_FLT_PMG GENMASK(23,
> 16)
> +#define MSMON_CFG_MBWU_FLT_RWBW GENMASK(31,
> 30)
> +
> +/*
> + * MSMON_CSU - Memory system performance monitor cache storage usage
> monitor
> + * register
> + * MSMON_CSU_CAPTURE - Memory system performance monitor cache
> storage usage
> + * capture register
> + * MSMON_MBWU - Memory system performance monitor memory
> bandwidth usage
> + * monitor register
> + * MSMON_MBWU_CAPTURE - Memory system performance monitor
> memory bandwidth usage
> + * capture register
> + */
> +#define MSMON___VALUE GENMASK(30, 0)
> +#define MSMON___NRDY BIT(31)
> +#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
> +/*
> + * MSMON_CAPT_EVNT - Memory system performance monitoring capture
> event
> + * generation register
> + */
> +#define MSMON_CAPT_EVNT_NOW BIT(0)
> +
> #endif /* MPAM_INTERNAL_H */
> --
> 2.39.5
^ permalink raw reply [flat|nested] 117+ messages in thread
* RE: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
@ 2025-07-17 1:08 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:07 ` James Morse
2025-07-22 15:06 ` Jonathan Cameron
` (2 subsequent siblings)
4 siblings, 1 reply; 117+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2025-07-17 1:08 UTC (permalink / raw)
To: 'James Morse', linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hello James,
>
> Register and enable error IRQs. All the MPAM error interrupts indicate a
> software bug, e.g. out of range partid. If the error interrupt is ever signalled,
> attempt to disable MPAM.
>
> Only the irq handler accesses the ESR register, so no locking is needed.
> The work to disable MPAM after an error needs to happen at process context,
> use a threaded interrupt.
>
> There is no support for percpu threaded interrupts, for now schedule the work
> to be done from the irq handler.
>
> Enabling the IRQs in the MSC may involve cross calling to a CPU that can
> access the MSC.
>
> CC: Rohit Mathew <rohit.mathew@arm.com>
> Tested-by: Rohit Mathew <rohit.mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 304
> +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 9 +-
> 2 files changed, 307 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
> b/drivers/platform/arm64/mpam/mpam_devices.c
> index 145535cd4732..af19cc25d16e 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -14,6 +14,9 @@
> #include <linux/device.h>
> #include <linux/errno.h>
> #include <linux/gfp.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> #include <linux/list.h>
> #include <linux/lockdep.h>
> #include <linux/mutex.h>
> @@ -62,6 +65,12 @@ static DEFINE_SPINLOCK(partid_max_lock);
> */
> static DECLARE_WORK(mpam_enable_work, &mpam_enable);
>
> +/*
> + * All mpam error interrupts indicate a software bug. On receipt,
> +disable the
> + * driver.
> + */
> +static DECLARE_WORK(mpam_broken_work, &mpam_disable);
> +
> /*
> * An MSC is a physical container for controls and monitors, each identified by
> * their RIS index. These share a base-address, interrupts and some MMIO
> @@ -159,6 +168,24 @@ static u64 mpam_msc_read_idr(struct mpam_msc
> *msc)
> return (idr_high << 32) | idr_low;
> }
>
> +static void mpam_msc_zero_esr(struct mpam_msc *msc) {
> + __mpam_write_reg(msc, MPAMF_ESR, 0);
> + if (msc->has_extd_esr)
> + __mpam_write_reg(msc, MPAMF_ESR + 4, 0); }
> +
> +static u64 mpam_msc_read_esr(struct mpam_msc *msc) {
> + u64 esr_high = 0, esr_low;
> +
> + esr_low = __mpam_read_reg(msc, MPAMF_ESR);
> + if (msc->has_extd_esr)
> + esr_high = __mpam_read_reg(msc, MPAMF_ESR + 4);
> +
> + return (esr_high << 32) | esr_low;
> +}
> +
> static void __mpam_part_sel_raw(u32 partsel, struct mpam_msc *msc) {
> lockdep_assert_held(&msc->part_sel_lock);
> @@ -405,12 +432,12 @@ static void mpam_msc_destroy(struct mpam_msc
> *msc)
>
> lockdep_assert_held(&mpam_list_lock);
>
> - list_del_rcu(&msc->glbl_list);
> - platform_set_drvdata(pdev, NULL);
> -
> list_for_each_entry_safe(ris, tmp, &msc->ris, msc_list)
> mpam_ris_destroy(ris);
>
> + list_del_rcu(&msc->glbl_list);
> + platform_set_drvdata(pdev, NULL);
> +
> add_to_garbage(msc);
> msc->garbage.pdev = pdev;
> }
> @@ -828,6 +855,7 @@ static int mpam_msc_hw_probe(struct mpam_msc
> *msc)
> pmg_max = FIELD_GET(MPAMF_IDR_PMG_MAX, idr);
> msc->partid_max = min(msc->partid_max, partid_max);
> msc->pmg_max = min(msc->pmg_max, pmg_max);
> + msc->has_extd_esr =
> FIELD_GET(MPAMF_IDR_HAS_EXT_ESR, idr);
>
> ris = mpam_get_or_create_ris(msc, ris_idx);
> if (IS_ERR(ris))
> @@ -974,6 +1002,13 @@ static void mpam_reset_msc(struct mpam_msc
> *msc, bool online)
> mpam_mon_sel_outer_unlock(msc);
> }
>
> +static void _enable_percpu_irq(void *_irq) {
> + int *irq = _irq;
> +
> + enable_percpu_irq(*irq, IRQ_TYPE_NONE); }
> +
> static int mpam_cpu_online(unsigned int cpu) {
> int idx;
> @@ -984,6 +1019,9 @@ static int mpam_cpu_online(unsigned int cpu)
> if (!cpumask_test_cpu(cpu, &msc->accessibility))
> continue;
>
> + if (msc->reenable_error_ppi)
> + _enable_percpu_irq(&msc->reenable_error_ppi);
> +
> if (atomic_fetch_inc(&msc->online_refs) == 0)
> mpam_reset_msc(msc, true);
> }
> @@ -1032,6 +1070,9 @@ static int mpam_cpu_offline(unsigned int cpu)
> if (!cpumask_test_cpu(cpu, &msc->accessibility))
> continue;
>
> + if (msc->reenable_error_ppi)
> + disable_percpu_irq(msc->reenable_error_ppi);
> +
> if (atomic_dec_and_test(&msc->online_refs))
> mpam_reset_msc(msc, false);
> }
> @@ -1058,6 +1099,51 @@ static void mpam_register_cpuhp_callbacks(int
> (*online)(unsigned int online),
> mutex_unlock(&mpam_cpuhp_state_lock);
> }
>
> +static int __setup_ppi(struct mpam_msc *msc) {
> + int cpu;
> +
> + msc->error_dev_id = alloc_percpu_gfp(struct mpam_msc *,
> GFP_KERNEL);
> + if (!msc->error_dev_id)
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, &msc->accessibility) {
> + struct mpam_msc *empty = *per_cpu_ptr(msc->error_dev_id,
> cpu);
> +
> + if (empty) {
> + pr_err_once("%s shares PPI with %s!\n",
> + dev_name(&msc->pdev->dev),
> + dev_name(&empty->pdev->dev));
> + return -EBUSY;
> + }
> + *per_cpu_ptr(msc->error_dev_id, cpu) = msc;
> + }
> +
> + return 0;
> +}
> +
> +static int mpam_msc_setup_error_irq(struct mpam_msc *msc) {
> + int irq;
> +
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + return 0;
> +
> + /* Allocate and initialise the percpu device pointer for PPI */
> + if (irq_is_percpu(irq))
> + return __setup_ppi(msc);
> +
> + /* sanity check: shared interrupts can be routed anywhere? */
> + if (!cpumask_equal(&msc->accessibility, cpu_possible_mask)) {
> + pr_err_once("msc:%u is a private resource with a shared error
> interrupt",
> + msc->id);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> static int mpam_dt_count_msc(void)
> {
> int count = 0;
> @@ -1266,6 +1352,10 @@ static int mpam_msc_drv_probe(struct
> platform_device *pdev)
> break;
> }
>
> + err = mpam_msc_setup_error_irq(msc);
> + if (err)
> + break;
> +
> if (device_property_read_u32(&pdev->dev, "pcc-channel",
> &msc->pcc_subspace_id))
> msc->iface = MPAM_IFACE_MMIO;
> @@ -1548,11 +1638,193 @@ static void mpam_enable_merge_features(struct
> list_head *all_classes_list)
> }
> }
>
> +static char *mpam_errcode_names[16] = {
> + [0] = "No error",
> + [1] = "PARTID_SEL_Range",
> + [2] = "Req_PARTID_Range",
> + [3] = "MSMONCFG_ID_RANGE",
> + [4] = "Req_PMG_Range",
> + [5] = "Monitor_Range",
> + [6] = "intPARTID_Range",
> + [7] = "Unexpected_INTERNAL",
> + [8] = "Undefined_RIS_PART_SEL",
> + [9] = "RIS_No_Control",
> + [10] = "Undefined_RIS_MON_SEL",
> + [11] = "RIS_No_Monitor",
> + [12 ... 15] = "Reserved"
> +};
> +
> +static int mpam_enable_msc_ecr(void *_msc) {
> + struct mpam_msc *msc = _msc;
> +
> + __mpam_write_reg(msc, MPAMF_ECR, 1);
> +
> + return 0;
> +}
> +
> +static int mpam_disable_msc_ecr(void *_msc) {
> + struct mpam_msc *msc = _msc;
> +
> + __mpam_write_reg(msc, MPAMF_ECR, 0);
> +
> + return 0;
> +}
> +
> +static irqreturn_t __mpam_irq_handler(int irq, struct mpam_msc *msc) {
> + u64 reg;
> + u16 partid;
> + u8 errcode, pmg, ris;
> +
> + if (WARN_ON_ONCE(!msc) ||
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
> + &msc->accessibility)))
> + return IRQ_NONE;
> +
> + reg = mpam_msc_read_esr(msc);
> +
> + errcode = FIELD_GET(MPAMF_ESR_ERRCODE, reg);
> + if (!errcode)
> + return IRQ_NONE;
In general, I think there is no problem.
However, the initial value of MPAMF_ESR_ERRCODE may not be 0 on some chips.
It is better to initialize when loading the MPAM driver.
Best regards,
Shaopeng TAN
> + /* Clear level triggered irq */
> + mpam_msc_zero_esr(msc);
> +
> + partid = FIELD_GET(MPAMF_ESR_PARTID_OR_MON, reg);
> + pmg = FIELD_GET(MPAMF_ESR_PMG, reg);
> + ris = FIELD_GET(MPAMF_ESR_PMG, reg);
> +
> + pr_err("error irq from msc:%u '%s', partid:%u, pmg: %u, ris: %u\n",
> + msc->id, mpam_errcode_names[errcode], partid, pmg, ris);
> +
> + if (irq_is_percpu(irq)) {
> + mpam_disable_msc_ecr(msc);
> + schedule_work(&mpam_broken_work);
> + return IRQ_HANDLED;
> + }
> +
> + return IRQ_WAKE_THREAD;
> +}
> +
> +static irqreturn_t mpam_ppi_handler(int irq, void *dev_id) {
> + struct mpam_msc *msc = *(struct mpam_msc **)dev_id;
> +
> + return __mpam_irq_handler(irq, msc);
> +}
> +
> +static irqreturn_t mpam_spi_handler(int irq, void *dev_id) {
> + struct mpam_msc *msc = dev_id;
> +
> + return __mpam_irq_handler(irq, msc);
> +}
> +
> +static irqreturn_t mpam_disable_thread(int irq, void *dev_id);
> +
> +static int mpam_register_irqs(void)
> +{
> + int err, irq, idx;
> + struct mpam_msc *msc;
> +
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list,
> srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
> + /* We anticipate sharing the interrupt with other MSCs */
> + if (irq_is_percpu(irq)) {
> + err = request_percpu_irq(irq, &mpam_ppi_handler,
> + "mpam:msc:error",
> + msc->error_dev_id);
> + if (err)
> + return err;
> +
> + msc->reenable_error_ppi = irq;
> + smp_call_function_many(&msc->accessibility,
> + &_enable_percpu_irq, &irq,
> + true);
> + } else {
> + err =
> devm_request_threaded_irq(&msc->pdev->dev, irq,
> +
> &mpam_spi_handler,
> +
> &mpam_disable_thread,
> + IRQF_SHARED,
> + "mpam:msc:error",
> msc);
> + if (err)
> + return err;
> + }
> +
> + msc->error_irq_requested = true;
> + mpam_touch_msc(msc, mpam_enable_msc_ecr, msc);
> + msc->error_irq_hw_enabled = true;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
> +
> +static void mpam_unregister_irqs(void)
> +{
> + int irq, idx;
> + struct mpam_msc *msc;
> +
> + cpus_read_lock();
> + /* take the lock as free_irq() can sleep */
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list,
> srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + if (msc->error_irq_hw_enabled) {
> + mpam_touch_msc(msc, mpam_disable_msc_ecr,
> msc);
> + msc->error_irq_hw_enabled = false;
> + }
> +
> + if (msc->error_irq_requested) {
> + if (irq_is_percpu(irq)) {
> + msc->reenable_error_ppi = 0;
> + free_percpu_irq(irq, msc->error_dev_id);
> + } else {
> + devm_free_irq(&msc->pdev->dev, irq, msc);
> + }
> + msc->error_irq_requested = false;
> + }
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> + cpus_read_unlock();
> +}
> +
> static void mpam_enable_once(void)
> {
> + int err;
> +
> + /*
> + * If all the MSC have been probed, enabling the IRQs happens next.
> + * That involves cross-calling to a CPU that can reach the MSC, and
> + * the locks must be taken in this order:
> + */
> + cpus_read_lock();
> mutex_lock(&mpam_list_lock);
> mpam_enable_merge_features(&mpam_classes);
> +
> + err = mpam_register_irqs();
> + if (err)
> + pr_warn("Failed to register irqs: %d\n", err);
> +
> mutex_unlock(&mpam_list_lock);
> + cpus_read_unlock();
> +
> + if (err) {
> + schedule_work(&mpam_broken_work);
> + return;
> + }
>
> mutex_lock(&mpam_cpuhp_state_lock);
> cpuhp_remove_state(mpam_cpuhp_state);
> @@ -1621,16 +1893,39 @@ static void mpam_reset_class(struct mpam_class
> *class)
> * All of MPAMs errors indicate a software bug, restore any modified
> * controls to their reset values.
> */
> -void mpam_disable(void)
> +static irqreturn_t mpam_disable_thread(int irq, void *dev_id)
> {
> int idx;
> struct mpam_class *class;
> + struct mpam_msc *msc, *tmp;
> +
> + mutex_lock(&mpam_cpuhp_state_lock);
> + if (mpam_cpuhp_state) {
> + cpuhp_remove_state(mpam_cpuhp_state);
> + mpam_cpuhp_state = 0;
> + }
> + mutex_unlock(&mpam_cpuhp_state_lock);
> +
> + mpam_unregister_irqs();
>
> idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_srcu(class, &mpam_classes, classes_list,
> srcu_read_lock_held(&mpam_srcu))
> mpam_reset_class(class);
> srcu_read_unlock(&mpam_srcu, idx);
> +
> + mutex_lock(&mpam_list_lock);
> + list_for_each_entry_safe(msc, tmp, &mpam_all_msc, glbl_list)
> + mpam_msc_destroy(msc);
> + mutex_unlock(&mpam_list_lock);
> + mpam_free_garbage();
> +
> + return IRQ_HANDLED;
> +}
> +
> +void mpam_disable(struct work_struct *ignored) {
> + mpam_disable_thread(0, NULL);
> }
>
> /*
> @@ -1644,7 +1939,6 @@ void mpam_enable(struct work_struct *work)
> struct mpam_msc *msc;
> bool all_devices_probed = true;
>
> - /* Have we probed all the hw devices? */
> mutex_lock(&mpam_list_lock);
> list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
> mutex_lock(&msc->probe_lock);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h
> b/drivers/platform/arm64/mpam/mpam_internal.h
> index de05eece0a31..e1c6a2676b54 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -44,6 +44,11 @@ struct mpam_msc {
> struct pcc_mbox_chan *pcc_chan;
> u32 nrdy_usec;
> cpumask_t accessibility;
> + bool has_extd_esr;
> +
> + int reenable_error_ppi;
> + struct mpam_msc * __percpu *error_dev_id;
> +
> atomic_t online_refs;
>
> /*
> @@ -52,6 +57,8 @@ struct mpam_msc {
> */
> struct mutex probe_lock;
> bool probed;
> + bool error_irq_requested;
> + bool error_irq_hw_enabled;
> u16 partid_max;
> u8 pmg_max;
> unsigned long ris_idxs[128 / BITS_PER_LONG];
> @@ -280,7 +287,7 @@ extern u8 mpam_pmg_max;
>
> /* Scheduled work callback to enable mpam once all MSC have been probed
> */ void mpam_enable(struct work_struct *work); -void mpam_disable(void);
> +void mpam_disable(struct work_struct *work);
>
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32
> cache_level,
> cpumask_t *affinity);
> --
> 2.39.5
^ permalink raw reply [flat|nested] 117+ messages in thread
* RE: [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
@ 2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
2025-07-25 17:06 ` James Morse
2025-07-22 14:28 ` Jonathan Cameron
2025-07-23 14:42 ` Ben Horgan
2 siblings, 1 reply; 117+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2025-07-17 7:58 UTC (permalink / raw)
To: 'James Morse', linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Sudeep Holla
Hello James,
> The PPTT describes CPUs and caches, as well as processor containers.
> The ACPI table for MPAM describes the set of CPUs that can access an MSC
> with the UID of a processor container.
>
> Add a helper to find the processor container by its id, then walk the possible
> CPUs to fill a cpumask with the CPUs that have this processor container as a
> parent.
>
> CC: Dave Martin <dave.martin@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/acpi/pptt.c | 93
> ++++++++++++++++++++++++++++++++++++++++++++
> include/linux/acpi.h | 6 +++
> 2 files changed, 99 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c index
> 54676e3d82dd..13619b1b821b 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -298,6 +298,99 @@ static struct acpi_pptt_processor
> *acpi_find_processor_node(struct acpi_table_he
> return NULL;
> }
>
> +/**
> + * acpi_pptt_get_child_cpus() - Find all the CPUs below a PPTT processor
> node
> + * @table_hdr: A reference to the PPTT table.
> + * @parent_node: A pointer to the processor node in the @table_hdr.
> + * @cpus: A cpumask to fill with the CPUs below @parent_node.
> + *
> + * Walks up the PPTT from every possible CPU to find if the provided
> + * @parent_node is a parent of this CPU.
> + */
> +static void acpi_pptt_get_child_cpus(struct acpi_table_header *table_hdr,
> + struct acpi_pptt_processor
> *parent_node,
> + cpumask_t *cpus)
> +{
> + struct acpi_pptt_processor *cpu_node;
> + u32 acpi_id;
> + int cpu;
> +
> + cpumask_clear(cpus);
> +
> + for_each_possible_cpu(cpu) {
> + acpi_id = get_acpi_id_for_cpu(cpu);
> + cpu_node = acpi_find_processor_node(table_hdr, acpi_id);
> +
> + while (cpu_node) {
> + if (cpu_node == parent_node) {
> + cpumask_set_cpu(cpu, cpus);
> + break;
> + }
> + cpu_node = fetch_pptt_node(table_hdr,
> cpu_node->parent);
> + }
> + }
> +}
> +
> +/**
> + * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs
> in a
> + * processor containers
> + * @acpi_cpu_id: The UID of the processor container.
> + * @cpus The resulting CPU mask.
> + *
> + * Find the specified Processor Container, and fill @cpus with all the
> +cpus
> + * below it.
> + *
> + * Not all 'Processor' entries in the PPTT are either a CPU or a
> +Processor
> + * Container, they may exist purely to describe a Private resource.
> +CPUs
> + * have to be leaves, so a Processor Container is a non-leaf that has
> +the
> + * 'ACPI Processor ID valid' flag set.
> + *
> + * Return: 0 for a complete walk, or an error if the mask is incomplete.
> + */
> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus)
> +{
> + struct acpi_pptt_processor *cpu_node;
> + struct acpi_table_header *table_hdr;
> + struct acpi_subtable_header *entry;
> + bool leaf_flag, has_leaf_flag = false;
> + unsigned long table_end;
> + acpi_status status;
> + u32 proc_sz;
> + int ret = 0;
> +
> + cpumask_clear(cpus);
> +
> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table_hdr);
> + if (ACPI_FAILURE(status))
> + return 0;
If pptt table cannot be got, should -ENODEV be returned?
> + if (table_hdr->revision > 1)
> + has_leaf_flag = true;
> +
> + table_end = (unsigned long)table_hdr + table_hdr->length;
> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
> + sizeof(struct acpi_table_pptt));
> + proc_sz = sizeof(struct acpi_pptt_processor);
> + while ((unsigned long)entry + proc_sz <= table_end) {
> + cpu_node = (struct acpi_pptt_processor *)entry;
> + if (entry->type == ACPI_PPTT_TYPE_PROCESSOR &&
> + cpu_node->flags &
> ACPI_PPTT_ACPI_PROCESSOR_ID_VALID) {
> + leaf_flag = cpu_node->flags &
> ACPI_PPTT_ACPI_LEAF_NODE;
> + if ((has_leaf_flag && !leaf_flag) ||
> + (!has_leaf_flag
> && !acpi_pptt_leaf_node(table_hdr, cpu_node))) {
> + if (cpu_node->acpi_processor_id ==
> acpi_cpu_id)
> + acpi_pptt_get_child_cpus(table_hdr,
> cpu_node, cpus);
> + }
> + }
> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
> + entry->length);
> + }
> +
> + acpi_put_table(table_hdr);
> +
> + return ret;
Only 0 is returned here.
There is no action to be taken when the mask is incomplete.
Best regards,
Shaopeng TAN
> +}
> +
> static u8 acpi_cache_type(enum cache_type type) {
> switch (type) {
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h index
> f102c0fe3431..8c3165c2b083 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1541,6 +1541,7 @@ int find_acpi_cpu_topology(unsigned int cpu, int
> level); int find_acpi_cpu_topology_cluster(unsigned int cpu); int
> find_acpi_cpu_topology_package(unsigned int cpu); int
> find_acpi_cpu_topology_hetero_id(unsigned int cpu);
> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t
> +*cpus);
> #else
> static inline int acpi_pptt_cpu_is_thread(unsigned int cpu) { @@ -1562,6
> +1563,11 @@ static inline int find_acpi_cpu_topology_hetero_id(unsigned int
> cpu) {
> return -EINVAL;
> }
> +static inline int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id,
> + cpumask_t *cpus)
> +{
> + return -EINVAL;
> +}
> #endif
>
> void acpi_arch_init(void);
> --
> 2.39.5
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory
2025-07-11 18:36 ` [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory James Morse
@ 2025-07-21 16:32 ` Jonathan Cameron
2025-08-06 18:03 ` James Morse
2025-07-24 10:56 ` Ben Horgan
1 sibling, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-21 16:32 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, 11 Jul 2025 18:36:24 +0000
James Morse <james.morse@arm.com> wrote:
> commit 363c8aea257 "platform: Add ARM64 platform directory" added a
> subdirectory for arm64 platform devices, but claims that all such
> devices must be 'EC like'.
>
> The arm64 MPAM driver manages an MMIO interface that appears in memory
> controllers, caches, IOMMU and connection points on the interconnect.
> It doesn't fit into any existing subsystem.
>
> It would be convenient to use this subdirectory for drivers for other
> arm64 platform devices which aren't closely coupled to the architecture
> code and don't fit into any existing subsystem.
>
> Move the existing code and maintainer entries to be under
> drivers/platform/arm64/ec. The MPAM driver will be added under
> drivers/platform/arm64/mpam.
>
> Signed-off-by: James Morse <james.morse@arm.com>
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4bac4ea21b64..bea01d413666 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3549,15 +3549,15 @@ S: Maintained
> F: arch/arm64/boot/Makefile
> F: scripts/make_fit.py
>
> -ARM64 PLATFORM DRIVERS
> -M: Hans de Goede <hansg@kernel.org>
> +ARM64 EC PLATFORM DRIVERS
> +M: Hans de Goede <hdegoede@redhat.com>
Smells like a rebase error as Hans' email address chagned
to the kernel.org one in the 6.16 cycle.
> M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> R: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> L: platform-driver-x86@vger.kernel.org
> S: Maintained
> Q: https://patchwork.kernel.org/project/platform-driver-x86/list/
> T: git git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git
> -F: drivers/platform/arm64/
> +F: drivers/platform/arm64/ec
Other than that looks sensible to me but obviously needs tags from Hans or Ilpo.
Jonathan
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
@ 2025-07-22 14:28 ` Jonathan Cameron
2025-07-25 17:05 ` James Morse
2025-07-23 14:42 ` Ben Horgan
2 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-22 14:28 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
On Fri, 11 Jul 2025 18:36:17 +0000
James Morse <james.morse@arm.com> wrote:
> The PPTT describes CPUs and caches, as well as processor containers.
> The ACPI table for MPAM describes the set of CPUs that can access an MSC
> with the UID of a processor container.
>
> Add a helper to find the processor container by its id, then walk
> the possible CPUs to fill a cpumask with the CPUs that have this
> processor container as a parent.
>
> CC: Dave Martin <dave.martin@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> +/**
> + * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs in a
> + * processor containers
> + * @acpi_cpu_id: The UID of the processor container.
> + * @cpus The resulting CPU mask.
Missing colon.
From a W=1 build (and hence kernel-doc warning).
> + *
> + * Find the specified Processor Container, and fill @cpus with all the cpus
> + * below it.
> + *
> + * Not all 'Processor' entries in the PPTT are either a CPU or a Processor
> + * Container, they may exist purely to describe a Private resource. CPUs
> + * have to be leaves, so a Processor Container is a non-leaf that has the
> + * 'ACPI Processor ID valid' flag set.
> + *
> + * Return: 0 for a complete walk, or an error if the mask is incomplete.
> + */
> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus)
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
2025-07-17 1:08 ` Shaopeng Tan (Fujitsu)
@ 2025-07-22 15:06 ` Jonathan Cameron
2025-08-08 7:11 ` James Morse
2025-07-28 10:49 ` Ben Horgan
2025-08-04 16:53 ` Fenghua Yu
4 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-22 15:06 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, 11 Jul 2025 18:36:37 +0000
James Morse <james.morse@arm.com> wrote:
> Register and enable error IRQs. All the MPAM error interrupts indicate a
> software bug, e.g. out of range partid. If the error interrupt is ever
> signalled, attempt to disable MPAM.
>
> Only the irq handler accesses the ESR register, so no locking is needed.
> The work to disable MPAM after an error needs to happen at process
> context, use a threaded interrupt.
>
> There is no support for percpu threaded interrupts, for now schedule
> the work to be done from the irq handler.
>
> Enabling the IRQs in the MSC may involve cross calling to a CPU that
> can access the MSC.
>
> CC: Rohit Mathew <rohit.mathew@arm.com>
> Tested-by: Rohit Mathew <rohit.mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
Sparse gives an imbalance warning in mpam_register_irqs()
> +static int mpam_register_irqs(void)
> +{
> + int err, irq, idx;
> + struct mpam_msc *msc;
> +
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
> + /* We anticipate sharing the interrupt with other MSCs */
> + if (irq_is_percpu(irq)) {
> + err = request_percpu_irq(irq, &mpam_ppi_handler,
> + "mpam:msc:error",
> + msc->error_dev_id);
> + if (err)
> + return err;
Looks like the srcu_read_lock is still held.
There is a DEFINE_LOCK_GUARD_1() in srcu.h so you can do
guard(srcu)(&mpam_srcu, idx);
I think and not worry about releasing it in errors or the good path.
> +
> + msc->reenable_error_ppi = irq;
> + smp_call_function_many(&msc->accessibility,
> + &_enable_percpu_irq, &irq,
> + true);
> + } else {
> + err = devm_request_threaded_irq(&msc->pdev->dev, irq,
> + &mpam_spi_handler,
> + &mpam_disable_thread,
> + IRQF_SHARED,
> + "mpam:msc:error", msc);
> + if (err)
> + return err;
> + }
> +
> + msc->error_irq_requested = true;
> + mpam_touch_msc(msc, mpam_enable_msc_ecr, msc);
> + msc->error_irq_hw_enabled = true;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
2025-07-22 14:28 ` Jonathan Cameron
@ 2025-07-23 14:42 ` Ben Horgan
2025-07-25 17:05 ` James Morse
2 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-23 14:42 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Sudeep Holla
Hi James,
On 7/11/25 19:36, James Morse wrote:
> The PPTT describes CPUs and caches, as well as processor containers.
> The ACPI table for MPAM describes the set of CPUs that can access an MSC
> with the UID of a processor container.
>
> Add a helper to find the processor container by its id, then walk
> the possible CPUs to fill a cpumask with the CPUs that have this
> processor container as a parent.
>
> CC: Dave Martin <dave.martin@arm.com>
> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/acpi/pptt.c | 93 ++++++++++++++++++++++++++++++++++++++++++++
> include/linux/acpi.h | 6 +++
> 2 files changed, 99 insertions(+)
>
> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
> index 54676e3d82dd..13619b1b821b 100644
> --- a/drivers/acpi/pptt.c
> +++ b/drivers/acpi/pptt.c
> @@ -298,6 +298,99 @@ static struct acpi_pptt_processor *acpi_find_processor_node(struct acpi_table_he
> return NULL;
> }
>
> +/**
> + * acpi_pptt_get_child_cpus() - Find all the CPUs below a PPTT processor node
> + * @table_hdr: A reference to the PPTT table.
> + * @parent_node: A pointer to the processor node in the @table_hdr.
> + * @cpus: A cpumask to fill with the CPUs below @parent_node.
> + *
> + * Walks up the PPTT from every possible CPU to find if the provided
> + * @parent_node is a parent of this CPU.
> + */
> +static void acpi_pptt_get_child_cpus(struct acpi_table_header *table_hdr,
> + struct acpi_pptt_processor *parent_node,
> + cpumask_t *cpus)
> +{
> + struct acpi_pptt_processor *cpu_node;
> + u32 acpi_id;
> + int cpu;
> +
> + cpumask_clear(cpus);
> +
> + for_each_possible_cpu(cpu) {
> + acpi_id = get_acpi_id_for_cpu(cpu);
> + cpu_node = acpi_find_processor_node(table_hdr, acpi_id);
> +
> + while (cpu_node) {
> + if (cpu_node == parent_node) {
> + cpumask_set_cpu(cpu, cpus);
> + break;
> + }
> + cpu_node = fetch_pptt_node(table_hdr, cpu_node->parent);
> + }
> + }
> +}
> +
> +/**
> + * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs in a
> + * processor containers
> + * @acpi_cpu_id: The UID of the processor container.
> + * @cpus The resulting CPU mask.
> + *
> + * Find the specified Processor Container, and fill @cpus with all the cpus
> + * below it.
> + *
> + * Not all 'Processor' entries in the PPTT are either a CPU or a Processor
> + * Container, they may exist purely to describe a Private resource. CPUs
> + * have to be leaves, so a Processor Container is a non-leaf that has the
> + * 'ACPI Processor ID valid' flag set.
> + *
> + * Return: 0 for a complete walk, or an error if the mask is incomplete.
> + */
> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus)
> +{
> + struct acpi_pptt_processor *cpu_node;
> + struct acpi_table_header *table_hdr;
> + struct acpi_subtable_header *entry;
> + bool leaf_flag, has_leaf_flag = false;
> + unsigned long table_end;
> + acpi_status status;
> + u32 proc_sz;
> + int ret = 0;
> +
> + cpumask_clear(cpus);
> +
> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table_hdr);
> + if (ACPI_FAILURE(status))
> + return 0;
> +
> + if (table_hdr->revision > 1)
> + has_leaf_flag = true;
> +
> + table_end = (unsigned long)table_hdr + table_hdr->length;
> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
> + sizeof(struct acpi_table_pptt));
> + proc_sz = sizeof(struct acpi_pptt_processor);
> + while ((unsigned long)entry + proc_sz <= table_end) {
> + cpu_node = (struct acpi_pptt_processor *)entry;
> + if (entry->type == ACPI_PPTT_TYPE_PROCESSOR &&
> + cpu_node->flags & ACPI_PPTT_ACPI_PROCESSOR_ID_VALID) {
> + leaf_flag = cpu_node->flags & ACPI_PPTT_ACPI_LEAF_NODE;
> + if ((has_leaf_flag && !leaf_flag) ||
> + (!has_leaf_flag && !acpi_pptt_leaf_node(table_hdr, cpu_node))) {
> + if (cpu_node->acpi_processor_id == acpi_cpu_id)
> + acpi_pptt_get_child_cpus(table_hdr, cpu_node, cpus);
> + }
acpi_pptt_leaf_node() returns early based on the leaf flag so you can
just rely on that here; remove has_leaf_flag and the corresponding extra
logic.
> + }
> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
> + entry->length);
> + }
> +
> + acpi_put_table(table_hdr);
> +
> + return ret;
> +}
> +
> static u8 acpi_cache_type(enum cache_type type)
> {
> switch (type) {
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index f102c0fe3431..8c3165c2b083 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -1541,6 +1541,7 @@ int find_acpi_cpu_topology(unsigned int cpu, int level);
> int find_acpi_cpu_topology_cluster(unsigned int cpu);
> int find_acpi_cpu_topology_package(unsigned int cpu);
> int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus);
> #else
> static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
> {
> @@ -1562,6 +1563,11 @@ static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
> {
> return -EINVAL;
> }
> +static inline int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id,
> + cpumask_t *cpus)
> +{
> + return -EINVAL;
> +}
> #endif
>
> void acpi_arch_init(void);
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-16 17:07 ` Jonathan Cameron
@ 2025-07-23 16:39 ` Ben Horgan
2025-08-05 17:07 ` James Morse
2025-07-28 10:08 ` Jonathan Cameron
2025-08-05 17:07 ` James Morse
2 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-23 16:39 UTC (permalink / raw)
To: Jonathan Cameron, James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Rohit Mathew,
Shanker Donthineni, Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko
Hi James, Jonathan,
On 7/16/25 18:07, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:22 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> Add code to parse the arm64 specific MPAM table, looking up the cache
>> level from the PPTT and feeding the end result into the MPAM driver.
>
> Throw in a link to the spec perhaps? Particularly useful to know which
> version this was written against when reviewing it.
As I comment below this code checks the table revision is 1 and so we
can assume it was written against version 2 of the spec. As of Monday,
there is a new version hot off the press,
https://developer.arm.com/documentation/den0065/3-0bet/?lang=en which
introduces an "MMIO size" field to allow for disabled nodes. This should
be considered here to avoid advertising msc that aren't present.
>
>>
>> CC: Carl Worth <carl@os.amperecomputing.com>
>> Signed-off-by: James Morse <james.morse@arm.com>
>> ---
>> arch/arm64/Kconfig | 1 +
>> drivers/acpi/arm64/Kconfig | 3 +
>> drivers/acpi/arm64/Makefile | 1 +
>> drivers/acpi/arm64/mpam.c | 365 ++++++++++++++++++++++++++++++++++++
>> drivers/acpi/tables.c | 2 +-
>> include/linux/arm_mpam.h | 46 +++++
>> 6 files changed, 417 insertions(+), 1 deletion(-)
>> create mode 100644 drivers/acpi/arm64/mpam.c
>> create mode 100644 include/linux/arm_mpam.h
>>
>
>> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
>> index 05ecde9eaabe..27b872249baa 100644
>> --- a/drivers/acpi/arm64/Makefile
>> +++ b/drivers/acpi/arm64/Makefile
>> @@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
>> obj-$(CONFIG_ACPI_IORT) += iort.o
>> obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
>> obj-$(CONFIG_ARM_AMBA) += amba.o
>> +obj-$(CONFIG_ACPI_MPAM) += mpam.o
>
> Keep it with the ACPI ones? There doesn't seem to be a lot of order in here
> though so I guess maybe there is logic behind putting it here I'm missing.
>
>> obj-y += dma.o init.o
>> obj-y += thermal_cpufreq.o
>> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
>> new file mode 100644
>> index 000000000000..f4791bac9a2a
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/mpam.c
>> @@ -0,0 +1,365 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2025 Arm Ltd.
>> +
>> +/* Parse the MPAM ACPI table feeding the discovered nodes into the driver */
>> +
>> +#define pr_fmt(fmt) "ACPI MPAM: " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/arm_mpam.h>
>> +#include <linux/cpu.h>
>> +#include <linux/cpumask.h>
>> +#include <linux/platform_device.h>
>> +
>> +#include <acpi/processor.h>
>> +
>> +/* Flags for acpi_table_mpam_msc.*_interrupt_flags */
>
> References.. I'm looking at 3.0-alpha table 5 to check this.
> I can see why you might be reluctant to point at an alpha if that
> is what you are using ;)
>
>
>> +#define ACPI_MPAM_MSC_IRQ_MODE_EDGE 1
>> +#define ACPI_MPAM_MSC_IRQ_TYPE_MASK (3 << 1)
>
> GENMASK(3, 2) would be my preference for how to do masks in new code.
>
>> +#define ACPI_MPAM_MSC_IRQ_TYPE_WIRED 0
>> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER BIT(3)
>> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_VALID BIT(4)
>> +
>> +static bool frob_irq(struct platform_device *pdev, int intid, u32 flags,
>> + int *irq, u32 processor_container_uid)
>> +{
>> + int sense;
>> +
>> + if (!intid)
>> + return false;
>> +
>> + /* 0 in this field indicates a wired interrupt */
>> + if (flags & ACPI_MPAM_MSC_IRQ_TYPE_MASK)
> I'd prefer more explicit code (and probably no comment)
>
> if (FIELD_GET(flags, ACPI_MPAM_MSC_IRQ_TYPE_MASK) !=
> ACPI_MPAM_MSC_IRQ_TYPE_WIRED)
> return false;
>
>> + return false;
>> +
>> + if (flags & ACPI_MPAM_MSC_IRQ_MODE_EDGE)
>> + sense = ACPI_EDGE_SENSITIVE;
>> + else
>> + sense = ACPI_LEVEL_SENSITIVE;
>
> If the spec is supposed to be using standard ACPI_* types for this field
> (I don't think the connection is explicitly documented though) then
>
> sense = FIELD_GET(flags, ACPI_MPAM_MSC_IRQ_MODE_MASK);
> Assuming a change to define the mask and rely on the ACPI defs for the values
>
> This one is entirely up to you.
>
>> +
>> + /*
>> + * If the GSI is in the GIC's PPI range, try and create a partitioned
>> + * percpu interrupt.
>> + */
>> + if (16 <= intid && intid < 32 && processor_container_uid != ~0) {
>> + pr_err_once("Partitioned interrupts not supported\n");
>> + return false;
>> + }
>> +
>> + *irq = acpi_register_gsi(&pdev->dev, intid, sense, ACPI_ACTIVE_HIGH);
>> + if (*irq <= 0) {
>> + pr_err_once("Failed to register interrupt 0x%x with ACPI\n",
>> + intid);
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> +static void acpi_mpam_parse_irqs(struct platform_device *pdev,
>> + struct acpi_mpam_msc_node *tbl_msc,
>> + struct resource *res, int *res_idx)
>> +{
>> + u32 flags, aff = ~0;
>> + int irq;
>> +
>> + flags = tbl_msc->overflow_interrupt_flags;
>> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
>> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
>> + aff = tbl_msc->overflow_interrupt_affinity;
> Just to make the two cases look the same I'd do
>
> else
> aff = ~0;
>
> here as well and not initialize above. It's not quite worth using
> a helper function for these two identical blocks but it's close.
>
>> + if (frob_irq(pdev, tbl_msc->overflow_interrupt, flags, &irq, aff)) {
>> + res[*res_idx].start = irq;
>> + res[*res_idx].end = irq;
>> + res[*res_idx].flags = IORESOURCE_IRQ;
>> + res[*res_idx].name = "overflow";
>
> res[*res_idx] = DEFINE_RES_IRQ_NAMED(irq, 1, "overflow");
>> +
>> + (*res_idx)++;
> Can roll this in as well.
>
>> + }
>> +
>> + flags = tbl_msc->error_interrupt_flags;
>> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
>> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
>> + aff = tbl_msc->error_interrupt_affinity;
>> + else
>> + aff = ~0;
>> + if (frob_irq(pdev, tbl_msc->error_interrupt, flags, &irq, aff)) {
>> + res[*res_idx].start = irq;
>> + res[*res_idx].end = irq;
>> + res[*res_idx].flags = IORESOURCE_IRQ;
>> + res[*res_idx].name = "error";
>
> Similar to above.
>
>> +
>> + (*res_idx)++;
>> + }
>> +}
>> +
>
>
>> +static bool __init parse_msc_pm_link(struct acpi_mpam_msc_node *tbl_msc,
>> + struct platform_device *pdev,
>> + u32 *acpi_id)
>> +{
>> + bool acpi_id_valid = false;
>> + struct acpi_device *buddy;
>> + char hid[16], uid[16];
>> + int err;
>> +
>> + memset(&hid, 0, sizeof(hid));
>> + memcpy(hid, &tbl_msc->hardware_id_linked_device,
>> + sizeof(tbl_msc->hardware_id_linked_device));
>> +
>> + if (!strcmp(hid, ACPI_PROCESSOR_CONTAINER_HID)) {
>> + *acpi_id = tbl_msc->instance_id_linked_device;
>> + acpi_id_valid = true;
>> + }
>> +
>> + err = snprintf(uid, sizeof(uid), "%u",
>> + tbl_msc->instance_id_linked_device);
>> + if (err < 0 || err >= sizeof(uid))
>
> Does snprintf() ever return < 0 ? It's documented as returning
> number of chars printed (without the NULL) so that can only be 0 or
> greater.
>
> Can it return >= sizeof(uid) ? Looks odd.
>
> + return acpi_id_valid;
>> +
>> + buddy = acpi_dev_get_first_match_dev(hid, uid, -1);
>> + if (buddy)
>> + device_link_add(&pdev->dev, &buddy->dev, DL_FLAG_STATELESS);
>> +
>> + return acpi_id_valid;
>> +}
>
>> +static int __init _parse_table(struct acpi_table_header *table)
>> +{
>> + char *table_end, *table_offset = (char *)(table + 1);
>> + struct property_entry props[4]; /* needs a sentinel */
>> + struct acpi_mpam_msc_node *tbl_msc;
>> + int next_res, next_prop, err = 0;
>> + struct acpi_device *companion;
>> + struct platform_device *pdev;
>> + enum mpam_msc_iface iface;
>> + struct resource res[3];
>> + char uid[16];
>> + u32 acpi_id;
>> +
>> + table_end = (char *)table + table->length;
>> +
>> + while (table_offset < table_end) {
>> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> + table_offset += tbl_msc->length;
>> +
>> + /*
>> + * If any of the reserved fields are set, make no attempt to
>> + * parse the msc structure. This will prevent the driver from
>> + * probing all the MSC, meaning it can't discover the system
>> + * wide supported partid and pmg ranges. This avoids whatever
>> + * this MSC is truncating the partids and creating a screaming
>> + * error interrupt.
>> + */
>> + if (tbl_msc->reserved || tbl_msc->reserved1 || tbl_msc->reserved2)
>> + continue;
>> +
>> + if (decode_interface_type(tbl_msc, &iface))
>> + continue;
>> +
>> + next_res = 0;
>> + next_prop = 0;
>> + memset(res, 0, sizeof(res));
>> + memset(props, 0, sizeof(props));
>> +
>> + pdev = platform_device_alloc("mpam_msc", tbl_msc->identifier);
>> + if (IS_ERR(pdev)) {
>
> returns NULL in at least some error cases (probably all, I'm just to lazy to check)
>
>> + err = PTR_ERR(pdev);
>> + break;
>> + }
>> +
>> + if (tbl_msc->length < sizeof(*tbl_msc)) {
>> + err = -EINVAL;
>> + break;
>> + }
>> +
>> + /* Some power management is described in the namespace: */
>> + err = snprintf(uid, sizeof(uid), "%u", tbl_msc->identifier);
>> + if (err > 0 && err < sizeof(uid)) {
>> + companion = acpi_dev_get_first_match_dev("ARMHAA5C", uid, -1);
>> + if (companion)
>> + ACPI_COMPANION_SET(&pdev->dev, companion);
>> + }
>> +
>> + if (iface == MPAM_IFACE_MMIO) {
>> + res[next_res].name = "MPAM:MSC";
>> + res[next_res].start = tbl_msc->base_address;
>> + res[next_res].end = tbl_msc->base_address + tbl_msc->mmio_size - 1;
>> + res[next_res].flags = IORESOURCE_MEM;
>> + next_res++;
>
> DEFINE_RES_MEM_NAMED()?
>
>> + } else if (iface == MPAM_IFACE_PCC) {
>> + props[next_prop++] = PROPERTY_ENTRY_U32("pcc-channel",
>> + tbl_msc->base_address);
>> + next_prop++;
>> + }
>> +
>> + acpi_mpam_parse_irqs(pdev, tbl_msc, res, &next_res);
>> + err = platform_device_add_resources(pdev, res, next_res);
>> + if (err)
>> + break;
>> +
>> + props[next_prop++] = PROPERTY_ENTRY_U32("arm,not-ready-us",
>> + tbl_msc->max_nrdy_usec);
>> +
>> + /*
>> + * The MSC's CPU affinity is described via its linked power
>> + * management device, but only if it points at a Processor or
>> + * Processor Container.
>> + */
>> + if (parse_msc_pm_link(tbl_msc, pdev, &acpi_id)) {
>> + props[next_prop++] = PROPERTY_ENTRY_U32("cpu_affinity",
>> + acpi_id);
>> + }
>> +
>> + err = device_create_managed_software_node(&pdev->dev, props,
>> + NULL);
>> + if (err)
>> + break;
>> +
>> + /* Come back later if you want the RIS too */
>> + err = platform_device_add_data(pdev, tbl_msc, tbl_msc->length);
>> + if (err)
>> + break;
>> +
>> + platform_device_add(pdev);
>
> Can fail.
>
>> + }
>> +
>> + if (err)
>> + platform_device_put(pdev);
>> +
>> + return err;
>> +}
>> +
>> +static struct acpi_table_header *get_table(void)
>> +{
>> + struct acpi_table_header *table;
>> + acpi_status status;
>> +
>> + if (acpi_disabled || !system_supports_mpam())
>> + return NULL;
>> +
>> + status = acpi_get_table(ACPI_SIG_MPAM, 0, &table);
>> + if (ACPI_FAILURE(status))
>> + return NULL;
>> +
>> + if (table->revision != 1)
>> + return NULL;
Indicates that this was written against version 2 of the spec.
>> +
>> + return table;
>> +}
>> +
>> +static int __init acpi_mpam_parse(void)
>> +{
>> + struct acpi_table_header *mpam;
>> + int err;
>> +
>> + mpam = get_table();
>> + if (!mpam)
>> + return 0;
>
> Just what I was suggesting for the PPTT case in early patches. Nice :)
>
>> +
>> + err = _parse_table(mpam);
>> + acpi_put_table(mpam);
>> +
>> + return err;
>> +}
>> +
>> +static int _count_msc(struct acpi_table_header *table)
>> +{
>> + char *table_end, *table_offset = (char *)(table + 1);
>> + struct acpi_mpam_msc_node *tbl_msc;
>> + int ret = 0;
>
> Call it count as it only ever contains the count?
>
>> +
>> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> + table_end = (char *)table + table->length;
>> +
>> + while (table_offset < table_end) {
>> + if (tbl_msc->length < sizeof(*tbl_msc))
>> + return -EINVAL;
>> +
>> + ret++;
>
> count++ would feel more natural here.
>
>> +
>> + table_offset += tbl_msc->length;
>> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> + }
>> +
>> + return ret;
>> +}
>
> That's all I have time for today. Will get to the rest of the series soonish.
>
> Jonathan
>
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-11 18:36 ` [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table James Morse
2025-07-16 17:07 ` Jonathan Cameron
@ 2025-07-24 10:50 ` Ben Horgan
2025-08-05 17:08 ` James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-24 10:50 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 11/07/2025 19:36, James Morse wrote:
> Add code to parse the arm64 specific MPAM table, looking up the cache
> level from the PPTT and feeding the end result into the MPAM driver.
>
> CC: Carl Worth <carl@os.amperecomputing.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> arch/arm64/Kconfig | 1 +
> drivers/acpi/arm64/Kconfig | 3 +
> drivers/acpi/arm64/Makefile | 1 +
> drivers/acpi/arm64/mpam.c | 365 ++++++++++++++++++++++++++++++++++++
> drivers/acpi/tables.c | 2 +-
> include/linux/arm_mpam.h | 46 +++++
> 6 files changed, 417 insertions(+), 1 deletion(-)
> create mode 100644 drivers/acpi/arm64/mpam.c
> create mode 100644 include/linux/arm_mpam.h
[snip]
> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
> new file mode 100644
> index 000000000000..f4791bac9a2a
> --- /dev/null
> +++ b/drivers/acpi/arm64/mpam.c
> @@ -0,0 +1,365 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2025 Arm Ltd.
> +
> +/* Parse the MPAM ACPI table feeding the discovered nodes into the driver */
> +
> +#define pr_fmt(fmt) "ACPI MPAM: " fmt
> +
> +#include <linux/acpi.h>
> +#include <linux/arm_mpam.h>
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/platform_device.h>
> +
> +#include <acpi/processor.h>
> +
> +/* Flags for acpi_table_mpam_msc.*_interrupt_flags */
> +#define ACPI_MPAM_MSC_IRQ_MODE_EDGE 1
> +#define ACPI_MPAM_MSC_IRQ_TYPE_MASK (3 << 1)
> +#define ACPI_MPAM_MSC_IRQ_TYPE_WIRED 0
> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER BIT(3)
> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_VALID BIT(4)
> +
> +static bool frob_irq(struct platform_device *pdev, int intid, u32 flags,
> + int *irq, u32 processor_container_uid)
> +{
> + int sense;
> +
> + if (!intid)
> + return false;
> +
> + /* 0 in this field indicates a wired interrupt */
> + if (flags & ACPI_MPAM_MSC_IRQ_TYPE_MASK)
> + return false;
> +
> + if (flags & ACPI_MPAM_MSC_IRQ_MODE_EDGE)
> + sense = ACPI_EDGE_SENSITIVE;
> + else
> + sense = ACPI_LEVEL_SENSITIVE;
> +
> + /*
> + * If the GSI is in the GIC's PPI range, try and create a partitioned
> + * percpu interrupt.
> + */
> + if (16 <= intid && intid < 32 && processor_container_uid != ~0) {
> + pr_err_once("Partitioned interrupts not supported\n");
> + return false;
> + }
> +
> + *irq = acpi_register_gsi(&pdev->dev, intid, sense, ACPI_ACTIVE_HIGH);
> + if (*irq <= 0) {
> + pr_err_once("Failed to register interrupt 0x%x with ACPI\n",
> + intid);
> + return false;
> + }
> +
> + return true;
> +}
> +
> +static void acpi_mpam_parse_irqs(struct platform_device *pdev,
> + struct acpi_mpam_msc_node *tbl_msc,
> + struct resource *res, int *res_idx)
> +{
> + u32 flags, aff = ~0;
> + int irq;
> +
> + flags = tbl_msc->overflow_interrupt_flags;
> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
> + aff = tbl_msc->overflow_interrupt_affinity;
> + if (frob_irq(pdev, tbl_msc->overflow_interrupt, flags, &irq, aff)) {
> + res[*res_idx].start = irq;
> + res[*res_idx].end = irq;
> + res[*res_idx].flags = IORESOURCE_IRQ;
> + res[*res_idx].name = "overflow";
> +
> + (*res_idx)++;
> + }
> +
> + flags = tbl_msc->error_interrupt_flags;
> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
> + aff = tbl_msc->error_interrupt_affinity;
> + else
> + aff = ~0;
> + if (frob_irq(pdev, tbl_msc->error_interrupt, flags, &irq, aff)) {
> + res[*res_idx].start = irq;
> + res[*res_idx].end = irq;
> + res[*res_idx].flags = IORESOURCE_IRQ;
> + res[*res_idx].name = "error";
> +
> + (*res_idx)++;
> + }
> +}
> +
> +static int acpi_mpam_parse_resource(struct mpam_msc *msc,
> + struct acpi_mpam_resource_node *res)
> +{
> + int level, nid;
> + u32 cache_id;
> +
> + switch (res->locator_type) {
> + case ACPI_MPAM_LOCATION_TYPE_PROCESSOR_CACHE:
> + cache_id = res->locator.cache_locator.cache_reference;
> + level = find_acpi_cache_level_from_id(cache_id);
> + if (level < 0) {
> + pr_err_once("Bad level (%u) for cache with id %u\n", level, cache_id);
> + return -EINVAL;
Nit: More robust to check for level <= 0.
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory
2025-07-11 18:36 ` [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory James Morse
2025-07-21 16:32 ` Jonathan Cameron
@ 2025-07-24 10:56 ` Ben Horgan
2025-08-06 18:03 ` James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-24 10:56 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 11/07/2025 19:36, James Morse wrote:
> commit 363c8aea257 "platform: Add ARM64 platform directory" added a
> subdirectory for arm64 platform devices, but claims that all such
> devices must be 'EC like'.
>
> The arm64 MPAM driver manages an MMIO interface that appears in memory
> controllers, caches, IOMMU and connection points on the interconnect.
> It doesn't fit into any existing subsystem.
>
> It would be convenient to use this subdirectory for drivers for other
> arm64 platform devices which aren't closely coupled to the architecture
> code and don't fit into any existing subsystem.
>
> Move the existing code and maintainer entries to be under
> drivers/platform/arm64/ec. The MPAM driver will be added under
> drivers/platform/arm64/mpam.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> MAINTAINERS | 6 +-
> drivers/platform/arm64/Kconfig | 72 +-----------------
> drivers/platform/arm64/Makefile | 9 +--
> drivers/platform/arm64/ec/Kconfig | 73 +++++++++++++++++++
> drivers/platform/arm64/ec/Makefile | 10 +++
> .../platform/arm64/{ => ec}/acer-aspire1-ec.c | 0
> .../arm64/{ => ec}/huawei-gaokun-ec.c | 0
> .../arm64/{ => ec}/lenovo-yoga-c630.c | 0
> 8 files changed, 88 insertions(+), 82 deletions(-)
> create mode 100644 drivers/platform/arm64/ec/Kconfig
> create mode 100644 drivers/platform/arm64/ec/Makefile
> rename drivers/platform/arm64/{ => ec}/acer-aspire1-ec.c (100%)
> rename drivers/platform/arm64/{ => ec}/huawei-gaokun-ec.c (100%)
> rename drivers/platform/arm64/{ => ec}/lenovo-yoga-c630.c (100%)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4bac4ea21b64..bea01d413666 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -3549,15 +3549,15 @@ S: Maintained
> F: arch/arm64/boot/Makefile
> F: scripts/make_fit.py
>
> -ARM64 PLATFORM DRIVERS
> -M: Hans de Goede <hansg@kernel.org>
> +ARM64 EC PLATFORM DRIVERS
> +M: Hans de Goede <hdegoede@redhat.com>
> M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> R: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> L: platform-driver-x86@vger.kernel.org
> S: Maintained
> Q: https://patchwork.kernel.org/project/platform-driver-x86/list/
> T: git git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git
> -F: drivers/platform/arm64/
> +F: drivers/platform/arm64/ec
>
> ARM64 PORT (AARCH64 ARCHITECTURE)
> M: Catalin Marinas <catalin.marinas@arm.com>
> diff --git a/drivers/platform/arm64/Kconfig b/drivers/platform/arm64/Kconfig
> index 06288aebc559..1eb8ab0855e5 100644
> --- a/drivers/platform/arm64/Kconfig
> +++ b/drivers/platform/arm64/Kconfig
> @@ -1,73 +1,3 @@
> # SPDX-License-Identifier: GPL-2.0-only
> -#
> -# EC-like Drivers for aarch64 based devices.
> -#
>
> -menuconfig ARM64_PLATFORM_DEVICES
> - bool "ARM64 Platform-Specific Device Drivers"
> - depends on ARM64 || COMPILE_TEST
> - default ARM64
> - help
> - Say Y here to get to see options for platform-specific device drivers
> - for arm64 based devices, primarily EC-like device drivers.
> - This option alone does not add any kernel code.
> -
> - If you say N, all options in this submenu will be skipped and disabled.
> -
> -if ARM64_PLATFORM_DEVICES
> -
> -config EC_ACER_ASPIRE1
> - tristate "Acer Aspire 1 Embedded Controller driver"
> - depends on ARCH_QCOM || COMPILE_TEST
> - depends on I2C
> - depends on DRM
> - depends on POWER_SUPPLY
> - depends on INPUT
> - help
> - Say Y here to enable the EC driver for the (Snapdragon-based)
> - Acer Aspire 1 laptop. The EC handles battery and charging
> - monitoring as well as some misc functions like the lid sensor
> - and USB Type-C DP HPD events.
> -
> - This driver provides battery and AC status support for the mentioned
> - laptop where this information is not properly exposed via the
> - standard ACPI devices.
> -
> -config EC_HUAWEI_GAOKUN
> - tristate "Huawei Matebook E Go Embedded Controller driver"
> - depends on ARCH_QCOM || COMPILE_TEST
> - depends on I2C
> - depends on INPUT
> - depends on HWMON
> - select AUXILIARY_BUS
> -
> - help
> - Say Y here to enable the EC driver for the Huawei Matebook E Go
> - which is a sc8280xp-based 2-in-1 tablet. The driver handles battery
> - (information, charge control) and USB Type-C DP HPD events as well
> - as some misc functions like the lid sensor and temperature sensors,
> - etc.
> -
> - This driver provides battery and AC status support for the mentioned
> - laptop where this information is not properly exposed via the
> - standard ACPI devices.
> -
> - Say M or Y here to include this support.
> -
> -config EC_LENOVO_YOGA_C630
> - tristate "Lenovo Yoga C630 Embedded Controller driver"
> - depends on ARCH_QCOM || COMPILE_TEST
> - depends on I2C
> - select AUXILIARY_BUS
> - help
> - Driver for the Embedded Controller in the Qualcomm Snapdragon-based
> - Lenovo Yoga C630, which provides battery and power adapter
> - information.
> -
> - This driver provides battery and AC status support for the mentioned
> - laptop where this information is not properly exposed via the
> - standard ACPI devices.
> -
> - Say M or Y here to include this support.
> -
> -endif # ARM64_PLATFORM_DEVICES
> +source "drivers/platform/arm64/ec/Kconfig"
> diff --git a/drivers/platform/arm64/Makefile b/drivers/platform/arm64/Makefile
> index 46a99eba3264..ce840a8cf8cc 100644
> --- a/drivers/platform/arm64/Makefile
> +++ b/drivers/platform/arm64/Makefile
> @@ -1,10 +1,3 @@
> # SPDX-License-Identifier: GPL-2.0-only
> -#
> -# Makefile for linux/drivers/platform/arm64
> -#
> -# This dir should only include drivers for EC-like devices.
> -#
>
> -obj-$(CONFIG_EC_ACER_ASPIRE1) += acer-aspire1-ec.o
> -obj-$(CONFIG_EC_HUAWEI_GAOKUN) += huawei-gaokun-ec.o
> -obj-$(CONFIG_EC_LENOVO_YOGA_C630) += lenovo-yoga-c630.o
> +obj-y += ec/
> diff --git a/drivers/platform/arm64/ec/Kconfig b/drivers/platform/arm64/ec/Kconfig
> new file mode 100644
> index 000000000000..06288aebc559
> --- /dev/null
> +++ b/drivers/platform/arm64/ec/Kconfig
> @@ -0,0 +1,73 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# EC-like Drivers for aarch64 based devices.
> +#
> +
> +menuconfig ARM64_PLATFORM_DEVICES
> + bool "ARM64 Platform-Specific Device Drivers"
> + depends on ARM64 || COMPILE_TEST
> + default ARM64
> + help
> + Say Y here to get to see options for platform-specific device drivers
> + for arm64 based devices, primarily EC-like device drivers.
> + This option alone does not add any kernel code.
> +
> + If you say N, all options in this submenu will be skipped and disabled.
> +
> +if ARM64_PLATFORM_DEVICES
Shouldn't this be kept in the directory above? By the description this
would be expected to apply to all drivers in drivers/platfrom/arm64.> +
> +config EC_ACER_ASPIRE1
> + tristate "Acer Aspire 1 Embedded Controller driver"
> + depends on ARCH_QCOM || COMPILE_TEST
> + depends on I2C
> + depends on DRM
> + depends on POWER_SUPPLY
> + depends on INPUT
> + help
> + Say Y here to enable the EC driver for the (Snapdragon-based)
> + Acer Aspire 1 laptop. The EC handles battery and charging
> + monitoring as well as some misc functions like the lid sensor
> + and USB Type-C DP HPD events.
> +
> + This driver provides battery and AC status support for the mentioned
> + laptop where this information is not properly exposed via the
> + standard ACPI devices.
> +
> +config EC_HUAWEI_GAOKUN
> + tristate "Huawei Matebook E Go Embedded Controller driver"
> + depends on ARCH_QCOM || COMPILE_TEST
> + depends on I2C
> + depends on INPUT
> + depends on HWMON
> + select AUXILIARY_BUS
> +
> + help
> + Say Y here to enable the EC driver for the Huawei Matebook E Go
> + which is a sc8280xp-based 2-in-1 tablet. The driver handles battery
> + (information, charge control) and USB Type-C DP HPD events as well
> + as some misc functions like the lid sensor and temperature sensors,
> + etc.
> +
> + This driver provides battery and AC status support for the mentioned
> + laptop where this information is not properly exposed via the
> + standard ACPI devices.
> +
> + Say M or Y here to include this support.
> +
> +config EC_LENOVO_YOGA_C630
> + tristate "Lenovo Yoga C630 Embedded Controller driver"
> + depends on ARCH_QCOM || COMPILE_TEST
> + depends on I2C
> + select AUXILIARY_BUS
> + help
> + Driver for the Embedded Controller in the Qualcomm Snapdragon-based
> + Lenovo Yoga C630, which provides battery and power adapter
> + information.
> +
> + This driver provides battery and AC status support for the mentioned
> + laptop where this information is not properly exposed via the
> + standard ACPI devices.
> +
> + Say M or Y here to include this support.
> +
> +endif # ARM64_PLATFORM_DEVICES
> diff --git a/drivers/platform/arm64/ec/Makefile b/drivers/platform/arm64/ec/Makefile
> new file mode 100644
> index 000000000000..b3a7c4096f08
> --- /dev/null
> +++ b/drivers/platform/arm64/ec/Makefile
> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Makefile for linux/drivers/platform/arm64/ec
> +#
> +# This dir should only include drivers for EC-like devices.
> +#
> +
> +obj-$(CONFIG_EC_ACER_ASPIRE1) += acer-aspire1-ec.o
> +obj-$(CONFIG_EC_HUAWEI_GAOKUN) += huawei-gaokun-ec.o
> +obj-$(CONFIG_EC_LENOVO_YOGA_C630) += lenovo-yoga-c630.o
> diff --git a/drivers/platform/arm64/acer-aspire1-ec.c b/drivers/platform/arm64/ec/acer-aspire1-ec.c
> similarity index 100%
> rename from drivers/platform/arm64/acer-aspire1-ec.c
> rename to drivers/platform/arm64/ec/acer-aspire1-ec.c
> diff --git a/drivers/platform/arm64/huawei-gaokun-ec.c b/drivers/platform/arm64/ec/huawei-gaokun-ec.c
> similarity index 100%
> rename from drivers/platform/arm64/huawei-gaokun-ec.c
> rename to drivers/platform/arm64/ec/huawei-gaokun-ec.c
> diff --git a/drivers/platform/arm64/lenovo-yoga-c630.c b/drivers/platform/arm64/ec/lenovo-yoga-c630.c
> similarity index 100%
> rename from drivers/platform/arm64/lenovo-yoga-c630.c
> rename to drivers/platform/arm64/ec/lenovo-yoga-c630.c
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
2025-07-11 18:36 ` [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
@ 2025-07-24 11:02 ` Ben Horgan
2025-08-06 18:03 ` James Morse
2025-07-24 12:09 ` Catalin Marinas
1 sibling, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-24 11:02 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 11/07/2025 19:36, James Morse wrote:
> Probing MPAM is convoluted. MSCs that are integrated with a CPU may
> only be accessible from those CPUs, and they may not be online.
> Touching the hardware early is pointless as MPAM can't be used until
> the system-wide common values for num_partid and num_pmg have been
> discovered.
>
> Start with driver probe/remove and mapping the MSC.
>
> CC: Carl Worth <carl@os.amperecomputing.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> arch/arm64/Kconfig | 1 +
> drivers/platform/arm64/Kconfig | 1 +
> drivers/platform/arm64/Makefile | 1 +
> drivers/platform/arm64/mpam/Kconfig | 10 +
> drivers/platform/arm64/mpam/Makefile | 4 +
> drivers/platform/arm64/mpam/mpam_devices.c | 336 ++++++++++++++++++++
> drivers/platform/arm64/mpam/mpam_internal.h | 62 ++++
> 7 files changed, 415 insertions(+)
> create mode 100644 drivers/platform/arm64/mpam/Kconfig
> create mode 100644 drivers/platform/arm64/mpam/Makefile
> create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
> create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index ad9a49a39e41..8abce7f4eb1e 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2060,6 +2060,7 @@ config ARM64_TLB_RANGE
>
> config ARM64_MPAM
> bool "Enable support for MPAM"
> + select ARM64_MPAM_DRIVER
> select ACPI_MPAM if ACPI
> help
> Memory Partitioning and Monitoring is an optional extension
> diff --git a/drivers/platform/arm64/Kconfig b/drivers/platform/arm64/Kconfig
> index 1eb8ab0855e5..16a927cf6ea2 100644
> --- a/drivers/platform/arm64/Kconfig
> +++ b/drivers/platform/arm64/Kconfig
> @@ -1,3 +1,4 @@
> # SPDX-License-Identifier: GPL-2.0-only
>
> source "drivers/platform/arm64/ec/Kconfig"
> +source "drivers/platform/arm64/mpam/Kconfig"
> diff --git a/drivers/platform/arm64/Makefile b/drivers/platform/arm64/Makefile
> index ce840a8cf8cc..c6ec3bc6a100 100644
> --- a/drivers/platform/arm64/Makefile
> +++ b/drivers/platform/arm64/Makefile
> @@ -1,3 +1,4 @@
> # SPDX-License-Identifier: GPL-2.0-only
>
> obj-y += ec/
> +obj-y += mpam/
> diff --git a/drivers/platform/arm64/mpam/Kconfig b/drivers/platform/arm64/mpam/Kconfig
> new file mode 100644
> index 000000000000..b63495d7da87
> --- /dev/null
> +++ b/drivers/platform/arm64/mpam/Kconfig
> @@ -0,0 +1,10 @@
> +# Confusingly, this is everything but the CPU bits of MPAM. CPU here means
> +# CPU resources, not containers or cgroups etc.
> +config ARM_CPU_RESCTRL
> + bool
> + depends on ARM64
> +
> +config ARM64_MPAM_DRIVER_DEBUG
> + bool "Enable debug messages from the MPAM driver."
> + help
> + Say yes here to enable debug messages from the MPAM driver.
> diff --git a/drivers/platform/arm64/mpam/Makefile b/drivers/platform/arm64/mpam/Makefile
> new file mode 100644
> index 000000000000..4255975c7724
> --- /dev/null
> +++ b/drivers/platform/arm64/mpam/Makefile
> @@ -0,0 +1,4 @@
> +obj-$(CONFIG_ARM64_MPAM) += mpam.o
> +mpam-y += mpam_devices.o
> +
> +cflags-$(CONFIG_ARM64_MPAM_DRIVER_DEBUG) += -DDEBUG
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> new file mode 100644
> index 000000000000..5b886ba54ba8
> --- /dev/null
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -0,0 +1,336 @@
> +// SPDX-License-Identifier: GPL-2.0
> +// Copyright (C) 2025 Arm Ltd.
> +
> +#define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
> +
> +#include <linux/acpi.h>
> +#include <linux/arm_mpam.h>
> +#include <linux/cacheinfo.h>
> +#include <linux/cpu.h>
> +#include <linux/cpumask.h>
> +#include <linux/device.h>
> +#include <linux/errno.h>
> +#include <linux/gfp.h>
> +#include <linux/list.h>
> +#include <linux/lockdep.h>
> +#include <linux/mutex.h>
> +#include <linux/of.h>
> +#include <linux/of_platform.h>
> +#include <linux/platform_device.h>
> +#include <linux/printk.h>
> +#include <linux/slab.h>
> +#include <linux/spinlock.h>
> +#include <linux/srcu.h>
> +#include <linux/types.h>
> +
> +#include <acpi/pcc.h>
> +
> +#include "mpam_internal.h"
> +
> +/*
> + * mpam_list_lock protects the SRCU lists when writing. Once the
> + * mpam_enabled key is enabled these lists are read-only,
> + * unless the error interrupt disables the driver.
> + */
> +static DEFINE_MUTEX(mpam_list_lock);
> +static LIST_HEAD(mpam_all_msc);
> +
> +static struct srcu_struct mpam_srcu;
> +
> +/* MPAM isn't available until all the MSC have been probed. */
> +static u32 mpam_num_msc;
> +
> +static void mpam_discovery_complete(void)
> +{
> + pr_err("Discovered all MSC\n");
> +}
> +
> +static int mpam_dt_count_msc(void)
> +{
> + int count = 0;
> + struct device_node *np;
> +
> + for_each_compatible_node(np, NULL, "arm,mpam-msc")
This will count even 'status = "disabled"' nodes. Add a check for that.
if (of_device_is_available(np))> + count++;
> +
> + return count;
> +}
> +
> +static int mpam_dt_parse_resource(struct mpam_msc *msc, struct device_node *np,
> + u32 ris_idx)
> +{
> + int err = 0;
> + u32 level = 0;
> + unsigned long cache_id;
> + struct device_node *cache;
> +
> + do {
> + if (of_device_is_compatible(np, "arm,mpam-cache")) {
> + cache = of_parse_phandle(np, "arm,mpam-device", 0);
> + if (!cache) {
> + pr_err("Failed to read phandle\n");
> + break;
> + }
> + } else if (of_device_is_compatible(np->parent, "cache")) {
> + cache = of_node_get(np->parent);
> + } else {
> + /* For now, only caches are supported */
> + cache = NULL;
> + break;
> + }
> +
> + err = of_property_read_u32(cache, "cache-level", &level);
> + if (err) {
> + pr_err("Failed to read cache-level\n");
> + break;
> + }
> +
> + cache_id = cache_of_calculate_id(cache);
> + if (cache_id == ~0UL) {
> + err = -ENOENT;
> + break;
> + }
> +
> + err = mpam_ris_create(msc, ris_idx, MPAM_CLASS_CACHE, level,
> + cache_id);
> + } while (0);
> + of_node_put(cache);
> +
> + return err;
> +}
> +
> +static int mpam_dt_parse_resources(struct mpam_msc *msc, void *ignored)
> +{
> + int err, num_ris = 0;
> + const u32 *ris_idx_p;
> + struct device_node *iter, *np;
> +
> + np = msc->pdev->dev.of_node;
> + for_each_child_of_node(np, iter) {
> + ris_idx_p = of_get_property(iter, "reg", NULL);
> + if (ris_idx_p) {
> + num_ris++;
> + err = mpam_dt_parse_resource(msc, iter, *ris_idx_p);
> + if (err) {
> + of_node_put(iter);
> + return err;
> + }
> + }
> + }
> +
> + if (!num_ris)
> + mpam_dt_parse_resource(msc, np, 0);
> +
> + return err;
> +}
> +
> +/*
> + * An MSC can control traffic from a set of CPUs, but may only be accessible
> + * from a (hopefully wider) set of CPUs. The common reason for this is power
> + * management. If all the CPUs in a cluster are in PSCI:CPU_SUSPEND, the
> + * the corresponding cache may also be powered off. By making accesses from
> + * one of those CPUs, we ensure this isn't the case.
> + */
> +static int update_msc_accessibility(struct mpam_msc *msc)
> +{
> + struct device_node *parent;
> + u32 affinity_id;
> + int err;
> +
> + if (!acpi_disabled) {
> + err = device_property_read_u32(&msc->pdev->dev, "cpu_affinity",
> + &affinity_id);
> + if (err) {
> + cpumask_copy(&msc->accessibility, cpu_possible_mask);
> + err = 0;
> + } else {
> + err = acpi_pptt_get_cpus_from_container(affinity_id,
> + &msc->accessibility);
> + }
> +
> + return err;
> + }
> +
> + /* This depends on the path to of_node */
> + parent = of_get_parent(msc->pdev->dev.of_node);
> + if (parent == of_root) {
> + cpumask_copy(&msc->accessibility, cpu_possible_mask);
> + err = 0;
> + } else {
> + err = -EINVAL;
> + pr_err("Cannot determine accessibility of MSC: %s\n",
> + dev_name(&msc->pdev->dev));
> + }
> + of_node_put(parent);
> +
> + return err;
> +}
> +
> +static int fw_num_msc;
> +
> +static void mpam_pcc_rx_callback(struct mbox_client *cl, void *msg)
> +{
> + /* TODO: wake up tasks blocked on this MSC's PCC channel */
> +}
> +
> +static void mpam_msc_drv_remove(struct platform_device *pdev)
> +{
> + struct mpam_msc *msc = platform_get_drvdata(pdev);
> +
> + if (!msc)
> + return;
> +
> + mutex_lock(&mpam_list_lock);
> + mpam_num_msc--;
> + platform_set_drvdata(pdev, NULL);
> + list_del_rcu(&msc->glbl_list);
> + synchronize_srcu(&mpam_srcu);
> + devm_kfree(&pdev->dev, msc);
> + mutex_unlock(&mpam_list_lock);
> +}
> +
> +static int mpam_msc_drv_probe(struct platform_device *pdev)
> +{
> + int err;
> + struct mpam_msc *msc;
> + struct resource *msc_res;
> + void *plat_data = pdev->dev.platform_data;
> +
> + mutex_lock(&mpam_list_lock);
> + do {
> + msc = devm_kzalloc(&pdev->dev, sizeof(*msc), GFP_KERNEL);
> + if (!msc) {
> + err = -ENOMEM;
> + break;
> + }
> +
> + mutex_init(&msc->probe_lock);
> + mutex_init(&msc->part_sel_lock);
> + mutex_init(&msc->outer_mon_sel_lock);
> + raw_spin_lock_init(&msc->inner_mon_sel_lock);
> + msc->id = mpam_num_msc++;
> + msc->pdev = pdev;
> + INIT_LIST_HEAD_RCU(&msc->glbl_list);
> + INIT_LIST_HEAD_RCU(&msc->ris);
> +
> + err = update_msc_accessibility(msc);
> + if (err)
> + break;
> + if (cpumask_empty(&msc->accessibility)) {
> + pr_err_once("msc:%u is not accessible from any CPU!",
> + msc->id);
> + err = -EINVAL;
> + break;
> + }
> +
> + if (device_property_read_u32(&pdev->dev, "pcc-channel",
> + &msc->pcc_subspace_id))
> + msc->iface = MPAM_IFACE_MMIO;
> + else
> + msc->iface = MPAM_IFACE_PCC;
> +
> + if (msc->iface == MPAM_IFACE_MMIO) {
> + void __iomem *io;
> +
> + io = devm_platform_get_and_ioremap_resource(pdev, 0,
> + &msc_res);
> + if (IS_ERR(io)) {
> + pr_err("Failed to map MSC base address\n");
> + err = PTR_ERR(io);
> + break;
> + }
> + msc->mapped_hwpage_sz = msc_res->end - msc_res->start;
> + msc->mapped_hwpage = io;
> + } else if (msc->iface == MPAM_IFACE_PCC) {
> + msc->pcc_cl.dev = &pdev->dev;
> + msc->pcc_cl.rx_callback = mpam_pcc_rx_callback;
> + msc->pcc_cl.tx_block = false;
> + msc->pcc_cl.tx_tout = 1000; /* 1s */
> + msc->pcc_cl.knows_txdone = false;
> +
> + msc->pcc_chan = pcc_mbox_request_channel(&msc->pcc_cl,
> + msc->pcc_subspace_id);
> + if (IS_ERR(msc->pcc_chan)) {
> + pr_err("Failed to request MSC PCC channel\n");
> + err = PTR_ERR(msc->pcc_chan);
> + break;
> + }
> + }
> +
> + list_add_rcu(&msc->glbl_list, &mpam_all_msc);
> + platform_set_drvdata(pdev, msc);
> + } while (0);
> + mutex_unlock(&mpam_list_lock);
> +
> + if (!err) {
> + /* Create RIS entries described by firmware */
> + if (!acpi_disabled)
> + err = acpi_mpam_parse_resources(msc, plat_data);
> + else
> + err = mpam_dt_parse_resources(msc, plat_data);
> + }
> +
> + if (!err && fw_num_msc == mpam_num_msc)
> + mpam_discovery_complete();
> +
> + if (err && msc)
> + mpam_msc_drv_remove(pdev);
> +
> + return err;
> +}
> +
> +static const struct of_device_id mpam_of_match[] = {
> + { .compatible = "arm,mpam-msc", },
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, mpam_of_match);
> +
> +static struct platform_driver mpam_msc_driver = {
> + .driver = {
> + .name = "mpam_msc",
> + .of_match_table = of_match_ptr(mpam_of_match),
> + },
> + .probe = mpam_msc_drv_probe,
> + .remove = mpam_msc_drv_remove,
> +};
> +
> +/*
> + * MSC that are hidden under caches are not created as platform devices
> + * as there is no cache driver. Caches are also special-cased in
> + * update_msc_accessibility().
> + */
> +static void mpam_dt_create_foundling_msc(void)
> +{
> + int err;
> + struct device_node *cache;
> +
> + for_each_compatible_node(cache, NULL, "cache") {
> + err = of_platform_populate(cache, mpam_of_match, NULL, NULL);
> + if (err)
> + pr_err("Failed to create MSC devices under caches\n");
> + }
> +}
> +
> +static int __init mpam_msc_driver_init(void)
> +{
> + if (!system_supports_mpam())
> + return -EOPNOTSUPP;
> +
> + init_srcu_struct(&mpam_srcu);
> +
> + if (!acpi_disabled)
> + fw_num_msc = acpi_mpam_count_msc();
> + else
> + fw_num_msc = mpam_dt_count_msc();
> +
> + if (fw_num_msc <= 0) {
> + pr_err("No MSC devices found in firmware\n");
> + return -EINVAL;
> + }
> +
> + if (acpi_disabled)
> + mpam_dt_create_foundling_msc();
> +
> + return platform_driver_register(&mpam_msc_driver);
> +}
> +subsys_initcall(mpam_msc_driver_init);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> new file mode 100644
> index 000000000000..07e0f240eaca
> --- /dev/null
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -0,0 +1,62 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +// Copyright (C) 2024 Arm Ltd.
> +
> +#ifndef MPAM_INTERNAL_H
> +#define MPAM_INTERNAL_H
> +
> +#include <linux/arm_mpam.h>
> +#include <linux/cpumask.h>
> +#include <linux/io.h>
> +#include <linux/mailbox_client.h>
> +#include <linux/mutex.h>
> +#include <linux/resctrl.h>
> +#include <linux/sizes.h>
> +
> +struct mpam_msc {
> + /* member of mpam_all_msc */
> + struct list_head glbl_list;
> +
> + int id;
> + struct platform_device *pdev;
> +
> + /* Not modified after mpam_is_enabled() becomes true */
> + enum mpam_msc_iface iface;
> + u32 pcc_subspace_id;
> + struct mbox_client pcc_cl;
> + struct pcc_mbox_chan *pcc_chan;
> + u32 nrdy_usec;
> + cpumask_t accessibility;
> +
> + /*
> + * probe_lock is only take during discovery. After discovery these
> + * properties become read-only and the lists are protected by SRCU.
> + */
> + struct mutex probe_lock;
> + unsigned long ris_idxs[128 / BITS_PER_LONG];
> + u32 ris_max;
> +
> + /* mpam_msc_ris of this component */
> + struct list_head ris;
> +
> + /*
> + * part_sel_lock protects access to the MSC hardware registers that are
> + * affected by MPAMCFG_PART_SEL. (including the ID registers that vary
> + * by RIS).
> + * If needed, take msc->lock first.
> + */
> + struct mutex part_sel_lock;
> +
> + /*
> + * mon_sel_lock protects access to the MSC hardware registers that are
> + * affeted by MPAMCFG_MON_SEL.
> + * If needed, take msc->lock first.
> + */
> + struct mutex outer_mon_sel_lock;
> + raw_spinlock_t inner_mon_sel_lock;
> + unsigned long inner_mon_sel_flags;
> +
> + void __iomem *mapped_hwpage;
> + size_t mapped_hwpage_sz;
> +};
> +
> +#endif /* MPAM_INTERNAL_H */
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
2025-07-11 18:36 ` [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
2025-07-24 11:02 ` Ben Horgan
@ 2025-07-24 12:09 ` Catalin Marinas
2025-08-06 18:04 ` James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Catalin Marinas @ 2025-07-24 12:09 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, Jul 11, 2025 at 06:36:25PM +0000, James Morse wrote:
> Probing MPAM is convoluted. MSCs that are integrated with a CPU may
> only be accessible from those CPUs, and they may not be online.
> Touching the hardware early is pointless as MPAM can't be used until
> the system-wide common values for num_partid and num_pmg have been
> discovered.
>
> Start with driver probe/remove and mapping the MSC.
>
> CC: Carl Worth <carl@os.amperecomputing.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> arch/arm64/Kconfig | 1 +
> drivers/platform/arm64/Kconfig | 1 +
> drivers/platform/arm64/Makefile | 1 +
> drivers/platform/arm64/mpam/Kconfig | 10 +
> drivers/platform/arm64/mpam/Makefile | 4 +
> drivers/platform/arm64/mpam/mpam_devices.c | 336 ++++++++++++++++++++
> drivers/platform/arm64/mpam/mpam_internal.h | 62 ++++
> 7 files changed, 415 insertions(+)
> create mode 100644 drivers/platform/arm64/mpam/Kconfig
> create mode 100644 drivers/platform/arm64/mpam/Makefile
> create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
> create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
Bikeshedding: why not drivers/resctrl to match fs/resctrl? We wouldn't
need the previous patch either to move the arm64 platform drivers.
I'm not an expert on resctrl but the MPAM code looks more like a backend
for the resctrl support, so it makes more sense to do as we did for
other drivers like irqchip, iommu. You can create drivers/resctrl/arm64
if you want to keep them grouped.
--
Catalin
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions
2025-07-11 18:36 ` [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions James Morse
2025-07-17 1:04 ` Shaopeng Tan (Fujitsu)
@ 2025-07-24 14:02 ` Ben Horgan
2025-08-06 18:05 ` James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-24 14:02 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
Nit: The file uses a mixture of tabs and spaces.
On 11/07/2025 19:36, James Morse wrote:
> Memory Partitioning and Monitoring (MPAM) has memory mapped devices
> (MSCs) with an identity/configuration page.
>
> Add the definitions for these registers as offset within the page(s).
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_internal.h | 268 ++++++++++++++++++++
> 1 file changed, 268 insertions(+)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index d49bb884b433..9110c171d9d2 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -150,4 +150,272 @@ extern struct list_head mpam_classes;
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> cpumask_t *affinity);
>
> +/*
> + * MPAM MSCs have the following register layout. See:
> + * Arm Architecture Reference Manual Supplement - Memory System Resource
> + * Partitioning and Monitoring (MPAM), for Armv8-A. DDI 0598A.a
I've been checking this against
https://developer.arm.com/documentation/ihi0099/latest/ as that looks to
be the current document although hopefully the contents are
non-contradictory.> + */
> +#define MPAM_ARCHITECTURE_V1 0x10
> +
> +/* Memory mapped control pages: */
> +/* ID Register offsets in the memory mapped page */
> +#define MPAMF_IDR 0x0000 /* features id register */
> +#define MPAMF_MSMON_IDR 0x0080 /* performance monitoring features */
> +#define MPAMF_IMPL_IDR 0x0028 /* imp-def partitioning */
> +#define MPAMF_CPOR_IDR 0x0030 /* cache-portion partitioning */
> +#define MPAMF_CCAP_IDR 0x0038 /* cache-capacity partitioning */
> +#define MPAMF_MBW_IDR 0x0040 /* mem-bw partitioning */
> +#define MPAMF_PRI_IDR 0x0048 /* priority partitioning */
> +#define MPAMF_CSUMON_IDR 0x0088 /* cache-usage monitor */
> +#define MPAMF_MBWUMON_IDR 0x0090 /* mem-bw usage monitor */
> +#define MPAMF_PARTID_NRW_IDR 0x0050 /* partid-narrowing */
> +#define MPAMF_IIDR 0x0018 /* implementer id register */
> +#define MPAMF_AIDR 0x0020 /* architectural id register */
> +
> +/* Configuration and Status Register offsets in the memory mapped page */
> +#define MPAMCFG_PART_SEL 0x0100 /* partid to configure: */
> +#define MPAMCFG_CPBM 0x1000 /* cache-portion config */
> +#define MPAMCFG_CMAX 0x0108 /* cache-capacity config */
> +#define MPAMCFG_CMIN 0x0110 /* cache-capacity config */
> +#define MPAMCFG_MBW_MIN 0x0200 /* min mem-bw config */
> +#define MPAMCFG_MBW_MAX 0x0208 /* max mem-bw config */
> +#define MPAMCFG_MBW_WINWD 0x0220 /* mem-bw accounting window config */
> +#define MPAMCFG_MBW_PBM 0x2000 /* mem-bw portion bitmap config */
> +#define MPAMCFG_PRI 0x0400 /* priority partitioning config */
> +#define MPAMCFG_MBW_PROP 0x0500 /* mem-bw stride config */
> +#define MPAMCFG_INTPARTID 0x0600 /* partid-narrowing config */
> +
> +#define MSMON_CFG_MON_SEL 0x0800 /* monitor selector */
> +#define MSMON_CFG_CSU_FLT 0x0810 /* cache-usage monitor filter */
> +#define MSMON_CFG_CSU_CTL 0x0818 /* cache-usage monitor config */
> +#define MSMON_CFG_MBWU_FLT 0x0820 /* mem-bw monitor filter */
> +#define MSMON_CFG_MBWU_CTL 0x0828 /* mem-bw monitor config */
> +#define MSMON_CSU 0x0840 /* current cache-usage */
> +#define MSMON_CSU_CAPTURE 0x0848 /* last cache-usage value captured */
> +#define MSMON_MBWU 0x0860 /* current mem-bw usage value */
> +#define MSMON_MBWU_CAPTURE 0x0868 /* last mem-bw value captured */
> +#define MSMON_CAPT_EVNT 0x0808 /* signal a capture event */
> +#define MPAMF_ESR 0x00F8 /* error status register */
> +#define MPAMF_ECR 0x00F0 /* error control register */
> +
> +/* MPAMF_IDR - MPAM features ID register */
> +#define MPAMF_IDR_PARTID_MAX GENMASK(15, 0)
> +#define MPAMF_IDR_PMG_MAX GENMASK(23, 16)
> +#define MPAMF_IDR_HAS_CCAP_PART BIT(24)
> +#define MPAMF_IDR_HAS_CPOR_PART BIT(25)
> +#define MPAMF_IDR_HAS_MBW_PART BIT(26)
> +#define MPAMF_IDR_HAS_PRI_PART BIT(27)
> +#define MPAMF_IDR_HAS_EXT BIT(28)
MPAMF_IDR_EXT. The field name is ext rather than has_ext. > +#define
MPAMF_IDR_HAS_IMPL_IDR BIT(29)
> +#define MPAMF_IDR_HAS_MSMON BIT(30)
> +#define MPAMF_IDR_HAS_PARTID_NRW BIT(31)
> +#define MPAMF_IDR_HAS_RIS BIT(32)
> +#define MPAMF_IDR_HAS_EXT_ESR BIT(38)
MPAMF_IDR_HAS_EXTD_ESR. Missing D.> +#define MPAMF_IDR_HAS_ESR
BIT(39)
> +#define MPAMF_IDR_RIS_MAX GENMASK(59, 56)
> +
> +/* MPAMF_MSMON_IDR - MPAM performance monitoring ID register */
> +#define MPAMF_MSMON_IDR_MSMON_CSU BIT(16)
> +#define MPAMF_MSMON_IDR_MSMON_MBWU BIT(17)
> +#define MPAMF_MSMON_IDR_HAS_LOCAL_CAPT_EVNT BIT(31)
> +
> +/* MPAMF_CPOR_IDR - MPAM features cache portion partitioning ID register */
> +#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15, 0)
> +
> +/* MPAMF_CCAP_IDR - MPAM features cache capacity partitioning ID register */
> +#define MPAMF_CCAP_IDR_HAS_CMAX_SOFTLIM BIT(31)
> +#define MPAMF_CCAP_IDR_NO_CMAX BIT(30)
> +#define MPAMF_CCAP_IDR_HAS_CMIN BIT(29)
> +#define MPAMF_CCAP_IDR_HAS_CASSOC BIT(28)
> +#define MPAMF_CCAP_IDR_CASSOC_WD GENMASK(12, 8)
> +#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0)
nit: Field ordering differs from the other registers.> +
> +/* MPAMF_MBW_IDR - MPAM features memory bandwidth partitioning ID register */
> +#define MPAMF_MBW_IDR_BWA_WD GENMASK(5, 0)
> +#define MPAMF_MBW_IDR_HAS_MIN BIT(10)
> +#define MPAMF_MBW_IDR_HAS_MAX BIT(11)
> +#define MPAMF_MBW_IDR_HAS_PBM BIT(12)
> +#define MPAMF_MBW_IDR_HAS_PROP BIT(13)
> +#define MPAMF_MBW_IDR_WINDWR BIT(14)
> +#define MPAMF_MBW_IDR_BWPBM_WD GENMASK(28, 16)
> +
> +/* MPAMF_PRI_IDR - MPAM features priority partitioning ID register */
> +#define MPAMF_PRI_IDR_HAS_INTPRI BIT(0)
> +#define MPAMF_PRI_IDR_INTPRI_0_IS_LOW BIT(1)
> +#define MPAMF_PRI_IDR_INTPRI_WD GENMASK(9, 4)
> +#define MPAMF_PRI_IDR_HAS_DSPRI BIT(16)
> +#define MPAMF_PRI_IDR_DSPRI_0_IS_LOW BIT(17)
> +#define MPAMF_PRI_IDR_DSPRI_WD GENMASK(25, 20)
> +
> +/* MPAMF_CSUMON_IDR - MPAM cache storage usage monitor ID register */
> +#define MPAMF_CSUMON_IDR_NUM_MON GENMASK(15, 0)
> +#define MPAMF_CSUMON_IDR_HAS_OFLOW_CAPT BIT(24)
> +#define MPAMF_CSUMON_IDR_HAS_CEVNT_OFLW BIT(25)
> +#define MPAMF_CSUMON_IDR_HAS_OFSR BIT(26)
> +#define MPAMF_CSUMON_IDR_HAS_OFLOW_LNKG BIT(27)
> +#define MPAMF_CSUMON_IDR_HAS_XCL BIT(29)
> +#define MPAMF_CSUMON_IDR_CSU_RO BIT(30)
> +#define MPAMF_CSUMON_IDR_HAS_CAPTURE BIT(31)
> +
> +/* MPAMF_MBWUMON_IDR - MPAM memory bandwidth usage monitor ID register */
> +#define MPAMF_MBWUMON_IDR_NUM_MON GENMASK(15, 0)
> +#define MPAMF_MBWUMON_IDR_HAS_RWBW BIT(28)
> +#define MPAMF_MBWUMON_IDR_LWD BIT(29)
> +#define MPAMF_MBWUMON_IDR_HAS_LONG BIT(30)
> +#define MPAMF_MBWUMON_IDR_HAS_CAPTURE BIT(31)
> +
> +/* MPAMF_PARTID_NRW_IDR - MPAM PARTID narrowing ID register */
> +#define MPAMF_PARTID_NRW_IDR_INTPARTID_MAX GENMASK(15, 0)
> +
> +/* MPAMF_IIDR - MPAM implementation ID register */
> +#define MPAMF_IIDR_PRODUCTID GENMASK(31, 20)
> +#define MPAMF_IIDR_PRODUCTID_SHIFT 20
> +#define MPAMF_IIDR_VARIANT GENMASK(19, 16)
> +#define MPAMF_IIDR_VARIANT_SHIFT 16
> +#define MPAMF_IIDR_REVISON GENMASK(15, 12)
> +#define MPAMF_IIDR_REVISON_SHIFT 12
> +#define MPAMF_IIDR_IMPLEMENTER GENMASK(11, 0)
> +#define MPAMF_IIDR_IMPLEMENTER_SHIFT 0
> +
> +/* MPAMF_AIDR - MPAM architecture ID register */
> +#define MPAMF_AIDR_ARCH_MAJOR_REV GENMASK(7, 4)
> +#define MPAMF_AIDR_ARCH_MINOR_REV GENMASK(3, 0)
> +
> +/* MPAMCFG_PART_SEL - MPAM partition configuration selection register */
> +#define MPAMCFG_PART_SEL_PARTID_SEL GENMASK(15, 0)
> +#define MPAMCFG_PART_SEL_INTERNAL BIT(16)
> +#define MPAMCFG_PART_SEL_RIS GENMASK(27, 24)
> +
> +/* MPAMCFG_CMAX - MPAM cache capacity configuration register */
> +#define MPAMCFG_CMAX_SOFTLIM BIT(31)
> +#define MPAMCFG_CMAX_CMAX GENMASK(15, 0)
> +
> +/* MPAMCFG_CMIN - MPAM cache capacity configuration register */
> +#define MPAMCFG_CMIN_CMIN GENMASK(15, 0)
> +
> +/*
> + * MPAMCFG_MBW_MIN - MPAM memory minimum bandwidth partitioning configuration
> + * register
> + */
> +#define MPAMCFG_MBW_MIN_MIN GENMASK(15, 0)
> +
> +/*
> + * MPAMCFG_MBW_MAX - MPAM memory maximum bandwidth partitioning configuration
> + * register
> + */
> +#define MPAMCFG_MBW_MAX_MAX GENMASK(15, 0)
> +#define MPAMCFG_MBW_MAX_HARDLIM BIT(31)
> +
> +/*
> + * MPAMCFG_MBW_WINWD - MPAM memory bandwidth partitioning window width
> + * register
> + */
> +#define MPAMCFG_MBW_WINWD_US_FRAC GENMASK(7, 0)
> +#define MPAMCFG_MBW_WINWD_US_INT GENMASK(23, 8)
> +
> +/* MPAMCFG_PRI - MPAM priority partitioning configuration register */
> +#define MPAMCFG_PRI_INTPRI GENMASK(15, 0)
> +#define MPAMCFG_PRI_DSPRI GENMASK(31, 16)
> +
> +/*
> + * MPAMCFG_MBW_PROP - Memory bandwidth proportional stride partitioning
> + * configuration register
> + */
> +#define MPAMCFG_MBW_PROP_STRIDEM1 GENMASK(15, 0)
> +#define MPAMCFG_MBW_PROP_EN BIT(31)
> +
> +/*
> + * MPAMCFG_INTPARTID - MPAM internal partition narrowing configuration register
> + */
> +#define MPAMCFG_INTPARTID_INTPARTID GENMASK(15, 0)
> +#define MPAMCFG_INTPARTID_INTERNAL BIT(16)
> +
> +/* MSMON_CFG_MON_SEL - Memory system performance monitor selection register */
> +#define MSMON_CFG_MON_SEL_MON_SEL GENMASK(15, 0)
> +#define MSMON_CFG_MON_SEL_RIS GENMASK(27, 24)
> +
> +/* MPAMF_ESR - MPAM Error Status Register */
> +#define MPAMF_ESR_PARTID_OR_MON GENMASK(15, 0)Probably a better name but PARTID_MON is in the specification.> +#define
MPAMF_ESR_PMG GENMASK(23, 16)
> +#define MPAMF_ESR_ERRCODE GENMASK(27, 24)
> +#define MPAMF_ESR_OVRWR BIT(31)
> +#define MPAMF_ESR_RIS GENMASK(35, 32)
> +
> +/* MPAMF_ECR - MPAM Error Control Register */
> +#define MPAMF_ECR_INTEN BIT(0)
> +
> +/* Error conditions in accessing memory mapped registers */
> +#define MPAM_ERRCODE_NONE 0
> +#define MPAM_ERRCODE_PARTID_SEL_RANGE 1
> +#define MPAM_ERRCODE_REQ_PARTID_RANGE 2
> +#define MPAM_ERRCODE_MSMONCFG_ID_RANGE 3
> +#define MPAM_ERRCODE_REQ_PMG_RANGE 4
> +#define MPAM_ERRCODE_MONITOR_RANGE 5
> +#define MPAM_ERRCODE_INTPARTID_RANGE 6
> +#define MPAM_ERRCODE_UNEXPECTED_INTERNAL 7
> +
> +/*
> + * MSMON_CFG_CSU_FLT - Memory system performance monitor configure cache storage
> + * usage monitor filter register
> + */
> +#define MSMON_CFG_CSU_FLT_PARTID GENMASK(15, 0)
> +#define MSMON_CFG_CSU_FLT_PMG GENMASK(23, 16)
> +
> +/*
> + * MSMON_CFG_CSU_CTL - Memory system performance monitor configure cache storage
> + * usage monitor control register
> + * MSMON_CFG_MBWU_CTL - Memory system performance monitor configure memory
> + * bandwidth usage monitor control register
> + */
> +#define MSMON_CFG_x_CTL_TYPE GENMASK(7, 0)
> +#define MSMON_CFG_x_CTL_OFLOW_STATUS_L BIT(15)
No OFLOW_STATUS_L for csu.> +#define MSMON_CFG_x_CTL_MATCH_PARTID BIT(16)
> +#define MSMON_CFG_x_CTL_MATCH_PMG BIT(17)
> +#define MSMON_CFG_x_CTL_SCLEN BIT(19)
> +#define MSMON_CFG_x_CTL_SUBTYPE GENMASK(23, 20)
GENMASK(22,20)> +#define MSMON_CFG_x_CTL_OFLOW_FRZ BIT(24)
> +#define MSMON_CFG_x_CTL_OFLOW_INTR BIT(25)
> +#define MSMON_CFG_x_CTL_OFLOW_STATUS BIT(26)
> +#define MSMON_CFG_x_CTL_CAPT_RESET BIT(27)
> +#define MSMON_CFG_x_CTL_CAPT_EVNT GENMASK(30, 28)
> +#define MSMON_CFG_x_CTL_EN BIT(31)
> +
> +#define MSMON_CFG_MBWU_CTL_TYPE_MBWU 0x42
> +#define MSMON_CFG_MBWU_CTL_TYPE_CSU 0x43
> +
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_NONE 0
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_READ 1
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_WRITE 2
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_BOTH 3
I'm not sure where these come from? SUBTYPE is marked unused in the
spec. Remove?> +
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_MAX 3
> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_MASK 0x3
Remove for same reason.> +
> +/*
> + * MSMON_CFG_MBWU_FLT - Memory system performance monitor configure memory
> + * bandwidth usage monitor filter register
> + */
> +#define MSMON_CFG_MBWU_FLT_PARTID GENMASK(15, 0)
> +#define MSMON_CFG_MBWU_FLT_PMG GENMASK(23, 16)
> +#define MSMON_CFG_MBWU_FLT_RWBW GENMASK(31, 30)
> +
> +/*
> + * MSMON_CSU - Memory system performance monitor cache storage usage monitor
> + * register
> + * MSMON_CSU_CAPTURE - Memory system performance monitor cache storage usage
> + * capture register
> + * MSMON_MBWU - Memory system performance monitor memory bandwidth usage
> + * monitor register
> + * MSMON_MBWU_CAPTURE - Memory system performance monitor memory bandwidth usage
> + * capture register
> + */
> +#define MSMON___VALUE GENMASK(30, 0)
> +#define MSMON___NRDY BIT(31)
> +#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
This gets renamed in the series. I think all registers layout
definitions can be added in this commit.> +/*
> + * MSMON_CAPT_EVNT - Memory system performance monitoring capture event
> + * generation register
> + */
> +#define MSMON_CAPT_EVNT_NOW BIT(0)
> +
> #endif /* MPAM_INTERNAL_H */
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
@ 2025-07-24 14:13 ` Ben Horgan
2025-08-06 18:07 ` James Morse
2025-07-29 6:11 ` Baisheng Gao
2025-08-05 8:46 ` Jonathan Cameron
2 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-24 14:13 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 11/07/2025 19:36, James Morse wrote:
> Because an MSC can only by accessed from the CPUs in its cpu-affinity
> set we need to be running on one of those CPUs to probe the MSC
> hardware.
>
> Do this work in the cpuhp callback. Probing the hardware will only
> happen before MPAM is enabled, walk all the MSCs and probe those we can
> reach that haven't already been probed.
>
> Later once MPAM is enabled, this cpuhp callback will be replaced by
> one that avoids the global list.
>
> Enabling a static key will also take the cpuhp lock, so can't be done
> from the cpuhp callback. Whenever a new MSC has been probed schedule
> work to test if all the MSCs have now been probed.
>
> CC: Lecopzer Chen <lecopzerc@nvidia.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 149 +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 8 +-
> 2 files changed, 152 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 0d6d5180903b..89434ae3efa6 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -4,6 +4,7 @@
> #define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
>
> #include <linux/acpi.h>
> +#include <linux/atomic.h>
> #include <linux/arm_mpam.h>
> #include <linux/cacheinfo.h>
> #include <linux/cpu.h>
> @@ -21,6 +22,7 @@
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <linux/types.h>
> +#include <linux/workqueue.h>
>
> #include <acpi/pcc.h>
>
> @@ -39,6 +41,16 @@ struct srcu_struct mpam_srcu;
> /* MPAM isn't available until all the MSC have been probed. */
> static u32 mpam_num_msc;
>
> +static int mpam_cpuhp_state;
> +static DEFINE_MUTEX(mpam_cpuhp_state_lock);
> +
> +/*
> + * mpam is enabled once all devices have been probed from CPU online callbacks,
> + * scheduled via this work_struct. If access to an MSC depends on a CPU that
> + * was not brought online at boot, this can happen surprisingly late.
> + */
> +static DECLARE_WORK(mpam_enable_work, &mpam_enable);
> +
> /*
> * An MSC is a physical container for controls and monitors, each identified by
> * their RIS index. These share a base-address, interrupts and some MMIO
> @@ -78,6 +90,22 @@ LIST_HEAD(mpam_classes);
> /* List of all objects that can be free()d after synchronise_srcu() */
> static LLIST_HEAD(mpam_garbage);
>
> +static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg)
> +{
> + WARN_ON_ONCE(reg > msc->mapped_hwpage_sz);
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
> +
> + return readl_relaxed(msc->mapped_hwpage + reg);
> +}
> +
> +static inline u32 _mpam_read_partsel_reg(struct mpam_msc *msc, u16 reg)
> +{
> + lockdep_assert_held_once(&msc->part_sel_lock);
> + return __mpam_read_reg(msc, reg);
> +}
> +
> +#define mpam_read_partsel_reg(msc, reg) _mpam_read_partsel_reg(msc, MPAMF_##reg)
> +
> #define init_garbage(x) init_llist_node(&(x)->garbage.llist)
>
> static struct mpam_vmsc *
> @@ -513,9 +541,84 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
> return err;
> }
>
> -static void mpam_discovery_complete(void)
> +static int mpam_msc_hw_probe(struct mpam_msc *msc)
> {
> - pr_err("Discovered all MSC\n");
> + u64 idr;
> + int err;
> +
> + lockdep_assert_held(&msc->probe_lock);
> +
> + mutex_lock(&msc->part_sel_lock);
> + idr = mpam_read_partsel_reg(msc, AIDR);
> + if ((idr & MPAMF_AIDR_ARCH_MAJOR_REV) != MPAM_ARCHITECTURE_V1) {
> + pr_err_once("%s does not match MPAM architecture v1.0\n",
> + dev_name(&msc->pdev->dev));
The error message need only mention the major revision. You've added
support for v1.1 and v1.0.> + err = -EIO;
> + } else {
> + msc->probed = true;
> + err = 0;
> + }
> + mutex_unlock(&msc->part_sel_lock);
> +
> + return err;
> +}
[snip]
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports
2025-07-11 18:36 ` [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports James Morse
@ 2025-07-24 15:08 ` Ben Horgan
2025-07-28 16:16 ` Jonathan Cameron
2025-08-07 18:26 ` James Morse
2025-07-28 8:56 ` Ben Horgan
1 sibling, 2 replies; 117+ messages in thread
From: Ben Horgan @ 2025-07-24 15:08 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> Expand the probing support with the control and monitor types
> we can use with resctrl.
>
> CC: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 154 +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 53 +++++++
> 2 files changed, 206 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 8646fb85ad09..61911831ab39 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -102,7 +102,7 @@ static LLIST_HEAD(mpam_garbage);
>
> static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg)
> {
> - WARN_ON_ONCE(reg > msc->mapped_hwpage_sz);
> + WARN_ON_ONCE(reg + sizeof(u32) > msc->mapped_hwpage_sz);
> WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
>
> return readl_relaxed(msc->mapped_hwpage + reg);
> @@ -131,6 +131,20 @@ static inline void _mpam_write_partsel_reg(struct mpam_msc *msc, u16 reg, u32 va
> }
> #define mpam_write_partsel_reg(msc, reg, val) _mpam_write_partsel_reg(msc, MPAMCFG_##reg, val)
>
> +static inline u32 _mpam_read_monsel_reg(struct mpam_msc *msc, u16 reg)
> +{
> + mpam_mon_sel_lock_held(msc);
> + return __mpam_read_reg(msc, reg);
> +}
> +#define mpam_read_monsel_reg(msc, reg) _mpam_read_monsel_reg(msc, MSMON_##reg)
> +
> +static inline void _mpam_write_monsel_reg(struct mpam_msc *msc, u16 reg, u32 val)
> +{
> + mpam_mon_sel_lock_held(msc);
> + __mpam_write_reg(msc, reg, val);
> +}
> +#define mpam_write_monsel_reg(msc, reg, val) _mpam_write_monsel_reg(msc, MSMON_##reg, val)
> +
> static u64 mpam_msc_read_idr(struct mpam_msc *msc)
> {
> u64 idr_high = 0, idr_low;
> @@ -645,6 +659,137 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc *msc,
> return found;
> }
>
> +/*
> + * IHI009A.a has this nugget: "If a monitor does not support automatic behaviour
> + * of NRDY, software can use this bit for any purpose" - so hardware might not
> + * implement this - but it isn't RES0.
> + *
> + * Try and see what values stick in this bit. If we can write either value,
> + * its probably not implemented by hardware.
> + */
> +#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result) \
> +do { \
> + u32 now; \
> + u64 mon_sel; \
> + bool can_set, can_clear; \
> + struct mpam_msc *_msc = _ris->vmsc->msc; \
> + \
> + if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(_msc))) { \
> + _result = false; \
> + break; \
> + } \
> + mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, 0) | \
> + FIELD_PREP(MSMON_CFG_MON_SEL_RIS, _ris->ris_idx); \
> + mpam_write_monsel_reg(_msc, CFG_MON_SEL, mon_sel); \
> + \
> + mpam_write_monsel_reg(_msc, _mon_reg, MSMON___NRDY); \
> + now = mpam_read_monsel_reg(_msc, _mon_reg); \
> + can_set = now & MSMON___NRDY; \
> + \
> + mpam_write_monsel_reg(_msc, _mon_reg, 0); \
> + now = mpam_read_monsel_reg(_msc, _mon_reg); \
> + can_clear = !(now & MSMON___NRDY); \
> + mpam_mon_sel_inner_unlock(_msc); \
> + \
> + _result = (!can_set || !can_clear); \
> +} while (0)
It is a bit surprising that something that looks like a function
modifies a boolean passed by value. Consider continuing the pattern you
have above:
#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result)
_mpam_ris_hw_probe_hw_nrdy(_ris, MSMON##_mon_reg, _result)
with signature:
void _mpam_ris_hw_probe_hw_nrdy(struct mpam_msc *msc, u16 reg, bool
*hw_managed);
and using the _mpam functions from the new _mpam_ris_hw_probe_hw_nrdy().
> +
> +static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
> +{
> + int err;
> + struct mpam_msc *msc = ris->vmsc->msc;
> + struct mpam_props *props = &ris->props;
> +
> + lockdep_assert_held(&msc->probe_lock);
> + lockdep_assert_held(&msc->part_sel_lock);
> +
> + /* Cache Portion partitioning */
> + if (FIELD_GET(MPAMF_IDR_HAS_CPOR_PART, ris->idr)) {
> + u32 cpor_features = mpam_read_partsel_reg(msc, CPOR_IDR);
> +
> + props->cpbm_wd = FIELD_GET(MPAMF_CPOR_IDR_CPBM_WD, cpor_features);
> + if (props->cpbm_wd)
> + mpam_set_feature(mpam_feat_cpor_part, props);
> + }
> +
> + /* Memory bandwidth partitioning */
> + if (FIELD_GET(MPAMF_IDR_HAS_MBW_PART, ris->idr)) {
> + u32 mbw_features = mpam_read_partsel_reg(msc, MBW_IDR);
> +
> + /* portion bitmap resolution */
> + props->mbw_pbm_bits = FIELD_GET(MPAMF_MBW_IDR_BWPBM_WD, mbw_features);
> + if (props->mbw_pbm_bits &&
> + FIELD_GET(MPAMF_MBW_IDR_HAS_PBM, mbw_features))
> + mpam_set_feature(mpam_feat_mbw_part, props);
> +
> + props->bwa_wd = FIELD_GET(MPAMF_MBW_IDR_BWA_WD, mbw_features);
> + if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_MAX, mbw_features))
> + mpam_set_feature(mpam_feat_mbw_max, props);
> + }
> +
> + /* Performance Monitoring */
> + if (FIELD_GET(MPAMF_IDR_HAS_MSMON, ris->idr)) {
> + u32 msmon_features = mpam_read_partsel_reg(msc, MSMON_IDR);
> +
> + /*
> + * If the firmware max-nrdy-us property is missing, the
> + * CSU counters can't be used. Should we wait forever?
> + */
> + err = device_property_read_u32(&msc->pdev->dev,
> + "arm,not-ready-us",
> + &msc->nrdy_usec);
> +
> + if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_CSU, msmon_features)) {
> + u32 csumonidr;
> +
> + csumonidr = mpam_read_partsel_reg(msc, CSUMON_IDR);
> + props->num_csu_mon = FIELD_GET(MPAMF_CSUMON_IDR_NUM_MON, csumonidr);
> + if (props->num_csu_mon) {
> + bool hw_managed;
> +
> + mpam_set_feature(mpam_feat_msmon_csu, props);
> +
> + /* Is NRDY hardware managed? */
> + mpam_mon_sel_outer_lock(msc);
> + mpam_ris_hw_probe_hw_nrdy(ris, CSU, hw_managed);
> + mpam_mon_sel_outer_unlock(msc);
> + if (hw_managed)
> + mpam_set_feature(mpam_feat_msmon_csu_hw_nrdy, props);
> + }
> +
> + /*
> + * Accept the missing firmware property if NRDY appears
> + * un-implemented.
> + */
> + if (err && mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, props))
> + pr_err_once("Counters are not usable because not-ready timeout was not provided by firmware.");
> + }
> + if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_MBWU, msmon_features)) {
> + bool hw_managed;
> + u32 mbwumonidr = mpam_read_partsel_reg(msc, MBWUMON_IDR);
> +
> + props->num_mbwu_mon = FIELD_GET(MPAMF_MBWUMON_IDR_NUM_MON, mbwumonidr);
> + if (props->num_mbwu_mon)
> + mpam_set_feature(mpam_feat_msmon_mbwu, props);
> +
> + if (FIELD_GET(MPAMF_MBWUMON_IDR_HAS_RWBW, mbwumonidr))
> + mpam_set_feature(mpam_feat_msmon_mbwu_rwbw, props);
> +
> + /* Is NRDY hardware managed? */
> + mpam_mon_sel_outer_lock(msc);
> + mpam_ris_hw_probe_hw_nrdy(ris, MBWU, hw_managed);
> + mpam_mon_sel_outer_unlock(msc);
> + if (hw_managed)
> + mpam_set_feature(mpam_feat_msmon_mbwu_hw_nrdy, props);
> +
> + /*
> + * Don't warn about any missing firmware property for
> + * MBWU NRDY - it doesn't make any sense!
> + */
> + }
> + }
> +}
> +
> static int mpam_msc_hw_probe(struct mpam_msc *msc)
> {
> u64 idr;
> @@ -665,6 +810,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
>
> idr = mpam_msc_read_idr(msc);
> mutex_unlock(&msc->part_sel_lock);
> +
> msc->ris_max = FIELD_GET(MPAMF_IDR_RIS_MAX, idr);
>
> /* Use these values so partid/pmg always starts with a valid value */
> @@ -685,6 +831,12 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
> ris = mpam_get_or_create_ris(msc, ris_idx);
> if (IS_ERR(ris))
> return PTR_ERR(ris);
> + ris->idr = idr;
> +
> + mutex_lock(&msc->part_sel_lock);
> + __mpam_part_sel(ris_idx, 0, msc);
> + mpam_ris_hw_probe(ris);
> + mutex_unlock(&msc->part_sel_lock);
> }
>
> spin_lock(&partid_max_lock);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index 42a454d5f914..ae6fd1f62cc4 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -136,6 +136,55 @@ static inline void mpam_mon_sel_lock_held(struct mpam_msc *msc)
> lockdep_assert_preemption_enabled();
> }
>
> +/*
> + * When we compact the supported features, we don't care what they are.
> + * Storing them as a bitmap makes life easy.
> + */
> +typedef u16 mpam_features_t;
> +
> +/* Bits for mpam_features_t */
> +enum mpam_device_features {
> + mpam_feat_ccap_part = 0,
> + mpam_feat_cpor_part,
> + mpam_feat_mbw_part,
> + mpam_feat_mbw_min,
> + mpam_feat_mbw_max,
> + mpam_feat_mbw_prop,
> + mpam_feat_msmon,
> + mpam_feat_msmon_csu,
> + mpam_feat_msmon_csu_capture,
> + mpam_feat_msmon_csu_hw_nrdy,
> + mpam_feat_msmon_mbwu,
> + mpam_feat_msmon_mbwu_capture,
> + mpam_feat_msmon_mbwu_rwbw,
> + mpam_feat_msmon_mbwu_hw_nrdy,
> + mpam_feat_msmon_capt,
> + MPAM_FEATURE_LAST,
> +};
> +#define MPAM_ALL_FEATURES ((1 << MPAM_FEATURE_LAST) - 1)
> +
> +struct mpam_props {
> + mpam_features_t features;
> +
> + u16 cpbm_wd;
> + u16 mbw_pbm_bits;
> + u16 bwa_wd;
> + u16 num_csu_mon;
> + u16 num_mbwu_mon;
> +};
> +
> +static inline bool mpam_has_feature(enum mpam_device_features feat,
> + struct mpam_props *props)
> +{
> + return (1 << feat) & props->features;
> +}
> +
> +static inline void mpam_set_feature(enum mpam_device_features feat,
> + struct mpam_props *props)
> +{
> + props->features |= (1 << feat);
> +}
> +
> struct mpam_class {
> /* mpam_components in this class */
> struct list_head components;
> @@ -175,6 +224,8 @@ struct mpam_vmsc {
> /* mpam_msc_ris in this vmsc */
> struct list_head ris;
>
> + struct mpam_props props;
> +
> /* All RIS in this vMSC are members of this MSC */
> struct mpam_msc *msc;
>
> @@ -186,6 +237,8 @@ struct mpam_vmsc {
>
> struct mpam_msc_ris {
> u8 ris_idx;
> + u64 idr;
> + struct mpam_props props;
>
> cpumask_t affinity;
>
--
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-23 14:42 ` Ben Horgan
@ 2025-07-25 17:05 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-25 17:05 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Sudeep Holla
Hi Ben,
On 23/07/2025 15:42, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> The PPTT describes CPUs and caches, as well as processor containers.
>> The ACPI table for MPAM describes the set of CPUs that can access an MSC
>> with the UID of a processor container.
>>
>> Add a helper to find the processor container by its id, then walk
>> the possible CPUs to fill a cpumask with the CPUs that have this
>> processor container as a parent.
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 54676e3d82dd..13619b1b821b 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -298,6 +298,99 @@ static struct acpi_pptt_processor *acpi_find_processor_node(struct
>> +/**
>> + * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs in a
>> + * processor containers
>> + * @acpi_cpu_id: The UID of the processor container.
>> + * @cpus The resulting CPU mask.
>> + *
>> + * Find the specified Processor Container, and fill @cpus with all the cpus
>> + * below it.
>> + *
>> + * Not all 'Processor' entries in the PPTT are either a CPU or a Processor
>> + * Container, they may exist purely to describe a Private resource. CPUs
>> + * have to be leaves, so a Processor Container is a non-leaf that has the
>> + * 'ACPI Processor ID valid' flag set.
>> + *
>> + * Return: 0 for a complete walk, or an error if the mask is incomplete.
>> + */
>> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus)
>> +{
>> + struct acpi_pptt_processor *cpu_node;
>> + struct acpi_table_header *table_hdr;
>> + struct acpi_subtable_header *entry;
>> + bool leaf_flag, has_leaf_flag = false;
>> + unsigned long table_end;
>> + acpi_status status;
>> + u32 proc_sz;
>> + int ret = 0;
>> +
>> + cpumask_clear(cpus);
>> +
>> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table_hdr);
>> + if (ACPI_FAILURE(status))
>> + return 0;
>> +
>> + if (table_hdr->revision > 1)
>> + has_leaf_flag = true;
>> +
>> + table_end = (unsigned long)table_hdr + table_hdr->length;
>> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
>> + sizeof(struct acpi_table_pptt));
>> + proc_sz = sizeof(struct acpi_pptt_processor);
>> + while ((unsigned long)entry + proc_sz <= table_end) {
>> + cpu_node = (struct acpi_pptt_processor *)entry;
>> + if (entry->type == ACPI_PPTT_TYPE_PROCESSOR &&
>> + cpu_node->flags & ACPI_PPTT_ACPI_PROCESSOR_ID_VALID) {
>> + leaf_flag = cpu_node->flags & ACPI_PPTT_ACPI_LEAF_NODE;
>> + if ((has_leaf_flag && !leaf_flag) ||
>> + (!has_leaf_flag && !acpi_pptt_leaf_node(table_hdr, cpu_node))) {
>> + if (cpu_node->acpi_processor_id == acpi_cpu_id)
>> + acpi_pptt_get_child_cpus(table_hdr, cpu_node, cpus);
>> + }
> acpi_pptt_leaf_node() returns early based on the leaf flag so you can just rely on that
> here; remove has_leaf_flag and the corresponding extra logic.
Aha! I was only doing this to try and avoid that extra descent of the tree. I missed that
its already taken into account.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels
2025-07-16 15:51 ` Jonathan Cameron
@ 2025-07-25 17:05 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-25 17:05 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
Hi Jonathan,
On 16/07/2025 16:51, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:18 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> acpi_count_levels() passes the number of levels back via a pointer argument.
>> It also passes this to acpi_find_cache_level() as the starting_level, and
>> preserves this value as it walks up the cpu_node tree counting the levels.
>>
>> The only caller acpi_get_cache_info() happens to have already initialised
>> levels to zero, which acpi_count_levels() depends on to get the correct
>> result.
>>
>> Explicitly zero the levels variable, so the count always starts at zero.
>> This saves any additional callers having to work out they need to do this.
> This is all a bit fiddly as we now end up with that initialized in various
> different places.
I've debugged this one a few times, (turns out I'm forgetful) ... I figured doing this
was better than adding a comment to warn others.
As its static, I figured it was something the compiler can optimise out if there is a
duplicate. (I couldn't find any initialisation I could remove because of this)
> Perhaps simpler to have acpi_count_levels() return the
> number of levels rather than void. Then return number of levels rather
> than 0 on success from acpi_get_cache_info(). Negative error codes used
> for failure just like now.
>
> That would leave only a local variable in acpi_count_levels being
> initialized to 0 and passed to acpi_find_cache_level() before being
> returned when the loop terminates.
>
> I think that sequence then makes it such that we can't fail to
> initialize it at without the compiler noticing and screaming.
>
> Requires a few changes from if (ret) to if (ret < 0) at callers
> of acpi_get_cache_info() but looks simple (says the person who
> hasn't actually coded it!)
Breaking the symmetry between levels and split_levels is an argument against this.
I think within pptt.c this is fine, because 'level's is used internally as
'starting_level', and this expectation it was initialised to zero is a nasty surprise.
But exposing that from acpi_get_cache_info() looks stranger - and would need to touch
users in cacheinfo, arm64, riscv.
I've updated acpi_count_levels() to look as you describe - that at least makes it harder
to miss this in future. (not sure whether it saves anything)
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 13619b1b821b..13ca2eee3b98 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -183,7 +183,7 @@ acpi_find_cache_level(struct acpi_table_header *table_hdr,
>> * @cpu_node: processor node we wish to count caches for
>> * @levels: Number of levels if success.
>> * @split_levels: Number of split cache levels (data/instruction) if
>> - * success. Can by NULL.
>> + * success. Can be NULL.
>
> Grumpy reviewer hat. Unrelated cleanup up - good to have but not in this patch where
> it's a distraction.
I was hoping diff would keep it as one hunk. Happy to leave it tyopd!
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-22 14:28 ` Jonathan Cameron
@ 2025-07-25 17:05 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-25 17:05 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
Hi Jonathan,
On 22/07/2025 15:28, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:17 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> The PPTT describes CPUs and caches, as well as processor containers.
>> The ACPI table for MPAM describes the set of CPUs that can access an MSC
>> with the UID of a processor container.
>>
>> Add a helper to find the processor container by its id, then walk
>> the possible CPUs to fill a cpumask with the CPUs that have this
>> processor container as a parent.
>> +/**
>> + * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs in a
>> + * processor containers
>> + * @acpi_cpu_id: The UID of the processor container.
>> + * @cpus The resulting CPU mask.
> Missing colon.
>
> From a W=1 build (and hence kernel-doc warning).
Thanks!
W=1 spews so much output I tend to rely on the kbuild robot to report if I'm making it worse!
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container
2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
@ 2025-07-25 17:06 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-07-25 17:06 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko, Sudeep Holla
Hi Shaopeng,
On 17/07/2025 08:58, Shaopeng Tan (Fujitsu) wrote:
> Hello James,
>
>> The PPTT describes CPUs and caches, as well as processor containers.
>> The ACPI table for MPAM describes the set of CPUs that can access an MSC
>> with the UID of a processor container.
>>
>> Add a helper to find the processor container by its id, then walk the possible
>> CPUs to fill a cpumask with the CPUs that have this processor container as a
>> parent.
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c index
>> 54676e3d82dd..13619b1b821b 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -298,6 +298,99 @@ static struct acpi_pptt_processor
>> +/**
>> + * acpi_pptt_get_cpus_from_container() - Populate a cpumask with all CPUs
>> in a
>> + * processor containers
>> + * @acpi_cpu_id: The UID of the processor container.
>> + * @cpus The resulting CPU mask.
>> + *
>> + * Find the specified Processor Container, and fill @cpus with all the
>> +cpus
>> + * below it.
>> + *
>> + * Not all 'Processor' entries in the PPTT are either a CPU or a
>> +Processor
>> + * Container, they may exist purely to describe a Private resource.
>> +CPUs
>> + * have to be leaves, so a Processor Container is a non-leaf that has
>> +the
>> + * 'ACPI Processor ID valid' flag set.
>> + *
>> + * Return: 0 for a complete walk, or an error if the mask is incomplete.
>> + */
>> +int acpi_pptt_get_cpus_from_container(u32 acpi_cpu_id, cpumask_t *cpus)
>> +{
>> + struct acpi_pptt_processor *cpu_node;
>> + struct acpi_table_header *table_hdr;
>> + struct acpi_subtable_header *entry;
>> + bool leaf_flag, has_leaf_flag = false;
>> + unsigned long table_end;
>> + acpi_status status;
>> + u32 proc_sz;
>> + int ret = 0;
>> +
>> + cpumask_clear(cpus);
>> +
>> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table_hdr);
>> + if (ACPI_FAILURE(status))
>> + return 0;
> If pptt table cannot be got, should -ENODEV be returned?
In general its not an error for the PPTT to be missing, there are plenty of platforms
where that is the case. I think in this case the caller has to be working with some
information that means there has to be a PPTT, so this isn't an error that needs handling.
In MPAM's case, the ACPI table references things in the PPTT, if that table were missing
the platform description is unusable. I don't think this is something we need to help
debug - just ensure we don't cause a panic() that would make it harder to debug!
>> + if (table_hdr->revision > 1)
>> + has_leaf_flag = true;
>> +
>> + table_end = (unsigned long)table_hdr + table_hdr->length;
>> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, table_hdr,
>> + sizeof(struct acpi_table_pptt));
>> + proc_sz = sizeof(struct acpi_pptt_processor);
>> + while ((unsigned long)entry + proc_sz <= table_end) {
>> + cpu_node = (struct acpi_pptt_processor *)entry;
>> + if (entry->type == ACPI_PPTT_TYPE_PROCESSOR &&
>> + cpu_node->flags &
>> ACPI_PPTT_ACPI_PROCESSOR_ID_VALID) {
>> + leaf_flag = cpu_node->flags &
>> ACPI_PPTT_ACPI_LEAF_NODE;
>> + if ((has_leaf_flag && !leaf_flag) ||
>> + (!has_leaf_flag
>> && !acpi_pptt_leaf_node(table_hdr, cpu_node))) {
>> + if (cpu_node->acpi_processor_id ==
>> acpi_cpu_id)
>> + acpi_pptt_get_child_cpus(table_hdr,
>> cpu_node, cpus);
>> + }
>> + }
>> + entry = ACPI_ADD_PTR(struct acpi_subtable_header, entry,
>> + entry->length);
>> + }
>> +
>> + acpi_put_table(table_hdr);
>> +
>> + return ret;
> Only 0 is returned here.
Good spot! I think this allocated memory in the past, it can probably be made 'void',
which will make your above point easier too.
> There is no action to be taken when the mask is incomplete.
I don't think there needs to be. General callers should be using cacheinfo for this
information. This only exists as the MPAM driver needs to know about the topology of the
system before 'all' the CPUs are online. (which could be never).
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node
2025-07-14 11:40 ` Ben Horgan
@ 2025-07-25 17:08 ` James Morse
2025-07-28 8:37 ` Ben Horgan
0 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-07-25 17:08 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 14/07/2025 12:40, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> The MPAM driver identifies caches by id for use with resctrl. It
>> needs to know the cache-id when probe-ing, but the value isn't set
>> in cacheinfo until device_initcall().
>>
>> Expose the code that generates the cache-id. The parts of the MPAM
>> driver that run early can use this to set up the resctrl structures
>> before cacheinfo is ready in device_initcall().
>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>> index 613410705a47..0fdd6358ee73 100644
>> --- a/drivers/base/cacheinfo.c
>> +++ b/drivers/base/cacheinfo.c
>> @@ -207,8 +207,7 @@ static bool match_cache_node(struct device_node *cpu,
>> #define arch_compact_of_hwid(_x) (_x)
>> #endif
>> -static void cache_of_set_id(struct cacheinfo *this_leaf,
>> - struct device_node *cache_node)
>> +unsigned long cache_of_calculate_id(struct device_node *cache_node)
>> {
>> struct device_node *cpu;
>> u32 min_id = ~0;
>> @@ -219,15 +218,23 @@ static void cache_of_set_id(struct cacheinfo *this_leaf,
>> id = arch_compact_of_hwid(id);
>> if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
>> of_node_put(cpu);
>> - return;
>> + return ~0UL;
>> }
>> if (match_cache_node(cpu, cache_node))
>> min_id = min(min_id, id);
>> }
>> - if (min_id != ~0) {
>> - this_leaf->id = min_id;
>> + return min_id;
> Looks like some 32bit/64bit confusion. Don't we want to return ~0UL if min_id == ~0?
Certainly some confusion - yup, because cache_of_calculate_id() needs to return something
that is out of range and (u32)-1 might be valid...
I think changing min_id to be defined as:
| unsigned long min_id = ~0UL;
fixes this - any trip round the loop that doesn't match anything will eventually return ~0UL.
Thanks! - I always get the 'UL' suffixes wrong.
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node
2025-07-25 17:08 ` James Morse
@ 2025-07-28 8:37 ` Ben Horgan
0 siblings, 0 replies; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 8:37 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/25/25 18:08, James Morse wrote:
> Hi Ben,
>
> On 14/07/2025 12:40, Ben Horgan wrote:
>> On 7/11/25 19:36, James Morse wrote:
>>> The MPAM driver identifies caches by id for use with resctrl. It
>>> needs to know the cache-id when probe-ing, but the value isn't set
>>> in cacheinfo until device_initcall().
>>>
>>> Expose the code that generates the cache-id. The parts of the MPAM
>>> driver that run early can use this to set up the resctrl structures
>>> before cacheinfo is ready in device_initcall().
>
>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>> index 613410705a47..0fdd6358ee73 100644
>>> --- a/drivers/base/cacheinfo.c
>>> +++ b/drivers/base/cacheinfo.c
>>> @@ -207,8 +207,7 @@ static bool match_cache_node(struct device_node *cpu,
>>> #define arch_compact_of_hwid(_x) (_x)
>>> #endif
>>> -static void cache_of_set_id(struct cacheinfo *this_leaf,
>>> - struct device_node *cache_node)
>>> +unsigned long cache_of_calculate_id(struct device_node *cache_node)
>>> {
>>> struct device_node *cpu;
>>> u32 min_id = ~0;
>>> @@ -219,15 +218,23 @@ static void cache_of_set_id(struct cacheinfo *this_leaf,
>>> id = arch_compact_of_hwid(id);
>>> if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
>>> of_node_put(cpu);
>>> - return;
>>> + return ~0UL;
>>> }
>>> if (match_cache_node(cpu, cache_node))
>>> min_id = min(min_id, id);
>>> }
>>> - if (min_id != ~0) {
>>> - this_leaf->id = min_id;
>>> + return min_id;
>
>> Looks like some 32bit/64bit confusion. Don't we want to return ~0UL if min_id == ~0?
>
> Certainly some confusion - yup, because cache_of_calculate_id() needs to return something
> that is out of range and (u32)-1 might be valid...
>
> I think changing min_id to be defined as:
> | unsigned long min_id = ~0UL;
>
> fixes this - any trip round the loop that doesn't match anything will eventually return ~0UL.
Yes, that would work.
>
>
> Thanks! - I always get the 'UL' suffixes wrong.
>
> James
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports
2025-07-11 18:36 ` [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports James Morse
2025-07-24 15:08 ` Ben Horgan
@ 2025-07-28 8:56 ` Ben Horgan
2025-08-08 7:20 ` James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 8:56 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> Expand the probing support with the control and monitor types
> we can use with resctrl.
>
> CC: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 154 +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 53 +++++++
> 2 files changed, 206 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 8646fb85ad09..61911831ab39 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -102,7 +102,7 @@ static LLIST_HEAD(mpam_garbage);
>
> static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg)
> {
> - WARN_ON_ONCE(reg > msc->mapped_hwpage_sz);
> + WARN_ON_ONCE(reg + sizeof(u32) > msc->mapped_hwpage_sz);
> WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
>
> return readl_relaxed(msc->mapped_hwpage + reg);
> @@ -131,6 +131,20 @@ static inline void _mpam_write_partsel_reg(struct mpam_msc *msc, u16 reg, u32 va
> }
> #define mpam_write_partsel_reg(msc, reg, val) _mpam_write_partsel_reg(msc, MPAMCFG_##reg, val)
>
> +static inline u32 _mpam_read_monsel_reg(struct mpam_msc *msc, u16 reg)
> +{
> + mpam_mon_sel_lock_held(msc);
> + return __mpam_read_reg(msc, reg);
> +}
> +#define mpam_read_monsel_reg(msc, reg) _mpam_read_monsel_reg(msc, MSMON_##reg)
> +
> +static inline void _mpam_write_monsel_reg(struct mpam_msc *msc, u16 reg, u32 val)
> +{
> + mpam_mon_sel_lock_held(msc);
> + __mpam_write_reg(msc, reg, val);
> +}
> +#define mpam_write_monsel_reg(msc, reg, val) _mpam_write_monsel_reg(msc, MSMON_##reg, val)
> +
> static u64 mpam_msc_read_idr(struct mpam_msc *msc)
> {
> u64 idr_high = 0, idr_low;
> @@ -645,6 +659,137 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc *msc,
> return found;
> }
>
> +/*
> + * IHI009A.a has this nugget: "If a monitor does not support automatic behaviour
> + * of NRDY, software can use this bit for any purpose" - so hardware might not
> + * implement this - but it isn't RES0.
> + *
> + * Try and see what values stick in this bit. If we can write either value,
> + * its probably not implemented by hardware.
> + */
> +#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result) \
> +do { \
> + u32 now; \
> + u64 mon_sel; \
> + bool can_set, can_clear; \
> + struct mpam_msc *_msc = _ris->vmsc->msc; \
> + \
> + if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(_msc))) { \
> + _result = false; \
> + break; \
> + } \
> + mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, 0) | \
> + FIELD_PREP(MSMON_CFG_MON_SEL_RIS, _ris->ris_idx); \
> + mpam_write_monsel_reg(_msc, CFG_MON_SEL, mon_sel); \
> + \
> + mpam_write_monsel_reg(_msc, _mon_reg, MSMON___NRDY); \
> + now = mpam_read_monsel_reg(_msc, _mon_reg); \
> + can_set = now & MSMON___NRDY; \
> + \
> + mpam_write_monsel_reg(_msc, _mon_reg, 0); \
> + now = mpam_read_monsel_reg(_msc, _mon_reg); \
> + can_clear = !(now & MSMON___NRDY); \
> + mpam_mon_sel_inner_unlock(_msc); \
> + \
> + _result = (!can_set || !can_clear); \
> +} while (0)
> +
> +static void mpam_ris_hw_probe(struct mpam_msc_ris *ris)
> +{
> + int err;
> + struct mpam_msc *msc = ris->vmsc->msc;
> + struct mpam_props *props = &ris->props;
> +
> + lockdep_assert_held(&msc->probe_lock);
> + lockdep_assert_held(&msc->part_sel_lock);
> +
> + /* Cache Portion partitioning */
> + if (FIELD_GET(MPAMF_IDR_HAS_CPOR_PART, ris->idr)) {
> + u32 cpor_features = mpam_read_partsel_reg(msc, CPOR_IDR);
> +
> + props->cpbm_wd = FIELD_GET(MPAMF_CPOR_IDR_CPBM_WD, cpor_features);
> + if (props->cpbm_wd)
> + mpam_set_feature(mpam_feat_cpor_part, props);
> + }
> +
> + /* Memory bandwidth partitioning */
> + if (FIELD_GET(MPAMF_IDR_HAS_MBW_PART, ris->idr)) {
> + u32 mbw_features = mpam_read_partsel_reg(msc, MBW_IDR);
> +
> + /* portion bitmap resolution */
> + props->mbw_pbm_bits = FIELD_GET(MPAMF_MBW_IDR_BWPBM_WD, mbw_features);
> + if (props->mbw_pbm_bits &&
> + FIELD_GET(MPAMF_MBW_IDR_HAS_PBM, mbw_features))
> + mpam_set_feature(mpam_feat_mbw_part, props);
> +
> + props->bwa_wd = FIELD_GET(MPAMF_MBW_IDR_BWA_WD, mbw_features);
> + if (props->bwa_wd && FIELD_GET(MPAMF_MBW_IDR_HAS_MAX, mbw_features))
> + mpam_set_feature(mpam_feat_mbw_max, props);
> + }
> +
> + /* Performance Monitoring */
> + if (FIELD_GET(MPAMF_IDR_HAS_MSMON, ris->idr)) {
> + u32 msmon_features = mpam_read_partsel_reg(msc, MSMON_IDR);
> +
> + /*
> + * If the firmware max-nrdy-us property is missing, the
> + * CSU counters can't be used. Should we wait forever?
> + */
> + err = device_property_read_u32(&msc->pdev->dev,
> + "arm,not-ready-us",
> + &msc->nrdy_usec);
> +
> + if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_CSU, msmon_features)) {
> + u32 csumonidr;
> +
> + csumonidr = mpam_read_partsel_reg(msc, CSUMON_IDR);
> + props->num_csu_mon = FIELD_GET(MPAMF_CSUMON_IDR_NUM_MON, csumonidr);
> + if (props->num_csu_mon) {
> + bool hw_managed;
> +
> + mpam_set_feature(mpam_feat_msmon_csu, props);
> +
> + /* Is NRDY hardware managed? */
> + mpam_mon_sel_outer_lock(msc);
> + mpam_ris_hw_probe_hw_nrdy(ris, CSU, hw_managed);
> + mpam_mon_sel_outer_unlock(msc);
> + if (hw_managed)
> + mpam_set_feature(mpam_feat_msmon_csu_hw_nrdy, props);
> + }
> +
> + /*
> + * Accept the missing firmware property if NRDY appears
> + * un-implemented.
> + */
> + if (err && mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, props))
> + pr_err_once("Counters are not usable because not-ready timeout was not provided by firmware.");
> + }
> + if (FIELD_GET(MPAMF_MSMON_IDR_MSMON_MBWU, msmon_features)) {
> + bool hw_managed;
> + u32 mbwumonidr = mpam_read_partsel_reg(msc, MBWUMON_IDR);
> +
> + props->num_mbwu_mon = FIELD_GET(MPAMF_MBWUMON_IDR_NUM_MON, mbwumonidr);
> + if (props->num_mbwu_mon)
> + mpam_set_feature(mpam_feat_msmon_mbwu, props);
> +
> + if (FIELD_GET(MPAMF_MBWUMON_IDR_HAS_RWBW, mbwumonidr))
> + mpam_set_feature(mpam_feat_msmon_mbwu_rwbw, props);
> +
> + /* Is NRDY hardware managed? */
> + mpam_mon_sel_outer_lock(msc);
> + mpam_ris_hw_probe_hw_nrdy(ris, MBWU, hw_managed);
> + mpam_mon_sel_outer_unlock(msc);
> + if (hw_managed)
> + mpam_set_feature(mpam_feat_msmon_mbwu_hw_nrdy, props);
> +
> + /*
> + * Don't warn about any missing firmware property for
> + * MBWU NRDY - it doesn't make any sense!
> + */
> + }
> + }
> +}
> +
> static int mpam_msc_hw_probe(struct mpam_msc *msc)
> {
> u64 idr;
> @@ -665,6 +810,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
>
> idr = mpam_msc_read_idr(msc);
> mutex_unlock(&msc->part_sel_lock);
> +
> msc->ris_max = FIELD_GET(MPAMF_IDR_RIS_MAX, idr);
>
> /* Use these values so partid/pmg always starts with a valid value */
> @@ -685,6 +831,12 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
> ris = mpam_get_or_create_ris(msc, ris_idx);
> if (IS_ERR(ris))
> return PTR_ERR(ris);
> + ris->idr = idr;
> +
> + mutex_lock(&msc->part_sel_lock);
> + __mpam_part_sel(ris_idx, 0, msc);
> + mpam_ris_hw_probe(ris);
> + mutex_unlock(&msc->part_sel_lock);
> }
>
> spin_lock(&partid_max_lock);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index 42a454d5f914..ae6fd1f62cc4 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -136,6 +136,55 @@ static inline void mpam_mon_sel_lock_held(struct mpam_msc *msc)
> lockdep_assert_preemption_enabled();
> }
>
> +/*
> + * When we compact the supported features, we don't care what they are.
> + * Storing them as a bitmap makes life easy.
> + */
> +typedef u16 mpam_features_t;
> +
> +/* Bits for mpam_features_t */
> +enum mpam_device_features {
> + mpam_feat_ccap_part = 0,
> + mpam_feat_cpor_part,
> + mpam_feat_mbw_part,
> + mpam_feat_mbw_min,
> + mpam_feat_mbw_max,
> + mpam_feat_mbw_prop,
> + mpam_feat_msmon,
> + mpam_feat_msmon_csu,
> + mpam_feat_msmon_csu_capture,
> + mpam_feat_msmon_csu_hw_nrdy,
> + mpam_feat_msmon_mbwu,
> + mpam_feat_msmon_mbwu_capture,
> + mpam_feat_msmon_mbwu_rwbw,
> + mpam_feat_msmon_mbwu_hw_nrdy,
> + mpam_feat_msmon_capt,
> + MPAM_FEATURE_LAST,
> +};
> +#define MPAM_ALL_FEATURES ((1 << MPAM_FEATURE_LAST) - 1)
Consider a static assert to check the type is big enough.
static_assert(BITS_PER_TYPE(mpam_features_t) >= MPAM_FEATURE_LAST);
> +
> +struct mpam_props {
> + mpam_features_t features;
> +
> + u16 cpbm_wd;
> + u16 mbw_pbm_bits;
> + u16 bwa_wd;
> + u16 num_csu_mon;
> + u16 num_mbwu_mon;
> +};
> +
> +static inline bool mpam_has_feature(enum mpam_device_features feat,
> + struct mpam_props *props)
> +{
> + return (1 << feat) & props->features;
> +}
> +
> +static inline void mpam_set_feature(enum mpam_device_features feat,
> + struct mpam_props *props)
> +{
> + props->features |= (1 << feat);
> +}
> +
> struct mpam_class {
> /* mpam_components in this class */
> struct list_head components;
> @@ -175,6 +224,8 @@ struct mpam_vmsc {
> /* mpam_msc_ris in this vmsc */
> struct list_head ris;
>
> + struct mpam_props props;
> +
> /* All RIS in this vMSC are members of this MSC */
> struct mpam_msc *msc;
>
> @@ -186,6 +237,8 @@ struct mpam_vmsc {
>
> struct mpam_msc_ris {
> u8 ris_idx;
> + u64 idr;
> + struct mpam_props props;
>
> cpumask_t affinity;
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class
2025-07-11 18:36 ` [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class James Morse
@ 2025-07-28 9:15 ` Ben Horgan
0 siblings, 0 replies; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 9:15 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> To make a decision about whether to expose an mpam class as
> a resctrl resource we need to know its overall supported
> features and properties.
>
> Once we've probed all the resources, we can walk the tree
> and produced overall values by merging the bitmaps. This
nit: s/produced/produce/
> eliminates features that are only supported by some MSC
> that make up a component or class.
>
> If bitmap properties are mismatched within a component we
> cannot support the mismatched feature.
>
> Care has to be taken as vMSC may hold mismatched RIS.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 215 ++++++++++++++++++++
> drivers/platform/arm64/mpam/mpam_internal.h | 8 +
> 2 files changed, 223 insertions(+)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 61911831ab39..7b042a35405a 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -1186,8 +1186,223 @@ static struct platform_driver mpam_msc_driver = {
> .remove = mpam_msc_drv_remove,
> };
>
> +/* Any of these features mean the BWA_WD field is valid. */
> +static bool mpam_has_bwa_wd_feature(struct mpam_props *props)
> +{
> + if (mpam_has_feature(mpam_feat_mbw_min, props))
> + return true;
> + if (mpam_has_feature(mpam_feat_mbw_max, props))
> + return true;
> + if (mpam_has_feature(mpam_feat_mbw_prop, props))
> + return true;
> + return false;
> +}
> +
> +#define MISMATCHED_HELPER(parent, child, helper, field, alias) \
> + helper(parent) && \
> + ((helper(child) && (parent)->field != (child)->field) || \
> + (!helper(child) && !(alias)))
> +
> +#define MISMATCHED_FEAT(parent, child, feat, field, alias) \
> + mpam_has_feature((feat), (parent)) && \
> + ((mpam_has_feature((feat), (child)) && (parent)->field != (child)->field) || \
> + (!mpam_has_feature((feat), (child)) && !(alias)))
> +
> +#define CAN_MERGE_FEAT(parent, child, feat, alias) \
> + (alias) && !mpam_has_feature((feat), (parent)) && \
> + mpam_has_feature((feat), (child))
> +
> +/*
> + * Combime two props fields.
nit: s/combime/combine/
> + * If this is for controls that alias the same resource, it is safe to just
> + * copy the values over. If two aliasing controls implement the same scheme
> + * a safe value must be picked.
> + * For non-aliasing controls, these control different resources, and the
> + * resulting safe value must be compatible with both. When merging values in
> + * the tree, all the aliasing resources must be handled first.
> + * On mismatch, parent is modified.
> + */
> +static void __props_mismatch(struct mpam_props *parent,
> + struct mpam_props *child, bool alias)
> +{
> + if (CAN_MERGE_FEAT(parent, child, mpam_feat_cpor_part, alias)) {
> + parent->cpbm_wd = child->cpbm_wd;
> + } else if (MISMATCHED_FEAT(parent, child, mpam_feat_cpor_part,
> + cpbm_wd, alias)) {
> + pr_debug("%s cleared cpor_part\n", __func__);
> + mpam_clear_feature(mpam_feat_cpor_part, &parent->features);
> + parent->cpbm_wd = 0;
> + }
> +
> + if (CAN_MERGE_FEAT(parent, child, mpam_feat_mbw_part, alias)) {
> + parent->mbw_pbm_bits = child->mbw_pbm_bits;
> + } else if (MISMATCHED_FEAT(parent, child, mpam_feat_mbw_part,
> + mbw_pbm_bits, alias)) {
> + pr_debug("%s cleared mbw_part\n", __func__);
> + mpam_clear_feature(mpam_feat_mbw_part, &parent->features);
> + parent->mbw_pbm_bits = 0;
> + }
> +
> + /* bwa_wd is a count of bits, fewer bits means less precision */
> + if (alias && !mpam_has_bwa_wd_feature(parent) && mpam_has_bwa_wd_feature(child)) {
> + parent->bwa_wd = child->bwa_wd;
> + } else if (MISMATCHED_HELPER(parent, child, mpam_has_bwa_wd_feature,
> + bwa_wd, alias)) {
> + pr_debug("%s took the min bwa_wd\n", __func__);
> + parent->bwa_wd = min(parent->bwa_wd, child->bwa_wd);
> + }
> +
> + /* For num properties, take the minimum */
> + if (CAN_MERGE_FEAT(parent, child, mpam_feat_msmon_csu, alias)) {
> + parent->num_csu_mon = child->num_csu_mon;
> + } else if (MISMATCHED_FEAT(parent, child, mpam_feat_msmon_csu,
> + num_csu_mon, alias)) {
> + pr_debug("%s took the min num_csu_mon\n", __func__);
> + parent->num_csu_mon = min(parent->num_csu_mon, child->num_csu_mon);
> + }
> +
> + if (CAN_MERGE_FEAT(parent, child, mpam_feat_msmon_mbwu, alias)) {
> + parent->num_mbwu_mon = child->num_mbwu_mon;
> + } else if (MISMATCHED_FEAT(parent, child, mpam_feat_msmon_mbwu,
> + num_mbwu_mon, alias)) {
> + pr_debug("%s took the min num_mbwu_mon\n", __func__);
> + parent->num_mbwu_mon = min(parent->num_mbwu_mon, child->num_mbwu_mon);
> + }
> +
> + if (alias) {
> + /* Merge features for aliased resources */
> + parent->features |= child->features;
> + } else {
> + /* Clear missing features for non aliasing */
> + parent->features &= child->features;
> + }
> +}
> +
> +/*
> + * If a vmsc doesn't match class feature/configuration, do the right thing(tm).
> + * For 'num' properties we can just take the minimum.
> + * For properties where the mismatched unused bits would make a difference, we
> + * nobble the class feature, as we can't configure all the resources.
> + * e.g. The L3 cache is composed of two resources with 13 and 17 portion
> + * bitmaps respectively.
> + */
> +static void
> +__class_props_mismatch(struct mpam_class *class, struct mpam_vmsc *vmsc)
> +{
> + struct mpam_props *cprops = &class->props;
> + struct mpam_props *vprops = &vmsc->props;
> +
> + lockdep_assert_held(&mpam_list_lock); /* we modify class */
> +
> + pr_debug("%s: Merging features for class:0x%lx &= vmsc:0x%lx\n",
> + dev_name(&vmsc->msc->pdev->dev),
> + (long)cprops->features, (long)vprops->features);
> +
> + /* Take the safe value for any common features */
> + __props_mismatch(cprops, vprops, false);
> +}
> +
> +static void
> +__vmsc_props_mismatch(struct mpam_vmsc *vmsc, struct mpam_msc_ris *ris)
> +{
> + struct mpam_props *rprops = &ris->props;
> + struct mpam_props *vprops = &vmsc->props;
> +
> + lockdep_assert_held(&mpam_list_lock); /* we modify vmsc */
> +
> + pr_debug("%s: Merging features for vmsc:0x%lx |= ris:0x%lx\n",
> + dev_name(&vmsc->msc->pdev->dev),
> + (long)vprops->features, (long)rprops->features);
> +
> + /*
> + * Merge mismatched features - Copy any features that aren't common,
> + * but take the safe value for any common features.
> + */
> + __props_mismatch(vprops, rprops, true);
> +}
> +
> +/*
> + * Copy the first component's first vMSC's properties and features to the
> + * class. __class_props_mismatch() will remove conflicts.
> + * It is not possible to have a class with no components, or a component with
> + * no resources. The vMSC properties have already been built.
> + */
> +static void mpam_enable_init_class_features(struct mpam_class *class)
> +{
> + struct mpam_vmsc *vmsc;
> + struct mpam_component *comp;
> +
> + comp = list_first_entry_or_null(&class->components,
> + struct mpam_component, class_list);
> + if (WARN_ON(!comp))
> + return;
> +
> + vmsc = list_first_entry_or_null(&comp->vmsc,
> + struct mpam_vmsc, comp_list);
> + if (WARN_ON(!vmsc))
> + return;
> +
> + class->props = vmsc->props;
> +}
> +
> +static void mpam_enable_merge_vmsc_features(struct mpam_component *comp)
> +{
> + struct mpam_vmsc *vmsc;
> + struct mpam_msc_ris *ris;
> + struct mpam_class *class = comp->class;
> +
> + list_for_each_entry(vmsc, &comp->vmsc, comp_list) {
> + list_for_each_entry(ris, &vmsc->ris, vmsc_list) {
> + __vmsc_props_mismatch(vmsc, ris);
> + class->nrdy_usec = max(class->nrdy_usec,
> + vmsc->msc->nrdy_usec);
> + }
> + }
> +}
> +
> +static void mpam_enable_merge_class_features(struct mpam_component *comp)
> +{
> + struct mpam_vmsc *vmsc;
> + struct mpam_class *class = comp->class;
> +
> + list_for_each_entry(vmsc, &comp->vmsc, comp_list)
> + __class_props_mismatch(class, vmsc);
> +}
> +
> +/*
> + * Merge all the common resource features into class.
> + * vmsc features are bitwise-or'd together, this must be done first.
> + * Next the class features are the bitwise-and of all the vmsc features.
> + * Other features are the min/max as appropriate.
> + *
> + * To avoid walking the whole tree twice, the class->nrdy_usec property is
> + * updated when working with the vmsc as it is a max(), and doesn't need
> + * initialising first.
> + */
> +static void mpam_enable_merge_features(struct list_head *all_classes_list)
> +{
> + struct mpam_class *class;
> + struct mpam_component *comp;
> +
> + lockdep_assert_held(&mpam_list_lock);
> +
> + list_for_each_entry(class, all_classes_list, classes_list) {
> + list_for_each_entry(comp, &class->components, class_list)
> + mpam_enable_merge_vmsc_features(comp);
> +
> + mpam_enable_init_class_features(class);
> +
> + list_for_each_entry(comp, &class->components, class_list)
> + mpam_enable_merge_class_features(comp);
> + }
> +}
> +
> static void mpam_enable_once(void)
> {
> + mutex_lock(&mpam_list_lock);
> + mpam_enable_merge_features(&mpam_classes);
> + mutex_unlock(&mpam_list_lock);
> +
> mutex_lock(&mpam_cpuhp_state_lock);
> cpuhp_remove_state(mpam_cpuhp_state);
> mpam_cpuhp_state = 0;
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index ae6fd1f62cc4..be56234b84b4 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -185,12 +185,20 @@ static inline void mpam_set_feature(enum mpam_device_features feat,
> props->features |= (1 << feat);
> }
>
> +static inline void mpam_clear_feature(enum mpam_device_features feat,
> + mpam_features_t *supported)
> +{
> + *supported &= ~(1 << feat);
> +}
> +
> struct mpam_class {
> /* mpam_components in this class */
> struct list_head components;
>
> cpumask_t affinity;
>
> + struct mpam_props props;
> + u32 nrdy_usec;
> u8 level;
> enum mpam_class_types type;
>
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks
2025-07-11 18:36 ` [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks James Morse
@ 2025-07-28 9:49 ` Ben Horgan
2025-08-08 7:05 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 9:49 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> When a CPU comes online, it may bring a newly accessible MSC with
> it. Only the default partid has its value reset by hardware, and
> even then the MSC might not have been reset since its config was
> previously dirtyied. e.g. Kexec.
>
> Any in-use partid must have its configuration restored, or reset.
> In-use partids may be held in caches and evicted later.
>
> MSC are also reset when CPUs are taken offline to cover cases where
> firmware doesn't reset the MSC over reboot using UEFI, or kexec
> where there is no firmware involvement.
>
> If the configuration for a RIS has not been touched since it was
> brought online, it does not need resetting again.
>
> To reset, write the maximum values for all discovered controls.
>
> CC: Rohit Mathew <Rohit.Mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 124 +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 8 ++
> 2 files changed, 131 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 7b042a35405a..d014dbe0ab96 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -7,6 +7,7 @@
> #include <linux/atomic.h>
> #include <linux/arm_mpam.h>
> #include <linux/bitfield.h>
> +#include <linux/bitmap.h>
> #include <linux/cacheinfo.h>
> #include <linux/cpu.h>
> #include <linux/cpumask.h>
> @@ -849,8 +850,116 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
> return 0;
> }
>
> +static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
> +{
> + u32 num_words, msb;
> + u32 bm = ~0;
> + int i;
> +
> + lockdep_assert_held(&msc->part_sel_lock);
> +
> + if (wd == 0)
> + return;
> +
> + /*
> + * Write all ~0 to all but the last 32bit-word, which may
> + * have fewer bits...
> + */
> + num_words = DIV_ROUND_UP(wd, 32);
> + for (i = 0; i < num_words - 1; i++, reg += sizeof(bm))
> + __mpam_write_reg(msc, reg, bm);
> +
> + /*
> + * ....and then the last (maybe) partial 32bit word. When wd is a
> + * multiple of 32, msb should be 31 to write a full 32bit word.
> + */
> + msb = (wd - 1) % 32;
> + bm = GENMASK(msb, 0);
> + if (bm)
> + __mpam_write_reg(msc, reg, bm);
Drop the 'if' as the 0 bit will always be part of the mask.
> +}
> +
> +static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
> +{
> + u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
> + struct mpam_msc *msc = ris->vmsc->msc;
> + struct mpam_props *rprops = &ris->props;
> +
> + mpam_assert_srcu_read_lock_held();
> +
> + mutex_lock(&msc->part_sel_lock);
> + __mpam_part_sel(ris->ris_idx, partid, msc);
> +
> + if (mpam_has_feature(mpam_feat_cpor_part, rprops))
> + mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
> +
> + if (mpam_has_feature(mpam_feat_mbw_part, rprops))
> + mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
> +
> + if (mpam_has_feature(mpam_feat_mbw_min, rprops))
> + mpam_write_partsel_reg(msc, MBW_MIN, 0);
> +
> + if (mpam_has_feature(mpam_feat_mbw_max, rprops))
> + mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
> +
> + if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
> + mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
> + mutex_unlock(&msc->part_sel_lock);
> +}
> +
> +static void mpam_reset_ris(struct mpam_msc_ris *ris)
> +{
> + u16 partid, partid_max;
> +
> + mpam_assert_srcu_read_lock_held();
> +
> + if (ris->in_reset_state)
> + return;
> +
> + spin_lock(&partid_max_lock);
> + partid_max = mpam_partid_max;
> + spin_unlock(&partid_max_lock);
> + for (partid = 0; partid < partid_max; partid++)
> + mpam_reset_ris_partid(ris, partid);
> +}
> +
> +static void mpam_reset_msc(struct mpam_msc *msc, bool online)
> +{
> + int idx;
> + struct mpam_msc_ris *ris;
> +
> + mpam_assert_srcu_read_lock_held();
> +
> + mpam_mon_sel_outer_lock(msc);
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(ris, &msc->ris, msc_list, srcu_read_lock_held(&mpam_srcu)) {
> + mpam_reset_ris(ris);
> +
> + /*
> + * Set in_reset_state when coming online. The reset state
> + * for non-zero partid may be lost while the CPUs are offline.
> + */
> + ris->in_reset_state = online;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> + mpam_mon_sel_outer_unlock(msc);
> +}
> +
> static int mpam_cpu_online(unsigned int cpu)
> {
> + int idx;
> + struct mpam_msc *msc;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
> + if (!cpumask_test_cpu(cpu, &msc->accessibility))
> + continue;
> +
> + if (atomic_fetch_inc(&msc->online_refs) == 0)
> + mpam_reset_msc(msc, true);
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> return 0;
> }
>
> @@ -886,6 +995,19 @@ static int mpam_discovery_cpu_online(unsigned int cpu)
>
> static int mpam_cpu_offline(unsigned int cpu)
> {
> + int idx;
> + struct mpam_msc *msc;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
> + if (!cpumask_test_cpu(cpu, &msc->accessibility))
> + continue;
> +
> + if (atomic_dec_and_test(&msc->online_refs))
> + mpam_reset_msc(msc, false);
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> return 0;
> }
>
> @@ -1419,7 +1541,7 @@ static void mpam_enable_once(void)
> mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
>
> printk(KERN_INFO "MPAM enabled with %u partid and %u pmg\n",
> - mpam_partid_max + 1, mpam_pmg_max + 1);
> + READ_ONCE(mpam_partid_max) + 1, mpam_pmg_max + 1);
Belongs in 'arm_mpam: Probe MSCs to find the supported partid/pmg values'.
> }
>
> /*
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index be56234b84b4..f3cc88136524 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -5,6 +5,7 @@
> #define MPAM_INTERNAL_H
>
> #include <linux/arm_mpam.h>
> +#include <linux/atomic.h>
> #include <linux/cpumask.h>
> #include <linux/io.h>
> #include <linux/llist.h>
> @@ -43,6 +44,7 @@ struct mpam_msc {
> struct pcc_mbox_chan *pcc_chan;
> u32 nrdy_usec;
> cpumask_t accessibility;
> + atomic_t online_refs;
>
> /*
> * probe_lock is only take during discovery. After discovery these
> @@ -247,6 +249,7 @@ struct mpam_msc_ris {
> u8 ris_idx;
> u64 idr;
> struct mpam_props props;
> + bool in_reset_state;
>
> cpumask_t affinity;
>
> @@ -266,6 +269,11 @@ struct mpam_msc_ris {
> extern struct srcu_struct mpam_srcu;
> extern struct list_head mpam_classes;
>
> +static inline void mpam_assert_srcu_read_lock_held(void)
> +{
> + WARN_ON_ONCE(!srcu_read_lock_held((&mpam_srcu)));
> +}
> +
> /* System wide partid/pmg values */
> extern u16 mpam_partid_max;
> extern u8 mpam_pmg_max;
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-16 17:07 ` Jonathan Cameron
2025-07-23 16:39 ` Ben Horgan
@ 2025-07-28 10:08 ` Jonathan Cameron
2025-08-05 17:08 ` James Morse
2025-08-05 17:07 ` James Morse
2 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-28 10:08 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
> > +static struct acpi_table_header *get_table(void)
> > +{
> > + struct acpi_table_header *table;
> > + acpi_status status;
> > +
> > + if (acpi_disabled || !system_supports_mpam())
> > + return NULL;
> > +
> > + status = acpi_get_table(ACPI_SIG_MPAM, 0, &table);
> > + if (ACPI_FAILURE(status))
> > + return NULL;
> > +
> > + if (table->revision != 1)
Missing an acpi_put_table()
I'm messing around with ACQUIRE() that is queued in the CXL tree
for the coming merge window and noticed this.
https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.17/cleanup-acquire
Interestingly this is a new corner case where we want conditional locking
style handling but with return_ptr() style handling. Maybe too much of a niche
to bother with infrastructure.
Worth noting though that one layer up it is probably worth something like:
DEFINE_FREE(acpi_table_mpam, struct acpi_table_header *, if (_T) acpi_put_table(_T));
That enables nice clean code like:
static int __init acpi_mpam_parse(void)
{
struct acpi_table_header *mpam = __free(acpi_table_mpam) = get_table();
if (!mpam)
return 0;
return _parse_table;
}
This series was big enough that I'm spinning a single 'suggested changes'
patch on top of it that includes stuff like this. Might take another day or so.
Jonathan
> > + return NULL;
> > +
> > + return table;
> > +}
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time
2025-07-11 18:36 ` [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time James Morse
@ 2025-07-28 10:22 ` Ben Horgan
2025-08-08 7:07 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 10:22 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> cpuhp callbacks aren't the only time the MSC configuration may need to
> be reset. Resctrl has an API call to reset a class.
> If an MPAM error interrupt arrives it indicates the driver has
> misprogrammed an MSC. The safest thing to do is reset all the MSCs
> and disable MPAM.
>
> Add a helper to reset RIS via their class. Call this from mpam_disable(),
> which can be scheduled from the error interrupt handler.
>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 62 ++++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 1 +
> 2 files changed, 61 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 2e32e54cc081..145535cd4732 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -916,8 +916,6 @@ static int mpam_reset_ris(void *arg)
> u16 partid, partid_max;
> struct mpam_msc_ris *ris = arg;
>
> - mpam_assert_srcu_read_lock_held();
> -
> if (ris->in_reset_state)
> return 0;
>
> @@ -1575,6 +1573,66 @@ static void mpam_enable_once(void)
> READ_ONCE(mpam_partid_max) + 1, mpam_pmg_max + 1);
> }
>
> +static void mpam_reset_component_locked(struct mpam_component *comp)
> +{
> + int idx;
> + struct mpam_msc *msc;
> + struct mpam_vmsc *vmsc;
> + struct mpam_msc_ris *ris;
> +
> + might_sleep();
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
> + msc = vmsc->msc;
> +
> + list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
> + if (!ris->in_reset_state)
> + mpam_touch_msc(msc, mpam_reset_ris, ris);
> + ris->in_reset_state = true;
> + }
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +}
> +
> +static void mpam_reset_class_locked(struct mpam_class *class)
> +{
> + int idx;
> + struct mpam_component *comp;
> +
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(comp, &class->components, class_list)
> + mpam_reset_component_locked(comp);
> + srcu_read_unlock(&mpam_srcu, idx);
> +}
> +
> +static void mpam_reset_class(struct mpam_class *class)
> +{
> + cpus_read_lock();
> + mpam_reset_class_locked(class);
> + cpus_read_unlock();
> +}
> +
> +/*
> + * Called in response to an error IRQ.
> + * All of MPAMs errors indicate a software bug, restore any modified
> + * controls to their reset values.
> + */
> +void mpam_disable(void)
> +{
> + int idx;
> + struct mpam_class *class;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(class, &mpam_classes, classes_list,
> + srcu_read_lock_held(&mpam_srcu))
> + mpam_reset_class(class);
> + srcu_read_unlock(&mpam_srcu, idx);
> +}
Consider moving to the next patch where you introduce interrupt support.
> +
> /*
> * Enable mpam once all devices have been probed.
> * Scheduled by mpam_discovery_cpu_online() once all devices have been created.
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index f3cc88136524..de05eece0a31 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -280,6 +280,7 @@ extern u8 mpam_pmg_max;
>
> /* Scheduled work callback to enable mpam once all MSC have been probed */
> void mpam_enable(struct work_struct *work);
> +void mpam_disable(void);
>
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> cpumask_t *affinity);
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
` (2 preceding siblings ...)
2025-07-22 15:06 ` Jonathan Cameron
@ 2025-07-28 10:49 ` Ben Horgan
2025-08-08 7:11 ` James Morse
2025-08-04 16:53 ` Fenghua Yu
4 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 10:49 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> Register and enable error IRQs. All the MPAM error interrupts indicate a
> software bug, e.g. out of range partid. If the error interrupt is ever
> signalled, attempt to disable MPAM.
>
> Only the irq handler accesses the ESR register, so no locking is needed.
> The work to disable MPAM after an error needs to happen at process
> context, use a threaded interrupt.
>
> There is no support for percpu threaded interrupts, for now schedule
> the work to be done from the irq handler.
>
> Enabling the IRQs in the MSC may involve cross calling to a CPU that
> can access the MSC.
>
> CC: Rohit Mathew <rohit.mathew@arm.com>
> Tested-by: Rohit Mathew <rohit.mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 304 +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 9 +-
> 2 files changed, 307 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 145535cd4732..af19cc25d16e 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -14,6 +14,9 @@
> #include <linux/device.h>
> #include <linux/errno.h>
> #include <linux/gfp.h>
> +#include <linux/interrupt.h>
> +#include <linux/irq.h>
> +#include <linux/irqdesc.h>
> #include <linux/list.h>
> #include <linux/lockdep.h>
> #include <linux/mutex.h>
> @@ -62,6 +65,12 @@ static DEFINE_SPINLOCK(partid_max_lock);
> */
> static DECLARE_WORK(mpam_enable_work, &mpam_enable);
>
> +/*
> + * All mpam error interrupts indicate a software bug. On receipt, disable the
> + * driver.
> + */
> +static DECLARE_WORK(mpam_broken_work, &mpam_disable);
> +
> /*
> * An MSC is a physical container for controls and monitors, each identified by
> * their RIS index. These share a base-address, interrupts and some MMIO
> @@ -159,6 +168,24 @@ static u64 mpam_msc_read_idr(struct mpam_msc *msc)
> return (idr_high << 32) | idr_low;
> }
>
> +static void mpam_msc_zero_esr(struct mpam_msc *msc)
> +{
> + __mpam_write_reg(msc, MPAMF_ESR, 0);
> + if (msc->has_extd_esr)
> + __mpam_write_reg(msc, MPAMF_ESR + 4, 0);
> +}
> +
> +static u64 mpam_msc_read_esr(struct mpam_msc *msc)
> +{
> + u64 esr_high = 0, esr_low;
> +
> + esr_low = __mpam_read_reg(msc, MPAMF_ESR);
> + if (msc->has_extd_esr)
> + esr_high = __mpam_read_reg(msc, MPAMF_ESR + 4);
> +
> + return (esr_high << 32) | esr_low;
> +}
> +
> static void __mpam_part_sel_raw(u32 partsel, struct mpam_msc *msc)
> {
> lockdep_assert_held(&msc->part_sel_lock);
> @@ -405,12 +432,12 @@ static void mpam_msc_destroy(struct mpam_msc *msc)
>
> lockdep_assert_held(&mpam_list_lock);
>
> - list_del_rcu(&msc->glbl_list);
> - platform_set_drvdata(pdev, NULL);
> -
> list_for_each_entry_safe(ris, tmp, &msc->ris, msc_list)
> mpam_ris_destroy(ris);
>
> + list_del_rcu(&msc->glbl_list);
> + platform_set_drvdata(pdev, NULL);
> +
> add_to_garbage(msc);
> msc->garbage.pdev = pdev;
> }
> @@ -828,6 +855,7 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
> pmg_max = FIELD_GET(MPAMF_IDR_PMG_MAX, idr);
> msc->partid_max = min(msc->partid_max, partid_max);
> msc->pmg_max = min(msc->pmg_max, pmg_max);
> + msc->has_extd_esr = FIELD_GET(MPAMF_IDR_HAS_EXT_ESR, idr);
>
> ris = mpam_get_or_create_ris(msc, ris_idx);
> if (IS_ERR(ris))
> @@ -974,6 +1002,13 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
> mpam_mon_sel_outer_unlock(msc);
> }
>
> +static void _enable_percpu_irq(void *_irq)
> +{
> + int *irq = _irq;
> +
> + enable_percpu_irq(*irq, IRQ_TYPE_NONE);
> +}
> +
> static int mpam_cpu_online(unsigned int cpu)
> {
> int idx;
> @@ -984,6 +1019,9 @@ static int mpam_cpu_online(unsigned int cpu)
> if (!cpumask_test_cpu(cpu, &msc->accessibility))
> continue;
>
> + if (msc->reenable_error_ppi)
> + _enable_percpu_irq(&msc->reenable_error_ppi);
> +
> if (atomic_fetch_inc(&msc->online_refs) == 0)
> mpam_reset_msc(msc, true);
> }
> @@ -1032,6 +1070,9 @@ static int mpam_cpu_offline(unsigned int cpu)
> if (!cpumask_test_cpu(cpu, &msc->accessibility))
> continue;
>
> + if (msc->reenable_error_ppi)
> + disable_percpu_irq(msc->reenable_error_ppi);
> +
> if (atomic_dec_and_test(&msc->online_refs))
> mpam_reset_msc(msc, false);
> }
> @@ -1058,6 +1099,51 @@ static void mpam_register_cpuhp_callbacks(int (*online)(unsigned int online),
> mutex_unlock(&mpam_cpuhp_state_lock);
> }
>
> +static int __setup_ppi(struct mpam_msc *msc)
> +{
> + int cpu;
> +
> + msc->error_dev_id = alloc_percpu_gfp(struct mpam_msc *, GFP_KERNEL);
> + if (!msc->error_dev_id)
> + return -ENOMEM;
> +
> + for_each_cpu(cpu, &msc->accessibility) {
> + struct mpam_msc *empty = *per_cpu_ptr(msc->error_dev_id, cpu);
> +
> + if (empty) {
> + pr_err_once("%s shares PPI with %s!\n",
> + dev_name(&msc->pdev->dev),
> + dev_name(&empty->pdev->dev));
> + return -EBUSY;
> + }
> + *per_cpu_ptr(msc->error_dev_id, cpu) = msc;
> + }
> +
> + return 0;
> +}
> +
> +static int mpam_msc_setup_error_irq(struct mpam_msc *msc)
> +{
> + int irq;
> +
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + return 0;
> +
> + /* Allocate and initialise the percpu device pointer for PPI */
> + if (irq_is_percpu(irq))
> + return __setup_ppi(msc);
> +
> + /* sanity check: shared interrupts can be routed anywhere? */
> + if (!cpumask_equal(&msc->accessibility, cpu_possible_mask)) {
> + pr_err_once("msc:%u is a private resource with a shared error interrupt",
> + msc->id);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> static int mpam_dt_count_msc(void)
> {
> int count = 0;
> @@ -1266,6 +1352,10 @@ static int mpam_msc_drv_probe(struct platform_device *pdev)
> break;
> }
>
> + err = mpam_msc_setup_error_irq(msc);
> + if (err)
> + break;
> +
> if (device_property_read_u32(&pdev->dev, "pcc-channel",
> &msc->pcc_subspace_id))
> msc->iface = MPAM_IFACE_MMIO;
> @@ -1548,11 +1638,193 @@ static void mpam_enable_merge_features(struct list_head *all_classes_list)
> }
> }
>
> +static char *mpam_errcode_names[16] = {
> + [0] = "No error",
> + [1] = "PARTID_SEL_Range",
> + [2] = "Req_PARTID_Range",
> + [3] = "MSMONCFG_ID_RANGE",
> + [4] = "Req_PMG_Range",
> + [5] = "Monitor_Range",
> + [6] = "intPARTID_Range",
> + [7] = "Unexpected_INTERNAL",
> + [8] = "Undefined_RIS_PART_SEL",
> + [9] = "RIS_No_Control",
> + [10] = "Undefined_RIS_MON_SEL",
> + [11] = "RIS_No_Monitor",
> + [12 ... 15] = "Reserved"
> +};
> +
> +static int mpam_enable_msc_ecr(void *_msc)
> +{
> + struct mpam_msc *msc = _msc;
> +
> + __mpam_write_reg(msc, MPAMF_ECR, 1);
You can use MPAMF_ECR_INTEN.
> +
> + return 0;
> +}
> +
> +static int mpam_disable_msc_ecr(void *_msc)
> +{
> + struct mpam_msc *msc = _msc;
> +
> + __mpam_write_reg(msc, MPAMF_ECR, 0);
> +
> + return 0;
> +}
> +
> +static irqreturn_t __mpam_irq_handler(int irq, struct mpam_msc *msc)
> +{
> + u64 reg;
> + u16 partid;
> + u8 errcode, pmg, ris;
> +
> + if (WARN_ON_ONCE(!msc) ||
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
> + &msc->accessibility)))
> + return IRQ_NONE;
> +
> + reg = mpam_msc_read_esr(msc);
> +
> + errcode = FIELD_GET(MPAMF_ESR_ERRCODE, reg);
> + if (!errcode)
> + return IRQ_NONE;
> +
> + /* Clear level triggered irq */
> + mpam_msc_zero_esr(msc);
> +
> + partid = FIELD_GET(MPAMF_ESR_PARTID_OR_MON, reg);
> + pmg = FIELD_GET(MPAMF_ESR_PMG, reg);
> + ris = FIELD_GET(MPAMF_ESR_PMG, reg);
> +
> + pr_err("error irq from msc:%u '%s', partid:%u, pmg: %u, ris: %u\n",
> + msc->id, mpam_errcode_names[errcode], partid, pmg, ris);
> +
> + if (irq_is_percpu(irq)) {
> + mpam_disable_msc_ecr(msc);
> + schedule_work(&mpam_broken_work);
> + return IRQ_HANDLED;
> + }
> +
> + return IRQ_WAKE_THREAD;
> +}
> +
> +static irqreturn_t mpam_ppi_handler(int irq, void *dev_id)
> +{
> + struct mpam_msc *msc = *(struct mpam_msc **)dev_id;
> +
> + return __mpam_irq_handler(irq, msc);
> +}
> +
> +static irqreturn_t mpam_spi_handler(int irq, void *dev_id)
> +{
> + struct mpam_msc *msc = dev_id;
> +
> + return __mpam_irq_handler(irq, msc);
> +}
> +
> +static irqreturn_t mpam_disable_thread(int irq, void *dev_id);
> +
> +static int mpam_register_irqs(void)
> +{
> + int err, irq, idx;
> + struct mpam_msc *msc;
> +
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
> + /* We anticipate sharing the interrupt with other MSCs */
> + if (irq_is_percpu(irq)) {
> + err = request_percpu_irq(irq, &mpam_ppi_handler,
> + "mpam:msc:error",
> + msc->error_dev_id);
> + if (err)
> + return err;
> +
> + msc->reenable_error_ppi = irq;
> + smp_call_function_many(&msc->accessibility,
> + &_enable_percpu_irq, &irq,
> + true);
> + } else {
> + err = devm_request_threaded_irq(&msc->pdev->dev, irq,
> + &mpam_spi_handler,
> + &mpam_disable_thread,
> + IRQF_SHARED,
> + "mpam:msc:error", msc);
> + if (err)
> + return err;
> + }
> +
> + msc->error_irq_requested = true;
> + mpam_touch_msc(msc, mpam_enable_msc_ecr, msc);
> + msc->error_irq_hw_enabled = true;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
> +
> +static void mpam_unregister_irqs(void)
> +{
> + int irq, idx;
> + struct mpam_msc *msc;
> +
> + cpus_read_lock();
> + /* take the lock as free_irq() can sleep */
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + if (msc->error_irq_hw_enabled) {
> + mpam_touch_msc(msc, mpam_disable_msc_ecr, msc);
> + msc->error_irq_hw_enabled = false;
> + }
> +
> + if (msc->error_irq_requested) {
> + if (irq_is_percpu(irq)) {
> + msc->reenable_error_ppi = 0;
> + free_percpu_irq(irq, msc->error_dev_id);
> + } else {
> + devm_free_irq(&msc->pdev->dev, irq, msc);
> + }
> + msc->error_irq_requested = false;
> + }
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> + cpus_read_unlock();
> +}
> +
> static void mpam_enable_once(void)
> {
> + int err;
> +
> + /*
> + * If all the MSC have been probed, enabling the IRQs happens next.
> + * That involves cross-calling to a CPU that can reach the MSC, and
> + * the locks must be taken in this order:
> + */
> + cpus_read_lock();
> mutex_lock(&mpam_list_lock);
> mpam_enable_merge_features(&mpam_classes);
> +
> + err = mpam_register_irqs();
> + if (err)
> + pr_warn("Failed to register irqs: %d\n", err);
> +
> mutex_unlock(&mpam_list_lock);
> + cpus_read_unlock();
> +
> + if (err) {
> + schedule_work(&mpam_broken_work);
> + return;
> + }
>
> mutex_lock(&mpam_cpuhp_state_lock);
> cpuhp_remove_state(mpam_cpuhp_state);
> @@ -1621,16 +1893,39 @@ static void mpam_reset_class(struct mpam_class *class)
> * All of MPAMs errors indicate a software bug, restore any modified
> * controls to their reset values.
> */
> -void mpam_disable(void)
> +static irqreturn_t mpam_disable_thread(int irq, void *dev_id)
> {
> int idx;
> struct mpam_class *class;
> + struct mpam_msc *msc, *tmp;
> +
> + mutex_lock(&mpam_cpuhp_state_lock);
> + if (mpam_cpuhp_state) {
> + cpuhp_remove_state(mpam_cpuhp_state);
> + mpam_cpuhp_state = 0;
> + }
> + mutex_unlock(&mpam_cpuhp_state_lock);
> +
> + mpam_unregister_irqs();
>
> idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_srcu(class, &mpam_classes, classes_list,
> srcu_read_lock_held(&mpam_srcu))
> mpam_reset_class(class);
> srcu_read_unlock(&mpam_srcu, idx);
> +
> + mutex_lock(&mpam_list_lock);
> + list_for_each_entry_safe(msc, tmp, &mpam_all_msc, glbl_list)
> + mpam_msc_destroy(msc);
> + mutex_unlock(&mpam_list_lock);
> + mpam_free_garbage();
> +
> + return IRQ_HANDLED;
> +}
> +
> +void mpam_disable(struct work_struct *ignored)
> +{
> + mpam_disable_thread(0, NULL);
> }
>
> /*
> @@ -1644,7 +1939,6 @@ void mpam_enable(struct work_struct *work)
> struct mpam_msc *msc;
> bool all_devices_probed = true;
>
> - /* Have we probed all the hw devices? */
Stray change. Keep the comment or remove it in the patch that introduced it.
> mutex_lock(&mpam_list_lock);
> list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
> mutex_lock(&msc->probe_lock);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index de05eece0a31..e1c6a2676b54 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -44,6 +44,11 @@ struct mpam_msc {
> struct pcc_mbox_chan *pcc_chan;
> u32 nrdy_usec;
> cpumask_t accessibility;
> + bool has_extd_esr;
> +
> + int reenable_error_ppi;
> + struct mpam_msc * __percpu *error_dev_id;
> +
> atomic_t online_refs;
>
> /*
> @@ -52,6 +57,8 @@ struct mpam_msc {
> */
> struct mutex probe_lock;
> bool probed;
> + bool error_irq_requested;
> + bool error_irq_hw_enabled;
> u16 partid_max;
> u8 pmg_max;
> unsigned long ris_idxs[128 / BITS_PER_LONG];
> @@ -280,7 +287,7 @@ extern u8 mpam_pmg_max;
>
> /* Scheduled work callback to enable mpam once all MSC have been probed */
> void mpam_enable(struct work_struct *work);
> -void mpam_disable(void);
> +void mpam_disable(struct work_struct *work);
>
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> cpumask_t *affinity);
--
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
@ 2025-07-28 11:59 ` Ben Horgan
2025-07-28 15:34 ` Dave Martin
2025-08-08 7:14 ` James Morse
2025-08-04 16:39 ` Fenghua Yu
2 siblings, 2 replies; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 11:59 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> When CPUs come online the original configuration should be restored.
> Once the maximum partid is known, allocate an configuration array for
> each component, and reprogram each RIS configuration from this.
>
> The MPAM spec describes how multiple controls can interact. To prevent
> this happening by accident, always reset controls that don't have a
> valid configuration. This allows the same helper to be used for
> configuration and reset.
>
> CC: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 236 ++++++++++++++++++--
> drivers/platform/arm64/mpam/mpam_internal.h | 26 ++-
> 2 files changed, 234 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index bb3695eb84e9..f3ecfda265d2 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -374,12 +374,16 @@ static void mpam_class_destroy(struct mpam_class *class)
> add_to_garbage(class);
> }
>
> +static void __destroy_component_cfg(struct mpam_component *comp);
> +
> static void mpam_comp_destroy(struct mpam_component *comp)
> {
> struct mpam_class *class = comp->class;
>
> lockdep_assert_held(&mpam_list_lock);
>
> + __destroy_component_cfg(comp);
> +
> list_del_rcu(&comp->class_list);
> add_to_garbage(comp);
>
> @@ -909,51 +913,90 @@ static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
> __mpam_write_reg(msc, reg, bm);
> }
>
> -static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
> +/* Called via IPI. Call while holding an SRCU reference */
> +static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
> + struct mpam_config *cfg)
> {
> u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
> struct mpam_msc *msc = ris->vmsc->msc;
> struct mpam_props *rprops = &ris->props;
>
> - mpam_assert_srcu_read_lock_held();
> -
> mutex_lock(&msc->part_sel_lock);
> __mpam_part_sel(ris->ris_idx, partid, msc);
>
> - if (mpam_has_feature(mpam_feat_cpor_part, rprops))
> - mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
> + if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_cpor_part, cfg))
> + mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
> + else
> + mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
> + rprops->cpbm_wd);
> + }
>
> - if (mpam_has_feature(mpam_feat_mbw_part, rprops))
> - mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
> + if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_part, cfg))
> + mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
> + else
> + mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM,
> + rprops->mbw_pbm_bits);
> + }
>
> if (mpam_has_feature(mpam_feat_mbw_min, rprops))
> mpam_write_partsel_reg(msc, MBW_MIN, 0);
>
> - if (mpam_has_feature(mpam_feat_mbw_max, rprops))
> - mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
> + if (mpam_has_feature(mpam_feat_mbw_max, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_max, cfg))
> + mpam_write_partsel_reg(msc, MBW_MAX, cfg->mbw_max);
> + else
> + mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
> + }
>
> if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
> mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
> mutex_unlock(&msc->part_sel_lock);
> }
>
> +struct reprogram_ris {
> + struct mpam_msc_ris *ris;
> + struct mpam_config *cfg;
> +};
> +
> +/* Call with MSC lock held */
> +static int mpam_reprogram_ris(void *_arg)
> +{
> + u16 partid, partid_max;
> + struct reprogram_ris *arg = _arg;
> + struct mpam_msc_ris *ris = arg->ris;
> + struct mpam_config *cfg = arg->cfg;
> +
> + if (ris->in_reset_state)
> + return 0;
> +
> + spin_lock(&partid_max_lock);
> + partid_max = mpam_partid_max;
> + spin_unlock(&partid_max_lock);
> + for (partid = 0; partid <= partid_max; partid++)
> + mpam_reprogram_ris_partid(ris, partid, cfg);
> +
> + return 0;
> +}
> +
> /*
> * Called via smp_call_on_cpu() to prevent migration, while still being
> * pre-emptible.
> */
> static int mpam_reset_ris(void *arg)
> {
> - u16 partid, partid_max;
> struct mpam_msc_ris *ris = arg;
> + struct reprogram_ris reprogram_arg;
> + struct mpam_config empty_cfg = { 0 };
>
> if (ris->in_reset_state)
> return 0;
>
> - spin_lock(&partid_max_lock);
> - partid_max = mpam_partid_max;
> - spin_unlock(&partid_max_lock);
> - for (partid = 0; partid < partid_max; partid++)
> - mpam_reset_ris_partid(ris, partid);
> + reprogram_arg.ris = ris;
> + reprogram_arg.cfg = &empty_cfg;
> +
> + mpam_reprogram_ris(&reprogram_arg);
>
> return 0;
> }
> @@ -984,13 +1027,11 @@ static int mpam_touch_msc(struct mpam_msc *msc, int (*fn)(void *a), void *arg)
>
> static void mpam_reset_msc(struct mpam_msc *msc, bool online)
> {
> - int idx;
> struct mpam_msc_ris *ris;
>
> mpam_assert_srcu_read_lock_held();
>
> mpam_mon_sel_outer_lock(msc);
> - idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_srcu(ris, &msc->ris, msc_list, srcu_read_lock_held(&mpam_srcu)) {
> mpam_touch_msc(msc, &mpam_reset_ris, ris);
>
> @@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
> */
> ris->in_reset_state = online;
> }
> - srcu_read_unlock(&mpam_srcu, idx);
> mpam_mon_sel_outer_unlock(msc);
> }
>
> +static void mpam_reprogram_msc(struct mpam_msc *msc)
> +{
> + int idx;
> + u16 partid;
> + bool reset;
> + struct mpam_config *cfg;
> + struct mpam_msc_ris *ris;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
> + if (!mpam_is_enabled() && !ris->in_reset_state) {
> + mpam_touch_msc(msc, &mpam_reset_ris, ris);
> + ris->in_reset_state = true;
> + continue;
> + }
> +
> + reset = true;
> + for (partid = 0; partid <= mpam_partid_max; partid++) {
Do we need to consider 'partid_max_lock' here?
> + cfg = &ris->vmsc->comp->cfg[partid];
> + if (cfg->features)
> + reset = false;
> +
> + mpam_reprogram_ris_partid(ris, partid, cfg);
> + }
> + ris->in_reset_state = reset;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +}
> +
> static void _enable_percpu_irq(void *_irq)
> {
> int *irq = _irq;
> @@ -1025,7 +1094,7 @@ static int mpam_cpu_online(unsigned int cpu)
> _enable_percpu_irq(&msc->reenable_error_ppi);
>
> if (atomic_fetch_inc(&msc->online_refs) == 0)
> - mpam_reset_msc(msc, true);
> + mpam_reprogram_msc(msc);
> }
> srcu_read_unlock(&mpam_srcu, idx);
>
> @@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)
> cpus_read_unlock();
> }
>
> +static void __destroy_component_cfg(struct mpam_component *comp)
> +{
> + add_to_garbage(comp->cfg);
> +}
> +
> +static int __allocate_component_cfg(struct mpam_component *comp)
> +{
> + if (comp->cfg)
> + return 0;
> +
> + comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg), GFP_KERNEL);
And here?
> + if (!comp->cfg)
> + return -ENOMEM;
> + init_garbage(comp->cfg);
> +
> + return 0;
> +}
> +
> +static int mpam_allocate_config(void)
> +{
> + int err = 0;
> + struct mpam_class *class;
> + struct mpam_component *comp;
> +
> + lockdep_assert_held(&mpam_list_lock);
> +
> + list_for_each_entry(class, &mpam_classes, classes_list) {
> + list_for_each_entry(comp, &class->components, class_list) {
> + err = __allocate_component_cfg(comp);
> + if (err)
> + return err;
> + }
> + }
> +
> + return 0;
> +}
> +
> static void mpam_enable_once(void)
> {
> int err;
> @@ -1817,12 +1923,21 @@ static void mpam_enable_once(void)
> */
> cpus_read_lock();
> mutex_lock(&mpam_list_lock);
> - mpam_enable_merge_features(&mpam_classes);
> + do {
> + mpam_enable_merge_features(&mpam_classes);
>
> - err = mpam_register_irqs();
> - if (err)
> - pr_warn("Failed to register irqs: %d\n", err);
> + err = mpam_allocate_config();
> + if (err) {
> + pr_err("Failed to allocate configuration arrays.\n");
> + break;
> + }
>
> + err = mpam_register_irqs();
> + if (err) {
> + pr_warn("Failed to register irqs: %d\n", err);
> + break;
> + }
> + } while (0);
> mutex_unlock(&mpam_list_lock);
> cpus_read_unlock();
>
> @@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct mpam_component *comp)
> might_sleep();
> lockdep_assert_cpus_held();
>
> + memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));
And here?
> +
> idx = srcu_read_lock(&mpam_srcu);
> list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
> msc = vmsc->msc;
> @@ -1963,6 +2080,79 @@ void mpam_enable(struct work_struct *work)
> mpam_enable_once();
> }
>
> +struct mpam_write_config_arg {
> + struct mpam_msc_ris *ris;
> + struct mpam_component *comp;
> + u16 partid;
> +};
> +
> +static int __write_config(void *arg)
> +{
> + struct mpam_write_config_arg *c = arg;
> +
> + mpam_reprogram_ris_partid(c->ris, c->partid, &c->comp->cfg[c->partid]);
> +
> + return 0;
> +}
> +
> +#define maybe_update_config(cfg, feature, newcfg, member, changes) do { \
> + if (mpam_has_feature(feature, newcfg) && \
> + (newcfg)->member != (cfg)->member) { \
> + (cfg)->member = (newcfg)->member; \
> + cfg->features |= (1 << feature); \
> + \
> + (changes) |= (1 << feature); \
> + } \
> +} while (0)
> +
> +static mpam_features_t mpam_update_config(struct mpam_config *cfg,
> + const struct mpam_config *newcfg)
> +{
> + mpam_features_t changes = 0;
> +
> + maybe_update_config(cfg, mpam_feat_cpor_part, newcfg, cpbm, changes);
> + maybe_update_config(cfg, mpam_feat_mbw_part, newcfg, mbw_pbm, changes);
> + maybe_update_config(cfg, mpam_feat_mbw_max, newcfg, mbw_max, changes);
> +
> + return changes;
> +}
> +
> +/* TODO: split into write_config/sync_config */
> +/* TODO: add config_dirty bitmap to drive sync_config */
> +int mpam_apply_config(struct mpam_component *comp, u16 partid,
> + struct mpam_config *cfg)
> +{
> + struct mpam_write_config_arg arg;
> + struct mpam_msc_ris *ris;
> + struct mpam_vmsc *vmsc;
> + struct mpam_msc *msc;
> + int idx;
> +
> + lockdep_assert_cpus_held();
> +
> + /* Don't pass in the current config! */
> + WARN_ON_ONCE(&comp->cfg[partid] == cfg);
> +
> + if (!mpam_update_config(&comp->cfg[partid], cfg))
> + return 0;
> +
> + arg.comp = comp;
> + arg.partid = partid;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
> + msc = vmsc->msc;
> +
> + list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
> + arg.ris = ris;
> + mpam_touch_msc(msc, __write_config, &arg);
> + }
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
> +
> /*
> * MSC that are hidden under caches are not created as platform devices
> * as there is no cache driver. Caches are also special-cased in
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index 1a24424b48df..029ec89f56f2 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -190,11 +190,7 @@ struct mpam_props {
> u16 num_mbwu_mon;
> };
>
> -static inline bool mpam_has_feature(enum mpam_device_features feat,
> - struct mpam_props *props)
> -{
> - return (1 << feat) & props->features;
> -}
> +#define mpam_has_feature(_feat, x) ((1 << (_feat)) & (x)->features)
>
> static inline void mpam_set_feature(enum mpam_device_features feat,
> struct mpam_props *props)
> @@ -225,6 +221,17 @@ struct mpam_class {
> struct mpam_garbage garbage;
> };
>
> +struct mpam_config {
> + /* Which configuration values are valid. 0 is used for reset */
> + mpam_features_t features;
> +
> + u32 cpbm;
> + u32 mbw_pbm;
> + u16 mbw_max;
> +
> + struct mpam_garbage garbage;
> +};
> +
> struct mpam_component {
> u32 comp_id;
>
> @@ -233,6 +240,12 @@ struct mpam_component {
>
> cpumask_t affinity;
>
> + /*
> + * Array of configuration values, indexed by partid.
> + * Read from cpuhp callbacks, hold the cpuhp lock when writing.
> + */
> + struct mpam_config *cfg;
> +
> /* member of mpam_class:components */
> struct list_head class_list;
>
> @@ -297,6 +310,9 @@ extern u8 mpam_pmg_max;
> void mpam_enable(struct work_struct *work);
> void mpam_disable(struct work_struct *work);
>
> +int mpam_apply_config(struct mpam_component *comp, u16 partid,
> + struct mpam_config *cfg);
> +
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> cpumask_t *affinity);
>
--
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value
2025-07-11 18:36 ` [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value James Morse
@ 2025-07-28 13:02 ` Ben Horgan
0 siblings, 0 replies; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 13:02 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> Reaing a monitor involves configuring what you want to monitor, and
nit: s/Reaing/Reading/
> reading the value. Components made up of multiple MSC may need values
> from each MSC. MSCs may take time to configure, returning 'not ready'.
> The maximum 'not ready' time should have been provided by firmware.
>
> Add mpam_msmon_read() to hide all this. If (one of) the MSC returns
> not ready, then wait the full timeout value before trying again.
>
> CC: Shanker Donthineni <sdonthineni@nvidia.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 222 ++++++++++++++++++++
> drivers/platform/arm64/mpam/mpam_internal.h | 18 ++
> 2 files changed, 240 insertions(+)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index b11503d8ef1b..7d2d2929b292 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -960,6 +960,228 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
> return 0;
> }
>
> +struct mon_read {
> + struct mpam_msc_ris *ris;
> + struct mon_cfg *ctx;
> + enum mpam_device_features type;
> + u64 *val;
> + int err;
> +};
> +
> +static void gen_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
> + u32 *flt_val)
> +{
> + struct mon_cfg *ctx = m->ctx;
> +
> + switch (m->type) {
> + case mpam_feat_msmon_csu:
> + *ctl_val = MSMON_CFG_MBWU_CTL_TYPE_CSU;
> + break;
> + case mpam_feat_msmon_mbwu:
> + *ctl_val = MSMON_CFG_MBWU_CTL_TYPE_MBWU;
> + break;
> + default:
> + return;
> + }
> +
> + /*
> + * For CSU counters its implementation-defined what happens when not
> + * filtering by partid.
> + */
> + *ctl_val |= MSMON_CFG_x_CTL_MATCH_PARTID;
> +
> + *flt_val = FIELD_PREP(MSMON_CFG_MBWU_FLT_PARTID, ctx->partid);
> + if (m->ctx->match_pmg) {
> + *ctl_val |= MSMON_CFG_x_CTL_MATCH_PMG;
> + *flt_val |= FIELD_PREP(MSMON_CFG_MBWU_FLT_PMG, ctx->pmg);
> + }
> +
> + if (mpam_has_feature(mpam_feat_msmon_mbwu_rwbw, &m->ris->props))
> + *flt_val |= FIELD_PREP(MSMON_CFG_MBWU_FLT_RWBW, ctx->opts);
> +}
> +
> +static void read_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
> + u32 *flt_val)
> +{
> + struct mpam_msc *msc = m->ris->vmsc->msc;
> +
> + switch (m->type) {
> + case mpam_feat_msmon_csu:
> + *ctl_val = mpam_read_monsel_reg(msc, CFG_CSU_CTL);
> + *flt_val = mpam_read_monsel_reg(msc, CFG_CSU_FLT);
> + break;
> + case mpam_feat_msmon_mbwu:
> + *ctl_val = mpam_read_monsel_reg(msc, CFG_MBWU_CTL);
> + *flt_val = mpam_read_monsel_reg(msc, CFG_MBWU_FLT);
> + break;
> + default:
> + return;
> + }
> +}
> +
> +/* Remove values set by the hardware to prevent aparant mismatches. */
> +static void clean_msmon_ctl_val(u32 *cur_ctl)
> +{
> + *cur_ctl &= ~MSMON_CFG_x_CTL_OFLOW_STATUS;
> +}
> +
> +static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
> + u32 flt_val)
> +{
> + struct mpam_msc *msc = m->ris->vmsc->msc;
> +
> + /*
> + * Write the ctl_val with the enable bit cleared, reset the counter,
> + * then enable counter.
> + */
> + switch (m->type) {
> + case mpam_feat_msmon_csu:
> + mpam_write_monsel_reg(msc, CFG_CSU_FLT, flt_val);
> + mpam_write_monsel_reg(msc, CFG_CSU_CTL, ctl_val);
> + mpam_write_monsel_reg(msc, CSU, 0);
> + mpam_write_monsel_reg(msc, CFG_CSU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
> + break;
> + case mpam_feat_msmon_mbwu:
> + mpam_write_monsel_reg(msc, CFG_MBWU_FLT, flt_val);
> + mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val);
> + mpam_write_monsel_reg(msc, MBWU, 0);
> + mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
> + break;
> + default:
> + return;
> + }
> +}
> +
> +/* Call with MSC lock held */
> +static void __ris_msmon_read(void *arg)
> +{
> + u64 now;
> + bool nrdy = false;
> + struct mon_read *m = arg;
> + struct mon_cfg *ctx = m->ctx;
> + struct mpam_msc_ris *ris = m->ris;
> + struct mpam_props *rprops = &ris->props;
> + struct mpam_msc *msc = m->ris->vmsc->msc;
> + u32 mon_sel, ctl_val, flt_val, cur_ctl, cur_flt;
> +
> + if (!mpam_mon_sel_inner_lock(msc)) {
> + m->err = -EIO;
> + return;
> + }
> + mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, ctx->mon) |
> + FIELD_PREP(MSMON_CFG_MON_SEL_RIS, ris->ris_idx);
> + mpam_write_monsel_reg(msc, CFG_MON_SEL, mon_sel);
> +
> + /*
> + * Read the existing configuration to avoid re-writing the same values.
> + * This saves waiting for 'nrdy' on subsequent reads.
> + */
> + read_msmon_ctl_flt_vals(m, &cur_ctl, &cur_flt);
> + clean_msmon_ctl_val(&cur_ctl);
> + gen_msmon_ctl_flt_vals(m, &ctl_val, &flt_val);
> + if (cur_flt != flt_val || cur_ctl != (ctl_val | MSMON_CFG_x_CTL_EN))
> + write_msmon_ctl_flt_vals(m, ctl_val, flt_val);
> +
> + switch (m->type) {
> + case mpam_feat_msmon_csu:
> + now = mpam_read_monsel_reg(msc, CSU);
> + if (mpam_has_feature(mpam_feat_msmon_csu_hw_nrdy, rprops))
> + nrdy = now & MSMON___NRDY;
> + break;
> + case mpam_feat_msmon_mbwu:
> + now = mpam_read_monsel_reg(msc, MBWU);
> + if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
> + nrdy = now & MSMON___NRDY;
> + break;
> + default:
> + m->err = -EINVAL;
> + break;
> + }
> + mpam_mon_sel_inner_unlock(msc);
> +
> + if (nrdy) {
> + m->err = -EBUSY;
> + return;
> + }
> +
> + now = FIELD_GET(MSMON___VALUE, now);
> + *m->val += now;
> +}
> +
> +static int _msmon_read(struct mpam_component *comp, struct mon_read *arg)
> +{
> + int err, idx;
> + struct mpam_msc *msc;
> + struct mpam_vmsc *vmsc;
> + struct mpam_msc_ris *ris;
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_rcu(vmsc, &comp->vmsc, comp_list) {
> + msc = vmsc->msc;
> +
> + mpam_mon_sel_outer_lock(msc);
> + list_for_each_entry_rcu(ris, &vmsc->ris, vmsc_list) {
> + arg->ris = ris;
> +
> + err = smp_call_function_any(&msc->accessibility,
> + __ris_msmon_read, arg,
> + true);
> + if (!err && arg->err)
> + err = arg->err;
> + if (err)
> + break;
> + }
> + mpam_mon_sel_outer_unlock(msc);
> + if (err)
> + break;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return err;
> +}
> +
> +int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
> + enum mpam_device_features type, u64 *val)
> +{
> + int err;
> + struct mon_read arg;
> + u64 wait_jiffies = 0;
> + struct mpam_props *cprops = &comp->class->props;
> +
> + might_sleep();
> +
> + if (!mpam_is_enabled())
> + return -EIO;
> +
> + if (!mpam_has_feature(type, cprops))
> + return -EOPNOTSUPP;
> +
> + memset(&arg, 0, sizeof(arg));
> + arg.ctx = ctx;
> + arg.type = type;
> + arg.val = val;
> + *val = 0;
> +
> + err = _msmon_read(comp, &arg);
> + if (err == -EBUSY && comp->class->nrdy_usec)
> + wait_jiffies = usecs_to_jiffies(comp->class->nrdy_usec);
> +
> + while (wait_jiffies)
> + wait_jiffies = schedule_timeout_uninterruptible(wait_jiffies);
> +
> + if (err == -EBUSY) {
> + memset(&arg, 0, sizeof(arg));
> + arg.ctx = ctx;
> + arg.type = type;
> + arg.val = val;
> + *val = 0;
> +
> + err = _msmon_read(comp, &arg);
> + }
> +
> + return err;
> +}
> +
> static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
> {
> u32 num_words, msb;
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index aca91f7dfbf6..4aabef96fb7a 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -308,6 +308,21 @@ struct mpam_msc_ris {
> struct mpam_garbage garbage;
> };
>
> +/* The values for MSMON_CFG_MBWU_FLT.RWBW */
> +enum mon_filter_options {
> + COUNT_BOTH = 0,
> + COUNT_WRITE = 1,
> + COUNT_READ = 2,
> +};
> +
> +struct mon_cfg {
> + u16 mon;
> + u8 pmg;
> + bool match_pmg;
> + u32 partid;
> + enum mon_filter_options opts;
> +};
> +
> static inline int mpam_alloc_csu_mon(struct mpam_class *class)
> {
> struct mpam_props *cprops = &class->props;
> @@ -360,6 +375,9 @@ void mpam_disable(struct work_struct *work);
> int mpam_apply_config(struct mpam_component *comp, u16 partid,
> struct mpam_config *cfg);
>
> +int mpam_msmon_read(struct mpam_component *comp, struct mon_cfg *ctx,
> + enum mpam_device_features, u64 *val);
> +
> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> cpumask_t *affinity);
>
--
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported
2025-07-11 18:36 ` [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported James Morse
@ 2025-07-28 13:46 ` Ben Horgan
2025-08-08 7:19 ` James Morse
0 siblings, 1 reply; 117+ messages in thread
From: Ben Horgan @ 2025-07-28 13:46 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi James,
On 7/11/25 19:36, James Morse wrote:
> From: Rohit Mathew <rohit.mathew@arm.com>
>
> If the 44 bit (long) or 63 bit (LWD) counters are detected on probing
> the RIS, use long/LWD counter instead of the regular 31 bit mbwu
> counter.
>
> Only 32bit accesses to the MSC are required to be supported by the
> spec, but these registers are 64bits. The lower half may overflow
> into the higher half between two 32bit reads. To avoid this, use
> a helper that reads the top half twice to check for overflow.
Slightly misleading as it may be read up to 4 times.
>
> Signed-off-by: Rohit Mathew <rohit.mathew@arm.com>
> [morse: merged multiple patches from Rohit]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 89 ++++++++++++++++++---
> drivers/platform/arm64/mpam/mpam_internal.h | 8 +-
> 2 files changed, 86 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 774137a124f8..ace69ac2d0ee 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -989,6 +989,48 @@ struct mon_read {
> int err;
> };
>
> +static bool mpam_ris_has_mbwu_long_counter(struct mpam_msc_ris *ris)
> +{
> + return (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, &ris->props) ||
> + mpam_has_feature(mpam_feat_msmon_mbwu_44counter, &ris->props));
> +}
> +
> +static u64 mpam_msc_read_mbwu_l(struct mpam_msc *msc)
> +{
> + int retry = 3;
> + u32 mbwu_l_low;
> + u64 mbwu_l_high1, mbwu_l_high2;
> +
> + mpam_mon_sel_lock_held(msc);
> +
> + WARN_ON_ONCE((MSMON_MBWU_L + sizeof(u64)) > msc->mapped_hwpage_sz);
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
> +
> + mbwu_l_high2 = __mpam_read_reg(msc, MSMON_MBWU_L + 4);
> + do {
> + mbwu_l_high1 = mbwu_l_high2;
> + mbwu_l_low = __mpam_read_reg(msc, MSMON_MBWU_L);
> + mbwu_l_high2 = __mpam_read_reg(msc, MSMON_MBWU_L + 4);
> +
> + retry--;
> + } while (mbwu_l_high1 != mbwu_l_high2 && retry > 0);
> +
> + if (mbwu_l_high1 == mbwu_l_high2)
> + return (mbwu_l_high1 << 32) | mbwu_l_low;
> + return MSMON___NRDY_L;
> +}
> +
> +static void mpam_msc_zero_mbwu_l(struct mpam_msc *msc)
> +{
> + mpam_mon_sel_lock_held(msc);
> +
> + WARN_ON_ONCE((MSMON_MBWU_L + sizeof(u64)) > msc->mapped_hwpage_sz);
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
> +
> + __mpam_write_reg(msc, MSMON_MBWU_L, 0);
> + __mpam_write_reg(msc, MSMON_MBWU_L + 4, 0);
> +}
> +
> static void gen_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
> u32 *flt_val)
> {
> @@ -1045,6 +1087,7 @@ static void read_msmon_ctl_flt_vals(struct mon_read *m, u32 *ctl_val,
> static void clean_msmon_ctl_val(u32 *cur_ctl)
> {
> *cur_ctl &= ~MSMON_CFG_x_CTL_OFLOW_STATUS;
> + *cur_ctl &= ~MSMON_CFG_x_CTL_OFLOW_STATUS_L;
> }
>
> static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
> @@ -1067,7 +1110,11 @@ static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
> case mpam_feat_msmon_mbwu:
> mpam_write_monsel_reg(msc, CFG_MBWU_FLT, flt_val);
> mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val);
> - mpam_write_monsel_reg(msc, MBWU, 0);
> + if (mpam_ris_has_mbwu_long_counter(m->ris))
> + mpam_msc_zero_mbwu_l(m->ris->vmsc->msc);
> + else
> + mpam_write_monsel_reg(msc, MBWU, 0);
> +
> mpam_write_monsel_reg(msc, CFG_MBWU_CTL, ctl_val | MSMON_CFG_x_CTL_EN);
>
> mbwu_state = &m->ris->mbwu_state[m->ctx->mon];
> @@ -1082,8 +1129,13 @@ static void write_msmon_ctl_flt_vals(struct mon_read *m, u32 ctl_val,
>
> static u64 mpam_msmon_overflow_val(struct mpam_msc_ris *ris)
> {
> - /* TODO: scaling, and long counters */
> - return GENMASK_ULL(30, 0);
> + /* TODO: implement scaling counters */
> + if (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, &ris->props))
> + return GENMASK_ULL(62, 0);
> + else if (mpam_has_feature(mpam_feat_msmon_mbwu_44counter, &ris->props))
> + return GENMASK_ULL(43, 0);
> + else
> + return GENMASK_ULL(30, 0);
> }
>
> /* Call with MSC lock held */
> @@ -1125,10 +1177,24 @@ static void __ris_msmon_read(void *arg)
> now = FIELD_GET(MSMON___VALUE, now);
> break;
> case mpam_feat_msmon_mbwu:
> - now = mpam_read_monsel_reg(msc, MBWU);
> - if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
> - nrdy = now & MSMON___NRDY;
> - now = FIELD_GET(MSMON___VALUE, now);
> + /*
> + * If long or lwd counters are supported, use them, else revert
> + * to the 32 bit counter.
> + */
32 bit counter -> 31 bit counter
> + if (mpam_ris_has_mbwu_long_counter(ris)) {
> + now = mpam_msc_read_mbwu_l(msc);
> + if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
> + nrdy = now & MSMON___NRDY_L;
> + if (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, rprops))
> + now = FIELD_GET(MSMON___LWD_VALUE, now);
> + else
> + now = FIELD_GET(MSMON___L_VALUE, now);
> + } else {
> + now = mpam_read_monsel_reg(msc, MBWU);
> + if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
> + nrdy = now & MSMON___NRDY;
> + now = FIELD_GET(MSMON___VALUE, now);
> + }
>
> if (nrdy)
> break;
> @@ -1421,8 +1487,13 @@ static int mpam_save_mbwu_state(void *arg)
> cur_ctl = mpam_read_monsel_reg(msc, CFG_MBWU_CTL);
> mpam_write_monsel_reg(msc, CFG_MBWU_CTL, 0);
>
> - val = mpam_read_monsel_reg(msc, MBWU);
> - mpam_write_monsel_reg(msc, MBWU, 0);
> + if (mpam_ris_has_mbwu_long_counter(ris)) {
> + val = mpam_msc_read_mbwu_l(msc);
> + mpam_msc_zero_mbwu_l(msc);
> + } else {
> + val = mpam_read_monsel_reg(msc, MBWU);
> + mpam_write_monsel_reg(msc, MBWU, 0);
> + }
>
> cfg->mon = i;
> cfg->pmg = FIELD_GET(MSMON_CFG_MBWU_FLT_PMG, cur_flt);
> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/mpam_internal.h
> index fc705801c1b6..4553616f2f67 100644
> --- a/drivers/platform/arm64/mpam/mpam_internal.h
> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
> @@ -178,7 +178,6 @@ enum mpam_device_features {
> mpam_feat_msmon_csu,
> mpam_feat_msmon_csu_capture,
> mpam_feat_msmon_csu_hw_nrdy,
> -
> /*
> * Having mpam_feat_msmon_mbwu set doesn't mean the regular 31 bit MBWU
> * counter would be used. The exact counter used is decided based on the
> @@ -457,6 +456,8 @@ int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> #define MSMON_CSU_CAPTURE 0x0848 /* last cache-usage value captured */
> #define MSMON_MBWU 0x0860 /* current mem-bw usage value */
> #define MSMON_MBWU_CAPTURE 0x0868 /* last mem-bw value captured */
> +#define MSMON_MBWU_L 0x0880 /* current long mem-bw usage value */
> +#define MSMON_MBWU_CAPTURE_L 0x0890 /* last long mem-bw value captured */
> #define MSMON_CAPT_EVNT 0x0808 /* signal a capture event */
> #define MPAMF_ESR 0x00F8 /* error status register */
> #define MPAMF_ECR 0x00F0 /* error control register */
> @@ -674,7 +675,10 @@ int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
> */
> #define MSMON___VALUE GENMASK(30, 0)
> #define MSMON___NRDY BIT(31)
> -#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
> +#define MSMON___NRDY_L BIT(63)
> +#define MSMON___L_VALUE GENMASK(43, 0)
> +#define MSMON___LWD_VALUE GENMASK(62, 0)
> +
As mentioned on an earlier patch. These could be added with all the
other register definition.
> /*
> * MSMON_CAPT_EVNT - Memory system performance monitoring capture event
> * generation register
--
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-28 11:59 ` Ben Horgan
@ 2025-07-28 15:34 ` Dave Martin
2025-08-08 7:16 ` James Morse
2025-08-08 7:14 ` James Morse
1 sibling, 1 reply; 117+ messages in thread
From: Dave Martin @ 2025-07-28 15:34 UTC (permalink / raw)
To: Ben Horgan
Cc: James Morse, linux-kernel, linux-arm-kernel, Rob Herring,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Koba Ko
Hi,
On Mon, Jul 28, 2025 at 12:59:12PM +0100, Ben Horgan wrote:
> Hi James,
>
> On 7/11/25 19:36, James Morse wrote:
> > When CPUs come online the original configuration should be restored.
> > Once the maximum partid is known, allocate an configuration array for
> > each component, and reprogram each RIS configuration from this.
> >
> > The MPAM spec describes how multiple controls can interact. To prevent
> > this happening by accident, always reset controls that don't have a
> > valid configuration. This allows the same helper to be used for
> > configuration and reset.
> >
> > CC: Dave Martin <Dave.Martin@arm.com>
> > Signed-off-by: James Morse <james.morse@arm.com>
> > ---
> > drivers/platform/arm64/mpam/mpam_devices.c | 236 ++++++++++++++++++--
> > drivers/platform/arm64/mpam/mpam_internal.h | 26 ++-
> > 2 files changed, 234 insertions(+), 28 deletions(-)
> >
> > diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> > index bb3695eb84e9..f3ecfda265d2 100644
> > --- a/drivers/platform/arm64/mpam/mpam_devices.c
> > +++ b/drivers/platform/arm64/mpam/mpam_devices.c
[...]
> > @@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
[...]
> > +static void mpam_reprogram_msc(struct mpam_msc *msc)
> > +{
> > + int idx;
> > + u16 partid;
> > + bool reset;
> > + struct mpam_config *cfg;
> > + struct mpam_msc_ris *ris;
> > +
> > + idx = srcu_read_lock(&mpam_srcu);
> > + list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
> > + if (!mpam_is_enabled() && !ris->in_reset_state) {
> > + mpam_touch_msc(msc, &mpam_reset_ris, ris);
> > + ris->in_reset_state = true;
> > + continue;
> > + }
> > +
> > + reset = true;
> > + for (partid = 0; partid <= mpam_partid_max; partid++) {
> Do we need to consider 'partid_max_lock' here?
Just throwing in my 2¢, since I'd dug into this a bit previously:
Here, we are resetting an MSC or re-onlining a CPU. Either way, I
think that this only happens after the initial probing phase is
complete.
mpam_enable_once() is ordered with respect to the task that did the
final unlock of partid_max_lock during probing, by means of the
schedule_work() call. (See <linux/workqueue.h>.)
Taking the hotplug lock and installing mpam_cpu_online() for CPU
hotplug probably brings a sufficient guarantee also (though I've not
dug into it).
This function doesn't seem to be called during the probing phase (via
mpam_discovery_cpu_online()), so there shouldn't be any racing updates
to the global variables here.
> > + cfg = &ris->vmsc->comp->cfg[partid];
> > + if (cfg->features)
> > + reset = false;
> > +
> > + mpam_reprogram_ris_partid(ris, partid, cfg);
> > + }
> > + ris->in_reset_state = reset;
> > + }
> > + srcu_read_unlock(&mpam_srcu, idx);
> > +}
[...]
> > @@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)
[...]
> > +static int __allocate_component_cfg(struct mpam_component *comp)
> > +{
> > + if (comp->cfg)
> > + return 0;
> > +
> > + comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg), GFP_KERNEL);
> And here?
Similarly, this runs only in the mpam_enable_once() call.
[...]
> > @@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct mpam_component *comp)
> > might_sleep();
> > lockdep_assert_cpus_held();
> > + memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));
> And here?
Similarly to mpam_reset_msc(), I think this probably only runs from
mpam_enable_once() or mpam_cpu_online().
I think most or all of the existing reads of the affected globals from
within mpam_resctrl.c are also callbacks from resctrl_init(), which
again exceutes during mpam_enable_once() (though I won't promise I
haven't missed one or two).
Once resctrl has fired up, I believe that the MPAM driver basically
trusts the IDs coming in from resctrl, and doesn't need to range-check
them against the global parameters again.
[...]
> Thanks,
>
> Ben
I consciously haven't done all the homework on this.
Although it may look like the globals are read all over the place after
probing, I think this actually only happens during resctrl initialision
(which is basically single-threaded).
The only place where they are read after probing and without mediation
via resctrl is on the CPU hotplug path.
Adding locking would ensure that an unstable value is never read, but
this is not sufficient by itself to sure that the _final_ value of a
variable is read (for some definition of "final"). And, if there is a
well-defined notion of final value and there is sufficient
synchronisation to ensure that this is the value read by a particular
read, then by construction an unstable value cannot be read.
I think that this kind of pattern is not that uncommon in the kernel,
though it is a bit painful to reason about.
Cheers
---Dave
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports
2025-07-24 15:08 ` Ben Horgan
@ 2025-07-28 16:16 ` Jonathan Cameron
2025-08-07 18:26 ` James Morse
1 sibling, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-07-28 16:16 UTC (permalink / raw)
To: Ben Horgan
Cc: James Morse, linux-kernel, linux-arm-kernel, Rob Herring,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
}
> >
> > +/*
> > + * IHI009A.a has this nugget: "If a monitor does not support automatic behaviour
> > + * of NRDY, software can use this bit for any purpose" - so hardware might not
> > + * implement this - but it isn't RES0.
> > + *
> > + * Try and see what values stick in this bit. If we can write either value,
> > + * its probably not implemented by hardware.
> > + */
> > +#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result) \
> > +do { \
> > + u32 now; \
> > + u64 mon_sel; \
> > + bool can_set, can_clear; \
> > + struct mpam_msc *_msc = _ris->vmsc->msc; \
> > + \
> > + if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(_msc))) { \
> > + _result = false; \
> > + break; \
> > + } \
> > + mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, 0) | \
> > + FIELD_PREP(MSMON_CFG_MON_SEL_RIS, _ris->ris_idx); \
> > + mpam_write_monsel_reg(_msc, CFG_MON_SEL, mon_sel); \
> > + \
> > + mpam_write_monsel_reg(_msc, _mon_reg, MSMON___NRDY); \
> > + now = mpam_read_monsel_reg(_msc, _mon_reg); \
> > + can_set = now & MSMON___NRDY; \
> > + \
> > + mpam_write_monsel_reg(_msc, _mon_reg, 0); \
> > + now = mpam_read_monsel_reg(_msc, _mon_reg); \
> > + can_clear = !(now & MSMON___NRDY); \
> > + mpam_mon_sel_inner_unlock(_msc); \
> > + \
> > + _result = (!can_set || !can_clear); \
> > +} while (0)
> It is a bit surprising that something that looks like a function
> modifies a boolean passed by value. Consider continuing the pattern you
> have above:
> #define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result)
> _mpam_ris_hw_probe_hw_nrdy(_ris, MSMON##_mon_reg, _result)
>
> with signature:
> void _mpam_ris_hw_probe_hw_nrdy(struct mpam_msc *msc, u16 reg, bool
> *hw_managed);
>
> and using the _mpam functions from the new _mpam_ris_hw_probe_hw_nrdy().
>
Agreed that this is ugly. Only a tiny bit of macro stuff is actually going on here.
I'd make it function.
If you really want to construct MSMON_CSU etc then wrap that helper with
a macro that builds reg from the name.
I might have missed something though in converting this.
The version I have has some other changes though so not trivial to post here :(
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
2025-07-24 14:13 ` Ben Horgan
@ 2025-07-29 6:11 ` Baisheng Gao
2025-08-06 18:07 ` James Morse
2025-08-05 8:46 ` Jonathan Cameron
2 siblings, 1 reply; 117+ messages in thread
From: Baisheng Gao @ 2025-07-29 6:11 UTC (permalink / raw)
To: james.morse
Cc: amitsinght, baolin.wang, ben.horgan, bobo.shaobowang, carl,
dave.martin, david, dfustini, kobak, lcherian, lecopzerc,
linux-arm-kernel, linux-kernel, peternewman, quic_jiles, rex.nie,
robh, rohit.mathew, scott, sdonthineni, shameerali.kolothum.thodi,
tan.shaopeng, xhao, zengheng4, hao_hao.wang
Hi James,
> Because an MSC can only by accessed from the CPUs in its cpu-affinity
> set we need to be running on one of those CPUs to probe the MSC
> hardware.
>
> Do this work in the cpuhp callback. Probing the hardware will only
> happen before MPAM is enabled, walk all the MSCs and probe those we can
> reach that haven't already been probed.
>
> Later once MPAM is enabled, this cpuhp callback will be replaced by
> one that avoids the global list.
>
> Enabling a static key will also take the cpuhp lock, so can't be done
> from the cpuhp callback. Whenever a new MSC has been probed schedule
> work to test if all the MSCs have now been probed.
>
> CC: Lecopzer Chen <lecopzerc@nvidia.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 149 +++++++++++++++++++-
> drivers/platform/arm64/mpam/mpam_internal.h | 8 +-
> 2 files changed, 152 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index 0d6d5180903b..89434ae3efa6 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -4,6 +4,7 @@
> #define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
>
> #include <linux/acpi.h>
> +#include <linux/atomic.h>
> #include <linux/arm_mpam.h>
> #include <linux/cacheinfo.h>
> #include <linux/cpu.h>
> @@ -21,6 +22,7 @@
> #include <linux/slab.h>
> #include <linux/spinlock.h>
> #include <linux/types.h>
> +#include <linux/workqueue.h>
>
> #include <acpi/pcc.h>
>
> @@ -39,6 +41,16 @@ struct srcu_struct mpam_srcu;
> /* MPAM isn't available until all the MSC have been probed. */
> static u32 mpam_num_msc;
>
> +static int mpam_cpuhp_state;
> +static DEFINE_MUTEX(mpam_cpuhp_state_lock);
> +
> +/*
> + * mpam is enabled once all devices have been probed from CPU online callbacks,
> + * scheduled via this work_struct. If access to an MSC depends on a CPU that
> + * was not brought online at boot, this can happen surprisingly late.
> + */
> +static DECLARE_WORK(mpam_enable_work, &mpam_enable);
> +
> /*
> * An MSC is a physical container for controls and monitors, each identified by
> * their RIS index. These share a base-address, interrupts and some MMIO
> @@ -78,6 +90,22 @@ LIST_HEAD(mpam_classes);
> /* List of all objects that can be free()d after synchronise_srcu() */
> static LLIST_HEAD(mpam_garbage);
>
> +static u32 __mpam_read_reg(struct mpam_msc *msc, u16 reg)
> +{
> + WARN_ON_ONCE(reg > msc->mapped_hwpage_sz);
> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(), &msc->accessibility));
> +
> + return readl_relaxed(msc->mapped_hwpage + reg);
> +}
> +
> +static inline u32 _mpam_read_partsel_reg(struct mpam_msc *msc, u16 reg)
> +{
> + lockdep_assert_held_once(&msc->part_sel_lock);
> + return __mpam_read_reg(msc, reg);
> +}
> +
> +#define mpam_read_partsel_reg(msc, reg) _mpam_read_partsel_reg(msc, MPAMF_##reg)
> +
> #define init_garbage(x) init_llist_node(&(x)->garbage.llist)
>
> static struct mpam_vmsc *
> @@ -513,9 +541,84 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
> return err;
> }
>
> -static void mpam_discovery_complete(void)
> +static int mpam_msc_hw_probe(struct mpam_msc *msc)
> {
> - pr_err("Discovered all MSC\n");
> + u64 idr;
> + int err;
> +
> + lockdep_assert_held(&msc->probe_lock);
> +
> + mutex_lock(&msc->part_sel_lock);
> + idr = mpam_read_partsel_reg(msc, AIDR);
> + if ((idr & MPAMF_AIDR_ARCH_MAJOR_REV) != MPAM_ARCHITECTURE_V1) {
> + pr_err_once("%s does not match MPAM architecture v1.0\n",
> + dev_name(&msc->pdev->dev));
> + err = -EIO;
> + } else {
> + msc->probed = true;
> + err = 0;
> + }
> + mutex_unlock(&msc->part_sel_lock);
> +
> + return err;
> +}
> +
> +static int mpam_cpu_online(unsigned int cpu)
> +{
> + return 0;
> +}
> +
> +/* Before mpam is enabled, try to probe new MSC */
> +static int mpam_discovery_cpu_online(unsigned int cpu)
> +{
> + int err = 0;
> + struct mpam_msc *msc;
> + bool new_device_probed = false;
> +
> + mutex_lock(&mpam_list_lock);
> + list_for_each_entry(msc, &mpam_all_msc, glbl_list) {
> + if (!cpumask_test_cpu(cpu, &msc->accessibility))
> + continue;
> +
> + mutex_lock(&msc->probe_lock);
> + if (!msc->probed)
> + err = mpam_msc_hw_probe(msc);
> + mutex_unlock(&msc->probe_lock);
> +
> + if (!err)
> + new_device_probed = true;
> + else
> + break; // mpam_broken
> + }
> + mutex_unlock(&mpam_list_lock);
> +
> + if (new_device_probed && !err)
> + schedule_work(&mpam_enable_work);
> +
> + return err;
> +}
> +
> +static int mpam_cpu_offline(unsigned int cpu)
> +{
> + return 0;
> +}
> +
> +static void mpam_register_cpuhp_callbacks(int (*online)(unsigned int online),
> + int (*offline)(unsigned int offline))
> +{
> + mutex_lock(&mpam_cpuhp_state_lock);
> + if (mpam_cpuhp_state) {
> + cpuhp_remove_state(mpam_cpuhp_state);
> + mpam_cpuhp_state = 0;
> + }
> +
> + mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mpam:online",
> + online, offline);
> + if (mpam_cpuhp_state <= 0) {
> + pr_err("Failed to register cpuhp callbacks");
> + mpam_cpuhp_state = 0;
> + }
> + mutex_unlock(&mpam_cpuhp_state_lock);
> }
>
> static int mpam_dt_count_msc(void)
> @@ -774,7 +877,7 @@ static int mpam_msc_drv_probe(struct platform_device *pdev)
> }
>
> if (!err && fw_num_msc == mpam_num_msc)
> - mpam_discovery_complete();
> + mpam_register_cpuhp_callbacks(&mpam_discovery_cpu_online, NULL);
>
> if (err && msc)
> mpam_msc_drv_remove(pdev);
> @@ -797,6 +900,46 @@ static struct platform_driver mpam_msc_driver = {
> .remove = mpam_msc_drv_remove,
> };
>
> +static void mpam_enable_once(void)
> +{
> + mutex_lock(&mpam_cpuhp_state_lock);
> + cpuhp_remove_state(mpam_cpuhp_state);
> + mpam_cpuhp_state = 0;
> + mutex_unlock(&mpam_cpuhp_state_lock);
Deleting the above 4 lines?
The mpam_cpuhp_state will be removed firstly in mpam_register_cpuhp_callbacks
if the mpam_cpuhp_state isn't 0.
> +
> + mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
> +
> + pr_info("MPAM enabled\n");
> +}
> +
[snip]
Regards,
Baisheng
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 00/36] arm_mpam: Add basic mpam driver
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
` (35 preceding siblings ...)
2025-07-11 18:36 ` [RFC PATCH 36/36] arm_mpam: Add kunit tests for props_mismatch() James Morse
@ 2025-08-01 16:09 ` Jonathan Cameron
2025-08-08 7:23 ` James Morse
36 siblings, 1 reply; 117+ messages in thread
From: Jonathan Cameron @ 2025-08-01 16:09 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, 11 Jul 2025 18:36:12 +0000
James Morse <james.morse@arm.com> wrote:
> Hello,
>
> This is just enough MPAM driver for the ACPI and DT pre-requisites.
> It doesn't contain any of the resctrl code, meaning you can't actually drive it
> from user-space yet.
>
> This is the initial group of patches that allows the resctrl code to be built
> on top. Including that will increase the number of trees that may need to
> coordinate, so breaking it up make sense.
>
> The locking looks very strange - but is influenced by the 'mpam-fb' firmware
> interface specification that is still alpha. That thing needs to wait for an
> interrupt after every system register write, which significantly impacts the
> driver. Some features just won't work, e.g. reading the monitor registers via
> perf.
> The aim is to not have to make invasive changes to the locking to support the
> firmware interface, hence it looks strange from day-1.
>
> I've not found a platform that can test all the behaviours around the monitors,
> so this is where I'd expect the most bugs.
>
> It's unclear where in the tree this should be put. It affects memory bandwidth
> and cache allocation, but doesn't (yet) interact with perf. The main interaction
> is with resctrl in fs/resctrl - but there will be no filesystem code in here.
> Its also likely there will be other in-kernel users. (in-kernel MSC emulation by
> KVM being an obvious example).
> (I'm not a fan of drivers/resctrl or drivers/mpam - its not the sort of thing
> that justifies being a 'subsystem'.)
>
> For now, I've put this under drivers/platform/arm64. Other ideas welcome.
>
> The first three patches are currently a series on the list, the PPTT stuff
> has previously been posted - this is where the users of those helpers appear.
>
Hi James,
Whilst I get that this is minimal, I was a bit surprised that it doesn't
contain enough to have the driver actually bind to the platform devices
I think that needs the CPU hotplug handler to register a requester.
So about another 4 arch patches from your tree. Maybe you can shuffle
things around to help with that.
That makes this a pain to test in isolation.
Given desire to poke the corners, I'm rebasing the old QEMU emulation and
will poke it some more. Now we are getting close to upstream kernel support
maybe I'll even clean that up for potential upstream QEMU.
For bonus points I 'could' hook it up to the cache simulator and actually
generate real 'counts' but that's probably more for fun than because it's
useful. Fake numbers are a lot cheaper to get.
Jonathan
>
> The MPAM spec that describes all the system and MMIO registers can be found
> here:
> https://developer.arm.com/documentation/ddi0598/db/?lang=en
> (Ignored the 'RETIRED' warning - that is just arm moving the documentation
> around. This document has the best overview)
>
> This series is based on v6.16-rc4, and can be retrieved from:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/driver/rfc
>
> The rest of the driver can be found here:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/snapshot/v6.16-rc4
>
> What is MPAM? Set your time-machine to 2020:
> https://lore.kernel.org/lkml/20201030161120.227225-1-james.morse@arm.com/
>
>
> Bugs welcome,
> Thanks,
>
> James Morse (31):
> cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for
> cache-id
> arm64: cacheinfo: Provide helper to compress MPIDR value into u32
> cacheinfo: Expose the code to generate a cache-id from a device_node
> ACPI / PPTT: Add a helper to fill a cpumask from a processor container
> ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear
> levels
> ACPI / PPTT: Find cache level by cache-id
> ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
> arm64: kconfig: Add Kconfig entry for MPAM
> ACPI / MPAM: Parse the MPAM table
> platform: arm64: Move ec devices to an ec subdirectory
> arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
> arm_mpam: Add the class and component structures for ris firmware
> described
> arm_mpam: Add MPAM MSC register layout definitions
> arm_mpam: Add cpuhp callbacks to probe MSC hardware
> arm_mpam: Probe MSCs to find the supported partid/pmg values
> arm_mpam: Add helpers for managing the locking around the mon_sel
> registers
> arm_mpam: Probe the hardware features resctrl supports
> arm_mpam: Merge supported features during mpam_enable() into
> mpam_class
> arm_mpam: Reset MSC controls from cpu hp callbacks
> arm_mpam: Add a helper to touch an MSC from any CPU
> arm_mpam: Extend reset logic to allow devices to be reset any time
> arm_mpam: Register and enable IRQs
> arm_mpam: Use a static key to indicate when mpam is enabled
> arm_mpam: Allow configuration to be applied and restored during cpu
> online
> arm_mpam: Probe and reset the rest of the features
> arm_mpam: Add helpers to allocate monitors
> arm_mpam: Add mpam_msmon_read() to read monitor value
> arm_mpam: Track bandwidth counter state for overflow and power
> management
> arm_mpam: Add helper to reset saved mbwu state
> arm_mpam: Add kunit test for bitmap reset
> arm_mpam: Add kunit tests for props_mismatch()
>
> Rob Herring (2):
> cacheinfo: Set cache 'id' based on DT data
> dt-bindings: arm: Add MPAM MSC binding
>
> Rohit Mathew (2):
> arm_mpam: Probe for long/lwd mbwu counters
> arm_mpam: Use long MBWU counters if supported
>
> Shanker Donthineni (1):
> arm_mpam: Add support for memory controller MSC on DT platforms
>
> .../devicetree/bindings/arm/arm,mpam-msc.yaml | 227 ++
> MAINTAINERS | 6 +-
> arch/arm64/Kconfig | 19 +
> arch/arm64/include/asm/cache.h | 17 +
> drivers/acpi/arm64/Kconfig | 3 +
> drivers/acpi/arm64/Makefile | 1 +
> drivers/acpi/arm64/mpam.c | 365 +++
> drivers/acpi/pptt.c | 240 +-
> drivers/acpi/tables.c | 2 +-
> drivers/base/cacheinfo.c | 57 +
> drivers/platform/arm64/Kconfig | 73 +-
> drivers/platform/arm64/Makefile | 10 +-
> drivers/platform/arm64/ec/Kconfig | 73 +
> drivers/platform/arm64/ec/Makefile | 10 +
> .../platform/arm64/{ => ec}/acer-aspire1-ec.c | 0
> .../arm64/{ => ec}/huawei-gaokun-ec.c | 0
> .../arm64/{ => ec}/lenovo-yoga-c630.c | 0
> drivers/platform/arm64/mpam/Kconfig | 23 +
> drivers/platform/arm64/mpam/Makefile | 4 +
> drivers/platform/arm64/mpam/mpam_devices.c | 2910 +++++++++++++++++
> drivers/platform/arm64/mpam/mpam_internal.h | 697 ++++
> .../platform/arm64/mpam/test_mpam_devices.c | 390 +++
> include/linux/acpi.h | 17 +
> include/linux/arm_mpam.h | 56 +
> include/linux/cacheinfo.h | 1 +
> 25 files changed, 5117 insertions(+), 84 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
> create mode 100644 drivers/acpi/arm64/mpam.c
> create mode 100644 drivers/platform/arm64/ec/Kconfig
> create mode 100644 drivers/platform/arm64/ec/Makefile
> rename drivers/platform/arm64/{ => ec}/acer-aspire1-ec.c (100%)
> rename drivers/platform/arm64/{ => ec}/huawei-gaokun-ec.c (100%)
> rename drivers/platform/arm64/{ => ec}/lenovo-yoga-c630.c (100%)
> create mode 100644 drivers/platform/arm64/mpam/Kconfig
> create mode 100644 drivers/platform/arm64/mpam/Makefile
> create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
> create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
> create mode 100644 drivers/platform/arm64/mpam/test_mpam_devices.c
> create mode 100644 include/linux/arm_mpam.h
>
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
2025-07-28 11:59 ` Ben Horgan
@ 2025-08-04 16:39 ` Fenghua Yu
2025-08-08 7:17 ` James Morse
2 siblings, 1 reply; 117+ messages in thread
From: Fenghua Yu @ 2025-08-04 16:39 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi, James,
On 7/11/25 11:36, James Morse wrote:
> When CPUs come online the original configuration should be restored.
> Once the maximum partid is known, allocate an configuration array for
> each component, and reprogram each RIS configuration from this.
>
> The MPAM spec describes how multiple controls can interact. To prevent
> this happening by accident, always reset controls that don't have a
> valid configuration. This allows the same helper to be used for
> configuration and reset.
>
> CC: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> drivers/platform/arm64/mpam/mpam_devices.c | 236 ++++++++++++++++++--
> drivers/platform/arm64/mpam/mpam_internal.h | 26 ++-
> 2 files changed, 234 insertions(+), 28 deletions(-)
>
> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
> index bb3695eb84e9..f3ecfda265d2 100644
> --- a/drivers/platform/arm64/mpam/mpam_devices.c
> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
> @@ -374,12 +374,16 @@ static void mpam_class_destroy(struct mpam_class *class)
> add_to_garbage(class);
> }
>
> +static void __destroy_component_cfg(struct mpam_component *comp);
> +
> static void mpam_comp_destroy(struct mpam_component *comp)
> {
> struct mpam_class *class = comp->class;
>
> lockdep_assert_held(&mpam_list_lock);
>
> + __destroy_component_cfg(comp);
> +
> list_del_rcu(&comp->class_list);
> add_to_garbage(comp);
>
> @@ -909,51 +913,90 @@ static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
> __mpam_write_reg(msc, reg, bm);
> }
>
> -static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
> +/* Called via IPI. Call while holding an SRCU reference */
> +static void mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
> + struct mpam_config *cfg)
> {
> u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
> struct mpam_msc *msc = ris->vmsc->msc;
> struct mpam_props *rprops = &ris->props;
>
> - mpam_assert_srcu_read_lock_held();
> -
> mutex_lock(&msc->part_sel_lock);
> __mpam_part_sel(ris->ris_idx, partid, msc);
>
> - if (mpam_has_feature(mpam_feat_cpor_part, rprops))
> - mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM, rprops->cpbm_wd);
> + if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_cpor_part, cfg))
> + mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
> + else
> + mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
> + rprops->cpbm_wd);
> + }
>
> - if (mpam_has_feature(mpam_feat_mbw_part, rprops))
> - mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM, rprops->mbw_pbm_bits);
> + if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_part, cfg))
> + mpam_write_partsel_reg(msc, MBW_PBM, cfg->mbw_pbm);
> + else
> + mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM,
> + rprops->mbw_pbm_bits);
> + }
>
> if (mpam_has_feature(mpam_feat_mbw_min, rprops))
> mpam_write_partsel_reg(msc, MBW_MIN, 0);
>
> - if (mpam_has_feature(mpam_feat_mbw_max, rprops))
> - mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
> + if (mpam_has_feature(mpam_feat_mbw_max, rprops)) {
> + if (mpam_has_feature(mpam_feat_mbw_max, cfg))
> + mpam_write_partsel_reg(msc, MBW_MAX, cfg->mbw_max);
> + else
> + mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
> + }
>
> if (mpam_has_feature(mpam_feat_mbw_prop, rprops))
> mpam_write_partsel_reg(msc, MBW_PROP, bwa_fract);
> mutex_unlock(&msc->part_sel_lock);
> }
>
> +struct reprogram_ris {
> + struct mpam_msc_ris *ris;
> + struct mpam_config *cfg;
> +};
> +
> +/* Call with MSC lock held */
> +static int mpam_reprogram_ris(void *_arg)
> +{
> + u16 partid, partid_max;
> + struct reprogram_ris *arg = _arg;
> + struct mpam_msc_ris *ris = arg->ris;
> + struct mpam_config *cfg = arg->cfg;
> +
> + if (ris->in_reset_state)
> + return 0;
> +
> + spin_lock(&partid_max_lock);
> + partid_max = mpam_partid_max;
partid_max is not used after the assignment.
> + spin_unlock(&partid_max_lock);
Doesn't make sense to lock protect a local variable partid_max which is
not used any way.
[SNIP]
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
` (3 preceding siblings ...)
2025-07-28 10:49 ` Ben Horgan
@ 2025-08-04 16:53 ` Fenghua Yu
2025-08-08 7:12 ` James Morse
4 siblings, 1 reply; 117+ messages in thread
From: Fenghua Yu @ 2025-08-04 16:53 UTC (permalink / raw)
To: James Morse, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi, James,
On 7/11/25 11:36, James Morse wrote:
> Register and enable error IRQs. All the MPAM error interrupts indicate a
> software bug, e.g. out of range partid. If the error interrupt is ever
> signalled, attempt to disable MPAM.
>
> Only the irq handler accesses the ESR register, so no locking is needed.
> The work to disable MPAM after an error needs to happen at process
> context, use a threaded interrupt.
>
> There is no support for percpu threaded interrupts, for now schedule
> the work to be done from the irq handler.
>
> Enabling the IRQs in the MSC may involve cross calling to a CPU that
> can access the MSC.
>
> CC: Rohit Mathew <rohit.mathew@arm.com>
> Tested-by: Rohit Mathew <rohit.mathew@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
[SNIP]
> +static int mpam_register_irqs(void)
> +{
> + int err, irq, idx;
> + struct mpam_msc *msc;
> +
> + lockdep_assert_cpus_held();
> +
> + idx = srcu_read_lock(&mpam_srcu);
> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
> + if (irq <= 0)
> + continue;
> +
> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
> + /* We anticipate sharing the interrupt with other MSCs */
> + if (irq_is_percpu(irq)) {
> + err = request_percpu_irq(irq, &mpam_ppi_handler,
> + "mpam:msc:error",
> + msc->error_dev_id);
> + if (err)
> + return err;
But right now mpam_srcu is still being locked. Need to unlock it before
return.
> +
> + msc->reenable_error_ppi = irq;
> + smp_call_function_many(&msc->accessibility,
> + &_enable_percpu_irq, &irq,
> + true);
> + } else {
> + err = devm_request_threaded_irq(&msc->pdev->dev, irq,
> + &mpam_spi_handler,
> + &mpam_disable_thread,
> + IRQF_SHARED,
> + "mpam:msc:error", msc);
> + if (err)
> + return err;
Ditto.
> + }
> +
> + msc->error_irq_requested = true;
> + mpam_touch_msc(msc, mpam_enable_msc_ecr, msc);
> + msc->error_irq_hw_enabled = true;
> + }
> + srcu_read_unlock(&mpam_srcu, idx);
> +
> + return 0;
> +}
> +
[SNIP]
Thanks.
-Fenghua
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
2025-07-24 14:13 ` Ben Horgan
2025-07-29 6:11 ` Baisheng Gao
@ 2025-08-05 8:46 ` Jonathan Cameron
2 siblings, 0 replies; 117+ messages in thread
From: Jonathan Cameron @ 2025-08-05 8:46 UTC (permalink / raw)
To: James Morse
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
On Fri, 11 Jul 2025 18:36:29 +0000
James Morse <james.morse@arm.com> wrote:
> Because an MSC can only by accessed from the CPUs in its cpu-affinity
> set we need to be running on one of those CPUs to probe the MSC
> hardware.
>
> Do this work in the cpuhp callback. Probing the hardware will only
> happen before MPAM is enabled, walk all the MSCs and probe those we can
> reach that haven't already been probed.
>
> Later once MPAM is enabled, this cpuhp callback will be replaced by
> one that avoids the global list.
>
> Enabling a static key will also take the cpuhp lock, so can't be done
> from the cpuhp callback. Whenever a new MSC has been probed schedule
> work to test if all the MSCs have now been probed.
>
> CC: Lecopzer Chen <lecopzerc@nvidia.com>
> Signed-off-by: James Morse <james.morse@arm.com>
Hi James,
One trivial thing noticed whilst testing..
> @@ -513,9 +541,84 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
> return err;
> }
>
> -static void mpam_discovery_complete(void)
> +static int mpam_msc_hw_probe(struct mpam_msc *msc)
> {
> - pr_err("Discovered all MSC\n");
> + u64 idr;
> + int err;
> +
> + lockdep_assert_held(&msc->probe_lock);
> +
> + mutex_lock(&msc->part_sel_lock);
> + idr = mpam_read_partsel_reg(msc, AIDR);
> + if ((idr & MPAMF_AIDR_ARCH_MAJOR_REV) != MPAM_ARCHITECTURE_V1) {
> + pr_err_once("%s does not match MPAM architecture v1.0\n",
You are only checking MAJOR REV which is probably the right thing to do
but in that case maybe change the message to be v1.x ?
> + dev_name(&msc->pdev->dev));
> + err = -EIO;
> + } else {
> + msc->probed = true;
> + err = 0;
> + }
> + mutex_unlock(&msc->part_sel_lock);
> +
> + return err;
> +}
> +
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id
2025-07-14 11:42 ` Ben Horgan
@ 2025-08-05 17:06 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:06 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko,
Sudeep Holla
Hi Ben,
On 14/07/2025 12:42, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> The MPAM table identifies caches by id. The MPAM driver also wants to know
>> the cache level to determine if the platform is of the shape that can be
>> managed via resctrl. Cacheinfo has this information, but only for CPUs that
>> are online.
>>
>> Waiting for all CPUs to come online is a problem for platforms where
>> CPUs are brought online late by user-space.
>>
>> Add a helper that walks every possible cache, until it finds the one
>> identified by cache-id, then return the level.
>>
>> acpi_count_levels() expects its levels parameter to be initialised to
>> zero as it passes it to acpi_find_cache_level() as starting_level.
>> The existing callers do this. Document it.
> This paragraph is stale. You dealt with this in the previous commit.
Fixed, thanks.
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id
2025-07-16 16:21 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-idUIRE Jonathan Cameron
@ 2025-08-05 17:06 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:06 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
Hi Jonathan,
On 16/07/2025 17:21, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:19 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> The MPAM table identifies caches by id. The MPAM driver also wants to know
>> the cache level to determine if the platform is of the shape that can be
>> managed via resctrl. Cacheinfo has this information, but only for CPUs that
>> are online.
>>
>> Waiting for all CPUs to come online is a problem for platforms where
>> CPUs are brought online late by user-space.
>>
>> Add a helper that walks every possible cache, until it finds the one
>> identified by cache-id, then return the level.
>>
>> acpi_count_levels() expects its levels parameter to be initialised to
>> zero as it passes it to acpi_find_cache_level() as starting_level.
>> The existing callers do this. Document it.
> A few suggestions inline. Mostly driven by the number of missing table
> puts I've seen in ACPI code. You don't have any missing here but with a
> bit of restructuring you can make that easy to see.
Sounds good,
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index 13ca2eee3b98..f53748a5df19 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -912,3 +912,76 @@ int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
>> return find_acpi_cpu_topology_tag(cpu, PPTT_ABORT_PACKAGE,
>> ACPI_PPTT_ACPI_IDENTICAL);
>> }
>> +
>> +/**
>> + * find_acpi_cache_level_from_id() - Get the level of the specified cache
>> + * @cache_id: The id field of the unified cache
>> + *
>> + * Determine the level relative to any CPU for the unified cache identified by
>> + * cache_id. This allows the property to be found even if the CPUs are offline.
>> + *
>> + * The returned level can be used to group unified caches that are peers.
>> + *
>> + * The PPTT table must be rev 3 or later,
>> + *
>> + * If one CPUs L2 is shared with another as L3, this function will return
>> + * an unpredictable value.
>> + *
>> + * Return: -ENOENT if the PPTT doesn't exist, or the cache cannot be found.
>> + * Otherwise returns a value which represents the level of the specified cache.
>> + */
>> +int find_acpi_cache_level_from_id(u32 cache_id)
>> +{
>> + u32 acpi_cpu_id;
>> + acpi_status status;
>> + int level, cpu, num_levels;
>> + struct acpi_pptt_cache *cache;
>> + struct acpi_table_header *table;
>> + struct acpi_pptt_cache_v1 *cache_v1;
>> + struct acpi_pptt_processor *cpu_node;
>> +
>> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
>> + if (ACPI_FAILURE(status)) {
>> + acpi_pptt_warn_missing();
>> + return -ENOENT;
>> + }
>> +
>> + if (table->revision < 3) {
> Maybe a unified exit path given all paths need to do
> acpi_put_table() and return either error or level.
>
> Or maybe it's time for some cleanup.h magic for acpi tables. I've
> been thinking about it for a while and mostly stuck on the name ;)
(Isn't that the hard bit?)
> (simpler suggestion follows)
>
> static struct acpi_table_header *acpi_get_table_ret(char *signature, u32 instance)
> {
> struct acpi_table_header *table;
> int status = acpi_get_table(signature, instance, &table);
>
> if (ACPI_FAILURE(status))
> return ERR_PTR(-ENOENT);
> return table;
> }
>
> DEFINE_FREE(acpi_table, struct acpi_table_header *, if (!IS_ERR(_T)) acpi_put_table(_T))
>
> Finally in here and loads of other places we avoid chance of missing an acpi_put_table
> and generally simplify the code a little.
>
> int find_acpi_cache_level_from_id(u32 cache_id)
> {
> u32 acpi_cpu_id;
> acpi_status status;
> int level, cpu, num_levels;
> struct acpi_pptt_cache *cache;
> struct acpi_pptt_cache_v1 *cache_v1;
> struct acpi_pptt_processor *cpu_node;
>
>
> struct acpi_table_header *table __free(acpi_table) =
> acpi_get_table_ret(ACPI_SIG_PPTT, 0);
>
> if (IS_ERR(table)
> return PTR_ERR(table);
>
> if (table->revision < 3)
> return -ENOENT;
>
> /*
> * If we found the cache first, we'd still need to walk from each CPU
> * to find the level...
> */
> for_each_possible_cpu(cpu) {
> acpi_cpu_id = get_acpi_id_for_cpu(cpu);
> cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
> if (!cpu_node)
> return -ENOENT;
> acpi_count_levels(table, cpu_node, &num_levels, NULL);
>
> /* Start at 1 for L1 */
> for (level = 1; level <= num_levels; level++) {
> cache = acpi_find_cache_node(table, acpi_cpu_id,
> ACPI_PPTT_CACHE_TYPE_UNIFIED,
> level, &cpu_node);
> if (!cache)
> continue;
>
> cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
> cache,
> sizeof(struct acpi_pptt_cache));
>
> if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
> cache_v1->cache_id == cache_id) {
> acpi_put_table(table);
> return level;
> }
> }
> }
> return -ENOENT;
> }
>
>
> A less 'fun' alternative is pull some code out as a helper to make put the get and put
> near each other with no conditionals to confuse things.
I still find the cleanup stuff slightly sickening ... so lets use it some more.
Added to linux/acpi.h to make it easier to use elsewhere. I think the earlier patches in
this series are simple enough in this area its not worth changing them...
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id
2025-07-16 16:24 ` Jonathan Cameron
@ 2025-08-05 17:06 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:06 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko, Sudeep Holla
Hi Jonathan,
On 16/07/2025 17:24, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:20 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> MPAM identifies CPUs by the cache_id in the PPTT cache structure.
>>
>> The driver needs to know which CPUs are associated with the cache,
>> the CPUs may not all be online, so cacheinfo does not have the
>> information.
>>
>> Add a helper to pull this information out of the PPTT.
>> diff --git a/drivers/acpi/pptt.c b/drivers/acpi/pptt.c
>> index f53748a5df19..81f7ac18c023 100644
>> --- a/drivers/acpi/pptt.c
>> +++ b/drivers/acpi/pptt.c
>> @@ -985,3 +985,73 @@ int find_acpi_cache_level_from_id(u32 cache_id)
>> +int acpi_pptt_get_cpumask_from_cache_id(u32 cache_id, cpumask_t *cpus)
>> +{
>> + u32 acpi_cpu_id;
>> + acpi_status status;
>> + int level, cpu, num_levels;
>> + struct acpi_pptt_cache *cache;
>> + struct acpi_table_header *table;
>> + struct acpi_pptt_cache_v1 *cache_v1;
>> + struct acpi_pptt_processor *cpu_node;
>> +
>> + cpumask_clear(cpus);
>> +
>> + status = acpi_get_table(ACPI_SIG_PPTT, 0, &table);
> Similar suggestions to previous patch apply here as well.
Done!
>
>> + if (ACPI_FAILURE(status)) {
>> + acpi_pptt_warn_missing();
>> + return -ENOENT;
>> + }
>> +
>> + if (table->revision < 3) {
>> + acpi_put_table(table);
>> + return -ENOENT;
>> + }
>> +
>> + /*
>> + * If we found the cache first, we'd still need to walk from each cpu.
>> + */
>> + for_each_possible_cpu(cpu) {
>> + acpi_cpu_id = get_acpi_id_for_cpu(cpu);
>> + cpu_node = acpi_find_processor_node(table, acpi_cpu_id);
>> + if (!cpu_node)
>> + break;
>> + acpi_count_levels(table, cpu_node, &num_levels, NULL);
>> +
>> + /* Start at 1 for L1 */
>> + for (level = 1; level <= num_levels; level++) {
>> + cache = acpi_find_cache_node(table, acpi_cpu_id,
>> + ACPI_PPTT_CACHE_TYPE_UNIFIED,
>> + level, &cpu_node);
>> + if (!cache)
>> + continue;
>> +
>> + cache_v1 = ACPI_ADD_PTR(struct acpi_pptt_cache_v1,
>> + cache,
>> + sizeof(struct acpi_pptt_cache));
>> +
>> + if (cache->flags & ACPI_PPTT_CACHE_ID_VALID &&
>> + cache_v1->cache_id == cache_id) {
>> + cpumask_set_cpu(cpu, cpus);
>> + }
> Unnecessary {} Fine to keep them if you add something else here later.
The condition being broken over multiple lines de-rails my C parsing abilities... 'Fixed'.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-16 17:07 ` Jonathan Cameron
2025-07-23 16:39 ` Ben Horgan
2025-07-28 10:08 ` Jonathan Cameron
@ 2025-08-05 17:07 ` James Morse
2 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:07 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
Hi Jonathan,
On 16/07/2025 18:07, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:22 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> Add code to parse the arm64 specific MPAM table, looking up the cache
>> level from the PPTT and feeding the end result into the MPAM driver.
> Throw in a link to the spec perhaps? Particularly useful to know which
> version this was written against when reviewing it.
Will do. Ben has already pointed out it wasn't written against the latest version...
>> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
>> index 05ecde9eaabe..27b872249baa 100644
>> --- a/drivers/acpi/arm64/Makefile
>> +++ b/drivers/acpi/arm64/Makefile
>> @@ -6,5 +6,6 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o
>> obj-$(CONFIG_ACPI_IORT) += iort.o
>> obj-$(CONFIG_ACPI_PROCESSOR_IDLE) += cpuidle.o
>> obj-$(CONFIG_ARM_AMBA) += amba.o
>> +obj-$(CONFIG_ACPI_MPAM) += mpam.o
>
> Keep it with the ACPI ones?
Sure,
> There doesn't seem to be a lot of order in here
> though so I guess maybe there is logic behind putting it here I'm missing.
merge conflicts over many years always put it at the bottom of the file.
I at least kept the conditional ones together.
Moving it up lets the table 'drivers' appear together in alphabetical order.
>> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
>> new file mode 100644
>> index 000000000000..f4791bac9a2a
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/mpam.c
>> @@ -0,0 +1,365 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +// Copyright (C) 2025 Arm Ltd.
>> +
>> +/* Parse the MPAM ACPI table feeding the discovered nodes into the driver */
>> +
>> +#define pr_fmt(fmt) "ACPI MPAM: " fmt
>> +
>> +#include <linux/acpi.h>
>> +#include <linux/arm_mpam.h>
>> +#include <linux/cpu.h>
>> +#include <linux/cpumask.h>
>> +#include <linux/platform_device.h>
>> +
>> +#include <acpi/processor.h>
>> +
>> +/* Flags for acpi_table_mpam_msc.*_interrupt_flags */
> References.. I'm looking at 3.0-alpha table 5 to check this.
> I can see why you might be reluctant to point at an alpha if that
> is what you are using ;)
I did this against the released(?) version 2.0. (aka table revision 1).
I'll add references based on the v3 beta ... it looks like that defines the
mmio-size=0 behaviour and the pcc stuff. The mmio-size is harmless - we'd
need to handle that as an error anyay. I don't want to touch the pcc thing
until there is a real platform that needs it, and the spec is finished...
e.g.
| * See 2.1.1 Interrupt Flags, Table 5, of DEN0065B_MPAM_ACPI_3.0-bet.
>> +#define ACPI_MPAM_MSC_IRQ_MODE_EDGE 1
>> +#define ACPI_MPAM_MSC_IRQ_TYPE_MASK (3 << 1)
> GENMASK(3, 2) would be my preference for how to do masks in new code.
GENMASK(2, 1), but yes.
>> +#define ACPI_MPAM_MSC_IRQ_TYPE_WIRED 0
>> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER BIT(3)
>> +#define ACPI_MPAM_MSC_IRQ_AFFINITY_VALID BIT(4)
>> +
>> +static bool frob_irq(struct platform_device *pdev, int intid, u32 flags,
>> + int *irq, u32 processor_container_uid)
>> +{
>> + int sense;
>> +
>> + if (!intid)
>> + return false;
>> +
>> + /* 0 in this field indicates a wired interrupt */
>> + if (flags & ACPI_MPAM_MSC_IRQ_TYPE_MASK)
> I'd prefer more explicit code (and probably no comment)
>
> if (FIELD_GET(flags, ACPI_MPAM_MSC_IRQ_TYPE_MASK) !=
> ACPI_MPAM_MSC_IRQ_TYPE_WIRED)
> return false;
Sure,
>> + return false;
>> +
>> + if (flags & ACPI_MPAM_MSC_IRQ_MODE_EDGE)
>> + sense = ACPI_EDGE_SENSITIVE;
>> + else
>> + sense = ACPI_LEVEL_SENSITIVE;
>
> If the spec is supposed to be using standard ACPI_* types for this field
> (I don't think the connection is explicitly documented though) then
Sent as feedback on the spec. (I didn't realise those were standard!)
> sense = FIELD_GET(flags, ACPI_MPAM_MSC_IRQ_MODE_MASK);
> Assuming a change to define the mask and rely on the ACPI defs for the values
>
> This one is entirely up to you.
>> +
>> + /*
>> + * If the GSI is in the GIC's PPI range, try and create a partitioned
>> + * percpu interrupt.
>> + */
>> + if (16 <= intid && intid < 32 && processor_container_uid != ~0) {
>> + pr_err_once("Partitioned interrupts not supported\n");
>> + return false;
>> + }
>> +
>> + *irq = acpi_register_gsi(&pdev->dev, intid, sense, ACPI_ACTIVE_HIGH);
>> + if (*irq <= 0) {
>> + pr_err_once("Failed to register interrupt 0x%x with ACPI\n",
>> + intid);
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>> +static void acpi_mpam_parse_irqs(struct platform_device *pdev,
>> + struct acpi_mpam_msc_node *tbl_msc,
>> + struct resource *res, int *res_idx)
>> +{
>> + u32 flags, aff = ~0;
>> + int irq;
>> +
>> + flags = tbl_msc->overflow_interrupt_flags;
>> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
>> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
>> + aff = tbl_msc->overflow_interrupt_affinity;
> Just to make the two cases look the same I'd do
>
> else
> aff = ~0;
>
> here as well and not initialize above. It's not quite worth using
> a helper function for these two identical blocks but it's close.
>
>> + if (frob_irq(pdev, tbl_msc->overflow_interrupt, flags, &irq, aff)) {
>> + res[*res_idx].start = irq;
>> + res[*res_idx].end = irq;
>> + res[*res_idx].flags = IORESOURCE_IRQ;
>> + res[*res_idx].name = "overflow";
>
> res[*res_idx] = DEFINE_RES_IRQ_NAMED(irq, 1, "overflow");
Handy, not seen that before.
>> +
>> + (*res_idx)++;
> Can roll this in as well.
>> + }
>> +
>> + flags = tbl_msc->error_interrupt_flags;
>> + if (flags & ACPI_MPAM_MSC_IRQ_AFFINITY_VALID &&
>> + flags & ACPI_MPAM_MSC_IRQ_AFFINITY_PROCESSOR_CONTAINER)
>> + aff = tbl_msc->error_interrupt_affinity;
>> + else
>> + aff = ~0;
>> + if (frob_irq(pdev, tbl_msc->error_interrupt, flags, &irq, aff)) {
>> + res[*res_idx].start = irq;
>> + res[*res_idx].end = irq;
>> + res[*res_idx].flags = IORESOURCE_IRQ;
>> + res[*res_idx].name = "error";
>
> Similar to above.
Yup,
>> +
>> + (*res_idx)++;
>> + }
>> +}
>> +
>
>
>> +static bool __init parse_msc_pm_link(struct acpi_mpam_msc_node *tbl_msc,
>> + struct platform_device *pdev,
>> + u32 *acpi_id)
>> +{
>> + bool acpi_id_valid = false;
>> + struct acpi_device *buddy;
>> + char hid[16], uid[16];
>> + int err;
>> +
>> + memset(&hid, 0, sizeof(hid));
>> + memcpy(hid, &tbl_msc->hardware_id_linked_device,
>> + sizeof(tbl_msc->hardware_id_linked_device));
>> +
>> + if (!strcmp(hid, ACPI_PROCESSOR_CONTAINER_HID)) {
>> + *acpi_id = tbl_msc->instance_id_linked_device;
>> + acpi_id_valid = true;
>> + }
>> +
>> + err = snprintf(uid, sizeof(uid), "%u",
>> + tbl_msc->instance_id_linked_device);
>> + if (err < 0 || err >= sizeof(uid))
> Does snprintf() ever return < 0 ? It's documented as returning
> number of chars printed (without the NULL) so that can only be 0 or
> greater.
That looks like paranoia around string parsing in C, and snprintf() returning an int.
I've removed the first half,
> Can it return >= sizeof(uid) ? Looks odd.
More paranoia, it should be impossible given the arguments, but the documentation has:
| If the return is greater than or equal to @size, the resulting string is truncated.
If the string is truncated, there is no reason to feed it into acpi_dev_get_first_match_dev().
> + return acpi_id_valid;
>> +
>> + buddy = acpi_dev_get_first_match_dev(hid, uid, -1);
>> + if (buddy)
>> + device_link_add(&pdev->dev, &buddy->dev, DL_FLAG_STATELESS);
>> +
>> + return acpi_id_valid;
>> +}
>
>> +static int __init _parse_table(struct acpi_table_header *table)
>> +{
>> + char *table_end, *table_offset = (char *)(table + 1);
>> + struct property_entry props[4]; /* needs a sentinel */
>> + struct acpi_mpam_msc_node *tbl_msc;
>> + int next_res, next_prop, err = 0;
>> + struct acpi_device *companion;
>> + struct platform_device *pdev;
>> + enum mpam_msc_iface iface;
>> + struct resource res[3];
>> + char uid[16];
>> + u32 acpi_id;
>> +
>> + table_end = (char *)table + table->length;
>> +
>> + while (table_offset < table_end) {
>> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> + table_offset += tbl_msc->length;
>> +
>> + /*
>> + * If any of the reserved fields are set, make no attempt to
>> + * parse the msc structure. This will prevent the driver from
>> + * probing all the MSC, meaning it can't discover the system
>> + * wide supported partid and pmg ranges. This avoids whatever
>> + * this MSC is truncating the partids and creating a screaming
>> + * error interrupt.
>> + */
>> + if (tbl_msc->reserved || tbl_msc->reserved1 || tbl_msc->reserved2)
>> + continue;
>> +
>> + if (decode_interface_type(tbl_msc, &iface))
>> + continue;
>> +
>> + next_res = 0;
>> + next_prop = 0;
>> + memset(res, 0, sizeof(res));
>> + memset(props, 0, sizeof(props));
>> +
>> + pdev = platform_device_alloc("mpam_msc", tbl_msc->identifier);
>> + if (IS_ERR(pdev)) {
> returns NULL in at least some error cases (probably all, I'm just to lazy to check)
So it does ... Fixed.
>> + err = PTR_ERR(pdev);
>> + break;
>> + }
>> +
>> + if (tbl_msc->length < sizeof(*tbl_msc)) {
>> + err = -EINVAL;
>> + break;
>> + }
>> +
>> + /* Some power management is described in the namespace: */
>> + err = snprintf(uid, sizeof(uid), "%u", tbl_msc->identifier);
>> + if (err > 0 && err < sizeof(uid)) {
>> + companion = acpi_dev_get_first_match_dev("ARMHAA5C", uid, -1);
>> + if (companion)
>> + ACPI_COMPANION_SET(&pdev->dev, companion);
>> + }
>> +
>> + if (iface == MPAM_IFACE_MMIO) {
>> + res[next_res].name = "MPAM:MSC";
>> + res[next_res].start = tbl_msc->base_address;
>> + res[next_res].end = tbl_msc->base_address + tbl_msc->mmio_size - 1;
>> + res[next_res].flags = IORESOURCE_MEM;
>> + next_res++;
> DEFINE_RES_MEM_NAMED()?
Done,
>> + } else if (iface == MPAM_IFACE_PCC) {
>> + props[next_prop++] = PROPERTY_ENTRY_U32("pcc-channel",
>> + tbl_msc->base_address);
>> + next_prop++;
>> + }
>> +
>> + acpi_mpam_parse_irqs(pdev, tbl_msc, res, &next_res);
>> + err = platform_device_add_resources(pdev, res, next_res);
>> + if (err)
>> + break;
>> +
>> + props[next_prop++] = PROPERTY_ENTRY_U32("arm,not-ready-us",
>> + tbl_msc->max_nrdy_usec);
>> +
>> + /*
>> + * The MSC's CPU affinity is described via its linked power
>> + * management device, but only if it points at a Processor or
>> + * Processor Container.
>> + */
>> + if (parse_msc_pm_link(tbl_msc, pdev, &acpi_id)) {
>> + props[next_prop++] = PROPERTY_ENTRY_U32("cpu_affinity",
>> + acpi_id);
>> + }
>> +
>> + err = device_create_managed_software_node(&pdev->dev, props,
>> + NULL);
>> + if (err)
>> + break;
>> +
>> + /* Come back later if you want the RIS too */
>> + err = platform_device_add_data(pdev, tbl_msc, tbl_msc->length);
>> + if (err)
>> + break;
>> +
>> + platform_device_add(pdev);
> Can fail.
Fixed,
>> + }
>> +
>> + if (err)
>> + platform_device_put(pdev);
>> +
>> + return err;
>> +}
>> +static int _count_msc(struct acpi_table_header *table)
>> +{
>> + char *table_end, *table_offset = (char *)(table + 1);
>> + struct acpi_mpam_msc_node *tbl_msc;
>> + int ret = 0;
> Call it count as it only ever contains the count?
Sure,
>> +
>> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> + table_end = (char *)table + table->length;
>> +
>> + while (table_offset < table_end) {
>> + if (tbl_msc->length < sizeof(*tbl_msc))
>> + return -EINVAL;
>> +
>> + ret++;
>
> count++ would feel more natural here.
>
>> +
>> + table_offset += tbl_msc->length;
>> + tbl_msc = (struct acpi_mpam_msc_node *)table_offset;
>> + }
>> +
>> + return ret;
>> +}
> That's all I have time for today. Will get to the rest of the series soonish.
Thanks for taking a look!
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-23 16:39 ` Ben Horgan
@ 2025-08-05 17:07 ` James Morse
2025-08-15 9:33 ` Ben Horgan
0 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-08-05 17:07 UTC (permalink / raw)
To: Ben Horgan, Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Rohit Mathew,
Shanker Donthineni, Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko
Hi Ben,
On 23/07/2025 17:39, Ben Horgan wrote:
> On 7/16/25 18:07, Jonathan Cameron wrote:
>> On Fri, 11 Jul 2025 18:36:22 +0000
>> James Morse <james.morse@arm.com> wrote:
>>
>>> Add code to parse the arm64 specific MPAM table, looking up the cache
>>> level from the PPTT and feeding the end result into the MPAM driver.
>>
>> Throw in a link to the spec perhaps? Particularly useful to know which
>> version this was written against when reviewing it.
> As I comment below this code checks the table revision is 1 and so we can assume it was
> written against version 2 of the spec. As of Monday, there is a new version hot off the
> press,
> https://developer.arm.com/documentation/den0065/3-0bet/?lang=en which introduces an "MMIO
> size" field to allow for disabled nodes. This should be considered here to avoid
> advertising msc that aren't present.
Sure. Bit of an unfortunate race with the spec people there!
Added as:
--------------------%<--------------------
diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
index 9ff5a6df9f1b..d8c6224a76f8 100644
--- a/drivers/acpi/arm64/mpam.c
+++ b/drivers/acpi/arm64/mpam.c
@@ -202,6 +202,9 @@ static int __init _parse_table(struct acpi_table_header *table)
if (tbl_msc->reserved || tbl_msc->reserved1 || tbl_msc->reserved2)
continue;
+ if (!tbl_msc->mmio_size)
+ continue;
+
if (decode_interface_type(tbl_msc, &iface))
continue;
@@ -290,7 +293,7 @@ static struct acpi_table_header *get_table(void)
if (ACPI_FAILURE(status))
return NULL;
- if (table->revision != 1)
+ if (table->revision < 1)
return NULL;
return table;
@@ -321,6 +324,9 @@ static int _count_msc(struct acpi_table_header *table)
table_end = (char *)table + table->length;
while (table_offset < table_end) {
+ if (!tbl_msc->mmio_size)
+ continue;
+
if (tbl_msc->length < sizeof(*tbl_msc))
return -EINVAL;
--------------------%<--------------------
Amusingly, PCC also defines mmio_size==0 as disabled, so _count_msc() doesn't need to know
what kind of thing this is. In principle they could change this as its beta, but a zero
sized MSC should probably be treated as an error anyway.
Thanks,
James
^ permalink raw reply related [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-28 10:08 ` Jonathan Cameron
@ 2025-08-05 17:08 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:08 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
Hi Jonathan,
On 28/07/2025 11:08, Jonathan Cameron wrote:
>>> +static struct acpi_table_header *get_table(void)
>>> +{
>>> + struct acpi_table_header *table;
>>> + acpi_status status;
>>> +
>>> + if (acpi_disabled || !system_supports_mpam())
>>> + return NULL;
>>> +
>>> + status = acpi_get_table(ACPI_SIG_MPAM, 0, &table);
>>> + if (ACPI_FAILURE(status))
>>> + return NULL;
>>> +
>>> + if (table->revision != 1)
>
> Missing an acpi_put_table()
Oops,
> I'm messing around with ACQUIRE() that is queued in the CXL tree
> for the coming merge window and noticed this.
> https://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl.git/log/?h=for-6.17/cleanup-acquire
(not more C++!)
> Interestingly this is a new corner case where we want conditional locking
> style handling but with return_ptr() style handling. Maybe too much of a niche
> to bother with infrastructure.
>
> Worth noting though that one layer up it is probably worth something like:
>
> DEFINE_FREE(acpi_table_mpam, struct acpi_table_header *, if (_T) acpi_put_table(_T));
>
> That enables nice clean code like:
>
>
> static int __init acpi_mpam_parse(void)
> {
> struct acpi_table_header *mpam = __free(acpi_table_mpam) = get_table();
>
> if (!mpam)
> return 0;
>
> return _parse_table;
> }
I've got bits of that from your PPTT suggestions. I ended up folding the get_table()
helper in here.
count_msc() gets the same treatment and the cleanup thing lets _count_msc() be folded into it.
Thanks,
James
> This series was big enough that I'm spinning a single 'suggested changes'
> patch on top of it that includes stuff like this. Might take another day or so.
>
> Jonathan
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-07-24 10:50 ` Ben Horgan
@ 2025-08-05 17:08 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:08 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 24/07/2025 11:50, Ben Horgan wrote:
> On 11/07/2025 19:36, James Morse wrote:
>> Add code to parse the arm64 specific MPAM table, looking up the cache
>> level from the PPTT and feeding the end result into the MPAM driver.
>> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
>> new file mode 100644
>> index 000000000000..f4791bac9a2a
>> --- /dev/null
>> +++ b/drivers/acpi/arm64/mpam.c
>> @@ -0,0 +1,365 @@
>> +static int acpi_mpam_parse_resource(struct mpam_msc *msc,
>> + struct acpi_mpam_resource_node *res)
>> +{
>> + int level, nid;
>> + u32 cache_id;
>> +
>> + switch (res->locator_type) {
>> + case ACPI_MPAM_LOCATION_TYPE_PROCESSOR_CACHE:
>> + cache_id = res->locator.cache_locator.cache_reference;
>> + level = find_acpi_cache_level_from_id(cache_id);
>> + if (level < 0) {
>> + pr_err_once("Bad level (%u) for cache with id %u\n", level, cache_id);
>> + return -EINVAL;
> Nit: More robust to check for level <= 0.
Sure, that can probably happen!
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding
2025-07-11 21:43 ` Rob Herring
@ 2025-08-05 17:08 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-05 17:08 UTC (permalink / raw)
To: Rob Herring
Cc: linux-kernel, linux-arm-kernel, Ben Horgan, Rohit Mathew,
Shanker Donthineni, Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko
Hi Rob,
On 11/07/2025 22:43, Rob Herring wrote:
> On Fri, Jul 11, 2025 at 06:36:23PM +0000, James Morse wrote:
>> From: Rob Herring <robh@kernel.org>
>>
>> The binding is designed around the assumption that an MSC will be a
>> sub-block of something else such as a memory controller, cache controller,
>> or IOMMU. However, it's certainly possible a design does not have that
>> association or has a mixture of both, so the binding illustrates how we can
>> support that with RIS child nodes.
>>
>> A key part of MPAM is we need to know about all of the MSCs in the system
>> before it can be enabled. This drives the need for the genericish
>> 'arm,mpam-msc' compatible. Though we can't assume an MSC is accessible
>> until a h/w specific driver potentially enables the h/w.
> Is there any DT based h/w using this? I'm not aware of any.
I'm told there is. I'll let them come out of the wood work to confirm it ...
> I would
> prefer not merging this until there is. I have little insight whether
> these genericish compatibles will be sufficient, but I have lots of
> experience to say they won't be. I would also suspect that if anyone has
> started using this, they've just extended/modified it however they
> wanted and no feedback got to me.
Sure, what are you looking for here, Reviewed-by tags from someone with a hardware
platform that is going to ship with DT?
>> diff --git a/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml b/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
>> new file mode 100644
>> index 000000000000..9d542ecb1a7d
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/arm/arm,mpam-msc.yaml
>> @@ -0,0 +1,227 @@
>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/arm/arm,mpam-msc.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Arm Memory System Resource Partitioning and Monitoring (MPAM)
>> +
>> +description: |
>> + The Arm MPAM specification can be found here:
>> +
>> + https://developer.arm.com/documentation/ddi0598/latest
>> +
>> +maintainers:
>> + - Rob Herring <robh@kernel.org>
>> +
>> +properties:
>> + compatible:
>> + items:
>> + - const: arm,mpam-msc # Further details are discoverable
>> + - const: arm,mpam-memory-controller-msc
>> +
>> + reg:
>> + maxItems: 1
>> + description: A memory region containing registers as defined in the MPAM
>> + specification.
>> +
>> + interrupts:
>> + minItems: 1
>> + items:
>> + - description: error (optional)
>> + - description: overflow (optional, only for monitoring)
>> +
>> + interrupt-names:
>> + oneOf:
>> + - items:
>> + - enum: [ error, overflow ]
>> + - items:
>> + - const: error
>> + - const: overflow
>> +
>> + arm,not-ready-us:
>> + description: The maximum time in microseconds for monitoring data to be
>> + accurate after a settings change. For more information, see the
>> + Not-Ready (NRDY) bit description in the MPAM specification.
>> +
>> + numa-node-id: true # see NUMA binding
>> +
>> + '#address-cells':
>> + const: 1
>> +
>> + '#size-cells':
>> + const: 0
>> +
>> +patternProperties:
>> + '^ris@[0-9a-f]$':
>> + type: object
>> + additionalProperties: false
>> + description: |
>
> '|' can be dropped.
>
>> + RIS nodes for each RIS in an MSC. These nodes are required for each RIS
>> + implementing known MPAM controls
>> +
>> + properties:
>> + compatible:
>> + enum:
>> + # Bulk storage for cache
>> + - arm,mpam-cache
>> + # Memory bandwidth
>> + - arm,mpam-memory
>> +
>> + reg:
>> + minimum: 0
>> + maximum: 0xf
>> +
>> + cpus:
>> + $ref: '/schemas/types.yaml#/definitions/phandle-array'
> Don't need the type. It's in the core schemas now.
(I assume the type is that '$ref' line)
>> + description:
>> + Phandle(s) to the CPU node(s) this RIS belongs to. By default, the parent
>> + device's affinity is used.
>> +
>> + arm,mpam-device:
>> + $ref: '/schemas/types.yaml#/definitions/phandle'
>
> Don't need quotes. This should be a warning, but no testing happened
> because the DT list and maintainers weren't CCed.
Yup, its an RFC - I only CC'd the folk that have previously expressed an interest for the
first pass. (git send-email put you on CC!)
>> + description:
>> + By default, the MPAM enabled device associated with a RIS is the MSC's
>> + parent node. It is possible for each RIS to be associated with different
>> + devices in which case 'arm,mpam-device' should be used.
>> +
>> + required:
>> + - compatible
>> + - reg
>> +
>> +required:
>> + - compatible
>> + - reg
>> +
>> +dependencies:
>> + interrupts: [ interrupt-names ]
>> +
>> +additionalProperties: false
>> +
>> +examples:
>> + - |
>> + /*
>> + cpus {
>> + cpu@0 {
>> + next-level-cache = <&L2_0>;
>> + };
>> + cpu@100 {
>> + next-level-cache = <&L2_1>;
>> + };
>> + };
>> + */
>> + L2_0: cache-controller-0 {
>> + compatible = "cache";
>> + cache-level = <2>;
>> + cache-unified;
>> + next-level-cache = <&L3>;
>> +
>> + };
>> +
>> + L2_1: cache-controller-1 {
>> + compatible = "cache";
>> + cache-level = <2>;
>> + cache-unified;
>> + next-level-cache = <&L3>;
>> +
>> + };
>
> All the above should be dropped. Not part of this binding.
>
>> +
>> + L3: cache-controller@30000000 {
>> + compatible = "arm,dsu-l3-cache", "cache";
>
> Pretty sure this is a warning because that compatible doesn't exist.
Not sure what to do with this. I see plenty of DT in the tree with 'cache', and you've got
'foo,a-memory-controller' below ...
>> + cache-level = <3>;
>> + cache-unified;
>> +
>> + ranges = <0x0 0x30000000 0x800000>;
>> + #address-cells = <1>;
>> + #size-cells = <1>;
>> +
>> + msc@10000 {
>> + compatible = "arm,mpam-msc";
>> +
>> + /* CPU affinity implied by parent cache node's */
>> + reg = <0x10000 0x2000>;
>> + interrupts = <1>, <2>;
>> + interrupt-names = "error", "overflow";
>> + arm,not-ready-us = <1>;
>> + };
>> + };
>> +
>> + mem: memory-controller@20000 {
>> + compatible = "foo,a-memory-controller";
>> + reg = <0x20000 0x1000>;
>> +
>> + #address-cells = <1>;
>> + #size-cells = <1>;
>> + ranges;
>> +
>> + msc@21000 {
>> + compatible = "arm,mpam-memory-controller-msc", "arm,mpam-msc";
>> + reg = <0x21000 0x1000>;
>> + interrupts = <3>;
>> + interrupt-names = "error";
>> + arm,not-ready-us = <1>;
>> + numa-node-id = <1>;
>> + };
>> + };
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory
2025-07-21 16:32 ` Jonathan Cameron
@ 2025-08-06 18:03 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:03 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
Hi Jonathan,
On 21/07/2025 17:32, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:24 +0000
> James Morse <james.morse@arm.com> wrote:
>
>> commit 363c8aea257 "platform: Add ARM64 platform directory" added a
>> subdirectory for arm64 platform devices, but claims that all such
>> devices must be 'EC like'.
>>
>> The arm64 MPAM driver manages an MMIO interface that appears in memory
>> controllers, caches, IOMMU and connection points on the interconnect.
>> It doesn't fit into any existing subsystem.
>>
>> It would be convenient to use this subdirectory for drivers for other
>> arm64 platform devices which aren't closely coupled to the architecture
>> code and don't fit into any existing subsystem.
>>
>> Move the existing code and maintainer entries to be under
>> drivers/platform/arm64/ec. The MPAM driver will be added under
>> drivers/platform/arm64/mpam.
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 4bac4ea21b64..bea01d413666 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -3549,15 +3549,15 @@ S: Maintained
>> F: arch/arm64/boot/Makefile
>> F: scripts/make_fit.py
>>
>> -ARM64 PLATFORM DRIVERS
>> -M: Hans de Goede <hansg@kernel.org>
>> +ARM64 EC PLATFORM DRIVERS
>> +M: Hans de Goede <hdegoede@redhat.com>
>
> Smells like a rebase error as Hans' email address chagned
> to the kernel.org one in the 6.16 cycle.
Bother - yes, that's exactly what happened.
>> M: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>> R: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
>> L: platform-driver-x86@vger.kernel.org
>> S: Maintained
>> Q: https://patchwork.kernel.org/project/platform-driver-x86/list/
>> T: git git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git
>> -F: drivers/platform/arm64/
>> +F: drivers/platform/arm64/ec
> Other than that looks sensible to me but obviously needs tags from Hans or Ilpo.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory
2025-07-24 10:56 ` Ben Horgan
@ 2025-08-06 18:03 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:03 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 24/07/2025 11:56, Ben Horgan wrote:
> On 11/07/2025 19:36, James Morse wrote:
>> commit 363c8aea257 "platform: Add ARM64 platform directory" added a
>> subdirectory for arm64 platform devices, but claims that all such
>> devices must be 'EC like'.
>>
>> The arm64 MPAM driver manages an MMIO interface that appears in memory
>> controllers, caches, IOMMU and connection points on the interconnect.
>> It doesn't fit into any existing subsystem.
>>
>> It would be convenient to use this subdirectory for drivers for other
>> arm64 platform devices which aren't closely coupled to the architecture
>> code and don't fit into any existing subsystem.
>>
>> Move the existing code and maintainer entries to be under
>> drivers/platform/arm64/ec. The MPAM driver will be added under
>> drivers/platform/arm64/mpam.
>> diff --git a/drivers/platform/arm64/ec/Kconfig b/drivers/platform/arm64/ec/Kconfig
>> new file mode 100644
>> index 000000000000..06288aebc559
>> --- /dev/null
>> +++ b/drivers/platform/arm64/ec/Kconfig
>> @@ -0,0 +1,73 @@
>> +# SPDX-License-Identifier: GPL-2.0-only
>> +#
>> +# EC-like Drivers for aarch64 based devices.
>> +#
>> +
>> +menuconfig ARM64_PLATFORM_DEVICES
>> + bool "ARM64 Platform-Specific Device Drivers"
>> + depends on ARM64 || COMPILE_TEST
>> + default ARM64
>> + help
>> + Say Y here to get to see options for platform-specific device drivers
>> + for arm64 based devices, primarily EC-like device drivers.
>> + This option alone does not add any kernel code.
>> +
>> + If you say N, all options in this submenu will be skipped and disabled.
>> +
>> +if ARM64_PLATFORM_DEVICES
> Shouldn't this be kept in the directory above? By the description this would be expected
> to apply to all drivers in drivers/platfrom/arm64.
Doing that makes any MPAM options appear under 'Platform-Specific Device Drivers' too. I
didn't to that as MPAM isn't specific to one platform, but doesn't fit under any of the
bus or high level groups under drivers. (I briefly toyed with drivers/perf - as there
isn't a drivers/qos).
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
2025-07-24 11:02 ` Ben Horgan
@ 2025-08-06 18:03 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:03 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 24/07/2025 12:02, Ben Horgan wrote:
> On 11/07/2025 19:36, James Morse wrote:
>> Probing MPAM is convoluted. MSCs that are integrated with a CPU may
>> only be accessible from those CPUs, and they may not be online.
>> Touching the hardware early is pointless as MPAM can't be used until
>> the system-wide common values for num_partid and num_pmg have been
>> discovered.
>>
>> Start with driver probe/remove and mapping the MSC.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> new file mode 100644
>> index 000000000000..5b886ba54ba8
>> --- /dev/null
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -0,0 +1,336 @@
>> +static int mpam_dt_count_msc(void)
>> +{
>> + int count = 0;
>> + struct device_node *np;
>> +
>> + for_each_compatible_node(np, NULL, "arm,mpam-msc")
> This will count even 'status = "disabled"' nodes. Add a check for that.
>
> if (of_device_is_available(np))> + count++;
Good spot, fixed - thanks.
Thanks,
James
>> +
>> + return count;
>> +}
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
2025-07-24 12:09 ` Catalin Marinas
@ 2025-08-06 18:04 ` James Morse
2025-08-07 17:50 ` Drew Fustini
0 siblings, 1 reply; 117+ messages in thread
From: James Morse @ 2025-08-06 18:04 UTC (permalink / raw)
To: Catalin Marinas
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
Hi Catalin,
On 24/07/2025 13:09, Catalin Marinas wrote:
> On Fri, Jul 11, 2025 at 06:36:25PM +0000, James Morse wrote:
>> Probing MPAM is convoluted. MSCs that are integrated with a CPU may
>> only be accessible from those CPUs, and they may not be online.
>> Touching the hardware early is pointless as MPAM can't be used until
>> the system-wide common values for num_partid and num_pmg have been
>> discovered.
>>
>> Start with driver probe/remove and mapping the MSC.
>> arch/arm64/Kconfig | 1 +
>> drivers/platform/arm64/Kconfig | 1 +
>> drivers/platform/arm64/Makefile | 1 +
>> drivers/platform/arm64/mpam/Kconfig | 10 +
>> drivers/platform/arm64/mpam/Makefile | 4 +
>> drivers/platform/arm64/mpam/mpam_devices.c | 336 ++++++++++++++++++++
>> drivers/platform/arm64/mpam/mpam_internal.h | 62 ++++
>> 7 files changed, 415 insertions(+)
>> create mode 100644 drivers/platform/arm64/mpam/Kconfig
>> create mode 100644 drivers/platform/arm64/mpam/Makefile
>> create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
>> create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
> Bikeshedding: why not drivers/resctrl to match fs/resctrl? We wouldn't
> need the previous patch either to move the arm64 platform drivers.
Initially because I don't see any other architecture having an MMIO interface to this
stuff, and didn't want a 'top level' driver directory for a single driver. But, re-reading
RISC-Vs CBQRI[0] it turns out that theirs is memory mapped...
> I'm not an expert on resctrl but the MPAM code looks more like a backend
> for the resctrl support, so it makes more sense to do as we did for
> other drivers like irqchip, iommu.
Only because there are many irqchip or iommu. I'm not a fan of drivers/mpam, but
drivers/resctrl would suit RISC-V too. (I'll check with Drew)
Thanks,
James
[0] https://patchew.org/linux/20230419111111.477118-1-dfustini@baylibre.com/
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions
2025-07-17 1:04 ` Shaopeng Tan (Fujitsu)
@ 2025-08-06 18:04 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:04 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hi Shaopeng,
On 17/07/2025 02:04, Shaopeng Tan (Fujitsu) wrote:
>> Memory Partitioning and Monitoring (MPAM) has memory mapped devices
>> (MSCs) with an identity/configuration page.
>>
>> Add the definitions for these registers as offset within the page(s).
>> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h
>> b/drivers/platform/arm64/mpam/mpam_internal.h
>> index d49bb884b433..9110c171d9d2 100644
>> --- a/drivers/platform/arm64/mpam/mpam_internal.h
>> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
>> @@ -150,4 +150,272 @@ extern struct list_head mpam_classes; int
>> +/*
>> + * MSMON_CFG_CSU_CTL - Memory system performance monitor configure
>> cache storage
>> + * usage monitor control register
>> + * MSMON_CFG_MBWU_CTL - Memory system performance monitor
>> configure memory
>> + * bandwidth usage monitor control register
>> + */
>> +#define MSMON_CFG_x_CTL_TYPE GENMASK(7, 0)
>> +#define MSMON_CFG_x_CTL_OFLOW_STATUS_L BIT(15)
>> +#define MSMON_CFG_x_CTL_MATCH_PARTID BIT(16)
>> +#define MSMON_CFG_x_CTL_MATCH_PMG BIT(17)
>> +#define MSMON_CFG_x_CTL_SCLEN BIT(19)
>> +#define MSMON_CFG_x_CTL_SUBTYPE GENMASK(23, 20)
>> +#define MSMON_CFG_x_CTL_OFLOW_FRZ BIT(24)
>> +#define MSMON_CFG_x_CTL_OFLOW_INTR BIT(25)
>> +#define MSMON_CFG_x_CTL_OFLOW_STATUS BIT(26)
>> +#define MSMON_CFG_x_CTL_CAPT_RESET BIT(27)
>> +#define MSMON_CFG_x_CTL_CAPT_EVNT GENMASK(30, 28)
>> +#define MSMON_CFG_x_CTL_EN BIT(31)
>> +
>> +#define MSMON_CFG_MBWU_CTL_TYPE_MBWU
>> 0x42
>> +#define MSMON_CFG_MBWU_CTL_TYPE_CSU
>> 0x43
> MSMON_CFG_CSU_CTL_TYPE_CSU?
Yup, copy-and-paste error. Thanks!
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions
2025-07-24 14:02 ` Ben Horgan
@ 2025-08-06 18:05 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:05 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 24/07/2025 15:02, Ben Horgan wrote:
> On 11/07/2025 19:36, James Morse wrote:
>> Memory Partitioning and Monitoring (MPAM) has memory mapped devices
>> (MSCs) with an identity/configuration page.
>>
>> Add the definitions for these registers as offset within the page(s).
>> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/
>> mpam_internal.h
>> index d49bb884b433..9110c171d9d2 100644
>> --- a/drivers/platform/arm64/mpam/mpam_internal.h
>> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
>> @@ -150,4 +150,272 @@ extern struct list_head mpam_classes;
>> int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32 cache_level,
>> cpumask_t *affinity);
>> +/*
>> + * MPAM MSCs have the following register layout. See:
>> + * Arm Architecture Reference Manual Supplement - Memory System Resource
>> + * Partitioning and Monitoring (MPAM), for Armv8-A. DDI 0598A.a
> I've been checking this against https://developer.arm.com/documentation/ihi0099/latest/ as
> that looks to be the current document although hopefully the contents are non-
> contradictory.
Yeah, A.a was the first release, they then split that document up and the CPU half got
consumed by the arm-arm. (pacman, but in pdf form).
I've updated this comment to point to the more modern 'system component specification',
aka "all the mmio stuff".
>> +/* MPAMF_IDR - MPAM features ID register */
>> +#define MPAMF_IDR_PARTID_MAX GENMASK(15, 0)
>> +#define MPAMF_IDR_PMG_MAX GENMASK(23, 16)
>> +#define MPAMF_IDR_HAS_CCAP_PART BIT(24)
>> +#define MPAMF_IDR_HAS_CPOR_PART BIT(25)
>> +#define MPAMF_IDR_HAS_MBW_PART BIT(26)
>> +#define MPAMF_IDR_HAS_PRI_PART BIT(27)
>> +#define MPAMF_IDR_HAS_EXT BIT(28)
> MPAMF_IDR_EXT. The field name is ext rather than has_ext.
Making it the odd one out!
I'll change this in the hope that one day this sort of thing can be generated.
>> +#define MPAMF_IDR_HAS_IMPL_IDR BIT(29)
>> +#define MPAMF_IDR_HAS_MSMON BIT(30)
>> +#define MPAMF_IDR_HAS_PARTID_NRW BIT(31)
>> +#define MPAMF_IDR_HAS_RIS BIT(32)
>> +#define MPAMF_IDR_HAS_EXT_ESR BIT(38)
> MPAMF_IDR_HAS_EXTD_ESR. Missing D.
Fixed.
>> +#define MPAMF_IDR_HAS_ESR BIT(39)
>> +#define MPAMF_IDR_RIS_MAX GENMASK(59, 56)
>> +
>> +/* MPAMF_MSMON_IDR - MPAM performance monitoring ID register */
>> +#define MPAMF_MSMON_IDR_MSMON_CSU BIT(16)
>> +#define MPAMF_MSMON_IDR_MSMON_MBWU BIT(17)
>> +#define MPAMF_MSMON_IDR_HAS_LOCAL_CAPT_EVNT BIT(31)
>> +
>> +/* MPAMF_CPOR_IDR - MPAM features cache portion partitioning ID register */
>> +#define MPAMF_CPOR_IDR_CPBM_WD GENMASK(15, 0)
>> +
>> +/* MPAMF_CCAP_IDR - MPAM features cache capacity partitioning ID register */
>> +#define MPAMF_CCAP_IDR_HAS_CMAX_SOFTLIM BIT(31)
>> +#define MPAMF_CCAP_IDR_NO_CMAX BIT(30)
>> +#define MPAMF_CCAP_IDR_HAS_CMIN BIT(29)
>> +#define MPAMF_CCAP_IDR_HAS_CASSOC BIT(28)
>> +#define MPAMF_CCAP_IDR_CASSOC_WD GENMASK(12, 8)
>> +#define MPAMF_CCAP_IDR_CMAX_WD GENMASK(5, 0)
> nit: Field ordering differs from the other registers.
Fixed,
[..]
>> +/* MPAMF_ESR - MPAM Error Status Register */
>> +#define MPAMF_ESR_PARTID_OR_MON GENMASK(15, 0)
Probably a better name but PARTID_MON is in the specification.
Fixed,
[..]
>> +/*
>> + * MSMON_CFG_CSU_CTL - Memory system performance monitor configure cache storage
>> + * usage monitor control register
>> + * MSMON_CFG_MBWU_CTL - Memory system performance monitor configure memory
>> + * bandwidth usage monitor control register
>> + */
>> +#define MSMON_CFG_x_CTL_TYPE GENMASK(7, 0)
>> +#define MSMON_CFG_x_CTL_OFLOW_STATUS_L BIT(15)
> No OFLOW_STATUS_L for csu.
You're suggesting there shouldn't be an 'x' in the middle? Sure.
Overflow is nonsense for the CSU 'counters' as they don't count up, so can't overflow.
(and yet they have an overflow status bit!)
>> +#define MSMON_CFG_x_CTL_MATCH_PARTID BIT(16)
>> +#define MSMON_CFG_x_CTL_MATCH_PMG BIT(17)
>> +#define MSMON_CFG_x_CTL_SCLEN BIT(19)
>> +#define MSMON_CFG_x_CTL_SUBTYPE GENMASK(23, 20)
> GENMASK(22,20)> +#define MSMON_CFG_x_CTL_OFLOW_FRZ BIT(24)
>> +#define MSMON_CFG_x_CTL_OFLOW_INTR BIT(25)
(Are you using Outlook?)
>> +#define MSMON_CFG_x_CTL_OFLOW_STATUS BIT(26)
>> +#define MSMON_CFG_x_CTL_CAPT_RESET BIT(27)
>> +#define MSMON_CFG_x_CTL_CAPT_EVNT GENMASK(30, 28)
>> +#define MSMON_CFG_x_CTL_EN BIT(31)
>> +
>> +#define MSMON_CFG_MBWU_CTL_TYPE_MBWU 0x42
>> +#define MSMON_CFG_MBWU_CTL_TYPE_CSU 0x43
>> +
>> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_NONE 0
>> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_READ 1
>> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_WRITE 2
>> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_BOTH 3
> I'm not sure where these come from? SUBTYPE is marked unused in the spec. Remove?> +
Looks like an earlier version of RWBW feature. I'll rip these out.
>> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_MAX 3
>> +#define MSMON_CFG_MBWU_CTL_SUBTYPE_MASK 0x3
> Remove for same reason.
>> +
>> +/*
>> + * MSMON_CFG_MBWU_FLT - Memory system performance monitor configure memory
>> + * bandwidth usage monitor filter register
>> + */
>> +#define MSMON_CFG_MBWU_FLT_PARTID GENMASK(15, 0)
>> +#define MSMON_CFG_MBWU_FLT_PMG GENMASK(23, 16)
>> +#define MSMON_CFG_MBWU_FLT_RWBW GENMASK(31, 30)
>> +
>> +/*
>> + * MSMON_CSU - Memory system performance monitor cache storage usage monitor
>> + * register
>> + * MSMON_CSU_CAPTURE - Memory system performance monitor cache storage usage
>> + * capture register
>> + * MSMON_MBWU - Memory system performance monitor memory bandwidth usage
>> + * monitor register
>> + * MSMON_MBWU_CAPTURE - Memory system performance monitor memory bandwidth usage
>> + * capture register
>> + */
>> +#define MSMON___VALUE GENMASK(30, 0)
>> +#define MSMON___NRDY BIT(31)
>> +#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
> This gets renamed in the series. I think all registers layout definitions can be added in
> this commit.
Yup, I've pulled that hunk in. (and spotted there is another)
Adding them all here was the plan. (and there will be a few that don't get used)
Thanks for going through all those!
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware
2025-07-24 14:13 ` Ben Horgan
@ 2025-08-06 18:07 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:07 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 24/07/2025 15:13, Ben Horgan wrote:
> On 11/07/2025 19:36, James Morse wrote:
>> Because an MSC can only by accessed from the CPUs in its cpu-affinity
>> set we need to be running on one of those CPUs to probe the MSC
>> hardware.
>>
>> Do this work in the cpuhp callback. Probing the hardware will only
>> happen before MPAM is enabled, walk all the MSCs and probe those we can
>> reach that haven't already been probed.
>>
>> Later once MPAM is enabled, this cpuhp callback will be replaced by
>> one that avoids the global list.
>>
>> Enabling a static key will also take the cpuhp lock, so can't be done
>> from the cpuhp callback. Whenever a new MSC has been probed schedule
>> work to test if all the MSCs have now been probed.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index 0d6d5180903b..89434ae3efa6 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -513,9 +541,84 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
>> return err;
>> }
>> -static void mpam_discovery_complete(void)
>> +static int mpam_msc_hw_probe(struct mpam_msc *msc)
>> {
>> - pr_err("Discovered all MSC\n");
>> + u64 idr;
>> + int err;
>> +
>> + lockdep_assert_held(&msc->probe_lock);
>> +
>> + mutex_lock(&msc->part_sel_lock);
>> + idr = mpam_read_partsel_reg(msc, AIDR);
>> + if ((idr & MPAMF_AIDR_ARCH_MAJOR_REV) != MPAM_ARCHITECTURE_V1) {
>> + pr_err_once("%s does not match MPAM architecture v1.0\n",
>> + dev_name(&msc->pdev->dev));
> The error message need only mention the major revision. You've added support for v1.1 and
> v1.0.
Makes sense,
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware
2025-07-29 6:11 ` Baisheng Gao
@ 2025-08-06 18:07 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-06 18:07 UTC (permalink / raw)
To: Baisheng Gao
Cc: amitsinght, baolin.wang, ben.horgan, bobo.shaobowang, carl,
dave.martin, david, dfustini, kobak, lcherian, lecopzerc,
linux-arm-kernel, linux-kernel, peternewman, quic_jiles, rex.nie,
robh, rohit.mathew, scott, sdonthineni, shameerali.kolothum.thodi,
tan.shaopeng, xhao, zengheng4, hao_hao.wang
Hi Baisheng,
On 29/07/2025 07:11, Baisheng Gao wrote:
>> Because an MSC can only by accessed from the CPUs in its cpu-affinity
>> set we need to be running on one of those CPUs to probe the MSC
>> hardware.
>>
>> Do this work in the cpuhp callback. Probing the hardware will only
>> happen before MPAM is enabled, walk all the MSCs and probe those we can
>> reach that haven't already been probed.
>>
>> Later once MPAM is enabled, this cpuhp callback will be replaced by
>> one that avoids the global list.
>>
>> Enabling a static key will also take the cpuhp lock, so can't be done
>> from the cpuhp callback. Whenever a new MSC has been probed schedule
>> work to test if all the MSCs have now been probed.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/mpam_devices.c
>> index 0d6d5180903b..89434ae3efa6 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -513,9 +541,84 @@ int mpam_ris_create(struct mpam_msc *msc, u8 ris_idx,
>> +static void mpam_register_cpuhp_callbacks(int (*online)(unsigned int online),
>> + int (*offline)(unsigned int offline))
>> +{
>> + mutex_lock(&mpam_cpuhp_state_lock);
>> + if (mpam_cpuhp_state) {
>> + cpuhp_remove_state(mpam_cpuhp_state);
>> + mpam_cpuhp_state = 0;
>> + }
>> +
>> + mpam_cpuhp_state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "mpam:online",
>> + online, offline);
>> + if (mpam_cpuhp_state <= 0) {
>> + pr_err("Failed to register cpuhp callbacks");
>> + mpam_cpuhp_state = 0;
>> + }
>> + mutex_unlock(&mpam_cpuhp_state_lock);
>> }
>>
>> static int mpam_dt_count_msc(void)
>> @@ -797,6 +900,46 @@ static struct platform_driver mpam_msc_driver = {
>> +static void mpam_enable_once(void)
>> +{
>> + mutex_lock(&mpam_cpuhp_state_lock);
>> + cpuhp_remove_state(mpam_cpuhp_state);
>> + mpam_cpuhp_state = 0;
>> + mutex_unlock(&mpam_cpuhp_state_lock);
> Deleting the above 4 lines?
> The mpam_cpuhp_state will be removed firstly in mpam_register_cpuhp_callbacks
> if the mpam_cpuhp_state isn't 0.
Yup - this is a pointless appendage because of the way the code evolved!
Thanks for catching it
James
>> +
>> + mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
>> +
>> + pr_info("MPAM enabled\n");
>> +}
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate
2025-08-06 18:04 ` James Morse
@ 2025-08-07 17:50 ` Drew Fustini
0 siblings, 0 replies; 117+ messages in thread
From: Drew Fustini @ 2025-08-07 17:50 UTC (permalink / raw)
To: James Morse
Cc: Catalin Marinas, linux-kernel, linux-arm-kernel, Rob Herring,
Ben Horgan, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
On Wed, Aug 06, 2025 at 07:04:09PM +0100, James Morse wrote:
> Hi Catalin,
>
> On 24/07/2025 13:09, Catalin Marinas wrote:
> > On Fri, Jul 11, 2025 at 06:36:25PM +0000, James Morse wrote:
> >> Probing MPAM is convoluted. MSCs that are integrated with a CPU may
> >> only be accessible from those CPUs, and they may not be online.
> >> Touching the hardware early is pointless as MPAM can't be used until
> >> the system-wide common values for num_partid and num_pmg have been
> >> discovered.
> >>
> >> Start with driver probe/remove and mapping the MSC.
>
> >> arch/arm64/Kconfig | 1 +
> >> drivers/platform/arm64/Kconfig | 1 +
> >> drivers/platform/arm64/Makefile | 1 +
> >> drivers/platform/arm64/mpam/Kconfig | 10 +
> >> drivers/platform/arm64/mpam/Makefile | 4 +
> >> drivers/platform/arm64/mpam/mpam_devices.c | 336 ++++++++++++++++++++
> >> drivers/platform/arm64/mpam/mpam_internal.h | 62 ++++
> >> 7 files changed, 415 insertions(+)
> >> create mode 100644 drivers/platform/arm64/mpam/Kconfig
> >> create mode 100644 drivers/platform/arm64/mpam/Makefile
> >> create mode 100644 drivers/platform/arm64/mpam/mpam_devices.c
> >> create mode 100644 drivers/platform/arm64/mpam/mpam_internal.h
>
> > Bikeshedding: why not drivers/resctrl to match fs/resctrl? We wouldn't
> > need the previous patch either to move the arm64 platform drivers.
>
> Initially because I don't see any other architecture having an MMIO interface to this
> stuff, and didn't want a 'top level' driver directory for a single driver. But, re-reading
> RISC-Vs CBQRI[0] it turns out that theirs is memory mapped...
Yeah, all the cpus (e.g. harts) can access all the registers of the QoS
controllers per the CBQRI spec [1].
The memory map for the example SoC in the proof-of-concept [2]:
Base addr Size
0x4820000 4KB Cluster 0 L2 cache controller
0x4821000 4KB Cluster 1 L2 cache controller
0x4828000 4KB Memory controller 0
0x4829000 4KB Memory controller 1
0X482a000 4KB Memory controller 2
0X482b000 4KB Shared LLC cache controller
> > I'm not an expert on resctrl but the MPAM code looks more like a backend
> > for the resctrl support, so it makes more sense to do as we did for
> > other drivers like irqchip, iommu.
>
> Only because there are many irqchip or iommu. I'm not a fan of drivers/mpam, but
> drivers/resctrl would suit RISC-V too. (I'll check with Drew)
I think that is reasonable. In the proof-of-concept, I had the following
structure, but I think there is a lot of room for improvement.
arch/riscv/kernel/qos/qos_resctrl.c
Implementation of the register interface described in the CBQRI spec
along with the resctrl implementation. I should probably break this up
into separate files for the CBQRI operations and the resctrl interface.
drivers/soc/foobar/foobar_cbqri_cache.c
DT-based driver for SoC cache controller that implements CBQRI
drivers/soc/foobar/foobar_cbqri_memory.c
DT-based driver for SoC memory controller that implements CBQRI
With all the great upstream progress, I've been meaning to rebase the
RISC-V CBQRI support and post an RFC as its been a really long time.
There is no public silicon yet that implements CBQRI but I think the
possibility is getting closer. I've also been working on integrating
ACPI support [3] using the new RQSC table, and I've been meaning to post
an RFC for that too.
Thanks,
Drew
[1] https://github.com/riscv-non-isa/riscv-cbqri/releases/download/v1.0/riscv-cbqri.pdf
[2] https://lore.kernel.org/linux-riscv/20230419111111.477118-1-dfustini@baylibre.com/
[3] https://lf-rise.atlassian.net/wiki/spaces/HOME/pages/433291272/ACPI+RQSC+Proof+of+Concept
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports
2025-07-24 15:08 ` Ben Horgan
2025-07-28 16:16 ` Jonathan Cameron
@ 2025-08-07 18:26 ` James Morse
1 sibling, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-07 18:26 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 24/07/2025 16:08, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> Expand the probing support with the control and monitor types
>> we can use with resctrl.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index 8646fb85ad09..61911831ab39 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -645,6 +659,137 @@ static struct mpam_msc_ris *mpam_get_or_create_ris(struct mpam_msc
>> *msc,
>> return found;
>> }
>> +/*
>> + * IHI009A.a has this nugget: "If a monitor does not support automatic behaviour
>> + * of NRDY, software can use this bit for any purpose" - so hardware might not
>> + * implement this - but it isn't RES0.
>> + *
>> + * Try and see what values stick in this bit. If we can write either value,
>> + * its probably not implemented by hardware.
>> + */
>> +#define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result) \
>> +do { \
>> + u32 now; \
>> + u64 mon_sel; \
>> + bool can_set, can_clear; \
>> + struct mpam_msc *_msc = _ris->vmsc->msc; \
>> + \
>> + if (WARN_ON_ONCE(!mpam_mon_sel_inner_lock(_msc))) { \
>> + _result = false; \
>> + break; \
>> + } \
>> + mon_sel = FIELD_PREP(MSMON_CFG_MON_SEL_MON_SEL, 0) | \
>> + FIELD_PREP(MSMON_CFG_MON_SEL_RIS, _ris->ris_idx); \
>> + mpam_write_monsel_reg(_msc, CFG_MON_SEL, mon_sel); \
>> + \
>> + mpam_write_monsel_reg(_msc, _mon_reg, MSMON___NRDY); \
>> + now = mpam_read_monsel_reg(_msc, _mon_reg); \
>> + can_set = now & MSMON___NRDY; \
>> + \
>> + mpam_write_monsel_reg(_msc, _mon_reg, 0); \
>> + now = mpam_read_monsel_reg(_msc, _mon_reg); \
>> + can_clear = !(now & MSMON___NRDY); \
>> + mpam_mon_sel_inner_unlock(_msc); \
>> + \
>> + _result = (!can_set || !can_clear); \
>> +} while (0)
> It is a bit surprising that something that looks like a function modifies a boolean passed
> by value.
> Consider continuing the pattern you have above:
> #define mpam_ris_hw_probe_hw_nrdy(_ris, _mon_reg, _result)
> _mpam_ris_hw_probe_hw_nrdy(_ris, MSMON##_mon_reg, _result)
>
> with signature:
> void _mpam_ris_hw_probe_hw_nrdy(struct mpam_msc *msc, u16 reg, bool *hw_managed);
>
> and using the _mpam functions from the new _mpam_ris_hw_probe_hw_nrdy().
With that, it may as well return the result.
Done.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks
2025-07-28 9:49 ` Ben Horgan
@ 2025-08-08 7:05 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:05 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 28/07/2025 10:49, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> When a CPU comes online, it may bring a newly accessible MSC with
>> it. Only the default partid has its value reset by hardware, and
>> even then the MSC might not have been reset since its config was
>> previously dirtyied. e.g. Kexec.
>>
>> Any in-use partid must have its configuration restored, or reset.
>> In-use partids may be held in caches and evicted later.
>>
>> MSC are also reset when CPUs are taken offline to cover cases where
>> firmware doesn't reset the MSC over reboot using UEFI, or kexec
>> where there is no firmware involvement.
>>
>> If the configuration for a RIS has not been touched since it was
>> brought online, it does not need resetting again.
>>
>> To reset, write the maximum values for all discovered controls.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index 7b042a35405a..d014dbe0ab96 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -849,8 +850,116 @@ static int mpam_msc_hw_probe(struct mpam_msc *msc)
>> return 0;
>> }
>> +static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg, u16 wd)
>> +{
>> + u32 num_words, msb;
>> + u32 bm = ~0;
>> + int i;
>> +
>> + lockdep_assert_held(&msc->part_sel_lock);
>> +
>> + if (wd == 0)
>> + return;
>> +
>> + /*
>> + * Write all ~0 to all but the last 32bit-word, which may
>> + * have fewer bits...
>> + */
>> + num_words = DIV_ROUND_UP(wd, 32);
>> + for (i = 0; i < num_words - 1; i++, reg += sizeof(bm))
>> + __mpam_write_reg(msc, reg, bm);
>> +
>> + /*
>> + * ....and then the last (maybe) partial 32bit word. When wd is a
>> + * multiple of 32, msb should be 31 to write a full 32bit word.
>> + */
>> + msb = (wd - 1) % 32;
>> + bm = GENMASK(msb, 0);
>> + if (bm)
>> + __mpam_write_reg(msc, reg, bm);
> Drop the 'if' as the 0 bit will always be part of the mask.
Yup, this was a harmless leftover of the versions that tried to optionally write the left over bits.
>> @@ -1419,7 +1541,7 @@ static void mpam_enable_once(void)
>> mpam_register_cpuhp_callbacks(mpam_cpu_online, mpam_cpu_offline);
>> printk(KERN_INFO "MPAM enabled with %u partid and %u pmg\n",
>> - mpam_partid_max + 1, mpam_pmg_max + 1);
>> + READ_ONCE(mpam_partid_max) + 1, mpam_pmg_max + 1);
>
> Belongs in 'arm_mpam: Probe MSCs to find the supported partid/pmg values'.
That value is now protected by a lock, and can't change at this point, so it no longer needs the READ_ONCE() anyway.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time
2025-07-28 10:22 ` Ben Horgan
@ 2025-08-08 7:07 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:07 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 28/07/2025 11:22, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> cpuhp callbacks aren't the only time the MSC configuration may need to
>> be reset. Resctrl has an API call to reset a class.
>> If an MPAM error interrupt arrives it indicates the driver has
>> misprogrammed an MSC. The safest thing to do is reset all the MSCs
>> and disable MPAM.
>>
>> Add a helper to reset RIS via their class. Call this from mpam_disable(),
>> which can be scheduled from the error interrupt handler.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index 2e32e54cc081..145535cd4732 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> +/*
>> + * Called in response to an error IRQ.
>> + * All of MPAMs errors indicate a software bug, restore any modified
>> + * controls to their reset values.
>> + */
>> +void mpam_disable(void)
>> +{
>> + int idx;
>> + struct mpam_class *class;
>> +
>> + idx = srcu_read_lock(&mpam_srcu);
>> + list_for_each_entry_srcu(class, &mpam_classes, classes_list,
>> + srcu_read_lock_held(&mpam_srcu))
>> + mpam_reset_class(class);
>> + srcu_read_unlock(&mpam_srcu, idx);
>> +}
> Consider moving to the next patch where you introduce interrupt support.
I pulled these changes out of that patch to try and make it simpler!
Doing that would leave a bunch of static functions unused.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-17 1:08 ` Shaopeng Tan (Fujitsu)
@ 2025-08-08 7:07 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:07 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hi Shaoepeng,
On 17/07/2025 02:08, Shaopeng Tan (Fujitsu) wrote:
>> Register and enable error IRQs. All the MPAM error interrupts indicate a
>> software bug, e.g. out of range partid. If the error interrupt is ever signalled,
>> attempt to disable MPAM.
>>
>> Only the irq handler accesses the ESR register, so no locking is needed.
>> The work to disable MPAM after an error needs to happen at process context,
>> use a threaded interrupt.
>>
>> There is no support for percpu threaded interrupts, for now schedule the work
>> to be done from the irq handler.
>>
>> Enabling the IRQs in the MSC may involve cross calling to a CPU that can
>> access the MSC.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
>> b/drivers/platform/arm64/mpam/mpam_devices.c
>> index 145535cd4732..af19cc25d16e 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> +static irqreturn_t __mpam_irq_handler(int irq, struct mpam_msc *msc) {
>> + u64 reg;
>> + u16 partid;
>> + u8 errcode, pmg, ris;
>> +
>> + if (WARN_ON_ONCE(!msc) ||
>> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
>> + &msc->accessibility)))
>> + return IRQ_NONE;
>> +
>> + reg = mpam_msc_read_esr(msc);
>> +
>> + errcode = FIELD_GET(MPAMF_ESR_ERRCODE, reg);
>> + if (!errcode)
>> + return IRQ_NONE;
> In general, I think there is no problem.
> However, the initial value of MPAMF_ESR_ERRCODE may not be 0 on some chips.
> It is better to initialize when loading the MPAM driver
Hmm, the architecture doesn't say - but if this weren't true, you get spurious fatal errors.
More interesting is if the bootloader (unlikey) or pervious kexec kernel (feasible) were using
MPAM and triggered an error. We'd ned up disabling the driver when there was no bug...
I'll add something to the hw_probe() path.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
@ 2025-08-08 7:08 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:08 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hi Shaopeng,
On 16/07/2025 08:31, Shaopeng Tan (Fujitsu) wrote:
>> Register and enable error IRQs. All the MPAM error interrupts indicate a
>> software bug, e.g. out of range partid. If the error interrupt is ever signalled,
>> attempt to disable MPAM.
>>
>> Only the irq handler accesses the ESR register, so no locking is needed.
>> The work to disable MPAM after an error needs to happen at process context,
>> use a threaded interrupt.
>>
>> There is no support for percpu threaded interrupts, for now schedule the work
>> to be done from the irq handler.
>>
>> Enabling the IRQs in the MSC may involve cross calling to a CPU that can
>> access the MSC.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
>> b/drivers/platform/arm64/mpam/mpam_devices.c
>> index 145535cd4732..af19cc25d16e 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> +static irqreturn_t __mpam_irq_handler(int irq, struct mpam_msc *msc) {
>> + u64 reg;
>> + u16 partid;
>> + u8 errcode, pmg, ris;
>> +
>> + if (WARN_ON_ONCE(!msc) ||
>> + WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
>> + &msc->accessibility)))
>> + return IRQ_NONE;
>> +
>> + reg = mpam_msc_read_esr(msc);
>> +
>> + errcode = FIELD_GET(MPAMF_ESR_ERRCODE, reg);
>> + if (!errcode)
>> + return IRQ_NONE;
>> +
>> + /* Clear level triggered irq */
>> + mpam_msc_zero_esr(msc);
>> +
>> + partid = FIELD_GET(MPAMF_ESR_PARTID_OR_MON, reg);
>> + pmg = FIELD_GET(MPAMF_ESR_PMG, reg);
>> + ris = FIELD_GET(MPAMF_ESR_PMG, reg);
> MPAMF_ESR_RIS?
Yup, thanks for catching that!
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-22 15:06 ` Jonathan Cameron
@ 2025-08-08 7:11 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:11 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
Hi Jonathan,
On 22/07/2025 16:06, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:37 +0000
> James Morse <james.morse@arm.com> wrote:
>> Register and enable error IRQs. All the MPAM error interrupts indicate a
>> software bug, e.g. out of range partid. If the error interrupt is ever
>> signalled, attempt to disable MPAM.
>>
>> Only the irq handler accesses the ESR register, so no locking is needed.
>> The work to disable MPAM after an error needs to happen at process
>> context, use a threaded interrupt.
>>
>> There is no support for percpu threaded interrupts, for now schedule
>> the work to be done from the irq handler.
>>
>> Enabling the IRQs in the MSC may involve cross calling to a CPU that
>> can access the MSC.
> Sparse gives an imbalance warning in mpam_register_irqs()
>> +static int mpam_register_irqs(void)
>> +{
>> + int err, irq, idx;
>> + struct mpam_msc *msc;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + idx = srcu_read_lock(&mpam_srcu);
>> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list, srcu_read_lock_held(&mpam_srcu)) {
>> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
>> + if (irq <= 0)
>> + continue;
>> +
>> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
>> + /* We anticipate sharing the interrupt with other MSCs */
>> + if (irq_is_percpu(irq)) {
>> + err = request_percpu_irq(irq, &mpam_ppi_handler,
>> + "mpam:msc:error",
>> + msc->error_dev_id);
>> + if (err)
>> + return err;
> Looks like the srcu_read_lock is still held.
Oops,
> There is a DEFINE_LOCK_GUARD_1() in srcu.h so you can do
>
> guard(srcu)(&mpam_srcu, idx);
>
> I think and not worry about releasing it in errors or the good path.
Sure ... but having the compiler chose when to release locks makes me nervous!
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-07-28 10:49 ` Ben Horgan
@ 2025-08-08 7:11 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:11 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 28/07/2025 11:49, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> Register and enable error IRQs. All the MPAM error interrupts indicate a
>> software bug, e.g. out of range partid. If the error interrupt is ever
>> signalled, attempt to disable MPAM.
>>
>> Only the irq handler accesses the ESR register, so no locking is needed.
>> The work to disable MPAM after an error needs to happen at process
>> context, use a threaded interrupt.
>>
>> There is no support for percpu threaded interrupts, for now schedule
>> the work to be done from the irq handler.
>>
>> Enabling the IRQs in the MSC may involve cross calling to a CPU that
>> can access the MSC.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index 145535cd4732..af19cc25d16e 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -1548,11 +1638,193 @@ static void mpam_enable_merge_features(struct list_head
>> +static int mpam_enable_msc_ecr(void *_msc)
>> +{
>> + struct mpam_msc *msc = _msc;
>> +
>> + __mpam_write_reg(msc, MPAMF_ECR, 1);
> You can use MPAMF_ECR_INTEN.
Sure,
>> +
>> + return 0;
>> +}
>> @@ -1644,7 +1939,6 @@ void mpam_enable(struct work_struct *work)
>> struct mpam_msc *msc;
>> bool all_devices_probed = true;
>> - /* Have we probed all the hw devices? */
> Stray change. Keep the comment or remove it in the patch that introduced it.
Fixed.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 25/36] arm_mpam: Register and enable IRQs
2025-08-04 16:53 ` Fenghua Yu
@ 2025-08-08 7:12 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:12 UTC (permalink / raw)
To: Fenghua Yu, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Fenghua,
On 04/08/2025 17:53, Fenghua Yu wrote:
> On 7/11/25 11:36, James Morse wrote:
>> Register and enable error IRQs. All the MPAM error interrupts indicate a
>> software bug, e.g. out of range partid. If the error interrupt is ever
>> signalled, attempt to disable MPAM.
>>
>> Only the irq handler accesses the ESR register, so no locking is needed.
>> The work to disable MPAM after an error needs to happen at process
>> context, use a threaded interrupt.
>>
>> There is no support for percpu threaded interrupts, for now schedule
>> the work to be done from the irq handler.
>>
>> Enabling the IRQs in the MSC may involve cross calling to a CPU that
>> can access the MSC.
>> +static int mpam_register_irqs(void)
>> +{
>> + int err, irq, idx;
>> + struct mpam_msc *msc;
>> +
>> + lockdep_assert_cpus_held();
>> +
>> + idx = srcu_read_lock(&mpam_srcu);
>> + list_for_each_entry_srcu(msc, &mpam_all_msc, glbl_list,
>> srcu_read_lock_held(&mpam_srcu)) {
>> + irq = platform_get_irq_byname_optional(msc->pdev, "error");
>> + if (irq <= 0)
>> + continue;
>> +
>> + /* The MPAM spec says the interrupt can be SPI, PPI or LPI */
>> + /* We anticipate sharing the interrupt with other MSCs */
>> + if (irq_is_percpu(irq)) {
>> + err = request_percpu_irq(irq, &mpam_ppi_handler,
>> + "mpam:msc:error",
>> + msc->error_dev_id);
>> + if (err)
>> + return err;
> But right now mpam_srcu is still being locked. Need to unlock it before return.
Yup, Jonathan's srcu guard runes solve that in a future proof way.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
@ 2025-08-08 7:13 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:13 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS,
lcherian@marvell.com, bobo.shaobowang@huawei.com,
baolin.wang@linux.alibaba.com, Jamie Iles, Xin Hao,
peternewman@google.com, dfustini@baylibre.com,
amitsinght@marvell.com, David Hildenbrand, Rex Nie, Dave Martin,
Koba Ko
Hi Shaopeng,
On 16/07/2025 07:49, Shaopeng Tan (Fujitsu) wrote:
>> When CPUs come online the original configuration should be restored.
>> Once the maximum partid is known, allocate an configuration array for each
>> component, and reprogram each RIS configuration from this.
>>
>> The MPAM spec describes how multiple controls can interact. To prevent this
>> happening by accident, always reset controls that don't have a valid
>> configuration. This allows the same helper to be used for configuration and
>> reset.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c
>> b/drivers/platform/arm64/mpam/mpam_devices.c
>> index bb3695eb84e9..f3ecfda265d2 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -909,51 +913,90 @@ static void mpam_reset_msc_bitmap(struct
>> mpam_msc *msc, u16 reg, u16 wd)
>> __mpam_write_reg(msc, reg, bm);
>> }
>>
>> -static void mpam_reset_ris_partid(struct mpam_msc_ris *ris, u16 partid)
>> +/* Called via IPI. Call while holding an SRCU reference */ static void
>> +mpam_reprogram_ris_partid(struct mpam_msc_ris *ris, u16 partid,
>> + struct mpam_config *cfg)
>> {
>> u16 bwa_fract = MPAMCFG_MBW_MAX_MAX;
>> struct mpam_msc *msc = ris->vmsc->msc;
>> struct mpam_props *rprops = &ris->props;
>>
>> - mpam_assert_srcu_read_lock_held();
>> -
>> mutex_lock(&msc->part_sel_lock);
>> __mpam_part_sel(ris->ris_idx, partid, msc);
>>
>> - if (mpam_has_feature(mpam_feat_cpor_part, rprops))
>> - mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
>> rprops->cpbm_wd);
>> + if (mpam_has_feature(mpam_feat_cpor_part, rprops)) {
>> + if (mpam_has_feature(mpam_feat_cpor_part, cfg))
>> + mpam_write_partsel_reg(msc, CPBM, cfg->cpbm);
>> + else
>> + mpam_reset_msc_bitmap(msc, MPAMCFG_CPBM,
>> + rprops->cpbm_wd);
>> + }
>>
>> - if (mpam_has_feature(mpam_feat_mbw_part, rprops))
>> - mpam_reset_msc_bitmap(msc, MPAMCFG_MBW_PBM,
>> rprops->mbw_pbm_bits);
>> + if (mpam_has_feature(mpam_feat_mbw_part, rprops)) {
>> + if (mpam_has_feature(mpam_feat_mbw_part, cfg))
>> + mpam_write_partsel_reg(msc, MBW_PBM,
>> cfg->mbw_pbm);
>> + else
>> + mpam_reset_msc_bitmap(msc,
>> MPAMCFG_MBW_PBM,
>> + rprops->mbw_pbm_bits);
>> + }
>>
>> if (mpam_has_feature(mpam_feat_mbw_min, rprops))
>> mpam_write_partsel_reg(msc, MBW_MIN, 0);
>>
>> - if (mpam_has_feature(mpam_feat_mbw_max, rprops))
>> - mpam_write_partsel_reg(msc, MBW_MAX, bwa_fract);
>> + if (mpam_has_feature(mpam_feat_mbw_max, rprops)) {
>> + if (mpam_has_feature(mpam_feat_mbw_max, cfg))
>> + mpam_write_partsel_reg(msc, MBW_MAX,
>> cfg->mbw_max);
>> + else
>> + mpam_write_partsel_reg(msc, MBW_MAX,
>> bwa_fract);
>> + }
> 0 was written to MPAMCFG_MBW_MAX. [HARDLIM].
> Depending on the chip, if [HARDLIM] is set to 1 by default, it will be overwritten.
Hardlimit was shoe-horned into the architecture as a backward-compatible thing. It's not too
surprising it gets stamped on here - it was previously RES0. Again, the architecture doesn't
say what the reset value of the register is - generally you can't rely on bits being preserved
if the OS doesn't know what they are.
For the full picture - we don't have a way of getting a hard-limit hint from resctrl. Currently
its just ignored as the behaviour will then be 'the same' on platforms that do or don't have
hardlimit. If we add something like that to the user-space interface - then we can plumb it in.
Enabling it on platforms that have it now will make that a murky picture as the 'old' behaviour
would need to be preserved.
Yes, mpam_devices.c should give its callers a way of setting hardlim - but today resctrl can't.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-28 11:59 ` Ben Horgan
2025-07-28 15:34 ` Dave Martin
@ 2025-08-08 7:14 ` James Morse
1 sibling, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:14 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 28/07/2025 12:59, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> When CPUs come online the original configuration should be restored.
>> Once the maximum partid is known, allocate an configuration array for
>> each component, and reprogram each RIS configuration from this.
>>
>> The MPAM spec describes how multiple controls can interact. To prevent
>> this happening by accident, always reset controls that don't have a
>> valid configuration. This allows the same helper to be used for
>> configuration and reset.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index bb3695eb84e9..f3ecfda265d2 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -1000,10 +1041,38 @@ static void mpam_reset_msc(struct mpam_msc *msc, bool online)
>> */
>> ris->in_reset_state = online;
>> }
>> - srcu_read_unlock(&mpam_srcu, idx);
>> mpam_mon_sel_outer_unlock(msc);
>> }
>> +static void mpam_reprogram_msc(struct mpam_msc *msc)
>> +{
>> + int idx;
>> + u16 partid;
>> + bool reset;
>> + struct mpam_config *cfg;
>> + struct mpam_msc_ris *ris;
>> +
>> + idx = srcu_read_lock(&mpam_srcu);
>> + list_for_each_entry_rcu(ris, &msc->ris, msc_list) {
>> + if (!mpam_is_enabled() && !ris->in_reset_state) {
>> + mpam_touch_msc(msc, &mpam_reset_ris, ris);
>> + ris->in_reset_state = true;
>> + continue;
>> + }
>> +
>> + reset = true;
>> + for (partid = 0; partid <= mpam_partid_max; partid++) {
> Do we need to consider 'partid_max_lock' here?
The lock is only needed while those parameters can change, due to a race with mpam_register_requestor(). Once mpam_enabled()
has been called, the values can't change, so the lock is redundant.
In this case, its relying on mpam_cpu_online() only ever being the cpuhp callback once partid_max_published has been set.
I'll add a comment.
>> + cfg = &ris->vmsc->comp->cfg[partid];
>> + if (cfg->features)
>> + reset = false;
>> +
>> + mpam_reprogram_ris_partid(ris, partid, cfg);
>> + }
>> + ris->in_reset_state = reset;
>> + }
>> + srcu_read_unlock(&mpam_srcu, idx);
>> +}
>> +
>> static void _enable_percpu_irq(void *_irq)
>> {
>> int *irq = _irq;
> @@ -1806,6 +1875,43 @@ static void mpam_unregister_irqs(void)
>> cpus_read_unlock();
>> }
>> +static void __destroy_component_cfg(struct mpam_component *comp)
>> +{
>> + add_to_garbage(comp->cfg);
>> +}
>> +
>> +static int __allocate_component_cfg(struct mpam_component *comp)
>> +{
>> + if (comp->cfg)
>> + return 0;
>> +
>> + comp->cfg = kcalloc(mpam_partid_max + 1, sizeof(*comp->cfg), GFP_KERNEL);
> And here?
Aha, that is a good one - the configuration should be allocated after the partid values are fixed. Currently its done by
the same function, but not in the right order.
>> + if (!comp->cfg)
>> + return -ENOMEM;
>> + init_garbage(comp->cfg);
>> +
>> + return 0;
>> +}
>> +
>> +static int mpam_allocate_config(void)
>> +{
>> + int err = 0;
>> + struct mpam_class *class;
>> + struct mpam_component *comp;
>> +
>> + lockdep_assert_held(&mpam_list_lock);
>> +
>> + list_for_each_entry(class, &mpam_classes, classes_list) {
>> + list_for_each_entry(comp, &class->components, class_list) {
>> + err = __allocate_component_cfg(comp);
>> + if (err)
>> + return err;
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static void mpam_enable_once(void)
>> {
>> int err;
>> @@ -1817,12 +1923,21 @@ static void mpam_enable_once(void)
>> */
>> cpus_read_lock();
>> mutex_lock(&mpam_list_lock);
>> - mpam_enable_merge_features(&mpam_classes);
>> + do {
>> + mpam_enable_merge_features(&mpam_classes);
>> - err = mpam_register_irqs();
>> - if (err)
>> - pr_warn("Failed to register irqs: %d\n", err);
>> + err = mpam_allocate_config();
>> + if (err) {
>> + pr_err("Failed to allocate configuration arrays.\n");
>> + break;
>> + }
>> + err = mpam_register_irqs();
>> + if (err) {
>> + pr_warn("Failed to register irqs: %d\n", err);
>> + break;
>> + }
>> + } while (0);
>> mutex_unlock(&mpam_list_lock);
>> cpus_read_unlock();
>> @@ -1861,6 +1976,8 @@ static void mpam_reset_component_locked(struct mpam_component
>> *comp)
>> might_sleep();
>> lockdep_assert_cpus_held();
>> + memset(comp->cfg, 0, (mpam_partid_max * sizeof(*comp->cfg)));
> And here?
Same story, and the same bug - the disable path can be called and reach this before the partid size has been fixed. The
code that enables interrupts should pull that earlier in mpam_enable_once, which would simplify all of these.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-07-28 15:34 ` Dave Martin
@ 2025-08-08 7:16 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:16 UTC (permalink / raw)
To: Dave Martin, Ben Horgan
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Rohit Mathew,
Shanker Donthineni, Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Koba Ko
Hi Dave, Ben,
On 28/07/2025 16:34, Dave Martin wrote:
> Although it may look like the globals are read all over the place after
> probing, I think this actually only happens during resctrl initialision
> (which is basically single-threaded).
>
> The only place where they are read after probing and without mediation
> via resctrl is on the CPU hotplug path.
(and the mpam driver gets the first go a cpuhp, then it calls into resctrl).
> Adding locking would ensure that an unstable value is never read, but
> this is not sufficient by itself to sure that the _final_ value of a
> variable is read (for some definition of "final"). And, if there is a
> well-defined notion of final value and there is sufficient
> synchronisation to ensure that this is the value read by a particular
> read, then by construction an unstable value cannot be read.
>
>
> I think that this kind of pattern is not that uncommon in the kernel,
> though it is a bit painful to reason about.
As it's sparked some discussion, I've added a mpam_assert_partid_sizes_fixed() that
documents this, and will trigger a WARN_ON_ONCE() if these things are observed as
happening in the wrong order.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online
2025-08-04 16:39 ` Fenghua Yu
@ 2025-08-08 7:17 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:17 UTC (permalink / raw)
To: Fenghua Yu, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Ben Horgan, Rohit Mathew, Shanker Donthineni,
Zeng Heng, Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Fenghua,
On 04/08/2025 17:39, Fenghua Yu wrote:
> On 7/11/25 11:36, James Morse wrote:
>> When CPUs come online the original configuration should be restored.
>> Once the maximum partid is known, allocate an configuration array for
>> each component, and reprogram each RIS configuration from this.
>>
>> The MPAM spec describes how multiple controls can interact. To prevent
>> this happening by accident, always reset controls that don't have a
>> valid configuration. This allows the same helper to be used for
>> configuration and reset.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index bb3695eb84e9..f3ecfda265d2 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c>> @@ -909,51 +913,90 @@ static void mpam_reset_msc_bitmap(struct mpam_msc *msc, u16 reg,
>> +/* Call with MSC lock held */
>> +static int mpam_reprogram_ris(void *_arg)
>> +{
>> + u16 partid, partid_max;
>> + struct reprogram_ris *arg = _arg;
>> + struct mpam_msc_ris *ris = arg->ris;
>> + struct mpam_config *cfg = arg->cfg;
>> +
>> + if (ris->in_reset_state)
>> + return 0;
>> +
>> + spin_lock(&partid_max_lock);
>> + partid_max = mpam_partid_max;
> partid_max is not used after the assignment.
>> + spin_unlock(&partid_max_lock);
>
> Doesn't make sense to lock protect a local variable partid_max which is not used any way.
>
> [SNIP]
Because you cut the user out:
| for (partid = 0; partid <= partid_max; partid++)
| mpam_reprogram_ris_partid(ris, partid, cfg);
|
| return 0;
| }
mpam_reprogram_ris() needs to snapshot the value because it can be called via mpam_reset_msc() -
which does run before mpam_enable_once(). This can race with mpam_register_requestor(), but the
race can only reduce partid_max, all the MSC are guaranteed to support at least the original value.
Taking the lock is so you don't get a torn value which is larger than the original value.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported
2025-07-28 13:46 ` Ben Horgan
@ 2025-08-08 7:19 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:19 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 28/07/2025 14:46, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> From: Rohit Mathew <rohit.mathew@arm.com>
>>
>> If the 44 bit (long) or 63 bit (LWD) counters are detected on probing
>> the RIS, use long/LWD counter instead of the regular 31 bit mbwu
>> counter.
>>
>> Only 32bit accesses to the MSC are required to be supported by the
>> spec, but these registers are 64bits. The lower half may overflow
>> into the higher half between two 32bit reads. To avoid this, use
>> a helper that reads the top half twice to check for overflow.
> Slightly misleading as it may be read up to 4 times.
Meh - its referring to the high/low/high pattern. Sure if it fails you go round the whole
thing again. I'll change it 'read multiple times to check for overflow'.
>> diff --git a/drivers/platform/arm64/mpam/mpam_devices.c b/drivers/platform/arm64/mpam/
>> mpam_devices.c
>> index 774137a124f8..ace69ac2d0ee 100644
>> --- a/drivers/platform/arm64/mpam/mpam_devices.c
>> +++ b/drivers/platform/arm64/mpam/mpam_devices.c
>> @@ -1125,10 +1177,24 @@ static void __ris_msmon_read(void *arg)
>> now = FIELD_GET(MSMON___VALUE, now);
>> break;
>> case mpam_feat_msmon_mbwu:
>> - now = mpam_read_monsel_reg(msc, MBWU);
>> - if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
>> - nrdy = now & MSMON___NRDY;
>> - now = FIELD_GET(MSMON___VALUE, now);
>> + /*
>> + * If long or lwd counters are supported, use them, else revert
>> + * to the 32 bit counter.
>> + */
> 32 bit counter -> 31 bit counter
Sure,
>> + if (mpam_ris_has_mbwu_long_counter(ris)) {
>> + now = mpam_msc_read_mbwu_l(msc);
>> + if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
>> + nrdy = now & MSMON___NRDY_L;
>> + if (mpam_has_feature(mpam_feat_msmon_mbwu_63counter, rprops))
>> + now = FIELD_GET(MSMON___LWD_VALUE, now);
>> + else
>> + now = FIELD_GET(MSMON___L_VALUE, now);
>> + } else {
>> + now = mpam_read_monsel_reg(msc, MBWU);
>> + if (mpam_has_feature(mpam_feat_msmon_mbwu_hw_nrdy, rprops))
>> + nrdy = now & MSMON___NRDY;
>> + now = FIELD_GET(MSMON___VALUE, now);
>> + }
>> if (nrdy)
>> break;
>> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/
>> mpam_internal.h
>> index fc705801c1b6..4553616f2f67 100644
>> --- a/drivers/platform/arm64/mpam/mpam_internal.h
>> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
>> @@ -674,7 +675,10 @@ int mpam_get_cpumask_from_cache_id(unsigned long cache_id, u32
>> cache_level,
>> */
>> #define MSMON___VALUE GENMASK(30, 0)
>> #define MSMON___NRDY BIT(31)
>> -#define MSMON_MBWU_L_VALUE GENMASK(62, 0)
>> +#define MSMON___NRDY_L BIT(63)
>> +#define MSMON___L_VALUE GENMASK(43, 0)
>> +#define MSMON___LWD_VALUE GENMASK(62, 0)
>> +
> As mentioned on an earlier patch. These could be added with all the other register
> definition.
Yup,
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports
2025-07-28 8:56 ` Ben Horgan
@ 2025-08-08 7:20 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:20 UTC (permalink / raw)
To: Ben Horgan, linux-kernel, linux-arm-kernel
Cc: Rob Herring, Rohit Mathew, Shanker Donthineni, Zeng Heng,
Lecopzer Chen, Carl Worth, shameerali.kolothum.thodi,
D Scott Phillips OS, lcherian, bobo.shaobowang, tan.shaopeng,
baolin.wang, Jamie Iles, Xin Hao, peternewman, dfustini,
amitsinght, David Hildenbrand, Rex Nie, Dave Martin, Koba Ko
Hi Ben,
On 28/07/2025 09:56, Ben Horgan wrote:
> On 7/11/25 19:36, James Morse wrote:
>> Expand the probing support with the control and monitor types
>> we can use with resctrl.
>> diff --git a/drivers/platform/arm64/mpam/mpam_internal.h b/drivers/platform/arm64/mpam/
>> mpam_internal.h
>> index 42a454d5f914..ae6fd1f62cc4 100644
>> --- a/drivers/platform/arm64/mpam/mpam_internal.h
>> +++ b/drivers/platform/arm64/mpam/mpam_internal.h
>> @@ -136,6 +136,55 @@ static inline void mpam_mon_sel_lock_held(struct mpam_msc *msc)
>> lockdep_assert_preemption_enabled();
>> }
>> +/*
>> + * When we compact the supported features, we don't care what they are.
>> + * Storing them as a bitmap makes life easy.
>> + */
>> +typedef u16 mpam_features_t;
>> +
>> +/* Bits for mpam_features_t */
>> +enum mpam_device_features {
>> + mpam_feat_ccap_part = 0,
>> + mpam_feat_cpor_part,
>> + mpam_feat_mbw_part,
>> + mpam_feat_mbw_min,
>> + mpam_feat_mbw_max,
>> + mpam_feat_mbw_prop,
>> + mpam_feat_msmon,
>> + mpam_feat_msmon_csu,
>> + mpam_feat_msmon_csu_capture,
>> + mpam_feat_msmon_csu_hw_nrdy,
>> + mpam_feat_msmon_mbwu,
>> + mpam_feat_msmon_mbwu_capture,
>> + mpam_feat_msmon_mbwu_rwbw,
>> + mpam_feat_msmon_mbwu_hw_nrdy,
>> + mpam_feat_msmon_capt,
>> + MPAM_FEATURE_LAST,
>> +};
>> +#define MPAM_ALL_FEATURES ((1 << MPAM_FEATURE_LAST) - 1)
>
> Consider a static assert to check the type is big enough.
>
> static_assert(BITS_PER_TYPE(mpam_features_t) >= MPAM_FEATURE_LAST);
Fancy.
There used to be an uglier one - not sure what happened to it!
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 00/36] arm_mpam: Add basic mpam driver
2025-08-01 16:09 ` [RFC PATCH 00/36] arm_mpam: Add basic mpam driver Jonathan Cameron
@ 2025-08-08 7:23 ` James Morse
0 siblings, 0 replies; 117+ messages in thread
From: James Morse @ 2025-08-08 7:23 UTC (permalink / raw)
To: Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Ben Horgan,
Rohit Mathew, Shanker Donthineni, Zeng Heng, Lecopzer Chen,
Carl Worth, shameerali.kolothum.thodi, D Scott Phillips OS,
lcherian, bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles,
Xin Hao, peternewman, dfustini, amitsinght, David Hildenbrand,
Rex Nie, Dave Martin, Koba Ko
Hi Jonathan,
On 01/08/2025 17:09, Jonathan Cameron wrote:
> On Fri, 11 Jul 2025 18:36:12 +0000
> James Morse <james.morse@arm.com> wrote:
>> This is just enough MPAM driver for the ACPI and DT pre-requisites.
>> It doesn't contain any of the resctrl code, meaning you can't actually drive it
>> from user-space yet.
[..]
> Whilst I get that this is minimal, I was a bit surprised that it doesn't
> contain enough to have the driver actually bind to the platform devices
> I think that needs the CPU hotplug handler to register a requester.
> So about another 4 arch patches from your tree. Maybe you can shuffle
> things around to help with that.
Ah, I hadn't spotted that. The register-requestor code should only serve to
reduce the available PARTID - just in case the CPUs support less than the
cache hierarchy.
It's likely its the 'system_supports_mpam()' that prevents the driver being
registered. I'll move that into the arm64 patches - its needed so any id
register overrides in the CPU knock out the driver too.
> That makes this a pain to test in isolation.
>
> Given desire to poke the corners, I'm rebasing the old QEMU emulation and
> will poke it some more. Now we are getting close to upstream kernel support
> maybe I'll even clean that up for potential upstream QEMU.
>
> For bonus points I 'could' hook it up to the cache simulator and actually
> generate real 'counts' but that's probably more for fun than because it's
> useful. Fake numbers are a lot cheaper to get.
I've not found a good source of fake numbers to use. Ideally it would just
increment whenever the task was scheduled - but I've found that hard to hack
that up in linux.
I fall back to reading the counters instead of the MPAM registers ... but its
hard to test for overflow or double counting with that.
Thanks,
James
^ permalink raw reply [flat|nested] 117+ messages in thread
* Re: [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table
2025-08-05 17:07 ` James Morse
@ 2025-08-15 9:33 ` Ben Horgan
0 siblings, 0 replies; 117+ messages in thread
From: Ben Horgan @ 2025-08-15 9:33 UTC (permalink / raw)
To: James Morse, Jonathan Cameron
Cc: linux-kernel, linux-arm-kernel, Rob Herring, Rohit Mathew,
Shanker Donthineni, Zeng Heng, Lecopzer Chen, Carl Worth,
shameerali.kolothum.thodi, D Scott Phillips OS, lcherian,
bobo.shaobowang, tan.shaopeng, baolin.wang, Jamie Iles, Xin Hao,
peternewman, dfustini, amitsinght, David Hildenbrand, Rex Nie,
Dave Martin, Koba Ko
Hi James,
On 8/5/25 18:07, James Morse wrote:
> Hi Ben,
>
> On 23/07/2025 17:39, Ben Horgan wrote:
>> On 7/16/25 18:07, Jonathan Cameron wrote:
>>> On Fri, 11 Jul 2025 18:36:22 +0000
>>> James Morse <james.morse@arm.com> wrote:
>>>
>>>> Add code to parse the arm64 specific MPAM table, looking up the cache
>>>> level from the PPTT and feeding the end result into the MPAM driver.
>>>
>>> Throw in a link to the spec perhaps? Particularly useful to know which
>>> version this was written against when reviewing it.
>
>> As I comment below this code checks the table revision is 1 and so we can assume it was
>> written against version 2 of the spec. As of Monday, there is a new version hot off the
>> press,
>> https://developer.arm.com/documentation/den0065/3-0bet/?lang=en which introduces an "MMIO
>> size" field to allow for disabled nodes. This should be considered here to avoid
>> advertising msc that aren't present.
>
> Sure. Bit of an unfortunate race with the spec people there!
>
> Added as:
> --------------------%<--------------------
> diff --git a/drivers/acpi/arm64/mpam.c b/drivers/acpi/arm64/mpam.c
> index 9ff5a6df9f1b..d8c6224a76f8 100644
> --- a/drivers/acpi/arm64/mpam.c
> +++ b/drivers/acpi/arm64/mpam.c
> @@ -202,6 +202,9 @@ static int __init _parse_table(struct acpi_table_header *table)
> if (tbl_msc->reserved || tbl_msc->reserved1 || tbl_msc->reserved2)
> continue;
>
> + if (!tbl_msc->mmio_size)
> + continue;
> +
> if (decode_interface_type(tbl_msc, &iface))
> continue;
>
> @@ -290,7 +293,7 @@ static struct acpi_table_header *get_table(void)
> if (ACPI_FAILURE(status))
> return NULL;
>
> - if (table->revision != 1)
> + if (table->revision < 1)
> return NULL;
>
> return table;
> @@ -321,6 +324,9 @@ static int _count_msc(struct acpi_table_header *table)
> table_end = (char *)table + table->length;
>
> while (table_offset < table_end) {
> + if (!tbl_msc->mmio_size)
> + continue;
> +
> if (tbl_msc->length < sizeof(*tbl_msc))
> return -EINVAL;
> --------------------%<--------------------
This seems fine as long as any later table revisions are guaranteed to
be backwards compatible.
>
> Amusingly, PCC also defines mmio_size==0 as disabled, so _count_msc() doesn't need to know
> what kind of thing this is. In principle they could change this as its beta, but a zero
> sized MSC should probably be treated as an error anyway.
>
>
> Thanks,
>
> James
Thanks,
Ben
^ permalink raw reply [flat|nested] 117+ messages in thread
end of thread, other threads:[~2025-08-15 12:58 UTC | newest]
Thread overview: 117+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-11 18:36 [RFC PATCH 00/36] arm_mpam: Add basic mpam driver James Morse
2025-07-11 18:36 ` [RFC PATCH 01/36] cacheinfo: Set cache 'id' based on DT data James Morse
2025-07-11 18:36 ` [RFC PATCH 02/36] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
2025-07-11 18:36 ` [RFC PATCH 03/36] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
2025-07-11 18:36 ` [RFC PATCH 04/36] cacheinfo: Expose the code to generate a cache-id from a device_node James Morse
2025-07-14 11:40 ` Ben Horgan
2025-07-25 17:08 ` James Morse
2025-07-28 8:37 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 05/36] ACPI / PPTT: Add a helper to fill a cpumask from a processor container James Morse
2025-07-17 7:58 ` Shaopeng Tan (Fujitsu)
2025-07-25 17:06 ` James Morse
2025-07-22 14:28 ` Jonathan Cameron
2025-07-25 17:05 ` James Morse
2025-07-23 14:42 ` Ben Horgan
2025-07-25 17:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 06/36] ACPI / PPTT: Stop acpi_count_levels() expecting callers to clear levels James Morse
2025-07-16 15:51 ` Jonathan Cameron
2025-07-25 17:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
2025-07-14 11:42 ` Ben Horgan
2025-08-05 17:06 ` James Morse
2025-07-16 16:21 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-idUIRE Jonathan Cameron
2025-08-05 17:06 ` [RFC PATCH 07/36] ACPI / PPTT: Find cache level by cache-id James Morse
2025-07-11 18:36 ` [RFC PATCH 08/36] ACPI / PPTT: Add a helper to fill a cpumask from a cache_id James Morse
2025-07-16 16:24 ` Jonathan Cameron
2025-08-05 17:06 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 09/36] arm64: kconfig: Add Kconfig entry for MPAM James Morse
2025-07-16 16:26 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 10/36] ACPI / MPAM: Parse the MPAM table James Morse
2025-07-16 17:07 ` Jonathan Cameron
2025-07-23 16:39 ` Ben Horgan
2025-08-05 17:07 ` James Morse
2025-08-15 9:33 ` Ben Horgan
2025-07-28 10:08 ` Jonathan Cameron
2025-08-05 17:08 ` James Morse
2025-08-05 17:07 ` James Morse
2025-07-24 10:50 ` Ben Horgan
2025-08-05 17:08 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 11/36] dt-bindings: arm: Add MPAM MSC binding James Morse
2025-07-11 21:43 ` Rob Herring
2025-08-05 17:08 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 12/36] platform: arm64: Move ec devices to an ec subdirectory James Morse
2025-07-21 16:32 ` Jonathan Cameron
2025-08-06 18:03 ` James Morse
2025-07-24 10:56 ` Ben Horgan
2025-08-06 18:03 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 13/36] arm_mpam: Add probe/remove for mpam msc driver and kbuild boiler plate James Morse
2025-07-24 11:02 ` Ben Horgan
2025-08-06 18:03 ` James Morse
2025-07-24 12:09 ` Catalin Marinas
2025-08-06 18:04 ` James Morse
2025-08-07 17:50 ` Drew Fustini
2025-07-11 18:36 ` [RFC PATCH 14/36] arm_mpam: Add support for memory controller MSC on DT platforms James Morse
2025-07-11 18:36 ` [RFC PATCH 15/36] arm_mpam: Add the class and component structures for ris firmware described James Morse
2025-07-11 18:36 ` [RFC PATCH 16/36] arm_mpam: Add MPAM MSC register layout definitions James Morse
2025-07-17 1:04 ` Shaopeng Tan (Fujitsu)
2025-08-06 18:04 ` James Morse
2025-07-24 14:02 ` Ben Horgan
2025-08-06 18:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 17/36] arm_mpam: Add cpuhp callbacks to probe MSC hardware James Morse
2025-07-24 14:13 ` Ben Horgan
2025-08-06 18:07 ` James Morse
2025-07-29 6:11 ` Baisheng Gao
2025-08-06 18:07 ` James Morse
2025-08-05 8:46 ` Jonathan Cameron
2025-07-11 18:36 ` [RFC PATCH 18/36] arm_mpam: Probe MSCs to find the supported partid/pmg values James Morse
2025-07-11 18:36 ` [RFC PATCH 19/36] arm_mpam: Add helpers for managing the locking around the mon_sel registers James Morse
2025-07-11 18:36 ` [RFC PATCH 20/36] arm_mpam: Probe the hardware features resctrl supports James Morse
2025-07-24 15:08 ` Ben Horgan
2025-07-28 16:16 ` Jonathan Cameron
2025-08-07 18:26 ` James Morse
2025-07-28 8:56 ` Ben Horgan
2025-08-08 7:20 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 21/36] arm_mpam: Merge supported features during mpam_enable() into mpam_class James Morse
2025-07-28 9:15 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 22/36] arm_mpam: Reset MSC controls from cpu hp callbacks James Morse
2025-07-28 9:49 ` Ben Horgan
2025-08-08 7:05 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 23/36] arm_mpam: Add a helper to touch an MSC from any CPU James Morse
2025-07-11 18:36 ` [RFC PATCH 24/36] arm_mpam: Extend reset logic to allow devices to be reset any time James Morse
2025-07-28 10:22 ` Ben Horgan
2025-08-08 7:07 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 25/36] arm_mpam: Register and enable IRQs James Morse
2025-07-16 7:31 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:08 ` James Morse
2025-07-17 1:08 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:07 ` James Morse
2025-07-22 15:06 ` Jonathan Cameron
2025-08-08 7:11 ` James Morse
2025-07-28 10:49 ` Ben Horgan
2025-08-08 7:11 ` James Morse
2025-08-04 16:53 ` Fenghua Yu
2025-08-08 7:12 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 26/36] arm_mpam: Use a static key to indicate when mpam is enabled James Morse
2025-07-11 18:36 ` [RFC PATCH 27/36] arm_mpam: Allow configuration to be applied and restored during cpu online James Morse
2025-07-16 6:49 ` Shaopeng Tan (Fujitsu)
2025-08-08 7:13 ` James Morse
2025-07-28 11:59 ` Ben Horgan
2025-07-28 15:34 ` Dave Martin
2025-08-08 7:16 ` James Morse
2025-08-08 7:14 ` James Morse
2025-08-04 16:39 ` Fenghua Yu
2025-08-08 7:17 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 28/36] arm_mpam: Probe and reset the rest of the features James Morse
2025-07-11 18:36 ` [RFC PATCH 29/36] arm_mpam: Add helpers to allocate monitors James Morse
2025-07-11 18:36 ` [RFC PATCH 30/36] arm_mpam: Add mpam_msmon_read() to read monitor value James Morse
2025-07-28 13:02 ` Ben Horgan
2025-07-11 18:36 ` [RFC PATCH 31/36] arm_mpam: Track bandwidth counter state for overflow and power management James Morse
2025-07-11 18:36 ` [RFC PATCH 32/36] arm_mpam: Probe for long/lwd mbwu counters James Morse
2025-07-11 18:36 ` [RFC PATCH 33/36] arm_mpam: Use long MBWU counters if supported James Morse
2025-07-28 13:46 ` Ben Horgan
2025-08-08 7:19 ` James Morse
2025-07-11 18:36 ` [RFC PATCH 34/36] arm_mpam: Add helper to reset saved mbwu state James Morse
2025-07-11 18:36 ` [RFC PATCH 35/36] arm_mpam: Add kunit test for bitmap reset James Morse
2025-07-11 18:36 ` [RFC PATCH 36/36] arm_mpam: Add kunit tests for props_mismatch() James Morse
2025-08-01 16:09 ` [RFC PATCH 00/36] arm_mpam: Add basic mpam driver Jonathan Cameron
2025-08-08 7:23 ` James Morse
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).