[PATCH v2 0/3] cacheinfo: Set cache 'id' based on DT data

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/3] cacheinfo: Set cache 'id' based on DT data
@ 2025-07-04 17:38 James Morse
  2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: James Morse @ 2025-07-04 17:38 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill,
	James Morse

Changes since v1?
 * Pushed some patches over the (very near) horizon of sharing the MPAM
   driver at you.
 * Added libvirt note to patch-1's commit message.
 * Removed the second loop in patch-1 to replace with a helper

An open question from v1 is whether it would be preferable to use an
index into the DT of the CPU nodes instead of the hardware id. This would
save an arch specific swizzle - but the numbers would change if the DT
were changed. This scheme isn't sensitive to the order of DT nodes.

---

This series adds support for cache-ids to device-tree systems.
These values are exposed to user-space via
/sys/devices/system/cpu/cpuX/cache/indexY/id
and are used to identify caches and their associated CPUs by kernel interfaces
such as resctrl.

Resctrl anticipates cache-ids are unique for a given cache level, but may
be sparse. See Documentation/filesystems/resctrl.rst's "Cache IDs" section.

Another user is PCIe's cache-steering hints, where an id provided by the
hardware would be needed. Today this expects a platform specific ACPI hook
the program that value into the PCIe root port registers. If DT platforms
are ever supported, it will likely need a kernel driver to convert the
user-space cache-id to whatever hardware value is needed.

This series generates a 32bit cache-id from the lowest CPU hardware id
of the CPU's associated with that cache. On arm64, CPU hardware ids may
be greater than 32bits, but can be swizzled into 32bits. An architecture
hook is provided to allow the architecture to swizzle the values into 32bits.

This series is based on v6.16-rc4, and can be retreived from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/cacheinfo/v2

The MPAM driver that makes use of these can be found here:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git mpam/snapshot/v6.16-rc1

What is MPAM? Set your time-machine to 2020:
https://lore.kernel.org/lkml/20201030161120.227225-1-james.morse@arm.com/

[v1] lore.kernel.org/r/20250613130356.8080-1-james.morse@arm.com

Bugs welcome,
Thanks,

James Morse (2):
  cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for
    cache-id
  arm64: cacheinfo: Provide helper to compress MPIDR value into u32

Rob Herring (1):
  cacheinfo: Set cache 'id' based on DT data

 arch/arm64/include/asm/cache.h | 17 ++++++++++++++
 drivers/base/cacheinfo.c       | 43 ++++++++++++++++++++++++++++++++++
 2 files changed, 60 insertions(+)

-- 
2.39.5

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-04 17:38 [PATCH v2 0/3] cacheinfo: Set cache 'id' based on DT data James Morse
@ 2025-07-04 17:38 ` James Morse
  2025-07-07  9:34   ` Jonathan Cameron
                     ` (2 more replies)
  2025-07-04 17:38 ` [PATCH v2 2/3] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
  2025-07-04 17:38 ` [PATCH v2 3/3] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
  2 siblings, 3 replies; 13+ messages in thread
From: James Morse @ 2025-07-04 17:38 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill,
	James Morse

From: Rob Herring <robh@kernel.org>

Use the minimum CPU h/w id of the CPUs associated with the cache for the
cache 'id'. This will provide a stable id value for a given system. As
we need to check all possible CPUs, we can't use the shared_cpu_map
which is just online CPUs. As there's not a cache to CPUs mapping in DT,
we have to walk all CPU nodes and then walk cache levels.

The cache_id exposed to user-space has historically been 32 bits, and
is too late to change. This value is parsed into a u32 by user-space
libraries such as libvirt:
https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588

Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
is found.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
[ ben: converted to use the __free cleanup idiom ]
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
[ morse: Add checks to give up if a value larger than 32 bits is seen. ]
Signed-off-by: James Morse <james.morse@arm.com>
---
Use as a 32bit value has also been seen in DPDK patches here:
http://inbox.dpdk.org/dev/20241021015246.304431-2-wathsala.vithanage@arm.com/

Changes since v1:
 * Remove the second loop in favour of a helper.
 
An open question from v1 is whether it would be preferable to use an
index into the DT of the CPU nodes instead of the hardware id. This would
save an arch specific swizzle - but the numbers would change if the DT
were changed. This scheme isn't sensitive to the order of DT nodes.

---
 drivers/base/cacheinfo.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index cf0d455209d7..df593da0d5f7 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -8,6 +8,7 @@
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
 #include <linux/acpi.h>
+#include <linux/bitfield.h>
 #include <linux/bitops.h>
 #include <linux/cacheinfo.h>
 #include <linux/compiler.h>
@@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
 	return of_property_read_bool(np, "cache-unified");
 }
 
+static bool match_cache_node(struct device_node *cpu,
+			     const struct device_node *cache_node)
+{
+	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);
+	     cache != NULL; cache = of_find_next_cache_node(cache)) {
+		if (cache == cache_node)
+			return true;
+	}
+
+	return false;
+}
+
+static void cache_of_set_id(struct cacheinfo *this_leaf,
+			    struct device_node *cache_node)
+{
+	struct device_node *cpu;
+	u32 min_id = ~0;
+
+	for_each_of_cpu_node(cpu) {
+		u64 id = of_get_cpu_hwid(cpu, 0);
+
+		if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
+			of_node_put(cpu);
+			return;
+		}
+
+		if (match_cache_node(cpu, cache_node))
+			min_id = min(min_id, id);
+	}
+
+	if (min_id != ~0) {
+		this_leaf->id = min_id;
+		this_leaf->attributes |= CACHE_ID;
+	}
+}
+
 static void cache_of_set_props(struct cacheinfo *this_leaf,
 			       struct device_node *np)
 {
@@ -198,6 +235,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf,
 	cache_get_line_size(this_leaf, np);
 	cache_nr_sets(this_leaf, np);
 	cache_associativity(this_leaf);
+	cache_of_set_id(this_leaf, np);
 }
 
 static int cache_setup_of_node(unsigned int cpu)
-- 
2.39.5



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/3] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
  2025-07-04 17:38 [PATCH v2 0/3] cacheinfo: Set cache 'id' based on DT data James Morse
  2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
@ 2025-07-04 17:38 ` James Morse
  2025-07-11  4:42   ` Gavin Shan
  2025-07-04 17:38 ` [PATCH v2 3/3] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
  2 siblings, 1 reply; 13+ messages in thread
From: James Morse @ 2025-07-04 17:38 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill,
	James Morse

Filesystems like resctrl use the cache-id exposed via sysfs to identify
groups of CPUs. The value is also used for PCIe cache steering tags. On
DT platforms cache-id is not something that is described in the
device-tree, but instead generated from the smallest CPU h/w id of the
CPUs associated with that cache.

CPU h/w ids may be larger than 32 bits.

Add a hook to allow architectures to compress the value from the devicetree
into 32 bits. Returning the same value is always safe as cache_of_set_id()
will stop if a value larger than 32 bits is seen.

For example, on arm64 the value is the MPIDR affinity register, which only
has 32 bits of affinity data, but spread accross the 64 bit field. An
arch-specific bit swizzle gives a 32 bit value.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
---
 drivers/base/cacheinfo.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index df593da0d5f7..25d028f7a986 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -196,6 +196,10 @@ static bool match_cache_node(struct device_node *cpu,
 	return false;
 }

+#ifndef arch_compact_of_hwid
+#define arch_compact_of_hwid(_x)	(_x)
+#endif
+
 static void cache_of_set_id(struct cacheinfo *this_leaf,
 			    struct device_node *cache_node)
 {
@@ -205,6 +209,7 @@ static void cache_of_set_id(struct cacheinfo *this_leaf,
 	for_each_of_cpu_node(cpu) {
 		u64 id = of_get_cpu_hwid(cpu, 0);

+		id = arch_compact_of_hwid(id);
 		if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
 			of_node_put(cpu);
 			return;
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/3] arm64: cacheinfo: Provide helper to compress MPIDR value into u32
  2025-07-04 17:38 [PATCH v2 0/3] cacheinfo: Set cache 'id' based on DT data James Morse
  2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
  2025-07-04 17:38 ` [PATCH v2 2/3] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
@ 2025-07-04 17:38 ` James Morse
  2025-07-11  4:43   ` Gavin Shan
  2 siblings, 1 reply; 13+ messages in thread
From: James Morse @ 2025-07-04 17:38 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill,
	James Morse

Filesystems like resctrl use the cache-id exposed via sysfs to identify
groups of CPUs. The value is also used for PCIe cache steering tags. On
DT platforms cache-id is not something that is described in the
device-tree, but instead generated from the smallest MPIDR of the CPUs
associated with that cache. The cache-id exposed to user-space has
historically been 32 bits.

MPIDR values may be larger than 32 bits.

MPIDR only has 32 bits worth of affinity data, but the aff3 field lives
above 32bits. The corresponding lower bits are masked out by
MPIDR_HWID_BITMASK and contain an SMT flag and Uni-Processor flag.

Swizzzle the aff3 field into the bottom 32 bits and using that.

In case more affinity fields are added in the future, the upper RES0
area should be checked. Returning a value greater than 32 bits from
this helper will cause the caller to give up on allocating cache-ids.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
---
Changes since v1:
 * Removal of unrelated changes.
 * Added a comment about how the RES0 bit safety net works.
---
 arch/arm64/include/asm/cache.h | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
index 99cd6546e72e..09963004ceea 100644
--- a/arch/arm64/include/asm/cache.h
+++ b/arch/arm64/include/asm/cache.h
@@ -87,6 +87,23 @@ int cache_line_size(void);

 #define dma_get_cache_alignment	cache_line_size

+/* Compress a u64 MPIDR value into 32 bits. */
+static inline u64 arch_compact_of_hwid(u64 id)
+{
+	u64 aff3 = MPIDR_AFFINITY_LEVEL(id, 3);
+
+	/*
+	 * These bits are expected to be RES0. If not, return a value with
+	 * the upper 32 bits set to force the caller to give up on 32 bit
+	 * cache ids.
+	 */
+	if (FIELD_GET(GENMASK_ULL(63, 40), id))
+		return id;
+
+	return (aff3 << 24) | FIELD_GET(GENMASK_ULL(23, 0), id);
+}
+#define arch_compact_of_hwid	arch_compact_of_hwid
+
 /*
  * Read the effective value of CTR_EL0.
  *
-- 
2.39.5

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
@ 2025-07-07  9:34   ` Jonathan Cameron
  2025-07-07 10:27   ` Ben Horgan
  2025-07-11  4:41   ` Gavin Shan
  2 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-07  9:34 UTC (permalink / raw)
  To: James Morse
  Cc: linux-kernel, linux-arm-kernel, Greg Kroah-Hartman,
	Rafael J . Wysocki, sudeep.holla, Rob Herring, Ben Horgan,
	Catalin Marinas, WillDeaconwill

On Fri, 4 Jul 2025 17:38:24 +0000
James Morse <james.morse@arm.com> wrote:

> From: Rob Herring <robh@kernel.org>
> 
> Use the minimum CPU h/w id of the CPUs associated with the cache for the
> cache 'id'. This will provide a stable id value for a given system. As
> we need to check all possible CPUs, we can't use the shared_cpu_map
> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> we have to walk all CPU nodes and then walk cache levels.
> 
> The cache_id exposed to user-space has historically been 32 bits, and
> is too late to change. This value is parsed into a u32 by user-space
> libraries such as libvirt:
> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
> 
> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
> is found.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Signed-off-by: Rob Herring <robh@kernel.org>
> [ ben: converted to use the __free cleanup idiom ]
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> [ morse: Add checks to give up if a value larger than 32 bits is seen. ]
> Signed-off-by: James Morse <james.morse@arm.com>

Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
  2025-07-07  9:34   ` Jonathan Cameron
@ 2025-07-07 10:27   ` Ben Horgan
  2025-07-07 12:32     ` Jonathan Cameron
  2025-07-11  4:41   ` Gavin Shan
  2 siblings, 1 reply; 13+ messages in thread
From: Ben Horgan @ 2025-07-07 10:27 UTC (permalink / raw)
  To: James Morse, linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Jonathan Cameron, Catalin Marinas, WillDeaconwill

Hi James,

On 7/4/25 18:38, James Morse wrote:
> From: Rob Herring <robh@kernel.org>
> 
> Use the minimum CPU h/w id of the CPUs associated with the cache for the
> cache 'id'. This will provide a stable id value for a given system. As
> we need to check all possible CPUs, we can't use the shared_cpu_map
> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> we have to walk all CPU nodes and then walk cache levels.
> 
> The cache_id exposed to user-space has historically been 32 bits, and
> is too late to change. This value is parsed into a u32 by user-space
> libraries such as libvirt:
> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
> 
> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
> is found.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Signed-off-by: Rob Herring <robh@kernel.org>
> [ ben: converted to use the __free cleanup idiom ]
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> [ morse: Add checks to give up if a value larger than 32 bits is seen. ]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Use as a 32bit value has also been seen in DPDK patches here:
> http://inbox.dpdk.org/dev/20241021015246.304431-2-wathsala.vithanage@arm.com/
> 
> Changes since v1:
>   * Remove the second loop in favour of a helper.
>   
> An open question from v1 is whether it would be preferable to use an
> index into the DT of the CPU nodes instead of the hardware id. This would
> save an arch specific swizzle - but the numbers would change if the DT
> were changed. This scheme isn't sensitive to the order of DT nodes.
> 
> ---
>   drivers/base/cacheinfo.c | 38 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> index cf0d455209d7..df593da0d5f7 100644
> --- a/drivers/base/cacheinfo.c
> +++ b/drivers/base/cacheinfo.c
> @@ -8,6 +8,7 @@
>   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>   
>   #include <linux/acpi.h>
> +#include <linux/bitfield.h>
>   #include <linux/bitops.h>
>   #include <linux/cacheinfo.h>
>   #include <linux/compiler.h>
> @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>   	return of_property_read_bool(np, "cache-unified");
>   }
>   
> +static bool match_cache_node(struct device_node *cpu,
> +			     const struct device_node *cache_node)
> +{
> +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);
Looks like the creation of this helper function has upset the 
device_node reference counting. This first __free(device_node) will only 
cause of_node_put() to be called in the case of the early return from 
the loop. You've dropped the second __free(device_node) which accounts 
for 'cache' changing on each iteration.
> +	     cache != NULL; cache = of_find_next_cache_node(cache)) {
> +		if (cache == cache_node)
> +			return true;
> +	}
> +
> +	return false;
> +}
> +
> +static void cache_of_set_id(struct cacheinfo *this_leaf,
> +			    struct device_node *cache_node)
> +{
> +	struct device_node *cpu;
> +	u32 min_id = ~0;
> +
> +	for_each_of_cpu_node(cpu) {
> +		u64 id = of_get_cpu_hwid(cpu, 0);
> +
> +		if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
> +			of_node_put(cpu);
> +			return;
> +		}
> +
> +		if (match_cache_node(cpu, cache_node))
> +			min_id = min(min_id, id);
> +	}
> +
> +	if (min_id != ~0) {
> +		this_leaf->id = min_id;
> +		this_leaf->attributes |= CACHE_ID;
> +	}
> +}
> +
>   static void cache_of_set_props(struct cacheinfo *this_leaf,
>   			       struct device_node *np)
>   {
> @@ -198,6 +235,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf,
>   	cache_get_line_size(this_leaf, np);
>   	cache_nr_sets(this_leaf, np);
>   	cache_associativity(this_leaf);
> +	cache_of_set_id(this_leaf, np);
>   }
>   
>   static int cache_setup_of_node(unsigned int cpu)


Thanks,

Ben



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-07 10:27   ` Ben Horgan
@ 2025-07-07 12:32     ` Jonathan Cameron
  2025-07-10 11:15       ` James Morse
  0 siblings, 1 reply; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-07 12:32 UTC (permalink / raw)
  To: Ben Horgan
  Cc: James Morse, linux-kernel, linux-arm-kernel, Greg Kroah-Hartman,
	Rafael J . Wysocki, sudeep.holla, Rob Herring, Catalin Marinas,
	WillDeaconwill

On Mon, 7 Jul 2025 11:27:06 +0100
Ben Horgan <ben.horgan@arm.com> wrote:

> Hi James,
> 
> On 7/4/25 18:38, James Morse wrote:
> > From: Rob Herring <robh@kernel.org>
> > 
> > Use the minimum CPU h/w id of the CPUs associated with the cache for the
> > cache 'id'. This will provide a stable id value for a given system. As
> > we need to check all possible CPUs, we can't use the shared_cpu_map
> > which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> > we have to walk all CPU nodes and then walk cache levels.
> > 
> > The cache_id exposed to user-space has historically been 32 bits, and
> > is too late to change. This value is parsed into a u32 by user-space
> > libraries such as libvirt:
> > https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
> > 
> > Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
> > is found.
> > 
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> > Signed-off-by: Rob Herring <robh@kernel.org>
> > [ ben: converted to use the __free cleanup idiom ]
> > Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> > [ morse: Add checks to give up if a value larger than 32 bits is seen. ]
> > Signed-off-by: James Morse <james.morse@arm.com>
> > ---
> > Use as a 32bit value has also been seen in DPDK patches here:
> > http://inbox.dpdk.org/dev/20241021015246.304431-2-wathsala.vithanage@arm.com/
> > 
> > Changes since v1:
> >   * Remove the second loop in favour of a helper.
> >   
> > An open question from v1 is whether it would be preferable to use an
> > index into the DT of the CPU nodes instead of the hardware id. This would
> > save an arch specific swizzle - but the numbers would change if the DT
> > were changed. This scheme isn't sensitive to the order of DT nodes.
> > 
> > ---
> >   drivers/base/cacheinfo.c | 38 ++++++++++++++++++++++++++++++++++++++
> >   1 file changed, 38 insertions(+)
> > 
> > diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> > index cf0d455209d7..df593da0d5f7 100644
> > --- a/drivers/base/cacheinfo.c
> > +++ b/drivers/base/cacheinfo.c
> > @@ -8,6 +8,7 @@
> >   #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
> >   
> >   #include <linux/acpi.h>
> > +#include <linux/bitfield.h>
> >   #include <linux/bitops.h>
> >   #include <linux/cacheinfo.h>
> >   #include <linux/compiler.h>
> > @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
> >   	return of_property_read_bool(np, "cache-unified");
> >   }
> >   
> > +static bool match_cache_node(struct device_node *cpu,
> > +			     const struct device_node *cache_node)
> > +{
> > +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);  
> Looks like the creation of this helper function has upset the 
> device_node reference counting. This first __free(device_node) will only 
> cause of_node_put() to be called in the case of the early return from 
> the loop. You've dropped the second __free(device_node) which accounts 
> for 'cache' changing on each iteration.

Good catch - this behaves differently from many of the of_get_next* type
helpers in that it doesn't drop the reference to the previous iteration
within the call.

Maybe it should?

I checked a few of the call sites and some would be simplified if it did
others would need some more complex restructuring but might benefit as
well.

> > +	     cache != NULL; cache = of_find_next_cache_node(cache)) {
> > +		if (cache == cache_node)
> > +			return true;
> > +	}
> > +
> > +	return false;
> > +}
> > +
> > +static void cache_of_set_id(struct cacheinfo *this_leaf,
> > +			    struct device_node *cache_node)
> > +{
> > +	struct device_node *cpu;
> > +	u32 min_id = ~0;
> > +
> > +	for_each_of_cpu_node(cpu) {
> > +		u64 id = of_get_cpu_hwid(cpu, 0);
> > +
> > +		if (FIELD_GET(GENMASK_ULL(63, 32), id)) {
> > +			of_node_put(cpu);
> > +			return;
> > +		}
> > +
> > +		if (match_cache_node(cpu, cache_node))
> > +			min_id = min(min_id, id);
> > +	}
> > +
> > +	if (min_id != ~0) {
> > +		this_leaf->id = min_id;
> > +		this_leaf->attributes |= CACHE_ID;
> > +	}
> > +}
> > +
> >   static void cache_of_set_props(struct cacheinfo *this_leaf,
> >   			       struct device_node *np)
> >   {
> > @@ -198,6 +235,7 @@ static void cache_of_set_props(struct cacheinfo *this_leaf,
> >   	cache_get_line_size(this_leaf, np);
> >   	cache_nr_sets(this_leaf, np);
> >   	cache_associativity(this_leaf);
> > +	cache_of_set_id(this_leaf, np);
> >   }
> >   
> >   static int cache_setup_of_node(unsigned int cpu)  
> 
> 
> Thanks,
> 
> Ben
> 
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-07 12:32     ` Jonathan Cameron
@ 2025-07-10 11:15       ` James Morse
  2025-07-10 11:24         ` Ben Horgan
  0 siblings, 1 reply; 13+ messages in thread
From: James Morse @ 2025-07-10 11:15 UTC (permalink / raw)
  To: Jonathan Cameron, Ben Horgan
  Cc: linux-kernel, linux-arm-kernel, Greg Kroah-Hartman,
	Rafael J . Wysocki, sudeep.holla, Rob Herring, Catalin Marinas,
	WillDeaconwill

Hi Ben, Jonathan,

On 07/07/2025 13:32, Jonathan Cameron wrote:
> On Mon, 7 Jul 2025 11:27:06 +0100
> Ben Horgan <ben.horgan@arm.com> wrote:
>> On 7/4/25 18:38, James Morse wrote:
>>> From: Rob Herring <robh@kernel.org>
>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
>>> cache 'id'. This will provide a stable id value for a given system. As
>>> we need to check all possible CPUs, we can't use the shared_cpu_map
>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
>>> we have to walk all CPU nodes and then walk cache levels.
>>>
>>> The cache_id exposed to user-space has historically been 32 bits, and
>>> is too late to change. This value is parsed into a u32 by user-space
>>> libraries such as libvirt:
>>> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
>>>
>>> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
>>> is found.

>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>> index cf0d455209d7..df593da0d5f7 100644
>>> --- a/drivers/base/cacheinfo.c
>>> +++ b/drivers/base/cacheinfo.c
>>> @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>>>   	return of_property_read_bool(np, "cache-unified");
>>>   }
>>>   
>>> +static bool match_cache_node(struct device_node *cpu,
>>> +			     const struct device_node *cache_node)
>>> +{
>>> +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);  
>> Looks like the creation of this helper function has upset the 
>> device_node reference counting. This first __free(device_node) will only 
>> cause of_node_put() to be called in the case of the early return from 
>> the loop. You've dropped the second __free(device_node) which accounts 
>> for 'cache' changing on each iteration.

Heh, I just took this hunk verbatim. Fixing this up with the __free() magic is tricky as
the existing patterns all drop the reference to cpu, which we don't want to do here. I
think at this point the __free() magic is just making this harder to understand. How about
the old fashioned way:

| static bool match_cache_node(struct device_node *cpu,
|                              const struct device_node *cache_node)
| {
|         struct device_node *prev, *cache = of_find_next_cache_node(cpu);
|
|         while (cache) {
|                 if (cache == cache_node) {
|                         of_node_put(cache);
|                         return true;
|                 }
|
|                 prev = cache;
|                 cache = of_find_next_cache_node(cache);
|                 of_node_put(prev);
|         }
|
|         return false;
| }


> Good catch - this behaves differently from many of the of_get_next* type
> helpers in that it doesn't drop the reference to the previous iteration
> within the call.
> 
> Maybe it should?
> 
> I checked a few of the call sites and some would be simplified if it did
> others would need some more complex restructuring but might benefit as
> well.

If it did, we'd end up dropping the reference to cpu on the way in, which
of_get_next_cpu_node() in for_each_of_cpu_node() was expecting to do.


Thanks,

James


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-10 11:15       ` James Morse
@ 2025-07-10 11:24         ` Ben Horgan
  2025-07-11 14:21           ` Jonathan Cameron
  0 siblings, 1 reply; 13+ messages in thread
From: Ben Horgan @ 2025-07-10 11:24 UTC (permalink / raw)
  To: James Morse, Jonathan Cameron
  Cc: linux-kernel, linux-arm-kernel, Greg Kroah-Hartman,
	Rafael J . Wysocki, sudeep.holla, Rob Herring, Catalin Marinas,
	WillDeaconwill

Hi James and Jonathan,

On 7/10/25 12:15, James Morse wrote:
> Hi Ben, Jonathan,
> 
> On 07/07/2025 13:32, Jonathan Cameron wrote:
>> On Mon, 7 Jul 2025 11:27:06 +0100
>> Ben Horgan <ben.horgan@arm.com> wrote:
>>> On 7/4/25 18:38, James Morse wrote:
>>>> From: Rob Herring <robh@kernel.org>
>>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
>>>> cache 'id'. This will provide a stable id value for a given system. As
>>>> we need to check all possible CPUs, we can't use the shared_cpu_map
>>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
>>>> we have to walk all CPU nodes and then walk cache levels.
>>>>
>>>> The cache_id exposed to user-space has historically been 32 bits, and
>>>> is too late to change. This value is parsed into a u32 by user-space
>>>> libraries such as libvirt:
>>>> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
>>>>
>>>> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
>>>> is found.
> 
>>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
>>>> index cf0d455209d7..df593da0d5f7 100644
>>>> --- a/drivers/base/cacheinfo.c
>>>> +++ b/drivers/base/cacheinfo.c
>>>> @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
>>>>    	return of_property_read_bool(np, "cache-unified");
>>>>    }
>>>>    
>>>> +static bool match_cache_node(struct device_node *cpu,
>>>> +			     const struct device_node *cache_node)
>>>> +{
>>>> +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);
>>> Looks like the creation of this helper function has upset the
>>> device_node reference counting. This first __free(device_node) will only
>>> cause of_node_put() to be called in the case of the early return from
>>> the loop. You've dropped the second __free(device_node) which accounts
>>> for 'cache' changing on each iteration.
> 
> Heh, I just took this hunk verbatim. Fixing this up with the __free() magic is tricky as
> the existing patterns all drop the reference to cpu, which we don't want to do here. I
> think at this point the __free() magic is just making this harder to understand. How about
> the old fashioned way:
> 
> | static bool match_cache_node(struct device_node *cpu,
> |                              const struct device_node *cache_node)
> | {
> |         struct device_node *prev, *cache = of_find_next_cache_node(cpu);
> |
> |         while (cache) {
> |                 if (cache == cache_node) {
> |                         of_node_put(cache);
> |                         return true;
> |                 }
> |
> |                 prev = cache;
> |                 cache = of_find_next_cache_node(cache);
> |                 of_node_put(prev);
> |         }
> |
> |         return false;
> | }
Ok with me.
> 
> 
>> Good catch - this behaves differently from many of the of_get_next* type
>> helpers in that it doesn't drop the reference to the previous iteration
>> within the call.
>>
>> Maybe it should?
>>
>> I checked a few of the call sites and some would be simplified if it did
>> others would need some more complex restructuring but might benefit as
>> well.
> 
> If it did, we'd end up dropping the reference to cpu on the way in, which
> of_get_next_cpu_node() in for_each_of_cpu_node() was expecting to do.

Yes, I think the blurring of the lines between a cpu node and cache node 
is at least partially to blame for the confusion here.
> 
> 
> Thanks,
> 
> James

Thanks,

Ben



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
  2025-07-07  9:34   ` Jonathan Cameron
  2025-07-07 10:27   ` Ben Horgan
@ 2025-07-11  4:41   ` Gavin Shan
  2 siblings, 0 replies; 13+ messages in thread
From: Gavin Shan @ 2025-07-11  4:41 UTC (permalink / raw)
  To: James Morse, linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill

On 7/5/25 3:38 AM, James Morse wrote:
> From: Rob Herring <robh@kernel.org>
> 
> Use the minimum CPU h/w id of the CPUs associated with the cache for the
> cache 'id'. This will provide a stable id value for a given system. As
> we need to check all possible CPUs, we can't use the shared_cpu_map
> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> we have to walk all CPU nodes and then walk cache levels.
> 
> The cache_id exposed to user-space has historically been 32 bits, and
> is too late to change. This value is parsed into a u32 by user-space
> libraries such as libvirt:
> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
> 
> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
> is found.
> 
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Signed-off-by: Rob Herring <robh@kernel.org>
> [ ben: converted to use the __free cleanup idiom ]
> Signed-off-by: Ben Horgan <ben.horgan@arm.com>
> [ morse: Add checks to give up if a value larger than 32 bits is seen. ]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> Use as a 32bit value has also been seen in DPDK patches here:
> http://inbox.dpdk.org/dev/20241021015246.304431-2-wathsala.vithanage@arm.com/
> 
> Changes since v1:
>   * Remove the second loop in favour of a helper.
>   
> An open question from v1 is whether it would be preferable to use an
> index into the DT of the CPU nodes instead of the hardware id. This would
> save an arch specific swizzle - but the numbers would change if the DT
> were changed. This scheme isn't sensitive to the order of DT nodes.
> 
> ---
>   drivers/base/cacheinfo.c | 38 ++++++++++++++++++++++++++++++++++++++
>   1 file changed, 38 insertions(+)
> 

With Ben Horgan's concern addressed, LGTM:

Reviewed-by: Gavin Shan <gshan@redha.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/3] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id
  2025-07-04 17:38 ` [PATCH v2 2/3] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
@ 2025-07-11  4:42   ` Gavin Shan
  0 siblings, 0 replies; 13+ messages in thread
From: Gavin Shan @ 2025-07-11  4:42 UTC (permalink / raw)
  To: James Morse, linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill

On 7/5/25 3:38 AM, James Morse wrote:
> Filesystems like resctrl use the cache-id exposed via sysfs to identify
> groups of CPUs. The value is also used for PCIe cache steering tags. On
> DT platforms cache-id is not something that is described in the
> device-tree, but instead generated from the smallest CPU h/w id of the
> CPUs associated with that cache.
> 
> CPU h/w ids may be larger than 32 bits.
> 
> Add a hook to allow architectures to compress the value from the devicetree
> into 32 bits. Returning the same value is always safe as cache_of_set_id()
> will stop if a value larger than 32 bits is seen.
> 
> For example, on arm64 the value is the MPIDR affinity register, which only
> has 32 bits of affinity data, but spread accross the 64 bit field. An
> arch-specific bit swizzle gives a 32 bit value.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> ---
>   drivers/base/cacheinfo.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
Reviewed-by: Gavin Shan <gshan@redha.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/3] arm64: cacheinfo: Provide helper to compress MPIDR value into u32
  2025-07-04 17:38 ` [PATCH v2 3/3] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
@ 2025-07-11  4:43   ` Gavin Shan
  0 siblings, 0 replies; 13+ messages in thread
From: Gavin Shan @ 2025-07-11  4:43 UTC (permalink / raw)
  To: James Morse, linux-kernel, linux-arm-kernel
  Cc: Greg Kroah-Hartman, Rafael J . Wysocki, sudeep.holla, Rob Herring,
	Ben Horgan, Jonathan Cameron, Catalin Marinas, WillDeaconwill

On 7/5/25 3:38 AM, James Morse wrote:
> Filesystems like resctrl use the cache-id exposed via sysfs to identify
> groups of CPUs. The value is also used for PCIe cache steering tags. On
> DT platforms cache-id is not something that is described in the
> device-tree, but instead generated from the smallest MPIDR of the CPUs
> associated with that cache. The cache-id exposed to user-space has
> historically been 32 bits.
> 
> MPIDR values may be larger than 32 bits.
> 
> MPIDR only has 32 bits worth of affinity data, but the aff3 field lives
> above 32bits. The corresponding lower bits are masked out by
> MPIDR_HWID_BITMASK and contain an SMT flag and Uni-Processor flag.
> 
> Swizzzle the aff3 field into the bottom 32 bits and using that.
> 
> In case more affinity fields are added in the future, the upper RES0
> area should be checked. Returning a value greater than 32 bits from
> this helper will cause the caller to give up on allocating cache-ids.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> ---
> Changes since v1:
>   * Removal of unrelated changes.
>   * Added a comment about how the RES0 bit safety net works.
> ---
>   arch/arm64/include/asm/cache.h | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 

Reviewed-by: Gavin Shan <gshan@redha.com>



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/3] cacheinfo: Set cache 'id' based on DT data
  2025-07-10 11:24         ` Ben Horgan
@ 2025-07-11 14:21           ` Jonathan Cameron
  0 siblings, 0 replies; 13+ messages in thread
From: Jonathan Cameron @ 2025-07-11 14:21 UTC (permalink / raw)
  To: Ben Horgan
  Cc: James Morse, linux-kernel, linux-arm-kernel, Greg Kroah-Hartman,
	Rafael J . Wysocki, sudeep.holla, Rob Herring, Catalin Marinas,
	WillDeaconwill

On Thu, 10 Jul 2025 12:24:01 +0100
Ben Horgan <ben.horgan@arm.com> wrote:

> Hi James and Jonathan,
> 
> On 7/10/25 12:15, James Morse wrote:
> > Hi Ben, Jonathan,
> > 
> > On 07/07/2025 13:32, Jonathan Cameron wrote:  
> >> On Mon, 7 Jul 2025 11:27:06 +0100
> >> Ben Horgan <ben.horgan@arm.com> wrote:  
> >>> On 7/4/25 18:38, James Morse wrote:  
> >>>> From: Rob Herring <robh@kernel.org>
> >>>> Use the minimum CPU h/w id of the CPUs associated with the cache for the
> >>>> cache 'id'. This will provide a stable id value for a given system. As
> >>>> we need to check all possible CPUs, we can't use the shared_cpu_map
> >>>> which is just online CPUs. As there's not a cache to CPUs mapping in DT,
> >>>> we have to walk all CPU nodes and then walk cache levels.
> >>>>
> >>>> The cache_id exposed to user-space has historically been 32 bits, and
> >>>> is too late to change. This value is parsed into a u32 by user-space
> >>>> libraries such as libvirt:
> >>>> https://github.com/libvirt/libvirt/blob/master/src/util/virresctrl.c#L1588
> >>>>
> >>>> Give up on assigning cache-id's if a CPU h/w id greater than 32 bits
> >>>> is found.  
> >   
> >>>> diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
> >>>> index cf0d455209d7..df593da0d5f7 100644
> >>>> --- a/drivers/base/cacheinfo.c
> >>>> +++ b/drivers/base/cacheinfo.c
> >>>> @@ -183,6 +184,42 @@ static bool cache_node_is_unified(struct cacheinfo *this_leaf,
> >>>>    	return of_property_read_bool(np, "cache-unified");
> >>>>    }
> >>>>    
> >>>> +static bool match_cache_node(struct device_node *cpu,
> >>>> +			     const struct device_node *cache_node)
> >>>> +{
> >>>> +	for (struct device_node *cache __free(device_node) = of_find_next_cache_node(cpu);  
> >>> Looks like the creation of this helper function has upset the
> >>> device_node reference counting. This first __free(device_node) will only
> >>> cause of_node_put() to be called in the case of the early return from
> >>> the loop. You've dropped the second __free(device_node) which accounts
> >>> for 'cache' changing on each iteration.  
> > 
> > Heh, I just took this hunk verbatim. Fixing this up with the __free() magic is tricky as
> > the existing patterns all drop the reference to cpu, which we don't want to do here. I
> > think at this point the __free() magic is just making this harder to understand. How about
> > the old fashioned way:
> > 
> > | static bool match_cache_node(struct device_node *cpu,
> > |                              const struct device_node *cache_node)
> > | {
> > |         struct device_node *prev, *cache = of_find_next_cache_node(cpu);
> > |
> > |         while (cache) {
> > |                 if (cache == cache_node) {
> > |                         of_node_put(cache);
> > |                         return true;
> > |                 }
> > |
> > |                 prev = cache;
> > |                 cache = of_find_next_cache_node(cache);
> > |                 of_node_put(prev);
> > |         }
> > |
> > |         return false;
> > | }  
> Ok with me.
Agreed. 

> > 
> >   
> >> Good catch - this behaves differently from many of the of_get_next* type
> >> helpers in that it doesn't drop the reference to the previous iteration
> >> within the call.
> >>
> >> Maybe it should?
> >>
> >> I checked a few of the call sites and some would be simplified if it did
> >> others would need some more complex restructuring but might benefit as
> >> well.  
> > 
> > If it did, we'd end up dropping the reference to cpu on the way in, which
> > of_get_next_cpu_node() in for_each_of_cpu_node() was expecting to do.  
> 
> Yes, I think the blurring of the lines between a cpu node and cache node 
> is at least partially to blame for the confusion here.
Yes.  That is more than a little ugly!

> > 
> > 
> > Thanks,
> > 
> > James  
> 
> Thanks,
> 
> Ben
> 
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-07-11 14:24 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-04 17:38 [PATCH v2 0/3] cacheinfo: Set cache 'id' based on DT data James Morse
2025-07-04 17:38 ` [PATCH v2 1/3] " James Morse
2025-07-07  9:34   ` Jonathan Cameron
2025-07-07 10:27   ` Ben Horgan
2025-07-07 12:32     ` Jonathan Cameron
2025-07-10 11:15       ` James Morse
2025-07-10 11:24         ` Ben Horgan
2025-07-11 14:21           ` Jonathan Cameron
2025-07-11  4:41   ` Gavin Shan
2025-07-04 17:38 ` [PATCH v2 2/3] cacheinfo: Add arch hook to compress CPU h/w id into 32 bits for cache-id James Morse
2025-07-11  4:42   ` Gavin Shan
2025-07-04 17:38 ` [PATCH v2 3/3] arm64: cacheinfo: Provide helper to compress MPIDR value into u32 James Morse
2025-07-11  4:43   ` Gavin Shan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).