Linux Kernel Selftest development
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM
@ 2026-01-23  4:40 Shaopeng Tan
  2026-01-23  4:40 ` [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture Shaopeng Tan
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Shaopeng Tan @ 2026-01-23  4:40 UTC (permalink / raw)
  To: fenghuay, reinette.chatre, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel, tan.shaopeng

Hello Fenghua, Reinette, Ben, James, and to whom it may concern,

The MPAM driver is nearing upstream merge,
but resctrl_test doesn't work on the Arm architecture.
I'm actively working on a series to support CAT/NONCONT_CAT tests for the Arm. 
(Support for MBM/MBA tests will be considered in the future.)

While I've modified the resctrl_test code to enable CAT on Arm,
CAT test is failing in the NVIDIA Grace environment. 
(I don't have any other environments.)
Am I misunderstanding the CAT tests, or is there something specific
about Grace that I'm overlooking? Any advice would be greatly appreciated.

First of all,
when running CAT on Grace, I observed that cache limiting is working as expected.
I verified this by checking "sudo cat /sys/fs/resctrl/c1/mon_data/mon_L3_*/llc_occupancy".
Furthermore, I noticed that benchmark execution times varied directly with the limited cache size.

I reused the existing Intel CAT test methodology,
that involves collecting cache miss counts via perf_event during a benchmark task and then
verifying a correlation between the cache limit value and these miss counts.
https://lore.kernel.org/lkml/20231215150515.36983-23-ilpo.jarvinen@linux.intel.com/#r

I'm aware that the specific cache miss numbers and CAT's impact can
differ significantly depending on the microarchitecture or SoC.
For Arm, we need to establish an appropriate minimum difference in LLC
misses between a test with n+1 bits CBM to the test with n bits.

However, my experiments with Grace showed that even when I significantly
varied the cache span size, the average LLC miss counts remained nearly unchanged.

Detailed test results as follows:

# # Starting L3_CAT test ...
# # Mounting resctrl to "/sys/fs/resctrl"
# # Cache size :119537664
# # Writing benchmark parameters to resctrl FS
# # Write schema "L3:1=fc0" to resctrl FS
# # Write schema "L3:1=3f" to resctrl FS
# # Write schema "L3:1=fe0" to resctrl FS
# # Write schema "L3:1=1f" to resctrl FS
# # Write schema "L3:1=ff0" to resctrl FS
# # Write schema "L3:1=f" to resctrl FS
# # Write schema "L3:1=ff8" to resctrl FS
# # Write schema "L3:1=7" to resctrl FS
# # Write schema "L3:1=ffc" to resctrl FS
# # Write schema "L3:1=3" to resctrl FS
# # Write schema "L3:1=ffe" to resctrl FS
# # Write schema "L3:1=1" to resctrl FS
# # Checking for pass/fail
# # Number of bits: 6
# # Average LLC val: 1609252
# # Cache span (lines): 933888
# # Fail: Check cache miss rate changed more than 4.0%
# # Percent diff=-0.0
# # Number of bits: 5
# # Average LLC val: 1609038
# # Cache span (lines): 778240
# # Fail: Check cache miss rate changed more than 3.0%
# # Percent diff=0.7
# # Number of bits: 4
# # Average LLC val: 1620802
# # Cache span (lines): 622592
# # Fail: Check cache miss rate changed more than 2.0%
# # Percent diff=1.1
# # Number of bits: 3
# # Average LLC val: 1639214
# # Cache span (lines): 466944
# # Fail: Check cache miss rate changed more than 1.0%
# # Percent diff=0.9
# # Number of bits: 2
# # Average LLC val: 1653470
# # Cache span (lines): 311296
# # Pass: Check cache miss rate changed more than 0.0%
# # Percent diff=1.0
# # Number of bits: 1
# # Average LLC val: 1669618
# # Cache span (lines): 155648
# not ok 4 L3_CAT: test

Additionally, even with a fixed alloc buffer size(span = 119537664),
the Average LLC value remains nearly unchanged regardless of the limited cache size.
Furthermore, it appears that ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL is
mapped to PERF_COUNT_HW_CACHE_MISSES in "./drivers/perf/arm_pmuv3.c",
to counteract this, I attempted to use the perf_event measurement event
to ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD,
ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL,
and ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD,
however, the Average LLC value still remains nearly unchanged.

My modifications to resctrl_test (for context):

diff --git a/tools/testing/selftests/resctrl/cache.c
b/tools/testing/selftests/resctrl/cache.c
index 9a4a6c52b14c..9f00680039c6 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -8,7 +8,8 @@ char llc_occup_path[1024];
 void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
 {
        memset(pea, 0, sizeof(*pea));
-       pea->type = PERF_TYPE_HARDWARE;
+       //pea->type = PERF_TYPE_HARDWARE;
+       pea->type = PERF_TYPE_RAW;
        pea->size = sizeof(*pea);
        pea->read_format = PERF_FORMAT_GROUP;
        pea->exclude_kernel = 1;
diff --git a/tools/testing/selftests/resctrl/cat_test.c
b/tools/testing/selftests/resctrl/cat_test.c
index 58b1590695d1..3ecf22fa1983 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -8,6 +8,7 @@
  *    Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
  *    Fenghua Yu <fenghua.yu@intel.com>
  */
+#include "perf/arm_pmuv3.h"
 #include "resctrl.h"
 #include <unistd.h>

@@ -181,7 +182,11 @@ static int cat_test(const struct resctrl_test *test,
        if (ret)
                goto reset_affinity;

        perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
+       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE);
+       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD);
+       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL);
+       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD);
        perf_event_initialize_read_format(&pe_read);
        pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
        if (pe_fd < 0) {
@@ -276,6 +281,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
        };
        param.mask = long_mask;
        span = cache_portion_size(cache_total_size, start_mask, full_cache_mask);
+       //span = 119537664; //L3 cache size of my machine

        remove(param.filename);

Any insights or suggestions would be greatly appreciated.

Best regards,
Shaopeng TAN

---
Shaopeng Tan (5):
  kselftests/resctrl: Detect the ARM architecture
  kselftests/resctrl: enable noncont_cat for MPAM
  kselftests/resctrl: remove unnecessary exclude_idle
  kselftests/resctrl: set shareable_mask to zero if all bits are shared
    between software and hardware
  kselftests/resctrl: Add support for CAT test on ARM

 tools/testing/selftests/resctrl/cache.c         | 1 -
 tools/testing/selftests/resctrl/cat_test.c      | 5 +++--
 tools/testing/selftests/resctrl/fill_buf.c      | 4 ++++
 tools/testing/selftests/resctrl/resctrl.h       | 1 +
 tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
 tools/testing/selftests/resctrl/resctrlfs.c     | 2 ++
 6 files changed, 17 insertions(+), 3 deletions(-)

-- 
2.47.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture
  2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
@ 2026-01-23  4:40 ` Shaopeng Tan
  2026-02-17 17:49   ` Reinette Chatre
  2026-01-23  4:40 ` [RFC PATCH 2/5] kselftests/resctrl: enable noncont_cat for MPAM Shaopeng Tan
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Shaopeng Tan @ 2026-01-23  4:40 UTC (permalink / raw)
  To: fenghuay, reinette.chatre, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel, tan.shaopeng

The resctrl test is not enabled for MPAM (ARM Memory System Resource
Partitioning and Monitoring)
Add processing to detect the ARM architecture.

Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
 tools/testing/selftests/resctrl/resctrl.h       | 1 +
 tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
 2 files changed, 8 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 3c51bdac2dfa..492d2a1c4033 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -38,6 +38,7 @@
  */
 #define ARCH_INTEL     1
 #define ARCH_AMD       2
+#define ARCH_ARM       3
 
 #define END_OF_TESTS	1
 
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 5154ffd821c4..662968d38eca 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -8,6 +8,7 @@
  *    Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
  *    Fenghua Yu <fenghua.yu@intel.com>
  */
+#include <sys/utsname.h>
 #include "resctrl.h"
 
 /* Volatile memory sink to prevent compiler optimizations */
@@ -26,6 +27,7 @@ static struct resctrl_test *resctrl_tests[] = {
 static int detect_vendor(void)
 {
 	FILE *inf = fopen("/proc/cpuinfo", "r");
+	struct utsname system_info;
 	int vendor_id = 0;
 	char *s = NULL;
 	char *res;
@@ -42,6 +44,11 @@ static int detect_vendor(void)
 		vendor_id = ARCH_INTEL;
 	else if (s && !strcmp(s, ": AuthenticAMD\n"))
 		vendor_id = ARCH_AMD;
+	else {
+		uname(&system_info);
+		if (strstr(system_info.machine, "aarch64") != NULL)
+			vendor_id = ARCH_ARM;
+	}
 
 	fclose(inf);
 	free(res);
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 2/5] kselftests/resctrl: enable noncont_cat for MPAM
  2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
  2026-01-23  4:40 ` [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture Shaopeng Tan
@ 2026-01-23  4:40 ` Shaopeng Tan
  2026-02-17 17:52   ` Reinette Chatre
  2026-01-23  4:40 ` [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle Shaopeng Tan
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Shaopeng Tan @ 2026-01-23  4:40 UTC (permalink / raw)
  To: fenghuay, reinette.chatre, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel, tan.shaopeng

Arm(MPAM driver) also supports non-contiguous CBM.
So enable noncont_cat for Arm.

Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
 tools/testing/selftests/resctrl/cat_test.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 94cfdba5308d..e1b30ab4cef5 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -291,7 +291,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
 static bool arch_supports_noncont_cat(const struct resctrl_test *test)
 {
 	/* AMD always supports non-contiguous CBM. */
-	if (get_vendor() == ARCH_AMD)
+	/* ARM(MPAM driver) also supports non-contiguous CBM. */
+	if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_ARM)
 		return true;
 
 #if defined(__i386__) || defined(__x86_64__) /* arch */
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle
  2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
  2026-01-23  4:40 ` [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture Shaopeng Tan
  2026-01-23  4:40 ` [RFC PATCH 2/5] kselftests/resctrl: enable noncont_cat for MPAM Shaopeng Tan
@ 2026-01-23  4:40 ` Shaopeng Tan
  2026-02-17 17:52   ` Reinette Chatre
  2026-01-23  4:40 ` [RFC PATCH 4/5] kselftests/resctrl: set shareable_mask to zero if all bits are shared between software and hardware Shaopeng Tan
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Shaopeng Tan @ 2026-01-23  4:40 UTC (permalink / raw)
  To: fenghuay, reinette.chatre, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel, tan.shaopeng

The Linux manual states regarding exclude_idle: "While you can currently
enable this for any event type, it is ignored for all but software events."
Also, it appears exclude_idle is not supported on Arm.
Therefore, remove it.

Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
 tools/testing/selftests/resctrl/cache.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 1ff1104e6575..9a4a6c52b14c 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -13,7 +13,6 @@ void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
 	pea->read_format = PERF_FORMAT_GROUP;
 	pea->exclude_kernel = 1;
 	pea->exclude_hv = 1;
-	pea->exclude_idle = 1;
 	pea->exclude_callchain_kernel = 1;
 	pea->inherit = 1;
 	pea->exclude_guest = 1;
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 4/5] kselftests/resctrl: set shareable_mask to zero if all bits are shared between software and hardware
  2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
                   ` (2 preceding siblings ...)
  2026-01-23  4:40 ` [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle Shaopeng Tan
@ 2026-01-23  4:40 ` Shaopeng Tan
  2026-02-17 17:52   ` Reinette Chatre
  2026-01-23  4:40 ` [RFC PATCH 5/5] kselftests/resctrl: Add support for CAT test on ARM Shaopeng Tan
  2026-01-27 20:40 ` [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests " Ben Horgan
  5 siblings, 1 reply; 15+ messages in thread
From: Shaopeng Tan @ 2026-01-23  4:40 UTC (permalink / raw)
  To: fenghuay, reinette.chatre, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel, tan.shaopeng

When all bits are shared between software and hardware, CAT test can not run.

In the case of MPAM driver, even if all bits are shared between
hardware and software, they can be used as if software-exclusive.

To enable CAT, if all bits are shared between hardware and software,
set shareable_mask to zero.

Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
 tools/testing/selftests/resctrl/resctrlfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 195f04c4d158..4b9ee803a112 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -495,6 +495,8 @@ int get_mask_no_shareable(const char *cache_type, unsigned long *mask)
 		return -1;
 	if (get_shareable_mask(cache_type, &shareable_mask) < 0)
 		return -1;
+	if (full_mask == shareable_mask)
+		shareable_mask = 0;
 
 	len = count_contiguous_bits(full_mask & ~shareable_mask, &start);
 	if (!len)
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [RFC PATCH 5/5] kselftests/resctrl: Add support for CAT test on ARM
  2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
                   ` (3 preceding siblings ...)
  2026-01-23  4:40 ` [RFC PATCH 4/5] kselftests/resctrl: set shareable_mask to zero if all bits are shared between software and hardware Shaopeng Tan
@ 2026-01-23  4:40 ` Shaopeng Tan
  2026-01-27 20:47   ` Ben Horgan
  2026-01-27 20:40 ` [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests " Ben Horgan
  5 siblings, 1 reply; 15+ messages in thread
From: Shaopeng Tan @ 2026-01-23  4:40 UTC (permalink / raw)
  To: fenghuay, reinette.chatre, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel, tan.shaopeng

Currently, CAT test is limited to Intel architectures.
Add cache cleaning and enable result checking for Arm architectures.

Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
---
 tools/testing/selftests/resctrl/cat_test.c | 2 +-
 tools/testing/selftests/resctrl/fill_buf.c | 4 ++++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index e1b30ab4cef5..58b1590695d1 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -113,7 +113,7 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type
 		ret = show_results_info(sum_llc_perf_miss, bits,
 					alloc_size / 64,
 					MIN_DIFF_PERCENT_PER_BIT * (bits - 1),
-					runs, get_vendor() == ARCH_INTEL,
+					runs, (get_vendor() == ARCH_INTEL || get_vendor() == ARCH_ARM),
 					&prev_avg_llc_val);
 		if (ret)
 			fail = 1;
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 19a01a52dc1a..dbbf80d22f42 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -35,6 +35,10 @@ static void cl_flush(void *p)
 #if defined(__i386) || defined(__x86_64)
 	asm volatile("clflush (%0)\n\t"
 		     : : "r"(p) : "memory");
+#elif defined(__aarch64__)
+	__asm__ __volatile__("dc civac, %0\n\t"
+		     : : "r" (p) : "memory");
+
 #endif
 }
 
-- 
2.47.3


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM
  2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
                   ` (4 preceding siblings ...)
  2026-01-23  4:40 ` [RFC PATCH 5/5] kselftests/resctrl: Add support for CAT test on ARM Shaopeng Tan
@ 2026-01-27 20:40 ` Ben Horgan
  2026-03-02  7:26   ` Shaopeng Tan (Fujitsu)
  5 siblings, 1 reply; 15+ messages in thread
From: Ben Horgan @ 2026-01-27 20:40 UTC (permalink / raw)
  To: Shaopeng Tan, fenghuay, reinette.chatre, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel

Hi Shaopeng,

On 1/23/26 04:40, Shaopeng Tan wrote:
> Hello Fenghua, Reinette, Ben, James, and to whom it may concern,
> 
> The MPAM driver is nearing upstream merge,
> but resctrl_test doesn't work on the Arm architecture.
> I'm actively working on a series to support CAT/NONCONT_CAT tests for the Arm. 
> (Support for MBM/MBA tests will be considered in the future.)

Great :) Having MPAM support in the resctrl kselftests will be be good.

> 
> While I've modified the resctrl_test code to enable CAT on Arm,
> CAT test is failing in the NVIDIA Grace environment. 
> (I don't have any other environments.)
> Am I misunderstanding the CAT tests, or is there something specific
> about Grace that I'm overlooking? Any advice would be greatly appreciated.

IIUC the L3 cache is in the nvidia interconnect and so changing the
cache portion bitmap would correlate with events from the nvidia
interconnect pmu. However, I don't think you are using events from the
interconnect.

> 
> First of all,
> when running CAT on Grace, I observed that cache limiting is working as expected.
> I verified this by checking "sudo cat /sys/fs/resctrl/c1/mon_data/mon_L3_*/llc_occupancy".
> Furthermore, I noticed that benchmark execution times varied directly with the limited cache size.

Good to know.

> 
> I reused the existing Intel CAT test methodology,
> that involves collecting cache miss counts via perf_event during a benchmark task and then
> verifying a correlation between the cache limit value and these miss counts.
> https://lore.kernel.org/lkml/20231215150515.36983-23-ilpo.jarvinen@linux.intel.com/#r
> 
> I'm aware that the specific cache miss numbers and CAT's impact can
> differ significantly depending on the microarchitecture or SoC.
> For Arm, we need to establish an appropriate minimum difference in LLC
> misses between a test with n+1 bits CBM to the test with n bits.
> 
> However, my experiments with Grace showed that even when I significantly
> varied the cache span size, the average LLC miss counts remained nearly unchanged.
> 
> Detailed test results as follows:
> 
> # # Starting L3_CAT test ...
> # # Mounting resctrl to "/sys/fs/resctrl"
> # # Cache size :119537664
> # # Writing benchmark parameters to resctrl FS
> # # Write schema "L3:1=fc0" to resctrl FS
> # # Write schema "L3:1=3f" to resctrl FS
> # # Write schema "L3:1=fe0" to resctrl FS
> # # Write schema "L3:1=1f" to resctrl FS
> # # Write schema "L3:1=ff0" to resctrl FS
> # # Write schema "L3:1=f" to resctrl FS
> # # Write schema "L3:1=ff8" to resctrl FS
> # # Write schema "L3:1=7" to resctrl FS
> # # Write schema "L3:1=ffc" to resctrl FS
> # # Write schema "L3:1=3" to resctrl FS
> # # Write schema "L3:1=ffe" to resctrl FS
> # # Write schema "L3:1=1" to resctrl FS
> # # Checking for pass/fail
> # # Number of bits: 6
> # # Average LLC val: 1609252
> # # Cache span (lines): 933888
> # # Fail: Check cache miss rate changed more than 4.0%
> # # Percent diff=-0.0
> # # Number of bits: 5
> # # Average LLC val: 1609038
> # # Cache span (lines): 778240
> # # Fail: Check cache miss rate changed more than 3.0%
> # # Percent diff=0.7
> # # Number of bits: 4
> # # Average LLC val: 1620802
> # # Cache span (lines): 622592
> # # Fail: Check cache miss rate changed more than 2.0%
> # # Percent diff=1.1
> # # Number of bits: 3
> # # Average LLC val: 1639214
> # # Cache span (lines): 466944
> # # Fail: Check cache miss rate changed more than 1.0%
> # # Percent diff=0.9
> # # Number of bits: 2
> # # Average LLC val: 1653470
> # # Cache span (lines): 311296
> # # Pass: Check cache miss rate changed more than 0.0%
> # # Percent diff=1.0
> # # Number of bits: 1
> # # Average LLC val: 1669618
> # # Cache span (lines): 155648
> # not ok 4 L3_CAT: test
> 
> Additionally, even with a fixed alloc buffer size(span = 119537664),
> the Average LLC value remains nearly unchanged regardless of the limited cache size.
> Furthermore, it appears that ARMV8_PMUV3_PERFCTR_L1D_CACHE_REFILL is
> mapped to PERF_COUNT_HW_CACHE_MISSES in "./drivers/perf/arm_pmuv3.c",
> to counteract this, I attempted to use the perf_event measurement event
> to ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD,
> ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL,
> and ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD,
> however, the Average LLC value still remains nearly unchanged.

I think these are from the neoverse_v2 rather than the interconnect.

> 
> My modifications to resctrl_test (for context):
> 
> diff --git a/tools/testing/selftests/resctrl/cache.c
> b/tools/testing/selftests/resctrl/cache.c
> index 9a4a6c52b14c..9f00680039c6 100644
> --- a/tools/testing/selftests/resctrl/cache.c
> +++ b/tools/testing/selftests/resctrl/cache.c
> @@ -8,7 +8,8 @@ char llc_occup_path[1024];
>  void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
>  {
>         memset(pea, 0, sizeof(*pea));
> -       pea->type = PERF_TYPE_HARDWARE;
> +       //pea->type = PERF_TYPE_HARDWARE;
> +       pea->type = PERF_TYPE_RAW;
>         pea->size = sizeof(*pea);
>         pea->read_format = PERF_FORMAT_GROUP;
>         pea->exclude_kernel = 1;
> diff --git a/tools/testing/selftests/resctrl/cat_test.c
> b/tools/testing/selftests/resctrl/cat_test.c
> index 58b1590695d1..3ecf22fa1983 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -8,6 +8,7 @@
>   *    Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
>   *    Fenghua Yu <fenghua.yu@intel.com>
>   */
> +#include "perf/arm_pmuv3.h"
>  #include "resctrl.h"
>  #include <unistd.h>
> 
> @@ -181,7 +182,11 @@ static int cat_test(const struct resctrl_test *test,
>         if (ret)
>                 goto reset_affinity;
> 
>         perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
> +       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE);
> +       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_LL_CACHE_MISS_RD);
> +       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_REFILL);
> +       //perf_event_attr_initialize(&pea, ARMV8_PMUV3_PERFCTR_L3D_CACHE_LMISS_RD);
>         perf_event_initialize_read_format(&pe_read);
>         pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
>         if (pe_fd < 0) {
> @@ -276,6 +281,7 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>         };
>         param.mask = long_mask;
>         span = cache_portion_size(cache_total_size, start_mask, full_cache_mask);
> +       //span = 119537664; //L3 cache size of my machine
> 
>         remove(param.filename);
> 
> Any insights or suggestions would be greatly appreciated.
> 
> Best regards,
> Shaopeng TAN
> 
> ---
> Shaopeng Tan (5):
>   kselftests/resctrl: Detect the ARM architecture
>   kselftests/resctrl: enable noncont_cat for MPAM
>   kselftests/resctrl: remove unnecessary exclude_idle
>   kselftests/resctrl: set shareable_mask to zero if all bits are shared
>     between software and hardware
>   kselftests/resctrl: Add support for CAT test on ARM
> 
>  tools/testing/selftests/resctrl/cache.c         | 1 -
>  tools/testing/selftests/resctrl/cat_test.c      | 5 +++--
>  tools/testing/selftests/resctrl/fill_buf.c      | 4 ++++
>  tools/testing/selftests/resctrl/resctrl.h       | 1 +
>  tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
>  tools/testing/selftests/resctrl/resctrlfs.c     | 2 ++
>  6 files changed, 17 insertions(+), 3 deletions(-)
> 

Thanks,

Ben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 5/5] kselftests/resctrl: Add support for CAT test on ARM
  2026-01-23  4:40 ` [RFC PATCH 5/5] kselftests/resctrl: Add support for CAT test on ARM Shaopeng Tan
@ 2026-01-27 20:47   ` Ben Horgan
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Horgan @ 2026-01-27 20:47 UTC (permalink / raw)
  To: Shaopeng Tan, fenghuay, reinette.chatre, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel

Hi Shaopeng,

On 1/23/26 04:40, Shaopeng Tan wrote:
> Currently, CAT test is limited to Intel architectures.
> Add cache cleaning and enable result checking for Arm architectures.
> 
> Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> ---
>  tools/testing/selftests/resctrl/cat_test.c | 2 +-
>  tools/testing/selftests/resctrl/fill_buf.c | 4 ++++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index e1b30ab4cef5..58b1590695d1 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -113,7 +113,7 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type
>  		ret = show_results_info(sum_llc_perf_miss, bits,
>  					alloc_size / 64,
>  					MIN_DIFF_PERCENT_PER_BIT * (bits - 1),
> -					runs, get_vendor() == ARCH_INTEL,
> +					runs, (get_vendor() == ARCH_INTEL || get_vendor() == ARCH_ARM),
>  					&prev_avg_llc_val);
>  		if (ret)
>  			fail = 1;
> diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
> index 19a01a52dc1a..dbbf80d22f42 100644
> --- a/tools/testing/selftests/resctrl/fill_buf.c
> +++ b/tools/testing/selftests/resctrl/fill_buf.c
> @@ -35,6 +35,10 @@ static void cl_flush(void *p)
>  #if defined(__i386) || defined(__x86_64)
>  	asm volatile("clflush (%0)\n\t"
>  		     : : "r"(p) : "memory");
> +#elif defined(__aarch64__)
> +	__asm__ __volatile__("dc civac, %0\n\t"
> +		     : : "r" (p) : "memory");
> +

This is only guaranteed to clean and invalidate to the point of
coherence, PoC. On Grace I expect this is L3/slc and so the cache line
there in L3/slc is likely not invalidated or pushed to DRAM.

The dsb() for synchronization is missing for aarch64 in sb().

>  #endif
>  }
>  

Thanks,

Ben


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture
  2026-01-23  4:40 ` [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture Shaopeng Tan
@ 2026-02-17 17:49   ` Reinette Chatre
  0 siblings, 0 replies; 15+ messages in thread
From: Reinette Chatre @ 2026-02-17 17:49 UTC (permalink / raw)
  To: Shaopeng Tan, fenghuay, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel

Hi Shaopeng,

On 1/22/26 8:40 PM, Shaopeng Tan wrote:
> The resctrl test is not enabled for MPAM (ARM Memory System Resource
> Partitioning and Monitoring)
> Add processing to detect the ARM architecture.
> 
> Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> ---
>  tools/testing/selftests/resctrl/resctrl.h       | 1 +
>  tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 3c51bdac2dfa..492d2a1c4033 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -38,6 +38,7 @@
>   */
>  #define ARCH_INTEL     1
>  #define ARCH_AMD       2
> +#define ARCH_ARM       3

Please see recent enhancement in this area:
4f4f01cc333e ("selftests/resctrl: Define CPU vendor IDs as bits to match usage")

Reinette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 2/5] kselftests/resctrl: enable noncont_cat for MPAM
  2026-01-23  4:40 ` [RFC PATCH 2/5] kselftests/resctrl: enable noncont_cat for MPAM Shaopeng Tan
@ 2026-02-17 17:52   ` Reinette Chatre
  0 siblings, 0 replies; 15+ messages in thread
From: Reinette Chatre @ 2026-02-17 17:52 UTC (permalink / raw)
  To: Shaopeng Tan, fenghuay, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel

Hi Shaopeng,

On 1/22/26 8:40 PM, Shaopeng Tan wrote:
> Arm(MPAM driver) also supports non-contiguous CBM.
> So enable noncont_cat for Arm.
> 
> Signed-off-by: Shaopeng Tan <tan.shaopeng@fujitsu.com>
> ---
>  tools/testing/selftests/resctrl/cat_test.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 94cfdba5308d..e1b30ab4cef5 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -291,7 +291,8 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>  static bool arch_supports_noncont_cat(const struct resctrl_test *test)
>  {
>  	/* AMD always supports non-contiguous CBM. */
> -	if (get_vendor() == ARCH_AMD)
> +	/* ARM(MPAM driver) also supports non-contiguous CBM. */
> +	if (get_vendor() == ARCH_AMD || get_vendor() == ARCH_ARM)
>  		return true;

As an enhancement could you please use a local variable instead of calling
get_vendor() twice? An example of such can be seen in:
86063a2568b8 ("selftests/resctrl: Fix non-contiguous CBM check for Hygon")

Reinette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle
  2026-01-23  4:40 ` [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle Shaopeng Tan
@ 2026-02-17 17:52   ` Reinette Chatre
  2026-03-02  7:35     ` Shaopeng Tan (Fujitsu)
  0 siblings, 1 reply; 15+ messages in thread
From: Reinette Chatre @ 2026-02-17 17:52 UTC (permalink / raw)
  To: Shaopeng Tan, fenghuay, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel

Hi Shaopeng,

On 1/22/26 8:40 PM, Shaopeng Tan wrote:
> The Linux manual states regarding exclude_idle: "While you can currently
> enable this for any event type, it is ignored for all but software events."
> Also, it appears exclude_idle is not supported on Arm.

Just to confirm, does "not supported on Arm" imply that perf_event_open() fails when
exclude_idle is 1? Thus encountering:
	EPERM  Returned  on  many  (but  not  all)  architectures when an unsupported
		exclude_hv, exclude_idle, exclude_user, or exclude_kernel setting is
		specified.

Reinette

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 4/5] kselftests/resctrl: set shareable_mask to zero if all bits are shared between software and hardware
  2026-01-23  4:40 ` [RFC PATCH 4/5] kselftests/resctrl: set shareable_mask to zero if all bits are shared between software and hardware Shaopeng Tan
@ 2026-02-17 17:52   ` Reinette Chatre
  0 siblings, 0 replies; 15+ messages in thread
From: Reinette Chatre @ 2026-02-17 17:52 UTC (permalink / raw)
  To: Shaopeng Tan, fenghuay, ben.horgan, james.morse, shuah
  Cc: linux-kselftest, linux-kernel, linux-arm-kernel

Hi Shaopeng,

On 1/22/26 8:40 PM, Shaopeng Tan wrote:
> When all bits are shared between software and hardware, CAT test can not run.
> 
> In the case of MPAM driver, even if all bits are shared between
> hardware and software, they can be used as if software-exclusive.

How can "software-exclusive" be guaranteed in a run of the CAT test? If some
hardware happens to allocate into the cache while the CAT test runs then the
test is more likely to fail without there actually being a problem with 
resctrl.

> 
> To enable CAT, if all bits are shared between hardware and software,
> set shareable_mask to zero.

Please let this architecture specific addition only apply to MPAM accompanied
by a comment. 

Reinette


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM
  2026-01-27 20:40 ` [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests " Ben Horgan
@ 2026-03-02  7:26   ` Shaopeng Tan (Fujitsu)
  2026-03-03 17:10     ` Ben Horgan
  0 siblings, 1 reply; 15+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-03-02  7:26 UTC (permalink / raw)
  To: Ben Horgan, fenghuay@nvidia.com, reinette.chatre@intel.com,
	james.morse@arm.com, shuah@kernel.org
  Cc: linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org

Hello Ben,

Thank you for your reply. 
 
I've made the fixes and re-run the tests on Grace, as you advised.
I appreciate your feedback.

> This is only guaranteed to clean and invalidate to the point of
> coherence, PoC. On Grace I expect this is L3/slc and so the cache line
> there in L3/slc is likely not invalidated or pushed to DRAM.
> The dsb() for synchronization is missing for aarch64 in sb().

I added dsb() for synchronization for aarch64 as shown below.
 
@@ -27,6 +30,8 @@ static void sb(void)
 #if defined(__i386) || defined(__x86_64)
        asm volatile("sfence\n\t"
                     : : : "memory");
+#elif defined(__aarch64__)
+       __asm__ __volatile__("dsb sy\n\t" ::: "memory");
 #endif
 }
 
> IIUC the L3 cache is in the nvidia interconnect and so changing the
> cache portion bitmap would correlate with events from the nvidia
> interconnect pmu. However, I don't think you are using events from the
> interconnect.

I used the NVIDIA event  "nvidia_scf_pmu/scf_cache_refill/".
 
After the above fixes, the running results are as follows: 
$ sudo ./resctrl_tests -t cat
TAP version 13
# Pass: Check kernel supports resctrl filesystem
# Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists
# resctrl filesystem not mounted
1..3
# Starting L3_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :119537664
# Writing benchmark parameters to resctrl FS
# Write schema "L3:1=fc0" to resctrl FS
# Write schema "L3:1=3f" to resctrl FS
# Write schema "L3:1=fe0" to resctrl FS
# Write schema "L3:1=1f" to resctrl FS
# Write schema "L3:1=ff0" to resctrl FS
# Write schema "L3:1=f" to resctrl FS
# Write schema "L3:1=ff8" to resctrl FS
# Write schema "L3:1=7" to resctrl FS
# Write schema "L3:1=ffc" to resctrl FS
# Write schema "L3:1=3" to resctrl FS
# Write schema "L3:1=ffe" to resctrl FS
# Write schema "L3:1=1" to resctrl FS
# Checking for pass/fail
# Number of bits: 6
# Average LLC val: 0
# Cache span (lines): 933888
# Number of bits: 5
# Average LLC val: 0
# Cache span (lines): 778240
# Number of bits: 4
# Average LLC val: 0
# Cache span (lines): 622592
# Number of bits: 3
# Average LLC val: 0
# Cache span (lines): 466944
# Number of bits: 2
# Average LLC val: 0
# Cache span (lines): 311296
# Number of bits: 1
# Average LLC val: 0
# Cache span (lines): 155648
ok 1 L3_CAT: test

The result of the nvidia_scf_pmu/scf_cache_refill event is 0. 
I have tried various changes to the perf_event_open() parameters, such as type, read_format, PID etc.. 
Although non-zero results were obtained for some parameter combinations, the expected results were not achieved in any scenario. 
Are there any special specifications needed for the perf_event_open() parameters for Grace or Arm architecture?

The perf_event_open() parameters used when collecting the above results are as follows:
perf_event_open({type=PERF_TYPE_RAW, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0xf1, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_GROUP, disabled=1, inherit=1, exclude_kernel=1, exclude_hv=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, exclude_callchain_kernel=1, ...}, 68508, 1, -1, PERF_FLAG_FD_CLOEXEC) = 3
Could you please give us your opinion?
 
Also, since this kselftest is for all Arm chips, we need an event common to all chips.
Do you have any ideas on what event we should collect?

Best regards,
Shaopeng TAN

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle
  2026-02-17 17:52   ` Reinette Chatre
@ 2026-03-02  7:35     ` Shaopeng Tan (Fujitsu)
  0 siblings, 0 replies; 15+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2026-03-02  7:35 UTC (permalink / raw)
  To: Reinette Chatre, fenghuay@nvidia.com, ben.horgan@arm.com,
	james.morse@arm.com, shuah@kernel.org
  Cc: linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org

Hello Reinette,
 
Thank you for your comments.
 
> Just to confirm, does "not supported on Arm" imply that perf_event_open() fails when
> exclude_idle is 1? Thus encountering:
>         EPERM  Returned  on  many  (but  not  all)  architectures when an unsupported
>                 exclude_hv, exclude_idle, exclude_user, or exclude_kernel setting is
>                 specified.

perf_event_open() fails and returns EOPNOTSUPP.

perf_event_open({type=PERF_TYPE_RAW, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0x3, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_GROUP, disabled=1, inherit=1, exclude_kernel=1, exclude_hv=1, exclude_idle=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, exclude_callchain_kernel=1, ...}, 14260, 1, -1, PERF_FLAG_FD_CLOEXEC) = -1 EOPNOTSUPP

Other suggestions will be addressed in the next patch series. Thank you.
 
Best regards,
Shaopeng TAN


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM
  2026-03-02  7:26   ` Shaopeng Tan (Fujitsu)
@ 2026-03-03 17:10     ` Ben Horgan
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Horgan @ 2026-03-03 17:10 UTC (permalink / raw)
  To: Shaopeng Tan (Fujitsu), fenghuay@nvidia.com,
	reinette.chatre@intel.com, james.morse@arm.com, shuah@kernel.org
  Cc: linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org

Hi Shaopeng,

On 3/2/26 07:26, Shaopeng Tan (Fujitsu) wrote:
> Hello Ben,
> 
> Thank you for your reply. 
>  
> I've made the fixes and re-run the tests on Grace, as you advised.
> I appreciate your feedback.
> 
>> This is only guaranteed to clean and invalidate to the point of
>> coherence, PoC. On Grace I expect this is L3/slc and so the cache line
>> there in L3/slc is likely not invalidated or pushed to DRAM.
>> The dsb() for synchronization is missing for aarch64 in sb().
> 
> I added dsb() for synchronization for aarch64 as shown below.
>  
> @@ -27,6 +30,8 @@ static void sb(void)
>  #if defined(__i386) || defined(__x86_64)
>         asm volatile("sfence\n\t"
>                      : : : "memory");
> +#elif defined(__aarch64__)
> +       __asm__ __volatile__("dsb sy\n\t" ::: "memory");
>  #endif
>  }

Sorry, if I wasn't clear. The dsb() is required for the synchronization
of the clean and invalidate operation but the clean and invalidate
operation has no requirement to clean and invalidate the L3/slc and as
that's the PoC and so probably just does the clean and invalidate up
to L2.

>  
>> IIUC the L3 cache is in the nvidia interconnect and so changing the
>> cache portion bitmap would correlate with events from the nvidia
>> interconnect pmu. However, I don't think you are using events from the
>> interconnect.
> 
> I used the NVIDIA event  "nvidia_scf_pmu/scf_cache_refill/".
>  
> After the above fixes, the running results are as follows: 
> $ sudo ./resctrl_tests -t cat
> TAP version 13
> # Pass: Check kernel supports resctrl filesystem
> # Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists
> # resctrl filesystem not mounted
> 1..3
> # Starting L3_CAT test ...
> # Mounting resctrl to "/sys/fs/resctrl"
> # Cache size :119537664
> # Writing benchmark parameters to resctrl FS
> # Write schema "L3:1=fc0" to resctrl FS
> # Write schema "L3:1=3f" to resctrl FS
> # Write schema "L3:1=fe0" to resctrl FS
> # Write schema "L3:1=1f" to resctrl FS
> # Write schema "L3:1=ff0" to resctrl FS
> # Write schema "L3:1=f" to resctrl FS
> # Write schema "L3:1=ff8" to resctrl FS
> # Write schema "L3:1=7" to resctrl FS
> # Write schema "L3:1=ffc" to resctrl FS
> # Write schema "L3:1=3" to resctrl FS
> # Write schema "L3:1=ffe" to resctrl FS
> # Write schema "L3:1=1" to resctrl FS
> # Checking for pass/fail
> # Number of bits: 6
> # Average LLC val: 0
> # Cache span (lines): 933888
> # Number of bits: 5
> # Average LLC val: 0
> # Cache span (lines): 778240
> # Number of bits: 4
> # Average LLC val: 0
> # Cache span (lines): 622592
> # Number of bits: 3
> # Average LLC val: 0
> # Cache span (lines): 466944
> # Number of bits: 2
> # Average LLC val: 0
> # Cache span (lines): 311296
> # Number of bits: 1
> # Average LLC val: 0
> # Cache span (lines): 155648
> ok 1 L3_CAT: test
> 
> The result of the nvidia_scf_pmu/scf_cache_refill event is 0. 
> I have tried various changes to the perf_event_open() parameters, such as type, read_format, PID etc.. 
> Although non-zero results were obtained for some parameter combinations, the expected results were not achieved in any scenario. 

Could this be because the clean and invalidate doesn't affect the slc/L3?

> Are there any special specifications needed for the perf_event_open() parameters for Grace or Arm architecture?

I'm not sure.

> 
> The perf_event_open() parameters used when collecting the above results are as follows:
> perf_event_open({type=PERF_TYPE_RAW, size=0x88 /* PERF_ATTR_SIZE_??? */, config=0xf1, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_GROUP, disabled=1, inherit=1, exclude_kernel=1, exclude_hv=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, exclude_callchain_kernel=1, ...}, 68508, 1, -1, PERF_FLAG_FD_CLOEXEC) = 3
> Could you please give us your opinion?
>  
> Also, since this kselftest is for all Arm chips, we need an event common to all chips.
> Do you have any ideas on what event we should collect?

I don't think there is any common event. Perhaps you could make the
event to test against an input to the test?

> 
> Best regards,
> Shaopeng TAN

Thanks,

Ben


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2026-03-03 17:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-23  4:40 [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests on ARM Shaopeng Tan
2026-01-23  4:40 ` [RFC PATCH 1/5] kselftests/resctrl: Detect the ARM architecture Shaopeng Tan
2026-02-17 17:49   ` Reinette Chatre
2026-01-23  4:40 ` [RFC PATCH 2/5] kselftests/resctrl: enable noncont_cat for MPAM Shaopeng Tan
2026-02-17 17:52   ` Reinette Chatre
2026-01-23  4:40 ` [RFC PATCH 3/5] kselftests/resctrl: remove unnecessary exclude_idle Shaopeng Tan
2026-02-17 17:52   ` Reinette Chatre
2026-03-02  7:35     ` Shaopeng Tan (Fujitsu)
2026-01-23  4:40 ` [RFC PATCH 4/5] kselftests/resctrl: set shareable_mask to zero if all bits are shared between software and hardware Shaopeng Tan
2026-02-17 17:52   ` Reinette Chatre
2026-01-23  4:40 ` [RFC PATCH 5/5] kselftests/resctrl: Add support for CAT test on ARM Shaopeng Tan
2026-01-27 20:47   ` Ben Horgan
2026-01-27 20:40 ` [RFC PATCH 0/5] kselftest/resctrl: Enable CAT and NONCONT_CAT tests " Ben Horgan
2026-03-02  7:26   ` Shaopeng Tan (Fujitsu)
2026-03-03 17:10     ` Ben Horgan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox