* [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-06 9:47 ` Ilpo Järvinen
2026-03-04 0:19 ` [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure Reinette Chatre
` (8 subsequent siblings)
9 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
Dave Martin reported inconsistent CMT test failures. In one experiment
the first run of the CMT test failed because of too large (24%) difference
between measured and achievable cache occupancy while the second run passed
with an acceptable 4% difference.
The CMT test is susceptible to interference from the rest of the system.
This can be demonstrated with a utility like stress-ng by running the CMT
test while introducing cache misses using:
stress-ng --matrix-3d 0 --matrix-3d-zyx
Below shows an example of the CMT test failing because of a significant
difference between measured and achievable cache occupancy when run with
interference:
# Starting CMT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :56623104
# Writing benchmark parameters to resctrl FS
# Benchmark PID: 3275
# Checking for pass/fail
# Fail: Check cache miss rate within 15%
# Percent diff=97
# Number of bits: 5
# Average LLC val: 501350
# Cache span (bytes): 23592960
not ok 1 CMT: test
The CMT test creates a new control group that is also capable of monitoring
and assigns the workload to it. The workload allocates a buffer that by
default fills a portion of the L3 and keeps reading from the buffer,
measuring the L3 occupancy at intervals. The test passes if the workload's
L3 occupancy is within 15% of the buffer size.
By not adjusting any capacity bitmasks the workload shares the cache with
the rest of the system. Any other task that may be running could evict
the workload's data from the cache causing it to have low cache occupancy.
Reduce interference from the rest of the system by ensuring that the
workload's control group uses the capacity bitmask found in the user
parameters for L3 and that the rest of the system can only allocate into
the inverse of the workload's L3 cache portion. Other tasks can thus no
longer evict the workload's data from L3.
Take the L2 cache into account to further improve test accuracy.
By default the buffer size is the same as the L3 portion that the workload
can allocate into. This buffer size does not take into account that some
of the workload's data may land in L2/L1. Address this in two ways:
- Reduce the amount of L2 cache the workload can allocate into to the
minimum on systems that support L2 cache allocation.
- Increase the buffer size to accommodate data that may be allocated into
the L2 cache. Use a buffer size double the L3 portion to keep using the
L3 portion size as goal for L3 occupancy while taking into account that
some of the data may be in L2.
With the above adjustments the CMT test is more consistent. Repeating the
CMT test while generating interference with stress-ng on a sample
system after applying the fixes show significant improvement in test
accuracy:
# Starting CMT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :56623104
# Writing benchmark parameters to resctrl FS
# Write schema "L3:0=fe0" to resctrl FS
# Write schema "L3:0=1f" to resctrl FS
# Benchmark PID: 3223
# Checking for pass/fail
# Pass: Check cache miss rate within 15%
# Percent diff=3
# Number of bits: 5
# Average LLC val: 22811443
# Cache span (bytes): 23592960
ok 1 CMT: test
Reported-by: Dave Martin <Dave.Martin@arm.com>
Closes: https://lore.kernel.org/lkml/aO+7MeSMV29VdbQs@e133380.arm.com/
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since v1:
- Fix typo in changelog: "data my be in L2" -> "data may be in L2".
---
tools/testing/selftests/resctrl/cmt_test.c | 35 ++++++++++++++++---
tools/testing/selftests/resctrl/mba_test.c | 4 ++-
tools/testing/selftests/resctrl/mbm_test.c | 4 ++-
tools/testing/selftests/resctrl/resctrl.h | 4 ++-
tools/testing/selftests/resctrl/resctrl_val.c | 2 +-
5 files changed, 41 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index d09e693dc739..44e9938dfafd 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -19,12 +19,39 @@
#define CON_MON_LCC_OCCUP_PATH \
"%s/%s/mon_data/mon_L3_%02d/llc_occupancy"
-static int cmt_init(const struct resctrl_val_param *param, int domain_id)
+/*
+ * Initialize capacity bitmasks (CBMs) for control group being tested,
+ * default resource group to prevent its tasks from interfering with test,
+ * and L2 resource of control group to minimize allocations into L2 if
+ * possible to better predict L3 occupancy.
+ */
+static int cmt_init(const struct resctrl_test *test,
+ const struct user_params *uparams,
+ const struct resctrl_val_param *param, int domain_id)
{
+ unsigned long long_mask;
+ char schemata[64];
+ int ret;
+
sprintf(llc_occup_path, CON_MON_LCC_OCCUP_PATH, RESCTRL_PATH,
param->ctrlgrp, domain_id);
- return 0;
+ ret = get_full_cbm(test->resource, &long_mask);
+ if (ret)
+ return ret;
+
+ snprintf(schemata, sizeof(schemata), "%lx", ~param->mask & long_mask);
+ ret = write_schemata("", schemata, uparams->cpu, test->resource);
+ if (ret)
+ return ret;
+
+ snprintf(schemata, sizeof(schemata), "%lx", param->mask);
+ ret = write_schemata(param->ctrlgrp, schemata, uparams->cpu, test->resource);
+
+ if (!ret && !strcmp(test->resource, "L3") && resctrl_resource_exists("L2"))
+ ret = write_schemata(param->ctrlgrp, "0x1", uparams->cpu, "L2");
+
+ return ret;
}
static int cmt_setup(const struct resctrl_test *test,
@@ -153,11 +180,11 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
span = cache_portion_size(cache_total_size, param.mask, long_mask);
if (uparams->fill_buf) {
- fill_buf.buf_size = span;
+ fill_buf.buf_size = span * 2;
fill_buf.memflush = uparams->fill_buf->memflush;
param.fill_buf = &fill_buf;
} else if (!uparams->benchmark_cmd[0]) {
- fill_buf.buf_size = span;
+ fill_buf.buf_size = span * 2;
fill_buf.memflush = true;
param.fill_buf = &fill_buf;
}
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index c7e9adc0368f..cd4c715b7ffd 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -17,7 +17,9 @@
#define ALLOCATION_MIN 10
#define ALLOCATION_STEP 10
-static int mba_init(const struct resctrl_val_param *param, int domain_id)
+static int mba_init(const struct resctrl_test *test,
+ const struct user_params *uparams,
+ const struct resctrl_val_param *param, int domain_id)
{
int ret;
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 84d8bc250539..58201f844740 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -83,7 +83,9 @@ static int check_results(size_t span)
return ret;
}
-static int mbm_init(const struct resctrl_val_param *param, int domain_id)
+static int mbm_init(const struct resctrl_test *test,
+ const struct user_params *uparams,
+ const struct resctrl_val_param *param, int domain_id)
{
int ret;
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index afe635b6e48d..c72045c74ac4 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -135,7 +135,9 @@ struct resctrl_val_param {
char filename[64];
unsigned long mask;
int num_of_runs;
- int (*init)(const struct resctrl_val_param *param,
+ int (*init)(const struct resctrl_test *test,
+ const struct user_params *uparams,
+ const struct resctrl_val_param *param,
int domain_id);
int (*setup)(const struct resctrl_test *test,
const struct user_params *uparams,
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 7c08e936572d..a5a8badb83d4 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -569,7 +569,7 @@ int resctrl_val(const struct resctrl_test *test,
goto reset_affinity;
if (param->init) {
- ret = param->init(param, domain_id);
+ ret = param->init(test, uparams, param, domain_id);
if (ret)
goto reset_affinity;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test
2026-03-04 0:19 ` [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test Reinette Chatre
@ 2026-03-06 9:47 ` Ilpo Järvinen
2026-03-06 19:24 ` Reinette Chatre
0 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2026-03-06 9:47 UTC (permalink / raw)
To: Reinette Chatre
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
On Tue, 3 Mar 2026, Reinette Chatre wrote:
> Dave Martin reported inconsistent CMT test failures. In one experiment
> the first run of the CMT test failed because of too large (24%) difference
> between measured and achievable cache occupancy while the second run passed
> with an acceptable 4% difference.
>
> The CMT test is susceptible to interference from the rest of the system.
> This can be demonstrated with a utility like stress-ng by running the CMT
> test while introducing cache misses using:
>
> stress-ng --matrix-3d 0 --matrix-3d-zyx
>
> Below shows an example of the CMT test failing because of a significant
> difference between measured and achievable cache occupancy when run with
> interference:
> # Starting CMT test ...
> # Mounting resctrl to "/sys/fs/resctrl"
> # Cache size :56623104
> # Writing benchmark parameters to resctrl FS
> # Benchmark PID: 3275
> # Checking for pass/fail
> # Fail: Check cache miss rate within 15%
> # Percent diff=97
> # Number of bits: 5
> # Average LLC val: 501350
> # Cache span (bytes): 23592960
> not ok 1 CMT: test
>
> The CMT test creates a new control group that is also capable of monitoring
> and assigns the workload to it. The workload allocates a buffer that by
> default fills a portion of the L3 and keeps reading from the buffer,
> measuring the L3 occupancy at intervals. The test passes if the workload's
> L3 occupancy is within 15% of the buffer size.
>
> By not adjusting any capacity bitmasks the workload shares the cache with
> the rest of the system. Any other task that may be running could evict
> the workload's data from the cache causing it to have low cache occupancy.
>
> Reduce interference from the rest of the system by ensuring that the
> workload's control group uses the capacity bitmask found in the user
> parameters for L3 and that the rest of the system can only allocate into
> the inverse of the workload's L3 cache portion. Other tasks can thus no
> longer evict the workload's data from L3.
>
> Take the L2 cache into account to further improve test accuracy.
> By default the buffer size is the same as the L3 portion that the workload
> can allocate into. This buffer size does not take into account that some
> of the workload's data may land in L2/L1. Address this in two ways:
> - Reduce the amount of L2 cache the workload can allocate into to the
"into to the" sounds wrong.
> minimum on systems that support L2 cache allocation.
> - Increase the buffer size to accommodate data that may be allocated into
> the L2 cache. Use a buffer size double the L3 portion to keep using the
> L3 portion size as goal for L3 occupancy while taking into account that
> some of the data may be in L2.
To me the this control over L2 looks logically pretty separate from the
inverse of L3 portion control so it would seem like logically belong to a
separate change.
> With the above adjustments the CMT test is more consistent. Repeating the
> CMT test while generating interference with stress-ng on a sample
> system after applying the fixes show significant improvement in test
> accuracy:
>
> # Starting CMT test ...
> # Mounting resctrl to "/sys/fs/resctrl"
> # Cache size :56623104
> # Writing benchmark parameters to resctrl FS
> # Write schema "L3:0=fe0" to resctrl FS
> # Write schema "L3:0=1f" to resctrl FS
> # Benchmark PID: 3223
> # Checking for pass/fail
> # Pass: Check cache miss rate within 15%
> # Percent diff=3
> # Number of bits: 5
> # Average LLC val: 22811443
> # Cache span (bytes): 23592960
> ok 1 CMT: test
>
> Reported-by: Dave Martin <Dave.Martin@arm.com>
> Closes: https://lore.kernel.org/lkml/aO+7MeSMV29VdbQs@e133380.arm.com/
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> Changes since v1:
> - Fix typo in changelog: "data my be in L2" -> "data may be in L2".
> ---
> tools/testing/selftests/resctrl/cmt_test.c | 35 ++++++++++++++++---
> tools/testing/selftests/resctrl/mba_test.c | 4 ++-
> tools/testing/selftests/resctrl/mbm_test.c | 4 ++-
> tools/testing/selftests/resctrl/resctrl.h | 4 ++-
> tools/testing/selftests/resctrl/resctrl_val.c | 2 +-
> 5 files changed, 41 insertions(+), 8 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
> index d09e693dc739..44e9938dfafd 100644
> --- a/tools/testing/selftests/resctrl/cmt_test.c
> +++ b/tools/testing/selftests/resctrl/cmt_test.c
> @@ -19,12 +19,39 @@
> #define CON_MON_LCC_OCCUP_PATH \
> "%s/%s/mon_data/mon_L3_%02d/llc_occupancy"
>
> -static int cmt_init(const struct resctrl_val_param *param, int domain_id)
> +/*
> + * Initialize capacity bitmasks (CBMs) for control group being tested,
> + * default resource group to prevent its tasks from interfering with test,
> + * and L2 resource of control group to minimize allocations into L2 if
> + * possible to better predict L3 occupancy.
> + */
> +static int cmt_init(const struct resctrl_test *test,
> + const struct user_params *uparams,
> + const struct resctrl_val_param *param, int domain_id)
> {
> + unsigned long long_mask;
> + char schemata[64];
> + int ret;
> +
> sprintf(llc_occup_path, CON_MON_LCC_OCCUP_PATH, RESCTRL_PATH,
> param->ctrlgrp, domain_id);
>
> - return 0;
> + ret = get_full_cbm(test->resource, &long_mask);
> + if (ret)
> + return ret;
> +
> + snprintf(schemata, sizeof(schemata), "%lx", ~param->mask & long_mask);
I don't know why this is variable is called "long_mask", the type seems
pretty unrelated to its use. Perhaps change it to e.g. full_mask?
Seems otherwise fine AFAICT.
--
i.
> + ret = write_schemata("", schemata, uparams->cpu, test->resource);
> + if (ret)
> + return ret;
> +
> + snprintf(schemata, sizeof(schemata), "%lx", param->mask);
> + ret = write_schemata(param->ctrlgrp, schemata, uparams->cpu, test->resource);
> +
> + if (!ret && !strcmp(test->resource, "L3") && resctrl_resource_exists("L2"))
> + ret = write_schemata(param->ctrlgrp, "0x1", uparams->cpu, "L2");
> +
> + return ret;
> }
>
> static int cmt_setup(const struct resctrl_test *test,
> @@ -153,11 +180,11 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
> span = cache_portion_size(cache_total_size, param.mask, long_mask);
>
> if (uparams->fill_buf) {
> - fill_buf.buf_size = span;
> + fill_buf.buf_size = span * 2;
> fill_buf.memflush = uparams->fill_buf->memflush;
> param.fill_buf = &fill_buf;
> } else if (!uparams->benchmark_cmd[0]) {
> - fill_buf.buf_size = span;
> + fill_buf.buf_size = span * 2;
> fill_buf.memflush = true;
> param.fill_buf = &fill_buf;
> }
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index c7e9adc0368f..cd4c715b7ffd 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -17,7 +17,9 @@
> #define ALLOCATION_MIN 10
> #define ALLOCATION_STEP 10
>
> -static int mba_init(const struct resctrl_val_param *param, int domain_id)
> +static int mba_init(const struct resctrl_test *test,
> + const struct user_params *uparams,
> + const struct resctrl_val_param *param, int domain_id)
> {
> int ret;
>
> diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
> index 84d8bc250539..58201f844740 100644
> --- a/tools/testing/selftests/resctrl/mbm_test.c
> +++ b/tools/testing/selftests/resctrl/mbm_test.c
> @@ -83,7 +83,9 @@ static int check_results(size_t span)
> return ret;
> }
>
> -static int mbm_init(const struct resctrl_val_param *param, int domain_id)
> +static int mbm_init(const struct resctrl_test *test,
> + const struct user_params *uparams,
> + const struct resctrl_val_param *param, int domain_id)
> {
> int ret;
>
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index afe635b6e48d..c72045c74ac4 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -135,7 +135,9 @@ struct resctrl_val_param {
> char filename[64];
> unsigned long mask;
> int num_of_runs;
> - int (*init)(const struct resctrl_val_param *param,
> + int (*init)(const struct resctrl_test *test,
> + const struct user_params *uparams,
> + const struct resctrl_val_param *param,
> int domain_id);
> int (*setup)(const struct resctrl_test *test,
> const struct user_params *uparams,
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index 7c08e936572d..a5a8badb83d4 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -569,7 +569,7 @@ int resctrl_val(const struct resctrl_test *test,
> goto reset_affinity;
>
> if (param->init) {
> - ret = param->init(param, domain_id);
> + ret = param->init(test, uparams, param, domain_id);
> if (ret)
> goto reset_affinity;
> }
>
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test
2026-03-06 9:47 ` Ilpo Järvinen
@ 2026-03-06 19:24 ` Reinette Chatre
2026-03-09 7:44 ` Ilpo Järvinen
0 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2026-03-06 19:24 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
Hi Ilpo,
On 3/6/26 1:47 AM, Ilpo Järvinen wrote:
> On Tue, 3 Mar 2026, Reinette Chatre wrote:
>
>> Dave Martin reported inconsistent CMT test failures. In one experiment
>> the first run of the CMT test failed because of too large (24%) difference
>> between measured and achievable cache occupancy while the second run passed
>> with an acceptable 4% difference.
>>
>> The CMT test is susceptible to interference from the rest of the system.
>> This can be demonstrated with a utility like stress-ng by running the CMT
>> test while introducing cache misses using:
>>
>> stress-ng --matrix-3d 0 --matrix-3d-zyx
>>
>> Below shows an example of the CMT test failing because of a significant
>> difference between measured and achievable cache occupancy when run with
>> interference:
>> # Starting CMT test ...
>> # Mounting resctrl to "/sys/fs/resctrl"
>> # Cache size :56623104
>> # Writing benchmark parameters to resctrl FS
>> # Benchmark PID: 3275
>> # Checking for pass/fail
>> # Fail: Check cache miss rate within 15%
>> # Percent diff=97
>> # Number of bits: 5
>> # Average LLC val: 501350
>> # Cache span (bytes): 23592960
>> not ok 1 CMT: test
>>
>> The CMT test creates a new control group that is also capable of monitoring
>> and assigns the workload to it. The workload allocates a buffer that by
>> default fills a portion of the L3 and keeps reading from the buffer,
>> measuring the L3 occupancy at intervals. The test passes if the workload's
>> L3 occupancy is within 15% of the buffer size.
>>
>> By not adjusting any capacity bitmasks the workload shares the cache with
>> the rest of the system. Any other task that may be running could evict
>> the workload's data from the cache causing it to have low cache occupancy.
>>
>> Reduce interference from the rest of the system by ensuring that the
>> workload's control group uses the capacity bitmask found in the user
>> parameters for L3 and that the rest of the system can only allocate into
>> the inverse of the workload's L3 cache portion. Other tasks can thus no
>> longer evict the workload's data from L3.
>>
>> Take the L2 cache into account to further improve test accuracy.
>> By default the buffer size is the same as the L3 portion that the workload
>> can allocate into. This buffer size does not take into account that some
>> of the workload's data may land in L2/L1. Address this in two ways:
>> - Reduce the amount of L2 cache the workload can allocate into to the
>
> "into to the" sounds wrong.
How about:
"Reduce the workload's L2 cache allocation to the minimum on systems that
support L2 cache allocation."
>
>> minimum on systems that support L2 cache allocation.
>> - Increase the buffer size to accommodate data that may be allocated into
>> the L2 cache. Use a buffer size double the L3 portion to keep using the
>> L3 portion size as goal for L3 occupancy while taking into account that
>> some of the data may be in L2.
>
> To me the this control over L2 looks logically pretty separate from the
> inverse of L3 portion control so it would seem like logically belong to a
> separate change.
Sure. Both adjustments are needed to get results described below so I'll move
all descriptions surrounding the before/after performance numbers to the cover
letter. Looks like partial fixes cannot use the Closes tag so instead of Closes
I'll use the Link tag in both patches to point to the original report.
>
>> With the above adjustments the CMT test is more consistent. Repeating the
>> CMT test while generating interference with stress-ng on a sample
>> system after applying the fixes show significant improvement in test
>> accuracy:
>>
>> # Starting CMT test ...
>> # Mounting resctrl to "/sys/fs/resctrl"
>> # Cache size :56623104
>> # Writing benchmark parameters to resctrl FS
>> # Write schema "L3:0=fe0" to resctrl FS
>> # Write schema "L3:0=1f" to resctrl FS
>> # Benchmark PID: 3223
>> # Checking for pass/fail
>> # Pass: Check cache miss rate within 15%
>> # Percent diff=3
>> # Number of bits: 5
>> # Average LLC val: 22811443
>> # Cache span (bytes): 23592960
>> ok 1 CMT: test
>>
>> Reported-by: Dave Martin <Dave.Martin@arm.com>
>> Closes: https://lore.kernel.org/lkml/aO+7MeSMV29VdbQs@e133380.arm.com/
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>> Changes since v1:
>> - Fix typo in changelog: "data my be in L2" -> "data may be in L2".
>> ---
>> tools/testing/selftests/resctrl/cmt_test.c | 35 ++++++++++++++++---
>> tools/testing/selftests/resctrl/mba_test.c | 4 ++-
>> tools/testing/selftests/resctrl/mbm_test.c | 4 ++-
>> tools/testing/selftests/resctrl/resctrl.h | 4 ++-
>> tools/testing/selftests/resctrl/resctrl_val.c | 2 +-
>> 5 files changed, 41 insertions(+), 8 deletions(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
>> index d09e693dc739..44e9938dfafd 100644
>> --- a/tools/testing/selftests/resctrl/cmt_test.c
>> +++ b/tools/testing/selftests/resctrl/cmt_test.c
>> @@ -19,12 +19,39 @@
>> #define CON_MON_LCC_OCCUP_PATH \
>> "%s/%s/mon_data/mon_L3_%02d/llc_occupancy"
>>
>> -static int cmt_init(const struct resctrl_val_param *param, int domain_id)
>> +/*
>> + * Initialize capacity bitmasks (CBMs) for control group being tested,
>> + * default resource group to prevent its tasks from interfering with test,
>> + * and L2 resource of control group to minimize allocations into L2 if
>> + * possible to better predict L3 occupancy.
>> + */
>> +static int cmt_init(const struct resctrl_test *test,
>> + const struct user_params *uparams,
>> + const struct resctrl_val_param *param, int domain_id)
>> {
>> + unsigned long long_mask;
>> + char schemata[64];
>> + int ret;
>> +
>> sprintf(llc_occup_path, CON_MON_LCC_OCCUP_PATH, RESCTRL_PATH,
>> param->ctrlgrp, domain_id);
>>
>> - return 0;
>> + ret = get_full_cbm(test->resource, &long_mask);
>> + if (ret)
>> + return ret;
>> +
>> + snprintf(schemata, sizeof(schemata), "%lx", ~param->mask & long_mask);
>
> I don't know why this is variable is called "long_mask", the type seems
> pretty unrelated to its use. Perhaps change it to e.g. full_mask?
It is called long_mask to be consistent with the same value used elsewhere
in this file. This is not required. I can rename to full_mask.
>
> Seems otherwise fine AFAICT.
>
Thank you very much for taking a look.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test
2026-03-06 19:24 ` Reinette Chatre
@ 2026-03-09 7:44 ` Ilpo Järvinen
0 siblings, 0 replies; 20+ messages in thread
From: Ilpo Järvinen @ 2026-03-09 7:44 UTC (permalink / raw)
To: Reinette Chatre
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
[-- Attachment #1: Type: text/plain, Size: 2897 bytes --]
On Fri, 6 Mar 2026, Reinette Chatre wrote:
> On 3/6/26 1:47 AM, Ilpo Järvinen wrote:
> > On Tue, 3 Mar 2026, Reinette Chatre wrote:
> >
> >> Dave Martin reported inconsistent CMT test failures. In one experiment
> >> the first run of the CMT test failed because of too large (24%) difference
> >> between measured and achievable cache occupancy while the second run passed
> >> with an acceptable 4% difference.
> >>
> >> The CMT test is susceptible to interference from the rest of the system.
> >> This can be demonstrated with a utility like stress-ng by running the CMT
> >> test while introducing cache misses using:
> >>
> >> stress-ng --matrix-3d 0 --matrix-3d-zyx
> >>
> >> Below shows an example of the CMT test failing because of a significant
> >> difference between measured and achievable cache occupancy when run with
> >> interference:
> >> # Starting CMT test ...
> >> # Mounting resctrl to "/sys/fs/resctrl"
> >> # Cache size :56623104
> >> # Writing benchmark parameters to resctrl FS
> >> # Benchmark PID: 3275
> >> # Checking for pass/fail
> >> # Fail: Check cache miss rate within 15%
> >> # Percent diff=97
> >> # Number of bits: 5
> >> # Average LLC val: 501350
> >> # Cache span (bytes): 23592960
> >> not ok 1 CMT: test
> >>
> >> The CMT test creates a new control group that is also capable of monitoring
> >> and assigns the workload to it. The workload allocates a buffer that by
> >> default fills a portion of the L3 and keeps reading from the buffer,
> >> measuring the L3 occupancy at intervals. The test passes if the workload's
> >> L3 occupancy is within 15% of the buffer size.
> >>
> >> By not adjusting any capacity bitmasks the workload shares the cache with
> >> the rest of the system. Any other task that may be running could evict
> >> the workload's data from the cache causing it to have low cache occupancy.
> >>
> >> Reduce interference from the rest of the system by ensuring that the
> >> workload's control group uses the capacity bitmask found in the user
> >> parameters for L3 and that the rest of the system can only allocate into
> >> the inverse of the workload's L3 cache portion. Other tasks can thus no
> >> longer evict the workload's data from L3.
> >>
> >> Take the L2 cache into account to further improve test accuracy.
> >> By default the buffer size is the same as the L3 portion that the workload
> >> can allocate into. This buffer size does not take into account that some
> >> of the workload's data may land in L2/L1. Address this in two ways:
> >> - Reduce the amount of L2 cache the workload can allocate into to the
> >
> > "into to the" sounds wrong.
>
> How about:
> "Reduce the workload's L2 cache allocation to the minimum on systems that
> support L2 cache allocation."
Works for me.
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-06 9:51 ` Ilpo Järvinen
2026-03-04 0:19 ` [PATCH v2 3/9] selftests/resctrl: Prepare for parsing multiple events per iMC Reinette Chatre
` (7 subsequent siblings)
9 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
The MBM and MBA tests compare MBM memory bandwidth measurements against
the memory bandwidth event values obtained from each memory controller's
PMU. The memory bandwidth event settings are discovered from the memory
controller details found in /sys/bus/event_source/devices/uncore_imc_N and
stored in struct imc_counter_config.
In addition to event settings struct imc_counter_config contains
imc_counter_config::return_value in which the associated event value is
stored on every read.
The event value is consumed and immediately recorded at regular intervals.
The stored value is never consumed afterwards, making its storage as part
of event configuration unnecessary.
Remove the return_value member from struct imc_counter_config. Instead
just use a local variable for use during event reading.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
tools/testing/selftests/resctrl/resctrl_val.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index a5a8badb83d4..2cc22f61a1f8 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -32,7 +32,6 @@ struct imc_counter_config {
__u64 event;
__u64 umask;
struct perf_event_attr pe;
- struct membw_read_format return_value;
int fd;
};
@@ -312,23 +311,23 @@ static int get_read_mem_bw_imc(float *bw_imc)
* Take overflow into consideration before calculating total bandwidth.
*/
for (imc = 0; imc < imcs; imc++) {
+ struct membw_read_format return_value;
struct imc_counter_config *r =
&imc_counters_config[imc];
- if (read(r->fd, &r->return_value,
- sizeof(struct membw_read_format)) == -1) {
+ if (read(r->fd, &return_value, sizeof(return_value)) == -1) {
ksft_perror("Couldn't get read bandwidth through iMC");
return -1;
}
- __u64 r_time_enabled = r->return_value.time_enabled;
- __u64 r_time_running = r->return_value.time_running;
+ __u64 r_time_enabled = return_value.time_enabled;
+ __u64 r_time_running = return_value.time_running;
if (r_time_enabled != r_time_running)
of_mul_read = (float)r_time_enabled /
(float)r_time_running;
- reads += r->return_value.value * of_mul_read * SCALE;
+ reads += return_value.value * of_mul_read * SCALE;
}
*bw_imc = reads;
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure
2026-03-04 0:19 ` [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure Reinette Chatre
@ 2026-03-06 9:51 ` Ilpo Järvinen
2026-03-06 19:25 ` Reinette Chatre
0 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2026-03-06 9:51 UTC (permalink / raw)
To: Reinette Chatre
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
[-- Attachment #1: Type: text/plain, Size: 2847 bytes --]
On Tue, 3 Mar 2026, Reinette Chatre wrote:
> The MBM and MBA tests compare MBM memory bandwidth measurements against
> the memory bandwidth event values obtained from each memory controller's
> PMU. The memory bandwidth event settings are discovered from the memory
> controller details found in /sys/bus/event_source/devices/uncore_imc_N and
> stored in struct imc_counter_config.
>
> In addition to event settings struct imc_counter_config contains
> imc_counter_config::return_value in which the associated event value is
> stored on every read.
>
> The event value is consumed and immediately recorded at regular intervals.
> The stored value is never consumed afterwards, making its storage as part
> of event configuration unnecessary.
>
> Remove the return_value member from struct imc_counter_config. Instead
> just use a local variable for use during event reading.
>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> tools/testing/selftests/resctrl/resctrl_val.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index a5a8badb83d4..2cc22f61a1f8 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -32,7 +32,6 @@ struct imc_counter_config {
> __u64 event;
> __u64 umask;
> struct perf_event_attr pe;
> - struct membw_read_format return_value;
> int fd;
> };
>
> @@ -312,23 +311,23 @@ static int get_read_mem_bw_imc(float *bw_imc)
> * Take overflow into consideration before calculating total bandwidth.
> */
> for (imc = 0; imc < imcs; imc++) {
> + struct membw_read_format return_value;
> struct imc_counter_config *r =
> &imc_counters_config[imc];
>
> - if (read(r->fd, &r->return_value,
> - sizeof(struct membw_read_format)) == -1) {
> + if (read(r->fd, &return_value, sizeof(return_value)) == -1) {
> ksft_perror("Couldn't get read bandwidth through iMC");
> return -1;
> }
>
> - __u64 r_time_enabled = r->return_value.time_enabled;
> - __u64 r_time_running = r->return_value.time_running;
> + __u64 r_time_enabled = return_value.time_enabled;
> + __u64 r_time_running = return_value.time_running;
>
> if (r_time_enabled != r_time_running)
> of_mul_read = (float)r_time_enabled /
> (float)r_time_running;
>
> - reads += r->return_value.value * of_mul_read * SCALE;
> + reads += return_value.value * of_mul_read * SCALE;
> }
This looks mostly okay though here too I don't like the variable name.
Something like "measurement" would tell what it is much better than overly
vague "return_value".
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure
2026-03-06 9:51 ` Ilpo Järvinen
@ 2026-03-06 19:25 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-06 19:25 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
Hi Ilpo,
On 3/6/26 1:51 AM, Ilpo Järvinen wrote:
> On Tue, 3 Mar 2026, Reinette Chatre wrote:
>> @@ -312,23 +311,23 @@ static int get_read_mem_bw_imc(float *bw_imc)
>> * Take overflow into consideration before calculating total bandwidth.
>> */
>> for (imc = 0; imc < imcs; imc++) {
>> + struct membw_read_format return_value;
>> struct imc_counter_config *r =
>> &imc_counters_config[imc];
>>
>> - if (read(r->fd, &r->return_value,
>> - sizeof(struct membw_read_format)) == -1) {
>> + if (read(r->fd, &return_value, sizeof(return_value)) == -1) {
>> ksft_perror("Couldn't get read bandwidth through iMC");
>> return -1;
>> }
>>
>> - __u64 r_time_enabled = r->return_value.time_enabled;
>> - __u64 r_time_running = r->return_value.time_running;
>> + __u64 r_time_enabled = return_value.time_enabled;
>> + __u64 r_time_running = return_value.time_running;
>>
>> if (r_time_enabled != r_time_running)
>> of_mul_read = (float)r_time_enabled /
>> (float)r_time_running;
>>
>> - reads += r->return_value.value * of_mul_read * SCALE;
>> + reads += return_value.value * of_mul_read * SCALE;
>> }
>
> This looks mostly okay though here too I don't like the variable name.
> Something like "measurement" would tell what it is much better than overly
> vague "return_value".
I agree. Will change to "measurement".
>
> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>
Thank you very much.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 3/9] selftests/resctrl: Prepare for parsing multiple events per iMC
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC Reinette Chatre
` (6 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
The events needed to read memory bandwidth are discovered by iterating
over every memory controller (iMC) within /sys/bus/event_source/devices.
Each iMC's PMU is assumed to have one event to measure read memory
bandwidth that is represented by the sysfs cas_count_read file. The event's
configuration is read from "cas_count_read" and stored as an element of
imc_counters_config[] by read_from_imc_dir() that receives the
index of the array where to store the configuration as argument.
It is possible that an iMC's PMU may have more than one event that should
be used to measure memory bandwidth.
Change semantics to not provide the index of the array to
read_from_imc_dir() but instead a pointer to the index. This enables
read_from_imc_dir() to store configurations for more than one event by
incrementing the index to imc_counters_config[] itself.
Ensure that the same type is consistently used for the index as it is
passed around during counter configuration.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
---
Changes since v1:
- Add Zide Chen's RB tag.
---
tools/testing/selftests/resctrl/resctrl_val.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 2cc22f61a1f8..25c8101631e0 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -73,7 +73,7 @@ static void read_mem_bw_ioctl_perf_event_ioc_disable(int i)
* @cas_count_cfg: Config
* @count: iMC number
*/
-static void get_read_event_and_umask(char *cas_count_cfg, int count)
+static void get_read_event_and_umask(char *cas_count_cfg, unsigned int count)
{
char *token[MAX_TOKENS];
int i = 0;
@@ -110,7 +110,7 @@ static int open_perf_read_event(int i, int cpu_no)
}
/* Get type and config of an iMC counter's read event. */
-static int read_from_imc_dir(char *imc_dir, int count)
+static int read_from_imc_dir(char *imc_dir, unsigned int *count)
{
char cas_count_cfg[1024], imc_counter_cfg[1024], imc_counter_type[1024];
FILE *fp;
@@ -123,7 +123,7 @@ static int read_from_imc_dir(char *imc_dir, int count)
return -1;
}
- if (fscanf(fp, "%u", &imc_counters_config[count].type) <= 0) {
+ if (fscanf(fp, "%u", &imc_counters_config[*count].type) <= 0) {
ksft_perror("Could not get iMC type");
fclose(fp);
@@ -147,7 +147,8 @@ static int read_from_imc_dir(char *imc_dir, int count)
}
fclose(fp);
- get_read_event_and_umask(cas_count_cfg, count);
+ get_read_event_and_umask(cas_count_cfg, *count);
+ *count += 1;
return 0;
}
@@ -196,13 +197,12 @@ static int num_of_imcs(void)
if (temp[0] >= '0' && temp[0] <= '9') {
sprintf(imc_dir, "%s/%s/", DYN_PMU_PATH,
ep->d_name);
- ret = read_from_imc_dir(imc_dir, count);
+ ret = read_from_imc_dir(imc_dir, &count);
if (ret) {
closedir(dp);
return ret;
}
- count++;
}
}
closedir(dp);
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (2 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 3/9] selftests/resctrl: Prepare for parsing multiple events per iMC Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-06 10:18 ` Ilpo Järvinen
2026-03-04 0:19 ` [PATCH v2 5/9] selftests/resctrl: Increase size of buffer used in MBM and MBA tests Reinette Chatre
` (5 subsequent siblings)
9 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
The resctrl selftests discover needed parameters to perf_event_open() via
sysfs. The PMU associated with every memory controller (iMC) is discovered
via the /sys/bus/event_source/devices/uncore_imc_N/type file while
the read memory bandwidth event type and umask is discovered via
/sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read.
Newer systems may have multiple events that expose read memory bandwidth.
For example,
/sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read_sch0
/sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read_sch1
Support parsing of iMC PMU properties when the PMU may have multiple events
to measure read memory bandwidth. The PMU only needs to be discovered once.
Split the parsing of event details from actual PMU discovery in order to
loop over all events associated with the PMU. Match all events with the
cas_count_read prefix instead of requiring there to be one file with that
name.
Make the parsing code more robust. With strings passed around to create
needed paths, use snprintf() instead of sprintf() to ensure there is
always enough space to create the path. Ensure there is enough room in
imc_counters_config[] before attempting to add an entry.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Zide Chen <zide.chen@intel.com>
---
Changes since v1:
- Add Zide Chen's RB tag.
---
tools/testing/selftests/resctrl/resctrl_val.c | 112 ++++++++++++++----
1 file changed, 90 insertions(+), 22 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 25c8101631e0..7aae0cc5aee9 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -11,10 +11,10 @@
#include "resctrl.h"
#define UNCORE_IMC "uncore_imc"
-#define READ_FILE_NAME "events/cas_count_read"
+#define READ_FILE_NAME "cas_count_read"
#define DYN_PMU_PATH "/sys/bus/event_source/devices"
#define SCALE 0.00006103515625
-#define MAX_IMCS 20
+#define MAX_IMCS 40
#define MAX_TOKENS 5
#define CON_MBM_LOCAL_BYTES_PATH \
@@ -109,21 +109,102 @@ static int open_perf_read_event(int i, int cpu_no)
return 0;
}
+static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
+ unsigned int *count)
+{
+ char imc_events[1024], imc_counter_cfg[1024], cas_count_cfg[1024];
+ unsigned int org_count = *count;
+ struct dirent *ep;
+ int path_len;
+ int ret = -1;
+ FILE *fp;
+ DIR *dp;
+
+ path_len = snprintf(imc_events, sizeof(imc_events), "%sevents", imc_dir);
+ if (path_len >= sizeof(imc_events)) {
+ ksft_print_msg("Unable to create path to %sevents\n", imc_dir);
+ return -1;
+ }
+ dp = opendir(imc_events);
+ if (dp) {
+ while ((ep = readdir(dp))) {
+ /*
+ * Parse all event files with READ_FILE_NAME
+ * prefix that contain the event number and umask.
+ * Skip files containing "." that contain unused
+ * properties of event.
+ */
+ if (!strstr(ep->d_name, READ_FILE_NAME) ||
+ strchr(ep->d_name, '.'))
+ continue;
+
+ path_len = snprintf(imc_counter_cfg, sizeof(imc_counter_cfg),
+ "%s/%s", imc_events, ep->d_name);
+ if (path_len >= sizeof(imc_counter_cfg)) {
+ ksft_print_msg("Unable to create path to %s/%s\n",
+ imc_events, ep->d_name);
+ goto out_close;
+ }
+ fp = fopen(imc_counter_cfg, "r");
+ if (!fp) {
+ ksft_perror("Failed to open iMC config file");
+ goto out_close;
+ }
+ if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
+ ksft_perror("Could not get iMC cas count read");
+ fclose(fp);
+ goto out_close;
+ }
+ fclose(fp);
+ if (*count >= MAX_IMCS) {
+ ksft_print_msg("Maximum iMC count exceeded\n");
+ goto out_close;
+ }
+
+ imc_counters_config[*count].type = type;
+ get_read_event_and_umask(cas_count_cfg, *count);
+ /* Do not fail after incrementing *count. */
+ *count += 1;
+ }
+ if (*count == org_count) {
+ ksft_print_msg("Unable to find events in %s\n", imc_events);
+ goto out_close;
+ }
+ } else {
+ ksft_perror("Unable to open PMU events directory");
+ goto out;
+ }
+ ret = 0;
+out_close:
+ closedir(dp);
+out:
+ return ret;
+}
+
/* Get type and config of an iMC counter's read event. */
static int read_from_imc_dir(char *imc_dir, unsigned int *count)
{
- char cas_count_cfg[1024], imc_counter_cfg[1024], imc_counter_type[1024];
+ char imc_counter_type[1024];
+ unsigned int type;
+ int path_len;
FILE *fp;
+ int ret;
/* Get type of iMC counter */
- sprintf(imc_counter_type, "%s%s", imc_dir, "type");
+ path_len = snprintf(imc_counter_type, sizeof(imc_counter_type),
+ "%s%s", imc_dir, "type");
+ if (path_len >= sizeof(imc_counter_type)) {
+ ksft_print_msg("Unable to create path to %s%s\n",
+ imc_dir, "type");
+ return -1;
+ }
fp = fopen(imc_counter_type, "r");
if (!fp) {
ksft_perror("Failed to open iMC counter type file");
return -1;
}
- if (fscanf(fp, "%u", &imc_counters_config[*count].type) <= 0) {
+ if (fscanf(fp, "%u", &type) <= 0) {
ksft_perror("Could not get iMC type");
fclose(fp);
@@ -131,24 +212,11 @@ static int read_from_imc_dir(char *imc_dir, unsigned int *count)
}
fclose(fp);
- /* Get read config */
- sprintf(imc_counter_cfg, "%s%s", imc_dir, READ_FILE_NAME);
- fp = fopen(imc_counter_cfg, "r");
- if (!fp) {
- ksft_perror("Failed to open iMC config file");
-
- return -1;
- }
- if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
- ksft_perror("Could not get iMC cas count read");
- fclose(fp);
-
- return -1;
+ ret = parse_imc_read_bw_events(imc_dir, type, count);
+ if (ret) {
+ ksft_print_msg("Unable to parse bandwidth event and umask\n");
+ return ret;
}
- fclose(fp);
-
- get_read_event_and_umask(cas_count_cfg, *count);
- *count += 1;
return 0;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC
2026-03-04 0:19 ` [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC Reinette Chatre
@ 2026-03-06 10:18 ` Ilpo Järvinen
2026-03-06 19:25 ` Reinette Chatre
0 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2026-03-06 10:18 UTC (permalink / raw)
To: Reinette Chatre
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
On Tue, 3 Mar 2026, Reinette Chatre wrote:
> The resctrl selftests discover needed parameters to perf_event_open() via
> sysfs. The PMU associated with every memory controller (iMC) is discovered
> via the /sys/bus/event_source/devices/uncore_imc_N/type file while
> the read memory bandwidth event type and umask is discovered via
> /sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read.
>
> Newer systems may have multiple events that expose read memory bandwidth.
> For example,
> /sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read_sch0
> /sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read_sch1
>
> Support parsing of iMC PMU properties when the PMU may have multiple events
> to measure read memory bandwidth. The PMU only needs to be discovered once.
> Split the parsing of event details from actual PMU discovery in order to
> loop over all events associated with the PMU. Match all events with the
> cas_count_read prefix instead of requiring there to be one file with that
> name.
>
> Make the parsing code more robust. With strings passed around to create
> needed paths, use snprintf() instead of sprintf() to ensure there is
> always enough space to create the path. Ensure there is enough room in
> imc_counters_config[] before attempting to add an entry.
>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> Reviewed-by: Zide Chen <zide.chen@intel.com>
> ---
> Changes since v1:
> - Add Zide Chen's RB tag.
> ---
> tools/testing/selftests/resctrl/resctrl_val.c | 112 ++++++++++++++----
> 1 file changed, 90 insertions(+), 22 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index 25c8101631e0..7aae0cc5aee9 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -11,10 +11,10 @@
> #include "resctrl.h"
>
> #define UNCORE_IMC "uncore_imc"
> -#define READ_FILE_NAME "events/cas_count_read"
> +#define READ_FILE_NAME "cas_count_read"
> #define DYN_PMU_PATH "/sys/bus/event_source/devices"
> #define SCALE 0.00006103515625
> -#define MAX_IMCS 20
> +#define MAX_IMCS 40
> #define MAX_TOKENS 5
>
> #define CON_MBM_LOCAL_BYTES_PATH \
> @@ -109,21 +109,102 @@ static int open_perf_read_event(int i, int cpu_no)
> return 0;
> }
>
> +static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
> + unsigned int *count)
> +{
> + char imc_events[1024], imc_counter_cfg[1024], cas_count_cfg[1024];
The first two are paths, right? PATH_MAX should be used instead of the
literals.
> + unsigned int org_count = *count;
orig_count is less ambiguous name.
> + struct dirent *ep;
> + int path_len;
> + int ret = -1;
> + FILE *fp;
> + DIR *dp;
> +
> + path_len = snprintf(imc_events, sizeof(imc_events), "%sevents", imc_dir);
> + if (path_len >= sizeof(imc_events)) {
> + ksft_print_msg("Unable to create path to %sevents\n", imc_dir);
> + return -1;
> + }
> + dp = opendir(imc_events);
> + if (dp) {
> + while ((ep = readdir(dp))) {
> + /*
> + * Parse all event files with READ_FILE_NAME
> + * prefix that contain the event number and umask.
> + * Skip files containing "." that contain unused
> + * properties of event.
> + */
> + if (!strstr(ep->d_name, READ_FILE_NAME) ||
> + strchr(ep->d_name, '.'))
> + continue;
> +
> + path_len = snprintf(imc_counter_cfg, sizeof(imc_counter_cfg),
> + "%s/%s", imc_events, ep->d_name);
> + if (path_len >= sizeof(imc_counter_cfg)) {
> + ksft_print_msg("Unable to create path to %s/%s\n",
> + imc_events, ep->d_name);
> + goto out_close;
> + }
> + fp = fopen(imc_counter_cfg, "r");
> + if (!fp) {
> + ksft_perror("Failed to open iMC config file");
> + goto out_close;
> + }
> + if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
> + ksft_perror("Could not get iMC cas count read");
> + fclose(fp);
> + goto out_close;
> + }
> + fclose(fp);
I'd prefer:
xx = fscanf(...);
fclose(fp);
if (xx) {
...
...But it is up to you ("ret" cannot be used as xx as is).
> + if (*count >= MAX_IMCS) {
> + ksft_print_msg("Maximum iMC count exceeded\n");
> + goto out_close;
> + }
> +
> + imc_counters_config[*count].type = type;
> + get_read_event_and_umask(cas_count_cfg, *count);
> + /* Do not fail after incrementing *count. */
> + *count += 1;
> + }
> + if (*count == org_count) {
> + ksft_print_msg("Unable to find events in %s\n", imc_events);
> + goto out_close;
> + }
> + } else {
> + ksft_perror("Unable to open PMU events directory");
> + goto out;
Reverse the logic (handle error first), it reduces the indentation level
of the loop.
> + }
> + ret = 0;
> +out_close:
> + closedir(dp);
> +out:
> + return ret;
> +}
> +
> /* Get type and config of an iMC counter's read event. */
> static int read_from_imc_dir(char *imc_dir, unsigned int *count)
> {
> - char cas_count_cfg[1024], imc_counter_cfg[1024], imc_counter_type[1024];
> + char imc_counter_type[1024];
> + unsigned int type;
> + int path_len;
> FILE *fp;
> + int ret;
>
> /* Get type of iMC counter */
> - sprintf(imc_counter_type, "%s%s", imc_dir, "type");
> + path_len = snprintf(imc_counter_type, sizeof(imc_counter_type),
> + "%s%s", imc_dir, "type");
> + if (path_len >= sizeof(imc_counter_type)) {
> + ksft_print_msg("Unable to create path to %s%s\n",
> + imc_dir, "type");
> + return -1;
> + }
> fp = fopen(imc_counter_type, "r");
> if (!fp) {
> ksft_perror("Failed to open iMC counter type file");
>
> return -1;
> }
> - if (fscanf(fp, "%u", &imc_counters_config[*count].type) <= 0) {
> + if (fscanf(fp, "%u", &type) <= 0) {
> ksft_perror("Could not get iMC type");
> fclose(fp);
>
> @@ -131,24 +212,11 @@ static int read_from_imc_dir(char *imc_dir, unsigned int *count)
> }
> fclose(fp);
>
> - /* Get read config */
> - sprintf(imc_counter_cfg, "%s%s", imc_dir, READ_FILE_NAME);
> - fp = fopen(imc_counter_cfg, "r");
> - if (!fp) {
> - ksft_perror("Failed to open iMC config file");
> -
> - return -1;
> - }
> - if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
> - ksft_perror("Could not get iMC cas count read");
> - fclose(fp);
> -
> - return -1;
> + ret = parse_imc_read_bw_events(imc_dir, type, count);
> + if (ret) {
> + ksft_print_msg("Unable to parse bandwidth event and umask\n");
> + return ret;
> }
> - fclose(fp);
> -
> - get_read_event_and_umask(cas_count_cfg, *count);
> - *count += 1;
>
> return 0;
> }
>
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC
2026-03-06 10:18 ` Ilpo Järvinen
@ 2026-03-06 19:25 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-06 19:25 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
Hi Ilpo,
On 3/6/26 2:18 AM, Ilpo Järvinen wrote:
> On Tue, 3 Mar 2026, Reinette Chatre wrote:
>
>> The resctrl selftests discover needed parameters to perf_event_open() via
>> sysfs. The PMU associated with every memory controller (iMC) is discovered
>> via the /sys/bus/event_source/devices/uncore_imc_N/type file while
>> the read memory bandwidth event type and umask is discovered via
>> /sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read.
>>
>> Newer systems may have multiple events that expose read memory bandwidth.
>> For example,
>> /sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read_sch0
>> /sys/bus/event_source/devices/uncore_imc_N/events/cas_count_read_sch1
>>
>> Support parsing of iMC PMU properties when the PMU may have multiple events
>> to measure read memory bandwidth. The PMU only needs to be discovered once.
>> Split the parsing of event details from actual PMU discovery in order to
>> loop over all events associated with the PMU. Match all events with the
>> cas_count_read prefix instead of requiring there to be one file with that
>> name.
>>
>> Make the parsing code more robust. With strings passed around to create
>> needed paths, use snprintf() instead of sprintf() to ensure there is
>> always enough space to create the path. Ensure there is enough room in
>> imc_counters_config[] before attempting to add an entry.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> Reviewed-by: Zide Chen <zide.chen@intel.com>
>> ---
>> Changes since v1:
>> - Add Zide Chen's RB tag.
>> ---
>> tools/testing/selftests/resctrl/resctrl_val.c | 112 ++++++++++++++----
>> 1 file changed, 90 insertions(+), 22 deletions(-)
>>
>> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
>> index 25c8101631e0..7aae0cc5aee9 100644
>> --- a/tools/testing/selftests/resctrl/resctrl_val.c
>> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
>> @@ -11,10 +11,10 @@
>> #include "resctrl.h"
>>
>> #define UNCORE_IMC "uncore_imc"
>> -#define READ_FILE_NAME "events/cas_count_read"
>> +#define READ_FILE_NAME "cas_count_read"
>> #define DYN_PMU_PATH "/sys/bus/event_source/devices"
>> #define SCALE 0.00006103515625
>> -#define MAX_IMCS 20
>> +#define MAX_IMCS 40
>> #define MAX_TOKENS 5
>>
>> #define CON_MBM_LOCAL_BYTES_PATH \
>> @@ -109,21 +109,102 @@ static int open_perf_read_event(int i, int cpu_no)
>> return 0;
>> }
>>
>> +static int parse_imc_read_bw_events(char *imc_dir, unsigned int type,
>> + unsigned int *count)
>> +{
>> + char imc_events[1024], imc_counter_cfg[1024], cas_count_cfg[1024];
>
> The first two are paths, right? PATH_MAX should be used instead of the
> literals.
Yes, they are paths. Thanks. Will use PATH_MAX.
>
>> + unsigned int org_count = *count;
>
> orig_count is less ambiguous name.
Sure.
>
>> + struct dirent *ep;
>> + int path_len;
>> + int ret = -1;
>> + FILE *fp;
>> + DIR *dp;
>> +
>> + path_len = snprintf(imc_events, sizeof(imc_events), "%sevents", imc_dir);
>> + if (path_len >= sizeof(imc_events)) {
>> + ksft_print_msg("Unable to create path to %sevents\n", imc_dir);
>> + return -1;
>> + }
>> + dp = opendir(imc_events);
>> + if (dp) {
>> + while ((ep = readdir(dp))) {
>> + /*
>> + * Parse all event files with READ_FILE_NAME
>> + * prefix that contain the event number and umask.
>> + * Skip files containing "." that contain unused
>> + * properties of event.
>> + */
>> + if (!strstr(ep->d_name, READ_FILE_NAME) ||
>> + strchr(ep->d_name, '.'))
>> + continue;
>> +
>> + path_len = snprintf(imc_counter_cfg, sizeof(imc_counter_cfg),
>> + "%s/%s", imc_events, ep->d_name);
>> + if (path_len >= sizeof(imc_counter_cfg)) {
>> + ksft_print_msg("Unable to create path to %s/%s\n",
>> + imc_events, ep->d_name);
>> + goto out_close;
>> + }
>> + fp = fopen(imc_counter_cfg, "r");
>> + if (!fp) {
>> + ksft_perror("Failed to open iMC config file");
>> + goto out_close;
>> + }
>> + if (fscanf(fp, "%1023s", cas_count_cfg) <= 0) {
>> + ksft_perror("Could not get iMC cas count read");
>> + fclose(fp);
>> + goto out_close;
>> + }
>> + fclose(fp);
>
> I'd prefer:
>
> xx = fscanf(...);
> fclose(fp);
> if (xx) {
> ...
>
> ...But it is up to you ("ret" cannot be used as xx as is).
ok. I do not really like to use ret for xx since it generates a bit of churn to
reset on success as well as failure paths after the fscanf(). I can introduce a new
local variable that should help with readability.
>
>> + if (*count >= MAX_IMCS) {
>> + ksft_print_msg("Maximum iMC count exceeded\n");
>> + goto out_close;
>> + }
>> +
>> + imc_counters_config[*count].type = type;
>> + get_read_event_and_umask(cas_count_cfg, *count);
>> + /* Do not fail after incrementing *count. */
>> + *count += 1;
>> + }
>> + if (*count == org_count) {
>> + ksft_print_msg("Unable to find events in %s\n", imc_events);
>> + goto out_close;
>> + }
>> + } else {
>> + ksft_perror("Unable to open PMU events directory");
>> + goto out;
>
> Reverse the logic (handle error first), it reduces the indentation level
> of the loop.
Right. Will do.
>
>> + }
>> + ret = 0;
>> +out_close:
>> + closedir(dp);
>> +out:
>> + return ret;
>> +}
>> +
>> /* Get type and config of an iMC counter's read event. */
>> static int read_from_imc_dir(char *imc_dir, unsigned int *count)
>> {
>> - char cas_count_cfg[1024], imc_counter_cfg[1024], imc_counter_type[1024];
>> + char imc_counter_type[1024];
I'll also change imc_counter_type to PATH_MAX.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 5/9] selftests/resctrl: Increase size of buffer used in MBM and MBA tests
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (3 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 6/9] selftests/resctrl: Raise threshold at which MBM and PMU values are compared Reinette Chatre
` (4 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
Errata for Sierra Forest [1] (SRF42) and Granite Rapids[2] (GNR12)
describe the problem that MBM on Intel RDT may overcount memory bandwidth
measurements. The resctrl tests compare memory bandwidth reported by iMC
PMU to that reported by MBM causing the tests to fail on these systems
depending on the settings of the platform related to the errata.
Since the resctrl tests need to run under various conditions it is not
possible to ensure system settings are such that MBM will not overcount.
It has been observed that the overcounting can be controlled via the
buffer size used in the MBM and MBA tests that rely on comparisons
between iMC PMU and MBM measurements.
Running the MBM test on affected platforms with different buffer sizes it
can be observed that the difference between iMC PMU and MBM counts reduce
as the buffer size increases. After increasing the buffer size to more
than 4X the differences between iMC PMU and MBM become insignificant.
Increase the buffer size used in MBM and MBA tests to 4X L3 size to reduce
possibility of tests failing due to difference in counts reported by iMC
PMU and MBM.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/sierra-forest/xeon-6700-series-processor-with-e-cores-specification-update/errata-details/ # [1]
Link: https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/birch-stream/xeon-6900-6700-6500-series-processors-with-p-cores-specification-update/011US/errata-details/ # [2]
---
tools/testing/selftests/resctrl/fill_buf.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c
index 19a01a52dc1a..b9fa7968cd6e 100644
--- a/tools/testing/selftests/resctrl/fill_buf.c
+++ b/tools/testing/selftests/resctrl/fill_buf.c
@@ -139,6 +139,6 @@ ssize_t get_fill_buf_size(int cpu_no, const char *cache_type)
if (ret)
return ret;
- return cache_total_size * 2 > MINIMUM_SPAN ?
- cache_total_size * 2 : MINIMUM_SPAN;
+ return cache_total_size * 4 > MINIMUM_SPAN ?
+ cache_total_size * 4 : MINIMUM_SPAN;
}
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 6/9] selftests/resctrl: Raise threshold at which MBM and PMU values are compared
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (4 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 5/9] selftests/resctrl: Increase size of buffer used in MBM and MBA tests Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 7/9] selftests/resctrl: Remove requirement on cache miss rate Reinette Chatre
` (3 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
commit 501cfdba0a40 ("selftests/resctrl: Do not compare performance
counters and resctrl at low bandwidth") introduced a threshold under which
memory bandwidth values from MBM and performance counters are not compared.
This is needed because MBM and the PMUs do not have an identical view of
memory bandwidth since PMUs can count all memory traffic while MBM does not
count "overhead" (for example RAS) traffic that cannot be attributed to an
RMID. As a ratio this difference in view of memory bandwidth is pronounced
at low memory bandwidths.
The 750MiB threshold was chosen arbitrarily after comparisons on different
platforms. Exposed to more platforms after introduction this threshold has
proven to be inadequate.
Having accurate comparison between performance counters and MBM requires
careful management of system load as well as control of features that
introduce extra memory traffic, for example, patrol scrub. This is not
appropriate for the resctrl selftests that are intended to run on a
variety of systems with various configurations.
Increase the memory bandwidth threshold under which no comparison is made
between performance counters and MBM. Add additional leniency by increasing
the percentage of difference that will be tolerated between these counts.
There is no impact to the validity of the resctrl selftests results as a
measure of resctrl subsystem health.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
tools/testing/selftests/resctrl/resctrl.h | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index cd4c715b7ffd..39cee9898359 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -12,7 +12,7 @@
#define RESULT_FILE_NAME "result_mba"
#define NUM_OF_RUNS 5
-#define MAX_DIFF_PERCENT 8
+#define MAX_DIFF_PERCENT 15
#define ALLOCATION_MAX 100
#define ALLOCATION_MIN 10
#define ALLOCATION_STEP 10
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 58201f844740..6dbbc3b76003 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -11,7 +11,7 @@
#include "resctrl.h"
#define RESULT_FILE_NAME "result_mbm"
-#define MAX_DIFF_PERCENT 8
+#define MAX_DIFF_PERCENT 15
#define NUM_OF_RUNS 5
static int
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index c72045c74ac4..861bf25f2f28 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -55,7 +55,7 @@
* and MBM respectively, for instance generating "overhead" traffic which
* is not counted against any specific RMID.
*/
-#define THROTTLE_THRESHOLD 750
+#define THROTTLE_THRESHOLD 2500
/*
* fill_buf_param: "fill_buf" benchmark parameters
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 7/9] selftests/resctrl: Remove requirement on cache miss rate
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (5 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 6/9] selftests/resctrl: Raise threshold at which MBM and PMU values are compared Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 8/9] selftests/resctrl: Simplify perf usage in CAT test Reinette Chatre
` (2 subsequent siblings)
9 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
As the CAT test reads the same buffer into different sized cache portions
it compares the number of cache misses against an expected percentage
based on the size of the cache portion.
Systems and test conditions vary. The CAT test is a test of resctrl
subsystem health and not a test of the hardware architecture so it is not
required to place requirements on the size of the difference in cache
misses, just that the number of cache misses when reading a buffer
increase as the cache portion used for the buffer decreases.
Remove additional constraint on how big the difference between cache
misses should be as the cache portion size changes. Only test that the
cache misses increase as the cache portion size decreases. This remains
a good sanity check of resctrl subsystem health while reducing impact
of hardware architectural differences and the various conditions under
which the test may run.
Increase the size difference between cache portions to additionally avoid
any consequences resulting from smaller increments.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
tools/testing/selftests/resctrl/cat_test.c | 33 ++++------------------
1 file changed, 5 insertions(+), 28 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index f00b622c1460..8bc47f06679a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -14,42 +14,20 @@
#define RESULT_FILE_NAME "result_cat"
#define NUM_OF_RUNS 5
-/*
- * Minimum difference in LLC misses between a test with n+1 bits CBM to the
- * test with n bits is MIN_DIFF_PERCENT_PER_BIT * (n - 1). With e.g. 5 vs 4
- * bits in the CBM mask, the minimum difference must be at least
- * MIN_DIFF_PERCENT_PER_BIT * (4 - 1) = 3 percent.
- *
- * The relationship between number of used CBM bits and difference in LLC
- * misses is not expected to be linear. With a small number of bits, the
- * margin is smaller than with larger number of bits. For selftest purposes,
- * however, linear approach is enough because ultimately only pass/fail
- * decision has to be made and distinction between strong and stronger
- * signal is irrelevant.
- */
-#define MIN_DIFF_PERCENT_PER_BIT 1UL
-
static int show_results_info(__u64 sum_llc_val, int no_of_bits,
unsigned long cache_span,
- unsigned long min_diff_percent,
unsigned long num_of_runs, bool platform,
__s64 *prev_avg_llc_val)
{
__u64 avg_llc_val = 0;
- float avg_diff;
int ret = 0;
avg_llc_val = sum_llc_val / num_of_runs;
if (*prev_avg_llc_val) {
- float delta = (__s64)(avg_llc_val - *prev_avg_llc_val);
-
- avg_diff = delta / *prev_avg_llc_val;
- ret = platform && (avg_diff * 100) < (float)min_diff_percent;
-
- ksft_print_msg("%s Check cache miss rate changed more than %.1f%%\n",
- ret ? "Fail:" : "Pass:", (float)min_diff_percent);
+ ret = platform && (avg_llc_val < *prev_avg_llc_val);
- ksft_print_msg("Percent diff=%.1f\n", avg_diff * 100);
+ ksft_print_msg("%s Check cache miss rate increased\n",
+ ret ? "Fail:" : "Pass:");
}
*prev_avg_llc_val = avg_llc_val;
@@ -58,10 +36,10 @@ static int show_results_info(__u64 sum_llc_val, int no_of_bits,
return ret;
}
-/* Remove the highest bit from CBM */
+/* Remove the highest bits from CBM */
static unsigned long next_mask(unsigned long current_mask)
{
- return current_mask & (current_mask >> 1);
+ return current_mask & (current_mask >> 2);
}
static int check_results(struct resctrl_val_param *param, const char *cache_type,
@@ -112,7 +90,6 @@ static int check_results(struct resctrl_val_param *param, const char *cache_type
ret = show_results_info(sum_llc_perf_miss, bits,
alloc_size / 64,
- MIN_DIFF_PERCENT_PER_BIT * (bits - 1),
runs, get_vendor() == ARCH_INTEL,
&prev_avg_llc_val);
if (ret)
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 8/9] selftests/resctrl: Simplify perf usage in CAT test
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (6 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 7/9] selftests/resctrl: Remove requirement on cache miss rate Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-04 0:19 ` [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on " Reinette Chatre
2026-03-04 15:18 ` [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Chen, Yu C
9 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
The CAT test relies on the PERF_COUNT_HW_CACHE_MISSES event to determine if
modifying a cache portion size is successful. This event is configured to
report the data as part of an event group, but no other events are added to
the group.
Remove the unnecessary PERF_FORMAT_GROUP format setting. This eliminates
the need for struct perf_event_read and results in read() of the associated
file descriptor to return just one value associated with the
PERF_COUNT_HW_CACHE_MISSES event of interest.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
tools/testing/selftests/resctrl/cache.c | 17 +++++------------
tools/testing/selftests/resctrl/cat_test.c | 4 +---
tools/testing/selftests/resctrl/resctrl.h | 11 +----------
3 files changed, 7 insertions(+), 25 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cache.c b/tools/testing/selftests/resctrl/cache.c
index 1ff1104e6575..03313a5ff905 100644
--- a/tools/testing/selftests/resctrl/cache.c
+++ b/tools/testing/selftests/resctrl/cache.c
@@ -10,7 +10,6 @@ void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config)
memset(pea, 0, sizeof(*pea));
pea->type = PERF_TYPE_HARDWARE;
pea->size = sizeof(*pea);
- pea->read_format = PERF_FORMAT_GROUP;
pea->exclude_kernel = 1;
pea->exclude_hv = 1;
pea->exclude_idle = 1;
@@ -37,19 +36,13 @@ int perf_event_reset_enable(int pe_fd)
return 0;
}
-void perf_event_initialize_read_format(struct perf_event_read *pe_read)
-{
- memset(pe_read, 0, sizeof(*pe_read));
- pe_read->nr = 1;
-}
-
int perf_open(struct perf_event_attr *pea, pid_t pid, int cpu_no)
{
int pe_fd;
pe_fd = perf_event_open(pea, pid, cpu_no, -1, PERF_FLAG_FD_CLOEXEC);
if (pe_fd == -1) {
- ksft_perror("Error opening leader");
+ ksft_perror("Unable to set up performance monitoring");
return -1;
}
@@ -132,9 +125,9 @@ static int print_results_cache(const char *filename, pid_t bm_pid, __u64 llc_val
*
* Return: =0 on success. <0 on failure.
*/
-int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
- const char *filename, pid_t bm_pid)
+int perf_event_measure(int pe_fd, const char *filename, pid_t bm_pid)
{
+ __u64 value;
int ret;
/* Stop counters after one span to get miss rate */
@@ -142,13 +135,13 @@ int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
if (ret < 0)
return ret;
- ret = read(pe_fd, pe_read, sizeof(*pe_read));
+ ret = read(pe_fd, &value, sizeof(value));
if (ret == -1) {
ksft_perror("Could not get perf value");
return -1;
}
- return print_results_cache(filename, bm_pid, pe_read->values[0].value);
+ return print_results_cache(filename, bm_pid, value);
}
/*
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 8bc47f06679a..6aac03147d41 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -135,7 +135,6 @@ static int cat_test(const struct resctrl_test *test,
struct resctrl_val_param *param,
size_t span, unsigned long current_mask)
{
- struct perf_event_read pe_read;
struct perf_event_attr pea;
cpu_set_t old_affinity;
unsigned char *buf;
@@ -159,7 +158,6 @@ static int cat_test(const struct resctrl_test *test,
goto reset_affinity;
perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
- perf_event_initialize_read_format(&pe_read);
pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
if (pe_fd < 0) {
ret = -1;
@@ -192,7 +190,7 @@ static int cat_test(const struct resctrl_test *test,
fill_cache_read(buf, span, true);
- ret = perf_event_measure(pe_fd, &pe_read, param->filename, bm_pid);
+ ret = perf_event_measure(pe_fd, param->filename, bm_pid);
if (ret)
goto free_buf;
}
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 861bf25f2f28..e04b313dd7f7 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -148,13 +148,6 @@ struct resctrl_val_param {
struct fill_buf_param *fill_buf;
};
-struct perf_event_read {
- __u64 nr; /* The number of events */
- struct {
- __u64 value; /* The value of the event */
- } values[2];
-};
-
/*
* Memory location that consumes values compiler must not optimize away.
* Volatile ensures writes to this location cannot be optimized away by
@@ -210,11 +203,9 @@ unsigned int count_bits(unsigned long n);
int snc_kernel_support(void);
void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config);
-void perf_event_initialize_read_format(struct perf_event_read *pe_read);
int perf_open(struct perf_event_attr *pea, pid_t pid, int cpu_no);
int perf_event_reset_enable(int pe_fd);
-int perf_event_measure(int pe_fd, struct perf_event_read *pe_read,
- const char *filename, pid_t bm_pid);
+int perf_event_measure(int pe_fd, const char *filename, pid_t bm_pid);
int measure_llc_resctrl(const char *filename, pid_t bm_pid);
void show_cache_info(int no_of_bits, __u64 avg_llc_val, size_t cache_span, bool lines);
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on CAT test
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (7 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 8/9] selftests/resctrl: Simplify perf usage in CAT test Reinette Chatre
@ 2026-03-04 0:19 ` Reinette Chatre
2026-03-06 10:35 ` Ilpo Järvinen
2026-03-04 15:18 ` [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Chen, Yu C
9 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2026-03-04 0:19 UTC (permalink / raw)
To: shuah, Dave.Martin, james.morse, tony.luck, babu.moger,
ilpo.jarvinen
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
yu.c.chen, reinette.chatre, linux-kselftest, linux-kernel,
patches
The L3 CAT test loads a buffer into cache that is proportional to the L3
size allocated for the workload and measures cache misses when accessing
the buffer as a test of L3 occupancy. When loading the buffer it can be
assumed that a portion of the buffer will be loaded into the L2 cache and
depending on cache design may not be present in L3. It is thus possible
for data to not be in L3 but also not trigger an L3 cache miss when
accessed.
Reduce impact of L2 on the L3 CAT test by, if L2 allocation is supported,
minimizing the portion of L2 that the workload can allocate into. This
encourages most of buffer to be loaded into L3 and support better
comparison between buffer size, cache portion, and cache misses when
accessing the buffer.
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
tools/testing/selftests/resctrl/cat_test.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 6aac03147d41..26062684a9f4 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -157,6 +157,12 @@ static int cat_test(const struct resctrl_test *test,
if (ret)
goto reset_affinity;
+ if (!strcmp(test->resource, "L3") && resctrl_resource_exists("L2")) {
+ ret = write_schemata(param->ctrlgrp, "0x1", uparams->cpu, "L2");
+ if (ret)
+ goto reset_affinity;
+ }
+
perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
if (pe_fd < 0) {
--
2.50.1
^ permalink raw reply related [flat|nested] 20+ messages in thread* Re: [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on CAT test
2026-03-04 0:19 ` [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on " Reinette Chatre
@ 2026-03-06 10:35 ` Ilpo Järvinen
2026-03-06 19:26 ` Reinette Chatre
0 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2026-03-06 10:35 UTC (permalink / raw)
To: Reinette Chatre
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
On Tue, 3 Mar 2026, Reinette Chatre wrote:
> The L3 CAT test loads a buffer into cache that is proportional to the L3
> size allocated for the workload and measures cache misses when accessing
> the buffer as a test of L3 occupancy. When loading the buffer it can be
> assumed that a portion of the buffer will be loaded into the L2 cache and
> depending on cache design may not be present in L3. It is thus possible
> for data to not be in L3 but also not trigger an L3 cache miss when
> accessed.
>
> Reduce impact of L2 on the L3 CAT test by, if L2 allocation is supported,
> minimizing the portion of L2 that the workload can allocate into. This
> encourages most of buffer to be loaded into L3 and support better
> comparison between buffer size, cache portion, and cache misses when
> accessing the buffer.
>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 6aac03147d41..26062684a9f4 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -157,6 +157,12 @@ static int cat_test(const struct resctrl_test *test,
> if (ret)
> goto reset_affinity;
>
> + if (!strcmp(test->resource, "L3") && resctrl_resource_exists("L2")) {
> + ret = write_schemata(param->ctrlgrp, "0x1", uparams->cpu, "L2");
> + if (ret)
> + goto reset_affinity;
> + }
This looks similar to what you did in the CMT test. Maybe add a common
function for minimizing L2 cache size so it doesn't have to duplicated to
all L3 related tests.
> +
> perf_event_attr_initialize(&pea, PERF_COUNT_HW_CACHE_MISSES);
> pe_fd = perf_open(&pea, bm_pid, uparams->cpu);
> if (pe_fd < 0) {
>
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on CAT test
2026-03-06 10:35 ` Ilpo Järvinen
@ 2026-03-06 19:26 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2026-03-06 19:26 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: shuah, Dave.Martin, james.morse, tony.luck, babu.moger, fenghuay,
peternewman, zide.chen, dapeng1.mi, ben.horgan, yu.c.chen,
linux-kselftest, LKML, patches
Hi Ilpo,
On 3/6/26 2:35 AM, Ilpo Järvinen wrote:
> On Tue, 3 Mar 2026, Reinette Chatre wrote:
>
>> The L3 CAT test loads a buffer into cache that is proportional to the L3
>> size allocated for the workload and measures cache misses when accessing
>> the buffer as a test of L3 occupancy. When loading the buffer it can be
>> assumed that a portion of the buffer will be loaded into the L2 cache and
>> depending on cache design may not be present in L3. It is thus possible
>> for data to not be in L3 but also not trigger an L3 cache miss when
>> accessed.
>>
>> Reduce impact of L2 on the L3 CAT test by, if L2 allocation is supported,
>> minimizing the portion of L2 that the workload can allocate into. This
>> encourages most of buffer to be loaded into L3 and support better
>> comparison between buffer size, cache portion, and cache misses when
>> accessing the buffer.
>>
>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>> ---
>> tools/testing/selftests/resctrl/cat_test.c | 6 ++++++
>> 1 file changed, 6 insertions(+)
>>
>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>> index 6aac03147d41..26062684a9f4 100644
>> --- a/tools/testing/selftests/resctrl/cat_test.c
>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>> @@ -157,6 +157,12 @@ static int cat_test(const struct resctrl_test *test,
>> if (ret)
>> goto reset_affinity;
>>
>> + if (!strcmp(test->resource, "L3") && resctrl_resource_exists("L2")) {
>> + ret = write_schemata(param->ctrlgrp, "0x1", uparams->cpu, "L2");
>> + if (ret)
>> + goto reset_affinity;
>> + }
>
> This looks similar to what you did in the CMT test. Maybe add a common
> function for minimizing L2 cache size so it doesn't have to duplicated to
> all L3 related tests.
Sure.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms
2026-03-04 0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
` (8 preceding siblings ...)
2026-03-04 0:19 ` [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on " Reinette Chatre
@ 2026-03-04 15:18 ` Chen, Yu C
9 siblings, 0 replies; 20+ messages in thread
From: Chen, Yu C @ 2026-03-04 15:18 UTC (permalink / raw)
To: Reinette Chatre
Cc: fenghuay, peternewman, zide.chen, dapeng1.mi, ben.horgan,
linux-kselftest, linux-kernel, patches, shuah, Dave.Martin,
james.morse, babu.moger, ilpo.jarvinen, tony.luck
On 3/4/2026 8:19 AM, Reinette Chatre wrote:
> Changes since v1:
> - The new perf interface that resctrl selftests can utilize has been accepted and
> merged into v7.0-rc2. This series can thus now be considered for inclusion.
> For reference,
> commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
> The resctrl selftest changes making use of the new perf interface are backward
> compatible. The selftests do not require a v7.0-rc2 kernel to run but the
> tests can only pass on recent Intel platforms running v7.0-rc2 or later.
> - Combine the two outstanding resctrl selftest submissions into one series
> for easier tracking:
> https://lore.kernel.org/lkml/084e82b5c29d75f16f24af8768d50d39ba0118a5.1769101788.git.reinette.chatre@intel.com/
> https://lore.kernel.org/lkml/cover.1770406608.git.reinette.chatre@intel.com/
> - Fix typo in changelog of "selftests/resctrl: Improve accuracy of cache
> occupancy test": "the data my be in L2" -> "the data my be in L2"
> - Add Zide Chen's RB tags.
>
> Cover letter updated to be accurate wrt perf changes:
>
> The resctrl selftests fail on recent Intel platforms. Intermittent failures
> in the CAT test and permanent failures of MBM and MBA tests on new platforms
> like Sierra Forest and Granite Rapids.
>
> The MBM and MBA resctrl selftests both generate memory traffic and compare the
> memory bandwidth measurements between the iMC PMUs and MBM to determine pass or
> fail. Both these tests are failing on recent platforms like Sierra Forest and
> Granite Rapids that have two events that need to be read and combined
> for a total memory bandwidth count instead of the single event available on
> earlier platforms.
>
> resctrl selftests prefer to obtain event details via sysfs instead of adding
> model specific details on which events to read. Enhancements to perf to expose
> the new event details are available since:
> commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
> This series demonstrates use of the new sysfs interface to perf to
> obtain to obtain accurate iMC read memory bandwidth measurements.
>
> An additional issue with all the tests is that these selftests are part
> performance tests and determine pass/fail on performance heuristics selected
> after running the tests on a variety of platforms. When new platforms
> arrive the previous heuristics may cause the tests to fail. These failures are
> not because of an issue with the resctrl subsystem the tests intend to test
> but because of the architectural changes in the new platforms.
>
> Adapt the resctrl tests to not be as sensitive to architectural changes
> while adjusting the remaining heuristics to ensure tests pass on a variety
> of platforms. More details in individual patches.
>
> Tested by running 100 iterations of all tests on Emerald Rapids, Granite
> Rapids, Sapphire Rapids, Ice Lake, Sierra Forest, and Broadwell.
>
Tested on a GNR with SNC3, without this patch I saw MBM/MBA failures.
With this patch applied on v7.0-rc2, I did not see any errors:
sudo ./resctrl_tests
TAP version 13
# Pass: Check kernel supports resctrl filesystem
# Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists
# resctrl filesystem not mounted
# dmesg: [ 16.192737] resctrl: Sub-NUMA Cluster mode detected with 3
nodes per L3 cache
# dmesg: [ 16.287785] resctrl: L3 allocation detected
# dmesg: [ 16.287961] resctrl: L2 allocation detected
# dmesg: [ 16.288093] resctrl: MB allocation detected
# dmesg: [ 16.288186] resctrl: L3 monitoring detected
1..6
# SNC-3 mode discovered.
# Starting MBM test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Writing benchmark parameters to resctrl FS
# Benchmark PID: 5503
# Write schema "MB:0=100" to resctrl FS
# Checking for pass/fail
# Pass: Check MBM diff within 15%
# avg_diff_per: 2%
# Span (MB): 640
# avg_bw_imc: 8815
# avg_bw_resc: 8570
ok 1 MBM: test
# Starting MBA test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Writing benchmark parameters to resctrl FS
# Benchmark PID: 5506
# Write schema "MB:0=10" to resctrl FS
# Write schema "MB:0=20" to resctrl FS
# Write schema "MB:0=30" to resctrl FS
# Write schema "MB:0=40" to resctrl FS
# Write schema "MB:0=50" to resctrl FS
# Write schema "MB:0=60" to resctrl FS
# Write schema "MB:0=70" to resctrl FS
# Write schema "MB:0=80" to resctrl FS
# Write schema "MB:0=90" to resctrl FS
# Write schema "MB:0=100" to resctrl FS
# Results are displayed in (MB)
# Bandwidth below threshold (2500 MiB). Dropping results from MBA
schemata 10.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA
schemata 20.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA
schemata 30.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA
schemata 40.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA
schemata 50.
# Pass: Check MBA diff within 15% for schemata 60
# avg_diff_per: 6%
# avg_bw_imc: 4669
# avg_bw_resc: 4355
# Pass: Check MBA diff within 15% for schemata 70
# avg_diff_per: 5%
# avg_bw_imc: 5556
# avg_bw_resc: 5231
# Pass: Check MBA diff within 15% for schemata 80
# avg_diff_per: 5%
# avg_bw_imc: 6257
# avg_bw_resc: 5942
# Pass: Check MBA diff within 15% for schemata 90
# avg_diff_per: 4%
# avg_bw_imc: 7126
# avg_bw_resc: 6804
# Pass: Check MBA diff within 15% for schemata 100
# avg_diff_per: 3%
# avg_bw_imc: 9246
# avg_bw_resc: 8901
# Pass: Check schemata change using MBA
ok 2 MBA: test
# Starting CMT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :167772160
# Writing benchmark parameters to resctrl FS
# Write schema "L3:0=ffe0" to resctrl FS
# Write schema "L3:0=1f" to resctrl FS
# Write schema "L2:1=0x1" to resctrl FS
# Benchmark PID: 5508
# Checking for pass/fail
# Pass: Check cache miss rate within 15%
# Percent diff=0
# Number of bits: 5
# Average LLC val: 52264960
# Cache span (bytes): 52428800
ok 3 CMT: test
# Starting L3_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :167772160
# Writing benchmark parameters to resctrl FS
# Write schema "L2:1=0x1" to resctrl FS
# Write schema "L3:0=3f80" to resctrl FS
# Write schema "L3:0=7f" to resctrl FS
# Write schema "L3:0=3fe0" to resctrl FS
# Write schema "L3:0=1f" to resctrl FS
# Write schema "L3:0=3ff8" to resctrl FS
# Write schema "L3:0=7" to resctrl FS
# Write schema "L3:0=3ffe" to resctrl FS
# Write schema "L3:0=1" to resctrl FS
# Checking for pass/fail
# Number of bits: 7
# Average LLC val: 620490
# Cache span (lines): 1146880
# Pass: Check cache miss rate increased
# Number of bits: 5
# Average LLC val: 1149986
# Cache span (lines): 819200
# Pass: Check cache miss rate increased
# Number of bits: 3
# Average LLC val: 1604363
# Cache span (lines): 491520
# Pass: Check cache miss rate increased
# Number of bits: 1
# Average LLC val: 2285082
# Cache span (lines): 163840
ok 4 L3_CAT: test
# Starting L3_NONCONT_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Write schema "L3:0=ff" to resctrl FS
# Write schema "L3:0=fc3f" to resctrl FS
ok 5 L3_NONCONT_CAT: test
# Starting L2_NONCONT_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Write schema "L2:1=ff" to resctrl FS
# Write schema "L2:1=fc3f" to resctrl FS
ok 6 L2_NONCONT_CAT: test
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Tested-by: Chen Yu <yu.c.chen@intel.com>
thanks,
Chenyu
^ permalink raw reply [flat|nested] 20+ messages in thread