All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: shuah@kernel.org, Dave.Martin@arm.com, james.morse@arm.com,
	 tony.luck@intel.com, babu.moger@amd.com, fenghuay@nvidia.com,
	 peternewman@google.com, zide.chen@intel.com,
	dapeng1.mi@linux.intel.com,  ben.horgan@arm.com,
	yu.c.chen@intel.com, jason.zeng@intel.com,
	 linux-kselftest@vger.kernel.org,
	LKML <linux-kernel@vger.kernel.org>,
	 patches@lists.linux.dev
Subject: Re: [PATCH v3 01/10] selftests/resctrl: Improve accuracy of cache occupancy test
Date: Thu, 26 Mar 2026 14:44:52 +0200 (EET)	[thread overview]
Message-ID: <7c10d8a4-cf81-aeea-4573-5d22ea39624c@linux.intel.com> (raw)
In-Reply-To: <b632cf0bad98f501079748346cb2e1dae120237d.1773432891.git.reinette.chatre@intel.com>

[-- Attachment #1: Type: text/plain, Size: 7992 bytes --]

On Fri, 13 Mar 2026, Reinette Chatre wrote:

> Dave Martin reported inconsistent CMT test failures. In one experiment
> the first run of the CMT test failed because of too large (24%) difference
> between measured and achievable cache occupancy while the second run passed
> with an acceptable 4% difference.
> 
> The CMT test is susceptible to interference from the rest of the system.
> This can be demonstrated with a utility like stress-ng by running the CMT
> test while introducing cache misses using:
> 
>    stress-ng --matrix-3d 0 --matrix-3d-zyx
> 
> Below shows an example of the CMT test failing because of a significant
> difference between measured and achievable cache occupancy when run with
> interference:
>     # Starting CMT test ...
>     # Mounting resctrl to "/sys/fs/resctrl"
>     # Cache size :335544320
>     # Writing benchmark parameters to resctrl FS
>     # Benchmark PID: 7011
>     # Checking for pass/fail
>     # Fail: Check cache miss rate within 15%
>     # Percent diff=99
>     # Number of bits: 5
>     # Average LLC val: 235929
>     # Cache span (bytes): 83886080
>     not ok 1 CMT: test
> 
> The CMT test creates a new control group that is also capable of monitoring
> and assigns the workload to it. The workload allocates a buffer that by
> default fills a portion of the L3 and keeps reading from the buffer,
> measuring the L3 occupancy at intervals. The test passes if the workload's
> L3 occupancy is within 15% of the buffer size.
> 
> By not adjusting any capacity bitmasks the workload shares the cache with
> the rest of the system. Any other task that may be running could evict
> the workload's data from the cache causing it to have low cache occupancy.
> 
> Reduce interference from the rest of the system by ensuring that the
> workload's control group uses the capacity bitmask found in the user
> parameters for L3 and that the rest of the system can only allocate into
> the inverse of the workload's L3 cache portion. Other tasks can thus no
> longer evict the workload's data from L3.
> 
> With the above adjustments the CMT test is more consistent. Repeating the
> CMT test while generating interference with stress-ng on a sample
> system after applying the fixes show significant improvement in test
> accuracy:
> 
>     # Starting CMT test ...
>     # Mounting resctrl to "/sys/fs/resctrl"
>     # Cache size :335544320
>     # Writing benchmark parameters to resctrl FS
>     # Write schema "L3:0=fffe0" to resctrl FS
>     # Write schema "L3:0=1f" to resctrl FS
>     # Benchmark PID: 7089
>     # Checking for pass/fail
>     # Pass: Check cache miss rate within 15%
>     # Percent diff=12
>     # Number of bits: 5
>     # Average LLC val: 73269248
>     # Cache span (bytes): 83886080
>     ok 1 CMT: test
> 
> Reported-by: Dave Martin <Dave.Martin@arm.com>
> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
> Tested-by: Chen Yu <yu.c.chen@intel.com>
> Link: https://lore.kernel.org/lkml/aO+7MeSMV29VdbQs@e133380.arm.com/
> ---
> Changes since v1:
> - Fix typo in changelog: "data my be in L2" -> "data may be in L2".
> 
> Changes since v2:
> - Split patch to separate changes impacting L3 and L2 resource. (Ilpo)
> - Re-run tests after patch split to ensure test impact match patch
>   and update changelog with refreshed data.
> - Since fix is now split across two patches: "Closes:" -> "Link:"
> - Rename "long_mask" to "full_mask". (Ilpo)
> - Add Chen Yu's tag.
> ---
>  tools/testing/selftests/resctrl/cmt_test.c    | 26 +++++++++++++++++--
>  tools/testing/selftests/resctrl/mba_test.c    |  4 ++-
>  tools/testing/selftests/resctrl/mbm_test.c    |  4 ++-
>  tools/testing/selftests/resctrl/resctrl.h     |  4 ++-
>  tools/testing/selftests/resctrl/resctrl_val.c |  2 +-
>  5 files changed, 34 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
> index d09e693dc739..7bc6cf49c1c5 100644
> --- a/tools/testing/selftests/resctrl/cmt_test.c
> +++ b/tools/testing/selftests/resctrl/cmt_test.c
> @@ -19,12 +19,34 @@
>  #define CON_MON_LCC_OCCUP_PATH		\
>  	"%s/%s/mon_data/mon_L3_%02d/llc_occupancy"
>  
> -static int cmt_init(const struct resctrl_val_param *param, int domain_id)
> +/*
> + * Initialize capacity bitmasks (CBMs) of:
> + * - control group being tested per test parameters,
> + * - default resource group as inverse of control group being tested to prevent
> + *   other tasks from interfering with test.
> + */
> +static int cmt_init(const struct resctrl_test *test,
> +		    const struct user_params *uparams,
> +		    const struct resctrl_val_param *param, int domain_id)
>  {
> +	unsigned long full_mask;
> +	char schemata[64];
> +	int ret;
> +
>  	sprintf(llc_occup_path, CON_MON_LCC_OCCUP_PATH, RESCTRL_PATH,
>  		param->ctrlgrp, domain_id);
>  
> -	return 0;
> +	ret = get_full_cbm(test->resource, &full_mask);
> +	if (ret)
> +		return ret;
> +
> +	snprintf(schemata, sizeof(schemata), "%lx", ~param->mask & full_mask);
> +	ret = write_schemata("", schemata, uparams->cpu, test->resource);
> +	if (ret)
> +		return ret;
> +
> +	snprintf(schemata, sizeof(schemata), "%lx", param->mask);
> +	return write_schemata(param->ctrlgrp, schemata, uparams->cpu, test->resource);
>  }
>  
>  static int cmt_setup(const struct resctrl_test *test,
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index c7e9adc0368f..cd4c715b7ffd 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -17,7 +17,9 @@
>  #define ALLOCATION_MIN		10
>  #define ALLOCATION_STEP		10
>  
> -static int mba_init(const struct resctrl_val_param *param, int domain_id)
> +static int mba_init(const struct resctrl_test *test,
> +		    const struct user_params *uparams,
> +		    const struct resctrl_val_param *param, int domain_id)
>  {
>  	int ret;
>  
> diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
> index 84d8bc250539..58201f844740 100644
> --- a/tools/testing/selftests/resctrl/mbm_test.c
> +++ b/tools/testing/selftests/resctrl/mbm_test.c
> @@ -83,7 +83,9 @@ static int check_results(size_t span)
>  	return ret;
>  }
>  
> -static int mbm_init(const struct resctrl_val_param *param, int domain_id)
> +static int mbm_init(const struct resctrl_test *test,
> +		    const struct user_params *uparams,
> +		    const struct resctrl_val_param *param, int domain_id)
>  {
>  	int ret;
>  
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index afe635b6e48d..c72045c74ac4 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -135,7 +135,9 @@ struct resctrl_val_param {
>  	char			filename[64];
>  	unsigned long		mask;
>  	int			num_of_runs;
> -	int			(*init)(const struct resctrl_val_param *param,
> +	int			(*init)(const struct resctrl_test *test,
> +					const struct user_params *uparams,
> +					const struct resctrl_val_param *param,
>  					int domain_id);
>  	int			(*setup)(const struct resctrl_test *test,
>  					 const struct user_params *uparams,
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index 7c08e936572d..a5a8badb83d4 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -569,7 +569,7 @@ int resctrl_val(const struct resctrl_test *test,
>  		goto reset_affinity;
>  
>  	if (param->init) {
> -		ret = param->init(param, domain_id);
> +		ret = param->init(test, uparams, param, domain_id);
>  		if (ret)
>  			goto reset_affinity;
>  	}
> 

Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>

-- 
 i.

  reply	other threads:[~2026-03-26 12:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-13 20:32 [PATCH v3 00/10] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
2026-03-13 20:32 ` [PATCH v3 01/10] selftests/resctrl: Improve accuracy of cache occupancy test Reinette Chatre
2026-03-26 12:44   ` Ilpo Järvinen [this message]
2026-03-13 20:32 ` [PATCH v3 02/10] selftests/resctrl: Reduce interference from L2 occupancy during " Reinette Chatre
2026-03-26 12:56   ` Ilpo Järvinen
2026-03-13 20:32 ` [PATCH v3 03/10] selftests/resctrl: Do not store iMC counter value in counter config structure Reinette Chatre
2026-03-13 20:32 ` [PATCH v3 04/10] selftests/resctrl: Prepare for parsing multiple events per iMC Reinette Chatre
2026-03-26 13:03   ` Ilpo Järvinen
2026-03-26 14:34     ` Reinette Chatre
2026-03-13 20:32 ` [PATCH v3 05/10] selftests/resctrl: Support multiple events associated with iMC Reinette Chatre
2026-03-27 17:28   ` Ilpo Järvinen
2026-03-13 20:32 ` [PATCH v3 06/10] selftests/resctrl: Increase size of buffer used in MBM and MBA tests Reinette Chatre
2026-03-27 17:30   ` Ilpo Järvinen
2026-03-13 20:32 ` [PATCH v3 07/10] selftests/resctrl: Raise threshold at which MBM and PMU values are compared Reinette Chatre
2026-03-27 17:34   ` Ilpo Järvinen
2026-03-27 23:19     ` Reinette Chatre
2026-03-13 20:32 ` [PATCH v3 08/10] selftests/resctrl: Remove requirement on cache miss rate Reinette Chatre
2026-03-27 17:45   ` Ilpo Järvinen
2026-03-27 23:21     ` Reinette Chatre
2026-03-31  8:07       ` Ilpo Järvinen
2026-03-31 17:39         ` Reinette Chatre
2026-03-13 20:32 ` [PATCH v3 09/10] selftests/resctrl: Simplify perf usage in CAT test Reinette Chatre
2026-03-27 17:47   ` Ilpo Järvinen
2026-03-13 20:32 ` [PATCH v3 10/10] selftests/resctrl: Reduce L2 impact on " Reinette Chatre
2026-03-27 17:49   ` Ilpo Järvinen
2026-03-27 23:22     ` Reinette Chatre
2026-03-31 19:13 ` [PATCH v3 00/10] selftests/resctrl: Fixes and improvements focused on Intel platforms Shuah Khan
2026-03-31 20:22   ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7c10d8a4-cf81-aeea-4573-5d22ea39624c@linux.intel.com \
    --to=ilpo.jarvinen@linux.intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=ben.horgan@arm.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=fenghuay@nvidia.com \
    --cc=james.morse@arm.com \
    --cc=jason.zeng@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    --cc=shuah@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=yu.c.chen@intel.com \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.