Re: [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms

public inbox for patches@lists.linux.dev
 help / color / mirror / Atom feed

From: "Chen, Yu C" <yu.c.chen@intel.com>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: <fenghuay@nvidia.com>, <peternewman@google.com>,
	<zide.chen@intel.com>, <dapeng1.mi@linux.intel.com>,
	<ben.horgan@arm.com>, <linux-kselftest@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <patches@lists.linux.dev>,
	<shuah@kernel.org>, <Dave.Martin@arm.com>, <james.morse@arm.com>,
	<babu.moger@amd.com>, <ilpo.jarvinen@linux.intel.com>,
	<tony.luck@intel.com>
Subject: Re: [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms
Date: Wed, 4 Mar 2026 23:18:21 +0800	[thread overview]
Message-ID: <4b6cd592-ab84-4dcb-ab86-ecb1be3482c7@intel.com> (raw)
In-Reply-To: <cover.1772582958.git.reinette.chatre@intel.com>

On 3/4/2026 8:19 AM, Reinette Chatre wrote:
> Changes since v1:
> - The new perf interface that resctrl selftests can utilize has been accepted and
>    merged into v7.0-rc2. This series can thus now be considered for inclusion.
>    For reference,
>    commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
>    The resctrl selftest changes making use of the new perf interface are backward
>    compatible. The selftests do not require a v7.0-rc2 kernel to run but the
>    tests can only pass on recent Intel platforms running v7.0-rc2 or later.
> - Combine the two outstanding resctrl selftest submissions into one series
>    for easier tracking:
>    https://lore.kernel.org/lkml/084e82b5c29d75f16f24af8768d50d39ba0118a5.1769101788.git.reinette.chatre@intel.com/
>    https://lore.kernel.org/lkml/cover.1770406608.git.reinette.chatre@intel.com/
> - Fix typo in changelog of "selftests/resctrl: Improve accuracy of cache
>    occupancy test": "the data my be in L2" -> "the data my be in L2"
> - Add Zide Chen's RB tags.
> 
> Cover letter updated to be accurate wrt perf changes:
> 
> The resctrl selftests fail on recent Intel platforms. Intermittent failures
> in the CAT test and permanent failures of MBM and MBA tests on new platforms
> like Sierra Forest and Granite Rapids.
> 
> The MBM and MBA resctrl selftests both generate memory traffic and compare the
> memory bandwidth measurements between the iMC PMUs and MBM to determine pass or
> fail. Both these tests are failing on recent platforms like Sierra Forest and
> Granite Rapids that have two events that need to be read and combined
> for a total memory bandwidth count instead of the single event available on
> earlier platforms.
> 
> resctrl selftests prefer to obtain event details via sysfs instead of adding
> model specific details on which events to read. Enhancements to perf to expose
> the new event details are available since:
>   commit 6a8a48644c4b ("perf/x86/intel/uncore: Add per-scheduler IMC CAS count events")
> This series demonstrates use of the new sysfs interface to perf to
> obtain to obtain accurate iMC read memory bandwidth measurements.
> 
> An additional issue with all the tests is that these selftests are part
> performance tests and determine pass/fail on performance heuristics selected
> after running the tests on a variety of platforms. When new platforms
> arrive the previous heuristics may cause the tests to fail. These failures are
> not because of an issue with the resctrl subsystem the tests intend to test
> but because of the architectural changes in the new platforms.
> 
> Adapt the resctrl tests to not be as sensitive to architectural changes
> while adjusting the remaining heuristics to ensure tests pass on a variety
> of platforms. More details in individual patches.
> 
> Tested by running 100 iterations of all tests on Emerald Rapids, Granite
> Rapids, Sapphire Rapids, Ice Lake, Sierra Forest, and Broadwell.
> 

Tested on a GNR with SNC3, without this patch I saw MBM/MBA failures.
With this patch applied on v7.0-rc2, I did not see any errors:
  sudo ./resctrl_tests
TAP version 13
# Pass: Check kernel supports resctrl filesystem
# Pass: Check resctrl mountpoint "/sys/fs/resctrl" exists
# resctrl filesystem not mounted
# dmesg: [   16.192737] resctrl: Sub-NUMA Cluster mode detected with 3 
nodes per L3 cache
# dmesg: [   16.287785] resctrl: L3 allocation detected
# dmesg: [   16.287961] resctrl: L2 allocation detected
# dmesg: [   16.288093] resctrl: MB allocation detected
# dmesg: [   16.288186] resctrl: L3 monitoring detected
1..6
# SNC-3 mode discovered.
# Starting MBM test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Writing benchmark parameters to resctrl FS
# Benchmark PID: 5503
# Write schema "MB:0=100" to resctrl FS
# Checking for pass/fail
# Pass: Check MBM diff within 15%
# avg_diff_per: 2%
# Span (MB): 640
# avg_bw_imc: 8815
# avg_bw_resc: 8570
ok 1 MBM: test
# Starting MBA test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Writing benchmark parameters to resctrl FS
# Benchmark PID: 5506
# Write schema "MB:0=10" to resctrl FS
# Write schema "MB:0=20" to resctrl FS
# Write schema "MB:0=30" to resctrl FS
# Write schema "MB:0=40" to resctrl FS
# Write schema "MB:0=50" to resctrl FS
# Write schema "MB:0=60" to resctrl FS
# Write schema "MB:0=70" to resctrl FS
# Write schema "MB:0=80" to resctrl FS
# Write schema "MB:0=90" to resctrl FS
# Write schema "MB:0=100" to resctrl FS
# Results are displayed in (MB)
# Bandwidth below threshold (2500 MiB). Dropping results from MBA 
schemata 10.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA 
schemata 20.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA 
schemata 30.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA 
schemata 40.
# Bandwidth below threshold (2500 MiB). Dropping results from MBA 
schemata 50.
# Pass: Check MBA diff within 15% for schemata 60
# avg_diff_per: 6%
# avg_bw_imc: 4669
# avg_bw_resc: 4355
# Pass: Check MBA diff within 15% for schemata 70
# avg_diff_per: 5%
# avg_bw_imc: 5556
# avg_bw_resc: 5231
# Pass: Check MBA diff within 15% for schemata 80
# avg_diff_per: 5%
# avg_bw_imc: 6257
# avg_bw_resc: 5942
# Pass: Check MBA diff within 15% for schemata 90
# avg_diff_per: 4%
# avg_bw_imc: 7126
# avg_bw_resc: 6804
# Pass: Check MBA diff within 15% for schemata 100
# avg_diff_per: 3%
# avg_bw_imc: 9246
# avg_bw_resc: 8901
# Pass: Check schemata change using MBA
ok 2 MBA: test
# Starting CMT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :167772160
# Writing benchmark parameters to resctrl FS
# Write schema "L3:0=ffe0" to resctrl FS
# Write schema "L3:0=1f" to resctrl FS
# Write schema "L2:1=0x1" to resctrl FS
# Benchmark PID: 5508
# Checking for pass/fail
# Pass: Check cache miss rate within 15%
# Percent diff=0
# Number of bits: 5
# Average LLC val: 52264960
# Cache span (bytes): 52428800
ok 3 CMT: test
# Starting L3_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Cache size :167772160
# Writing benchmark parameters to resctrl FS
# Write schema "L2:1=0x1" to resctrl FS
# Write schema "L3:0=3f80" to resctrl FS
# Write schema "L3:0=7f" to resctrl FS
# Write schema "L3:0=3fe0" to resctrl FS
# Write schema "L3:0=1f" to resctrl FS
# Write schema "L3:0=3ff8" to resctrl FS
# Write schema "L3:0=7" to resctrl FS
# Write schema "L3:0=3ffe" to resctrl FS
# Write schema "L3:0=1" to resctrl FS
# Checking for pass/fail
# Number of bits: 7
# Average LLC val: 620490
# Cache span (lines): 1146880
# Pass: Check cache miss rate increased
# Number of bits: 5
# Average LLC val: 1149986
# Cache span (lines): 819200
# Pass: Check cache miss rate increased
# Number of bits: 3
# Average LLC val: 1604363
# Cache span (lines): 491520
# Pass: Check cache miss rate increased
# Number of bits: 1
# Average LLC val: 2285082
# Cache span (lines): 163840
ok 4 L3_CAT: test
# Starting L3_NONCONT_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Write schema "L3:0=ff" to resctrl FS
# Write schema "L3:0=fc3f" to resctrl FS
ok 5 L3_NONCONT_CAT: test
# Starting L2_NONCONT_CAT test ...
# Mounting resctrl to "/sys/fs/resctrl"
# Write schema "L2:1=ff" to resctrl FS
# Write schema "L2:1=fc3f" to resctrl FS
ok 6 L2_NONCONT_CAT: test
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0

Tested-by: Chen Yu <yu.c.chen@intel.com>

thanks,
Chenyu

     prev parent reply	other threads:[~2026-03-04 15:18 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-04  0:19 [PATCH v2 0/9] selftests/resctrl: Fixes and improvements focused on Intel platforms Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 1/9] selftests/resctrl: Improve accuracy of cache occupancy test Reinette Chatre
2026-03-06  9:47   ` Ilpo Järvinen
2026-03-06 19:24     ` Reinette Chatre
2026-03-09  7:44       ` Ilpo Järvinen
2026-03-04  0:19 ` [PATCH v2 2/9] selftests/resctrl: Do not store iMC counter value in counter config structure Reinette Chatre
2026-03-06  9:51   ` Ilpo Järvinen
2026-03-06 19:25     ` Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 3/9] selftests/resctrl: Prepare for parsing multiple events per iMC Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 4/9] selftests/resctrl: Support multiple events associated with iMC Reinette Chatre
2026-03-06 10:18   ` Ilpo Järvinen
2026-03-06 19:25     ` Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 5/9] selftests/resctrl: Increase size of buffer used in MBM and MBA tests Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 6/9] selftests/resctrl: Raise threshold at which MBM and PMU values are compared Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 7/9] selftests/resctrl: Remove requirement on cache miss rate Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 8/9] selftests/resctrl: Simplify perf usage in CAT test Reinette Chatre
2026-03-04  0:19 ` [PATCH v2 9/9] selftests/resctrl: Reduce L2 impact on " Reinette Chatre
2026-03-06 10:35   ` Ilpo Järvinen
2026-03-06 19:26     ` Reinette Chatre
2026-03-04 15:18 ` Chen, Yu C [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b6cd592-ab84-4dcb-ab86-ecb1be3482c7@intel.com \
    --to=yu.c.chen@intel.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=ben.horgan@arm.com \
    --cc=dapeng1.mi@linux.intel.com \
    --cc=fenghuay@nvidia.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=james.morse@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=patches@lists.linux.dev \
    --cc=peternewman@google.com \
    --cc=reinette.chatre@intel.com \
    --cc=shuah@kernel.org \
    --cc=tony.luck@intel.com \
    --cc=zide.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox