Linux Kernel Selftest development
 help / color / mirror / Atom feed
* [PATCH v5 0/2] selftests/resctrl: SNC kernel support discovery
@ 2024-10-29 13:00 Maciej Wieczor-Retman
  2024-10-29 13:00 ` [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled Maciej Wieczor-Retman
  2024-10-29 13:00 ` [PATCH v5 2/2] selftests/resctrl: Adjust SNC support messages Maciej Wieczor-Retman
  0 siblings, 2 replies; 6+ messages in thread
From: Maciej Wieczor-Retman @ 2024-10-29 13:00 UTC (permalink / raw)
  To: fenghua.yu, reinette.chatre, shuah
  Cc: linux-kselftest, linux-kernel, ilpo.jarvinen, tony.luck

Changes v5:
- Tests are skipped if snc_unreliable was set.
- Moved resctrlfs.c changes from patch 2/2 to 1/2.
- Removed CAT changes since it's not impacted by SNC in the selftest.
- Updated various comments.
- Fixed a bunch of minor issues pointed out in the review.

Changes v4:
- Printing SNC warnings at the start of every test.
- Printing SNC warnings at the end of every relevant test.
- Remove global snc_mode variable, consolidate snc detection functions
  into one.
- Correct minor mistakes.

Changes v3:
- Reworked patch 2.
- Changed minor things in patch 1 like function name and made
  corrections to the patch message.

Changes v2:
- Removed patches 2 and 3 since now this part will be supported by the
  kernel.

Sub-Numa Clustering (SNC) allows splitting CPU cores, caches and memory
into multiple NUMA nodes. When enabled, NUMA-aware applications can
achieve better performance on bigger server platforms.

SNC support in the kernel was merged into x86/cache [1]. With SNC enabled
and kernel support in place all the tests will function normally (aside
from effective cache size). There might be a problem when SNC is enabled
but the system is still using an older kernel version without SNC
support. Currently the only message displayed in that situation is a
guess that SNC might be enabled and is causing issues. That message also
is displayed whenever the test fails on an Intel platform.

Add a mechanism to discover kernel support for SNC which will add more
meaning and certainty to the error message.

Add runtime SNC mode detection and verify how reliable that information
is.

Series was tested on Ice Lake server platforms with SNC disabled, SNC-2
and SNC-4. The tests were also ran with and without kernel support for
SNC.

Series applies cleanly on kselftest/next.

[1] https://lore.kernel.org/all/20240628215619.76401-1-tony.luck@intel.com/

Previous versions of this series:
[v1] https://lore.kernel.org/all/cover.1709721159.git.maciej.wieczor-retman@intel.com/
[v2] https://lore.kernel.org/all/cover.1715769576.git.maciej.wieczor-retman@intel.com/
[v3] https://lore.kernel.org/all/cover.1719842207.git.maciej.wieczor-retman@intel.com/
[v4] https://lore.kernel.org/all/cover.1720774981.git.maciej.wieczor-retman@intel.com/

Maciej Wieczor-Retman (2):
  selftests/resctrl: Adjust effective L3 cache size with SNC enabled
  selftests/resctrl: Adjust SNC support messages

 tools/testing/selftests/resctrl/cmt_test.c    |   8 +-
 tools/testing/selftests/resctrl/mba_test.c    |   8 +-
 tools/testing/selftests/resctrl/mbm_test.c    |  10 +-
 tools/testing/selftests/resctrl/resctrl.h     |   7 +
 .../testing/selftests/resctrl/resctrl_tests.c |   8 +-
 tools/testing/selftests/resctrl/resctrlfs.c   | 132 ++++++++++++++++++
 6 files changed, 166 insertions(+), 7 deletions(-)

-- 
2.46.2


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled
  2024-10-29 13:00 [PATCH v5 0/2] selftests/resctrl: SNC kernel support discovery Maciej Wieczor-Retman
@ 2024-10-29 13:00 ` Maciej Wieczor-Retman
  2024-11-05 17:08   ` Reinette Chatre
  2024-11-05 23:28   ` Reinette Chatre
  2024-10-29 13:00 ` [PATCH v5 2/2] selftests/resctrl: Adjust SNC support messages Maciej Wieczor-Retman
  1 sibling, 2 replies; 6+ messages in thread
From: Maciej Wieczor-Retman @ 2024-10-29 13:00 UTC (permalink / raw)
  To: fenghua.yu, reinette.chatre, shuah
  Cc: linux-kernel, linux-kselftest, ilpo.jarvinen, tony.luck

Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
nodes. Systems may support splitting into either two, three or four
nodes.

When SNC mode is enabled the effective amount of L3 cache available
for allocation is divided by the number of nodes per L3.

Detect which SNC mode is active by comparing the number of CPUs
that share a cache with CPU0, with the number of CPUs on node0.

Co-developed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
Changelog v5:
- Set snc_mode to 1 at the start of if(!snc_mode) block.
- Move all resctrlfs.c code from patch 2/2 to this one. (Reinette)
- Fix Co-developed-by tag.
- Update SNC discovery comments for the case where it gets disabled by
  being unreliable.
- Remove exclamation mark from ksft_perror() and add full file path
  instead of "offline CPUs file".
- bit map -> bitmap.
- Remove unnecessary empty line.

Changelog v4:
- Make returned value a static local variable so the function only runs
  the logic once. (Reinette)

Changelog v3:
- Add comparison between present and online cpus to test if the
  calculated SNC mode is credible. (Reinette)
- Added comment to cache size modification to better explain why it is
  needed there. (Reinette)
- Fix facts in patch message. (Reinette)
- Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)

 tools/testing/selftests/resctrl/resctrl.h     |   4 +
 .../testing/selftests/resctrl/resctrl_tests.c |   8 +-
 tools/testing/selftests/resctrl/resctrlfs.c   | 132 ++++++++++++++++++
 3 files changed, 143 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 2dda56084588..851b37c9c38a 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -11,6 +11,7 @@
 #include <signal.h>
 #include <dirent.h>
 #include <stdbool.h>
+#include <ctype.h>
 #include <sys/stat.h>
 #include <sys/ioctl.h>
 #include <sys/mount.h>
@@ -43,6 +44,8 @@
 
 #define DEFAULT_SPAN		(250 * MB)
 
+#define MAX_SNC		4
+
 /*
  * user_params:		User supplied parameters
  * @cpu:		CPU number to which the benchmark will be bound to
@@ -120,6 +123,7 @@ extern volatile int *value_sink;
 
 extern char llc_occup_path[1024];
 
+int snc_nodes_per_l3_cache(void);
 int get_vendor(void);
 bool check_resctrlfs_support(void);
 int filter_dmesg(void);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index ecbb7605a981..4b84d6199a36 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -118,11 +118,17 @@ static bool test_vendor_specific_check(const struct resctrl_test *test)
 
 static void run_single_test(const struct resctrl_test *test, const struct user_params *uparams)
 {
-	int ret;
+	int ret, snc_mode;
 
 	if (test->disabled)
 		return;
 
+	snc_mode = snc_nodes_per_l3_cache();
+	if (snc_mode > 1)
+		ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
+	else if (snc_unreliable)
+		ksft_print_msg("SNC detection unreliable due to offline CPUs. Test results may not be accurate if SNC enabled.\n");
+
 	if (!test_vendor_specific_check(test)) {
 		ksft_test_result_skip("Hardware does not support %s\n", test->name);
 		return;
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index 250c320349a7..d6384f404d95 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -13,6 +13,8 @@
 
 #include "resctrl.h"
 
+int snc_unreliable;
+
 static int find_resctrl_mount(char *buffer)
 {
 	FILE *mounts;
@@ -156,6 +158,93 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id)
 	return 0;
 }
 
+/*
+ * Count number of CPUs in a /sys bitmap
+ */
+static unsigned int count_sys_bitmap_bits(char *name)
+{
+	FILE *fp = fopen(name, "r");
+	int count = 0, c;
+
+	if (!fp)
+		return 0;
+
+	while ((c = fgetc(fp)) != EOF) {
+		if (!isxdigit(c))
+			continue;
+		switch (c) {
+		case 'f':
+			count++;
+		case '7': case 'b': case 'd': case 'e':
+			count++;
+		case '3': case '5': case '6': case '9': case 'a': case 'c':
+			count++;
+		case '1': case '2': case '4': case '8':
+			count++;
+		}
+	}
+	fclose(fp);
+
+	return count;
+}
+
+static bool cpus_offline_empty(void)
+{
+	char offline_cpus_str[64];
+	FILE *fp;
+
+	fp = fopen("/sys/devices/system/cpu/offline", "r");
+	if (fscanf(fp, "%s", offline_cpus_str) < 0) {
+		if (!errno) {
+			fclose(fp);
+			return 1;
+		}
+		ksft_perror("Could not read /sys/devices/system/cpu/offline");
+	}
+
+	fclose(fp);
+
+	return 0;
+}
+
+/*
+ * Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
+ * If some CPUs are offline the numbers may not be exact multiples of each
+ * other. Any offline CPUs on node0 will be also gone from shared_cpu_map of
+ * CPU0 but offline CPUs from other nodes will only make the cache_cpus value
+ * lower. Still try to get the ratio right by preventing the second possibility.
+ */
+int snc_nodes_per_l3_cache(void)
+{
+	int node_cpus, cache_cpus, i;
+	static int snc_mode;
+
+	if (!snc_mode) {
+		snc_mode = 1;
+		if (!cpus_offline_empty()) {
+			ksft_print_msg("Runtime SNC detection unreliable due to offline CPUs.\n");
+			ksft_print_msg("Setting SNC mode to disabled.\n");
+			snc_unreliable = 1;
+			return snc_mode;
+		}
+		node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
+		cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
+
+		if (!node_cpus || !cache_cpus) {
+			ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n");
+			return 1;
+		}
+		for (i = 1; i <= MAX_SNC ; i++) {
+			if (i * node_cpus >= cache_cpus) {
+				snc_mode = i;
+				break;
+			}
+		}
+	}
+
+	return snc_mode;
+}
+
 /*
  * get_cache_size - Get cache size for a specified CPU
  * @cpu_no:	CPU number
@@ -211,6 +300,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
 			break;
 	}
 
+	/*
+	 * The amount of cache represented by each bit in the masks
+	 * in the schemata file is reduced by a factor equal to SNC
+	 * nodes per L3 cache.
+	 * E.g. on a SNC-2 system with a 100MB L3 cache a test that
+	 * allocates memory from its local SNC node (default behavior
+	 * without using libnuma) will only see 50 MB llc_occupancy
+	 * with a fully populated L3 mask in the schemata file.
+	 */
+	if (cache_num == 3)
+		*cache_size /= snc_nodes_per_l3_cache();
 	return 0;
 }
 
@@ -869,3 +969,35 @@ unsigned int count_bits(unsigned long n)
 
 	return count;
 }
+
+/**
+ * snc_kernel_support - Check for existence of mon_sub_L3_00 file that indicates
+ * SNC resctrl support on the kernel side.
+ *
+ * Return: 0 if not supported, 1 if SNC is disabled or SNC discovery is
+ * unreliable or SNC is both enabled and supported.
+ */
+int snc_kernel_support(void)
+{
+	char node_path[PATH_MAX];
+	struct stat statbuf;
+	int ret;
+
+	ret = snc_nodes_per_l3_cache();
+	/*
+	 * If SNC is disabled then its kernel support isn't important. If SNC
+	 * got disabled because the discovery process was unreliable the
+	 * snc_unreliable variable was set. It can be used to verify the SNC
+	 * discovery reliability elsewhere in the selftest.
+	 */
+	if (ret == 1)
+		return ret;
+
+	snprintf(node_path, sizeof(node_path), "%s/%s/%s", RESCTRL_PATH, "mon_data",
+		 "mon_L3_00/mon_sub_L3_00");
+
+	if (!stat(node_path, &statbuf))
+		return 1;
+
+	return 0;
+}
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v5 2/2] selftests/resctrl: Adjust SNC support messages
  2024-10-29 13:00 [PATCH v5 0/2] selftests/resctrl: SNC kernel support discovery Maciej Wieczor-Retman
  2024-10-29 13:00 ` [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled Maciej Wieczor-Retman
@ 2024-10-29 13:00 ` Maciej Wieczor-Retman
  2024-11-05 17:08   ` Reinette Chatre
  1 sibling, 1 reply; 6+ messages in thread
From: Maciej Wieczor-Retman @ 2024-10-29 13:00 UTC (permalink / raw)
  To: fenghua.yu, reinette.chatre, shuah
  Cc: linux-kernel, linux-kselftest, ilpo.jarvinen, tony.luck

Resctrl selftest prints a message on test failure that Sub-Numa
Clustering (SNC) could be enabled and points the user to check their BIOS
settings. No actual check is performed before printing that message so
it is not very accurate in pinpointing a problem.

Figuring out if SNC is enabled is only one part of the problem, the
others being whether the detected SNC mode is reliable and whether the
kernel supports SNC in resctrl.

When there is SNC support for kernel's resctrl subsystem and SNC is
enabled then sub node files are created for each node in the resctrlfs.
The sub node files exist in each regular node's L3 monitoring directory.
The reliable path to check for existence of sub node files is
/sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00.

To check if SNC detection is reliable one can check the
/sys/devices/system/cpu/offline file. If it's empty, it means all cores
are operational and the ratio should be calculated correctly. If it has
any contents, it means the detected SNC mode can't be trusted and should
be disabled.

Add helpers for all operations mentioned above.

Detect SNC mode once and let other tests inherit that information.

Add messages to alert the user when SNC detection could return incorrect
results. Correct old messages to account for kernel support of SNC in
resctrl.

Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
---
Changelog v5:
- Move all resctrlfs.c code from this patch to 1/2. (Reinette)
- Remove kernel support check and error message from CAT since it can't
  be happen.
- Remove snc checks in CAT since snc doesn't affect it here.
- Skip MBM, MBA and CMT tests if snc is unreliable.

Changelog v4:
- Change messages at the end of tests and at the start of
  run_single_test. (Reinette)
- Add messages at the end of CAT since it can also fail due to enabled
  SNC + lack of kernel support.
- Remove snc_mode global variable. (Reinette)
- Fix wrong description of snc_kernel_support(). (Reinette)
- Move call to cpus_offline_empty() into snc_nodes_per_l3_cache() so the
  whole detection flow is in one place as discussed. (Reinette)

Changelog v3:
- Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
- Add printing the discovered SNC mode. (Reinette)
- Change method of kernel support discovery from cache sizes to
  existance of sub node files.
- Check if SNC detection is unreliable.
- Move SNC detection to only the first run_single_test() instead on
  error at the end of test runs.
- Add global value to remind user at the end of relevant tests if SNC
  detection was found to be unreliable.
- Redo the patch message after the changes.

Changelog v2:
- Move snc_ways() checks from individual tests into
  snc_kernel_support().
- Write better comment for snc_kernel_support().

 tools/testing/selftests/resctrl/cmt_test.c |  8 ++++++--
 tools/testing/selftests/resctrl/mba_test.c |  8 +++++++-
 tools/testing/selftests/resctrl/mbm_test.c | 10 +++++++---
 tools/testing/selftests/resctrl/resctrl.h  |  3 +++
 4 files changed, 23 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index 0c045080d808..1470bd64d158 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -133,6 +133,10 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
 	ret = get_cache_size(uparams->cpu, "L3", &cache_total_size);
 	if (ret)
 		return ret;
+
+	if ((get_vendor() == ARCH_INTEL) && snc_unreliable)
+		ksft_exit_skip("Sub-NUMA Clustering could not be detected properly. Skipping...\n");
+
 	ksft_print_msg("Cache size :%lu\n", cache_total_size);
 
 	count_of_bits = count_bits(long_mask);
@@ -175,8 +179,8 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
 		goto out;
 
 	ret = check_results(&param, span, n);
-	if (ret && (get_vendor() == ARCH_INTEL))
-		ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
+	if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
+		ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
 
 out:
 	free(span_str);
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index ab8496a4925b..8f4e198da047 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -170,15 +170,21 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
 		.setup		= mba_setup,
 		.measure	= mba_measure,
 	};
-	int ret;
+	int ret, snc_support;
 
 	remove(RESULT_FILE_NAME);
 
+	snc_support = snc_kernel_support();
+	if ((get_vendor() == ARCH_INTEL) && snc_unreliable)
+		ksft_exit_skip("Sub-NUMA Clustering could not be detected properly. Skipping...\n");
+
 	ret = resctrl_val(test, uparams, uparams->benchmark_cmd, &param);
 	if (ret)
 		return ret;
 
 	ret = check_results();
+	if (ret && (get_vendor() == ARCH_INTEL) && !snc_support)
+		ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
 
 	return ret;
 }
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 6b5a3b52d861..a68f70589b91 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -138,17 +138,21 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param
 		.setup		= mbm_setup,
 		.measure	= mbm_measure,
 	};
-	int ret;
+	int ret, snc_support;
 
 	remove(RESULT_FILE_NAME);
 
+	snc_support = snc_kernel_support();
+	if ((get_vendor() == ARCH_INTEL) && snc_unreliable)
+		ksft_exit_skip("Sub-NUMA Clustering could not be detected properly. Skipping...\n");
+
 	ret = resctrl_val(test, uparams, uparams->benchmark_cmd, &param);
 	if (ret)
 		return ret;
 
 	ret = check_results(DEFAULT_SPAN);
-	if (ret && (get_vendor() == ARCH_INTEL))
-		ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
+	if (ret && (get_vendor() == ARCH_INTEL) && !snc_support)
+		ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
 
 	return ret;
 }
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index 851b37c9c38a..488bdca01e4f 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -121,6 +121,8 @@ struct perf_event_read {
  */
 extern volatile int *value_sink;
 
+extern int snc_unreliable;
+
 extern char llc_occup_path[1024];
 
 int snc_nodes_per_l3_cache(void);
@@ -167,6 +169,7 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
 int signal_handler_register(const struct resctrl_test *test);
 void signal_handler_unregister(void);
 unsigned int count_bits(unsigned long n);
+int snc_kernel_support(void);
 
 void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config);
 void perf_event_initialize_read_format(struct perf_event_read *pe_read);
-- 
2.46.2


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled
  2024-10-29 13:00 ` [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled Maciej Wieczor-Retman
@ 2024-11-05 17:08   ` Reinette Chatre
  2024-11-05 23:28   ` Reinette Chatre
  1 sibling, 0 replies; 6+ messages in thread
From: Reinette Chatre @ 2024-11-05 17:08 UTC (permalink / raw)
  To: Maciej Wieczor-Retman, fenghua.yu, shuah
  Cc: linux-kernel, linux-kselftest, ilpo.jarvinen, tony.luck

Hi Maciej,

On 10/29/24 6:00 AM, Maciej Wieczor-Retman wrote:
> Sub-NUMA Cluster divides CPUs sharing an L3 cache into separate NUMA
> nodes. Systems may support splitting into either two, three or four
> nodes.
> 
> When SNC mode is enabled the effective amount of L3 cache available
> for allocation is divided by the number of nodes per L3.
> 
> Detect which SNC mode is active by comparing the number of CPUs
> that share a cache with CPU0, with the number of CPUs on node0.
> 
> Co-developed-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Tony Luck <tony.luck@intel.com>
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---

...

> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 2dda56084588..851b37c9c38a 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -11,6 +11,7 @@
>  #include <signal.h>
>  #include <dirent.h>
>  #include <stdbool.h>
> +#include <ctype.h>
>  #include <sys/stat.h>
>  #include <sys/ioctl.h>
>  #include <sys/mount.h>
> @@ -43,6 +44,8 @@
>  
>  #define DEFAULT_SPAN		(250 * MB)
>  
> +#define MAX_SNC		4
> +
>  /*
>   * user_params:		User supplied parameters
>   * @cpu:		CPU number to which the benchmark will be bound to
> @@ -120,6 +123,7 @@ extern volatile int *value_sink;
>  
>  extern char llc_occup_path[1024];
>  
> +int snc_nodes_per_l3_cache(void);
>  int get_vendor(void);
>  bool check_resctrlfs_support(void);
>  int filter_dmesg(void);
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index ecbb7605a981..4b84d6199a36 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -118,11 +118,17 @@ static bool test_vendor_specific_check(const struct resctrl_test *test)
>  
>  static void run_single_test(const struct resctrl_test *test, const struct user_params *uparams)
>  {
> -	int ret;
> +	int ret, snc_mode;
>  
>  	if (test->disabled)
>  		return;
>  
> +	snc_mode = snc_nodes_per_l3_cache();
> +	if (snc_mode > 1)
> +		ksft_print_msg("SNC-%d mode discovered.\n", snc_mode);
> +	else if (snc_unreliable)
> +		ksft_print_msg("SNC detection unreliable due to offline CPUs. Test results may not be accurate if SNC enabled.\n");

As I understand none of the tests can be considered reliable if SNC detection is unreliable
so skipping the test can be done here in a central spot without duplicating it in each test.

> +
>  	if (!test_vendor_specific_check(test)) {
>  		ksft_test_result_skip("Hardware does not support %s\n", test->name);
>  		return;
> diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
> index 250c320349a7..d6384f404d95 100644
> --- a/tools/testing/selftests/resctrl/resctrlfs.c
> +++ b/tools/testing/selftests/resctrl/resctrlfs.c
> @@ -13,6 +13,8 @@
>  
>  #include "resctrl.h"
>  
> +int snc_unreliable;
> +
>  static int find_resctrl_mount(char *buffer)
>  {
>  	FILE *mounts;
> @@ -156,6 +158,93 @@ int get_domain_id(const char *resource, int cpu_no, int *domain_id)
>  	return 0;
>  }
>  
> +/*
> + * Count number of CPUs in a /sys bitmap
> + */
> +static unsigned int count_sys_bitmap_bits(char *name)
> +{
> +	FILE *fp = fopen(name, "r");
> +	int count = 0, c;
> +
> +	if (!fp)
> +		return 0;
> +
> +	while ((c = fgetc(fp)) != EOF) {
> +		if (!isxdigit(c))
> +			continue;
> +		switch (c) {
> +		case 'f':
> +			count++;
> +		case '7': case 'b': case 'd': case 'e':
> +			count++;
> +		case '3': case '5': case '6': case '9': case 'a': case 'c':
> +			count++;
> +		case '1': case '2': case '4': case '8':
> +			count++;
> +		}
> +	}
> +	fclose(fp);
> +
> +	return count;
> +}
> +
> +static bool cpus_offline_empty(void)
> +{
> +	char offline_cpus_str[64];
> +	FILE *fp;
> +
> +	fp = fopen("/sys/devices/system/cpu/offline", "r");
> +	if (fscanf(fp, "%s", offline_cpus_str) < 0) {
> +		if (!errno) {
> +			fclose(fp);
> +			return 1;
> +		}
> +		ksft_perror("Could not read /sys/devices/system/cpu/offline");
> +	}
> +
> +	fclose(fp);
> +
> +	return 0;
> +}
> +
> +/*
> + * Detect SNC by comparing #CPUs in node0 with #CPUs sharing LLC with CPU0.
> + * If some CPUs are offline the numbers may not be exact multiples of each
> + * other. Any offline CPUs on node0 will be also gone from shared_cpu_map of
> + * CPU0 but offline CPUs from other nodes will only make the cache_cpus value
> + * lower. Still try to get the ratio right by preventing the second possibility.

Similar to v4 this "try to get the ratio right" is unnecessary because of the explicit
check for offline CPUs.

> + */
> +int snc_nodes_per_l3_cache(void)
> +{
> +	int node_cpus, cache_cpus, i;
> +	static int snc_mode;
> +
> +	if (!snc_mode) {
> +		snc_mode = 1;
> +		if (!cpus_offline_empty()) {
> +			ksft_print_msg("Runtime SNC detection unreliable due to offline CPUs.\n");
> +			ksft_print_msg("Setting SNC mode to disabled.\n");
> +			snc_unreliable = 1;
> +			return snc_mode;
> +		}
> +		node_cpus = count_sys_bitmap_bits("/sys/devices/system/node/node0/cpumap");
> +		cache_cpus = count_sys_bitmap_bits("/sys/devices/system/cpu/cpu0/cache/index3/shared_cpu_map");
> +
> +		if (!node_cpus || !cache_cpus) {
> +			ksft_print_msg("Could not determine Sub-NUMA Cluster mode.\n");
> +			return 1;

Unclear why this is hardcoded "1". If this is an error then it should be negative so that
caller can consume as error. If it is intended to represent SNC mode then it should be
"snc_mode".

> +		}
> +		for (i = 1; i <= MAX_SNC ; i++) {
> +			if (i * node_cpus >= cache_cpus) {
> +				snc_mode = i;
> +				break;

As I understand this complication is no longer needed because of earlier cpus_offline_empty()
check.

> +			}
> +		}
> +	}
> +
> +	return snc_mode;
> +}
> +
>  /*
>   * get_cache_size - Get cache size for a specified CPU
>   * @cpu_no:	CPU number
> @@ -211,6 +300,17 @@ int get_cache_size(int cpu_no, const char *cache_type, unsigned long *cache_size
>  			break;
>  	}
>  
> +	/*
> +	 * The amount of cache represented by each bit in the masks
> +	 * in the schemata file is reduced by a factor equal to SNC
> +	 * nodes per L3 cache.
> +	 * E.g. on a SNC-2 system with a 100MB L3 cache a test that
> +	 * allocates memory from its local SNC node (default behavior
> +	 * without using libnuma) will only see 50 MB llc_occupancy
> +	 * with a fully populated L3 mask in the schemata file.
> +	 */
> +	if (cache_num == 3)
> +		*cache_size /= snc_nodes_per_l3_cache();

get_cache_size() is also used by L3 CAT test. L3 CAT test is thus also impacted
if SNC cannot be detected reliably and should be skipped.

>  	return 0;
>  }
>  
> @@ -869,3 +969,35 @@ unsigned int count_bits(unsigned long n)
>  
>  	return count;
>  }
> +
> +/**
> + * snc_kernel_support - Check for existence of mon_sub_L3_00 file that indicates
> + * SNC resctrl support on the kernel side.
> + *
> + * Return: 0 if not supported, 1 if SNC is disabled or SNC discovery is
> + * unreliable or SNC is both enabled and supported.
> + */
> +int snc_kernel_support(void)
> +{
> +	char node_path[PATH_MAX];
> +	struct stat statbuf;
> +	int ret;
> +
> +	ret = snc_nodes_per_l3_cache();
> +	/*
> +	 * If SNC is disabled then its kernel support isn't important. If SNC
> +	 * got disabled because the discovery process was unreliable the
> +	 * snc_unreliable variable was set. It can be used to verify the SNC
> +	 * discovery reliability elsewhere in the selftest.
> +	 */
> +	if (ret == 1)
> +		return ret;
> +
> +	snprintf(node_path, sizeof(node_path), "%s/%s/%s", RESCTRL_PATH, "mon_data",
> +		 "mon_L3_00/mon_sub_L3_00");
> +
> +	if (!stat(node_path, &statbuf))
> +		return 1;
> +
> +	return 0;
> +}

Reinette

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 2/2] selftests/resctrl: Adjust SNC support messages
  2024-10-29 13:00 ` [PATCH v5 2/2] selftests/resctrl: Adjust SNC support messages Maciej Wieczor-Retman
@ 2024-11-05 17:08   ` Reinette Chatre
  0 siblings, 0 replies; 6+ messages in thread
From: Reinette Chatre @ 2024-11-05 17:08 UTC (permalink / raw)
  To: Maciej Wieczor-Retman, fenghua.yu, shuah
  Cc: linux-kernel, linux-kselftest, ilpo.jarvinen, tony.luck

Hi Maciej,

On 10/29/24 6:00 AM, Maciej Wieczor-Retman wrote:
> Resctrl selftest prints a message on test failure that Sub-Numa
> Clustering (SNC) could be enabled and points the user to check their BIOS
> settings. No actual check is performed before printing that message so
> it is not very accurate in pinpointing a problem.
> 
> Figuring out if SNC is enabled is only one part of the problem, the
> others being whether the detected SNC mode is reliable and whether the
> kernel supports SNC in resctrl.

Starting to sound like previous patch ...

> 
> When there is SNC support for kernel's resctrl subsystem and SNC is
> enabled then sub node files are created for each node in the resctrlfs.
> The sub node files exist in each regular node's L3 monitoring directory.
> The reliable path to check for existence of sub node files is
> /sys/fs/resctrl/mon_data/mon_L3_00/mon_sub_L3_00.

... this is about previous patch ...

> 
> To check if SNC detection is reliable one can check the
> /sys/devices/system/cpu/offline file. If it's empty, it means all cores
> are operational and the ratio should be calculated correctly. If it has
> any contents, it means the detected SNC mode can't be trusted and should
> be disabled.

... also about previous patch ...

> 
> Add helpers for all operations mentioned above.

Not done in this patch.

> 
> Detect SNC mode once and let other tests inherit that information.

Not done in this patch.

> 
> Add messages to alert the user when SNC detection could return incorrect
> results. Correct old messages to account for kernel support of SNC in
> resctrl.

Sounds like description of what earlier version of this patch did ...

> 
> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
> ---
> Changelog v5:
> - Move all resctrlfs.c code from this patch to 1/2. (Reinette)

Please update the changelog to match the changes made to the patch.

> - Remove kernel support check and error message from CAT since it can't
>   be happen.
> - Remove snc checks in CAT since snc doesn't affect it here.
> - Skip MBM, MBA and CMT tests if snc is unreliable.
> 
> Changelog v4:
> - Change messages at the end of tests and at the start of
>   run_single_test. (Reinette)
> - Add messages at the end of CAT since it can also fail due to enabled
>   SNC + lack of kernel support.
> - Remove snc_mode global variable. (Reinette)
> - Fix wrong description of snc_kernel_support(). (Reinette)
> - Move call to cpus_offline_empty() into snc_nodes_per_l3_cache() so the
>   whole detection flow is in one place as discussed. (Reinette)
> 
> Changelog v3:
> - Change snc_ways() to snc_nodes_per_l3_cache(). (Reinette)
> - Add printing the discovered SNC mode. (Reinette)
> - Change method of kernel support discovery from cache sizes to
>   existance of sub node files.
> - Check if SNC detection is unreliable.
> - Move SNC detection to only the first run_single_test() instead on
>   error at the end of test runs.
> - Add global value to remind user at the end of relevant tests if SNC
>   detection was found to be unreliable.
> - Redo the patch message after the changes.
> 
> Changelog v2:
> - Move snc_ways() checks from individual tests into
>   snc_kernel_support().
> - Write better comment for snc_kernel_support().
> 
>  tools/testing/selftests/resctrl/cmt_test.c |  8 ++++++--
>  tools/testing/selftests/resctrl/mba_test.c |  8 +++++++-
>  tools/testing/selftests/resctrl/mbm_test.c | 10 +++++++---
>  tools/testing/selftests/resctrl/resctrl.h  |  3 +++
>  4 files changed, 23 insertions(+), 6 deletions(-)
> 
> diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
> index 0c045080d808..1470bd64d158 100644
> --- a/tools/testing/selftests/resctrl/cmt_test.c
> +++ b/tools/testing/selftests/resctrl/cmt_test.c
> @@ -133,6 +133,10 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
>  	ret = get_cache_size(uparams->cpu, "L3", &cache_total_size);
>  	if (ret)
>  		return ret;
> +
> +	if ((get_vendor() == ARCH_INTEL) && snc_unreliable)
> +		ksft_exit_skip("Sub-NUMA Clustering could not be detected properly. Skipping...\n");
> +

This (inconsistent) duplication can be moved to run_single_test().

>  	ksft_print_msg("Cache size :%lu\n", cache_total_size);
>  
>  	count_of_bits = count_bits(long_mask);
> @@ -175,8 +179,8 @@ static int cmt_run_test(const struct resctrl_test *test, const struct user_param
>  		goto out;
>  
>  	ret = check_results(&param, span, n);
> -	if (ret && (get_vendor() == ARCH_INTEL))
> -		ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
> +	if (ret && (get_vendor() == ARCH_INTEL) && !snc_kernel_support())
> +		ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
>  
>  out:
>  	free(span_str);
> diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
> index ab8496a4925b..8f4e198da047 100644
> --- a/tools/testing/selftests/resctrl/mba_test.c
> +++ b/tools/testing/selftests/resctrl/mba_test.c
> @@ -170,15 +170,21 @@ static int mba_run_test(const struct resctrl_test *test, const struct user_param
>  		.setup		= mba_setup,
>  		.measure	= mba_measure,
>  	};
> -	int ret;
> +	int ret, snc_support;
>  
>  	remove(RESULT_FILE_NAME);
>  
> +	snc_support = snc_kernel_support();
> +	if ((get_vendor() == ARCH_INTEL) && snc_unreliable)
> +		ksft_exit_skip("Sub-NUMA Clustering could not be detected properly. Skipping...\n");
> +
>  	ret = resctrl_val(test, uparams, uparams->benchmark_cmd, &param);
>  	if (ret)
>  		return ret;
>  
>  	ret = check_results();
> +	if (ret && (get_vendor() == ARCH_INTEL) && !snc_support)
> +		ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
>  
>  	return ret;
>  }
> diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
> index 6b5a3b52d861..a68f70589b91 100644
> --- a/tools/testing/selftests/resctrl/mbm_test.c
> +++ b/tools/testing/selftests/resctrl/mbm_test.c
> @@ -138,17 +138,21 @@ static int mbm_run_test(const struct resctrl_test *test, const struct user_param
>  		.setup		= mbm_setup,
>  		.measure	= mbm_measure,
>  	};
> -	int ret;
> +	int ret, snc_support;
>  
>  	remove(RESULT_FILE_NAME);
>  
> +	snc_support = snc_kernel_support();
> +	if ((get_vendor() == ARCH_INTEL) && snc_unreliable)
> +		ksft_exit_skip("Sub-NUMA Clustering could not be detected properly. Skipping...\n");
> +
>  	ret = resctrl_val(test, uparams, uparams->benchmark_cmd, &param);
>  	if (ret)
>  		return ret;
>  
>  	ret = check_results(DEFAULT_SPAN);
> -	if (ret && (get_vendor() == ARCH_INTEL))
> -		ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
> +	if (ret && (get_vendor() == ARCH_INTEL) && !snc_support)
> +		ksft_print_msg("Kernel doesn't support Sub-NUMA Clustering but it is enabled on the system.\n");
>  
>  	return ret;
>  }
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 851b37c9c38a..488bdca01e4f 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -121,6 +121,8 @@ struct perf_event_read {
>   */
>  extern volatile int *value_sink;
>  
> +extern int snc_unreliable;
> +
>  extern char llc_occup_path[1024];
>  
>  int snc_nodes_per_l3_cache(void);
> @@ -167,6 +169,7 @@ void ctrlc_handler(int signum, siginfo_t *info, void *ptr);
>  int signal_handler_register(const struct resctrl_test *test);
>  void signal_handler_unregister(void);
>  unsigned int count_bits(unsigned long n);
> +int snc_kernel_support(void);
>  
>  void perf_event_attr_initialize(struct perf_event_attr *pea, __u64 config);
>  void perf_event_initialize_read_format(struct perf_event_read *pe_read);

Reinette

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled
  2024-10-29 13:00 ` [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled Maciej Wieczor-Retman
  2024-11-05 17:08   ` Reinette Chatre
@ 2024-11-05 23:28   ` Reinette Chatre
  1 sibling, 0 replies; 6+ messages in thread
From: Reinette Chatre @ 2024-11-05 23:28 UTC (permalink / raw)
  To: Maciej Wieczor-Retman, fenghua.yu, shuah
  Cc: linux-kernel, linux-kselftest, ilpo.jarvinen, tony.luck

Hi Maciej,

On 10/29/24 6:00 AM, Maciej Wieczor-Retman wrote:
> diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
> index 2dda56084588..851b37c9c38a 100644
> --- a/tools/testing/selftests/resctrl/resctrl.h
> +++ b/tools/testing/selftests/resctrl/resctrl.h
> @@ -11,6 +11,7 @@
>  #include <signal.h>
>  #include <dirent.h>
>  #include <stdbool.h>
> +#include <ctype.h>
>  #include <sys/stat.h>
>  #include <sys/ioctl.h>
>  #include <sys/mount.h>
> @@ -43,6 +44,8 @@
>  
>  #define DEFAULT_SPAN		(250 * MB)
>  
> +#define MAX_SNC		4
> +

fyi, it seems that 6 is the new max:
https://lore.kernel.org/all/20241031220213.17991-1-tony.luck@intel.com/

Reinette


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-11-05 23:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-29 13:00 [PATCH v5 0/2] selftests/resctrl: SNC kernel support discovery Maciej Wieczor-Retman
2024-10-29 13:00 ` [PATCH v5 1/2] selftests/resctrl: Adjust effective L3 cache size with SNC enabled Maciej Wieczor-Retman
2024-11-05 17:08   ` Reinette Chatre
2024-11-05 23:28   ` Reinette Chatre
2024-10-29 13:00 ` [PATCH v5 2/2] selftests/resctrl: Adjust SNC support messages Maciej Wieczor-Retman
2024-11-05 17:08   ` Reinette Chatre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox