* [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests
@ 2023-09-15 15:44 Ilpo Järvinen
2023-09-15 15:44 ` [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
` (7 more replies)
0 siblings, 8 replies; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
Fix three issues with resctrl selftests.
The signal handling fix became necessary after the mount/umount fixes.
The other two came up when I ran resctrl selftests across the server
fleet in our lab to validate the upcoming CAT test rewrite (the rewrite
is not part of this series).
These are developed and should apply cleanly at least on top the
benchmark cleanup series (might apply cleanly also w/o the benchmark
series, I didn't test).
v2:
- Include patch to move _GNU_SOURCE to Makefile to allow normal #include
placement
- Rework the signal register/unregister into patch to use helpers
- Fixed incorrect function parameter description
- Use return !!res to avoid confusing implicit boolean conversion
- Improve MBA/MBM success bound patch's changelog
- Tweak Cc: stable dependencies (make it a chain).
Ilpo Järvinen (6):
selftests/resctrl: Extend signal handler coverage to unmount on
receiving signal
selftests/resctrl: Remove duplicate feature check from CMT test
selftests/resctrl: Move _GNU_SOURCE define into Makefile
selftests/resctrl: Refactor feature check to use resource and feature
name
selftests/resctrl: Fix feature checks
selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
tools/testing/selftests/resctrl/Makefile | 2 +-
tools/testing/selftests/resctrl/cat_test.c | 8 --
tools/testing/selftests/resctrl/cmt_test.c | 3 -
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
tools/testing/selftests/resctrl/resctrl.h | 7 +-
.../testing/selftests/resctrl/resctrl_tests.c | 78 +++++++++++--------
tools/testing/selftests/resctrl/resctrl_val.c | 22 +++---
tools/testing/selftests/resctrl/resctrlfs.c | 69 +++++++---------
9 files changed, 88 insertions(+), 105 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
@ 2023-09-15 15:44 ` Ilpo Järvinen
2023-09-26 21:38 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 2/6] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
` (6 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
Unmounting resctrl FS has been moved into the per test functions in
resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
SIGTERM, or SIGHUP) is received, the running selftest is aborted by
ctrlc_handler() which then unmounts resctrl fs before exiting. The
current section between signal_handler_register() and
signal_handler_unregister(), however, does not cover the entire
duration when resctrl FS is mounted.
Move signal_handler_register() and signal_handler_unregister() calls
from per test files into resctrl_tests.c to properly unmount resctrl
fs. In order to not add signal_handler_register()/unregister() n times,
create helpers test_prepare() and test_cleanup().
Adjust child process kill() call in ctrlc_handler() to only be invoked
if the child was already forked.
Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
tools/testing/selftests/resctrl/cat_test.c | 8 ---
.../testing/selftests/resctrl/resctrl_tests.c | 65 +++++++++++--------
tools/testing/selftests/resctrl/resctrl_val.c | 22 +++----
3 files changed, 48 insertions(+), 47 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
index 97b87285ab2a..224ba8544d8a 100644
--- a/tools/testing/selftests/resctrl/cat_test.c
+++ b/tools/testing/selftests/resctrl/cat_test.c
@@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
strcpy(param.filename, RESULT_FILE_NAME1);
param.num_of_runs = 0;
param.cpu_no = sibling_cpu_no;
- } else {
- ret = signal_handler_register();
- if (ret) {
- kill(bm_pid, SIGKILL);
- goto out;
- }
}
remove(param.filename);
@@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
}
close(pipefd[0]);
kill(bm_pid, SIGKILL);
- signal_handler_unregister();
}
-out:
cat_test_cleanup();
return ret;
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 823672a20a43..524ba83d7568 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -67,21 +67,41 @@ void tests_cleanup(void)
cat_test_cleanup();
}
-static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
+static int test_prepare()
{
int res;
- ksft_print_msg("Starting MBM BW change ...\n");
+ res = signal_handler_register();
+ if (res)
+ return res;
res = mount_resctrlfs();
if (res) {
+ signal_handler_unregister();
ksft_exit_fail_msg("Failed to mount resctrl FS\n");
- return;
+ return res;
}
+ return 0;
+}
+
+static void test_cleanup()
+{
+ umount_resctrlfs();
+ signal_handler_unregister();
+}
+
+static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
+{
+ int res;
+
+ ksft_print_msg("Starting MBM BW change ...\n");
+
+ if (test_prepare())
+ return;
if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
- goto umount;
+ goto cleanup;
}
res = mbm_bw_change(cpu_no, benchmark_cmd);
@@ -89,8 +109,8 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
if ((get_vendor() == ARCH_INTEL) && res)
ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
-umount:
- umount_resctrlfs();
+cleanup:
+ test_cleanup();
}
static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
@@ -99,22 +119,19 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
ksft_print_msg("Starting MBA Schemata change ...\n");
- res = mount_resctrlfs();
- if (res) {
- ksft_exit_fail_msg("Failed to mount resctrl FS\n");
+ if (test_prepare())
return;
- }
if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
- goto umount;
+ goto cleanup;
}
res = mba_schemata_change(cpu_no, benchmark_cmd);
ksft_test_result(!res, "MBA: schemata change\n");
-umount:
- umount_resctrlfs();
+cleanup:
+ test_cleanup();
}
static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
@@ -123,15 +140,12 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
ksft_print_msg("Starting CMT test ...\n");
- res = mount_resctrlfs();
- if (res) {
- ksft_exit_fail_msg("Failed to mount resctrl FS\n");
+ if (test_prepare())
return;
- }
if (!validate_resctrl_feature_request(CMT_STR)) {
ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
- goto umount;
+ goto cleanup;
}
res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
@@ -139,8 +153,8 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
if ((get_vendor() == ARCH_INTEL) && res)
ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
-umount:
- umount_resctrlfs();
+cleanup:
+ test_cleanup();
}
static void run_cat_test(int cpu_no, int no_of_bits)
@@ -149,22 +163,19 @@ static void run_cat_test(int cpu_no, int no_of_bits)
ksft_print_msg("Starting CAT test ...\n");
- res = mount_resctrlfs();
- if (res) {
- ksft_exit_fail_msg("Failed to mount resctrl FS\n");
+ if (test_prepare())
return;
- }
if (!validate_resctrl_feature_request(CAT_STR)) {
ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
- goto umount;
+ goto cleanup;
}
res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
ksft_test_result(!res, "CAT: test\n");
-umount:
- umount_resctrlfs();
+cleanup:
+ test_cleanup();
}
int main(int argc, char **argv)
diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
index 51963a6f2186..a9fe61133119 100644
--- a/tools/testing/selftests/resctrl/resctrl_val.c
+++ b/tools/testing/selftests/resctrl/resctrl_val.c
@@ -468,7 +468,9 @@ pid_t bm_pid, ppid;
void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
{
- kill(bm_pid, SIGKILL);
+ /* Only kill child after bm_pid is set after fork() */
+ if (bm_pid)
+ kill(bm_pid, SIGKILL);
umount_resctrlfs();
tests_cleanup();
ksft_print_msg("Ending\n\n");
@@ -485,6 +487,8 @@ int signal_handler_register(void)
struct sigaction sigact;
int ret = 0;
+ bm_pid = 0;
+
sigact.sa_sigaction = ctrlc_handler;
sigemptyset(&sigact.sa_mask);
sigact.sa_flags = SA_SIGINFO;
@@ -706,10 +710,6 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
ksft_print_msg("Benchmark PID: %d\n", bm_pid);
- ret = signal_handler_register();
- if (ret)
- goto out;
-
/*
* The cast removes constness but nothing mutates benchmark_cmd within
* the context of this process. At the receiving process, it becomes
@@ -721,19 +721,19 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
/* Taskset benchmark to specified cpu */
ret = taskset_benchmark(bm_pid, param->cpu_no);
if (ret)
- goto unregister;
+ goto out;
/* Write benchmark to specified control&monitoring grp in resctrl FS */
ret = write_bm_pid_to_resctrl(bm_pid, param->ctrlgrp, param->mongrp,
resctrl_val);
if (ret)
- goto unregister;
+ goto out;
if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
ret = initialize_mem_bw_imc();
if (ret)
- goto unregister;
+ goto out;
initialize_mem_bw_resctrl(param->ctrlgrp, param->mongrp,
param->cpu_no, resctrl_val);
@@ -748,7 +748,7 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
sizeof(pipe_message)) {
perror("# failed reading message from child process");
close(pipefd[0]);
- goto unregister;
+ goto out;
}
}
close(pipefd[0]);
@@ -757,7 +757,7 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
if (sigqueue(bm_pid, SIGUSR1, value) == -1) {
perror("# sigqueue SIGUSR1 to child");
ret = errno;
- goto unregister;
+ goto out;
}
/* Give benchmark enough time to fully run */
@@ -786,8 +786,6 @@ int resctrl_val(const char * const *benchmark_cmd, struct resctrl_val_param *par
}
}
-unregister:
- signal_handler_unregister();
out:
kill(bm_pid, SIGKILL);
--
2.30.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/6] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-15 15:44 ` [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
@ 2023-09-15 15:44 ` Ilpo Järvinen
2023-09-26 21:39 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 3/6] selftests/resctrl: Move _GNU_SOURCE define into Makefile Ilpo Järvinen
` (5 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
The test runner run_cmt_test() in resctrl_tests.c checks for CMT
feature and does not run cmt_resctrl_val() if CMT is not supported.
Then cmt_resctrl_val() also check is CMT is supported.
Remove the duplicated feature check for CMT from cmt_resctrl_val().
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
tools/testing/selftests/resctrl/cmt_test.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/tools/testing/selftests/resctrl/cmt_test.c b/tools/testing/selftests/resctrl/cmt_test.c
index cf2f5e92dea6..50bdbce9fba9 100644
--- a/tools/testing/selftests/resctrl/cmt_test.c
+++ b/tools/testing/selftests/resctrl/cmt_test.c
@@ -80,9 +80,6 @@ int cmt_resctrl_val(int cpu_no, int n, const char * const *benchmark_cmd)
size_t span;
int ret, i;
- if (!validate_resctrl_feature_request(CMT_STR))
- return -1;
-
ret = get_cbm_mask("L3", cbm_mask);
if (ret)
return ret;
--
2.30.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 3/6] selftests/resctrl: Move _GNU_SOURCE define into Makefile
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-15 15:44 ` [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
2023-09-15 15:44 ` [PATCH v2 2/6] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
@ 2023-09-15 15:44 ` Ilpo Järvinen
2023-09-26 21:39 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 4/6] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
` (4 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
Currently, _GNU_SOURCE is defined in resctrl.h. Defining _GNU_SOURCE
has a large impact on what gets defined when including headers either
before or after it. This can result in compile failures if .c file
decides to include a standard header file before resctrl.h.
It is safer to define _GNU_SOURCE in Makefile so it is always defined
regardless of in which order includes are done.
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org>
---
tools/testing/selftests/resctrl/Makefile | 2 +-
tools/testing/selftests/resctrl/resctrl.h | 1 -
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/Makefile b/tools/testing/selftests/resctrl/Makefile
index 5073dbc96125..2deac2031de9 100644
--- a/tools/testing/selftests/resctrl/Makefile
+++ b/tools/testing/selftests/resctrl/Makefile
@@ -1,6 +1,6 @@
# SPDX-License-Identifier: GPL-2.0
-CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2
+CFLAGS = -g -Wall -O2 -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE
CFLAGS += $(KHDR_INCLUDES)
TEST_GEN_PROGS := resctrl_tests
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index dd07463cdf48..d9b5df95849d 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -1,5 +1,4 @@
/* SPDX-License-Identifier: GPL-2.0 */
-#define _GNU_SOURCE
#ifndef RESCTRL_H
#define RESCTRL_H
#include <stdio.h>
--
2.30.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 4/6] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (2 preceding siblings ...)
2023-09-15 15:44 ` [PATCH v2 3/6] selftests/resctrl: Move _GNU_SOURCE define into Makefile Ilpo Järvinen
@ 2023-09-15 15:44 ` Ilpo Järvinen
2023-09-26 21:40 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 5/6] selftests/resctrl: Fix feature checks Ilpo Järvinen
` (3 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
Feature check in validate_resctrl_feature_request() takes in the test
name string and maps that to what to check per test.
Pass resource and feature names to validate_resctrl_feature_request()
directly rather than deriving them from the test name inside the
function which makes the feature check easier to extend for new test
cases.
Use !! in the return statement to make the boolean conversion more
obvious even if it is not strictly necessary from correctness point of
view (to avoid it looking like the function is returning a freed
pointer).
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # selftests/resctrl: Remove duplicate feature check from CMT test
Cc: <stable@vger.kernel.org> # selftests/resctrl: Move _GNU_SOURCE define into Makefile
---
tools/testing/selftests/resctrl/resctrl.h | 6 +-
.../testing/selftests/resctrl/resctrl_tests.c | 10 +--
tools/testing/selftests/resctrl/resctrlfs.c | 69 ++++++++-----------
3 files changed, 34 insertions(+), 51 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl.h b/tools/testing/selftests/resctrl/resctrl.h
index d9b5df95849d..8578a8b4e145 100644
--- a/tools/testing/selftests/resctrl/resctrl.h
+++ b/tools/testing/selftests/resctrl/resctrl.h
@@ -27,10 +27,6 @@
#define RESCTRL_PATH "/sys/fs/resctrl"
#define PHYS_ID_PATH "/sys/devices/system/cpu/cpu"
#define INFO_PATH "/sys/fs/resctrl/info"
-#define L3_PATH "/sys/fs/resctrl/info/L3"
-#define MB_PATH "/sys/fs/resctrl/info/MB"
-#define L3_MON_PATH "/sys/fs/resctrl/info/L3_MON"
-#define L3_MON_FEATURES_PATH "/sys/fs/resctrl/info/L3_MON/mon_features"
#define ARCH_INTEL 1
#define ARCH_AMD 2
@@ -87,7 +83,7 @@ int get_resource_id(int cpu_no, int *resource_id);
int mount_resctrlfs(void);
int umount_resctrlfs(void);
int validate_bw_report_request(char *bw_report);
-bool validate_resctrl_feature_request(const char *resctrl_val);
+bool validate_resctrl_feature_request(const char *resource, const char *feature);
char *fgrep(FILE *inf, const char *str);
int taskset_benchmark(pid_t bm_pid, int cpu_no);
void run_benchmark(int signum, siginfo_t *info, void *ucontext);
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index 524ba83d7568..b13aee5dffb9 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -99,7 +99,9 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
if (test_prepare())
return;
- if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) {
+ if (!validate_resctrl_feature_request("L3_MON", "mbm_total_bytes") ||
+ !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
+ (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
goto cleanup;
}
@@ -122,7 +124,7 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
if (test_prepare())
return;
- if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) {
+ if (!validate_resctrl_feature_request("MB", NULL) || (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
goto cleanup;
}
@@ -143,7 +145,7 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
if (test_prepare())
return;
- if (!validate_resctrl_feature_request(CMT_STR)) {
+ if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy")) {
ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
goto cleanup;
}
@@ -166,7 +168,7 @@ static void run_cat_test(int cpu_no, int no_of_bits)
if (test_prepare())
return;
- if (!validate_resctrl_feature_request(CAT_STR)) {
+ if (!validate_resctrl_feature_request("L3", NULL)) {
ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
goto cleanup;
}
diff --git a/tools/testing/selftests/resctrl/resctrlfs.c b/tools/testing/selftests/resctrl/resctrlfs.c
index bd36ee206602..3a8111362d26 100644
--- a/tools/testing/selftests/resctrl/resctrlfs.c
+++ b/tools/testing/selftests/resctrl/resctrlfs.c
@@ -8,6 +8,8 @@
* Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>,
* Fenghua Yu <fenghua.yu@intel.com>
*/
+#include <limits.h>
+
#include "resctrl.h"
static int find_resctrl_mount(char *buffer)
@@ -604,63 +606,46 @@ char *fgrep(FILE *inf, const char *str)
/*
* validate_resctrl_feature_request - Check if requested feature is valid.
- * @resctrl_val: Requested feature
+ * @resource: Required resource (e.g., MB, L3, L2, L3_MON, etc.)
+ * @feature: Required monitor feature (in mon_features file). Can only be
+ * set for L3_MON. Must be NULL for all other resources.
*
- * Return: True if the feature is supported, else false. False is also
- * returned if resctrl FS is not mounted.
+ * Return: True if the resource/feature is supported, else false. False is
+ * also returned if resctrl FS is not mounted.
*/
-bool validate_resctrl_feature_request(const char *resctrl_val)
+bool validate_resctrl_feature_request(const char *resource, const char *feature)
{
+ char res_path[PATH_MAX];
struct stat statbuf;
- bool found = false;
char *res;
FILE *inf;
int ret;
- if (!resctrl_val)
+ if (!resource)
return false;
ret = find_resctrl_mount(NULL);
if (ret)
return false;
- if (!strncmp(resctrl_val, CAT_STR, sizeof(CAT_STR))) {
- if (!stat(L3_PATH, &statbuf))
- return true;
- } else if (!strncmp(resctrl_val, MBA_STR, sizeof(MBA_STR))) {
- if (!stat(MB_PATH, &statbuf))
- return true;
- } else if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR)) ||
- !strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
- if (!stat(L3_MON_PATH, &statbuf)) {
- inf = fopen(L3_MON_FEATURES_PATH, "r");
- if (!inf)
- return false;
-
- if (!strncmp(resctrl_val, CMT_STR, sizeof(CMT_STR))) {
- res = fgrep(inf, "llc_occupancy");
- if (res) {
- found = true;
- free(res);
- }
- }
-
- if (!strncmp(resctrl_val, MBM_STR, sizeof(MBM_STR))) {
- res = fgrep(inf, "mbm_total_bytes");
- if (res) {
- free(res);
- res = fgrep(inf, "mbm_local_bytes");
- if (res) {
- found = true;
- free(res);
- }
- }
- }
- fclose(inf);
- }
- }
+ snprintf(res_path, sizeof(res_path), "%s/%s", INFO_PATH, resource);
+
+ if (stat(res_path, &statbuf))
+ return false;
+
+ if (!feature)
+ return true;
+
+ snprintf(res_path, sizeof(res_path), "%s/%s/mon_features", INFO_PATH, resource);
+ inf = fopen(res_path, "r");
+ if (!inf)
+ return false;
+
+ res = fgrep(inf, feature);
+ free(res);
+ fclose(inf);
- return found;
+ return !!res;
}
int filter_dmesg(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 5/6] selftests/resctrl: Fix feature checks
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (3 preceding siblings ...)
2023-09-15 15:44 ` [PATCH v2 4/6] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
@ 2023-09-15 15:44 ` Ilpo Järvinen
2023-09-26 21:41 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 6/6] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
` (2 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
The MBA and CMT tests expect support of other features to be able to
run.
When platform only supports MBA but not MBM, MBA test will fail with:
Failed to open total bw file: No such file or directory
When platform only supports CMT but not CAT, CMT test will fail with:
Failed to open bit mask file '/sys/fs/resctrl/info/L3/cbm_mask': No such file or directory
Extend feature checks to cover these two conditions.
Fixes: ee0415681eb6 ("selftests/resctrl: Use resctrl/info for feature detection")
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: <stable@vger.kernel.org> # selftests/resctrl: Refactor feature check to use resource and feature name
---
tools/testing/selftests/resctrl/resctrl_tests.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
index b13aee5dffb9..836dfe5c0b4d 100644
--- a/tools/testing/selftests/resctrl/resctrl_tests.c
+++ b/tools/testing/selftests/resctrl/resctrl_tests.c
@@ -124,7 +124,9 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
if (test_prepare())
return;
- if (!validate_resctrl_feature_request("MB", NULL) || (get_vendor() != ARCH_INTEL)) {
+ if (!validate_resctrl_feature_request("MB", NULL) ||
+ !validate_resctrl_feature_request("L3_MON", "mbm_local_bytes") ||
+ (get_vendor() != ARCH_INTEL)) {
ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
goto cleanup;
}
@@ -145,7 +147,8 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
if (test_prepare())
return;
- if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy")) {
+ if (!validate_resctrl_feature_request("L3_MON", "llc_occupancy") ||
+ !validate_resctrl_feature_request("L3", NULL)) {
ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
goto cleanup;
}
--
2.30.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 6/6] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (4 preceding siblings ...)
2023-09-15 15:44 ` [PATCH v2 5/6] selftests/resctrl: Fix feature checks Ilpo Järvinen
@ 2023-09-15 15:44 ` Ilpo Järvinen
2023-09-26 21:41 ` Reinette Chatre
2023-09-20 10:13 ` [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Maciej Wieczór-Retman
2023-09-28 8:18 ` Shaopeng Tan (Fujitsu)
7 siblings, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-15 15:44 UTC (permalink / raw)
To: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable, Ilpo Järvinen
The initial value of 5% chosen for the maximum allowed percentage
difference between resctrl mbm value and IMC mbm value in commit
06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting
format") was "randomly chosen value" (as admitted by the changelog).
When running tests in our lab across a large number platforms, 5%
difference upper bound for success seems a bit on the low side for the
MBA and MBM tests. Some platforms produce outliers that are slightly
above that, typically 6-7%, which leads MBA/MBM test frequently
failing.
Replace the "randomly chosen value" with a success bound that is based
on those measurements across large number of platforms by relaxing the
MBA/MBM success bound to 8%. The relaxed bound removes the failures due
the frequent outliers.
Fixes: 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format")
Cc: <stable@vger.kernel.org>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
---
tools/testing/selftests/resctrl/mba_test.c | 2 +-
tools/testing/selftests/resctrl/mbm_test.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index cf8284dadcb2..d3bf4368341e 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -12,7 +12,7 @@
#define RESULT_FILE_NAME "result_mba"
#define NUM_OF_RUNS 5
-#define MAX_DIFF_PERCENT 5
+#define MAX_DIFF_PERCENT 8
#define ALLOCATION_MAX 100
#define ALLOCATION_MIN 10
#define ALLOCATION_STEP 10
diff --git a/tools/testing/selftests/resctrl/mbm_test.c b/tools/testing/selftests/resctrl/mbm_test.c
index 1ae131a2e246..d3c0d30c676a 100644
--- a/tools/testing/selftests/resctrl/mbm_test.c
+++ b/tools/testing/selftests/resctrl/mbm_test.c
@@ -11,7 +11,7 @@
#include "resctrl.h"
#define RESULT_FILE_NAME "result_mbm"
-#define MAX_DIFF_PERCENT 5
+#define MAX_DIFF_PERCENT 8
#define NUM_OF_RUNS 5
static int
--
2.30.2
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (5 preceding siblings ...)
2023-09-15 15:44 ` [PATCH v2 6/6] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
@ 2023-09-20 10:13 ` Maciej Wieczór-Retman
2023-09-28 8:18 ` Shaopeng Tan (Fujitsu)
7 siblings, 0 replies; 20+ messages in thread
From: Maciej Wieczór-Retman @ 2023-09-20 10:13 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Reinette Chatre, Shuah Khan, Shuah Khan, linux-kselftest, LKML,
Shaopeng Tan, stable
Hi, looks good to me. Also ran it without problems on Intel(R) Xeon(R)
Platinum 8360Y.
Reviewed-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
--
Kind regards
Maciej Wieczór-Retman
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-15 15:44 ` [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
@ 2023-09-26 21:38 ` Reinette Chatre
2023-09-28 8:10 ` Shaopeng Tan (Fujitsu)
2023-09-28 12:47 ` Ilpo Järvinen
0 siblings, 2 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-26 21:38 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> Unmounting resctrl FS has been moved into the per test functions in
> resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> ctrlc_handler() which then unmounts resctrl fs before exiting. The
> current section between signal_handler_register() and
> signal_handler_unregister(), however, does not cover the entire
> duration when resctrl FS is mounted.
>
> Move signal_handler_register() and signal_handler_unregister() calls
> from per test files into resctrl_tests.c to properly unmount resctrl
> fs. In order to not add signal_handler_register()/unregister() n times,
> create helpers test_prepare() and test_cleanup().
>
> Adjust child process kill() call in ctrlc_handler() to only be invoked
> if the child was already forked.
>
> Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org>
> ---
> tools/testing/selftests/resctrl/cat_test.c | 8 ---
> .../testing/selftests/resctrl/resctrl_tests.c | 65 +++++++++++--------
> tools/testing/selftests/resctrl/resctrl_val.c | 22 +++----
> 3 files changed, 48 insertions(+), 47 deletions(-)
>
> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> index 97b87285ab2a..224ba8544d8a 100644
> --- a/tools/testing/selftests/resctrl/cat_test.c
> +++ b/tools/testing/selftests/resctrl/cat_test.c
> @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> strcpy(param.filename, RESULT_FILE_NAME1);
> param.num_of_runs = 0;
> param.cpu_no = sibling_cpu_no;
> - } else {
> - ret = signal_handler_register();
> - if (ret) {
> - kill(bm_pid, SIGKILL);
> - goto out;
> - }
> }
>
> remove(param.filename);
> @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> }
> close(pipefd[0]);
> kill(bm_pid, SIGKILL);
> - signal_handler_unregister();
> }
>
> -out:
> cat_test_cleanup();
>
> return ret;
> diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> index 823672a20a43..524ba83d7568 100644
> --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> @@ -67,21 +67,41 @@ void tests_cleanup(void)
> cat_test_cleanup();
> }
>
> -static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> +static int test_prepare()
> {
> int res;
>
> - ksft_print_msg("Starting MBM BW change ...\n");
> + res = signal_handler_register();
> + if (res)
> + return res;
>
> res = mount_resctrlfs();
> if (res) {
> + signal_handler_unregister();
> ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> - return;
> + return res;
> }
> + return 0;
> +}
> +
> +static void test_cleanup()
> +{
> + umount_resctrlfs();
> + signal_handler_unregister();
> +}
Thank you for adding these.
> +
> +static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> +{
> + int res;
> +
> + ksft_print_msg("Starting MBM BW change ...\n");
> +
> + if (test_prepare())
> + return;
>
I am not sure about this. With this exit the kselftest machinery is not
aware of the test passing or failing. I wonder if there should not rather
be a "goto" here that triggers ksft_test_result()? This needs some more
thought though. First, with this change test_prepare() officially gains
responsibility to determine if a failure is transient (just a single test
fails) or permanent (no use trying any other tests if this fails). For
the former it would then be up to the caller to call ksft_test_result()
and for the latter test_prepare() will call ksft_exit_fail_msg().
Second, that SNC warning may be an inconvenience with a new goto. Here
it may be ok to print that message before the test failure?
> if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) {
> ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
> - goto umount;
> + goto cleanup;
> }
>
> res = mbm_bw_change(cpu_no, benchmark_cmd);
> @@ -89,8 +109,8 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> if ((get_vendor() == ARCH_INTEL) && res)
> ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
>
> -umount:
> - umount_resctrlfs();
> +cleanup:
> + test_cleanup();
> }
>
> static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> @@ -99,22 +119,19 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
>
> ksft_print_msg("Starting MBA Schemata change ...\n");
>
> - res = mount_resctrlfs();
> - if (res) {
> - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> + if (test_prepare())
> return;
> - }
>
> if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) {
> ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
> - goto umount;
> + goto cleanup;
> }
>
> res = mba_schemata_change(cpu_no, benchmark_cmd);
> ksft_test_result(!res, "MBA: schemata change\n");
>
> -umount:
> - umount_resctrlfs();
> +cleanup:
> + test_cleanup();
> }
>
> static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> @@ -123,15 +140,12 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
>
> ksft_print_msg("Starting CMT test ...\n");
>
> - res = mount_resctrlfs();
> - if (res) {
> - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> + if (test_prepare())
> return;
> - }
>
> if (!validate_resctrl_feature_request(CMT_STR)) {
> ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
> - goto umount;
> + goto cleanup;
> }
>
> res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
> @@ -139,8 +153,8 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> if ((get_vendor() == ARCH_INTEL) && res)
> ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
>
> -umount:
> - umount_resctrlfs();
> +cleanup:
> + test_cleanup();
> }
>
> static void run_cat_test(int cpu_no, int no_of_bits)
> @@ -149,22 +163,19 @@ static void run_cat_test(int cpu_no, int no_of_bits)
>
> ksft_print_msg("Starting CAT test ...\n");
>
> - res = mount_resctrlfs();
> - if (res) {
> - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> + if (test_prepare())
> return;
> - }
>
> if (!validate_resctrl_feature_request(CAT_STR)) {
> ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
> - goto umount;
> + goto cleanup;
> }
>
> res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
> ksft_test_result(!res, "CAT: test\n");
>
> -umount:
> - umount_resctrlfs();
> +cleanup:
> + test_cleanup();
> }
>
> int main(int argc, char **argv)
> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> index 51963a6f2186..a9fe61133119 100644
> --- a/tools/testing/selftests/resctrl/resctrl_val.c
> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> @@ -468,7 +468,9 @@ pid_t bm_pid, ppid;
>
> void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
> {
> - kill(bm_pid, SIGKILL);
> + /* Only kill child after bm_pid is set after fork() */
> + if (bm_pid)
> + kill(bm_pid, SIGKILL);
> umount_resctrlfs();
> tests_cleanup();
> ksft_print_msg("Ending\n\n");
> @@ -485,6 +487,8 @@ int signal_handler_register(void)
> struct sigaction sigact;
> int ret = 0;
>
> + bm_pid = 0;
> +
Since this is an initialization fix in this area ... what
do you think of also initializing sigact? It could just be
a change to
struct sigaction sigact = {};
This will prevent registering a signal handler with
uninitialized sa_flags.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 2/6] selftests/resctrl: Remove duplicate feature check from CMT test
2023-09-15 15:44 ` [PATCH v2 2/6] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
@ 2023-09-26 21:39 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-26 21:39 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> The test runner run_cmt_test() in resctrl_tests.c checks for CMT
> feature and does not run cmt_resctrl_val() if CMT is not supported.
> Then cmt_resctrl_val() also check is CMT is supported.
>
> Remove the duplicated feature check for CMT from cmt_resctrl_val().
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org>
Thank you.
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 3/6] selftests/resctrl: Move _GNU_SOURCE define into Makefile
2023-09-15 15:44 ` [PATCH v2 3/6] selftests/resctrl: Move _GNU_SOURCE define into Makefile Ilpo Järvinen
@ 2023-09-26 21:39 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-26 21:39 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> Currently, _GNU_SOURCE is defined in resctrl.h. Defining _GNU_SOURCE
nitpick: If you do submit a new version, could you please drop the
"Currently".
> has a large impact on what gets defined when including headers either
> before or after it. This can result in compile failures if .c file
> decides to include a standard header file before resctrl.h.
>
> It is safer to define _GNU_SOURCE in Makefile so it is always defined
> regardless of in which order includes are done.
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org>
Thank you.
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/6] selftests/resctrl: Refactor feature check to use resource and feature name
2023-09-15 15:44 ` [PATCH v2 4/6] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
@ 2023-09-26 21:40 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-26 21:40 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> Feature check in validate_resctrl_feature_request() takes in the test
> name string and maps that to what to check per test.
>
> Pass resource and feature names to validate_resctrl_feature_request()
> directly rather than deriving them from the test name inside the
> function which makes the feature check easier to extend for new test
> cases.
>
> Use !! in the return statement to make the boolean conversion more
> obvious even if it is not strictly necessary from correctness point of
> view (to avoid it looking like the function is returning a freed
> pointer).
>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org> # selftests/resctrl: Remove duplicate feature check from CMT test
> Cc: <stable@vger.kernel.org> # selftests/resctrl: Move _GNU_SOURCE define into Makefile
Thank you.
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 5/6] selftests/resctrl: Fix feature checks
2023-09-15 15:44 ` [PATCH v2 5/6] selftests/resctrl: Fix feature checks Ilpo Järvinen
@ 2023-09-26 21:41 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-26 21:41 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> The MBA and CMT tests expect support of other features to be able to
> run.
>
> When platform only supports MBA but not MBM, MBA test will fail with:
> Failed to open total bw file: No such file or directory
>
> When platform only supports CMT but not CAT, CMT test will fail with:
> Failed to open bit mask file '/sys/fs/resctrl/info/L3/cbm_mask': No such file or directory
>
> Extend feature checks to cover these two conditions.
It may be helpful to expand on this more to justify why this is
stable material. At this point it sounds as though (from user space
perspective) one error is replaced by another. Perhaps just a snippet
that states that this fix will cause the test to be skipped on these
platforms.
>
> Fixes: ee0415681eb6 ("selftests/resctrl: Use resctrl/info for feature detection")
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Cc: <stable@vger.kernel.org> # selftests/resctrl: Refactor feature check to use resource and feature name
Thank you.
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 6/6] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
2023-09-15 15:44 ` [PATCH v2 6/6] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
@ 2023-09-26 21:41 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-26 21:41 UTC (permalink / raw)
To: Ilpo Järvinen, Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman
Cc: LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> The initial value of 5% chosen for the maximum allowed percentage
> difference between resctrl mbm value and IMC mbm value in commit
> 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting
> format") was "randomly chosen value" (as admitted by the changelog).
>
> When running tests in our lab across a large number platforms, 5%
> difference upper bound for success seems a bit on the low side for the
> MBA and MBM tests. Some platforms produce outliers that are slightly
> above that, typically 6-7%, which leads MBA/MBM test frequently
> failing.
>
> Replace the "randomly chosen value" with a success bound that is based
> on those measurements across large number of platforms by relaxing the
> MBA/MBM success bound to 8%. The relaxed bound removes the failures due
> the frequent outliers.
>
> Fixes: 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Thank you.
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-26 21:38 ` Reinette Chatre
@ 2023-09-28 8:10 ` Shaopeng Tan (Fujitsu)
2023-10-04 17:43 ` Reinette Chatre
2023-09-28 12:47 ` Ilpo Järvinen
1 sibling, 1 reply; 20+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2023-09-28 8:10 UTC (permalink / raw)
To: 'Reinette Chatre', Ilpo Järvinen, Shuah Khan,
Shuah Khan, linux-kselftest@vger.kernel.org,
Maciej Wieczór-Retman
Cc: LKML, stable@vger.kernel.org
Hello
> On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> > Unmounting resctrl FS has been moved into the per test functions in
> > resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> > resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> > SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> > ctrlc_handler() which then unmounts resctrl fs before exiting. The
> > current section between signal_handler_register() and
> > signal_handler_unregister(), however, does not cover the entire
> > duration when resctrl FS is mounted.
> >
> > Move signal_handler_register() and signal_handler_unregister() calls
> > from per test files into resctrl_tests.c to properly unmount resctrl
> > fs. In order to not add signal_handler_register()/unregister() n
> > times, create helpers test_prepare() and test_cleanup().
> >
> > Adjust child process kill() call in ctrlc_handler() to only be invoked
> > if the child was already forked.
> >
> > Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount
> > to higher level")
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Cc: <stable@vger.kernel.org>
> > ---
> > tools/testing/selftests/resctrl/cat_test.c | 8 ---
> > .../testing/selftests/resctrl/resctrl_tests.c | 65
> > +++++++++++-------- tools/testing/selftests/resctrl/resctrl_val.c |
> > 22 +++----
> > 3 files changed, 48 insertions(+), 47 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c
> > b/tools/testing/selftests/resctrl/cat_test.c
> > index 97b87285ab2a..224ba8544d8a 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char
> *cache_type)
> > strcpy(param.filename, RESULT_FILE_NAME1);
> > param.num_of_runs = 0;
> > param.cpu_no = sibling_cpu_no;
> > - } else {
> > - ret = signal_handler_register();
> > - if (ret) {
> > - kill(bm_pid, SIGKILL);
> > - goto out;
> > - }
> > }
> >
> > remove(param.filename);
> > @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char
> *cache_type)
> > }
> > close(pipefd[0]);
> > kill(bm_pid, SIGKILL);
> > - signal_handler_unregister();
> > }
> >
> > -out:
> > cat_test_cleanup();
> >
> > return ret;
> > diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c
> > b/tools/testing/selftests/resctrl/resctrl_tests.c
> > index 823672a20a43..524ba83d7568 100644
> > --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> > +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> > @@ -67,21 +67,41 @@ void tests_cleanup(void)
> > cat_test_cleanup();
> > }
> >
> > -static void run_mbm_test(const char * const *benchmark_cmd, int
> > cpu_no)
> > +static int test_prepare()
> > {
> > int res;
> >
> > - ksft_print_msg("Starting MBM BW change ...\n");
> > + res = signal_handler_register();
> > + if (res)
> > + return res;
> >
> > res = mount_resctrlfs();
> > if (res) {
> > + signal_handler_unregister();
> > ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > - return;
> > + return res;
> > }
> > + return 0;
> > +}
> > +
> > +static void test_cleanup()
> > +{
> > + umount_resctrlfs();
> > + signal_handler_unregister();
> > +}
>
> Thank you for adding these.
>
> > +
> > +static void run_mbm_test(const char * const *benchmark_cmd, int
> > +cpu_no) {
> > + int res;
> > +
> > + ksft_print_msg("Starting MBM BW change ...\n");
> > +
> > + if (test_prepare())
> > + return;
> >
>
> I am not sure about this. With this exit the kselftest machinery is not aware of
> the test passing or failing. I wonder if there should not rather be a "goto" here
> that triggers ksft_test_result()? This needs some more thought though. First,
> with this change test_prepare() officially gains responsibility to determine if a
> failure is transient (just a single test
> fails) or permanent (no use trying any other tests if this fails). For the former it
> would then be up to the caller to call ksft_test_result() and for the latter
> test_prepare() will call ksft_exit_fail_msg().
> Second, that SNC warning may be an inconvenience with a new goto. Here it
> may be ok to print that message before the test failure?
If a failure may be permanent, it may be best to detect it before running all tests, rather than in test_prepare().
Now some detections are completed before running all tests. For example:
273 if (geteuid() != 0)
274 return ksft_exit_skip("Not running as root. Skipping...\n");
275
276 if (!check_resctrlfs_support())
277 return ksft_exit_skip("resctrl FS does not exist. Enable X86_CPU_RESCTRL config option.\n");
278
279 if (umount_resctrlfs())
280 return ksft_exit_skip("resctrl FS unmount failed.\n");
Best regards,
Shaopeng TAN
> > if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() !=
> ARCH_INTEL)) {
> > ksft_test_result_skip("Hardware does not support MBM or
> MBM is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = mbm_bw_change(cpu_no, benchmark_cmd); @@ -89,8 +109,8
> @@
> > static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> > if ((get_vendor() == ARCH_INTEL) && res)
> > ksft_print_msg("Intel MBM may be inaccurate when
> Sub-NUMA
> > Clustering is enabled. Check BIOS configuration.\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > static void run_mba_test(const char * const *benchmark_cmd, int
> > cpu_no) @@ -99,22 +119,19 @@ static void run_mba_test(const char *
> > const *benchmark_cmd, int cpu_no)
> >
> > ksft_print_msg("Starting MBA Schemata change ...\n");
> >
> > - res = mount_resctrlfs();
> > - if (res) {
> > - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > + if (test_prepare())
> > return;
> > - }
> >
> > if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() !=
> ARCH_INTEL)) {
> > ksft_test_result_skip("Hardware does not support MBA or
> MBA is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = mba_schemata_change(cpu_no, benchmark_cmd);
> > ksft_test_result(!res, "MBA: schemata change\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > static void run_cmt_test(const char * const *benchmark_cmd, int
> > cpu_no) @@ -123,15 +140,12 @@ static void run_cmt_test(const char *
> > const *benchmark_cmd, int cpu_no)
> >
> > ksft_print_msg("Starting CMT test ...\n");
> >
> > - res = mount_resctrlfs();
> > - if (res) {
> > - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > + if (test_prepare())
> > return;
> > - }
> >
> > if (!validate_resctrl_feature_request(CMT_STR)) {
> > ksft_test_result_skip("Hardware does not support CMT or
> CMT is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd); @@ -139,8 +153,8
> @@
> > static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> > if ((get_vendor() == ARCH_INTEL) && res)
> > ksft_print_msg("Intel CMT may be inaccurate when
> Sub-NUMA
> > Clustering is enabled. Check BIOS configuration.\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > static void run_cat_test(int cpu_no, int no_of_bits) @@ -149,22
> > +163,19 @@ static void run_cat_test(int cpu_no, int no_of_bits)
> >
> > ksft_print_msg("Starting CAT test ...\n");
> >
> > - res = mount_resctrlfs();
> > - if (res) {
> > - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > + if (test_prepare())
> > return;
> > - }
> >
> > if (!validate_resctrl_feature_request(CAT_STR)) {
> > ksft_test_result_skip("Hardware does not support CAT or CAT
> is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
> > ksft_test_result(!res, "CAT: test\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > int main(int argc, char **argv)
> > diff --git a/tools/testing/selftests/resctrl/resctrl_val.c
> > b/tools/testing/selftests/resctrl/resctrl_val.c
> > index 51963a6f2186..a9fe61133119 100644
> > --- a/tools/testing/selftests/resctrl/resctrl_val.c
> > +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> > @@ -468,7 +468,9 @@ pid_t bm_pid, ppid;
> >
> > void ctrlc_handler(int signum, siginfo_t *info, void *ptr) {
> > - kill(bm_pid, SIGKILL);
> > + /* Only kill child after bm_pid is set after fork() */
> > + if (bm_pid)
> > + kill(bm_pid, SIGKILL);
> > umount_resctrlfs();
> > tests_cleanup();
> > ksft_print_msg("Ending\n\n");
> > @@ -485,6 +487,8 @@ int signal_handler_register(void)
> > struct sigaction sigact;
> > int ret = 0;
> >
> > + bm_pid = 0;
> > +
>
> Since this is an initialization fix in this area ... what do you think of also
> initializing sigact? It could just be a change to
> struct sigaction sigact = {};
>
> This will prevent registering a signal handler with uninitialized sa_flags.
>
> Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
` (6 preceding siblings ...)
2023-09-20 10:13 ` [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Maciej Wieczór-Retman
@ 2023-09-28 8:18 ` Shaopeng Tan (Fujitsu)
7 siblings, 0 replies; 20+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2023-09-28 8:18 UTC (permalink / raw)
To: 'Ilpo Järvinen', Reinette Chatre, Shuah Khan,
Shuah Khan, linux-kselftest@vger.kernel.org,
Maciej Wieczór-Retman
Cc: LKML, stable@vger.kernel.org
Hi Ilpo,
I run the following command on Intel(R) Xeon(R) Gold 6254 CPU.
- $ sudo make -C tools/testing/selftests/resctrl run_tests
- tools/testing/selftests/resctrl$ sudo ./resctrl_tests
There is no problem.
<Reviewed-by:tan.shaopeng@jp.fujitsu.com>
<Tested-by:tan.shaopeng@jp.fujitsu.com>
Best regards,
Shaopeng TAN
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-26 21:38 ` Reinette Chatre
2023-09-28 8:10 ` Shaopeng Tan (Fujitsu)
@ 2023-09-28 12:47 ` Ilpo Järvinen
2023-09-28 17:09 ` Reinette Chatre
1 sibling, 1 reply; 20+ messages in thread
From: Ilpo Järvinen @ 2023-09-28 12:47 UTC (permalink / raw)
To: Reinette Chatre
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
[-- Attachment #1: Type: text/plain, Size: 10078 bytes --]
On Tue, 26 Sep 2023, Reinette Chatre wrote:
> On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
> > Unmounting resctrl FS has been moved into the per test functions in
> > resctrl_tests.c by commit caddc0fbe495 ("selftests/resctrl: Move
> > resctrl FS mount/umount to higher level"). In case a signal (SIGINT,
> > SIGTERM, or SIGHUP) is received, the running selftest is aborted by
> > ctrlc_handler() which then unmounts resctrl fs before exiting. The
> > current section between signal_handler_register() and
> > signal_handler_unregister(), however, does not cover the entire
> > duration when resctrl FS is mounted.
> >
> > Move signal_handler_register() and signal_handler_unregister() calls
> > from per test files into resctrl_tests.c to properly unmount resctrl
> > fs. In order to not add signal_handler_register()/unregister() n times,
> > create helpers test_prepare() and test_cleanup().
> >
> > Adjust child process kill() call in ctrlc_handler() to only be invoked
> > if the child was already forked.
> >
> > Fixes: caddc0fbe495 ("selftests/resctrl: Move resctrl FS mount/umount to higher level")
> > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> > Cc: <stable@vger.kernel.org>
> > ---
> > tools/testing/selftests/resctrl/cat_test.c | 8 ---
> > .../testing/selftests/resctrl/resctrl_tests.c | 65 +++++++++++--------
> > tools/testing/selftests/resctrl/resctrl_val.c | 22 +++----
> > 3 files changed, 48 insertions(+), 47 deletions(-)
> >
> > diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
> > index 97b87285ab2a..224ba8544d8a 100644
> > --- a/tools/testing/selftests/resctrl/cat_test.c
> > +++ b/tools/testing/selftests/resctrl/cat_test.c
> > @@ -167,12 +167,6 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> > strcpy(param.filename, RESULT_FILE_NAME1);
> > param.num_of_runs = 0;
> > param.cpu_no = sibling_cpu_no;
> > - } else {
> > - ret = signal_handler_register();
> > - if (ret) {
> > - kill(bm_pid, SIGKILL);
> > - goto out;
> > - }
> > }
> >
> > remove(param.filename);
> > @@ -209,10 +203,8 @@ int cat_perf_miss_val(int cpu_no, int n, char *cache_type)
> > }
> > close(pipefd[0]);
> > kill(bm_pid, SIGKILL);
> > - signal_handler_unregister();
> > }
> >
> > -out:
> > cat_test_cleanup();
> >
> > return ret;
> > diff --git a/tools/testing/selftests/resctrl/resctrl_tests.c b/tools/testing/selftests/resctrl/resctrl_tests.c
> > index 823672a20a43..524ba83d7568 100644
> > --- a/tools/testing/selftests/resctrl/resctrl_tests.c
> > +++ b/tools/testing/selftests/resctrl/resctrl_tests.c
> > @@ -67,21 +67,41 @@ void tests_cleanup(void)
> > cat_test_cleanup();
> > }
> >
> > -static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> > +static int test_prepare()
> > {
> > int res;
> >
> > - ksft_print_msg("Starting MBM BW change ...\n");
> > + res = signal_handler_register();
> > + if (res)
> > + return res;
> > res = mount_resctrlfs();
> > if (res) {
> > + signal_handler_unregister();
> > ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > - return;
> > + return res;
> > }
> > + return 0;
> > +}
> > +
> > +static void test_cleanup()
> > +{
> > + umount_resctrlfs();
> > + signal_handler_unregister();
> > +}
>
> Thank you for adding these.
>
> > +
> > +static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> > +{
> > + int res;
> > +
> > + ksft_print_msg("Starting MBM BW change ...\n");
> > +
> > + if (test_prepare())
> > + return;
> >
>
> I am not sure about this. With this exit the kselftest machinery is not
> aware of the test passing or failing. I wonder if there should not rather
> be a "goto" here that triggers ksft_test_result()?
Yes, ksft_test_result() is needed here (I forgot to add it).
> This needs some more
> thought though. First, with this change test_prepare() officially gains
> responsibility to determine if a failure is transient (just a single test
> fails) or permanent (no use trying any other tests if this fails). For
> the former it would then be up to the caller to call ksft_test_result()
> and for the latter test_prepare() will call ksft_exit_fail_msg().
Well, I didn't initially have test_prepare() at all but all this was
within the test functions (which will be consolidated to a single function
by the series that comes after the two series are done + one patch from
Maciej).
I was just trying to do what was done previously but it seems I forgot to
handle the result status on signal reg fail path.
TBH, I wouldn't mind if also the signal reg fail is just up'ed to
ksft_exit_fail_msg(). I don't think it can ever fail with the parameters
given to it so its error handling feels pretty much dead-code (unless some
crazy thing such as apparmor does something out of the blue, I don't know
if apparmor has capability override sigaction() but I've seen apparmor to
create errors that from the surface make no sense whatsoever comparable
to this case).
So basically this discussion is now about what to do with the mount
failing which already does _exit() before this patch (and possibly some
hypotethical, new prepare code after the consolidation work which also
will have some impact and I believe we might actually want to kill
test_prepare() at that point anyway).
> Second, that SNC warning may be an inconvenience with a new goto. Here
> it may be ok to print that message before the test failure?
I don't follow what you're referring to with "that SNC warning". To the
"Intel CMT may be inaccurate ..." one?
> > if (!validate_resctrl_feature_request(MBM_STR) || (get_vendor() != ARCH_INTEL)) {
> > ksft_test_result_skip("Hardware does not support MBM or MBM is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = mbm_bw_change(cpu_no, benchmark_cmd);
> > @@ -89,8 +109,8 @@ static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
> > if ((get_vendor() == ARCH_INTEL) && res)
> > ksft_print_msg("Intel MBM may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> > @@ -99,22 +119,19 @@ static void run_mba_test(const char * const *benchmark_cmd, int cpu_no)
> >
> > ksft_print_msg("Starting MBA Schemata change ...\n");
> >
> > - res = mount_resctrlfs();
> > - if (res) {
> > - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > + if (test_prepare())
> > return;
> > - }
> >
> > if (!validate_resctrl_feature_request(MBA_STR) || (get_vendor() != ARCH_INTEL)) {
> > ksft_test_result_skip("Hardware does not support MBA or MBA is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = mba_schemata_change(cpu_no, benchmark_cmd);
> > ksft_test_result(!res, "MBA: schemata change\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> > @@ -123,15 +140,12 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> >
> > ksft_print_msg("Starting CMT test ...\n");
> >
> > - res = mount_resctrlfs();
> > - if (res) {
> > - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > + if (test_prepare())
> > return;
> > - }
> >
> > if (!validate_resctrl_feature_request(CMT_STR)) {
> > ksft_test_result_skip("Hardware does not support CMT or CMT is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = cmt_resctrl_val(cpu_no, 5, benchmark_cmd);
> > @@ -139,8 +153,8 @@ static void run_cmt_test(const char * const *benchmark_cmd, int cpu_no)
> > if ((get_vendor() == ARCH_INTEL) && res)
> > ksft_print_msg("Intel CMT may be inaccurate when Sub-NUMA Clustering is enabled. Check BIOS configuration.\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > static void run_cat_test(int cpu_no, int no_of_bits)
> > @@ -149,22 +163,19 @@ static void run_cat_test(int cpu_no, int no_of_bits)
> >
> > ksft_print_msg("Starting CAT test ...\n");
> >
> > - res = mount_resctrlfs();
> > - if (res) {
> > - ksft_exit_fail_msg("Failed to mount resctrl FS\n");
> > + if (test_prepare())
> > return;
> > - }
> >
> > if (!validate_resctrl_feature_request(CAT_STR)) {
> > ksft_test_result_skip("Hardware does not support CAT or CAT is disabled\n");
> > - goto umount;
> > + goto cleanup;
> > }
> >
> > res = cat_perf_miss_val(cpu_no, no_of_bits, "L3");
> > ksft_test_result(!res, "CAT: test\n");
> >
> > -umount:
> > - umount_resctrlfs();
> > +cleanup:
> > + test_cleanup();
> > }
> >
> > int main(int argc, char **argv)
> > diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
> > index 51963a6f2186..a9fe61133119 100644
> > --- a/tools/testing/selftests/resctrl/resctrl_val.c
> > +++ b/tools/testing/selftests/resctrl/resctrl_val.c
> > @@ -468,7 +468,9 @@ pid_t bm_pid, ppid;
> >
> > void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
> > {
> > - kill(bm_pid, SIGKILL);
> > + /* Only kill child after bm_pid is set after fork() */
> > + if (bm_pid)
> > + kill(bm_pid, SIGKILL);
> > umount_resctrlfs();
> > tests_cleanup();
> > ksft_print_msg("Ending\n\n");
> > @@ -485,6 +487,8 @@ int signal_handler_register(void)
> > struct sigaction sigact;
> > int ret = 0;
> >
> > + bm_pid = 0;
> > +
>
> Since this is an initialization fix in this area ... what
> do you think of also initializing sigact? It could just be
> a change to
> struct sigaction sigact = {};
>
> This will prevent registering a signal handler with
> uninitialized sa_flags.
Nice catch. It seems quite bad bug, I'll add another patch to fix it.
Thanks once again for your reviews! I'll also address the changelog
improvements you mentioned against the other patches.
--
i.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-28 12:47 ` Ilpo Järvinen
@ 2023-09-28 17:09 ` Reinette Chatre
0 siblings, 0 replies; 20+ messages in thread
From: Reinette Chatre @ 2023-09-28 17:09 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Shuah Khan, Shuah Khan, linux-kselftest,
Maciej Wieczór-Retman, LKML, Shaopeng Tan, stable
Hi Ilpo,
On 9/28/2023 5:47 AM, Ilpo Järvinen wrote:
> On Tue, 26 Sep 2023, Reinette Chatre wrote:
>> On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
...
>>> +
>>> +static void run_mbm_test(const char * const *benchmark_cmd, int cpu_no)
>>> +{
>>> + int res;
>>> +
>>> + ksft_print_msg("Starting MBM BW change ...\n");
>>> +
>>> + if (test_prepare())
>>> + return;
>>>
>>
>> I am not sure about this. With this exit the kselftest machinery is not
>> aware of the test passing or failing. I wonder if there should not rather
>> be a "goto" here that triggers ksft_test_result()?
>
> Yes, ksft_test_result() is needed here (I forgot to add it).
>
>> This needs some more
>> thought though. First, with this change test_prepare() officially gains
>> responsibility to determine if a failure is transient (just a single test
>> fails) or permanent (no use trying any other tests if this fails). For
>> the former it would then be up to the caller to call ksft_test_result()
>> and for the latter test_prepare() will call ksft_exit_fail_msg().
>
> Well, I didn't initially have test_prepare() at all but all this was
> within the test functions (which will be consolidated to a single function
> by the series that comes after the two series are done + one patch from
> Maciej).
>
> I was just trying to do what was done previously but it seems I forgot to
> handle the result status on signal reg fail path.
>
> TBH, I wouldn't mind if also the signal reg fail is just up'ed to
> ksft_exit_fail_msg(). I don't think it can ever fail with the parameters
> given to it so its error handling feels pretty much dead-code (unless some
> crazy thing such as apparmor does something out of the blue, I don't know
> if apparmor has capability override sigaction() but I've seen apparmor to
> create errors that from the surface make no sense whatsoever comparable
> to this case).
>
> So basically this discussion is now about what to do with the mount
> failing which already does _exit() before this patch (and possibly some
> hypotethical, new prepare code after the consolidation work which also
> will have some impact and I believe we might actually want to kill
> test_prepare() at that point anyway).
Having failure during signal handler registration also trigger ksft_exit_fail_msg()
sounds fair to me. I am also ok with keeping the exit when mount fails.
If any future test_prepare() code does not imply a test exit then I hope it would
be obvious that ksft_test_result() needs to be called. Perhaps that can
be accomplished if test_prepare() does not exit the test but instead just
returns an error code (if needed it can use ksft_print_msg() internally for
any details about particular failures) and the caller call ksft_exit_fail_msg()
if test_prepare() fails? With the caller responsible for the ksft_exit_fail_msg()
as well as ksft_test_result() then any new addition may be guided to the
right calls. This considers hypothetical future changes to code that is
being consolidated so surely no strong opinions from my side.
>> Second, that SNC warning may be an inconvenience with a new goto. Here
>> it may be ok to print that message before the test failure?
>
> I don't follow what you're referring to with "that SNC warning". To the
> "Intel CMT may be inaccurate ..." one?
Yes, that is the warning. I envisioned addressing the issue by adding a
goto label right before the ksft_test_result() call within run_cmt_test()
in this case (but also in run_mbm_test()). Doing so would solve the issue
that test counters are incremented on test_prepare() failure but it will
also trigger the message you note and that would be confusing to the user if
the test failure was because of signal handler registration failure.
...
>>> diff --git a/tools/testing/selftests/resctrl/resctrl_val.c b/tools/testing/selftests/resctrl/resctrl_val.c
>>> index 51963a6f2186..a9fe61133119 100644
>>> --- a/tools/testing/selftests/resctrl/resctrl_val.c
>>> +++ b/tools/testing/selftests/resctrl/resctrl_val.c
>>> @@ -468,7 +468,9 @@ pid_t bm_pid, ppid;
>>>
>>> void ctrlc_handler(int signum, siginfo_t *info, void *ptr)
>>> {
>>> - kill(bm_pid, SIGKILL);
>>> + /* Only kill child after bm_pid is set after fork() */
>>> + if (bm_pid)
>>> + kill(bm_pid, SIGKILL);
>>> umount_resctrlfs();
>>> tests_cleanup();
>>> ksft_print_msg("Ending\n\n");
>>> @@ -485,6 +487,8 @@ int signal_handler_register(void)
>>> struct sigaction sigact;
>>> int ret = 0;
>>>
>>> + bm_pid = 0;
>>> +
>>
>> Since this is an initialization fix in this area ... what
>> do you think of also initializing sigact? It could just be
>> a change to
>> struct sigaction sigact = {};
>>
>> This will prevent registering a signal handler with
>> uninitialized sa_flags.
>
> Nice catch. It seems quite bad bug, I'll add another patch to fix it.
>
> Thanks once again for your reviews! I'll also address the changelog
> improvements you mentioned against the other patches.
>
Thanks to you for improving the resctrl selftests so significantly.
This work is very valuable because we use it to measure and gain
confidence in the health of the resctrl subsystem.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-09-28 8:10 ` Shaopeng Tan (Fujitsu)
@ 2023-10-04 17:43 ` Reinette Chatre
2023-10-05 7:53 ` Shaopeng Tan (Fujitsu)
0 siblings, 1 reply; 20+ messages in thread
From: Reinette Chatre @ 2023-10-04 17:43 UTC (permalink / raw)
To: Shaopeng Tan (Fujitsu), Ilpo Järvinen, Shuah Khan,
Shuah Khan, linux-kselftest@vger.kernel.org,
Maciej Wieczór-Retman
Cc: LKML, stable@vger.kernel.org
Hi Shaopeng,
On 9/28/2023 1:10 AM, Shaopeng Tan (Fujitsu) wrote:
>> On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
...
>>> +static void run_mbm_test(const char * const *benchmark_cmd, int
>>> +cpu_no) {
>>> + int res;
>>> +
>>> + ksft_print_msg("Starting MBM BW change ...\n");
>>> +
>>> + if (test_prepare())
>>> + return;
>>>
>>
>> I am not sure about this. With this exit the kselftest machinery is not aware of
>> the test passing or failing. I wonder if there should not rather be a "goto" here
>> that triggers ksft_test_result()? This needs some more thought though. First,
>> with this change test_prepare() officially gains responsibility to determine if a
>> failure is transient (just a single test
>> fails) or permanent (no use trying any other tests if this fails). For the former it
>> would then be up to the caller to call ksft_test_result() and for the latter
>> test_prepare() will call ksft_exit_fail_msg().
>> Second, that SNC warning may be an inconvenience with a new goto. Here it
>> may be ok to print that message before the test failure?
>
> If a failure may be permanent, it may be best to detect it before running all tests, rather than in test_prepare().
> Now some detections are completed before running all tests. For example:
> 273 if (geteuid() != 0)
> 274 return ksft_exit_skip("Not running as root. Skipping...\n");
> 275
> 276 if (!check_resctrlfs_support())
> 277 return ksft_exit_skip("resctrl FS does not exist. Enable X86_CPU_RESCTRL config option.\n");
> 278
> 279 if (umount_resctrlfs())
> 280 return ksft_exit_skip("resctrl FS unmount failed.\n");
>
You are correct that the tests should aim to detect as early as possible if
no test has a chance of succeeding. This is covered in the checks you mention.
The purpose of test_prepare()/test_cleanup() pair is to perform actions that
should be done for every test. For example, resctrl is mounted before each
test and unmounted after each test. Since these actions are required to be done
for every test it cannot be a single call before all tests are run.
It may be possible to add a test_prepare() directly followed by a test_cleanup()
before any test is run to be more explicit about early detection but that
does not seem necessary considering the checks would be done anyway when the
first test is run. Even when doing so it would not eliminate the need for
test_prepare()/test_cleanup() to form part of every test run and needing to exit
if, for example, a previous test triggered a fault preventing resctrl from
being mounted.
Reinette
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal
2023-10-04 17:43 ` Reinette Chatre
@ 2023-10-05 7:53 ` Shaopeng Tan (Fujitsu)
0 siblings, 0 replies; 20+ messages in thread
From: Shaopeng Tan (Fujitsu) @ 2023-10-05 7:53 UTC (permalink / raw)
To: 'Reinette Chatre', Ilpo Järvinen, Shuah Khan,
Shuah Khan, linux-kselftest@vger.kernel.org,
Maciej Wieczór-Retman
Cc: LKML, stable@vger.kernel.org
Hello Reinette,
> On 9/28/2023 1:10 AM, Shaopeng Tan (Fujitsu) wrote:
> >> On 9/15/2023 8:44 AM, Ilpo Järvinen wrote:
>
> ...
>
> >>> +static void run_mbm_test(const char * const *benchmark_cmd, int
> >>> +cpu_no) {
> >>> + int res;
> >>> +
> >>> + ksft_print_msg("Starting MBM BW change ...\n");
> >>> +
> >>> + if (test_prepare())
> >>> + return;
> >>>
> >>
> >> I am not sure about this. With this exit the kselftest machinery is
> >> not aware of the test passing or failing. I wonder if there should
> >> not rather be a "goto" here that triggers ksft_test_result()? This
> >> needs some more thought though. First, with this change
> >> test_prepare() officially gains responsibility to determine if a
> >> failure is transient (just a single test
> >> fails) or permanent (no use trying any other tests if this fails).
> >> For the former it would then be up to the caller to call
> >> ksft_test_result() and for the latter
> >> test_prepare() will call ksft_exit_fail_msg().
> >> Second, that SNC warning may be an inconvenience with a new goto.
> >> Here it may be ok to print that message before the test failure?
> >
> > If a failure may be permanent, it may be best to detect it before running all
> tests, rather than in test_prepare().
> > Now some detections are completed before running all tests. For example:
> > 273 if (geteuid() != 0)
> > 274 return ksft_exit_skip("Not running as root.
> Skipping...\n");
> > 275
> > 276 if (!check_resctrlfs_support())
> > 277 return ksft_exit_skip("resctrl FS does not exist. Enable
> X86_CPU_RESCTRL config option.\n");
> > 278
> > 279 if (umount_resctrlfs())
> > 280 return ksft_exit_skip("resctrl FS unmount failed.\n");
> >
>
> You are correct that the tests should aim to detect as early as possible if no test
> has a chance of succeeding. This is covered in the checks you mention.
> The purpose of test_prepare()/test_cleanup() pair is to perform actions that
> should be done for every test. For example, resctrl is mounted before each test
> and unmounted after each test. Since these actions are required to be done for
> every test it cannot be a single call before all tests are run.
>
> It may be possible to add a test_prepare() directly followed by a test_cleanup()
> before any test is run to be more explicit about early detection but that does not
> seem necessary considering the checks would be done anyway when the first
> test is run. Even when doing so it would not eliminate the need for
> test_prepare()/test_cleanup() to form part of every test run and needing to exit
> if, for example, a previous test triggered a fault preventing resctrl from being
> mounted.
Thanks for your explanation. I understand
Best regards,
Shaopeng TAN
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2023-10-05 15:45 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-15 15:44 [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-15 15:44 ` [PATCH v2 1/6] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
2023-09-26 21:38 ` Reinette Chatre
2023-09-28 8:10 ` Shaopeng Tan (Fujitsu)
2023-10-04 17:43 ` Reinette Chatre
2023-10-05 7:53 ` Shaopeng Tan (Fujitsu)
2023-09-28 12:47 ` Ilpo Järvinen
2023-09-28 17:09 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 2/6] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
2023-09-26 21:39 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 3/6] selftests/resctrl: Move _GNU_SOURCE define into Makefile Ilpo Järvinen
2023-09-26 21:39 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 4/6] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
2023-09-26 21:40 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 5/6] selftests/resctrl: Fix feature checks Ilpo Järvinen
2023-09-26 21:41 ` Reinette Chatre
2023-09-15 15:44 ` [PATCH v2 6/6] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
2023-09-26 21:41 ` Reinette Chatre
2023-09-20 10:13 ` [PATCH v2 0/6] selftests/resctrl: Fixes to failing tests Maciej Wieczór-Retman
2023-09-28 8:18 ` Shaopeng Tan (Fujitsu)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox