* [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection
@ 2025-07-31 13:26 Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 1/5] perf bench: Add RCU benchmark using rcuscale kernel module Yuzhuo Jing
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Yuzhuo Jing @ 2025-07-31 13:26 UTC (permalink / raw)
To: Davidlohr Bueso, Paul E . McKenney, Josh Triplett,
Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang Kan, Yuzhuo Jing, Yuzhuo Jing, Sebastian Andrzej Siewior,
linux-kernel, rcu, linux-perf-users
Add an 'bench sync rcu' benchmark, using the kernel's rcuscale module.
This patch series adds the following features:
* Automatic rcuscale module load/unload and grace-period statistics.
(The statistics feature was derived from
tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuscale.sh.)
(patch 1)
* Simple benchmark specifying a list of parameters supported by
rcuscale. (patch 1)
* A feature to execute child process, and automatically replace
reader/writer threads ID placeholder strings. This allows child
process to attach to kernel threads to collect performance
statistics. (patch 2)
* Range-based benchmark that enumerates all combinations of parameter
ranges (patch 3).
* Ratio-based benchmark that scales between two parameters. (patch 4)
Example usages have been added to each patch commit message.
This patch series depends on the new features of an ongoing patch series
that exposes rcuscale module internal states and experiment results
through debugfs. That patch series is also required for programmatic
experiment start/finish controls.
Link: https://lore.kernel.org/lkml/20250730022347.71722-1-yuzhuo@google.com/T/
RFCs:
* This patch series depends on the behavior of rcuscale kernel module.
In case of interface changes, especially aforementioned
"experiment results" format changes, this benchmark may break.
* The tools/testing/selftests/rcutorture suite provides a set of
scripts to run rcuscale, rcutorture, refscale in KVM, but left out
bare-metal testing. This patch series provides direct benchmarking
without KVM indirection. However, they reside in different folders.
Is there a better way to integrate both suites?
* (Patch 3) What would be a better range format? The current format
is defined as start[:end:step], and is only for integers.
Potentially we may want ranges for non-integers, or relationships
from expressions.
The patches are based on an ongoing series. Specifically, the minor
changes in builtin-bench.c may prevent applying change cleanly to
master/HEAD. Though the sync-rcu.c itself is independent of the lock
benchmarks from previous series.
Link: https://lore.kernel.org/lkml/20250729022640.3134066-1-yuzhuo@google.com/T/
Link: https://lore.kernel.org/lkml/20250729081256.3433892-1-yuzhuo@google.com/T/
Yuzhuo Jing (5):
perf bench: Add RCU benchmark using rcuscale kernel module
perf bench: Implement subprocess execution for 'sync rcu'
perf bench: Add 'range' mode to 'sync rcu'
perf bench: Add 'ratio' mode to 'sync rcu'
perf bench: Add documentation for 'sync rcu' suite
tools/perf/Documentation/perf-bench.txt | 131 +++
tools/perf/bench/Build | 1 +
tools/perf/bench/bench.h | 1 +
tools/perf/bench/sync-rcu.c | 1319 +++++++++++++++++++++++
tools/perf/builtin-bench.c | 1 +
5 files changed, 1453 insertions(+)
create mode 100644 tools/perf/bench/sync-rcu.c
--
2.50.1.565.gc32cd1483b-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v1 1/5] perf bench: Add RCU benchmark using rcuscale kernel module
2025-07-31 13:26 [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection Yuzhuo Jing
@ 2025-07-31 13:26 ` Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 2/5] perf bench: Implement subprocess execution for 'sync rcu' Yuzhuo Jing
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Yuzhuo Jing @ 2025-07-31 13:26 UTC (permalink / raw)
To: Davidlohr Bueso, Paul E . McKenney, Josh Triplett,
Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang Kan, Yuzhuo Jing, Yuzhuo Jing, Sebastian Andrzej Siewior,
linux-kernel, rcu, linux-perf-users
Add 'rcu' to the 'perf bench sync' collection. This benchmark depends
on the rcuscale kernel module, and also depends on new features
in the rcuscale module that exposes control and internal state through
debugfs.
This patch adds the basic 'once' mode that runs one combination of
rcuscale parameters. Command usage is defined as:
perf bench sync rcu [options..] [once <gp_type> [<param=value>..]]
* gp_type is one of "sync", "async", "exp"
* Valid params can be found from `modinfo rcuscale`. Except that
gp_exp, gp_async, and block_start are managed by the benchmark and
cannot be set by user.
This benchmark parses the modinfo to validate the existence of user
provided parameters. It then executes modprobe to load rcuscale.
Experiment start/finish is controlled through
/sys/kernel/debug/rcuscale/{should_start,test_complete}.
After the experiment finishes, it reads
/sys/kernel/debug/rcuscale/writer_durations and outputs statistics.
The statistic code (print_writer_duration_stats function) is derived
from tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuscale.sh.
Example output:
./perf bench sync rcu --hist once exp nreaders=1 nwriters=1
\# Running 'sync/rcu' benchmark:
Running experiment with options: gp_exp=1 nreaders=1 nwriters=1
Experiment finished.
Histogram bucket size: 0.1 microseconds
8.8 11
8.9 25
9 27
9.1 13
9.2 6
9.3 3
9.4 2
9.5 2
9.6 3
9.7 1
9.8 1
9.9 1
10 2
10.4 1
10.5 1
11.1 1
6025.6 1
Average grace-period duration: 68.734 microseconds
Minimum grace-period duration: 8.813
50th percentile grace-period duration: 9.044
90th percentile grace-period duration: 9.625
99th percentile grace-period duration: 10.516
Maximum grace-period duration: 6025.679
Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
---
tools/perf/bench/Build | 1 +
tools/perf/bench/bench.h | 1 +
tools/perf/bench/sync-rcu.c | 816 ++++++++++++++++++++++++++++++++++++
tools/perf/builtin-bench.c | 1 +
4 files changed, 819 insertions(+)
create mode 100644 tools/perf/bench/sync-rcu.c
diff --git a/tools/perf/bench/Build b/tools/perf/bench/Build
index 13558279fa0e..f694f8715cfc 100644
--- a/tools/perf/bench/Build
+++ b/tools/perf/bench/Build
@@ -20,6 +20,7 @@ perf-bench-y += breakpoint.o
perf-bench-y += pmu-scan.o
perf-bench-y += uprobe.o
perf-bench-y += sync.o
+perf-bench-y += sync-rcu.o
perf-bench-y += qspinlock.o
perf-bench-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
diff --git a/tools/perf/bench/bench.h b/tools/perf/bench/bench.h
index 42c0696b05fb..09c5b3af347f 100644
--- a/tools/perf/bench/bench.h
+++ b/tools/perf/bench/bench.h
@@ -24,6 +24,7 @@ int bench_sched_pipe(int argc, const char **argv);
int bench_sched_seccomp_notify(int argc, const char **argv);
int bench_sync_qspinlock(int argc, const char **argv);
int bench_sync_ticket(int argc, const char **argv);
+int bench_sync_rcu(int argc, const char **argv);
int bench_syscall_basic(int argc, const char **argv);
int bench_syscall_getpgid(int argc, const char **argv);
int bench_syscall_fork(int argc, const char **argv);
diff --git a/tools/perf/bench/sync-rcu.c b/tools/perf/bench/sync-rcu.c
new file mode 100644
index 000000000000..ac85841f0b68
--- /dev/null
+++ b/tools/perf/bench/sync-rcu.c
@@ -0,0 +1,816 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * RCU scale benchmark using rcuscale kernel module.
+ *
+ * 2025 Yuzhuo Jing <yuzhuo@google.com>
+ */
+#include <dirent.h>
+#include <err.h>
+#include <errno.h>
+#include <inttypes.h>
+#include <math.h>
+#include <signal.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <subcmd/parse-options.h>
+#include <sys/cdefs.h>
+#include <sys/wait.h>
+
+#include "bench.h"
+
+#define MAX_OPTS 64
+#define MAX_OPTNAME 64
+#define MAX_OPTTYPE 16
+#define MAX_OPTVALUE 128
+
+#define INIT_CAPACITY 1024UL
+
+/* ============================ Global Options ============================ */
+
+static bool dryrun;
+static unsigned int cooldown = 3;
+static bool show_hist;
+static const char *debugfs = "/sys/kernel/debug";
+
+static const struct option bench_rcu_options[] = {
+ OPT_BOOLEAN('n', "dryrun", &dryrun, "Dry run mode"),
+ OPT_UINTEGER('c', "cooldown", &cooldown,
+ "Sleep time between each run (default: 3 seconds)"),
+ OPT_BOOLEAN(0, "hist", &show_hist,
+ "Show histogram of writer durations"),
+ OPT_STRING(0, "debugfs", &debugfs, "path",
+ "Debugfs mount point (default: /sys/kernel/debug)"),
+ OPT_END()
+};
+
+static const char *const bench_rcu_usage[] = {
+ "RCU benchmark using rcuscale kernel module.",
+ "",
+ "perf bench sync rcu [options..]",
+ "perf bench sync rcu [options..] once <gp_type> [<param=value>..]",
+ "",
+ " <gp_type>: The type of grace period to use: sync, async, exp (expedited)",
+ " This sets the gp_exp or gp_async kernel module parameters.",
+ " <param>: Any parameter of the rcuscale kernel module, e.g. holdoff=5.",
+ " Valid options can be found from running `modinfo rcuscale`.",
+ "",
+ "Notes on param:",
+ " This benchmark manages gp_exp and gp_async, and sets block_start=1.",
+ " User cannot override those parameters. This benchmark also sets default",
+ " values writer_no_print=1 and holdoff=3, but users may override those.",
+ " Note that if nwriters=0, the rcuscale kernel module will not exit,",
+ " and the benchmark will sleep indefinitely.",
+ "",
+ "Modes:",
+ " default: Run 'once sync'.",
+ " once: Run benchmark once, with all parameters passed through to the",
+ " kernel rcuscale module.",
+ "",
+ "Examples:",
+ " perf bench sync rcu --hist once exp nreaders=1 nwriters=1 writer_cpu_offset=1",
+ " perf bench sync rcu once",
+ " perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1",
+ "",
+ "In case perf exited abnormally, user need to unload rcuscale by running:",
+ " modprobe -r rcuscale torture",
+ "",
+ "Global options:", // continues to show bench_rcu_options
+ NULL
+};
+
+/* ============================ Runtime Options ============================ */
+
+#define MODPROBE_BASE_COUNT 3
+#define MODPROBE_CMD_MAX (2 * MAX_OPTS + MODPROBE_BASE_COUNT + 1)
+/*
+ * The command line builder for modprobe. The cmd array will be directly
+ * passed to execvp.
+ *
+ * Note: the cmd array does not own the pointers in it. Those argument
+ * pointers could come from:
+ * - string literals (e.g. the "modprobe" and "rcuscale" command name)
+ * - simple_params
+ */
+struct modprobe_cmd {
+ const char *cmd[MODPROBE_CMD_MAX];
+ size_t count;
+};
+
+#define MODPROBE_CMD_INIT \
+ struct modprobe_cmd modprobe_cmd = { \
+ { "modprobe", "rcuscale", "block_start=1", NULL, }, \
+ MODPROBE_BASE_COUNT, \
+ }
+#define MODPROBE_REMOVE_CMD "modprobe -r rcuscale torture"
+
+/*
+ * Generic modprobe parameter definition. This is the storage for an
+ * instantiated module parameter. This may come from parameters directly
+ * given by user, or generated.
+ *
+ * Format must be "key=value".
+ */
+struct modprobe_param {
+ char value[MAX_OPTVALUE];
+};
+
+/*
+ * The storage for simple (i.e. non-range) module parameter strings.
+ */
+static struct modprobe_param simple_params[MAX_OPTS];
+static int simple_params_count;
+
+static bool in_child;
+
+struct durations {
+ u64 *values;
+ size_t count;
+ size_t capacity;
+ u64 sum;
+};
+
+/* ========================== Override parameters ======================= */
+
+/* Non-override module parameter. This may be updated by "ratio" command. */
+struct non_override_param {
+ char name[MAX_OPTNAME];
+};
+static int non_override_params_count = 3;
+static struct non_override_param non_override_params[MAX_OPTS] = {
+ { "block_start" },
+ { "gp_exp" },
+ { "gp_async" },
+};
+
+struct overridable_param {
+ char name[MAX_OPTNAME];
+ bool user_overridden;
+};
+static struct overridable_param overridable_params[] = {
+ { "writer_no_print", false },
+ { "holdoff", false },
+};
+static const int overridable_params_count = ARRAY_SIZE(overridable_params);
+
+/* Valid module parameters parsed from modinfo. */
+struct modinfo_parm {
+ char name[MAX_OPTNAME];
+ char type[MAX_OPTTYPE];
+};
+static struct modinfo_parm modinfo_parms[MAX_OPTS];
+static int modinfo_parms_count;
+
+/* ========================== Cleanup functions ========================== */
+
+static void unload_module(void)
+{
+ if (system(MODPROBE_REMOVE_CMD) != 0)
+ fprintf(stderr, "Failed to unload rcuscale kernel module.\n"
+ "Please manually remove it using `"MODPROBE_REMOVE_CMD"`.\n");
+}
+
+static void cleanup(void)
+{
+ if (in_child)
+ return;
+
+ unload_module();
+}
+
+static void signal_handler(int sig)
+{
+ if (sig)
+ fprintf(stderr, "perf: Signal %d received\n", sig);
+ /* cleanup is registered in atexit */
+ fprintf(stderr, "Cleaning up...\n");
+ exit(1);
+}
+
+static void setup_cleanup(void)
+{
+ atexit(cleanup);
+ signal(SIGINT, signal_handler);
+ signal(SIGTERM, signal_handler);
+ signal(SIGSEGV, signal_handler);
+}
+
+/*
+ * Failure handling. Use this for logical checks, and use "err" for those with
+ * external interactions.
+ */
+#define fail(fmt, ...) \
+do { \
+ fprintf(stderr, "perf: "fmt"\n", ##__VA_ARGS__); \
+ exit(1); \
+} while (0)
+
+/* ============================ Modprobe info ============================ */
+
+/*
+ * Parse modinfo and store the results in modinfo_parms. Used to determine
+ * whether an option is valid as a range parameter.
+ *
+ * The expected format is:
+ * nreaders:Number of RCU reader threads (int)
+ * nwriters:Number of RCU updater threads (int)
+ */
+static void parse_modinfo(void)
+{
+ char *line = NULL;
+ size_t len = 0;
+ FILE *fp;
+
+ fp = popen("modinfo rcuscale -0 -F parm", "r");
+ if (!fp)
+ err(EXIT_FAILURE, "Failed to run modinfo");
+
+ while (getdelim(&line, &len, '\0', fp) != -1) {
+ char *type;
+ char *remaining = NULL;
+ char *name = strtok_r(line, ":", &remaining);
+
+ if (!name)
+ fail("Failed to parse modinfo parameter name");
+
+ type = strrchr(remaining, '(');
+ if (!type)
+ fail("Failed to parse modinfo parameter type");
+ remaining = NULL;
+ type = strtok_r(type + 1, ")", &remaining);
+ if (!type)
+ fail("Failed to parse modinfo parameter type");
+
+ strlcpy(modinfo_parms[modinfo_parms_count].name, name, MAX_OPTNAME);
+ strlcpy(modinfo_parms[modinfo_parms_count].type, type, MAX_OPTTYPE);
+ modinfo_parms_count++;
+ }
+ if (!modinfo_parms_count)
+ fail("Failed to read modinfo");
+
+ free(line);
+ pclose(fp);
+}
+
+/*
+ * Check if the module parameter is an integer.
+ */
+static bool modparm_is_int(const char *name)
+{
+ for (int i = 0; i < modinfo_parms_count; i++) {
+ if (strcmp(modinfo_parms[i].name, name) == 0)
+ return strcmp(modinfo_parms[i].type, "int") == 0;
+ }
+ return false;
+}
+
+/* ============================ Argument parsing ============================ */
+
+/*
+ * Reserve memory for a pointer.
+ *
+ * If current capacity is 0, the minimum capacity is at least INIT_CAPACITY or
+ * min_capacity.
+ *
+ * The current capacity is doubled if it is less than the minimum capacity.
+ */
+static void raw_reserve_size(void **ptr, size_t elemsize,
+ size_t *current_capacity, size_t min_capacity)
+{
+ size_t new_capacity;
+
+ if (min_capacity == 0)
+ min_capacity = INIT_CAPACITY;
+
+ if (*current_capacity >= min_capacity)
+ return;
+
+ new_capacity = *current_capacity ?: min_capacity;
+ while (new_capacity < min_capacity)
+ new_capacity *= 2;
+
+ *ptr = realloc(*ptr, new_capacity * elemsize);
+ if (!*ptr)
+ fail("Failed to allocate memory");
+ *current_capacity = new_capacity;
+}
+#define reserve_size(ptr, current_capacity, min_capacity) \
+ raw_reserve_size((void **)(ptr), sizeof(**(ptr)), (current_capacity), (min_capacity))
+
+static int parse_int(const char *val)
+{
+ char *endptr;
+ long num = strtol(val, &endptr, 10);
+
+ if (*endptr != '\0')
+ fail("Invalid integer format: %s", val);
+
+ return (int)num;
+}
+
+static void simple_params_add(const char *full)
+{
+ if (simple_params_count >= MAX_OPTS)
+ fail("Too many module parameters");
+ strlcpy(simple_params[simple_params_count++].value, full, MAX_OPTVALUE);
+}
+
+static void parse_gp_type(const char *gp_type)
+{
+ if (strcmp(gp_type, "sync") == 0) {
+ // no new option is added
+ } else if (strcmp(gp_type, "async") == 0)
+ simple_params_add("gp_async=1");
+ else if (strcmp(gp_type, "exp") == 0)
+ simple_params_add("gp_exp=1");
+ else
+ fail("Invalid grace period type: %s", gp_type);
+}
+
+/*
+ * Check if the option is already set.
+ */
+static bool param_has_conflict(const char *key)
+{
+ for (int i = 0; i < non_override_params_count; ++i) {
+ if (strcmp(key, non_override_params[i].name) == 0)
+ return true;
+ }
+ for (int i = 0; i < simple_params_count; ++i) {
+ if (strncmp(key, simple_params[i].value, strlen(key)) == 0
+ && simple_params[i].value[strlen(key)] == '=')
+ return true;
+ }
+ /* overridable_params are considered non conflict */
+
+ return false;
+}
+
+static struct overridable_param *overridable_param_get(const char *key)
+{
+ for (int i = 0; i < overridable_params_count; ++i)
+ if (strcmp(overridable_params[i].name, key) == 0)
+ return overridable_params + i;
+ return NULL;
+}
+
+/*
+ * For overridable_params, if user specifies it, set overridden so that it will
+ * not be appended to modprobe cmd.
+ */
+static inline void param_try_set_user_override(const char *key)
+{
+ struct overridable_param *param = overridable_param_get(key);
+
+ if (param)
+ param->user_overridden = true;
+}
+
+/*
+ * Validate basics about the parameter name.
+ *
+ * Note: This is supposed to only be used during parsing user provided
+ * arguments. This will also update the "user_overridden" flag for overridable
+ * parameters.
+ */
+static void check_param_name(const char *name)
+{
+ if (strlen(name) + 1 > MAX_OPTNAME)
+ fail("Module parameter name too long: %s", name);
+ if (param_has_conflict(name))
+ fail("Module parameter \"%s\" has conflict", name);
+ /* Set user overridden if possible */
+ param_try_set_user_override(name);
+
+ for (int i = 0; i < modinfo_parms_count; ++i) {
+ if (strcmp(modinfo_parms[i].name, name) == 0)
+ return;
+ }
+ fail("Module parameter \"%s\" does not exist in modinfo", name);
+}
+
+/*
+ * Parse module parameter. Results are stored in params and range_params.
+ *
+ * If allow_range is false, all params are stored in params, and checks
+ * the format cannot be range.
+ *
+ * If allow_range is true, params that only has one value will be stored in
+ * params, and range ones will be stored in range_params.
+ */
+static void parse_module_params(int argc, const char *argv[])
+{
+ while (argc) {
+ char *saved_ptr = NULL;
+ char *key;
+ char *value;
+ char buf[MAX_OPTVALUE] = "";
+
+ if (strnlen(argv[0], MAX_OPTVALUE) >= MAX_OPTVALUE - 1)
+ fail("Module parameter too long: \"%s\"", argv[0]);
+ strlcpy(buf, argv[0], MAX_OPTVALUE);
+
+ /* Parse keys and values. */
+ key = strtok_r(buf, "=", &saved_ptr);
+ if (!key)
+ fail("Failed to parse module option \"%s\"", argv[0]);
+ check_param_name(key);
+
+ value = strtok_r(NULL, "=", &saved_ptr);
+ if (!value || strlen(value) == 0)
+ fail("Cannot find value for module option \"%s\"", key);
+ if (strlen(value) + 1 > MAX_OPTVALUE)
+ fail("Module parameter value too long: \"%s\"", value);
+
+ /* Ensure integer type value are integers, but don't need the value. */
+ if (modparm_is_int(key))
+ parse_int(value);
+
+ simple_params_add(argv[0]);
+
+ argc--;
+ argv++;
+ }
+}
+
+/* ====================== Experiment Result Handling ====================== */
+
+static void durations_add(struct durations *durations, u64 duration)
+{
+ reserve_size(&durations->values, &durations->capacity, durations->count + 1);
+ durations->values[durations->count++] = duration;
+ durations->sum += duration;
+}
+
+/*
+ * Parse writer durations from debugfs and push them to durations array.
+ *
+ * The expected format is writer_id,duration.
+ *
+ * Durations are converted to microseconds and stored in durations array.
+ */
+static struct durations *parse_durations(void)
+{
+ char durations_path[PATH_MAX];
+ FILE *fp = NULL;
+ char *line = NULL;
+ size_t len = 0;
+ ssize_t read;
+ u64 duration;
+ struct durations *durations = calloc(1, sizeof(*durations));
+
+ if (!durations)
+ fail("Failed to allocate memory for durations");
+
+ snprintf(durations_path, sizeof(durations_path), "%s/rcuscale/writer_durations", debugfs);
+
+ fp = fopen(durations_path, "r");
+ if (!fp)
+ err(EXIT_FAILURE, "Failed to open writer_durations");
+
+ while ((read = getline(&line, &len, fp)) != -1) {
+ if (sscanf(line, "%*d,%lu", &duration) != 1)
+ fail("Failed to parse writer duration. Line: %s", line);
+ durations_add(durations, duration);
+ }
+
+ free(line);
+ fclose(fp);
+
+ return durations;
+}
+
+static void free_durations(struct durations *durations)
+{
+ free(durations->values);
+ free(durations);
+}
+
+/*
+ * Helper function for sorting.
+ */
+static int compare_duration(const void *a, const void *b)
+{
+ u64 aa = *(u64 *)a, bb = *(u64 *)b;
+
+ return aa < bb ? -1 : !!(aa > bb);
+}
+
+/*
+ * Print a trimmed zero converted ns to us, without automatic scientific
+ * notation like %g.
+ * e.g. 10000 ns -> 10 us
+ * e.g. 10001 ns -> 10.001 us
+ * e.g. 10100 ns -> 10.1 us
+ */
+static char *print_us(char *buf, u64 ns)
+{
+ int len;
+
+ sprintf(buf, "%"PRIu64, ns / 1000);
+ if (ns % 1000 == 0)
+ return buf;
+
+ sprintf(buf + strlen(buf), ".%03"PRIu64, ns % 1000);
+ len = strlen(buf);
+ while (len && buf[len - 1] == '0')
+ buf[--len] = '\0';
+
+ return buf;
+}
+
+/*
+ * Print statistics of writer durations.
+ *
+ * This function is derived from
+ * tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuscale.sh
+ * Note that the durations array are in nanoseconds, and are integers.
+ */
+static void print_writer_duration_stats(const struct durations *d)
+{
+ size_t pct50, pct90, pct99;
+ size_t count;
+ u64 div, last;
+ char ms_us_buf[30];
+ u64 *durations = d->values;
+ size_t durations_count = d->count;
+
+ if (durations_count == 0) {
+ printf("No rcuscale records found.\n");
+ return;
+ }
+
+ qsort(durations, durations_count, sizeof(*durations), compare_duration);
+
+ // Calculate percentiles
+ pct50 = max(durations_count * 50 / 100, 1UL);
+ pct90 = max(durations_count * 90 / 100, 1UL);
+ pct99 = max(durations_count * 99 / 100, 1UL);
+
+#define US_NS 1000
+#define us(ns) print_us(ms_us_buf, (ns))
+
+ if (show_hist) {
+ // Calculate histogram bucket size based on 90th percentile
+ div = pow(10, floor(log10((double)durations[pct90 - 1]) + 0.5)) / 100;
+ if (div <= 0)
+ div = 1;
+ printf("Histogram bucket size: %s microseconds\n", us(div));
+
+ last = durations[0] - 10 * US_NS;
+ count = 0;
+ for (size_t i = 0; i < durations_count; ++i) {
+ u64 current = durations[i] / div * div;
+
+ if (last == current) {
+ count++;
+ } else {
+ if (count > 0)
+ printf("%s %lu\n", us(last), count);
+ count = 1;
+ last = current;
+ }
+ }
+ if (count > 0)
+ printf("%s %lu\n", us(last), count);
+ }
+
+ // Print statistics
+ printf("Average grace-period duration: %s microseconds\n", us(d->sum / durations_count));
+ printf("Minimum grace-period duration: %s\n", us(durations[0]));
+ printf("50th percentile grace-period duration: %s\n", us(durations[pct50 - 1]));
+ printf("90th percentile grace-period duration: %s\n", us(durations[pct90 - 1]));
+ printf("99th percentile grace-period duration: %s\n", us(durations[pct99 - 1]));
+ printf("Maximum grace-period duration: %s\n", us(durations[durations_count - 1]));
+
+#undef US_NS
+#undef us
+}
+
+/* ============================ Experiment Functions ============================ */
+
+/*
+ * Trigger the experiment by writing 1 to should_start.
+ */
+static void start_experiment(void)
+{
+ char path[PATH_MAX];
+ FILE *fp;
+
+ snprintf(path, sizeof(path), "%s/rcuscale/should_start", debugfs);
+
+ fp = fopen(path, "w");
+ if (!fp)
+ err(EXIT_FAILURE, "Failed to open %s", path);
+
+ if (fprintf(fp, "1\n") < 0)
+ err(EXIT_FAILURE, "Failed to write %s", path);
+
+ fclose(fp);
+}
+
+/*
+ * Wait for the experiment to complete by reading test_complete once every
+ * second, until it is not 0.
+ */
+static void wait_experiment(void)
+{
+ char path[PATH_MAX];
+
+ snprintf(path, sizeof(path), "%s/rcuscale/test_complete", debugfs);
+
+ while (true) {
+ int finished;
+ FILE *fp = fopen(path, "r");
+
+ if (!fp)
+ err(EXIT_FAILURE, "Failed to open %s", path);
+
+ if (fscanf(fp, "%d", &finished) != 1) {
+ fclose(fp);
+ err(EXIT_FAILURE, "Failed to read %s", path);
+ }
+
+ fclose(fp);
+
+ if (finished)
+ break;
+
+ sleep(1);
+ }
+}
+
+/*
+ * Run the constructed modprobe command.
+ */
+static void run_modprobe(const struct modprobe_cmd *cmd)
+{
+ int retval;
+ pid_t pid;
+
+ if (dryrun)
+ return;
+
+ pid = fork();
+ if (pid < 0)
+ err(EXIT_FAILURE, "Failed to fork child process");
+
+ if (pid == 0) {
+ execvp(cmd->cmd[0], (char *const *)cmd->cmd);
+ in_child = true;
+ err(EXIT_FAILURE, "Failed to execute modprobe command");
+ }
+ waitpid(pid, &retval, 0);
+ if (retval)
+ fail("modprobe failed, exiting.");
+}
+
+/*
+ * Print modprobe parameters, but skip the base command line, and also skip
+ * those overridable params not overridden by user.
+ */
+static void print_params(const struct modprobe_cmd *cmd)
+{
+ bool printed = false;
+ char keybuf[MAX_OPTNAME];
+ struct overridable_param *param;
+
+ printf("Running experiment with options:");
+ for (int i = MODPROBE_BASE_COUNT; cmd->cmd[i] != NULL; ++i) {
+ if (sscanf(cmd->cmd[i], "%[^=]=", keybuf) != 1)
+ fail("Invalid generated modprobe parameter: %s", cmd->cmd[i]);
+ param = overridable_param_get(keybuf);
+ if (param == NULL || param->user_overridden) {
+ printed = true;
+ printf(" %s", cmd->cmd[i]);
+ }
+ }
+ if (!printed)
+ printf(" (default)\n");
+ else
+ printf("\n");
+}
+
+/*
+ * Core Experiment function
+ */
+static void runonce(const struct modprobe_cmd *modprobe_cmd)
+{
+ struct durations *durations;
+
+ print_params(modprobe_cmd);
+ run_modprobe(modprobe_cmd);
+
+ if (dryrun)
+ return;
+
+ /* Start and wait for experiment */
+ start_experiment();
+ wait_experiment();
+
+ /* Parse writer durations */
+ /* Wait until all kernel threads enter final wait */
+ sleep(1);
+ durations = parse_durations();
+ unload_module();
+
+ printf("Experiment finished.\n");
+
+ /* Print statistics */
+ print_writer_duration_stats(durations);
+ free_durations(durations);
+}
+
+static void modprobe_cmd_add(struct modprobe_cmd *cmd, const char *v)
+{
+ // 2 for NULL and v
+ if (cmd->count + 2 >= MODPROBE_CMD_MAX)
+ fail("Too many module parameters");
+ cmd->cmd[cmd->count] = v;
+ cmd->cmd[++cmd->count] = NULL;
+}
+
+/*
+ * Append parameters that are overridable by users.
+ */
+static void modprobe_cmd_add_overridable(struct modprobe_cmd *cmd)
+{
+ if (!param_has_conflict("writer_no_print"))
+ modprobe_cmd_add(cmd, "writer_no_print=1");
+ if (!param_has_conflict("holdoff"))
+ modprobe_cmd_add(cmd, "holdoff=3");
+}
+
+/*
+ * Collect simple options into modprobe_cmd.
+ */
+static void modprobe_collect_simple_options(struct modprobe_cmd *cmd)
+{
+ for (int i = 0; i < simple_params_count; ++i)
+ modprobe_cmd_add(cmd, simple_params[i].value);
+
+ modprobe_cmd_add_overridable(cmd);
+}
+
+/*
+ * Test once. Does not allow ranges.
+ */
+static void test_once(int argc, const char *argv[])
+{
+ MODPROBE_CMD_INIT;
+
+ parse_module_params(argc, argv);
+
+ modprobe_collect_simple_options(&modprobe_cmd);
+
+ runonce(&modprobe_cmd);
+}
+
+/* ============================= Entry Point ============================== */
+
+int bench_sync_rcu(int argc, const char **argv)
+{
+ void (*cmd)(int argc, const char *argv[]);
+ const char *runmode, *gp_type;
+
+ /* Reset errno to avoid printing irrelavent error string */
+ errno = 0;
+
+ /* Parse global options first. */
+ argc = parse_options(argc, argv, bench_rcu_options, bench_rcu_usage,
+ PARSE_OPT_STOP_AT_NON_OPTION);
+
+ /* The empty case is equivalent to 'once sync'.
+ * Otherwise, at least two positional options are required:
+ * once/range/ratio and sync/async/exp
+ */
+ if (argc == 0) {
+ runmode = "once";
+ gp_type = "sync";
+ } else if (argc < 2) {
+ usage_with_options(bench_rcu_usage, bench_rcu_options);
+ } else {
+ runmode = argv[0];
+ gp_type = argv[1];
+ argc -= 2;
+ argv += 2;
+ }
+
+ if (strcmp(runmode, "once") == 0)
+ cmd = test_once;
+ else
+ usage_with_options(bench_rcu_usage, bench_rcu_options);
+
+ parse_gp_type(gp_type);
+
+ parse_modinfo();
+ if (system(MODPROBE_REMOVE_CMD) != 0)
+ err(EXIT_FAILURE, "Unloading existing rcuscale module failed");
+
+ setup_cleanup();
+
+ cmd(argc, argv);
+
+ return 0;
+}
diff --git a/tools/perf/builtin-bench.c b/tools/perf/builtin-bench.c
index 8d945b846321..9d2e765c7e16 100644
--- a/tools/perf/builtin-bench.c
+++ b/tools/perf/builtin-bench.c
@@ -55,6 +55,7 @@ static struct bench sched_benchmarks[] = {
static struct bench sync_benchmarks[] = {
{ "qspinlock", "Benchmark for queued spinlock", bench_sync_qspinlock },
{ "ticket", "Benchmark for ticket spinlock", bench_sync_ticket },
+ { "rcu", "Benchmark using rcuscale kernel module", bench_sync_rcu },
{ "all", "Run all synchronization benchmarks", NULL },
{ NULL, NULL, NULL }
};
--
2.50.1.565.gc32cd1483b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v1 2/5] perf bench: Implement subprocess execution for 'sync rcu'
2025-07-31 13:26 [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 1/5] perf bench: Add RCU benchmark using rcuscale kernel module Yuzhuo Jing
@ 2025-07-31 13:26 ` Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 3/5] perf bench: Add 'range' mode to " Yuzhuo Jing
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Yuzhuo Jing @ 2025-07-31 13:26 UTC (permalink / raw)
To: Davidlohr Bueso, Paul E . McKenney, Josh Triplett,
Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang Kan, Yuzhuo Jing, Yuzhuo Jing, Sebastian Andrzej Siewior,
linux-kernel, rcu, linux-perf-users
Monitor system state is useful for understanding performance impact.
This patch enables running external tool during the benchmark. It
provides a similar semantic to 'perf record -- perf bench mem', except
that the order is reversed.
Because the benchmark threads are kernel module created kthreads, perf
cannot directly attach to them. This patch propose a method to execute
the attach command from a child process, using command line
substitution.
If any of the command string contains "{READER,WRITER,KFREE}_TASKS"
placeholder, they are replaced with the real value upon startup. The
thread ID information comes from
/sys/kernel/debug/rcuscale/{reader,writer,kfree}_tasks.
Example usage of running 'perf stat' to attach kernel threads:
$ ./perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1 -- \
perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period \
-t READER_TASKS,WRITER_TASKS
\# Running 'sync/rcu' benchmark:
Running experiment with options: nreaders=1 nwriters=1 writer_cpu_offset=1
Running child command: perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period -t 1682932,1682933
Performance counter stats for thread id '1682932,1682933':
20105 ipi:ipi_send_cpu
702 rcu:rcu_grace_period
25.023871111 seconds time elapsed
Experiment finished.
Waiting for child process to exit.
Average grace-period duration: 188128.652 microseconds
Minimum grace-period duration: 9000.221
50th percentile grace-period duration: 217996.932
90th percentile grace-period duration: 218001.019
99th percentile grace-period duration: 218153.558
Maximum grace-period duration: 326999.705
Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
---
tools/perf/bench/sync-rcu.c | 252 +++++++++++++++++++++++++++++++++++-
1 file changed, 247 insertions(+), 5 deletions(-)
diff --git a/tools/perf/bench/sync-rcu.c b/tools/perf/bench/sync-rcu.c
index ac85841f0b68..934d2416c216 100644
--- a/tools/perf/bench/sync-rcu.c
+++ b/tools/perf/bench/sync-rcu.c
@@ -5,6 +5,7 @@
* 2025 Yuzhuo Jing <yuzhuo@google.com>
*/
#include <dirent.h>
+#include <ctype.h>
#include <err.h>
#include <errno.h>
#include <inttypes.h>
@@ -32,6 +33,7 @@
static bool dryrun;
static unsigned int cooldown = 3;
static bool show_hist;
+static unsigned int child_delay = 1;
static const char *debugfs = "/sys/kernel/debug";
static const struct option bench_rcu_options[] = {
@@ -40,6 +42,8 @@ static const struct option bench_rcu_options[] = {
"Sleep time between each run (default: 3 seconds)"),
OPT_BOOLEAN(0, "hist", &show_hist,
"Show histogram of writer durations"),
+ OPT_UINTEGER(0, "child-delay", &child_delay,
+ "Wait for child startup before starting experiment (default: 1 second)"),
OPT_STRING(0, "debugfs", &debugfs, "path",
"Debugfs mount point (default: /sys/kernel/debug)"),
OPT_END()
@@ -48,13 +52,18 @@ static const struct option bench_rcu_options[] = {
static const char *const bench_rcu_usage[] = {
"RCU benchmark using rcuscale kernel module.",
"",
- "perf bench sync rcu [options..]",
- "perf bench sync rcu [options..] once <gp_type> [<param=value>..]",
+ "perf bench sync rcu [options..] [-- <command>..]",
+ "perf bench sync rcu [options..] once <gp_type> [<param=value>..] [-- <command>..]",
"",
" <gp_type>: The type of grace period to use: sync, async, exp (expedited)",
" This sets the gp_exp or gp_async kernel module parameters.",
" <param>: Any parameter of the rcuscale kernel module, e.g. holdoff=5.",
" Valid options can be found from running `modinfo rcuscale`.",
+ " <command>: A child command to run during the experiment. This is useful",
+ " for running tools that monitor system metrics during the",
+ " experiment. If the command line string contains",
+ " {READER,WRITER,KFREE}_TASKS placeholders, they will be substituted",
+ " with the tasks PIDs, separated by comma.",
"",
"Notes on param:",
" This benchmark manages gp_exp and gp_async, and sets block_start=1.",
@@ -73,6 +82,10 @@ static const char *const bench_rcu_usage[] = {
" perf bench sync rcu once",
" perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1",
"",
+ " perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1 -- \\",
+ " perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period \\",
+ " -t READER_TASKS,WRITER_TASKS",
+ "",
"In case perf exited abnormally, user need to unload rcuscale by running:",
" modprobe -r rcuscale torture",
"",
@@ -105,6 +118,23 @@ struct modprobe_cmd {
}
#define MODPROBE_REMOVE_CMD "modprobe -r rcuscale torture"
+/*
+ * Generated subprocess command.
+ *
+ * Different from modprobe_cmd, this struct is owns the argv array and all
+ * strings in the array. The only exception is child_cmd_template, which
+ * contains the remainder of argv parsing.
+ *
+ * Upon each runonce(), generate_child_command will make a copy of the strings
+ * in child_cmd_template and also substitute placeholders to actual value.
+ */
+struct child_cmd {
+ int argc;
+ char **argv;
+};
+
+static struct child_cmd child_cmd_template;
+
/*
* Generic modprobe parameter definition. This is the storage for an
* instantiated module parameter. This may come from parameters directly
@@ -122,6 +152,7 @@ struct modprobe_param {
static struct modprobe_param simple_params[MAX_OPTS];
static int simple_params_count;
+static pid_t child_pid;
static bool in_child;
struct durations {
@@ -177,6 +208,12 @@ static void cleanup(void)
return;
unload_module();
+
+ if (child_pid) {
+ kill(child_pid, SIGTERM);
+ waitpid(child_pid, NULL, 0);
+ child_pid = 0;
+ }
}
static void signal_handler(int sig)
@@ -407,6 +444,13 @@ static void parse_module_params(int argc, const char *argv[])
char *value;
char buf[MAX_OPTVALUE] = "";
+ /* Handle child command. */
+ if (strcmp(argv[0], "--") == 0) {
+ child_cmd_template.argc = argc - 1;
+ child_cmd_template.argv = (char **)argv + 1;
+ break;
+ }
+
if (strnlen(argv[0], MAX_OPTVALUE) >= MAX_OPTVALUE - 1)
fail("Module parameter too long: \"%s\"", argv[0]);
strlcpy(buf, argv[0], MAX_OPTVALUE);
@@ -434,6 +478,162 @@ static void parse_module_params(int argc, const char *argv[])
}
}
+/* ======================== Child Command Handling ========================= */
+
+/*
+ * Read reader, writer, or kfree tasks from debugfs, and return a comma
+ * separated string.
+ */
+static char *get_tids(const char *debugfs_filename)
+{
+ char path[PATH_MAX];
+ FILE *fp;
+
+ char *tids = calloc(INIT_CAPACITY, sizeof(char));
+ size_t tids_len = 0;
+ size_t tids_capacity = INIT_CAPACITY;
+
+ char *line = NULL;
+ size_t line_buf_size = 0;
+
+ if (!tids)
+ fail("Failed to allocate memory for substitute string");
+
+ snprintf(path, sizeof(path), "%s/rcuscale/%s", debugfs, debugfs_filename);
+
+ fp = fopen(path, "r");
+ if (!fp)
+ err(EXIT_FAILURE, "Failed to open %s", path);
+
+ while (getline(&line, &line_buf_size, fp) != -1) {
+ size_t line_len = strlen(line);
+ bool is_first = (tids_len == 0);
+
+ // trim white space and new line characters
+ while (line_len && isspace(line[line_len - 1]))
+ line[--line_len] = '\0';
+
+ // 2 for NUL-terminator and ","
+ reserve_size(&tids, &tids_capacity, tids_len + line_len + 2);
+ // skip "," for the first value
+ if (!is_first)
+ strlcpy(tids + tids_len, ",", 2);
+ strcat(tids + tids_len, line);
+ tids_len += line_len + !is_first;
+ }
+
+ free(line);
+ fclose(fp);
+
+ return tids;
+}
+
+/*
+ * Replace the placeholder with the actual value. Modifies the given new string.
+ */
+static void replace_child_arg(char **arg, const char *placeholder,
+ const char *debugfs_filename, char **replacement)
+{
+ size_t str_capacity = strlen(*arg) + 1;
+ size_t placeholder_len = strlen(placeholder);
+
+ while (true) {
+ size_t replacement_len;
+ const char *found = strstr(*arg, placeholder);
+ size_t placeholder_off, suffix_off;
+
+ if (found == NULL)
+ return;
+
+ placeholder_off = found - *arg;
+ found = NULL;
+
+ /* Replacement is calculated lazily upon encountering placeholder */
+ if (*replacement == NULL)
+ *replacement = get_tids(debugfs_filename);
+
+ replacement_len = strlen(*replacement);
+
+ reserve_size(arg, &str_capacity,
+ str_capacity - placeholder_len + replacement_len + 1);
+
+ suffix_off = placeholder_off + placeholder_len;
+
+ /* Move: v suffix_off
+ * PREFIX PLACEHOLDER SUFFIX
+ * ^ placeholder_off
+ * To: PREFIX _______ SUFFIX
+ * Or: PREFIX _______________ SUFFIX
+ * ^ placeholder_off+replacement_len
+ */
+ memmove(*arg + placeholder_off + replacement_len,
+ *arg + suffix_off, strlen(*arg + suffix_off) + 1);
+ /* Fill in the replacement */
+ memcpy(*arg + placeholder_off, *replacement, replacement_len);
+ }
+}
+
+/*
+ * Generate child command by replacing {READER,WRITER,KFREE}_TASKS with the actual
+ * values, comma separated. Caller must call free_child_command().
+ */
+static struct child_cmd *generate_child_command(void)
+{
+ char *reader_tasks_string = NULL;
+ char *writer_tasks_string = NULL;
+ char *kfree_tasks_string = NULL;
+ struct child_cmd *cmd = calloc(1, sizeof(*cmd));
+
+ if (!cmd)
+ fail("Failed to allocate memory for child command");
+
+ cmd->argc = child_cmd_template.argc;
+ if (cmd->argc == 0) {
+ cmd->argv = NULL;
+ return cmd;
+ }
+
+ cmd->argv = malloc((cmd->argc + 1) * sizeof(char *));
+ if (!cmd->argv)
+ fail("Failed to allocate memory for child command");
+
+ for (int i = 0; i < cmd->argc; ++i) {
+ char *arg = strdup(child_cmd_template.argv[i]);
+
+ if (!arg)
+ fail("Failed to allocate memory for child command");
+
+ if (dryrun) {
+ cmd->argv[i] = arg;
+ continue;
+ }
+
+ replace_child_arg(&arg, "READER_TASKS", "reader_tasks", &reader_tasks_string);
+ replace_child_arg(&arg, "WRITER_TASKS", "writer_tasks", &writer_tasks_string);
+ replace_child_arg(&arg, "KFREE_TASKS", "kfree_tasks", &kfree_tasks_string);
+
+ cmd->argv[i] = arg;
+ }
+
+ cmd->argv[cmd->argc] = NULL;
+
+ free(reader_tasks_string);
+ free(writer_tasks_string);
+ free(kfree_tasks_string);
+
+ return cmd;
+}
+
+/*
+ * Free the child command.
+ */
+static void free_child_command(struct child_cmd *cmd)
+{
+ for (int i = 0; i < cmd->argc; i++)
+ free(cmd->argv[i]);
+ free(cmd->argv);
+}
+
/* ====================== Experiment Result Handling ====================== */
static void durations_add(struct durations *durations, u64 duration)
@@ -692,18 +892,53 @@ static void print_params(const struct modprobe_cmd *cmd)
printf("\n");
}
+static void print_child_command(const struct child_cmd *cmd)
+{
+ if (cmd->argc == 0)
+ return;
+ printf("Running child command:");
+ for (int i = 0; i < cmd->argc; ++i)
+ printf(" %s", cmd->argv[i]);
+ printf("\n");
+}
+
/*
* Core Experiment function
*/
static void runonce(const struct modprobe_cmd *modprobe_cmd)
{
+ struct child_cmd *child_cmd;
struct durations *durations;
print_params(modprobe_cmd);
run_modprobe(modprobe_cmd);
- if (dryrun)
+ child_cmd = generate_child_command();
+ print_child_command(child_cmd);
+
+ if (dryrun) {
+ free_child_command(child_cmd);
return;
+ }
+
+ if (child_cmd->argc != 0) {
+ // Start command in background
+ child_pid = fork();
+ if (child_pid < 0)
+ err(EXIT_FAILURE, "Failed to fork child process");
+
+ if (child_pid == 0) {
+ execvp(child_cmd->argv[0], child_cmd->argv);
+ in_child = true;
+ err(EXIT_FAILURE, "Failed to execute child command");
+ }
+ // otherwise, parent process
+ }
+ free_child_command(child_cmd);
+ child_cmd = NULL;
+
+ /* Wait for child process to initialize */
+ sleep(child_delay);
/* Start and wait for experiment */
start_experiment();
@@ -717,6 +952,13 @@ static void runonce(const struct modprobe_cmd *modprobe_cmd)
printf("Experiment finished.\n");
+ /* Wait for child to finish */
+ if (child_pid != 0) {
+ printf("Waiting for child process to exit.\n");
+ waitpid(child_pid, NULL, 0);
+ child_pid = 0;
+ }
+
/* Print statistics */
print_writer_duration_stats(durations);
free_durations(durations);
@@ -779,13 +1021,13 @@ int bench_sync_rcu(int argc, const char **argv)
/* Parse global options first. */
argc = parse_options(argc, argv, bench_rcu_options, bench_rcu_usage,
- PARSE_OPT_STOP_AT_NON_OPTION);
+ PARSE_OPT_STOP_AT_NON_OPTION | PARSE_OPT_KEEP_DASHDASH);
/* The empty case is equivalent to 'once sync'.
* Otherwise, at least two positional options are required:
* once/range/ratio and sync/async/exp
*/
- if (argc == 0) {
+ if (argc == 0 || strcmp(argv[0], "--") == 0) {
runmode = "once";
gp_type = "sync";
} else if (argc < 2) {
--
2.50.1.565.gc32cd1483b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v1 3/5] perf bench: Add 'range' mode to 'sync rcu'
2025-07-31 13:26 [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 1/5] perf bench: Add RCU benchmark using rcuscale kernel module Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 2/5] perf bench: Implement subprocess execution for 'sync rcu' Yuzhuo Jing
@ 2025-07-31 13:26 ` Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 4/5] perf bench: Add 'ratio' " Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 5/5] perf bench: Add documentation for 'sync rcu' suite Yuzhuo Jing
4 siblings, 0 replies; 6+ messages in thread
From: Yuzhuo Jing @ 2025-07-31 13:26 UTC (permalink / raw)
To: Davidlohr Bueso, Paul E . McKenney, Josh Triplett,
Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang Kan, Yuzhuo Jing, Yuzhuo Jing, Sebastian Andrzej Siewior,
linux-kernel, rcu, linux-perf-users
Add 'range' mode to test multiple combinations of parameters in
rcuscale. The command format is similar to 'once', but allows
parameters to be specified as 'name=start[:end:[:step]]', inclusive
integer ranges. The default step is 1.
This 'range' mode allows multiple parameters to be ranges, and in that
scenario, the benchmark will enumerate all combinations of all ranges.
Example usage below running 6 scenarios of
[nreaders = 1 or 2] x [writer_cpu_offset = 0 or 1 or 2]:
From the result, we can see that overlapping or non-overlapping reader
and writer CPU affinity will affect performance characteristics.
$ ./perf bench sync rcu range exp nreaders=1:2 nwriters=1 writer_cpu_offset=0:2
\# Running 'sync/rcu' benchmark:
Running experiment with options: gp_exp=1 nwriters=1 nreaders=1 writer_cpu_offset=0
Experiment finished.
Average grace-period duration: 297.535 microseconds
Minimum grace-period duration: 8.853
50th percentile grace-period duration: 9.044
90th percentile grace-period duration: 9.905
99th percentile grace-period duration: 5724.727
Maximum grace-period duration: 12029.204
Cooling down (3s)..
Running experiment with options: gp_exp=1 nwriters=1 nreaders=1 writer_cpu_offset=1
Experiment finished.
Average grace-period duration: 15.491 microseconds
Minimum grace-period duration: 8.863
50th percentile grace-period duration: 9.354
90th percentile grace-period duration: 21.142
99th percentile grace-period duration: 50.195
Maximum grace-period duration: 319.359
Cooling down (3s)..
Running experiment with options: gp_exp=1 nwriters=1 nreaders=1 writer_cpu_offset=2
Experiment finished.
Average grace-period duration: 21.439 microseconds
Minimum grace-period duration: 11.046
50th percentile grace-period duration: 16.134
90th percentile grace-period duration: 32.819
99th percentile grace-period duration: 53.59
Maximum grace-period duration: 186.71
Cooling down (3s)..
Running experiment with options: gp_exp=1 nwriters=1 nreaders=2 writer_cpu_offset=0
Experiment finished.
Average grace-period duration: 122.448 microseconds
Minimum grace-period duration: 8.934
50th percentile grace-period duration: 9.234
90th percentile grace-period duration: 9.895
99th percentile grace-period duration: 13.31
Maximum grace-period duration: 6024.476
Cooling down (3s)..
Running experiment with options: gp_exp=1 nwriters=1 nreaders=2 writer_cpu_offset=1
Experiment finished.
Average grace-period duration: 68.765 microseconds
Minimum grace-period duration: 8.913
50th percentile grace-period duration: 9.144
90th percentile grace-period duration: 9.384
99th percentile grace-period duration: 10.505
Maximum grace-period duration: 6023.405
Cooling down (3s)..
Running experiment with options: gp_exp=1 nwriters=1 nreaders=2 writer_cpu_offset=2
Experiment finished.
Average grace-period duration: 12.079 microseconds
Minimum grace-period duration: 9.204
50th percentile grace-period duration: 9.344
90th percentile grace-period duration: 11.538
99th percentile grace-period duration: 41.152
Maximum grace-period duration: 78.478
Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
---
tools/perf/bench/sync-rcu.c | 199 ++++++++++++++++++++++++++++++++++--
1 file changed, 193 insertions(+), 6 deletions(-)
diff --git a/tools/perf/bench/sync-rcu.c b/tools/perf/bench/sync-rcu.c
index 934d2416c216..921520a645ae 100644
--- a/tools/perf/bench/sync-rcu.c
+++ b/tools/perf/bench/sync-rcu.c
@@ -54,6 +54,7 @@ static const char *const bench_rcu_usage[] = {
"",
"perf bench sync rcu [options..] [-- <command>..]",
"perf bench sync rcu [options..] once <gp_type> [<param=value>..] [-- <command>..]",
+ "perf bench sync rcu [options..] range <gp_type> [<param=range>..] [-- <command>..]",
"",
" <gp_type>: The type of grace period to use: sync, async, exp (expedited)",
" This sets the gp_exp or gp_async kernel module parameters.",
@@ -76,11 +77,18 @@ static const char *const bench_rcu_usage[] = {
" default: Run 'once sync'.",
" once: Run benchmark once, with all parameters passed through to the",
" kernel rcuscale module.",
+ " range: Run benchmark multiple times, with parameters as ranges.",
+ " Range format is defined as start[:end[:step]], inclusive, non-negative.",
+ " The benchmark instantiates all combinations of all ranges.",
+ " If a parameter does not specify end, or start=end, it behaves",
+ " the same as 'once' mode. The range parameter types are validated",
+ " agains `modinfo rcuscale` to ensure they are integer.",
"",
"Examples:",
" perf bench sync rcu --hist once exp nreaders=1 nwriters=1 writer_cpu_offset=1",
" perf bench sync rcu once",
" perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1",
+ " perf bench sync rcu range exp nreaders=1:2 nwriters=1 writer_cpu_offset=0:2",
"",
" perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1 -- \\",
" perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period \\",
@@ -105,6 +113,7 @@ static const char *const bench_rcu_usage[] = {
* pointers could come from:
* - string literals (e.g. the "modprobe" and "rcuscale" command name)
* - simple_params
+ * - generated param from ranges
*/
struct modprobe_cmd {
const char *cmd[MODPROBE_CMD_MAX];
@@ -146,6 +155,30 @@ struct modprobe_param {
char value[MAX_OPTVALUE];
};
+/*
+ * Parsed range module parameter. The collected range_params will be
+ * instantiated to actual values, and then collected into modprobe_cmd.
+ *
+ * The range is inclusive.
+ *
+ * Example range: start=1 end=9 step=2 will instantiate values 1, 3 5 7 9.
+ */
+struct range {
+ int start;
+ int end;
+ int step;
+};
+struct range_option {
+ char name[MAX_OPTNAME];
+ struct range range;
+};
+
+/*
+ * The storage of range parameters.
+ */
+static struct range_option range_params[MAX_OPTS];
+static int range_params_count;
+
/*
* The storage for simple (i.e. non-range) module parameter strings.
*/
@@ -346,6 +379,75 @@ static int parse_int(const char *val)
return (int)num;
}
+/*
+ * Parse a range string into a range struct. The range is inclusive.
+ *
+ * The range string is in the format of "start[:end[:step]]".
+ * The default step is 1.
+ *
+ * Example:
+ * "1:10:2" -> start=1, end=10, step=2
+ * "1:10" -> start=1, end=10, step=1
+ * "1" -> start=1, end=1, step=1
+ */
+static int parse_range(struct range *range, const char *str)
+{
+#define MAX_RANGE 5
+
+ char *token;
+ char *saveptr = NULL;
+ int count = 0;
+ int values[MAX_RANGE];
+
+ char *str_copy = strdup(str);
+
+ if (!str_copy)
+ fail("Memory allocation failed");
+
+ // Split by : or -
+ token = strtok_r(str_copy, ":", &saveptr);
+ while (token != NULL && count < MAX_RANGE) {
+ values[count++] = parse_int(token);
+ token = strtok_r(NULL, ":", &saveptr);
+ }
+
+ switch (count) {
+ case 1:
+ range->start = values[0];
+ range->end = values[0];
+ range->step = 1;
+ break;
+ case 2:
+ range->start = values[0];
+ range->end = values[1];
+ range->step = 1;
+ break;
+ case 3:
+ range->start = values[0];
+ range->end = values[1];
+ range->step = values[2];
+ break;
+ default:
+ free(str_copy);
+ fail("Invalid range format: \"%s\"", str);
+ }
+
+ if (range->start < 0 || range->end < 0)
+ fail("Range must be non negative");
+ if (range->start > range->end)
+ fail("Range start must be smaller or equal to end");
+ if (range->step <= 0)
+ fail("Range step must be positive");
+
+ free(str_copy);
+ return 0;
+
+#undef MAX_RANGE
+}
+
+#define param_print_key_value(param, fmt, ...) \
+ snprintf((param)->value, MAX_OPTVALUE, fmt, ##__VA_ARGS__)
+
static void simple_params_add(const char *full)
{
if (simple_params_count >= MAX_OPTS)
@@ -353,6 +455,14 @@ static void simple_params_add(const char *full)
strlcpy(simple_params[simple_params_count++].value, full, MAX_OPTVALUE);
}
+static void range_params_add(const char *name, const struct range *range)
+{
+ if (range_params_count >= MAX_OPTS)
+ fail("Too many module parameters");
+ strlcpy(range_params[range_params_count].name, name, MAX_OPTNAME);
+ range_params[range_params_count++].range = *range;
+}
+
static void parse_gp_type(const char *gp_type)
{
if (strcmp(gp_type, "sync") == 0) {
@@ -379,6 +489,10 @@ static bool param_has_conflict(const char *key)
&& simple_params[i].value[strlen(key)] == '=')
return true;
}
+ for (int i = 0; i < range_params_count; ++i) {
+ if (strcmp(key, range_params[i].name) == 0)
+ return true;
+ }
/* overridable_params are considered non conflict */
return false;
@@ -436,10 +550,12 @@ static void check_param_name(const char *name)
* If allow_range is true, params that only has one value will be stored in
* params, and range ones will be stored in range_params.
*/
-static void parse_module_params(int argc, const char *argv[])
+static void parse_module_params(int argc, const char *argv[], bool allow_range)
{
while (argc) {
char *saved_ptr = NULL;
+ struct range range;
+ bool is_range = false;
char *key;
char *value;
char buf[MAX_OPTVALUE] = "";
@@ -467,11 +583,26 @@ static void parse_module_params(int argc, const char *argv[])
if (strlen(value) + 1 > MAX_OPTVALUE)
fail("Module parameter value too long: \"%s\"", value);
- /* Ensure integer type value are integers, but don't need the value. */
- if (modparm_is_int(key))
- parse_int(value);
+ if (modparm_is_int(key)) {
+ /* Detect range options. */
+ if (allow_range) {
+ parse_range(&range, value);
+ is_range = !(range.start == range.end
+ || range.start + range.step > range.end);
+ } else {
+ /* Ensure integer type value are integers,
+ * but don't need the value.
+ */
+ if (modparm_is_int(key))
+ parse_int(value);
+ }
+ }
- simple_params_add(argv[0]);
+ /* Store the option. */
+ if (is_range)
+ range_params_add(key, &range);
+ else
+ simple_params_add(argv[0]);
argc--;
argv++;
@@ -973,6 +1104,11 @@ static void modprobe_cmd_add(struct modprobe_cmd *cmd, const char *v)
cmd->cmd[++cmd->count] = NULL;
}
+static void modprobe_cmd_pop(struct modprobe_cmd *cmd)
+{
+ cmd->cmd[--cmd->count] = NULL;
+}
+
/*
* Append parameters that are overridable by users.
*/
@@ -1002,13 +1138,62 @@ static void test_once(int argc, const char *argv[])
{
MODPROBE_CMD_INIT;
- parse_module_params(argc, argv);
+ parse_module_params(argc, argv, false);
modprobe_collect_simple_options(&modprobe_cmd);
runonce(&modprobe_cmd);
}
+/*
+ * Recursively generate modprobe options from the range command.
+ *
+ * This will modify the global params storage and
+ * params_count, and also collect new options into modprobe_cmd.
+ */
+static void test_range_recursive(int range_index, struct modprobe_cmd *cmd)
+{
+ struct range range;
+
+ if (range_index >= range_params_count)
+ return runonce(cmd);
+
+ range = range_params[range_index].range;
+
+ for (int i = range.start; i <= range.end; i += range.step) {
+ struct modprobe_param param;
+
+ param_print_key_value(¶m, "%s=%d",
+ range_params[range_index].name, i);
+ modprobe_cmd_add(cmd, param.value);
+
+ test_range_recursive(range_index + 1, cmd);
+
+ modprobe_cmd_pop(cmd);
+
+ if (i + range.step <= range.end) {
+ printf("Cooling down (%ds)..\n", cooldown);
+ if (!dryrun)
+ sleep(cooldown);
+ puts("");
+ }
+ }
+}
+
+/*
+ * Test range. Use recursion on all range commands.
+ */
+static void test_range(int argc, const char *argv[])
+{
+ MODPROBE_CMD_INIT;
+
+ parse_module_params(argc, argv, true);
+
+ modprobe_collect_simple_options(&modprobe_cmd);
+
+ test_range_recursive(0, &modprobe_cmd);
+}
+
/* ============================= Entry Point ============================== */
int bench_sync_rcu(int argc, const char **argv)
@@ -1041,6 +1226,8 @@ int bench_sync_rcu(int argc, const char **argv)
if (strcmp(runmode, "once") == 0)
cmd = test_once;
+ else if (strcmp(runmode, "range") == 0)
+ cmd = test_range;
else
usage_with_options(bench_rcu_usage, bench_rcu_options);
--
2.50.1.565.gc32cd1483b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v1 4/5] perf bench: Add 'ratio' mode to 'sync rcu'
2025-07-31 13:26 [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection Yuzhuo Jing
` (2 preceding siblings ...)
2025-07-31 13:26 ` [PATCH v1 3/5] perf bench: Add 'range' mode to " Yuzhuo Jing
@ 2025-07-31 13:26 ` Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 5/5] perf bench: Add documentation for 'sync rcu' suite Yuzhuo Jing
4 siblings, 0 replies; 6+ messages in thread
From: Yuzhuo Jing @ 2025-07-31 13:26 UTC (permalink / raw)
To: Davidlohr Bueso, Paul E . McKenney, Josh Triplett,
Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang Kan, Yuzhuo Jing, Yuzhuo Jing, Sebastian Andrzej Siewior,
linux-kernel, rcu, linux-perf-users
Add a 'ratio' mode to RCU benchmark. This mode helps investigate
performance effects on the ratio between selected two parameters.
The command is defined as:
ratio <gp_type> <total> <param1_range> <param1_name> <param2_name>
<total> is the sum of <param1> and <param2>.
<param1_range> specifies the range of param1's values, and thus param2's
values can be calculated as 'total-param1'.
Example usage:
$ ./perf bench sync rcu ratio sync 10 0:10:3 nreaders nwriters
\# Running 'sync/rcu' benchmark:
Running experiment with options: nreaders=0 nwriters=10
Experiment finished.
Average grace-period duration: 16494.413 microseconds
Minimum grace-period duration: 7994.842
50th percentile grace-period duration: 17999.439
90th percentile grace-period duration: 18001.923
99th percentile grace-period duration: 23999.068
Maximum grace-period duration: 24000.441
Cooling down (3s)..
Running experiment with options: nreaders=3 nwriters=7
Experiment finished.
Average grace-period duration: 140018.793 microseconds
Minimum grace-period duration: 11987.02
50th percentile grace-period duration: 120999.879
90th percentile grace-period duration: 218000.011
99th percentile grace-period duration: 218006.372
Maximum grace-period duration: 219002.024
Cooling down (3s)..
Running experiment with options: nreaders=6 nwriters=4
Experiment finished.
Average grace-period duration: 210481.539 microseconds
Minimum grace-period duration: 5999.579
50th percentile grace-period duration: 217999.902
90th percentile grace-period duration: 218000.529
99th percentile grace-period duration: 218998.809
Maximum grace-period duration: 219000.652
Cooling down (3s)..
Running experiment with options: nreaders=9 nwriters=1
Experiment finished.
Average grace-period duration: 210782.119 microseconds
Minimum grace-period duration: 89997.154
50th percentile grace-period duration: 217999.829
90th percentile grace-period duration: 218001.299
99th percentile grace-period duration: 219003.072
Maximum grace-period duration: 324116.763
Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
---
tools/perf/bench/sync-rcu.c | 74 +++++++++++++++++++++++++++++++++++++
1 file changed, 74 insertions(+)
diff --git a/tools/perf/bench/sync-rcu.c b/tools/perf/bench/sync-rcu.c
index 921520a645ae..73142fd5be21 100644
--- a/tools/perf/bench/sync-rcu.c
+++ b/tools/perf/bench/sync-rcu.c
@@ -55,6 +55,7 @@ static const char *const bench_rcu_usage[] = {
"perf bench sync rcu [options..] [-- <command>..]",
"perf bench sync rcu [options..] once <gp_type> [<param=value>..] [-- <command>..]",
"perf bench sync rcu [options..] range <gp_type> [<param=range>..] [-- <command>..]",
+ "perf bench sync rcu [options..] ratio <gp_type> <total> <param1_range> <param1_name> <param2_name> [<param=value>..] [-- <command>..]",
"",
" <gp_type>: The type of grace period to use: sync, async, exp (expedited)",
" This sets the gp_exp or gp_async kernel module parameters.",
@@ -83,12 +84,17 @@ static const char *const bench_rcu_usage[] = {
" If a parameter does not specify end, or start=end, it behaves",
" the same as 'once' mode. The range parameter types are validated",
" agains `modinfo rcuscale` to ensure they are integer.",
+ " ratio: Run benchmark that changes the ratio between two parameters.",
+ " <total> specifies the sum of param1 and param2, and <param1_range>",
+ " is the range of param1 values. param2 is calculated by total-param1.",
+ " Additional non-range parameters may also be specified.",
"",
"Examples:",
" perf bench sync rcu --hist once exp nreaders=1 nwriters=1 writer_cpu_offset=1",
" perf bench sync rcu once",
" perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1",
" perf bench sync rcu range exp nreaders=1:2 nwriters=1 writer_cpu_offset=0:2",
+ " perf bench sync rcu ratio sync 10 0:10:3 nreaders nwriters",
"",
" perf bench sync rcu once sync nreaders=1 nwriters=1 writer_cpu_offset=1 -- \\",
" perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period \\",
@@ -1194,6 +1200,72 @@ static void test_range(int argc, const char *argv[])
test_range_recursive(0, &modprobe_cmd);
}
+/*
+ * Test ratio. Use loop on two range options.
+ *
+ * Does not allow ranges for other options.
+ *
+ * Example:
+ * perf bench rcu ratio sync 10 1:10:2 nreaders nwriters
+ * will run the following experiments:
+ * nreaders=1, nwriters=9
+ * nreaders=2, nwriters=8
+ * nreaders=3, nwriters=7
+ * ...
+ * nreaders=9, nwriters=1
+ */
+static void test_ratio(int argc, const char *argv[])
+{
+ MODPROBE_CMD_INIT;
+
+ int total;
+ struct range option1_range;
+ const char *option1_name;
+ const char *option2_name;
+
+ if (argc < 4)
+ usage_with_options(bench_rcu_usage, bench_rcu_options);
+
+ total = parse_int(argv[0]);
+ parse_range(&option1_range, argv[1]);
+ option1_name = argv[2];
+ option2_name = argv[3];
+
+ check_param_name(option1_name);
+ check_param_name(option2_name);
+
+ if (total < option1_range.start || total < option1_range.end)
+ fail("Total must be greater than or equal to the range bounary");
+
+ parse_module_params(argc - 4, argv + 4, false);
+
+ modprobe_collect_simple_options(&modprobe_cmd);
+
+ for (int i = option1_range.start; i <= option1_range.end; i += option1_range.step) {
+ int j = total - i;
+
+ struct modprobe_param param1, param2;
+
+ param_print_key_value(¶m1, "%s=%d", option1_name, i);
+ param_print_key_value(¶m2, "%s=%d", option2_name, j);
+
+ modprobe_cmd_add(&modprobe_cmd, param1.value);
+ modprobe_cmd_add(&modprobe_cmd, param2.value);
+
+ runonce(&modprobe_cmd);
+
+ modprobe_cmd_pop(&modprobe_cmd);
+ modprobe_cmd_pop(&modprobe_cmd);
+
+ if (i + option1_range.step <= option1_range.end) {
+ printf("Cooling down (%ds)..\n", cooldown);
+ if (!dryrun)
+ sleep(cooldown);
+ puts("");
+ }
+ }
+}
+
/* ============================= Entry Point ============================== */
int bench_sync_rcu(int argc, const char **argv)
@@ -1228,6 +1300,8 @@ int bench_sync_rcu(int argc, const char **argv)
cmd = test_once;
else if (strcmp(runmode, "range") == 0)
cmd = test_range;
+ else if (strcmp(runmode, "ratio") == 0)
+ cmd = test_ratio;
else
usage_with_options(bench_rcu_usage, bench_rcu_options);
--
2.50.1.565.gc32cd1483b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v1 5/5] perf bench: Add documentation for 'sync rcu' suite
2025-07-31 13:26 [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection Yuzhuo Jing
` (3 preceding siblings ...)
2025-07-31 13:26 ` [PATCH v1 4/5] perf bench: Add 'ratio' " Yuzhuo Jing
@ 2025-07-31 13:26 ` Yuzhuo Jing
4 siblings, 0 replies; 6+ messages in thread
From: Yuzhuo Jing @ 2025-07-31 13:26 UTC (permalink / raw)
To: Davidlohr Bueso, Paul E . McKenney, Josh Triplett,
Frederic Weisbecker, Neeraj Upadhyay, Joel Fernandes, Boqun Feng,
Uladzislau Rezki, Steven Rostedt, Mathieu Desnoyers,
Lai Jiangshan, Zqiang, Peter Zijlstra, Ingo Molnar,
Arnaldo Carvalho de Melo, Namhyung Kim, Mark Rutland,
Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
Liang Kan, Yuzhuo Jing, Yuzhuo Jing, Sebastian Andrzej Siewior,
linux-kernel, rcu, linux-perf-users
Add documentation for 'perf bench sync rcu'. This benchmark has three
modes or subcommands that takes positional arguments. In addition,
*kernel* module parameters are directly specified in the form of
"name=value", without "--name". Multiple subsections are thus added to
the 'sync' section for illustration.
Signed-off-by: Yuzhuo Jing <yuzhuo@google.com>
---
tools/perf/Documentation/perf-bench.txt | 131 ++++++++++++++++++++++++
1 file changed, 131 insertions(+)
diff --git a/tools/perf/Documentation/perf-bench.txt b/tools/perf/Documentation/perf-bench.txt
index 8331bd28b10e..786c6e6880f5 100644
--- a/tools/perf/Documentation/perf-bench.txt
+++ b/tools/perf/Documentation/perf-bench.txt
@@ -49,6 +49,9 @@ SUBSYSTEM
'sched'::
Scheduler and IPC mechanisms.
+'sync'::
+ Synchronization primitives.
+
'syscall'::
System call performance (throughput).
@@ -162,6 +165,134 @@ Example of *pipe*
---------------------
+SUITES FOR 'sync'
+~~~~~~~~~~~~~~~~~
+*rcu*::
+Suite for RCU performance. Depends on rcuscale kernel module.
+This benchmark has three modes: once, range, ratio. Usage is defined as below.
+
+'perf bench sync rcu' [options..] [-- <command>..]
+'perf bench sync rcu' [options..] once <gp_type> [<param=value>..] [-- <command>..]
+'perf bench sync rcu' [options..] range <gp_type> [<param=range>..] [-- <command>..]
+'perf bench sync rcu' [options..] ratio <gp_type> <total> <param1_range> <param1_name> <param2_name> [<param=value>..] [-- <command>..]
+
+Modes for *rcu*
+^^^^^^^^^^^^^^^
+
+default::
+Run 'once sync'.",
+
+once::
+Run benchmark once, with all parameters passed through to the kernel rcuscale
+module.
+
+range::
+Run benchmark multiple times, with parameters as ranges. Range format is
+defined as start[:end[:step]], inclusive, non-negative. The benchmark
+instantiates all combinations of all ranges.
+
+ratio::
+Run benchmark that changes the ratio between two parameters. 'total' specifies
+the sum of 'param1' and 'param2', and 'param1_range' is the range of 'param1'
+values. 'param2' is calculated by 'total-param1'. Additional non-range
+parameters may also be specified.
+
+Positional arguments for *rcu*
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+gp_type::
+The type of grace period to use: sync, async, exp (expedited)
+
+param::
+Any parameter of the rcuscale kernel module, except for "gp_exp",
+"gp_async" and "block_start" that are managed by this benchmark.
+Valid options can be found from "modinfo rcuscale".
+
+command::
+A child command to run during the experiment. If the command line
+string contains {READER,WRITER,KFREE}_TASKS placeholders, they will be
+substituted with the tasks PIDs, separated by comma.
+
+Options for *rcu*
+^^^^^^^^^^^^^^^^^
+-c::
+--cooldown::
+Sleep time between each experiment (default: 3 seconds).
+
+-n::
+--dryrun::
+Dry run mode. Do not run experiments, but instead print what parameter
+combination will run.
+
+--child-delay=::
+Wait for child startup before starting experiment (default: 1 second).
+
+--debugfs=::
+Debugfs mount point used to interact with the rcuscale kernel module. (default:
+/sys/kernel/debug).
+
+--hist::
+Show the histogram of writer durations.
+
+Example of *rcu*
+^^^^^^^^^^^^^^^^
+
+---------------------
+% perf bench sync rcu once exp nreaders=1 nwriters=1
+# Running 'sync/rcu' benchmark:
+Running experiment with options: gp_exp=1 nreaders=1 nwriters=1
+Experiment finished.
+Average grace-period duration: 124.236 microseconds
+Minimum grace-period duration: 8.783
+50th percentile grace-period duration: 9.033
+90th percentile grace-period duration: 9.665
+99th percentile grace-period duration: 20.911
+Maximum grace-period duration: 6025.167
+
+% perf bench sync rcu range exp nreaders=1 nwriters=1 writer_cpu_offset=0:1 -- \
+ perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period \
+ -t READER_TASKS,WRITER_TASKS
+# Running 'sync/rcu' benchmark:
+Running experiment with options: gp_exp=1 nreaders=1 nwriters=1 writer_cpu_offset=0
+Running child command: perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period -t 2061441,2061442
+
+ Performance counter stats for thread id '2061441,2061442':
+
+ 2400 ipi:ipi_send_cpu
+ 100 rcu:rcu_grace_period
+
+ 6.006040148 seconds time elapsed
+
+Experiment finished.
+Waiting for child process to exit.
+Average grace-period duration: 301.177 microseconds
+Minimum grace-period duration: 9.064
+50th percentile grace-period duration: 9.394
+90th percentile grace-period duration: 10.977
+99th percentile grace-period duration: 5926.781
+Maximum grace-period duration: 6011.067
+Cooling down (3s)..
+
+Running experiment with options: gp_exp=1 nreaders=1 nwriters=1 writer_cpu_offset=1
+Running child command: perf stat -e ipi:ipi_send_cpu,rcu:rcu_grace_period -t 2061461,2061462
+
+ Performance counter stats for thread id '2061461,2061462':
+
+ 2144 ipi:ipi_send_cpu
+ 201 rcu:rcu_grace_period
+
+ 6.006110747 seconds time elapsed
+
+Experiment finished.
+Waiting for child process to exit.
+Average grace-period duration: 12.23 microseconds
+Minimum grace-period duration: 9.134
+50th percentile grace-period duration: 9.475
+90th percentile grace-period duration: 11.897
+99th percentile grace-period duration: 38.057
+Maximum grace-period duration: 67.19
+---------------------
+
SUITES FOR 'syscall'
~~~~~~~~~~~~~~~~~~
*basic*::
--
2.50.1.565.gc32cd1483b-goog
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-07-31 13:27 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-31 13:26 [PATCH v1 0/5] perf bench: Add rcu to the 'bench sync' collection Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 1/5] perf bench: Add RCU benchmark using rcuscale kernel module Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 2/5] perf bench: Implement subprocess execution for 'sync rcu' Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 3/5] perf bench: Add 'range' mode to " Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 4/5] perf bench: Add 'ratio' " Yuzhuo Jing
2025-07-31 13:26 ` [PATCH v1 5/5] perf bench: Add documentation for 'sync rcu' suite Yuzhuo Jing
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).