* [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask
@ 2023-09-26 4:29 Changbin Du
2023-09-26 4:29 ` [PATCH v5 1/5] perf cpumap: Add __perf_cpu_map__new and perf_cpu_map__2_cpuset Changbin Du
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Changbin Du @ 2023-09-26 4:29 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, linux-perf-users, linux-kernel,
changbin.du, Changbin Du
[I still think this is a handy option, so do some improvment and resend it.]
To get consistent benchmarking results, sometimes we need to set the
sched_policy/priority/cpumask of the workload to reduce system noise.
For example, CPU binding is required on big.little system.
$ perf stat -- taskset -c 0 ls
However, the events of 'taskset' itself are also counted here. To get more
accurate result, this should be avoided.
To get away of the middleman, this adds a new option '--workload-config' to
do the same jobs for stat and record commands.
--workload-config <[sched_policy=policy][,sched_prio=priority][,cpu-list=list]>
setup target workload (the <command>) attributes:
sched_policy: other|fifo|rr|batch|idle
sched_prio: scheduling priority for fifo|rr, nice value for other
cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5
For example,
$ sudo perf stat --workload-config sched_policy=fifo,sched_prio=40,cpu-list=0-3:7 -- ls
Above command will make 'ls' run on CPU #0-#3 and #7 with fifo scheduler and
realtime priority is 40.
v5:
o rename '--workload-attr' as '--workload-config' (sounds better) .
o transform to key-value pair style option.
v4:
- add a test case for perf-stat. (suggeted by Ian Rogers)
- fix warning found by 0-DAY.
v3:
- replace taskset with --workload-attr option in documents and tests.
v2:
- Use cpu list spec instead of cpu mask number.
- Update documents.
Changbin Du (5):
perf cpumap: Add __perf_cpu_map__new and perf_cpu_map__2_cpuset
perf: util: support string type option for perf_parse_sublevel_options
perf: add new option '--workload-config' to set workload
sched_policy/prio/cpumask
perf: replace taskset with --workload-config option
perf test: add test case for --workload-config option
tools/lib/perf/cpumap.c | 45 +++++++-
tools/lib/perf/include/perf/cpumap.h | 4 +
tools/lib/perf/libperf.map | 2 +
tools/perf/Documentation/intel-hybrid.txt | 2 +-
tools/perf/Documentation/perf-record.txt | 7 ++
tools/perf/Documentation/perf-stat.txt | 8 +-
tools/perf/builtin-record.c | 27 +++++
tools/perf/builtin-stat.c | 19 +++
tools/perf/tests/cpumap.c | 23 ++++
tools/perf/tests/shell/stat.sh | 19 +++
.../tests/shell/stat_bpf_counters_cgrp.sh | 2 +-
tools/perf/tests/shell/test_arm_coresight.sh | 2 +-
tools/perf/tests/shell/test_data_symbol.sh | 2 +-
tools/perf/tests/shell/test_intel_pt.sh | 2 +-
tools/perf/util/evlist.c | 108 ++++++++++++++++++
tools/perf/util/evlist.h | 3 +
tools/perf/util/parse-sublevel-options.c | 12 +-
tools/perf/util/parse-sublevel-options.h | 7 ++
tools/perf/util/target.h | 9 ++
19 files changed, 291 insertions(+), 12 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH v5 1/5] perf cpumap: Add __perf_cpu_map__new and perf_cpu_map__2_cpuset
2023-09-26 4:29 [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
@ 2023-09-26 4:29 ` Changbin Du
2023-09-26 4:29 ` [PATCH v5 2/5] perf: util: support string type option for perf_parse_sublevel_options Changbin Du
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Changbin Du @ 2023-09-26 4:29 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, linux-perf-users, linux-kernel,
changbin.du, Changbin Du
This adds two new api which will be used later.
- __perf_cpu_map__new: accept a specified separator instead of ','.
- perf_cpu_map__2_cpuset: convert perf_cpu_map to cpu_set_t.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
---
tools/lib/perf/cpumap.c | 45 ++++++++++++++++++++++++++--
tools/lib/perf/include/perf/cpumap.h | 4 +++
tools/lib/perf/libperf.map | 2 ++
tools/perf/tests/cpumap.c | 23 ++++++++++++++
4 files changed, 71 insertions(+), 3 deletions(-)
diff --git a/tools/lib/perf/cpumap.c b/tools/lib/perf/cpumap.c
index 2a5a29217374..23e907078b28 100644
--- a/tools/lib/perf/cpumap.c
+++ b/tools/lib/perf/cpumap.c
@@ -1,4 +1,5 @@
// SPDX-License-Identifier: GPL-2.0-only
+#define _GNU_SOURCE
#include <perf/cpumap.h>
#include <stdlib.h>
#include <linux/refcount.h>
@@ -7,6 +8,7 @@
#include <stdio.h>
#include <string.h>
#include <unistd.h>
+#include <sched.h>
#include <ctype.h>
#include <limits.h>
@@ -201,7 +203,7 @@ static struct perf_cpu_map *cpu_map__read_all_cpu_map(void)
return cpus;
}
-struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
+struct perf_cpu_map *__perf_cpu_map__new(const char *cpu_list, char sep)
{
struct perf_cpu_map *cpus = NULL;
unsigned long start_cpu, end_cpu = 0;
@@ -225,7 +227,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
p = NULL;
start_cpu = strtoul(cpu_list, &p, 0);
if (start_cpu >= INT_MAX
- || (*p != '\0' && *p != ',' && *p != '-'))
+ || (*p != '\0' && *p != sep && *p != '-'))
goto invalid;
if (*p == '-') {
@@ -233,7 +235,7 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
p = NULL;
end_cpu = strtoul(cpu_list, &p, 0);
- if (end_cpu >= INT_MAX || (*p != '\0' && *p != ','))
+ if (end_cpu >= INT_MAX || (*p != '\0' && *p != sep))
goto invalid;
if (end_cpu < start_cpu)
@@ -278,6 +280,11 @@ struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
return cpus;
}
+struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list)
+{
+ return __perf_cpu_map__new(cpu_list, ',');
+}
+
static int __perf_cpu_map__nr(const struct perf_cpu_map *cpus)
{
return RC_CHK_ACCESS(cpus)->nr;
@@ -479,3 +486,35 @@ struct perf_cpu_map *perf_cpu_map__intersect(struct perf_cpu_map *orig,
free(tmp_cpus);
return merged;
}
+
+/* The caller is responsible for freeing returned cpu_set_t with CPU_FREE(). */
+cpu_set_t *perf_cpu_map__2_cpuset(struct perf_cpu_map *cpus, size_t *cpuset_size)
+{
+ cpu_set_t *cpusetp;
+ int max_cpu;
+ struct perf_cpu cpu;
+ int idx;
+
+ if (perf_cpu_map__has_any_cpu(cpus))
+ return NULL;
+
+ max_cpu = perf_cpu_map__max(cpus).cpu;
+ if (max_cpu < 0)
+ return NULL;
+
+ cpusetp = CPU_ALLOC(max_cpu + 1);
+ if (cpusetp == NULL)
+ return NULL;
+
+ *cpuset_size = CPU_ALLOC_SIZE(max_cpu + 1);
+ CPU_ZERO_S(*cpuset_size, cpusetp);
+
+ perf_cpu_map__for_each_cpu(cpu, idx, cpus) {
+ if (cpu.cpu == -1)
+ continue;
+
+ CPU_SET_S(cpu.cpu, *cpuset_size, cpusetp);
+ }
+
+ return cpusetp;
+}
diff --git a/tools/lib/perf/include/perf/cpumap.h b/tools/lib/perf/include/perf/cpumap.h
index e38d859a384d..1a0498f92dbe 100644
--- a/tools/lib/perf/include/perf/cpumap.h
+++ b/tools/lib/perf/include/perf/cpumap.h
@@ -3,6 +3,7 @@
#define __LIBPERF_CPUMAP_H
#include <perf/core.h>
+#include <sched.h>
#include <stdio.h>
#include <stdbool.h>
@@ -23,6 +24,7 @@ struct perf_cpu_map;
*/
LIBPERF_API struct perf_cpu_map *perf_cpu_map__dummy_new(void);
LIBPERF_API struct perf_cpu_map *perf_cpu_map__default_new(void);
+LIBPERF_API struct perf_cpu_map *__perf_cpu_map__new(const char *cpu_list, char sep);
LIBPERF_API struct perf_cpu_map *perf_cpu_map__new(const char *cpu_list);
LIBPERF_API struct perf_cpu_map *perf_cpu_map__read(FILE *file);
LIBPERF_API struct perf_cpu_map *perf_cpu_map__get(struct perf_cpu_map *map);
@@ -46,6 +48,8 @@ LIBPERF_API bool perf_cpu_map__equal(const struct perf_cpu_map *lhs,
*/
LIBPERF_API bool perf_cpu_map__has_any_cpu(const struct perf_cpu_map *map);
+LIBPERF_API cpu_set_t *perf_cpu_map__2_cpuset(struct perf_cpu_map *cpus, size_t *cpuset_size);
+
#define perf_cpu_map__for_each_cpu(cpu, idx, cpus) \
for ((idx) = 0, (cpu) = perf_cpu_map__cpu(cpus, idx); \
(idx) < perf_cpu_map__nr(cpus); \
diff --git a/tools/lib/perf/libperf.map b/tools/lib/perf/libperf.map
index 190b56ae923a..fe0946e34471 100644
--- a/tools/lib/perf/libperf.map
+++ b/tools/lib/perf/libperf.map
@@ -5,6 +5,7 @@ LIBPERF_0.0.1 {
perf_cpu_map__default_new;
perf_cpu_map__get;
perf_cpu_map__put;
+ __perf_cpu_map__new;
perf_cpu_map__new;
perf_cpu_map__read;
perf_cpu_map__nr;
@@ -12,6 +13,7 @@ LIBPERF_0.0.1 {
perf_cpu_map__empty;
perf_cpu_map__max;
perf_cpu_map__has;
+ perf_cpu_map__2_cpuset;
perf_thread_map__new_array;
perf_thread_map__new_dummy;
perf_thread_map__set_pid;
diff --git a/tools/perf/tests/cpumap.c b/tools/perf/tests/cpumap.c
index 7730fc2ab40b..ae5e5337ea4f 100644
--- a/tools/perf/tests/cpumap.c
+++ b/tools/perf/tests/cpumap.c
@@ -1,5 +1,6 @@
// SPDX-License-Identifier: GPL-2.0
#include "tests.h"
+#include <sched.h>
#include <stdio.h>
#include "cpumap.h"
#include "event.h"
@@ -247,12 +248,34 @@ static int test__cpu_map_equal(struct test_suite *test __maybe_unused, int subte
return TEST_OK;
}
+static int test__cpu_map_convert(struct test_suite *test __maybe_unused, int subtest __maybe_unused)
+{
+ struct perf_cpu_map *any = perf_cpu_map__dummy_new();
+ struct perf_cpu_map *cpus = perf_cpu_map__new("1-2");
+ cpu_set_t *cpu_set;
+ size_t setsize;
+
+ cpu_set = perf_cpu_map__2_cpuset(any, &setsize);
+ TEST_ASSERT_VAL("not equal", cpu_set == NULL);
+ CPU_FREE(cpu_set);
+
+ cpu_set = perf_cpu_map__2_cpuset(cpus, &setsize);
+ TEST_ASSERT_VAL("cpus", cpu_set != NULL);
+ TEST_ASSERT_VAL("bad cpuset", !CPU_ISSET_S(0, setsize, cpu_set));
+ TEST_ASSERT_VAL("bad cpuset", CPU_ISSET_S(1, setsize, cpu_set));
+ TEST_ASSERT_VAL("bad cpuset", CPU_ISSET_S(2, setsize, cpu_set));
+ CPU_FREE(cpu_set);
+
+ return TEST_OK;
+}
+
static struct test_case tests__cpu_map[] = {
TEST_CASE("Synthesize cpu map", cpu_map_synthesize),
TEST_CASE("Print cpu map", cpu_map_print),
TEST_CASE("Merge cpu map", cpu_map_merge),
TEST_CASE("Intersect cpu map", cpu_map_intersect),
TEST_CASE("Equal cpu map", cpu_map_equal),
+ TEST_CASE("Convert cpu map", cpu_map_convert),
{ .name = NULL, }
};
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v5 2/5] perf: util: support string type option for perf_parse_sublevel_options
2023-09-26 4:29 [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
2023-09-26 4:29 ` [PATCH v5 1/5] perf cpumap: Add __perf_cpu_map__new and perf_cpu_map__2_cpuset Changbin Du
@ 2023-09-26 4:29 ` Changbin Du
2023-09-26 4:29 ` [PATCH v5 3/5] perf: add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Changbin Du @ 2023-09-26 4:29 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, linux-perf-users, linux-kernel,
changbin.du, Changbin Du
Add string type option to get non-int value.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
---
tools/perf/util/parse-sublevel-options.c | 12 +++++++++---
tools/perf/util/parse-sublevel-options.h | 7 +++++++
2 files changed, 16 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/parse-sublevel-options.c b/tools/perf/util/parse-sublevel-options.c
index a841d17ffd57..d08a1ccc9616 100644
--- a/tools/perf/util/parse-sublevel-options.c
+++ b/tools/perf/util/parse-sublevel-options.c
@@ -34,10 +34,16 @@ static int parse_one_sublevel_option(const char *str,
return -1;
}
- if (vstr)
- v = atoi(vstr);
+ if (vstr) {
+ /* The value of option either is a integer or string. */
+ if (opt->value_ptr) {
+ v = atoi(vstr);
+ *opt->value_ptr = v;
+ } else {
+ *opt->str_ptr = strdup(vstr);
+ }
+ }
- *opt->value_ptr = v;
free(s);
return 0;
}
diff --git a/tools/perf/util/parse-sublevel-options.h b/tools/perf/util/parse-sublevel-options.h
index 578b18ef03bb..d536ebe43b58 100644
--- a/tools/perf/util/parse-sublevel-options.h
+++ b/tools/perf/util/parse-sublevel-options.h
@@ -3,7 +3,14 @@
struct sublevel_option {
const char *name;
+
+ /*
+ * Only one of below can be non-null. So we simply support
+ * two types: integer and string. For string, the caller is
+ * responsible for freeing allocated memory after use.
+ */
int *value_ptr;
+ char **str_ptr;
};
int perf_parse_sublevel_options(const char *str, struct sublevel_option *opts);
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v5 3/5] perf: add new option '--workload-config' to set workload sched_policy/prio/cpumask
2023-09-26 4:29 [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
2023-09-26 4:29 ` [PATCH v5 1/5] perf cpumap: Add __perf_cpu_map__new and perf_cpu_map__2_cpuset Changbin Du
2023-09-26 4:29 ` [PATCH v5 2/5] perf: util: support string type option for perf_parse_sublevel_options Changbin Du
@ 2023-09-26 4:29 ` Changbin Du
2023-09-26 4:29 ` [PATCH v5 4/5] perf: replace taskset with --workload-config option Changbin Du
2023-09-26 4:29 ` [PATCH v5 5/5] perf test: add test case for " Changbin Du
4 siblings, 0 replies; 6+ messages in thread
From: Changbin Du @ 2023-09-26 4:29 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, linux-perf-users, linux-kernel,
changbin.du, Changbin Du, kernel test robot
To get consistent benchmarking results, sometimes we need to set the
sched_policy/priority/cpumask of the workload to reduce system noise.
For example, CPU binding is required on big.little system.
$ perf stat -- taskset -c 0 ls
However, the events of 'taskset' itself are also counted here. To get more
accurate result, this should be avoided.
To get away of the middleman, this adds a new option '--workload-config' to
do the same jobs for stat and record commands.
--workload-config <[sched_policy=policy][,sched_prio=priority][,cpu-list=list]>
setup target workload (the <command>) attributes:
sched_policy: other|fifo|rr|batch|idle
sched_prio: scheduling priority for fifo|rr, nice value for other
cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5
For example,
$ sudo perf stat --workload-config sched_policy=fifo,sched_prio=40,cpu-list=0-3:7 -- ls
Above command will make 'ls' run on CPU #0-#3 and #7 with fifo scheduler and
realtime priority is 40.
Cc: kernel test robot <yujie.liu@intel.com>
Signed-off-by: Changbin Du <changbin.du@huawei.com>
---
v2: Use cpu list spec instead of cpu mask number.
v3:
o rename '--workload-attr' as '--workload-config'
o transform to key-value style option
---
tools/perf/Documentation/perf-record.txt | 7 ++
tools/perf/Documentation/perf-stat.txt | 6 ++
tools/perf/builtin-record.c | 27 ++++++
tools/perf/builtin-stat.c | 19 ++++
tools/perf/util/evlist.c | 108 +++++++++++++++++++++++
tools/perf/util/evlist.h | 3 +
tools/perf/util/target.h | 9 ++
7 files changed, 179 insertions(+)
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index d5217be012d7..da4692751e17 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -821,6 +821,13 @@ filtered through the mask provided by -C option.
only, as of now. So the applications built without the frame
pointer might see bogus addresses.
+--workload-config <[sched_policy=policy][,sched_prio=priority][,cpu-list=list]>::
+ setup target workload (the <command>) attributes:
+
+ sched_policy: other|fifo|rr|batch|idle
+ sched_prio: scheduling priority for fifo|rr, nice value for other
+ cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5
+
include::intel-hybrid.txt[]
SEE ALSO
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index 8f789fa1242e..b2038f7e236a 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -262,6 +262,12 @@ disable events during measurements:
wait -n ${perf_pid}
exit $?
+--workload-config <[sched_policy=policy][,sched_prio=priority][,cpu-list=list]>::
+ setup target workload (the <command>) attributes:
+
+ sched_policy: other|fifo|rr|batch|idle
+ sched_prio: scheduling priority for fifo|rr, nice value for other
+ cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5
--pre::
--post::
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 34bb31f08bb5..20799a1e60f6 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -3277,6 +3277,17 @@ static int parse_record_synth_option(const struct option *opt,
return 0;
}
+static int record_parse_workload_attr_opt(const struct option *opt,
+ const char *arg,
+ int unset __maybe_unused)
+{
+ struct record_opts *opts = opt->value;
+
+ return evlist__parse_workload_config(arg, &opts->target.workload.sched_policy,
+ &opts->target.workload.sched_priority,
+ &opts->target.workload.cpu_map);
+}
+
/*
* XXX Ideally would be local to cmd_record() and passed to a record__new
* because we need to have access to it in record__exit, that is called
@@ -3297,6 +3308,8 @@ static struct record record = {
.target = {
.uses_mmap = true,
.default_per_cpu = true,
+ .workload.sched_policy = -1,
+ .workload.sched_priority = 0,
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
.nr_threads_synthesize = 1,
@@ -3321,6 +3334,12 @@ static struct record record = {
const char record_callchain_help[] = CALLCHAIN_RECORD_HELP
"\n\t\t\t\tDefault: fp";
+const char record_workload_config_help[] =
+ "setup target workload (the <command>) attributes:\n\n"
+ HELP_PAD "sched_policy: other|fifo|rr|batch|idle\n"
+ HELP_PAD "sched_prio: scheduling priority for fifo|rr, nice value for other\n"
+ HELP_PAD "cpu-list: CPU affinity. e.g. 1-3:5 is processors #1, #2, #3 and #5";
+
static bool dry_run;
static struct parse_events_option_args parse_events_option_args = {
@@ -3535,6 +3554,10 @@ static struct option __record_options[] = {
"write collected trace data into several data files using parallel threads",
record__parse_threads),
OPT_BOOLEAN(0, "off-cpu", &record.off_cpu, "Enable off-cpu analysis"),
+ OPT_CALLBACK(0, "workload-config", &record.opts,
+ "[sched_policy=policy][,sched_prio=priority][,cpu-list=list]",
+ record_workload_config_help,
+ &record_parse_workload_attr_opt),
OPT_END()
};
@@ -4221,6 +4244,10 @@ int cmd_record(int argc, const char **argv)
record__free_thread_masks(rec, rec->nr_threads);
rec->nr_threads = 0;
evlist__close_control(rec->opts.ctl_fd, rec->opts.ctl_fd_ack, &rec->opts.ctl_fd_close);
+ if (rec->opts.target.workload.cpu_map) {
+ perf_cpu_map__put(rec->opts.target.workload.cpu_map);
+ rec->opts.target.workload.cpu_map = NULL;
+ }
return err;
}
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 07b48f6df48e..a7a3a788e7d9 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -108,6 +108,8 @@ static bool all_counters_use_bpf = true;
static struct target target = {
.uid = UINT_MAX,
+ .workload.sched_policy = -1,
+ .workload.sched_priority = 0,
};
#define METRIC_ONLY_LEN 20
@@ -1160,6 +1162,14 @@ static int parse_cache_level(const struct option *opt,
return 0;
}
+static int parse_workload_attr_opt(const struct option *opt __maybe_unused, const char *arg,
+ int unset __maybe_unused)
+{
+ return evlist__parse_workload_config(arg, &target.workload.sched_policy,
+ &target.workload.sched_priority,
+ &target.workload.cpu_map);
+}
+
static struct option stat_options[] = {
OPT_BOOLEAN('T', "transaction", &transaction_run,
"hardware transaction statistics"),
@@ -1220,6 +1230,10 @@ static struct option stat_options[] = {
OPT_BOOLEAN(0, "append", &append_file, "append to the output file"),
OPT_INTEGER(0, "log-fd", &output_fd,
"log output to fd, instead of stderr"),
+ OPT_CALLBACK(0, "workload-config", &stat_config,
+ "[sched_policy=policy][,sched_prio=priority][,cpu-list=list]",
+ record_workload_config_help,
+ &parse_workload_attr_opt),
OPT_STRING(0, "pre", &pre_cmd, "command",
"command to run prior to the measured command"),
OPT_STRING(0, "post", &post_cmd, "command",
@@ -2893,5 +2907,10 @@ int cmd_stat(int argc, const char **argv)
metricgroup__rblist_exit(&stat_config.metric_events);
evlist__close_control(stat_config.ctl_fd, stat_config.ctl_fd_ack, &stat_config.ctl_fd_close);
+ if (target.workload.cpu_map) {
+ perf_cpu_map__put(target.workload.cpu_map);
+ target.workload.cpu_map = NULL;
+ }
+
return status;
}
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index 7ef43f72098e..7ad7a4fed282 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -33,6 +33,7 @@
#include "util/bpf-filter.h"
#include "util/stat.h"
#include "util/util.h"
+#include "util/parse-sublevel-options.h"
#include <signal.h>
#include <unistd.h>
#include <sched.h>
@@ -46,6 +47,7 @@
#include <sys/mman.h>
#include <sys/prctl.h>
#include <sys/timerfd.h>
+#include <sys/resource.h>
#include <linux/bitops.h>
#include <linux/hash.h>
@@ -1398,6 +1400,109 @@ int evlist__open(struct evlist *evlist)
return err;
}
+int evlist__parse_workload_config(const char *str, int *sched_policy, int *sched_priority,
+ struct perf_cpu_map **cpu_map)
+{
+ char *policy_str = NULL;
+ int priority = -1;
+ char *cpu_list = NULL;
+ int ret;
+ struct sublevel_option workload_conf_opts[] = {
+ { .name = "sched_policy", .str_ptr = &policy_str},
+ { .name = "sched_prio", .value_ptr = &priority},
+ { .name = "cpu-list", .str_ptr = &cpu_list},
+ { .name = NULL, }
+ };
+
+ ret = perf_parse_sublevel_options(str, workload_conf_opts);
+ if (ret)
+ return ret;
+
+ /* sched policy, default to 'other'. */
+ if (!policy_str || !strncmp(policy_str, "other", sizeof("other")))
+ *sched_policy = SCHED_OTHER;
+ else if (!strncmp(policy_str, "fifo", sizeof("fifo")))
+ *sched_policy = SCHED_FIFO;
+ else if (!strncmp(policy_str, "rr", sizeof("rr")))
+ *sched_policy = SCHED_RR;
+ else if (!strncmp(policy_str, "batch", sizeof("batch")))
+ *sched_policy = SCHED_BATCH;
+ else if (!strncmp(policy_str, "idle", sizeof("idle")))
+ *sched_policy = SCHED_IDLE;
+ else {
+ pr_err("workload_attr: unknown sched policy %s\n", policy_str);
+ ret = -EINVAL;
+ goto out;
+ }
+
+ /* check sched priority and set default value */
+ if (*sched_policy == SCHED_FIFO || *sched_policy == SCHED_RR) {
+ if (priority == -1)
+ priority = 99; /* default to lowest priority */
+ else if (priority < 1 || priority > 99) {
+ pr_err("workload_attr: invalid priority %d for fifo and rr, allowed [1,99]\n",
+ priority);
+ ret = -EINVAL;
+ goto out;
+ }
+ } else if (*sched_policy == SCHED_OTHER && priority == -1)
+ priority = 0;
+ *sched_priority = priority;
+
+ /* allowed cpu list */
+ *cpu_map = __perf_cpu_map__new(cpu_list, ':');
+ if (!*cpu_map) {
+ pr_err("workload_attr: failed to get cpus map from %s\n", cpu_list);
+ ret = -EINVAL;
+ }
+
+out:
+ free(policy_str);
+ free(cpu_list);
+ return ret;
+}
+
+static int configurate_workload(struct target *target)
+{
+ struct sched_param param;
+ int policy = target->workload.sched_policy;
+ int priority = target->workload.sched_priority;
+
+ if (policy >= 0) {
+ param.sched_priority = (policy == SCHED_FIFO || policy == SCHED_RR) ?
+ priority : 0;
+ if (sched_setscheduler(0, policy, ¶m) != 0) {
+ pr_err("failed to set the sched policy %d: %s\n", policy, strerror(errno));
+ return -1;
+ }
+
+ if (policy == SCHED_OTHER) {
+ if (setpriority(PRIO_PROCESS, 0, priority) != 0) {
+ pr_err("failed to set the nice value %d: %s\n", priority, strerror(errno));
+ return -1;
+ }
+ }
+ }
+
+ if (target->workload.cpu_map) {
+ size_t cpuset_size = -1;
+ cpu_set_t *cpu_set;
+
+ cpu_set = perf_cpu_map__2_cpuset(target->workload.cpu_map, &cpuset_size);
+ if (!cpu_set)
+ return -1;
+
+ if (sched_setaffinity(0, cpuset_size, cpu_set) != 0) {
+ pr_err("failed to set the sched affinity: %s\n", strerror(errno));
+ CPU_FREE(cpu_set);
+ return -1;
+ }
+ CPU_FREE(cpu_set);
+ }
+
+ return 0;
+}
+
int evlist__prepare_workload(struct evlist *evlist, struct target *target, const char *argv[],
bool pipe_output, void (*exec_error)(int signo, siginfo_t *info, void *ucontext))
{
@@ -1464,6 +1569,9 @@ int evlist__prepare_workload(struct evlist *evlist, struct target *target, const
exit(ret);
}
+ if (configurate_workload(target) != 0)
+ exit(-1);
+
execvp(argv[0], (char **)argv);
if (exec_error) {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 664c6bf7b3e0..540e17d0d9fe 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -15,6 +15,7 @@
#include <pthread.h>
#include <signal.h>
#include <unistd.h>
+#include <sched.h>
struct pollfd;
struct thread_map;
@@ -180,6 +181,8 @@ void evlist__set_id_pos(struct evlist *evlist);
void evlist__config(struct evlist *evlist, struct record_opts *opts, struct callchain_param *callchain);
int record_opts__config(struct record_opts *opts);
+int evlist__parse_workload_config(const char *str, int *sched_policy, int *sched_priority,
+ struct perf_cpu_map **cpu_set);
int evlist__prepare_workload(struct evlist *evlist, struct target *target,
const char *argv[], bool pipe_output,
void (*exec_error)(int signo, siginfo_t *info, void *ucontext));
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index d582cae8e105..78b7e7ab1c7b 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -4,6 +4,7 @@
#include <stdbool.h>
#include <sys/types.h>
+#include <sched.h>
struct target {
const char *pid;
@@ -19,6 +20,12 @@ struct target {
bool use_bpf;
int initial_delay;
const char *attr_map;
+
+ struct {
+ int sched_policy;
+ int sched_priority;
+ struct perf_cpu_map *cpu_map;
+ } workload;
};
enum target_errno {
@@ -103,4 +110,6 @@ static inline bool target__uses_dummy_map(struct target *target)
return use_dummy;
}
+extern const char record_workload_config_help[];
+
#endif /* _PERF_TARGET_H */
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v5 4/5] perf: replace taskset with --workload-config option
2023-09-26 4:29 [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
` (2 preceding siblings ...)
2023-09-26 4:29 ` [PATCH v5 3/5] perf: add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
@ 2023-09-26 4:29 ` Changbin Du
2023-09-26 4:29 ` [PATCH v5 5/5] perf test: add test case for " Changbin Du
4 siblings, 0 replies; 6+ messages in thread
From: Changbin Du @ 2023-09-26 4:29 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, linux-perf-users, linux-kernel,
changbin.du, Changbin Du
Replace the taskset with our new option --workload-config.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
---
tools/perf/Documentation/intel-hybrid.txt | 2 +-
tools/perf/Documentation/perf-stat.txt | 2 +-
tools/perf/tests/shell/stat_bpf_counters_cgrp.sh | 2 +-
tools/perf/tests/shell/test_arm_coresight.sh | 2 +-
tools/perf/tests/shell/test_data_symbol.sh | 2 +-
tools/perf/tests/shell/test_intel_pt.sh | 2 +-
6 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/perf/Documentation/intel-hybrid.txt b/tools/perf/Documentation/intel-hybrid.txt
index e7a776ad25d7..f2b9d623a656 100644
--- a/tools/perf/Documentation/intel-hybrid.txt
+++ b/tools/perf/Documentation/intel-hybrid.txt
@@ -132,7 +132,7 @@ displayed. The percentage is the event's running time/enabling time.
One example, 'triad_loop' runs on cpu16 (atom core), while we can see the
scaled value for core cycles is 160,444,092 and the percentage is 0.47%.
-perf stat -e cycles \-- taskset -c 16 ./triad_loop
+perf stat -e cycles --workload-config cpu-list=16 \-- ./triad_loop
As previous, two events are created.
diff --git a/tools/perf/Documentation/perf-stat.txt b/tools/perf/Documentation/perf-stat.txt
index b2038f7e236a..fc22d72560e1 100644
--- a/tools/perf/Documentation/perf-stat.txt
+++ b/tools/perf/Documentation/perf-stat.txt
@@ -464,7 +464,7 @@ on workload with changing phases.
To interpret the results it is usually needed to know on which
CPUs the workload runs on. If needed the CPUs can be forced using
-taskset.
+--workload-config option.
--td-level::
Print the top-down statistics that equal the input level. It allows
diff --git a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
index e75d0780dc78..c8737cb32a63 100755
--- a/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
+++ b/tools/perf/tests/shell/stat_bpf_counters_cgrp.sh
@@ -60,7 +60,7 @@ check_system_wide_counted()
check_cpu_list_counted()
{
- check_cpu_list_counted_output=$(perf stat -C 0,1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, taskset -c 1 sleep 1 2>&1)
+ check_cpu_list_counted_output=$(perf stat -C 0,1 --bpf-counters --for-each-cgroup ${test_cgroups} -e cpu-clock -x, --workload-config cpu-list=1 -- sleep 1 2>&1)
if echo ${check_cpu_list_counted_output} | grep -q -F "<not "; then
echo "Some CPU events are not counted"
if [ "${verbose}" = "1" ]; then
diff --git a/tools/perf/tests/shell/test_arm_coresight.sh b/tools/perf/tests/shell/test_arm_coresight.sh
index f1bf5621160f..e5be16fa7171 100755
--- a/tools/perf/tests/shell/test_arm_coresight.sh
+++ b/tools/perf/tests/shell/test_arm_coresight.sh
@@ -38,7 +38,7 @@ record_touch_file() {
echo "Recording trace (only user mode) with path: CPU$2 => $1"
rm -f $file
perf record -o ${perfdata} -e cs_etm/@$1/u --per-thread \
- -- taskset -c $2 touch $file > /dev/null 2>&1
+ --workload-attr cpu-list=$2 -- touch $file > /dev/null 2>&1
}
perf_script_branch_samples() {
diff --git a/tools/perf/tests/shell/test_data_symbol.sh b/tools/perf/tests/shell/test_data_symbol.sh
index 69bb6fe86c50..d2b58c78a18b 100755
--- a/tools/perf/tests/shell/test_data_symbol.sh
+++ b/tools/perf/tests/shell/test_data_symbol.sh
@@ -50,7 +50,7 @@ echo "Recording workload..."
# specific CPU and test in per-CPU mode.
is_amd=$(grep -E -c 'vendor_id.*AuthenticAMD' /proc/cpuinfo)
if (($is_amd >= 1)); then
- perf mem record -o ${PERF_DATA} -C 0 -- taskset -c 0 $TEST_PROGRAM &
+ perf mem record -o ${PERF_DATA} -C 0 --workload-config cpu-list=0 -- $TEST_PROGRAM &
else
perf mem record --all-user -o ${PERF_DATA} -- $TEST_PROGRAM &
fi
diff --git a/tools/perf/tests/shell/test_intel_pt.sh b/tools/perf/tests/shell/test_intel_pt.sh
index 3a8b9bffa022..46f53aece135 100755
--- a/tools/perf/tests/shell/test_intel_pt.sh
+++ b/tools/perf/tests/shell/test_intel_pt.sh
@@ -110,7 +110,7 @@ test_system_wide_side_band()
can_cpu_wide 1 || return $?
# Record on CPU 0 a task running on CPU 1
- perf_record_no_decode -o "${perfdatafile}" -e intel_pt//u -C 0 -- taskset --cpu-list 1 uname
+ perf_record_no_decode -o "${perfdatafile}" -e intel_pt//u -C 0 --workload-config cpu-list=1 -- uname
# Should get MMAP events from CPU 1 because they can be needed to decode
mmap_cnt=$(perf script -i "${perfdatafile}" --no-itrace --show-mmap-events -C 1 2>/dev/null | grep -c MMAP)
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH v5 5/5] perf test: add test case for --workload-config option
2023-09-26 4:29 [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
` (3 preceding siblings ...)
2023-09-26 4:29 ` [PATCH v5 4/5] perf: replace taskset with --workload-config option Changbin Du
@ 2023-09-26 4:29 ` Changbin Du
4 siblings, 0 replies; 6+ messages in thread
From: Changbin Du @ 2023-09-26 4:29 UTC (permalink / raw)
To: Peter Zijlstra, Ingo Molnar, Arnaldo Carvalho de Melo
Cc: Mark Rutland, Alexander Shishkin, Jiri Olsa, Namhyung Kim,
Ian Rogers, Adrian Hunter, linux-perf-users, linux-kernel,
changbin.du, Changbin Du
Ensure the sched attributes are applied to workload.
Signed-off-by: Changbin Du <changbin.du@huawei.com>
---
tools/perf/tests/shell/stat.sh | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh
index 3f1e67795490..7b320a25505a 100755
--- a/tools/perf/tests/shell/stat.sh
+++ b/tools/perf/tests/shell/stat.sh
@@ -16,6 +16,24 @@ test_default_stat() {
echo "Basic stat command test [Success]"
}
+test_stat_workload_config() {
+ echo "stat with --workload-config test"
+ if ! perf stat --workload-config cpu-list=1 -- bash -c 'taskset -pc $$' 2>&1 | grep -E -q "current affinity list: 1"
+ then
+ echo "stat with --workload-config test [Failed]"
+ err=1
+ return
+ fi
+
+ if ! perf stat --workload-config sched_policy=other,sched_prio=10 -- bash -c 'ps -o pid,cls,ni,cmd -p $$' 2>&1 | grep -E -q "TS\s+10"
+ then
+ echo "stat with --workload-config test [Failed]"
+ err=1
+ return
+ fi
+ echo "stat with --workload-config test [Success]"
+}
+
test_stat_record_report() {
echo "stat record and report test"
if ! perf stat record -o - true | perf stat report -i - 2>&1 | \
@@ -147,6 +165,7 @@ test_cputype() {
}
test_default_stat
+test_stat_workload_config
test_stat_record_report
test_stat_record_script
test_stat_repeat_weak_groups
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-09-26 4:30 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-26 4:29 [PATCH v5 0/5] perf: Add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
2023-09-26 4:29 ` [PATCH v5 1/5] perf cpumap: Add __perf_cpu_map__new and perf_cpu_map__2_cpuset Changbin Du
2023-09-26 4:29 ` [PATCH v5 2/5] perf: util: support string type option for perf_parse_sublevel_options Changbin Du
2023-09-26 4:29 ` [PATCH v5 3/5] perf: add new option '--workload-config' to set workload sched_policy/prio/cpumask Changbin Du
2023-09-26 4:29 ` [PATCH v5 4/5] perf: replace taskset with --workload-config option Changbin Du
2023-09-26 4:29 ` [PATCH v5 5/5] perf test: add test case for " Changbin Du
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).