* [PATCH 0/2] perf sched timehist: Add --show-prio & --prio option
@ 2024-08-06 1:56 Yang Jihong
2024-08-06 1:57 ` [PATCH 1/2] perf sched timehist: Add --show-prio option Yang Jihong
2024-08-06 1:57 ` [PATCH 2/2] perf sched timehist: Add --prio option Yang Jihong
0 siblings, 2 replies; 6+ messages in thread
From: Yang Jihong @ 2024-08-06 1:56 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: yangjihong
This patch set adds --show-prio and --prio to show and filter task priorities.
Sometimes may only be interested in the scheduling of certain tasks
(such as RT tasks). Support for analyzing events of tasks with given priority(ies)
only.
Both options are disabled by default, consistent with the original behavior.
Yang Jihong (2):
perf sched timehist: Add --show-prio option
perf sched timehist: Add --prio option
tools/perf/Documentation/perf-sched.txt | 6 +
tools/perf/builtin-sched.c | 163 +++++++++++++++++++++++-
2 files changed, 162 insertions(+), 7 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] perf sched timehist: Add --show-prio option
2024-08-06 1:56 [PATCH 0/2] perf sched timehist: Add --show-prio & --prio option Yang Jihong
@ 2024-08-06 1:57 ` Yang Jihong
2024-08-08 18:42 ` Madadi Vineeth Reddy
2024-08-06 1:57 ` [PATCH 2/2] perf sched timehist: Add --prio option Yang Jihong
1 sibling, 1 reply; 6+ messages in thread
From: Yang Jihong @ 2024-08-06 1:57 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: yangjihong
The --show-prio option is used to display the priority of task.
It is disabled by default, which is consistent with original behavior.
The display format is xxx (priority does not change during task running)
or xxx->yyy (priority changes during task running)
Testcase:
# perf sched record nice -n 9 true
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 0.497 MB perf.data ]
# perf sched timehist -h
Usage: perf sched timehist [<options>]
-C, --cpu <cpu> list of cpus to profile
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-g, --call-graph Display call chains if present (default on)
-I, --idle-hist Show idle events only
-i, --input <file> input file name
-k, --vmlinux <file> vmlinux pathname
-M, --migrations Show migration events
-n, --next Show next task
-p, --pid <pid[,pid...]>
analyze events only for given process id(s)
-s, --summary Show only syscall summary with statistics
-S, --with-summary Show all syscalls and summary with statistics
-t, --tid <tid[,tid...]>
analyze events only for given thread id(s)
-V, --cpu-visual Add CPU visual
-v, --verbose be more verbose (show symbol address, etc)
-w, --wakeups Show wakeup events
--kallsyms <file>
kallsyms pathname
--max-stack <n> Maximum number of functions to display backtrace.
--show-prio Show task priority
--state Show task state when sched-out
--symfs <directory>
Look for files with symbols relative to this directory
--time <str> Time span for analysis (start,stop)
# perf sched timehist
Samples of sched_switch event do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
23952.006537 [0000] perf[534] 0.000 0.000 0.000
23952.006593 [0000] migration/0[19] 0.000 0.014 0.056
23952.006899 [0001] perf[534] 0.000 0.000 0.000
23952.006947 [0001] migration/1[22] 0.000 0.015 0.047
23952.007138 [0002] perf[534] 0.000 0.000 0.000
<SNIP>
# perf sched timehist --show-prio
Samples of sched_switch event do not have callchains.
time cpu task name prio wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ -------- --------- --------- ---------
23952.006537 [0000] perf[534] 120 0.000 0.000 0.000
23952.006593 [0000] migration/0[19] 0 0.000 0.014 0.056
23952.006899 [0001] perf[534] 120 0.000 0.000 0.000
<SNIP>
23952.034843 [0003] nice[535] 120->129 0.189 0.024 23.314
<SNIP>
23952.053838 [0005] rcu_preempt[16] 120 3.993 0.000 0.023
23952.053990 [0005] <idle> 120 0.023 0.023 0.152
23952.054137 [0006] <idle> 120 1.427 1.427 17.855
23952.054278 [0007] <idle> 120 0.506 0.506 1.650
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
---
tools/perf/Documentation/perf-sched.txt | 3 +
tools/perf/builtin-sched.c | 91 +++++++++++++++++++++++--
2 files changed, 87 insertions(+), 7 deletions(-)
diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 84d49f9241b1..3efa5c58418d 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -212,6 +212,9 @@ OPTIONS for 'perf sched timehist'
--state::
Show task state when it switched out.
+--show-prio::
+ Show task priority.
+
OPTIONS for 'perf sched replay'
------------------------------
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 8750b5f2d49b..2af9dbdf4ba2 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -228,6 +228,7 @@ struct perf_sched {
bool show_next;
bool show_migrations;
bool show_state;
+ bool show_prio;
u64 skipped_samples;
const char *time_str;
struct perf_time_interval ptime;
@@ -258,6 +259,8 @@ struct thread_runtime {
bool comm_changed;
u64 migrations;
+
+ int prio;
};
/* per event run time data */
@@ -920,6 +923,11 @@ struct sort_dimension {
struct list_head list;
};
+static inline void init_prio(struct thread_runtime *r)
+{
+ r->prio = -1;
+}
+
/*
* handle runtime stats saved per thread
*/
@@ -932,6 +940,7 @@ static struct thread_runtime *thread__init_runtime(struct thread *thread)
return NULL;
init_stats(&r->run_stats);
+ init_prio(r);
thread__set_priv(thread, r);
return r;
@@ -2036,6 +2045,24 @@ static char *timehist_get_commstr(struct thread *thread)
return str;
}
+/* prio field format: xxx or xxx->yyy */
+#define MAX_PRIO_STR_LEN 8
+static char *timehist_get_priostr(struct evsel *evsel,
+ struct thread *thread,
+ struct perf_sample *sample)
+{
+ static char prio_str[16];
+ int prev_prio = (int)evsel__intval(evsel, sample, "prev_prio");
+ struct thread_runtime *tr = thread__priv(thread);
+
+ if (tr->prio != prev_prio && tr->prio != -1)
+ scnprintf(prio_str, sizeof(prio_str), "%d->%d", tr->prio, prev_prio);
+ else
+ scnprintf(prio_str, sizeof(prio_str), "%d", prev_prio);
+
+ return prio_str;
+}
+
static void timehist_header(struct perf_sched *sched)
{
u32 ncpus = sched->max_cpu.cpu + 1;
@@ -2053,8 +2080,14 @@ static void timehist_header(struct perf_sched *sched)
printf(" ");
}
- printf(" %-*s %9s %9s %9s", comm_width,
- "task name", "wait time", "sch delay", "run time");
+ if (sched->show_prio) {
+ printf(" %-*s %-*s %9s %9s %9s",
+ comm_width, "task name", MAX_PRIO_STR_LEN, "prio",
+ "wait time", "sch delay", "run time");
+ } else {
+ printf(" %-*s %9s %9s %9s", comm_width,
+ "task name", "wait time", "sch delay", "run time");
+ }
if (sched->show_state)
printf(" %s", "state");
@@ -2069,8 +2102,14 @@ static void timehist_header(struct perf_sched *sched)
if (sched->show_cpu_visual)
printf(" %*s ", ncpus, "");
- printf(" %-*s %9s %9s %9s", comm_width,
- "[tid/pid]", "(msec)", "(msec)", "(msec)");
+ if (sched->show_prio) {
+ printf(" %-*s %-*s %9s %9s %9s",
+ comm_width, "[tid/pid]", MAX_PRIO_STR_LEN, "",
+ "(msec)", "(msec)", "(msec)");
+ } else {
+ printf(" %-*s %9s %9s %9s", comm_width,
+ "[tid/pid]", "(msec)", "(msec)", "(msec)");
+ }
if (sched->show_state)
printf(" %5s", "");
@@ -2085,9 +2124,15 @@ static void timehist_header(struct perf_sched *sched)
if (sched->show_cpu_visual)
printf(" %.*s ", ncpus, graph_dotted_line);
- printf(" %.*s %.9s %.9s %.9s", comm_width,
- graph_dotted_line, graph_dotted_line, graph_dotted_line,
- graph_dotted_line);
+ if (sched->show_prio) {
+ printf(" %.*s %.*s %.9s %.9s %.9s",
+ comm_width, graph_dotted_line, MAX_PRIO_STR_LEN, graph_dotted_line,
+ graph_dotted_line, graph_dotted_line, graph_dotted_line);
+ } else {
+ printf(" %.*s %.9s %.9s %.9s", comm_width,
+ graph_dotted_line, graph_dotted_line, graph_dotted_line,
+ graph_dotted_line);
+ }
if (sched->show_state)
printf(" %.5s", graph_dotted_line);
@@ -2134,6 +2179,9 @@ static void timehist_print_sample(struct perf_sched *sched,
printf(" %-*s ", comm_width, timehist_get_commstr(thread));
+ if (sched->show_prio)
+ printf(" %-*s ", MAX_PRIO_STR_LEN, timehist_get_priostr(evsel, thread, sample));
+
wait_time = tr->dt_sleep + tr->dt_iowait + tr->dt_preempt;
print_sched_time(wait_time, 6);
@@ -2301,6 +2349,7 @@ static int init_idle_thread(struct thread *thread)
if (itr == NULL)
return -ENOMEM;
+ init_prio(&itr->tr);
init_stats(&itr->tr.run_stats);
callchain_init(&itr->callchain);
callchain_cursor_reset(&itr->cursor);
@@ -2627,6 +2676,30 @@ static int timehist_migrate_task_event(struct perf_tool *tool,
return 0;
}
+static void timehist_update_task_prio(struct evsel *evsel,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ struct thread *thread;
+ struct thread_runtime *tr = NULL;
+ const u32 next_pid = evsel__intval(evsel, sample, "next_pid");
+ const u32 next_prio = evsel__intval(evsel, sample, "next_prio");
+
+ if (next_pid == 0)
+ thread = get_idle_thread(sample->cpu);
+ else
+ thread = machine__findnew_thread(machine, -1, next_pid);
+
+ if (thread == NULL)
+ return;
+
+ tr = thread__get_runtime(thread);
+ if (tr == NULL)
+ return;
+
+ tr->prio = next_prio;
+}
+
static int timehist_sched_change_event(struct perf_tool *tool,
union perf_event *event,
struct evsel *evsel,
@@ -2650,6 +2723,9 @@ static int timehist_sched_change_event(struct perf_tool *tool,
goto out;
}
+ if (sched->show_prio)
+ timehist_update_task_prio(evsel, sample, machine);
+
thread = timehist_get_thread(sched, sample, machine, evsel);
if (thread == NULL) {
rc = -1;
@@ -3691,6 +3767,7 @@ int cmd_sched(int argc, const char **argv)
OPT_STRING('t', "tid", &symbol_conf.tid_list_str, "tid[,tid...]",
"analyze events only for given thread id(s)"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
+ OPT_BOOLEAN(0, "show-prio", &sched.show_prio, "Show task priority"),
OPT_PARENT(sched_options)
};
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/2] perf sched timehist: Add --prio option
2024-08-06 1:56 [PATCH 0/2] perf sched timehist: Add --show-prio & --prio option Yang Jihong
2024-08-06 1:57 ` [PATCH 1/2] perf sched timehist: Add --show-prio option Yang Jihong
@ 2024-08-06 1:57 ` Yang Jihong
2024-08-16 17:27 ` Namhyung Kim
1 sibling, 1 reply; 6+ messages in thread
From: Yang Jihong @ 2024-08-06 1:57 UTC (permalink / raw)
To: peterz, mingo, acme, namhyung, mark.rutland, alexander.shishkin,
jolsa, irogers, adrian.hunter, kan.liang, linux-perf-users,
linux-kernel
Cc: yangjihong
The --prio option is used to only show events for the given task priority(ies).
The default is to show events for all priority tasks, which is consistent with
the previous behavior.
Testcase:
# perf sched record nice -n 9 perf bench sched messaging -l 10000
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 3.435 [sec]
[ perf record: Woken up 270 times to write data ]
[ perf record: Captured and wrote 618.688 MB perf.data (5729036 samples) ]
# perf sched timehist -h
Usage: perf sched timehist [<options>]
-C, --cpu <cpu> list of cpus to profile
-D, --dump-raw-trace dump raw trace in ASCII
-f, --force don't complain, do it
-g, --call-graph Display call chains if present (default on)
-I, --idle-hist Show idle events only
-i, --input <file> input file name
-k, --vmlinux <file> vmlinux pathname
-M, --migrations Show migration events
-n, --next Show next task
-p, --pid <pid[,pid...]>
analyze events only for given process id(s)
-s, --summary Show only syscall summary with statistics
-S, --with-summary Show all syscalls and summary with statistics
-t, --tid <tid[,tid...]>
analyze events only for given thread id(s)
-V, --cpu-visual Add CPU visual
-v, --verbose be more verbose (show symbol address, etc)
-w, --wakeups Show wakeup events
--kallsyms <file>
kallsyms pathname
--max-stack <n> Maximum number of functions to display backtrace.
--prio <prio> analyze events only for given task priority(ies)
--show-prio Show task priority
--state Show task state when sched-out
--symfs <directory>
Look for files with symbols relative to this directory
--time <str> Time span for analysis (start,stop)
# perf sched timehist --prio 140
Samples of sched_switch event do not have callchains.
Invalid prio string
# perf sched timehist --show-prio --prio 129
Samples of sched_switch event do not have callchains.
time cpu task name prio wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ -------- --------- --------- ---------
2090450.765421 [0002] sched-messaging[1229618] 129 0.000 0.000 0.029
2090450.765445 [0007] sched-messaging[1229616] 129 0.000 0.062 0.043
2090450.765448 [0014] sched-messaging[1229619] 129 0.000 0.000 0.032
2090450.765478 [0013] sched-messaging[1229617] 129 0.000 0.065 0.048
2090450.765503 [0014] sched-messaging[1229622] 129 0.000 0.000 0.017
2090450.765550 [0002] sched-messaging[1229624] 129 0.000 0.000 0.021
2090450.765562 [0007] sched-messaging[1229621] 129 0.000 0.071 0.028
2090450.765570 [0005] sched-messaging[1229620] 129 0.000 0.064 0.066
2090450.765583 [0001] sched-messaging[1229625] 129 0.000 0.001 0.031
2090450.765595 [0013] sched-messaging[1229623] 129 0.000 0.060 0.028
2090450.765637 [0014] sched-messaging[1229628] 129 0.000 0.000 0.019
2090450.765665 [0007] sched-messaging[1229627] 129 0.000 0.038 0.030
<SNIP>
# perf sched timehist --show-prio --prio 0,120-129
Samples of sched_switch event do not have callchains.
time cpu task name prio wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ -------- --------- --------- ---------
2090450.763231 [0000] perf[1229608] 120 0.000 0.000 0.000
2090450.763235 [0000] migration/0[15] 0 0.000 0.001 0.003
2090450.763263 [0001] perf[1229608] 120 0.000 0.000 0.000
2090450.763268 [0001] migration/1[21] 0 0.000 0.001 0.004
2090450.763302 [0002] perf[1229608] 120 0.000 0.000 0.000
2090450.763309 [0002] migration/2[27] 0 0.000 0.001 0.007
2090450.763338 [0003] perf[1229608] 120 0.000 0.000 0.000
2090450.763343 [0003] migration/3[33] 0 0.000 0.001 0.004
2090450.763459 [0004] perf[1229608] 120 0.000 0.000 0.000
2090450.763469 [0004] migration/4[39] 0 0.000 0.002 0.010
2090450.763496 [0005] perf[1229608] 120 0.000 0.000 0.000
2090450.763501 [0005] migration/5[45] 0 0.000 0.001 0.004
2090450.763613 [0006] perf[1229608] 120 0.000 0.000 0.000
2090450.763622 [0006] migration/6[51] 0 0.000 0.001 0.008
2090450.763652 [0007] perf[1229608] 120 0.000 0.000 0.000
2090450.763660 [0007] migration/7[57] 0 0.000 0.001 0.008
<SNIP>
2090450.765665 [0001] <idle> 120 0.031 0.031 0.081
2090450.765665 [0007] sched-messaging[1229627] 129 0.000 0.038 0.030
2090450.765667 [0000] s1-perf[8235/7168] 120 0.008 0.000 0.004
2090450.765684 [0013] <idle> 120 0.028 0.028 0.088
2090450.765685 [0001] sched-messaging[1229630] 129 0.000 0.001 0.020
2090450.765688 [0000] <idle> 120 0.004 0.004 0.020
2090450.765689 [0002] <idle> 120 0.021 0.021 0.138
2090450.765691 [0005] sched-messaging[1229626] 129 0.000 0.085 0.029
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
---
tools/perf/Documentation/perf-sched.txt | 3 +
tools/perf/builtin-sched.c | 74 ++++++++++++++++++++++++-
2 files changed, 76 insertions(+), 1 deletion(-)
diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 3efa5c58418d..70a9016d5a06 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -215,6 +215,9 @@ OPTIONS for 'perf sched timehist'
--show-prio::
Show task priority.
+--prio::
+ Only show events for given task priority(ies).
+
OPTIONS for 'perf sched replay'
------------------------------
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
index 2af9dbdf4ba2..3981994d9921 100644
--- a/tools/perf/builtin-sched.c
+++ b/tools/perf/builtin-sched.c
@@ -51,6 +51,7 @@
#define COMM_LEN 20
#define SYM_LEN 129
#define MAX_PID 1024000
+#define MAX_PRIO 140
static const char *cpu_list;
static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
@@ -234,6 +235,8 @@ struct perf_sched {
struct perf_time_interval ptime;
struct perf_time_interval hist_time;
volatile bool thread_funcs_exit;
+ const char *prio_str;
+ DECLARE_BITMAP(prio_bitmap, MAX_PRIO);
};
/* per thread run time data */
@@ -2504,12 +2507,33 @@ static bool timehist_skip_sample(struct perf_sched *sched,
struct perf_sample *sample)
{
bool rc = false;
+ int prio = -1;
+ struct thread_runtime *tr = NULL;
if (thread__is_filtered(thread)) {
rc = true;
sched->skipped_samples++;
}
+ if (sched->prio_str) {
+ /*
+ * Because priority may be changed during task execution,
+ * first read priority from prev sched_in event for current task.
+ * If prev sched_in event is not saved, then read priority from
+ * current task sched_out event.
+ */
+ tr = thread__get_runtime(thread);
+ if (tr && tr->prio != -1)
+ prio = tr->prio;
+ else if (evsel__name_is(evsel, "sched:sched_switch"))
+ prio = evsel__intval(evsel, sample, "prev_prio");
+
+ if (prio != -1 && !test_bit(prio, sched->prio_bitmap)) {
+ rc = true;
+ sched->skipped_samples++;
+ }
+ }
+
if (sched->idle_hist) {
if (!evsel__name_is(evsel, "sched:sched_switch"))
rc = true;
@@ -2723,7 +2747,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
goto out;
}
- if (sched->show_prio)
+ if (sched->show_prio || sched->prio_str)
timehist_update_task_prio(evsel, sample, machine);
thread = timehist_get_thread(sched, sample, machine, evsel);
@@ -3142,6 +3166,47 @@ static int timehist_check_attr(struct perf_sched *sched,
return 0;
}
+static int timehist_parse_prio_str(struct perf_sched *sched)
+{
+ char *p;
+ unsigned long start_prio, end_prio;
+ const char *str = sched->prio_str;
+
+ if (!str)
+ return 0;
+
+ while (isdigit(*str)) {
+ p = NULL;
+ start_prio = strtoul(str, &p, 0);
+ if (start_prio >= MAX_PRIO || (*p != '\0' && *p != ',' && *p != '-'))
+ return -1;
+
+ if (*p == '-') {
+ str = ++p;
+ p = NULL;
+ end_prio = strtoul(str, &p, 0);
+
+ if (end_prio >= MAX_PRIO || (*p != '\0' && *p != ','))
+ return -1;
+
+ if (end_prio < start_prio)
+ return -1;
+ } else {
+ end_prio = start_prio;
+ }
+
+ for (; start_prio <= end_prio; start_prio++)
+ __set_bit(start_prio, sched->prio_bitmap);
+
+ if (*p)
+ ++p;
+
+ str = p;
+ }
+
+ return 0;
+}
+
static int perf_sched__timehist(struct perf_sched *sched)
{
struct evsel_str_handler handlers[] = {
@@ -3203,6 +3268,11 @@ static int perf_sched__timehist(struct perf_sched *sched)
if (timehist_check_attr(sched, evlist) != 0)
goto out;
+ if (timehist_parse_prio_str(sched) != 0) {
+ pr_err("Invalid prio string\n");
+ goto out;
+ }
+
setup_pager();
/* prefer sched_waking if it is captured */
@@ -3768,6 +3838,8 @@ int cmd_sched(int argc, const char **argv)
"analyze events only for given thread id(s)"),
OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
OPT_BOOLEAN(0, "show-prio", &sched.show_prio, "Show task priority"),
+ OPT_STRING(0, "prio", &sched.prio_str, "prio",
+ "analyze events only for given task priority(ies)"),
OPT_PARENT(sched_options)
};
--
2.25.1
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] perf sched timehist: Add --show-prio option
2024-08-06 1:57 ` [PATCH 1/2] perf sched timehist: Add --show-prio option Yang Jihong
@ 2024-08-08 18:42 ` Madadi Vineeth Reddy
2024-08-09 1:59 ` [External] " Yang Jihong
0 siblings, 1 reply; 6+ messages in thread
From: Madadi Vineeth Reddy @ 2024-08-08 18:42 UTC (permalink / raw)
To: Yang Jihong
Cc: Peter Zijlstra, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-perf-users, linux-kernel, Madadi Vineeth Reddy
Hi Yang Jihong,
On 06/08/24 07:27, Yang Jihong wrote:
> The --show-prio option is used to display the priority of task.
> It is disabled by default, which is consistent with original behavior.
>
> The display format is xxx (priority does not change during task running)
> or xxx->yyy (priority changes during task running)
>
I had done the following steps.
1) Ran 'perf sched record stress-ng --cpu 32 -l 50 --timeout 40s'
2) Changed the nice value of stress-ng while the record was going on.
ps -C stress-ng -o pid=
64195
renice -n 10 -p 64195
3) Then ran 'perf sched timehist --show-prio'
343918.720052 [0015] stress-ng[64195] 120 0.013 0.013 21.719
.
.
343958.797257 [0009] stress-ng[64195] 130 40076.812 0.007 0.393
The output shows the final priority change of stress-ng from 120 to 130,
but the transition format xxx->yyy is not displayed.
Am I missing something?
Thanks and Regards
Madadi Vineeth Reddy
> Testcase:
>
> # perf sched record nice -n 9 true
> [ perf record: Woken up 0 times to write data ]
> [ perf record: Captured and wrote 0.497 MB perf.data ]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [External] Re: [PATCH 1/2] perf sched timehist: Add --show-prio option
2024-08-08 18:42 ` Madadi Vineeth Reddy
@ 2024-08-09 1:59 ` Yang Jihong
0 siblings, 0 replies; 6+ messages in thread
From: Yang Jihong @ 2024-08-09 1:59 UTC (permalink / raw)
To: vineethr
Cc: Peter Zijlstra, mingo, acme, namhyung, mark.rutland,
alexander.shishkin, jolsa, irogers, adrian.hunter, kan.liang,
linux-perf-users, linux-kernel, Madadi Vineeth Reddy
Hello,
On 8/9/24 02:42, Madadi Vineeth Reddy wrote:
> Hi Yang Jihong,
>
> On 06/08/24 07:27, Yang Jihong wrote:
>> The --show-prio option is used to display the priority of task.
>> It is disabled by default, which is consistent with original behavior.
>>
>> The display format is xxx (priority does not change during task running)
>> or xxx->yyy (priority changes during task running)
>>
>
> I had done the following steps.
> 1) Ran 'perf sched record stress-ng --cpu 32 -l 50 --timeout 40s'
> 2) Changed the nice value of stress-ng while the record was going on.
> ps -C stress-ng -o pid=
> 64195
>
> renice -n 10 -p 64195
>
> 3) Then ran 'perf sched timehist --show-prio'
>
> 343918.720052 [0015] stress-ng[64195] 120 0.013 0.013 21.719
> .
> .
> 343958.797257 [0009] stress-ng[64195] 130 40076.812 0.007 0.393
>
> The output shows the final priority change of stress-ng from 120 to 130,
> but the transition format xxx->yyy is not displayed.
>
> Am I missing something?
>
1. I think it may be because when change the priority of stress-ng, the
stress-ng task had been sched_out and did not run.
2.
sched_switch: A -> B(prio1)
sched_switch: B(prio_2) -> C
The specific details are that:
first process the first sched_switch event, read the priority of task B,
as prio1.
then process the second sched_switch event, read the priority of task B,
as prio2.
Only when change the priority of task B, it is in the running state,
will show prio1->prio2
Thanks,
Yang.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] perf sched timehist: Add --prio option
2024-08-06 1:57 ` [PATCH 2/2] perf sched timehist: Add --prio option Yang Jihong
@ 2024-08-16 17:27 ` Namhyung Kim
0 siblings, 0 replies; 6+ messages in thread
From: Namhyung Kim @ 2024-08-16 17:27 UTC (permalink / raw)
To: Yang Jihong
Cc: peterz, mingo, acme, mark.rutland, alexander.shishkin, jolsa,
irogers, adrian.hunter, kan.liang, linux-perf-users, linux-kernel
On Tue, Aug 06, 2024 at 09:57:01AM +0800, Yang Jihong wrote:
> The --prio option is used to only show events for the given task priority(ies).
> The default is to show events for all priority tasks, which is consistent with
> the previous behavior.
>
> Testcase:
> # perf sched record nice -n 9 perf bench sched messaging -l 10000
> # Running 'sched/messaging' benchmark:
> # 20 sender and receiver processes per group
> # 10 groups == 400 processes run
>
> Total time: 3.435 [sec]
> [ perf record: Woken up 270 times to write data ]
> [ perf record: Captured and wrote 618.688 MB perf.data (5729036 samples) ]
>
> # perf sched timehist -h
>
> Usage: perf sched timehist [<options>]
>
> -C, --cpu <cpu> list of cpus to profile
> -D, --dump-raw-trace dump raw trace in ASCII
> -f, --force don't complain, do it
> -g, --call-graph Display call chains if present (default on)
> -I, --idle-hist Show idle events only
> -i, --input <file> input file name
> -k, --vmlinux <file> vmlinux pathname
> -M, --migrations Show migration events
> -n, --next Show next task
> -p, --pid <pid[,pid...]>
> analyze events only for given process id(s)
> -s, --summary Show only syscall summary with statistics
> -S, --with-summary Show all syscalls and summary with statistics
> -t, --tid <tid[,tid...]>
> analyze events only for given thread id(s)
> -V, --cpu-visual Add CPU visual
> -v, --verbose be more verbose (show symbol address, etc)
> -w, --wakeups Show wakeup events
> --kallsyms <file>
> kallsyms pathname
> --max-stack <n> Maximum number of functions to display backtrace.
> --prio <prio> analyze events only for given task priority(ies)
> --show-prio Show task priority
> --state Show task state when sched-out
> --symfs <directory>
> Look for files with symbols relative to this directory
> --time <str> Time span for analysis (start,stop)
>
> # perf sched timehist --prio 140
> Samples of sched_switch event do not have callchains.
> Invalid prio string
>
> # perf sched timehist --show-prio --prio 129
> Samples of sched_switch event do not have callchains.
> time cpu task name prio wait time sch delay run time
> [tid/pid] (msec) (msec) (msec)
> --------------- ------ ------------------------------ -------- --------- --------- ---------
> 2090450.765421 [0002] sched-messaging[1229618] 129 0.000 0.000 0.029
> 2090450.765445 [0007] sched-messaging[1229616] 129 0.000 0.062 0.043
> 2090450.765448 [0014] sched-messaging[1229619] 129 0.000 0.000 0.032
> 2090450.765478 [0013] sched-messaging[1229617] 129 0.000 0.065 0.048
> 2090450.765503 [0014] sched-messaging[1229622] 129 0.000 0.000 0.017
> 2090450.765550 [0002] sched-messaging[1229624] 129 0.000 0.000 0.021
> 2090450.765562 [0007] sched-messaging[1229621] 129 0.000 0.071 0.028
> 2090450.765570 [0005] sched-messaging[1229620] 129 0.000 0.064 0.066
> 2090450.765583 [0001] sched-messaging[1229625] 129 0.000 0.001 0.031
> 2090450.765595 [0013] sched-messaging[1229623] 129 0.000 0.060 0.028
> 2090450.765637 [0014] sched-messaging[1229628] 129 0.000 0.000 0.019
> 2090450.765665 [0007] sched-messaging[1229627] 129 0.000 0.038 0.030
> <SNIP>
>
> # perf sched timehist --show-prio --prio 0,120-129
> Samples of sched_switch event do not have callchains.
> time cpu task name prio wait time sch delay run time
> [tid/pid] (msec) (msec) (msec)
> --------------- ------ ------------------------------ -------- --------- --------- ---------
> 2090450.763231 [0000] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763235 [0000] migration/0[15] 0 0.000 0.001 0.003
> 2090450.763263 [0001] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763268 [0001] migration/1[21] 0 0.000 0.001 0.004
> 2090450.763302 [0002] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763309 [0002] migration/2[27] 0 0.000 0.001 0.007
> 2090450.763338 [0003] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763343 [0003] migration/3[33] 0 0.000 0.001 0.004
> 2090450.763459 [0004] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763469 [0004] migration/4[39] 0 0.000 0.002 0.010
> 2090450.763496 [0005] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763501 [0005] migration/5[45] 0 0.000 0.001 0.004
> 2090450.763613 [0006] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763622 [0006] migration/6[51] 0 0.000 0.001 0.008
> 2090450.763652 [0007] perf[1229608] 120 0.000 0.000 0.000
> 2090450.763660 [0007] migration/7[57] 0 0.000 0.001 0.008
> <SNIP>
> 2090450.765665 [0001] <idle> 120 0.031 0.031 0.081
> 2090450.765665 [0007] sched-messaging[1229627] 129 0.000 0.038 0.030
> 2090450.765667 [0000] s1-perf[8235/7168] 120 0.008 0.000 0.004
> 2090450.765684 [0013] <idle> 120 0.028 0.028 0.088
> 2090450.765685 [0001] sched-messaging[1229630] 129 0.000 0.001 0.020
> 2090450.765688 [0000] <idle> 120 0.004 0.004 0.020
> 2090450.765689 [0002] <idle> 120 0.021 0.021 0.138
> 2090450.765691 [0005] sched-messaging[1229626] 129 0.000 0.085 0.029
>
> Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
> ---
> tools/perf/Documentation/perf-sched.txt | 3 +
> tools/perf/builtin-sched.c | 74 ++++++++++++++++++++++++-
> 2 files changed, 76 insertions(+), 1 deletion(-)
>
> diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
> index 3efa5c58418d..70a9016d5a06 100644
> --- a/tools/perf/Documentation/perf-sched.txt
> +++ b/tools/perf/Documentation/perf-sched.txt
> @@ -215,6 +215,9 @@ OPTIONS for 'perf sched timehist'
> --show-prio::
> Show task priority.
>
> +--prio::
> + Only show events for given task priority(ies).
Please describe how it takes multiple priorities. Like 0,120-129.
Thanks,
Namhyung
> +
> OPTIONS for 'perf sched replay'
> ------------------------------
>
> diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c
> index 2af9dbdf4ba2..3981994d9921 100644
> --- a/tools/perf/builtin-sched.c
> +++ b/tools/perf/builtin-sched.c
> @@ -51,6 +51,7 @@
> #define COMM_LEN 20
> #define SYM_LEN 129
> #define MAX_PID 1024000
> +#define MAX_PRIO 140
>
> static const char *cpu_list;
> static DECLARE_BITMAP(cpu_bitmap, MAX_NR_CPUS);
> @@ -234,6 +235,8 @@ struct perf_sched {
> struct perf_time_interval ptime;
> struct perf_time_interval hist_time;
> volatile bool thread_funcs_exit;
> + const char *prio_str;
> + DECLARE_BITMAP(prio_bitmap, MAX_PRIO);
> };
>
> /* per thread run time data */
> @@ -2504,12 +2507,33 @@ static bool timehist_skip_sample(struct perf_sched *sched,
> struct perf_sample *sample)
> {
> bool rc = false;
> + int prio = -1;
> + struct thread_runtime *tr = NULL;
>
> if (thread__is_filtered(thread)) {
> rc = true;
> sched->skipped_samples++;
> }
>
> + if (sched->prio_str) {
> + /*
> + * Because priority may be changed during task execution,
> + * first read priority from prev sched_in event for current task.
> + * If prev sched_in event is not saved, then read priority from
> + * current task sched_out event.
> + */
> + tr = thread__get_runtime(thread);
> + if (tr && tr->prio != -1)
> + prio = tr->prio;
> + else if (evsel__name_is(evsel, "sched:sched_switch"))
> + prio = evsel__intval(evsel, sample, "prev_prio");
> +
> + if (prio != -1 && !test_bit(prio, sched->prio_bitmap)) {
> + rc = true;
> + sched->skipped_samples++;
> + }
> + }
> +
> if (sched->idle_hist) {
> if (!evsel__name_is(evsel, "sched:sched_switch"))
> rc = true;
> @@ -2723,7 +2747,7 @@ static int timehist_sched_change_event(struct perf_tool *tool,
> goto out;
> }
>
> - if (sched->show_prio)
> + if (sched->show_prio || sched->prio_str)
> timehist_update_task_prio(evsel, sample, machine);
>
> thread = timehist_get_thread(sched, sample, machine, evsel);
> @@ -3142,6 +3166,47 @@ static int timehist_check_attr(struct perf_sched *sched,
> return 0;
> }
>
> +static int timehist_parse_prio_str(struct perf_sched *sched)
> +{
> + char *p;
> + unsigned long start_prio, end_prio;
> + const char *str = sched->prio_str;
> +
> + if (!str)
> + return 0;
> +
> + while (isdigit(*str)) {
> + p = NULL;
> + start_prio = strtoul(str, &p, 0);
> + if (start_prio >= MAX_PRIO || (*p != '\0' && *p != ',' && *p != '-'))
> + return -1;
> +
> + if (*p == '-') {
> + str = ++p;
> + p = NULL;
> + end_prio = strtoul(str, &p, 0);
> +
> + if (end_prio >= MAX_PRIO || (*p != '\0' && *p != ','))
> + return -1;
> +
> + if (end_prio < start_prio)
> + return -1;
> + } else {
> + end_prio = start_prio;
> + }
> +
> + for (; start_prio <= end_prio; start_prio++)
> + __set_bit(start_prio, sched->prio_bitmap);
> +
> + if (*p)
> + ++p;
> +
> + str = p;
> + }
> +
> + return 0;
> +}
> +
> static int perf_sched__timehist(struct perf_sched *sched)
> {
> struct evsel_str_handler handlers[] = {
> @@ -3203,6 +3268,11 @@ static int perf_sched__timehist(struct perf_sched *sched)
> if (timehist_check_attr(sched, evlist) != 0)
> goto out;
>
> + if (timehist_parse_prio_str(sched) != 0) {
> + pr_err("Invalid prio string\n");
> + goto out;
> + }
> +
> setup_pager();
>
> /* prefer sched_waking if it is captured */
> @@ -3768,6 +3838,8 @@ int cmd_sched(int argc, const char **argv)
> "analyze events only for given thread id(s)"),
> OPT_STRING('C', "cpu", &cpu_list, "cpu", "list of cpus to profile"),
> OPT_BOOLEAN(0, "show-prio", &sched.show_prio, "Show task priority"),
> + OPT_STRING(0, "prio", &sched.prio_str, "prio",
> + "analyze events only for given task priority(ies)"),
> OPT_PARENT(sched_options)
> };
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-08-16 17:27 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-06 1:56 [PATCH 0/2] perf sched timehist: Add --show-prio & --prio option Yang Jihong
2024-08-06 1:57 ` [PATCH 1/2] perf sched timehist: Add --show-prio option Yang Jihong
2024-08-08 18:42 ` Madadi Vineeth Reddy
2024-08-09 1:59 ` [External] " Yang Jihong
2024-08-06 1:57 ` [PATCH 2/2] perf sched timehist: Add --prio option Yang Jihong
2024-08-16 17:27 ` Namhyung Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).