* [PATCH 1/3] perf sort: Keep output fields in the same level @ 2025-03-07 8:08 Namhyung Kim 2025-03-07 8:08 ` [PATCH 2/3] perf report: Allow hierarchy mode for --children Namhyung Kim ` (3 more replies) 0 siblings, 4 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-07 8:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users, Dmitry Vyukov This is useful for hierarchy output mode where the first level is considered as output fields. We want them in the same level so that it can show only the remaining groups in the hierarchy. Before: $ perf report -s overhead,sample,period,comm,dso -H --stdio ... # Overhead Samples / Period / Command / Shared Object # ................. .......................................... # 100.00% 4035 100.00% 3835883066 100.00% perf 99.37% perf 0.50% ld-linux-x86-64.so.2 0.06% [unknown] 0.04% libc.so.6 0.02% libLLVM-16.so.1 After: $ perf report -s overhead,sample,period,comm,dso -H --stdio ... # Overhead Samples Period Command / Shared Object # ....................................... ....................... # 100.00% 4035 3835883066 perf 99.37% 4005 3811826223 perf 0.50% 19 19210014 ld-linux-x86-64.so.2 0.06% 8 2367089 [unknown] 0.04% 2 1720336 libc.so.6 0.02% 1 759404 libLLVM-16.so.1 Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/util/sort.c | 44 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index f08fbc4bf0a2ce29..6b49d64854f5f986 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -3720,6 +3720,34 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok, return -ESRCH; } +/* This should match with sort_dimension__add() above */ +static bool is_hpp_sort_key(const char *key) +{ + unsigned i; + + for (i = 0; i < ARRAY_SIZE(arch_specific_sort_keys); i++) { + if (!strcmp(arch_specific_sort_keys[i], key) && + !arch_support_sort_key(key)) { + return false; + } + } + + for (i = 0; i < ARRAY_SIZE(common_sort_dimensions); i++) { + struct sort_dimension *sd = &common_sort_dimensions[i]; + + if (sd->name && !strncasecmp(key, sd->name, strlen(key))) + return false; + } + + for (i = 0; i < ARRAY_SIZE(hpp_sort_dimensions); i++) { + struct hpp_dimension *hd = &hpp_sort_dimensions[i]; + + if (!strncasecmp(key, hd->name, strlen(key))) + return true; + } + return false; +} + static int setup_sort_list(struct perf_hpp_list *list, char *str, struct evlist *evlist) { @@ -3727,7 +3755,9 @@ static int setup_sort_list(struct perf_hpp_list *list, char *str, int ret = 0; int level = 0; int next_level = 1; + int prev_level = 0; bool in_group = false; + bool prev_was_hpp = false; do { tok = str; @@ -3748,6 +3778,19 @@ static int setup_sort_list(struct perf_hpp_list *list, char *str, } if (*tok) { + if (is_hpp_sort_key(tok)) { + /* keep output (hpp) sort keys in the same level */ + if (prev_was_hpp) { + bool next_same = (level == next_level); + + level = prev_level; + next_level = next_same ? level : level+1; + } + prev_was_hpp = true; + } else { + prev_was_hpp = false; + } + ret = sort_dimension__add(list, tok, evlist, level); if (ret == -EINVAL) { if (!cacheline_size() && !strncasecmp(tok, "dcacheline", strlen(tok))) @@ -3759,6 +3802,7 @@ static int setup_sort_list(struct perf_hpp_list *list, char *str, ui__error("Unknown --sort key: `%s'", tok); break; } + prev_level = level; } level = next_level; -- 2.49.0.rc0.332.g42c0ae87b1-goog ^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 2/3] perf report: Allow hierarchy mode for --children 2025-03-07 8:08 [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim @ 2025-03-07 8:08 ` Namhyung Kim 2025-03-07 8:08 ` [PATCH 3/3] perf report: Disable children column for data type profiling Namhyung Kim ` (2 subsequent siblings) 3 siblings, 0 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-07 8:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users, Dmitry Vyukov It was prohibited because the output fields in the children mode were not handled properly with hierarchy. But we can have the output fields in the same level, it can allow them together. For example, latency mode adds more output fields by default and now they are displayed properly. $ perf record --latency -g -- perf test -w thloop $ perf report -H --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 2K of event 'cycles:Pu' # Event count (approx.): 8266456478 # # Children Latency Overhead Latency Command / Shared Object / Symbol # ........................................... ........................................................ # 0.08% 0.16% 100.00% 100.00% perf 0.08% 0.16% 0.24% 0.47% ld-linux-x86-64.so.2 0.12% 0.24% 0.12% 0.24% [.] _dl_relocate_object 0.08% 0.16% 0.08% 0.16% [.] _dl_lookup_symbol_x 0.03% 0.06% 0.03% 0.06% [.] strcmp 0.00% 0.01% 0.00% 0.01% [.] _dl_start 0.00% 0.00% 0.00% 0.00% [.] _dl_start_user 0.00% 0.00% 0.00% 0.00% [.] _dl_sysdep_start 0.00% 0.00% 0.00% 0.00% [.] _start 0.00% 0.00% 0.00% 0.00% [.] dl_main 0.03% 0.06% 0.03% 0.06% libLLVM-16.so.1 0.03% 0.06% 0.03% 0.06% [.] llvm::StringMapImpl::RehashTable(unsigned int) 0.00% 0.00% 0.00% 0.00% [.] 0x00007f137ccd18e8 0.00% 0.00% 99.66% 99.31% perf 99.66% 99.31% 99.66% 99.31% [.] test_loop | |--49.86%--0x7f137b633d68 | 0x55dbdbbb7d2c ... Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/builtin-report.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index c1226da6c610258f..fc776e9d7fdfa273 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -1673,8 +1673,6 @@ int cmd_report(int argc, const char **argv) if (symbol_conf.report_hierarchy) { /* disable incompatible options */ - symbol_conf.cumulate_callchain = false; - if (field_order) { pr_err("Error: --hierarchy and --fields options cannot be used together\n"); parse_options_usage(report_usage, options, "F", 1); -- 2.49.0.rc0.332.g42c0ae87b1-goog ^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH 3/3] perf report: Disable children column for data type profiling 2025-03-07 8:08 [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-07 8:08 ` [PATCH 2/3] perf report: Allow hierarchy mode for --children Namhyung Kim @ 2025-03-07 8:08 ` Namhyung Kim 2025-03-20 0:36 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-21 18:30 ` Namhyung Kim 3 siblings, 0 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-07 8:08 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users, Dmitry Vyukov I've realized that it doesn't make sense to accumulate the samples to parent in the callchain when data type profiling is enabled. Because it won't have the same data type access in the parent. Otherwise it'd see something like this: $ perf report -s type --stdio -g none # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 2K of event 'cycles:Pu' # Event count (approx.): 8266456478 # # Children Latency Self Latency Data Type # ........ ....... ........ ........ ......... # 698.97% 697.72% 99.80% 99.61% (unknown) 0.09% 0.18% 0.09% 0.18% Elf64_Rela 0.05% 0.10% 0.05% 0.10% unsigned char 0.05% 0.10% 0.05% 0.10% struct exit_function_list 0.00% 0.01% 0.00% 0.01% struct rtld_global Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/builtin-report.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index fc776e9d7fdfa273..b030ce72e13ea8d1 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -1719,6 +1719,9 @@ int cmd_report(int argc, const char **argv) report.data_type = true; annotate_opts.annotate_src = false; + /* disable incompatible options */ + symbol_conf.cumulate_callchain = false; + #ifndef HAVE_LIBDW_SUPPORT pr_err("Error: Data type profiling is disabled due to missing DWARF support\n"); goto error; -- 2.49.0.rc0.332.g42c0ae87b1-goog ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-07 8:08 [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-07 8:08 ` [PATCH 2/3] perf report: Allow hierarchy mode for --children Namhyung Kim 2025-03-07 8:08 ` [PATCH 3/3] perf report: Disable children column for data type profiling Namhyung Kim @ 2025-03-20 0:36 ` Namhyung Kim 2025-03-20 9:32 ` Ingo Molnar 2025-03-21 18:30 ` Namhyung Kim 3 siblings, 1 reply; 37+ messages in thread From: Namhyung Kim @ 2025-03-20 0:36 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users, Dmitry Vyukov On Fri, Mar 07, 2025 at 12:08:27AM -0800, Namhyung Kim wrote: > This is useful for hierarchy output mode where the first level is > considered as output fields. We want them in the same level so that it > can show only the remaining groups in the hierarchy. > > Before: > $ perf report -s overhead,sample,period,comm,dso -H --stdio > ... > # Overhead Samples / Period / Command / Shared Object > # ................. .......................................... > # > 100.00% 4035 > 100.00% 3835883066 > 100.00% perf > 99.37% perf > 0.50% ld-linux-x86-64.so.2 > 0.06% [unknown] > 0.04% libc.so.6 > 0.02% libLLVM-16.so.1 > > After: > $ perf report -s overhead,sample,period,comm,dso -H --stdio > ... > # Overhead Samples Period Command / Shared Object > # ....................................... ....................... > # > 100.00% 4035 3835883066 perf > 99.37% 4005 3811826223 perf > 0.50% 19 19210014 ld-linux-x86-64.so.2 > 0.06% 8 2367089 [unknown] > 0.04% 2 1720336 libc.so.6 > 0.02% 1 759404 libLLVM-16.so.1 > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> Ping! Anybody interested in this change? :) Thanks, Namhyung > --- > tools/perf/util/sort.c | 44 ++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 44 insertions(+) > > diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c > index f08fbc4bf0a2ce29..6b49d64854f5f986 100644 > --- a/tools/perf/util/sort.c > +++ b/tools/perf/util/sort.c > @@ -3720,6 +3720,34 @@ int sort_dimension__add(struct perf_hpp_list *list, const char *tok, > return -ESRCH; > } > > +/* This should match with sort_dimension__add() above */ > +static bool is_hpp_sort_key(const char *key) > +{ > + unsigned i; > + > + for (i = 0; i < ARRAY_SIZE(arch_specific_sort_keys); i++) { > + if (!strcmp(arch_specific_sort_keys[i], key) && > + !arch_support_sort_key(key)) { > + return false; > + } > + } > + > + for (i = 0; i < ARRAY_SIZE(common_sort_dimensions); i++) { > + struct sort_dimension *sd = &common_sort_dimensions[i]; > + > + if (sd->name && !strncasecmp(key, sd->name, strlen(key))) > + return false; > + } > + > + for (i = 0; i < ARRAY_SIZE(hpp_sort_dimensions); i++) { > + struct hpp_dimension *hd = &hpp_sort_dimensions[i]; > + > + if (!strncasecmp(key, hd->name, strlen(key))) > + return true; > + } > + return false; > +} > + > static int setup_sort_list(struct perf_hpp_list *list, char *str, > struct evlist *evlist) > { > @@ -3727,7 +3755,9 @@ static int setup_sort_list(struct perf_hpp_list *list, char *str, > int ret = 0; > int level = 0; > int next_level = 1; > + int prev_level = 0; > bool in_group = false; > + bool prev_was_hpp = false; > > do { > tok = str; > @@ -3748,6 +3778,19 @@ static int setup_sort_list(struct perf_hpp_list *list, char *str, > } > > if (*tok) { > + if (is_hpp_sort_key(tok)) { > + /* keep output (hpp) sort keys in the same level */ > + if (prev_was_hpp) { > + bool next_same = (level == next_level); > + > + level = prev_level; > + next_level = next_same ? level : level+1; > + } > + prev_was_hpp = true; > + } else { > + prev_was_hpp = false; > + } > + > ret = sort_dimension__add(list, tok, evlist, level); > if (ret == -EINVAL) { > if (!cacheline_size() && !strncasecmp(tok, "dcacheline", strlen(tok))) > @@ -3759,6 +3802,7 @@ static int setup_sort_list(struct perf_hpp_list *list, char *str, > ui__error("Unknown --sort key: `%s'", tok); > break; > } > + prev_level = level; > } > > level = next_level; > -- > 2.49.0.rc0.332.g42c0ae87b1-goog > ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-20 0:36 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim @ 2025-03-20 9:32 ` Ingo Molnar 2025-03-20 16:16 ` Namhyung Kim ` (2 more replies) 0 siblings, 3 replies; 37+ messages in thread From: Ingo Molnar @ 2025-03-20 9:32 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, LKML, linux-perf-users, Dmitry Vyukov * Namhyung Kim <namhyung@kernel.org> wrote: > On Fri, Mar 07, 2025 at 12:08:27AM -0800, Namhyung Kim wrote: > > This is useful for hierarchy output mode where the first level is > > considered as output fields. We want them in the same level so that it > > can show only the remaining groups in the hierarchy. > > > > Before: > > $ perf report -s overhead,sample,period,comm,dso -H --stdio > > ... > > # Overhead Samples / Period / Command / Shared Object > > # ................. .......................................... > > # > > 100.00% 4035 > > 100.00% 3835883066 > > 100.00% perf > > 99.37% perf > > 0.50% ld-linux-x86-64.so.2 > > 0.06% [unknown] > > 0.04% libc.so.6 > > 0.02% libLLVM-16.so.1 > > > > After: > > $ perf report -s overhead,sample,period,comm,dso -H --stdio > > ... > > # Overhead Samples Period Command / Shared Object > > # ....................................... ....................... > > # > > 100.00% 4035 3835883066 perf > > 99.37% 4005 3811826223 perf > > 0.50% 19 19210014 ld-linux-x86-64.so.2 > > 0.06% 8 2367089 [unknown] > > 0.04% 2 1720336 libc.so.6 > > 0.02% 1 759404 libLLVM-16.so.1 > > > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > > Ping! Anybody interested in this change? :) Oh yes, all such pieces of intelligent organization of textual output of profiling data are worth their weight in gold in my book. :-) Acked-by: Ingo Molnar <mingo@kernel.org> 1) On a related note, does anyone know why perf stat output alignment sucks so much these days: starship:~/tip> perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging 2>&1 | grep elapsed 0.11620 +- 0.00327 seconds time elapsed ( +- 2.81% ) 0.120813 +- 0.000570 seconds time elapsed ( +- 0.47% ) 0.122280 +- 0.000443 seconds time elapsed ( +- 0.36% ) 0.119813 +- 0.000752 seconds time elapsed ( +- 0.63% ) 0.12190 +- 0.00134 seconds time elapsed ( +- 1.10% ) 0.119862 +- 0.000542 seconds time elapsed ( +- 0.45% ) 0.120075 +- 0.000608 seconds time elapsed ( +- 0.51% ) 0.120350 +- 0.000273 seconds time elapsed ( +- 0.23% ) 0.12203 +- 0.00114 seconds time elapsed ( +- 0.93% ) 0.12229 +- 0.00114 seconds time elapsed ( +- 0.93% ) 0.12032 +- 0.00115 seconds time elapsed ( +- 0.95% ) 0.121241 +- 0.000463 seconds time elapsed ( +- 0.38% ) 0.119404 +- 0.000333 seconds time elapsed ( +- 0.28% ) 0.119945 +- 0.000766 seconds time elapsed ( +- 0.64% ) 0.121215 +- 0.000879 seconds time elapsed ( +- 0.72% ) 0.12001 +- 0.00109 seconds time elapsed ( +- 0.91% ) 0.12193 +- 0.00182 seconds time elapsed ( +- 1.49% ) 0.119184 +- 0.000794 seconds time elapsed ( +- 0.67% ) 0.120062 +- 0.000439 seconds time elapsed ( +- 0.37% ) 0.120834 +- 0.000760 seconds time elapsed ( +- 0.63% ) 0.369473 +- 0.000992 seconds time elapsed ( +- 0.27% ) ... see how the vertical alignment of the output goes randomly wacko - I presume because there's a trailing zero in the output number and the code for some inexplicable reason decides to shorten it to make the life of developers harder? ;-) 2) It's also incredibly hard to Ctrl-C a 'perf stat --repeat' instance: starship:~/tip> perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run ... Ctrl-C # Running 'sched/messaging' benchmark: perf: pollperf: perf: pollperf: pollpollperf: pollperf: pollperf: : Interrupted system call : Interrupted system call poll: Interrupted system call perf: pollperf: : Interrupted system call perf: pollperf: pollpollperf: : Interrupted system call pollperf: pollperf: perf: perf: pollpollpollperf: : Interrupted system call pollperf: poll: Interrupted system call : Interrupted system call : Interrupted system call : Interrupted system call perf: poll: Interrupted system call perf: perf: pollpoll: Interrupted system call : Interrupted system call perf: perf: perf: perf: perf: perf: : Interrupted system call pollpollpollpollpollpoll: Interrupted system call : Interrupted system call : Interrupted system call perf: perf: pollperf: perf: perf: perf: perf: perf: pollperf: : Interrupted system call pollpollpoll: Interrupted system call Note how the perf stat instance actually *hangs*. I have to Ctrl-Z it, and kill -9 %1 it the hard way to clean up: pollpollpoll: Interrupted system call � [1]+ Stopped perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging starship:~/tip> kill -9 %1 [1]+ Stopped perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging starship:~/tip> kill -9 %1 Does anyone use this thing for actual benchmarking work? ;-) 3) It would also be nice to be able to Ctrl-C out of a 'perf top' instance that freezes the output or so. Or prints a snapshot in ASCII. Anything but what it does currently: it just exits and clears the xterm screen of all useful information... I have to use 'f' (how many people know about that feature?) and copy & paste anything interesting from the screen the hard way. 4) It would also be nice to have an export function to save current 'perf top' profiling data and have it available for 'perf report' et al analysis. Ie. frequently I just see an interesting snapshot and decide that it's good for further analysis, freeze the screen and are left with very few options to keep it for further look and reference. 5) Would anyone be interested in an OpenGL-ish version of perf top, with its own low level shader for font-atlas based text output and vertex based polygon graphics, double buffering, full screen support with very little Xorg interaction for the GX pathway, etc? It should be *far* faster and lower overhead than the current ncurses->xterm->Wayland levels of indirection... and it could open up a new world of on-screen profiling information as well. Basically a very simple self-sustained OpenGL game engine for the key low level graphics primitives of modern GFX hardware. I could whip up a prototype if there's interest. Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-20 9:32 ` Ingo Molnar @ 2025-03-20 16:16 ` Namhyung Kim 2025-03-24 7:28 ` Ingo Molnar 2025-03-25 0:46 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-30 5:54 ` Namhyung Kim 2 siblings, 1 reply; 37+ messages in thread From: Namhyung Kim @ 2025-03-20 16:16 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, LKML, linux-perf-users, Dmitry Vyukov Hi Ingo, On Thu, Mar 20, 2025 at 10:32:39AM +0100, Ingo Molnar wrote: > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > On Fri, Mar 07, 2025 at 12:08:27AM -0800, Namhyung Kim wrote: > > > This is useful for hierarchy output mode where the first level is > > > considered as output fields. We want them in the same level so that it > > > can show only the remaining groups in the hierarchy. > > > > > > Before: > > > $ perf report -s overhead,sample,period,comm,dso -H --stdio > > > ... > > > # Overhead Samples / Period / Command / Shared Object > > > # ................. .......................................... > > > # > > > 100.00% 4035 > > > 100.00% 3835883066 > > > 100.00% perf > > > 99.37% perf > > > 0.50% ld-linux-x86-64.so.2 > > > 0.06% [unknown] > > > 0.04% libc.so.6 > > > 0.02% libLLVM-16.so.1 > > > > > > After: > > > $ perf report -s overhead,sample,period,comm,dso -H --stdio > > > ... > > > # Overhead Samples Period Command / Shared Object > > > # ....................................... ....................... > > > # > > > 100.00% 4035 3835883066 perf > > > 99.37% 4005 3811826223 perf > > > 0.50% 19 19210014 ld-linux-x86-64.so.2 > > > 0.06% 8 2367089 [unknown] > > > 0.04% 2 1720336 libc.so.6 > > > 0.02% 1 759404 libLLVM-16.so.1 > > > > > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > > > > Ping! Anybody interested in this change? :) > > Oh yes, all such pieces of intelligent organization of textual output > of profiling data are worth their weight in gold in my book. :-) > > Acked-by: Ingo Molnar <mingo@kernel.org> Thanks a lot! > > 1) > > On a related note, does anyone know why perf stat output alignment > sucks so much these days: > > starship:~/tip> perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging 2>&1 | grep elapsed > 0.11620 +- 0.00327 seconds time elapsed ( +- 2.81% ) > 0.120813 +- 0.000570 seconds time elapsed ( +- 0.47% ) > 0.122280 +- 0.000443 seconds time elapsed ( +- 0.36% ) > 0.119813 +- 0.000752 seconds time elapsed ( +- 0.63% ) > 0.12190 +- 0.00134 seconds time elapsed ( +- 1.10% ) > 0.119862 +- 0.000542 seconds time elapsed ( +- 0.45% ) > 0.120075 +- 0.000608 seconds time elapsed ( +- 0.51% ) > 0.120350 +- 0.000273 seconds time elapsed ( +- 0.23% ) > 0.12203 +- 0.00114 seconds time elapsed ( +- 0.93% ) > 0.12229 +- 0.00114 seconds time elapsed ( +- 0.93% ) > 0.12032 +- 0.00115 seconds time elapsed ( +- 0.95% ) > 0.121241 +- 0.000463 seconds time elapsed ( +- 0.38% ) > 0.119404 +- 0.000333 seconds time elapsed ( +- 0.28% ) > 0.119945 +- 0.000766 seconds time elapsed ( +- 0.64% ) > 0.121215 +- 0.000879 seconds time elapsed ( +- 0.72% ) > 0.12001 +- 0.00109 seconds time elapsed ( +- 0.91% ) > 0.12193 +- 0.00182 seconds time elapsed ( +- 1.49% ) > 0.119184 +- 0.000794 seconds time elapsed ( +- 0.67% ) > 0.120062 +- 0.000439 seconds time elapsed ( +- 0.37% ) > 0.120834 +- 0.000760 seconds time elapsed ( +- 0.63% ) > 0.369473 +- 0.000992 seconds time elapsed ( +- 0.27% ) > > ... see how the vertical alignment of the output goes randomly wacko - > I presume because there's a trailing zero in the output number and the > code for some inexplicable reason decides to shorten it to make the > life of developers harder? ;-) It seems it calcuates the precision when printing the result and doesn't care much about the repeatations. > > 2) > > It's also incredibly hard to Ctrl-C a 'perf stat --repeat' instance: > > starship:~/tip> perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging > # Running 'sched/messaging' benchmark: > # 20 sender and receiver processes per group > # 10 groups == 400 processes run > > ... > Ctrl-C > > # Running 'sched/messaging' benchmark: > perf: pollperf: perf: pollperf: pollpollperf: pollperf: pollperf: : Interrupted system call > : Interrupted system call > poll: Interrupted system call > perf: pollperf: : Interrupted system call > perf: pollperf: pollpollperf: : Interrupted system call > pollperf: pollperf: perf: perf: pollpollpollperf: : Interrupted system call > pollperf: poll: Interrupted system call > : Interrupted system call > : Interrupted system call > : Interrupted system call > perf: poll: Interrupted system call > perf: perf: pollpoll: Interrupted system call > : Interrupted system call > perf: perf: perf: perf: perf: perf: : Interrupted system call > pollpollpollpollpollpoll: Interrupted system call > : Interrupted system call > : Interrupted system call > perf: perf: pollperf: perf: perf: perf: perf: perf: pollperf: : Interrupted system call > pollpollpoll: Interrupted system call > > Note how the perf stat instance actually *hangs*. I have to Ctrl-Z it, > and kill -9 %1 it the hard way to clean up: > > pollpollpoll: Interrupted system call > � > [1]+ Stopped perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging > starship:~/tip> kill -9 %1 > > [1]+ Stopped perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging > starship:~/tip> kill -9 %1 > > Does anyone use this thing for actual benchmarking work? ;-) Oh, ok. I'll take a look. > > 3) > > It would also be nice to be able to Ctrl-C out of a 'perf top' instance > that freezes the output or so. Or prints a snapshot in ASCII. Anything > but what it does currently: it just exits and clears the xterm screen > of all useful information... > > I have to use 'f' (how many people know about that feature?) and copy & > paste anything interesting from the screen the hard way. Actually I was not aware of 'f' key. :) I think you can use 'P' to save the current screen to a file. > > 4) > > It would also be nice to have an export function to save current 'perf > top' profiling data and have it available for 'perf report' et al > analysis. Ie. frequently I just see an interesting snapshot and decide > that it's good for further analysis, freeze the screen and are left > with very few options to keep it for further look and reference. Hmm.. interesting. I'll think about it but it seems doable. > > 5) > > Would anyone be interested in an OpenGL-ish version of perf top, with > its own low level shader for font-atlas based text output and vertex > based polygon graphics, double buffering, full screen support with very > little Xorg interaction for the GX pathway, etc? It should be *far* > faster and lower overhead than the current ncurses->xterm->Wayland > levels of indirection... and it could open up a new world of on-screen > profiling information as well. Basically a very simple self-sustained > OpenGL game engine for the key low level graphics primitives of modern > GFX hardware. I could whip up a prototype if there's interest. I'm not familiar with the fancy GFX things but it'd be interesting to see. Not sure how well we can manage it in the future though, as we have unloved GTK2 UI as well. Thanks, Namhyung ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-20 16:16 ` Namhyung Kim @ 2025-03-24 7:28 ` Ingo Molnar 2025-03-25 0:26 ` Namhyung Kim 2025-04-04 9:41 ` [perf top] annotation doesn't work, libunwind doesn't seem to be working either Ingo Molnar 0 siblings, 2 replies; 37+ messages in thread From: Ingo Molnar @ 2025-03-24 7:28 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, LKML, linux-perf-users, Dmitry Vyukov * Namhyung Kim <namhyung@kernel.org> wrote: > > 3) > > > > It would also be nice to be able to Ctrl-C out of a 'perf top' instance > > that freezes the output or so. Or prints a snapshot in ASCII. Anything > > but what it does currently: it just exits and clears the xterm screen > > of all useful information... > > > > I have to use 'f' (how many people know about that feature?) and copy & > > paste anything interesting from the screen the hard way. > > Actually I was not aware of 'f' key. :) I think you can use 'P' to save > the current screen to a file. And I was not aware of the 'P' key. :-) Would be nice if it also printed column and meta information, i.e. if it included these lines: Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.): 758731362801 lost: 0/0 drop: 0/0 Overhead Shared Object Symbol or so? BTW., the approximate 'event count' is a pretty uninformative value these days, right? Might as well not clutter the screen with it. Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-24 7:28 ` Ingo Molnar @ 2025-03-25 0:26 ` Namhyung Kim 2025-04-04 9:41 ` [perf top] annotation doesn't work, libunwind doesn't seem to be working either Ingo Molnar 1 sibling, 0 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-25 0:26 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, LKML, linux-perf-users, Dmitry Vyukov On Mon, Mar 24, 2025 at 08:28:31AM +0100, Ingo Molnar wrote: > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > > 3) > > > > > > It would also be nice to be able to Ctrl-C out of a 'perf top' instance > > > that freezes the output or so. Or prints a snapshot in ASCII. Anything > > > but what it does currently: it just exits and clears the xterm screen > > > of all useful information... > > > > > > I have to use 'f' (how many people know about that feature?) and copy & > > > paste anything interesting from the screen the hard way. > > > > Actually I was not aware of 'f' key. :) I think you can use 'P' to save > > the current screen to a file. > > And I was not aware of the 'P' key. :-) > > Would be nice if it also printed column and meta information, i.e. if > it included these lines: > > Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.): 758731362801 lost: 0/0 drop: 0/0 > Overhead Shared Object Symbol > > or so? Yep, maybe we can do the same as the stdio output. # # # Total Lost Samples: 0 # # Samples: 7 of event 'cycles' # Event count (approx.): 3641468 # # Overhead Command Shared Object Symbol # ........ ....... ................ .................................. # > > BTW., the approximate 'event count' is a pretty uninformative value > these days, right? Might as well not clutter the screen with it. Maybe. I don't look at the number so far but also don't feel the need for removal. Thanks, Namhyung ^ permalink raw reply [flat|nested] 37+ messages in thread
* [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-03-24 7:28 ` Ingo Molnar 2025-03-25 0:26 ` Namhyung Kim @ 2025-04-04 9:41 ` Ingo Molnar 2025-04-04 17:28 ` Namhyung Kim 2025-04-07 6:02 ` Howard Chu 1 sibling, 2 replies; 37+ messages in thread From: Ingo Molnar @ 2025-04-04 9:41 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov BTW., here's a few other 'perf' annoyances I have, if anyone's listening :-) 1) I cannot get the libunwind build-feature failure to go away: ... libcrypto: [ on ] ... libunwind: [ OFF ] ... libcapstone: [ on ] ... llvm-perf: [ on ] I do have libunwind-dev installed: ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime but it fails to link: kepler:~/tip/tools/build/feature> cat test-libunwind.make.output /usr/bin/ld: /tmp/cc2BfaNM.o: in function `main': test-libunwind.c:(.text+0x1c): undefined reference to `_Ux86_64_create_addr_space' /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' collect2: error: ld returned 1 exit status I tried to install the libunwind-19-dev package, I tried to uninstall/reinstall - no combination seems to work. perf is also totally unhelpful about resolving such issues - it used to issue tips about what packages to install, but those tips are not present anymore, for this one at least. 2) More annoyingly, I cannot get source code annotation of the kernel to work - this might be related to the libunwind build failure (or not). Within 'perf top' I go into a function, and I press 's', which is supposed to toggle the source code annotation ... but nothing happens. 'nothing happens' is perhaps the most passive-aggressive reaction a tool can give to a user. I'd prefer a *crash* to ignoring the keypress ... There's no error printed anywhere - just the color of the hexadecimal labels changes, nothing else. 'perf top' won't annotate kernel functions (despite me having a DEBUGINFO kernel package installed and booted), nor will it even annotate its own functions within 'perf', in a similar fashion. It's similar for 'perf report' too, although that's not a surprise, they share much of the codebase. 'perf annotate --stdio' only shows an objdump --disassemble style output, but without any source annotations either. This is a rather frustrating user experience. :-/ Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-04 9:41 ` [perf top] annotation doesn't work, libunwind doesn't seem to be working either Ingo Molnar @ 2025-04-04 17:28 ` Namhyung Kim 2025-04-04 18:13 ` Arnaldo Carvalho de Melo 2025-04-05 9:06 ` Ingo Molnar 2025-04-07 6:02 ` Howard Chu 1 sibling, 2 replies; 37+ messages in thread From: Namhyung Kim @ 2025-04-04 17:28 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov Hi Ingo, On Fri, Apr 04, 2025 at 11:41:46AM +0200, Ingo Molnar wrote: > > BTW., here's a few other 'perf' annoyances I have, if anyone's > listening :-) Thanks for the report. Glad to hear from you. > > > 1) > > I cannot get the libunwind build-feature failure to go away: > > ... libcrypto: [ on ] > ... libunwind: [ OFF ] > ... libcapstone: [ on ] > ... llvm-perf: [ on ] > > I do have libunwind-dev installed: > > ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development > ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime > > but it fails to link: > > kepler:~/tip/tools/build/feature> cat test-libunwind.make.output > /usr/bin/ld: /tmp/cc2BfaNM.o: in function `main': > test-libunwind.c:(.text+0x1c): undefined reference to `_Ux86_64_create_addr_space' > /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' > /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' > collect2: error: ld returned 1 exit status > > I tried to install the libunwind-19-dev package, I tried to > uninstall/reinstall - no combination seems to work. > > perf is also totally unhelpful about resolving such issues - it used to > issue tips about what packages to install, but those tips are not > present anymore, for this one at least. Since v6.13, we switched to disable libunwind by default and used unwinding in libdw. commit 13e17c9ff491 ("perf build: Make libunwind opt-in rather than opt-out") You need to pass LIBUNWIND=1 to build with it. > > 2) > > More annoyingly, I cannot get source code annotation of the kernel to > work - this might be related to the libunwind build failure (or not). I don't think it's related. Anyway do you see the same for user binaries or is it only for the kernel? > > Within 'perf top' I go into a function, and I press 's', which is > supposed to toggle the source code annotation ... but nothing happens. > > 'nothing happens' is perhaps the most passive-aggressive reaction a > tool can give to a user. I'd prefer a *crash* to ignoring the keypress > ... Oh ok. I prefer a warning. :) > > There's no error printed anywhere - just the color of the hexadecimal > labels changes, nothing else. > > 'perf top' won't annotate kernel functions (despite me having a > DEBUGINFO kernel package installed and booted), nor will it even > annotate its own functions within 'perf', in a similar fashion. I see. So the problem is not specific to the kernel. > > It's similar for 'perf report' too, although that's not a surprise, > they share much of the codebase. Yep, of course. > > 'perf annotate --stdio' only shows an objdump --disassemble style > output, but without any source annotations either. > > This is a rather frustrating user experience. :-/ Sorry about that. I suspect that I made a mistake when I was working on the data type profiling. I'll take a look. Thanks, Namhyung ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-04 17:28 ` Namhyung Kim @ 2025-04-04 18:13 ` Arnaldo Carvalho de Melo 2025-04-04 18:25 ` Arnaldo Carvalho de Melo 2025-04-05 9:06 ` Ingo Molnar 1 sibling, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-04 18:13 UTC (permalink / raw) To: Namhyung Kim Cc: Ingo Molnar, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Fri, Apr 04, 2025 at 10:28:06AM -0700, Namhyung Kim wrote: > Hi Ingo, > On Fri, Apr 04, 2025 at 11:41:46AM +0200, Ingo Molnar wrote: > > BTW., here's a few other 'perf' annoyances I have, if anyone's > > listening :-) > Thanks for the report. Glad to hear from you. Ditto, always a pleasure to get reports from power users, and I don't mind if its complaints, its feedback, and that is what we need to hear to fix things and improve the experience. > > 1) > > I cannot get the libunwind build-feature failure to go away: > > ... libcrypto: [ on ] > > ... libunwind: [ OFF ] > > ... libcapstone: [ on ] > > ... llvm-perf: [ on ] > > I do have libunwind-dev installed: > > ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development > > ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime > > but it fails to link: > > kepler:~/tip/tools/build/feature> cat test-libunwind.make.output > > /usr/bin/ld: /tmp/cc2BfaNM.o: in function `main': > > test-libunwind.c:(.text+0x1c): undefined reference to `_Ux86_64_create_addr_space' > > /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' > > /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' > > collect2: error: ld returned 1 exit status > > I tried to install the libunwind-19-dev package, I tried to > > uninstall/reinstall - no combination seems to work. > > perf is also totally unhelpful about resolving such issues - it used to > > issue tips about what packages to install, but those tips are not > > present anymore, for this one at least. > Since v6.13, we switched to disable libunwind by default and used > unwinding in libdw. > commit 13e17c9ff491 ("perf build: Make libunwind opt-in rather than opt-out") > You need to pass LIBUNWIND=1 to build with it. But we should have that spelled out in the output of the build where it appears marked as not being enabled even with the usual devel libraries being available. Its back to change in behaviour that is not clearly marked in the tool output. > > 2) > > More annoyingly, I cannot get source code annotation of the kernel to > > work - this might be related to the libunwind build failure (or not). > I don't think it's related. Anyway do you see the same for user > binaries or is it only for the kernel? Also now we have multiple ways of doing that feature, perhaps there is a bug in enabling it, that is in this description: https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a6e8a58de6294578195447596fb975a9027b4d2c I'm checking... > > Within 'perf top' I go into a function, and I press 's', which is > > supposed to toggle the source code annotation ... but nothing happens. > > 'nothing happens' is perhaps the most passive-aggressive reaction a > > tool can give to a user. I'd prefer a *crash* to ignoring the keypress > > ... > Oh ok. I prefer a warning. :) Agreed. > > There's no error printed anywhere - just the color of the hexadecimal > > labels changes, nothing else. > > 'perf top' won't annotate kernel functions (despite me having a > > DEBUGINFO kernel package installed and booted), nor will it even > > annotate its own functions within 'perf', in a similar fashion. > I see. So the problem is not specific to the kernel. I didn't see it in my tests, but I've been away, travelling, will check, this should be caught by CI systems using 'perf test'. > > It's similar for 'perf report' too, although that's not a surprise, > > they share much of the codebase. > Yep, of course. > > 'perf annotate --stdio' only shows an objdump --disassemble style > > output, but without any source annotations either. > > This is a rather frustrating user experience. :-/ > Sorry about that. I suspect that I made a mistake when I was working on > the data type profiling. I'll take a look. We need to continue improving the 'perf test' infrastructure and get help from people doing CI, perf has way too many features and dependencies, we need to continue testing what is great and pruning dependencies that drag us down while not helping with features used by most people. - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-04 18:13 ` Arnaldo Carvalho de Melo @ 2025-04-04 18:25 ` Arnaldo Carvalho de Melo 2025-04-04 18:40 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-04 18:25 UTC (permalink / raw) To: Namhyung Kim Cc: Ingo Molnar, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Fri, Apr 04, 2025 at 03:13:12PM -0300, Arnaldo Carvalho de Melo wrote: > On Fri, Apr 04, 2025 at 10:28:06AM -0700, Namhyung Kim wrote: > > > I cannot get the libunwind build-feature failure to go away: > > > > ... libcrypto: [ on ] > > > ... libunwind: [ OFF ] > > > ... libcapstone: [ on ] > > > ... llvm-perf: [ on ] > > > > I do have libunwind-dev installed: > > > > ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development > > > ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime > > > > but it fails to link: > > > > kepler:~/tip/tools/build/feature> cat test-libunwind.make.output > > > /usr/bin/ld: /tmp/cc2BfaNM.o: in function `main': > > > test-libunwind.c:(.text+0x1c): undefined reference to `_Ux86_64_create_addr_space' > > > /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' > > > /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' > > > collect2: error: ld returned 1 exit status > > > > I tried to install the libunwind-19-dev package, I tried to > > > uninstall/reinstall - no combination seems to work. > > > > perf is also totally unhelpful about resolving such issues - it used to > > > issue tips about what packages to install, but those tips are not > > > present anymore, for this one at least. > > > Since v6.13, we switched to disable libunwind by default and used > > unwinding in libdw. > > > commit 13e17c9ff491 ("perf build: Make libunwind opt-in rather than opt-out") > > > You need to pass LIBUNWIND=1 to build with it. > > But we should have that spelled out in the output of the build where it > appears marked as not being enabled even with the usual devel libraries > being available. > > Its back to change in behaviour that is not clearly marked in the tool > output. I just tried but realized that I have the libunwind devel files available... so get: Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libbfd: [ on ] ... libbfd-buildid: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] Nope, not really, I only have: ⬢ [acme@toolbox perf-tools-next]$ rpm -qa | grep unwind libunwind-1.8.0-3.fc40.x86_64 ⬢ [acme@toolbox perf-tools-next]$ rpm -e libunwind error: Failed dependencies: libunwind.so.8()(64bit) is needed by (installed) gstreamer1-1.24.11-1.fc40.x86_64 ⬢ [acme@toolbox perf-tools-next]$ So this is a bug... There are no references to libunwind tools/build/feature/test-all.c and it is building ok: ⬢ [acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output ⬢ [acme@toolbox perf-tools-next]$ and somehow libunwind devel files are being somehow detected as being present... Investigating... - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-04 18:25 ` Arnaldo Carvalho de Melo @ 2025-04-04 18:40 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-04 18:40 UTC (permalink / raw) To: Namhyung Kim Cc: Ingo Molnar, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Fri, Apr 04, 2025 at 03:26:03PM -0300, Arnaldo Carvalho de Melo wrote: > So this is a bug... > > There are no references to libunwind tools/build/feature/test-all.c and > it is building ok: > > ⬢ [acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output > ⬢ [acme@toolbox perf-tools-next]$ > > and somehow libunwind devel files are being somehow detected as being present... > > Investigating... Cleaning everything and rebuilding we still end up with: ⬢ [acme@toolbox perf-tools-next]$ cat /tmp/build/perf-tools-next/feature/test-all.make.output ⬢ [acme@toolbox perf-tools-next]$ grep unwind /tmp/build/perf-tools-next/FEATURE-DUMP feature-libunwind=1 ⬢ [acme@toolbox perf-tools-next]$ I found a couple of problems with the patches that made libunwind to be opt-in, I'm sending a patch shortly. - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-04 17:28 ` Namhyung Kim 2025-04-04 18:13 ` Arnaldo Carvalho de Melo @ 2025-04-05 9:06 ` Ingo Molnar 2025-04-05 9:09 ` Ingo Molnar 1 sibling, 1 reply; 37+ messages in thread From: Ingo Molnar @ 2025-04-05 9:06 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Namhyung Kim <namhyung@kernel.org> wrote: > > /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' > > /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' > > collect2: error: ld returned 1 exit status > > > > I tried to install the libunwind-19-dev package, I tried to > > uninstall/reinstall - no combination seems to work. > > > > perf is also totally unhelpful about resolving such issues - it used to > > issue tips about what packages to install, but those tips are not > > present anymore, for this one at least. > > Since v6.13, we switched to disable libunwind by default and used > unwinding in libdw. > > commit 13e17c9ff491 ("perf build: Make libunwind opt-in rather than > opt-out") > > You need to pass LIBUNWIND=1 to build with it. So, users (including me) have absolutely no idea which unwinding library they want to build with: 99% of them would like to have a clean, full-featured perf build with all libraries marked as '[ on ]' in the build log. :-) > > Within 'perf top' I go into a function, and I press 's', which is > > supposed to toggle the source code annotation ... but nothing happens. > > > > 'nothing happens' is perhaps the most passive-aggressive reaction a > > tool can give to a user. I'd prefer a *crash* to ignoring the keypress > > ... > > Oh ok. I prefer a warning. :) There's a zillion other things I'd prefer over the current behavior, what I wanted to say was that *even a crash* is superior to a puzzling 'nothing' :-) Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-05 9:06 ` Ingo Molnar @ 2025-04-05 9:09 ` Ingo Molnar 0 siblings, 0 replies; 37+ messages in thread From: Ingo Molnar @ 2025-04-05 9:09 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Ingo Molnar <mingo@kernel.org> wrote: > > You need to pass LIBUNWIND=1 to build with it. > > So, users (including me) have absolutely no idea which unwinding > library they want to build with: 99% of them would like to have a > clean, full-featured perf build with all libraries marked as '[ on ]' > in the build log. :-) Correction: users would like to use the *best* unwinding library, and it would be great if the build process picked that for them, or at least warned them if that ideal state wasn't possible due to missing dependencies. Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-04 9:41 ` [perf top] annotation doesn't work, libunwind doesn't seem to be working either Ingo Molnar 2025-04-04 17:28 ` Namhyung Kim @ 2025-04-07 6:02 ` Howard Chu 2025-04-07 16:58 ` Ingo Molnar 1 sibling, 1 reply; 37+ messages in thread From: Howard Chu @ 2025-04-07 6:02 UTC (permalink / raw) To: Ingo Molnar Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov Hello Ingo, I use Ubuntu 24.10 too and my cpu is also x86_64, so I am taking a look. On Fri, Apr 4, 2025 at 2:42 AM Ingo Molnar <mingo@kernel.org> wrote: > > > BTW., here's a few other 'perf' annoyances I have, if anyone's > listening :-) > > > 1) > > I cannot get the libunwind build-feature failure to go away: > > ... libcrypto: [ on ] > ... libunwind: [ OFF ] > ... libcapstone: [ on ] > ... llvm-perf: [ on ] > > I do have libunwind-dev installed: > > ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development > ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime > > but it fails to link: > > kepler:~/tip/tools/build/feature> cat test-libunwind.make.output > /usr/bin/ld: /tmp/cc2BfaNM.o: in function `main': > test-libunwind.c:(.text+0x1c): undefined reference to `_Ux86_64_create_addr_space' > /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' > /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' > collect2: error: ld returned 1 exit status $ make test-libunwind.bin gcc -MD -Wall -Werror -o test-libunwind.bin test-libunwind.c > test-libunwind.make.output 2>&1 -lelf -llzma make: *** [Makefile:212: test-libunwind.bin] Error 1 $ cat test-libunwind.make.output /usr/bin/ld: /tmp/ccPpmuee.o: in function `main': test-libunwind.c:(.text+0x1c): undefined reference to `_Ux86_64_create_addr_space' /usr/bin/ld: test-libunwind.c:(.text+0x44): undefined reference to `_Ux86_64_init_remote' /usr/bin/ld: test-libunwind.c:(.text+0x6b): undefined reference to `_Ux86_64_dwarf_search_unwind_table' collect2: error: ld returned 1 exit status You mentioned your machine is x86_64, I ran $ make test-libunwind-x86_64.bin gcc -MD -Wall -Werror -o test-libunwind-x86_64.bin test-libunwind-x86_64.c > test-libunwind-x86_64.make.output 2>&1 -lelf -llzma -lunwind-x86_64 $ And it successfully built test-libunwind-x86_64.bin for me. > > I tried to install the libunwind-19-dev package, I tried to > uninstall/reinstall - no combination seems to work. $ apt list --installed | grep libunwind WARNING: apt does not have a stable CLI interface. Use with caution in scripts. libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] libunwind8/oracular,now 1.6.2-3.1 amd64 [installed,automatic] Built perf with make clean install, my libunwind is on. ... libunwind: [ on ] But when I ran perf check, $ perf check feature libunwind libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT libunwind is OFF. plot twist: I uninstalled libdw, now libunwind is OFF. ... libdw: [ OFF ] ... glibc: [ on ] ... libbfd: [ on ] ... libbfd-buildid: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libunwind: [ OFF ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] So they are somehow tied together. Don't know how to fix it yet. So maybe try installing libdw for a usable perf annotate? Thanks, Howard ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-07 6:02 ` Howard Chu @ 2025-04-07 16:58 ` Ingo Molnar 2025-04-07 17:04 ` Ingo Molnar 2025-04-08 0:54 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 37+ messages in thread From: Ingo Molnar @ 2025-04-07 16:58 UTC (permalink / raw) To: Howard Chu Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Howard Chu <howardchu95@gmail.com> wrote: > > I tried to install the libunwind-19-dev package, I tried to > > uninstall/reinstall - no combination seems to work. > > $ apt list --installed | grep libunwind > > WARNING: apt does not have a stable CLI interface. Use with caution in scripts. > > libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] > libunwind8/oracular,now 1.6.2-3.1 amd64 [installed,automatic] > > Built perf with make clean install, my libunwind is on. > > ... libunwind: [ on ] I seem to have the exact same build environment: starship:~> apt list --installed | grep libunwind WARNING: apt does not have a stable CLI interface. Use with caution in scripts. libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] libunwind8/oracular,now 1.6.2-3.1 amd64 [installed] Yet: Makefile.config:1142: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel Makefile.config:1155: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev Makefile.config:1187: libtracefs is missing. Please install libtracefs-dev/libtracefs-devel Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libbfd: [ on ] ... libbfd-buildid: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libunwind: [ OFF ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] > > But when I ran perf check, > > $ perf check feature libunwind > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT > > libunwind is OFF. Same here. Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-07 16:58 ` Ingo Molnar @ 2025-04-07 17:04 ` Ingo Molnar 2025-04-08 0:54 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 37+ messages in thread From: Ingo Molnar @ 2025-04-07 17:04 UTC (permalink / raw) To: Howard Chu Cc: Namhyung Kim, Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Ingo Molnar <mingo@kernel.org> wrote: > Yet: > BTW., on a side quest, more details about these other dependency error messages: > Makefile.config:1142: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel But I have openjdk-8-jdk installed: root@starship:~# apt list --installed | grep openjdk-8 openjdk-8-jdk-headless/oracular-updates,oracular-security,now 8u442-b06~us1-0ubuntu1~24.10 amd64 [installed,automatic] openjdk-8-jdk/oracular-updates,oracular-security,now 8u442-b06~us1-0ubuntu1~24.10 amd64 [installed] openjdk-8-jre-headless/oracular-updates,oracular-security,now 8u442-b06~us1-0ubuntu1~24.10 amd64 [installed,automatic] openjdk-8-jre/oracular-updates,oracular-security,now 8u442-b06~us1-0ubuntu1~24.10 amd64 [installed,automatic] (and a bunch of other OpenJDK packages.) > Makefile.config:1155: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev This one was resolved by the suggestion: apt install libpfm4-dev (yay!) Although it would be nice if it was done via the regular feature checksL: a red 'OFF' there catches my attention - some random message in a 1000 lines long build log doesn't ... > Makefile.config:1187: libtracefs is missing. Please install libtracefs-dev/libtracefs-devel This too got resolved by the suggestion offered: apt install libtracefs-dev Same observation to move this to the 'on/OFF' type of detection applies. Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-07 16:58 ` Ingo Molnar 2025-04-07 17:04 ` Ingo Molnar @ 2025-04-08 0:54 ` Arnaldo Carvalho de Melo 2025-04-08 6:16 ` Namhyung Kim 2025-04-08 8:05 ` Ingo Molnar 1 sibling, 2 replies; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-08 0:54 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov [-- Attachment #1: Type: text/plain, Size: 6425 bytes --] On Mon, Apr 07, 2025 at 06:58:28PM +0200, Ingo Molnar wrote: > * Howard Chu <howardchu95@gmail.com> wrote: > > > I tried to install the libunwind-19-dev package, I tried to > > > uninstall/reinstall - no combination seems to work. > > > > $ apt list --installed | grep libunwind > > > > WARNING: apt does not have a stable CLI interface. Use with caution in scripts. > > > > libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] > > libunwind8/oracular,now 1.6.2-3.1 amd64 [installed,automatic] > > > > Built perf with make clean install, my libunwind is on. > > > > ... libunwind: [ on ] > > I seem to have the exact same build environment: > > starship:~> apt list --installed | grep libunwind > > WARNING: apt does not have a stable CLI interface. Use with caution in scripts. > > libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] > libunwind8/oracular,now 1.6.2-3.1 amd64 [installed] > > Yet: > > Makefile.config:1142: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel > Makefile.config:1155: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev > Makefile.config:1187: libtracefs is missing. Please install libtracefs-dev/libtracefs-devel > > Auto-detecting system features: > ... libdw: [ on ] > ... glibc: [ on ] > ... libbfd: [ on ] > ... libbfd-buildid: [ on ] > ... libelf: [ on ] > ... libnuma: [ on ] > ... numa_num_possible_cpus: [ on ] > ... libperl: [ on ] > ... libpython: [ on ] > ... libcrypto: [ on ] > ... libunwind: [ OFF ] > ... libcapstone: [ on ] > ... llvm-perf: [ on ] > ... zlib: [ on ] > ... lzma: [ on ] > ... get_cpuid: [ on ] > ... bpf: [ on ] > ... libaio: [ on ] > ... libzstd: [ on ] > > But when I ran perf check, > > $ perf check feature libunwind > > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT > > libunwind is OFF. > Same here. So, with a patch I have (attached) here this turns into: ⬢ [acme@toolbox perf-tools-next]$ perf check feature libunwind libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) ⬢ [acme@toolbox perf-tools-next]$ Then I follow the suggestion and install libunwind-devel and build with LIBUNWIND=1: $ make -k LIBUNWIND=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin <SNIP> Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libbfd: [ on ] ... libbfd-buildid: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] ⬢ [acme@toolbox perf-tools-next]$ ldd ~/bin/perf | grep unwind libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f434eaea000) libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f434ead0000) ⬢ [acme@toolbox perf-tools-next]$ I need to have that hooked up to tools/build/Makefile.feature, where it would find that tip somewhere, i.e. that "Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it". This all would not happen if when libunwind was turned into an opt-in the default output for it had been removed, so that OFF thing wouldn't appear on the screen. The annotate case is unrelated from what I can see, the preference for the disassembler (libcapstone, libllvm or the old method of parsing objdump output) has to start with the most capable, and the source code part shows that the current order isn't the best, I'll continue the investigation/tests tomorrow and will come up with a series of patches for the problems reported. For reference, from 'perf-config' man page: annotate.*:: These are in control of addresses, jump function, source code in lines of assembly code from a specific program. annotate.disassemblers:: Choose the disassembler to use: "objdump", "llvm", "capstone", if not specified it will first try, if available, the "llvm" one, then, if it fails, "capstone", and finally the original "objdump" based one. Choosing a different one is useful when handling some feature that is known to be best support at some point by one of the options, to compare the output when in doubt about some bug, etc. This can be a list, in order of preference, the first one that works finishes the process. A quick test here with perf top without setting the above ends up with the behaviour that Ingo reported, no source code is shown and 's' doesn't work. Then if I set it to use objdump: root@number:~# perf config annotate.disassemblers=objdump root@number:~# perf config annotate.disassemblers annotate.disassemblers=objdump root@number:~# cat ~/.perfconfig # this file is auto-generated. [annotate] disassemblers = objdump root@number:~# It takes longer to show the annotated output but there is source code and 's' works. - Arnaldo [-- Attachment #2: tip-perf-version-build-options.patch --] [-- Type: text/plain, Size: 4235 bytes --] diff --git a/tools/perf/builtin-check.c b/tools/perf/builtin-check.c index 61a11a9b4e7594bf..e2cfff18de299367 100644 --- a/tools/perf/builtin-check.c +++ b/tools/perf/builtin-check.c @@ -43,7 +43,7 @@ struct feature_status supported_features[] = { FEATURE_STATUS("libpython", HAVE_LIBPYTHON_SUPPORT), FEATURE_STATUS("libslang", HAVE_SLANG_SUPPORT), FEATURE_STATUS("libtraceevent", HAVE_LIBTRACEEVENT), - FEATURE_STATUS("libunwind", HAVE_LIBUNWIND_SUPPORT), + FEATURE_STATUS_TIP("libunwind", HAVE_LIBUNWIND_SUPPORT, "Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it"), FEATURE_STATUS("lzma", HAVE_LZMA_SUPPORT), FEATURE_STATUS("numa_num_possible_cpus", HAVE_LIBNUMA_SUPPORT), FEATURE_STATUS("zlib", HAVE_ZLIB_SUPPORT), @@ -65,22 +65,17 @@ static void on_off_print(const char *status) printf(" ]"); } -/* Helper function to print status of a feature along with name/macro */ -static void status_print(const char *name, const char *macro, - const char *status) +void feature_status__printf(const struct feature_status *feature) { - printf("%22s: ", name); - on_off_print(status); - printf(" # %s\n", macro); -} + printf("%22s: ", feature->name); + on_off_print(feature->is_builtin ? "on" : "OFF"); + printf(" # %s", feature->macro); + + if (!feature->is_builtin && feature->tip) + printf(" ( tip: %s )", feature->tip); -#define STATUS(feature) \ -do { \ - if (feature.is_builtin) \ - status_print(feature.name, feature.macro, "on"); \ - else \ - status_print(feature.name, feature.macro, "OFF"); \ -} while (0) + putchar('\n'); +} /** * check whether "feature" is built-in with perf @@ -95,7 +90,7 @@ static int has_support(const char *feature) if ((strcasecmp(feature, supported_features[i].name) == 0) || (strcasecmp(feature, supported_features[i].macro) == 0)) { if (!quiet) - STATUS(supported_features[i]); + feature_status__printf(&supported_features[i]); return supported_features[i].is_builtin; } } diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c index e149d96c6dc57c22..10f25c6705b117a2 100644 --- a/tools/perf/builtin-version.c +++ b/tools/perf/builtin-version.c @@ -26,38 +26,10 @@ static const char * const version_usage[] = { NULL }; -static void on_off_print(const char *status) -{ - printf("[ "); - - if (!strcmp(status, "OFF")) - color_fprintf(stdout, PERF_COLOR_RED, "%-3s", status); - else - color_fprintf(stdout, PERF_COLOR_GREEN, "%-3s", status); - - printf(" ]"); -} - -static void status_print(const char *name, const char *macro, - const char *status) -{ - printf("%22s: ", name); - on_off_print(status); - printf(" # %s\n", macro); -} - -#define STATUS(feature) \ -do { \ - if (feature.is_builtin) \ - status_print(feature.name, feature.macro, "on"); \ - else \ - status_print(feature.name, feature.macro, "OFF"); \ -} while (0) - static void library_status(void) { for (int i = 0; supported_features[i].name; ++i) - STATUS(supported_features[i]); + feature_status__printf(&supported_features[i]); } int cmd_version(int argc, const char **argv) diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h index a07e93c5384878d3..51f8f1f26813fb32 100644 --- a/tools/perf/builtin.h +++ b/tools/perf/builtin.h @@ -5,6 +5,7 @@ struct feature_status { const char *name; const char *macro; + const char *tip; int is_builtin; }; @@ -13,7 +14,16 @@ struct feature_status { .macro = #macro_, \ .is_builtin = IS_BUILTIN(macro_) } +#define FEATURE_STATUS_TIP(name_, macro_, tip_) { \ + .name = name_, \ + .macro = #macro_, \ + .tip = tip_, \ + .is_builtin = IS_BUILTIN(macro_) } + extern struct feature_status supported_features[]; + +void feature_status__printf(const struct feature_status *feature); + struct cmdnames; void list_common_cmds_help(void); ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-08 0:54 ` Arnaldo Carvalho de Melo @ 2025-04-08 6:16 ` Namhyung Kim 2025-04-09 3:26 ` Arnaldo Carvalho de Melo 2025-04-08 8:05 ` Ingo Molnar 1 sibling, 1 reply; 37+ messages in thread From: Namhyung Kim @ 2025-04-08 6:16 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Ingo Molnar, Howard Chu, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Mon, Apr 07, 2025 at 09:54:47PM -0300, Arnaldo Carvalho de Melo wrote: > On Mon, Apr 07, 2025 at 06:58:28PM +0200, Ingo Molnar wrote: > > * Howard Chu <howardchu95@gmail.com> wrote: > > > > > I tried to install the libunwind-19-dev package, I tried to > > > > uninstall/reinstall - no combination seems to work. > > > > > > $ apt list --installed | grep libunwind > > > > > > WARNING: apt does not have a stable CLI interface. Use with caution in scripts. > > > > > > libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] > > > libunwind8/oracular,now 1.6.2-3.1 amd64 [installed,automatic] > > > > > > Built perf with make clean install, my libunwind is on. > > > > > > ... libunwind: [ on ] > > > > I seem to have the exact same build environment: > > > > starship:~> apt list --installed | grep libunwind > > > > WARNING: apt does not have a stable CLI interface. Use with caution in scripts. > > > > libunwind-dev/oracular,now 1.6.2-3.1 amd64 [installed] > > libunwind8/oracular,now 1.6.2-3.1 amd64 [installed] > > > > Yet: > > > > Makefile.config:1142: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel > > Makefile.config:1155: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev > > Makefile.config:1187: libtracefs is missing. Please install libtracefs-dev/libtracefs-devel > > > > Auto-detecting system features: > > ... libdw: [ on ] > > ... glibc: [ on ] > > ... libbfd: [ on ] > > ... libbfd-buildid: [ on ] > > ... libelf: [ on ] > > ... libnuma: [ on ] > > ... numa_num_possible_cpus: [ on ] > > ... libperl: [ on ] > > ... libpython: [ on ] > > ... libcrypto: [ on ] > > ... libunwind: [ OFF ] > > ... libcapstone: [ on ] > > ... llvm-perf: [ on ] > > ... zlib: [ on ] > > ... lzma: [ on ] > > ... get_cpuid: [ on ] > > ... bpf: [ on ] > > ... libaio: [ on ] > > ... libzstd: [ on ] > > > > But when I ran perf check, > > > > $ perf check feature libunwind > > > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT > > > > libunwind is OFF. > > > Same here. > > So, with a patch I have (attached) here this turns into: > > ⬢ [acme@toolbox perf-tools-next]$ perf check feature libunwind > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) > ⬢ [acme@toolbox perf-tools-next]$ > > Then I follow the suggestion and install libunwind-devel and build with > LIBUNWIND=1: > > $ make -k LIBUNWIND=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin > <SNIP> > Auto-detecting system features: > ... libdw: [ on ] > ... glibc: [ on ] > ... libbfd: [ on ] > ... libbfd-buildid: [ on ] > ... libelf: [ on ] > ... libnuma: [ on ] > ... numa_num_possible_cpus: [ on ] > ... libperl: [ on ] > ... libpython: [ on ] > ... libcrypto: [ on ] > ... libunwind: [ on ] > ... libcapstone: [ on ] > ... llvm-perf: [ on ] > ... zlib: [ on ] > ... lzma: [ on ] > ... get_cpuid: [ on ] > ... bpf: [ on ] > ... libaio: [ on ] > ... libzstd: [ on ] > > ⬢ [acme@toolbox perf-tools-next]$ ldd ~/bin/perf | grep unwind > libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f434eaea000) > libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f434ead0000) > ⬢ [acme@toolbox perf-tools-next]$ > > I need to have that hooked up to tools/build/Makefile.feature, where it > would find that tip somewhere, i.e. that "Deprecated, use LIBUNWIND=1 > and install libunwind-dev[el] to build with it". > > This all would not happen if when libunwind was turned into an opt-in > the default output for it had been removed, so that OFF thing wouldn't > appear on the screen. > > The annotate case is unrelated from what I can see, the preference for > the disassembler (libcapstone, libllvm or the old method of parsing > objdump output) has to start with the most capable, and the source code > part shows that the current order isn't the best, I'll continue the > investigation/tests tomorrow and will come up with a series of patches > for the problems reported. > > For reference, from 'perf-config' man page: > > annotate.*:: > These are in control of addresses, jump function, source code > in lines of assembly code from a specific program. > > annotate.disassemblers:: > Choose the disassembler to use: "objdump", "llvm", "capstone", > if not specified it will first try, if available, the "llvm" one, > then, if it fails, "capstone", and finally the original "objdump" > based one. > > Choosing a different one is useful when handling some feature that > is known to be best support at some point by one of the options, > to compare the output when in doubt about some bug, etc. > > This can be a list, in order of preference, the first one that works > finishes the process. > > A quick test here with perf top without setting the above ends up with > the behaviour that Ingo reported, no source code is shown and 's' > doesn't work. Then if I set it to use objdump: > > root@number:~# perf config annotate.disassemblers=objdump > root@number:~# perf config annotate.disassemblers > annotate.disassemblers=objdump > root@number:~# cat ~/.perfconfig > # this file is auto-generated. > [annotate] > disassemblers = objdump > root@number:~# > > It takes longer to show the annotated output but there is source code > and 's' works. Thanks for chasing it down, we miss source line support in the disasm using capstone and libllvm. We can add that later but for now we may refresh the result and force to objdump when source line is requested. Thanks, Namhyung > > - Arnaldo > diff --git a/tools/perf/builtin-check.c b/tools/perf/builtin-check.c > index 61a11a9b4e7594bf..e2cfff18de299367 100644 > --- a/tools/perf/builtin-check.c > +++ b/tools/perf/builtin-check.c > @@ -43,7 +43,7 @@ struct feature_status supported_features[] = { > FEATURE_STATUS("libpython", HAVE_LIBPYTHON_SUPPORT), > FEATURE_STATUS("libslang", HAVE_SLANG_SUPPORT), > FEATURE_STATUS("libtraceevent", HAVE_LIBTRACEEVENT), > - FEATURE_STATUS("libunwind", HAVE_LIBUNWIND_SUPPORT), > + FEATURE_STATUS_TIP("libunwind", HAVE_LIBUNWIND_SUPPORT, "Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it"), > FEATURE_STATUS("lzma", HAVE_LZMA_SUPPORT), > FEATURE_STATUS("numa_num_possible_cpus", HAVE_LIBNUMA_SUPPORT), > FEATURE_STATUS("zlib", HAVE_ZLIB_SUPPORT), > @@ -65,22 +65,17 @@ static void on_off_print(const char *status) > printf(" ]"); > } > > -/* Helper function to print status of a feature along with name/macro */ > -static void status_print(const char *name, const char *macro, > - const char *status) > +void feature_status__printf(const struct feature_status *feature) > { > - printf("%22s: ", name); > - on_off_print(status); > - printf(" # %s\n", macro); > -} > + printf("%22s: ", feature->name); > + on_off_print(feature->is_builtin ? "on" : "OFF"); > + printf(" # %s", feature->macro); > + > + if (!feature->is_builtin && feature->tip) > + printf(" ( tip: %s )", feature->tip); > > -#define STATUS(feature) \ > -do { \ > - if (feature.is_builtin) \ > - status_print(feature.name, feature.macro, "on"); \ > - else \ > - status_print(feature.name, feature.macro, "OFF"); \ > -} while (0) > + putchar('\n'); > +} > > /** > * check whether "feature" is built-in with perf > @@ -95,7 +90,7 @@ static int has_support(const char *feature) > if ((strcasecmp(feature, supported_features[i].name) == 0) || > (strcasecmp(feature, supported_features[i].macro) == 0)) { > if (!quiet) > - STATUS(supported_features[i]); > + feature_status__printf(&supported_features[i]); > return supported_features[i].is_builtin; > } > } > diff --git a/tools/perf/builtin-version.c b/tools/perf/builtin-version.c > index e149d96c6dc57c22..10f25c6705b117a2 100644 > --- a/tools/perf/builtin-version.c > +++ b/tools/perf/builtin-version.c > @@ -26,38 +26,10 @@ static const char * const version_usage[] = { > NULL > }; > > -static void on_off_print(const char *status) > -{ > - printf("[ "); > - > - if (!strcmp(status, "OFF")) > - color_fprintf(stdout, PERF_COLOR_RED, "%-3s", status); > - else > - color_fprintf(stdout, PERF_COLOR_GREEN, "%-3s", status); > - > - printf(" ]"); > -} > - > -static void status_print(const char *name, const char *macro, > - const char *status) > -{ > - printf("%22s: ", name); > - on_off_print(status); > - printf(" # %s\n", macro); > -} > - > -#define STATUS(feature) \ > -do { \ > - if (feature.is_builtin) \ > - status_print(feature.name, feature.macro, "on"); \ > - else \ > - status_print(feature.name, feature.macro, "OFF"); \ > -} while (0) > - > static void library_status(void) > { > for (int i = 0; supported_features[i].name; ++i) > - STATUS(supported_features[i]); > + feature_status__printf(&supported_features[i]); > } > > int cmd_version(int argc, const char **argv) > diff --git a/tools/perf/builtin.h b/tools/perf/builtin.h > index a07e93c5384878d3..51f8f1f26813fb32 100644 > --- a/tools/perf/builtin.h > +++ b/tools/perf/builtin.h > @@ -5,6 +5,7 @@ > struct feature_status { > const char *name; > const char *macro; > + const char *tip; > int is_builtin; > }; > > @@ -13,7 +14,16 @@ struct feature_status { > .macro = #macro_, \ > .is_builtin = IS_BUILTIN(macro_) } > > +#define FEATURE_STATUS_TIP(name_, macro_, tip_) { \ > + .name = name_, \ > + .macro = #macro_, \ > + .tip = tip_, \ > + .is_builtin = IS_BUILTIN(macro_) } > + > extern struct feature_status supported_features[]; > + > +void feature_status__printf(const struct feature_status *feature); > + > struct cmdnames; > > void list_common_cmds_help(void); ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-08 6:16 ` Namhyung Kim @ 2025-04-09 3:26 ` Arnaldo Carvalho de Melo 2025-04-10 20:48 ` Namhyung Kim 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-09 3:26 UTC (permalink / raw) To: Namhyung Kim Cc: Ingo Molnar, Howard Chu, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Mon, Apr 07, 2025 at 11:16:04PM -0700, Namhyung Kim wrote: > On Mon, Apr 07, 2025 at 09:54:47PM -0300, Arnaldo Carvalho de Melo wrote: > > A quick test here with perf top without setting the above ends up with > > the behaviour that Ingo reported, no source code is shown and 's' > > doesn't work. Then if I set it to use objdump: > > root@number:~# perf config annotate.disassemblers=objdump > > root@number:~# perf config annotate.disassemblers > > annotate.disassemblers=objdump > > root@number:~# cat ~/.perfconfig > > # this file is auto-generated. > > [annotate] > > disassemblers = objdump > > root@number:~# > > It takes longer to show the annotated output but there is source code > > and 's' works. > Thanks for chasing it down, we miss source line support in the disasm > using capstone and libllvm. We can add that later but for now we may > refresh the result and force to objdump when source line is requested. That is a good idea, I thought about it as well and will check how to do it on top of what I have at: https://web.git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-annotate%2bbuild - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 3:26 ` Arnaldo Carvalho de Melo @ 2025-04-10 20:48 ` Namhyung Kim 2025-04-10 20:54 ` Ingo Molnar 2025-04-24 12:37 ` Arnaldo Carvalho de Melo 0 siblings, 2 replies; 37+ messages in thread From: Namhyung Kim @ 2025-04-10 20:48 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Ingo Molnar, Howard Chu, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Wed, Apr 09, 2025 at 12:26:56AM -0300, Arnaldo Carvalho de Melo wrote: > On Mon, Apr 07, 2025 at 11:16:04PM -0700, Namhyung Kim wrote: > > On Mon, Apr 07, 2025 at 09:54:47PM -0300, Arnaldo Carvalho de Melo wrote: > > > A quick test here with perf top without setting the above ends up with > > > the behaviour that Ingo reported, no source code is shown and 's' > > > doesn't work. Then if I set it to use objdump: > > > > root@number:~# perf config annotate.disassemblers=objdump > > > root@number:~# perf config annotate.disassemblers > > > annotate.disassemblers=objdump > > > root@number:~# cat ~/.perfconfig > > > # this file is auto-generated. > > > [annotate] > > > disassemblers = objdump > > > root@number:~# > > > > It takes longer to show the annotated output but there is source code > > > and 's' works. > > > Thanks for chasing it down, we miss source line support in the disasm > > using capstone and libllvm. We can add that later but for now we may > > refresh the result and force to objdump when source line is requested. > > That is a good idea, I thought about it as well and will check how to do > it on top of what I have at: > > https://web.git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-annotate%2bbuild How about this? From ce37ae9008ec7c6ab66ee1b6632a3c51bc0802f3 Mon Sep 17 00:00:00 2001 From: Namhyung Kim <namhyung@kernel.org> Date: Thu, 10 Apr 2025 13:42:16 -0700 Subject: [PATCH] perf annotate: Fix source code annotate with objdump Recently it uses llvm and capstone to speed up annotation or disassembly of instructions. But they don't support source code view yet. Until it fixed, we can force to use objdump for source code annotation. To prevent performance loss, it's disabled by default and turned it on when user requests it in TUI by pressing 's' key. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> --- tools/perf/ui/browsers/annotate.c | 43 +++++++++++++++++++++++++++++-- tools/perf/util/annotate.c | 2 ++ tools/perf/util/annotate.h | 1 + tools/perf/util/disasm.c | 7 +++++ 4 files changed, 51 insertions(+), 2 deletions(-) diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index ab776b1ed2d5b4ba..fe31069990d77e65 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -345,6 +345,23 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser, browser->curr_hot = rb_last(&browser->entries); } +static struct annotation_line *annotate_browser__find_new_asm_line( + struct annotate_browser *browser, + int idx_asm) +{ + struct annotation_line *al; + struct list_head *head = browser->b.entries; + + /* find an annotation line in the new list with the same idx_asm */ + list_for_each_entry(al, head, node) { + if (al->idx_asm == idx_asm) + return al; + } + + /* There are no asm lines */ + return NULL; +} + static struct annotation_line *annotate_browser__find_next_asm_line( struct annotate_browser *browser, struct annotation_line *al) @@ -368,7 +385,8 @@ static struct annotation_line *annotate_browser__find_next_asm_line( return NULL; } -static bool annotate_browser__toggle_source(struct annotate_browser *browser) +static bool annotate_browser__toggle_source(struct annotate_browser *browser, + struct evsel *evsel) { struct annotation *notes = browser__annotation(&browser->b); struct annotation_line *al; @@ -377,6 +395,27 @@ static bool annotate_browser__toggle_source(struct annotate_browser *browser) browser->b.seek(&browser->b, offset, SEEK_CUR); al = list_entry(browser->b.top, struct annotation_line, node); + if (!annotate_opts.annotate_src) + annotate_opts.annotate_src = true; + + if (annotate_opts.hide_src_code && !notes->src->tried_source) { + struct map_symbol *ms = browser->b.priv; + bool orig_entries = notes->src->nr_entries; + int orig_idx_asm = al->idx_asm; + + notes->src->tried_source = true; + + /* annotate again with source code info */ + annotate_opts.hide_src_code = false; + annotated_source__purge(notes->src); + symbol__annotate2(ms, evsel, &browser->arch); + annotate_opts.hide_src_code = true; + + browser->b.entries = ¬es->src->source; + al = annotate_browser__find_new_asm_line(browser, orig_idx_asm); + browser->b.seek(&browser->b, al->idx_asm, SEEK_SET); + } + if (annotate_opts.hide_src_code) { if (al->idx_asm < offset) offset = al->idx; @@ -833,7 +872,7 @@ static int annotate_browser__run(struct annotate_browser *browser, nd = browser->curr_hot; break; case 's': - if (annotate_browser__toggle_source(browser)) + if (annotate_browser__toggle_source(browser, evsel)) ui_helpline__puts(help); annotate__scnprintf_title(hists, title, sizeof(title)); annotate_browser__show(&browser->b, title, help); diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c index 264a212b47df850c..0dd475a744b6dfac 100644 --- a/tools/perf/util/annotate.c +++ b/tools/perf/util/annotate.c @@ -1451,6 +1451,7 @@ void annotated_source__purge(struct annotated_source *as) list_del_init(&al->node); disasm_line__free(disasm_line(al)); } + as->tried_source = false; } static size_t disasm_line__fprintf(struct disasm_line *dl, FILE *fp) @@ -2280,6 +2281,7 @@ void annotation_options__init(void) opt->annotate_src = true; opt->offset_level = ANNOTATION__OFFSET_JUMP_TARGETS; opt->percent_type = PERCENT_PERIOD_LOCAL; + opt->hide_src_code = true; opt->hide_src_code_on_title = true; } diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h index bbb89b32f398b3c9..8b5131d257b01e3e 100644 --- a/tools/perf/util/annotate.h +++ b/tools/perf/util/annotate.h @@ -294,6 +294,7 @@ struct annotated_source { int nr_entries; int nr_asm_entries; int max_jump_sources; + bool tried_source; u64 start; struct { u8 addr; diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c index 8f0eb56c6fc66a96..ff475a239f4b0db4 100644 --- a/tools/perf/util/disasm.c +++ b/tools/perf/util/disasm.c @@ -2284,6 +2284,13 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args) } } + /* FIXME: LLVM and CAPSTONE should support source code */ + if (options->annotate_src && !options->hide_src_code) { + err = symbol__disassemble_objdump(symfs_filename, sym, args); + if (err == 0) + goto out_remove_tmp; + } + err = -1; for (u8 i = 0; i < ARRAY_SIZE(options->disassemblers) && err != 0; i++) { enum perf_disassembler dis = options->disassemblers[i]; -- 2.49.0.604.gff1f9ca942-goog ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-10 20:48 ` Namhyung Kim @ 2025-04-10 20:54 ` Ingo Molnar 2025-04-24 12:37 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 37+ messages in thread From: Ingo Molnar @ 2025-04-10 20:54 UTC (permalink / raw) To: Namhyung Kim Cc: Arnaldo Carvalho de Melo, Howard Chu, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Namhyung Kim <namhyung@kernel.org> wrote: > On Wed, Apr 09, 2025 at 12:26:56AM -0300, Arnaldo Carvalho de Melo wrote: > > On Mon, Apr 07, 2025 at 11:16:04PM -0700, Namhyung Kim wrote: > > > On Mon, Apr 07, 2025 at 09:54:47PM -0300, Arnaldo Carvalho de Melo wrote: > > > > A quick test here with perf top without setting the above ends up with > > > > the behaviour that Ingo reported, no source code is shown and 's' > > > > doesn't work. Then if I set it to use objdump: > > > > > > root@number:~# perf config annotate.disassemblers=objdump > > > > root@number:~# perf config annotate.disassemblers > > > > annotate.disassemblers=objdump > > > > root@number:~# cat ~/.perfconfig > > > > # this file is auto-generated. > > > > [annotate] > > > > disassemblers = objdump > > > > root@number:~# > > > > > > It takes longer to show the annotated output but there is source code > > > > and 's' works. > > > > > Thanks for chasing it down, we miss source line support in the disasm > > > using capstone and libllvm. We can add that later but for now we may > > > refresh the result and force to objdump when source line is requested. > > > > That is a good idea, I thought about it as well and will check how to do > > it on top of what I have at: > > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-annotate%2bbuild > > How about this? > > > From ce37ae9008ec7c6ab66ee1b6632a3c51bc0802f3 Mon Sep 17 00:00:00 2001 > From: Namhyung Kim <namhyung@kernel.org> > Date: Thu, 10 Apr 2025 13:42:16 -0700 > Subject: [PATCH] perf annotate: Fix source code annotate with objdump > > Recently it uses llvm and capstone to speed up annotation or disassembly > of instructions. But they don't support source code view yet. Until it > fixed, we can force to use objdump for source code annotation. > > To prevent performance loss, it's disabled by default and turned it on > when user requests it in TUI by pressing 's' key. That's a good compromise IMO, thank you for implementing it! Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-10 20:48 ` Namhyung Kim 2025-04-10 20:54 ` Ingo Molnar @ 2025-04-24 12:37 ` Arnaldo Carvalho de Melo 1 sibling, 0 replies; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-24 12:37 UTC (permalink / raw) To: Namhyung Kim Cc: Ingo Molnar, Howard Chu, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Thu, Apr 10, 2025 at 01:48:28PM -0700, Namhyung Kim wrote: > On Wed, Apr 09, 2025 at 12:26:56AM -0300, Arnaldo Carvalho de Melo wrote: > > On Mon, Apr 07, 2025 at 11:16:04PM -0700, Namhyung Kim wrote: > > > On Mon, Apr 07, 2025 at 09:54:47PM -0300, Arnaldo Carvalho de Melo wrote: > > > > A quick test here with perf top without setting the above ends up with > > > > the behaviour that Ingo reported, no source code is shown and 's' > > > > doesn't work. Then if I set it to use objdump: > > > > > > root@number:~# perf config annotate.disassemblers=objdump > > > > root@number:~# perf config annotate.disassemblers > > > > annotate.disassemblers=objdump > > > > root@number:~# cat ~/.perfconfig > > > > # this file is auto-generated. > > > > [annotate] > > > > disassemblers = objdump > > > > root@number:~# > > > > > > It takes longer to show the annotated output but there is source code > > > > and 's' works. > > > > > Thanks for chasing it down, we miss source line support in the disasm > > > using capstone and libllvm. We can add that later but for now we may > > > refresh the result and force to objdump when source line is requested. > > > > That is a good idea, I thought about it as well and will check how to do > > it on top of what I have at: > > > > https://web.git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=perf-annotate%2bbuild > > How about this? > > >From ce37ae9008ec7c6ab66ee1b6632a3c51bc0802f3 Mon Sep 17 00:00:00 2001 > From: Namhyung Kim <namhyung@kernel.org> > Date: Thu, 10 Apr 2025 13:42:16 -0700 > Subject: [PATCH] perf annotate: Fix source code annotate with objdump > > Recently it uses llvm and capstone to speed up annotation or disassembly > of instructions. But they don't support source code view yet. Until it > fixed, we can force to use objdump for source code annotation. > > To prevent performance loss, it's disabled by default and turned it on > when user requests it in TUI by pressing 's' key. > > Reported-by: Ingo Molnar <mingo@kernel.org> > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/perf/ui/browsers/annotate.c | 43 +++++++++++++++++++++++++++++-- > tools/perf/util/annotate.c | 2 ++ > tools/perf/util/annotate.h | 1 + > tools/perf/util/disasm.c | 7 +++++ > 4 files changed, 51 insertions(+), 2 deletions(-) > > diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c > index ab776b1ed2d5b4ba..fe31069990d77e65 100644 > --- a/tools/perf/ui/browsers/annotate.c > +++ b/tools/perf/ui/browsers/annotate.c > @@ -345,6 +345,23 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser, > browser->curr_hot = rb_last(&browser->entries); > } > > +static struct annotation_line *annotate_browser__find_new_asm_line( > + struct annotate_browser *browser, > + int idx_asm) > +{ > + struct annotation_line *al; > + struct list_head *head = browser->b.entries; > + > + /* find an annotation line in the new list with the same idx_asm */ > + list_for_each_entry(al, head, node) { > + if (al->idx_asm == idx_asm) > + return al; > + } > + > + /* There are no asm lines */ > + return NULL; > +} > + > static struct annotation_line *annotate_browser__find_next_asm_line( > struct annotate_browser *browser, > struct annotation_line *al) > @@ -368,7 +385,8 @@ static struct annotation_line *annotate_browser__find_next_asm_line( > return NULL; > } > > -static bool annotate_browser__toggle_source(struct annotate_browser *browser) > +static bool annotate_browser__toggle_source(struct annotate_browser *browser, > + struct evsel *evsel) > { > struct annotation *notes = browser__annotation(&browser->b); > struct annotation_line *al; > @@ -377,6 +395,27 @@ static bool annotate_browser__toggle_source(struct annotate_browser *browser) > browser->b.seek(&browser->b, offset, SEEK_CUR); > al = list_entry(browser->b.top, struct annotation_line, node); > > + if (!annotate_opts.annotate_src) > + annotate_opts.annotate_src = true; > + > + if (annotate_opts.hide_src_code && !notes->src->tried_source) { > + struct map_symbol *ms = browser->b.priv; > + bool orig_entries = notes->src->nr_entries; CC /tmp/build/perf-tools-next/ui/browsers/annotate.o AR /tmp/build/perf-tools-next/libpmu-events.a AR /tmp/build/perf-tools-next/libperf-test.a AR /tmp/build/perf-tools-next/libperf-util.a ui/browsers/annotate.c: In function ‘annotate_browser__toggle_source’: ui/browsers/annotate.c:403:22: error: unused variable ‘orig_entries’ [-Werror=unused-variable] 403 | bool orig_entries = notes->src->nr_entries; | ^~~~~~~~~~~~ cc1: all warnings being treated as errors make[5]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:85: /tmp/build/perf-tools-next/ui/browsers/annotate.o] Error 1 make[4]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:142: browsers] Error 2 make[3]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:142: ui] Error 2 make[2]: *** [Makefile.perf:792: /tmp/build/perf-tools-next/perf-ui-in.o] Error 2 make[1]: *** [Makefile.perf:290: sub-make] Error 2 make: *** [Makefile:119: install-bin] Error 2 make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf' ⬢ [acme@toolbox perf-tools-next]$ I'm applying this on top, ok? diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c index fe31069990d77e65..b400ac032dcba9a6 100644 --- a/tools/perf/ui/browsers/annotate.c +++ b/tools/perf/ui/browsers/annotate.c @@ -400,7 +400,6 @@ static bool annotate_browser__toggle_source(struct annotate_browser *browser, if (annotate_opts.hide_src_code && !notes->src->tried_source) { struct map_symbol *ms = browser->b.priv; - bool orig_entries = notes->src->nr_entries; int orig_idx_asm = al->idx_asm; notes->src->tried_source = true; Nah, this part needs checking: + browser->b.entries = ¬es->src->source; + al = annotate_browser__find_new_asm_line(browser, orig_idx_asm); + browser->b.seek(&browser->b, al->idx_asm, SEEK_SET); As annotate_browser__find_new_asm_line() can return NULL. Can you please resubmit with these changes? Thanks, - Arnaldo > + int orig_idx_asm = al->idx_asm; > + > + notes->src->tried_source = true; > + > + /* annotate again with source code info */ > + annotate_opts.hide_src_code = false; > + annotated_source__purge(notes->src); > + symbol__annotate2(ms, evsel, &browser->arch); > + annotate_opts.hide_src_code = true; > + > + browser->b.entries = ¬es->src->source; > + al = annotate_browser__find_new_asm_line(browser, orig_idx_asm); > + browser->b.seek(&browser->b, al->idx_asm, SEEK_SET); > + } > + > if (annotate_opts.hide_src_code) { > if (al->idx_asm < offset) > offset = al->idx; > @@ -833,7 +872,7 @@ static int annotate_browser__run(struct annotate_browser *browser, > nd = browser->curr_hot; > break; > case 's': > - if (annotate_browser__toggle_source(browser)) > + if (annotate_browser__toggle_source(browser, evsel)) > ui_helpline__puts(help); > annotate__scnprintf_title(hists, title, sizeof(title)); > annotate_browser__show(&browser->b, title, help); > diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c > index 264a212b47df850c..0dd475a744b6dfac 100644 > --- a/tools/perf/util/annotate.c > +++ b/tools/perf/util/annotate.c > @@ -1451,6 +1451,7 @@ void annotated_source__purge(struct annotated_source *as) > list_del_init(&al->node); > disasm_line__free(disasm_line(al)); > } > + as->tried_source = false; > } > > static size_t disasm_line__fprintf(struct disasm_line *dl, FILE *fp) > @@ -2280,6 +2281,7 @@ void annotation_options__init(void) > opt->annotate_src = true; > opt->offset_level = ANNOTATION__OFFSET_JUMP_TARGETS; > opt->percent_type = PERCENT_PERIOD_LOCAL; > + opt->hide_src_code = true; > opt->hide_src_code_on_title = true; > } > > diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h > index bbb89b32f398b3c9..8b5131d257b01e3e 100644 > --- a/tools/perf/util/annotate.h > +++ b/tools/perf/util/annotate.h > @@ -294,6 +294,7 @@ struct annotated_source { > int nr_entries; > int nr_asm_entries; > int max_jump_sources; > + bool tried_source; > u64 start; > struct { > u8 addr; > diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c > index 8f0eb56c6fc66a96..ff475a239f4b0db4 100644 > --- a/tools/perf/util/disasm.c > +++ b/tools/perf/util/disasm.c > @@ -2284,6 +2284,13 @@ int symbol__disassemble(struct symbol *sym, struct annotate_args *args) > } > } > > + /* FIXME: LLVM and CAPSTONE should support source code */ > + if (options->annotate_src && !options->hide_src_code) { > + err = symbol__disassemble_objdump(symfs_filename, sym, args); > + if (err == 0) > + goto out_remove_tmp; > + } > + > err = -1; > for (u8 i = 0; i < ARRAY_SIZE(options->disassemblers) && err != 0; i++) { > enum perf_disassembler dis = options->disassemblers[i]; > -- > 2.49.0.604.gff1f9ca942-goog > ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-08 0:54 ` Arnaldo Carvalho de Melo 2025-04-08 6:16 ` Namhyung Kim @ 2025-04-08 8:05 ` Ingo Molnar 2025-04-09 2:23 ` Arnaldo Carvalho de Melo 1 sibling, 1 reply; 37+ messages in thread From: Ingo Molnar @ 2025-04-08 8:05 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > So, with a patch I have (attached) here this turns into: > > ⬢ [acme@toolbox perf-tools-next]$ perf check feature libunwind > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) > ⬢ [acme@toolbox perf-tools-next]$ I can confirm that on one system with your fix applied there's no 'OFF' entry anymore: ... libdw: [ on ] ... glibc: [ on ] ... libbfd: [ on ] ... libbfd-buildid: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libunwind: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] But on another, similar Ubuntu system I still get: ... libunwind: [ OFF ] kepler:~/tip> dpkg -l | grep -E 'capstone|unwind|libdw' ii libcapstone-dev:amd64 4.0.2-5.1build1 amd64 lightweight multi-architecture disassembly framework - devel files ii libcapstone4:amd64 4.0.2-5.1build1 amd64 lightweight multi-architecture disassembly framework - library ii libdw-dev:amd64 0.191-2ubuntu0.1 amd64 libdw1t64 development libraries and header files ii libdw1t64:amd64 0.191-2ubuntu0.1 amd64 library that provides access to the DWARF debug information ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime And I have your fix applied: kepler:~/tip> git diff --stat tools/perf/builtin-check.c | 27 +++++++++++---------------- tools/perf/builtin-version.c | 30 +----------------------------- tools/perf/builtin.h | 10 ++++++++++ 3 files changed, 22 insertions(+), 45 deletions(-) kepler:~/tip> perf check feature libunwind libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) > The annotate case is unrelated from what I can see, the preference > for the disassembler (libcapstone, libllvm or the old method of > parsing objdump output) has to start with the most capable, and the > source code part shows that the current order isn't the best, I'll > continue the investigation/tests tomorrow and will come up with a > series of patches for the problems reported. But I was able to test the annotation behavior with annotate.disassemblers=objdump: > For reference, from 'perf-config' man page: > > annotate.*:: > These are in control of addresses, jump function, source code > in lines of assembly code from a specific program. > > annotate.disassemblers:: > Choose the disassembler to use: "objdump", "llvm", "capstone", > if not specified it will first try, if available, the "llvm" one, > then, if it fails, "capstone", and finally the original "objdump" > based one. > > Choosing a different one is useful when handling some feature that > is known to be best support at some point by one of the options, > to compare the output when in doubt about some bug, etc. > > This can be a list, in order of preference, the first one that works > finishes the process. > > A quick test here with perf top without setting the above ends up with > the behaviour that Ingo reported, no source code is shown and 's' > doesn't work. Then if I set it to use objdump: > > root@number:~# perf config annotate.disassemblers=objdump > root@number:~# perf config annotate.disassemblers > annotate.disassemblers=objdump > root@number:~# cat ~/.perfconfig > # this file is auto-generated. > [annotate] > disassemblers = objdump > root@number:~# > > It takes longer to show the annotated output but there is source code > and 's' works. I can confirm that this works around the annotation bug, and I can also confirm that objdump is pretty slow, 'perf top' just appears to hang for many seconds. :-) BTW., if objdump source-annotation is the best output we have right now, then it would be nice to somehow visually distinguish source code versus disassembly. Right now it's all the same color and flows together on the screen. Maybe we could gray out source code lines or so? Also, the source code appears to lose its horizontal indentation levels: 0.25 sub $0xffffffff8495b8a0,%rax 0.20 sar $0x3,%rax 0.22 imul %rdx,%rax __debug_atomic_inc(lock_class_ops[idx]); 0.59 movslq %eax,%rbx 0.38 incq %gs:-0x7b9433f8(,%rbx,8) class_idx = class - lock_classes; if (depth && !sync) { /* we're holding locks and the new held lock is not a sync */ hlock = curr->held_locks + depth - 1; 0.39 mov oops_in_progress,%edx class_idx = class - lock_classes; 0.26 mov 0xeb8(%r14),%ecx 0.25 mov %ecx,(%rsp) hlock = curr->held_locks + depth - 1; 0.25 test %edx,%edx ↓ jne c3 0.26 cmp $0x2f,%ecx 0.00 ↓ ja fc7 if (!references) references++; if (!hlock->references) hlock->references++; 0.26 c3: mov (%rsp),%ecx 0.26 lea 0x0(,%rcx,8),%r10 0.13 mov %rcx,%rsi 0.26 sub %rcx,%r10 0.26 lea 0xec0(%r14),%rcx 0.13 shl $0x3,%r10 if (!hlock->references) I suspect a lot of the formatting is up to objdump, but we should at least be able to print all out extra 's' output a visually more distinctive fashion? But if we had full control over annotation output, I think a popular annotation output style would be to leave the original source code formatting and indentation. Finally, here's a higher level UI design discussion of current perf top/report behavior: perf top/report IMHO frequently violates core principles of good TUI interfaces: - There's no guaranteed feedback on keypresses: - If I press an unbound key it just ignores it, and I keep wondering whether I mis-typed, misremembered the key binding, whether the keybinding doesn't work, or wether it's a keybinding with 'silent' behavior. - There are keybindings with 'silent' behavior: for example 'P' (print current annotation) doesn't update anything in the status line. A good TUI would do that, and it would also show something visible on repeat keypresses of the same key. - Basically I consider it an UI bug if the tool does not give me visual feedback on *any* keypress. If a user types something, it's very much intended: arrow keys move up and down or enter/exit functions, 'q' will quit, etc. There's *always* expected behavior when a user presses a key - and it's basically an UI bug if that is ignored for any reason. - On long processing steps sometimes there's no progress bar, the tool just 'hangs'. In some cases this is implemented: such as the initial parsing of perf.data. In others it's not: such as the objdump annotation output. (Which is default-off, so this isn't really a bugreport.) - Sometimes there's nonsensical UI flows with non-intuitive outcomes. An example I noticed is the 'zoom in' behavior in perf report/top: - If in the default 'perf report' starting screen of a short profiling session I enter into a function name on the screen, I get these choices: Annotate file_cache_slot::get_next_line(char**, long*) Zoom into cc1 thread Zoom into cc1 DSO (use the 'k' hotkey to zoom directly into the kernel) Browse map details Run scripts for samples of symbol [file_cache_slot::get_next_line(char**, long*)] Run scripts for all samples Switch to another data file in PWD Exit - If I press 'Zoom into cc1 thread', I get into some sort of filtered mode I am unable to exit via intuitive ways. There's no on-screen hint to make it go away, I have to exit the tool and restart it entirely to get back to the main profiling screen again ... which is obviously a suboptimal UI flow. - To make matters worse, the 'h' screen gives me this hint: ESC Zoom out except it doesn't appear to be working, my profiling output is still limited to cc1 entries. - The *only* way I found to zoom out into the general screen again was to press 'm' for the 'context menu' (I found this not because it was named intuitively, but because I trial-error brute forced the UI keybindings), and there it had: Zoom out of cc1 thread - Yay! But very unintuitive. The intuitive step would have been the left arrow key BTW. (with restoring the cursor position to the cc1 entry I pressed 'Enter' on originally), but that doesn't do anything, it gets ignored. Also, a status line printout that we are filtered to cc1, and how to exit that mode, would be nice as well. - [ Sub-bugreport: at higher levels the 'P' keybinding doesn't print the current screen like on the annotation screen, it goes back a level. Guess how I found out. ;-) To be able to quote the above context menu screen, I copy-pasted the ncurses lines, and manually edited out all the non-printable characters unsuitable for emails. ] I genuinely believe that these "basic profiling workflow" details are *FAR* more important to perf usability and perf popularity than pretty much any of the more esoteric hardware features, scalability enhancements and niche tools I see almost daily progress on - and these basic workflow details affect *everyone* who uses perf... Thanks, Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-08 8:05 ` Ingo Molnar @ 2025-04-09 2:23 ` Arnaldo Carvalho de Melo 2025-04-09 12:19 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-09 2:23 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Tue, Apr 08, 2025 at 10:05:15AM +0200, Ingo Molnar wrote: > * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > So, with a patch I have (attached) here this turns into: > > ⬢ [acme@toolbox perf-tools-next]$ perf check feature libunwind > > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) > > ⬢ [acme@toolbox perf-tools-next]$ > I can confirm that on one system with your fix applied there's no 'OFF' > entry anymore: > ... libdw: [ on ] > ... glibc: [ on ] > ... libbfd: [ on ] > ... libbfd-buildid: [ on ] > ... libelf: [ on ] > ... libnuma: [ on ] > ... numa_num_possible_cpus: [ on ] > ... libperl: [ on ] > ... libpython: [ on ] > ... libcrypto: [ on ] > ... libunwind: [ on ] > ... libcapstone: [ on ] > ... llvm-perf: [ on ] > ... zlib: [ on ] > ... lzma: [ on ] > ... get_cpuid: [ on ] > ... bpf: [ on ] > ... libaio: [ on ] > ... libzstd: [ on ] > > But on another, similar Ubuntu system I still get: > > ... libunwind: [ OFF ] Ok, I'm not being able to reproduce this, will try again in more systems, but can you please take a look at what I've made available at the perf-annotate+build branch in my tree at: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-annotate+build What is there now: ⬢ [acme@toolbox perf-tools-next]$ git log --oneline v6.15-rc1.. 6b4e380deb02de46 (HEAD -> perf-tools-next) perf ui browser annotate: Don't show the source code view status initially befd9928ac3b0914 perf ui browser annotate: Show in the title the source code view toggle 6c557247d907487e perf ui browser map: Provide feedback on unhandled hotkeys dcee76bd1ef37cd5 perf ui browser hists: Provide feedback on unhandled hotkeys 7cd3b61be22b6111 perf ui browser header: Provide feedback on unhandled hotkeys 4226b944d6cfec62 perf ui browser annotate: Provide feedback on unhandled hotkeys 96f43ea8e198b9cd perf ui browser annotate-data: Provide feedback on unhandled hotkeys 1101a930a674936a perf ui browser: Add a warn on unhandled hotkey helper bdd6ce8eb8c2f2f7 perf ui browser: Add key_name() helper 65185f2cf7af5026 tools build: Don't show libbfd build status as it is opt-in 33da8804f7c00056 perf check: Add tip about building with libbfd using BUILD_NONDISTRO=1 da4de82814f54c6c perf build: Warn when libdebuginfod devel files are not available d5f1c3eb842aaafe tools build: Don't show libunwind build status as it is opt-in 6cf8fa4dcce720f9 perf check: Allow showing a tip for opt-in features not built into perf 07f8bc136355d09a perf check: Move the FEATURE_STATUS() macro to its only user source file a3fb01a0d062d9f4 perf check: Share the feature status printing routine with 'perf version' c2bc1a34f2488b7f tools build: Don't set libunwind as available if test-all.c build succeeds ⬢ [acme@toolbox perf-tools-next]$ Please see below for further commentary. > kepler:~/tip> dpkg -l | grep -E 'capstone|unwind|libdw' > ii libcapstone-dev:amd64 4.0.2-5.1build1 amd64 lightweight multi-architecture disassembly framework - devel files > ii libcapstone4:amd64 4.0.2-5.1build1 amd64 lightweight multi-architecture disassembly framework - library > ii libdw-dev:amd64 0.191-2ubuntu0.1 amd64 libdw1t64 development libraries and header files > ii libdw1t64:amd64 0.191-2ubuntu0.1 amd64 library that provides access to the DWARF debug information > ii libunwind-dev:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - development > ii libunwind8:amd64 1.6.2-3.1 amd64 library to determine the call-chain of a program - runtime > > And I have your fix applied: > > kepler:~/tip> git diff --stat > tools/perf/builtin-check.c | 27 +++++++++++---------------- > tools/perf/builtin-version.c | 30 +----------------------------- > tools/perf/builtin.h | 10 ++++++++++ > 3 files changed, 22 insertions(+), 45 deletions(-) > > kepler:~/tip> perf check feature libunwind > libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) Tried on a raspberry pi5 16 16gb with: acme@raspberrypi:~/git/perf-tools-next $ head /etc/os-release PRETTY_NAME="Debian GNU/Linux 12 (bookworm)" NAME="Debian GNU/Linux" VERSION_ID="12" VERSION="12 (bookworm)" VERSION_CODENAME=bookworm ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" acme@raspberrypi:~/git/perf-tools-next $ acme@raspberrypi:~/git/perf-tools-next $ dpkg -l | grep -E 'capstone|unwind|libdw' ii libcapstone-dev:arm64 4.0.2-5 arm64 lightweight multi-architecture disassembly framework - devel files ii libcapstone4:arm64 4.0.2-5 arm64 lightweight multi-architecture disassembly framework - library ii libdw-dev:arm64 0.188-2.1 arm64 libdw1 development libraries and header files ii libdw1:arm64 0.188-2.1 arm64 library that provides access to the DWARF debug information ii libunwind-16:arm64 1:16.0.6-15~deb12u1 arm64 production-quality unwinder ii libunwind-dev:arm64 1.6.2-3 arm64 library to determine the call-chain of a program - development ii libunwind8:arm64 1.6.2-3 arm64 library to determine the call-chain of a program - runtime acme@raspberrypi:~/git/perf-tools-next $ Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ OFF ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] GEN /tmp/build/perf-tools-next/common-cmds.h PERF_VERSION = 6.15.rc1.g6b4e380deb02 And then: acme@raspberrypi:~/git/perf-tools-next $ perf check feature libunwind libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it ) acme@raspberrypi:~/git/perf-tools-next $ Ok, lemme try following the tip provided: acme@raspberrypi:~/git/perf-tools-next $ make -k CORESIGHT=1 LIBUNWIND=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin <SNIP> Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ OFF ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] CC /tmp/build/perf-tools-next/libperf/core.o INSTALL libapi_headers INSTALL libsubcmd_headers INSTALL libsymbol_headers <SNIP> CC /tmp/build/perf-tools-next/util/unwind-libunwind.o util/unwind-libunwind-local.c: In function ‘read_unwind_spec_debug_frame’: util/unwind-libunwind-local.c:374:56: error: expected ‘)’ before ‘{’ token 374 | if (dso__data_get_fd(dso, machine, &fd) { | ~ ^~ | ) util/unwind-libunwind-local.c:423:9: error: expected expression before ‘}’ token 423 | } | ^ util/unwind-libunwind-local.c: At top level: util/unwind-libunwind-local.c:198:12: error: ‘elf_section_offset’ defined but not used [-Werror=unused-function] 198 | static u64 elf_section_offset(int fd, const char *name) | ^~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors make[4]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:85: /tmp/build/perf-tools-next/util/unwind-libunwind-local.o] Error 1 make[4]: *** Waiting for unfinished jobs.... make[3]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:142: util] Error 2 make[2]: *** [Makefile.perf:798: /tmp/build/perf-tools-next/perf-util-in.o] Error 2 make[1]: *** [Makefile.perf:290: sub-make] Error 2 make: *** [Makefile:119: install-bin] Error 2 make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf' acme@raspberrypi:~/git/perf-tools-next $ So now I find a bug (I think I fixed this at some point but the patch didn't get merged? Seems simple enough: acme@raspberrypi:~/git/perf-tools-next $ vim tools/perf/util/unwind-libunwind-local.c acme@raspberrypi:~/git/perf-tools-next $ git diff diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c index 9fb2c1343c7f..0b037e7389a0 100644 --- a/tools/perf/util/unwind-libunwind-local.c +++ b/tools/perf/util/unwind-libunwind-local.c @@ -371,7 +371,7 @@ static int read_unwind_spec_debug_frame(struct dso *dso, * has to be pointed by symsrc_filename */ if (ofs == 0) { - if (dso__data_get_fd(dso, machine, &fd) { + if (dso__data_get_fd(dso, machine, &fd)) { ofs = elf_section_offset(fd, ".debug_frame"); dso__data_put_fd(dso); } acme@raspberrypi:~/git/perf-tools-next $ acme@raspberrypi:~/git/perf-tools-next $ perf check feature libunwind libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT acme@raspberrypi:~/git/perf-tools-next $ Seems to work: acme@raspberrypi:~/git/perf-tools-next $ ldd ~/bin/perf | grep -i -E 'capstone|unwind|llvm' libunwind.so.8 => /lib/aarch64-linux-gnu/libunwind.so.8 (0x00007ffec89c0000) libunwind-aarch64.so.8 => /lib/aarch64-linux-gnu/libunwind-aarch64.so.8 (0x00007ffec8960000) libLLVM-14.so.1 => /lib/aarch64-linux-gnu/libLLVM-14.so.1 (0x00007ffec1180000) libcapstone.so.4 => /lib/aarch64-linux-gnu/libcapstone.so.4 (0x00007ffec0740000) acme@raspberrypi:~/git/perf-tools-next$ > > The annotate case is unrelated from what I can see, the preference > > for the disassembler (libcapstone, libllvm or the old method of > > parsing objdump output) has to start with the most capable, and the > > source code part shows that the current order isn't the best, I'll > > continue the investigation/tests tomorrow and will come up with a > > series of patches for the problems reported. > But I was able to test the annotation behavior with > annotate.disassemblers=objdump: > > For reference, from 'perf-config' man page: > > > > annotate.*:: > > These are in control of addresses, jump function, source code > > in lines of assembly code from a specific program. > > > > annotate.disassemblers:: > > Choose the disassembler to use: "objdump", "llvm", "capstone", > > if not specified it will first try, if available, the "llvm" one, > > then, if it fails, "capstone", and finally the original "objdump" > > based one. > > > > Choosing a different one is useful when handling some feature that > > is known to be best support at some point by one of the options, > > to compare the output when in doubt about some bug, etc. > > > > This can be a list, in order of preference, the first one that works > > finishes the process. > > > > A quick test here with perf top without setting the above ends up with > > the behaviour that Ingo reported, no source code is shown and 's' > > doesn't work. Then if I set it to use objdump: > > > > root@number:~# perf config annotate.disassemblers=objdump > > root@number:~# perf config annotate.disassemblers > > annotate.disassemblers=objdump > > root@number:~# cat ~/.perfconfig > > # this file is auto-generated. > > [annotate] > > disassemblers = objdump > > root@number:~# > > > > It takes longer to show the annotated output but there is source code > > and 's' works. > I can confirm that this works around the annotation bug, and I can also > confirm that objdump is pretty slow, 'perf top' just appears to hang > for many seconds. :-) Note, it is possible to do source code annotation with at least the llvm one, I did investigate it but the developer that contributed the llvm didn't had time to work on it, see the analysis in my message: https://lore.kernel.org/all/ZzT6fjDSUlPzUt0U@x1/T/#u I.e. llvm-objdump _has_ it: -S, --source Intermix source code with disassembly I describe in that message how to use it, etc, haven't tried to check if it is faster than using binutils' objdump :-\ You can try with: ⬢ [acme@toolbox perf-tools-next]$ perf annotate -h objdump Usage: perf annotate [<options>] --objdump <path> objdump binary to use for disassembly and annotations ⬢ [acme@toolbox perf-tools-next]$ The message has more details. > BTW., if objdump source-annotation is the best output we have right > now, then it would be nice to somehow visually distinguish source code > versus disassembly. Right now it's all the same color and flows > together on the screen. Maybe we could gray out source code lines or > so? I'll try something like I did a long time ago in an attempt I plan to revisit in showing what parts of functions are inline expansions, see this screen shot from those times: http://oldvger.kernel.org/~acme/perf/inline_annotate/ipt_do_table.png https://web.git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/log/?h=tmp.perf/inline_annotate > Also, the source code appears to lose its horizontal indentation > levels: > > 0.25 sub $0xffffffff8495b8a0,%rax > 0.20 sar $0x3,%rax > 0.22 imul %rdx,%rax > __debug_atomic_inc(lock_class_ops[idx]); > 0.59 movslq %eax,%rbx > 0.38 incq %gs:-0x7b9433f8(,%rbx,8) > > class_idx = class - lock_classes; > > if (depth && !sync) { > /* we're holding locks and the new held lock is not a sync */ > hlock = curr->held_locks + depth - 1; > 0.39 mov oops_in_progress,%edx > class_idx = class - lock_classes; > 0.26 mov 0xeb8(%r14),%ecx > 0.25 mov %ecx,(%rsp) > hlock = curr->held_locks + depth - 1; > 0.25 test %edx,%edx > ↓ jne c3 > 0.26 cmp $0x2f,%ecx > 0.00 ↓ ja fc7 > if (!references) > references++; > > if (!hlock->references) > hlock->references++; > > 0.26 c3: mov (%rsp),%ecx > 0.26 lea 0x0(,%rcx,8),%r10 > 0.13 mov %rcx,%rsi > 0.26 sub %rcx,%r10 > 0.26 lea 0xec0(%r14),%rcx > 0.13 shl $0x3,%r10 > if (!hlock->references) > > I suspect a lot of the formatting is up to objdump, but we should at > least be able to print all out extra 's' output a visually more > distinctive fashion? I think that objdump just keeps whatever indentation is in the original source code, it just picks it from what is in DWARF, just like 'perf probe' does, etc: root@number:~# perf probe -L icmp_rcv | head -20 <icmp_rcv@/root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/source-d5d23b89-#usr#src#debug#kernel-6.13.9#linux-6.13.9-100.fc40.x86_64#net#ipv4#icmp.c:0> 0 int icmp_rcv(struct sk_buff *skb) { 2 enum skb_drop_reason reason = SKB_DROP_REASON_NOT_SPECIFIED; struct rtable *rt = skb_rtable(skb); struct net *net = dev_net_rcu(rt->dst.dev); struct icmphdr *icmph; if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb)) { 8 struct sec_path *sp = skb_sec_path(skb); 9 int nh; if (!(sp && sp->xvec[sp->len - 1]->props.flags & XFRM_STATE_ICMP)) { reason = SKB_DROP_REASON_XFRM_POLICY; goto drop; } 17 if (!pskb_may_pull(skb, sizeof(*icmph) + sizeof(struct iphdr))) goto drop; root@number:~# The instruction part is completely parsed out from the objdump -dS output and put into internal representation that then was reused to get the disassembly from libcapstone and libllvm, no parsing from some tool. So I think we can adjust things and get something better, will check. > But if we had full control over annotation output, I think a popular > annotation output style would be to leave the original source code > formatting and indentation. > > Finally, here's a higher level UI design discussion of current perf > top/report behavior: > > perf top/report IMHO frequently violates core principles of good TUI > interfaces: > > - There's no guaranteed feedback on keypresses: > - If I press an unbound key it just ignores it, and I keep > wondering whether I mis-typed, misremembered the key binding, > whether the keybinding doesn't work, or wether it's a keybinding > with 'silent' behavior. Look at the branch above, there are still things do do, but I'm making progrsess. > - There are keybindings with 'silent' behavior: for example 'P' > (print current annotation) doesn't update anything in the status > line. A good TUI would do that, and it would also show something > visible on repeat keypresses of the same key. That is now the behaviour for 's' > - Basically I consider it an UI bug if the tool does not give me > visual feedback on *any* keypress. If a user types something, > it's very much intended: arrow keys move up and down or > enter/exit functions, 'q' will quit, etc. There's *always* > expected behavior when a user presses a key - and it's basically > an UI bug if that is ignored for any reason. More will be done in that direction. > - On long processing steps sometimes there's no progress bar, the > tool just 'hangs'. In some cases this is implemented: such as the > initial parsing of perf.data. In others it's not: such as the > objdump annotation output. (Which is default-off, so this isn't > really a bugreport.) the objdump disassembly should have a progress bar. > - Sometimes there's nonsensical UI flows with non-intuitive outcomes. > An example I noticed is the 'zoom in' behavior in perf report/top: > > - If in the default 'perf report' starting screen of a short > profiling session I enter into a function name on the screen, I > get these choices: > > Annotate file_cache_slot::get_next_line(char**, long*) > Zoom into cc1 thread > Zoom into cc1 DSO (use the 'k' hotkey to zoom directly into the kernel) > Browse map details > Run scripts for samples of symbol [file_cache_slot::get_next_line(char**, long*)] > Run scripts for all samples > Switch to another data file in PWD > Exit > > - If I press 'Zoom into cc1 thread', I get into some sort of > filtered mode I am unable to exit via intuitive ways. There's no > on-screen hint to make it go away, I have to exit the tool and > restart it entirely to get back to the main profiling screen again > ... which is obviously a suboptimal UI flow. > > - To make matters worse, the 'h' screen gives me this hint: > > ESC Zoom out > > except it doesn't appear to be working, my profiling output is > still limited to cc1 entries. I think it worked when I last touched that code, but thanks for the report, I'll ifx it. > - The *only* way I found to zoom out into the general screen again > was to press 'm' for the 'context menu' (I found this not > because it was named intuitively, but because I trial-error > brute forced the UI keybindings), and there it had: > > Zoom out of cc1 thread I think originally zooming out was with the left arrow key, but then IIRC it was reused for scrolling because of modes where lots of columns were needed, like with 'perf c2c report', or something like that, well, here it is: ⬢ [acme@toolbox perf-tools-next]$ git show 31eb4360546b4bd89 commit 31eb4360546b4bd890f349db01295a173c09b0fb Author: Namhyung Kim <namhyung@kernel.org> Date: Tue Oct 13 09:02:01 2015 +0900 perf hists browser: Add 'm' key for context menu display With horizontal scrolling, the left/right arrow keys are used to scroll columns and ENTER/ESC keys are used to enter/exit menu. However if callchain is recorded, the ENTER key is used to toggle callchain expansion so there's no way to display menu. Use 'm' key to display the menu for this case. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1444694521-8136-1-git-send-email-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > - Yay! But very unintuitive. The intuitive step would have been > the left arrow key BTW. (with restoring the cursor position to > the cc1 entry I pressed 'Enter' on originally), but that doesn't That is how I implemented it initially, yes, over 10 years ago. > do anything, it gets ignored. Also, a status line printout that > we are filtered to cc1, and how to exit that mode, would be nice > as well. I'll work on that > - [ Sub-bugreport: at higher levels the 'P' keybinding doesn't print > the current screen like on the annotation screen, it goes back a > level. Guess how I found out. ;-) That should work as well. > To be able to quote the above context menu screen, I copy-pasted > the ncurses lines, and manually edited out all the non-printable > characters unsuitable for emails. ] Which is a PITA, one more detail to tidy, thanks. > I genuinely believe that these "basic profiling workflow" details are > *FAR* more important to perf usability and perf popularity than pretty > much any of the more esoteric hardware features, scalability > enhancements and niche tools I see almost daily progress on - and these > basic workflow details affect *everyone* who uses perf... Right, agreed, balancing processing tons of patches, addressing tech debt, supporting new features is a balancing act that doesn't always work well. But, again, thanks for your report, I'll try to get all of this sorted out. Meanwhile, if you could please test what I put in that branch and report results, that would be great. - Arnaldo ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 2:23 ` Arnaldo Carvalho de Melo @ 2025-04-09 12:19 ` Arnaldo Carvalho de Melo 2025-04-09 15:57 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-09 12:19 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Tue, Apr 08, 2025 at 11:23:54PM -0300, Arnaldo Carvalho de Melo wrote: > On Tue, Apr 08, 2025 at 10:05:15AM +0200, Ingo Molnar wrote: > > - Yay! But very unintuitive. The intuitive step would have been > > the left arrow key BTW. (with restoring the cursor position to > > the cc1 entry I pressed 'Enter' on originally), but that doesn't > That is how I implemented it initially, yes, over 10 years ago. I'll experiment with making the Left arrow key, if already on the first column, to work as zoom out. - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 12:19 ` Arnaldo Carvalho de Melo @ 2025-04-09 15:57 ` Arnaldo Carvalho de Melo 2025-04-09 19:17 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-09 15:57 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Wed, Apr 09, 2025 at 09:19:01AM -0300, Arnaldo Carvalho de Melo wrote: > On Tue, Apr 08, 2025 at 11:23:54PM -0300, Arnaldo Carvalho de Melo wrote: > > On Tue, Apr 08, 2025 at 10:05:15AM +0200, Ingo Molnar wrote: > > > - Yay! But very unintuitive. The intuitive step would have been > > > the left arrow key BTW. (with restoring the cursor position to > > > the cc1 entry I pressed 'Enter' on originally), but that doesn't > > > That is how I implemented it initially, yes, over 10 years ago. > > I'll experiment with making the Left arrow key, if already on the first > column, to work as zoom out. Can you try this, on top of the series or even just this, pressing left should do what we want. diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c index 115aa1af5a0218cd..dc88427b4ae5af4c 100644 --- a/tools/perf/ui/browser.c +++ b/tools/perf/ui/browser.c @@ -459,6 +459,8 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs) goto out; if (browser->horiz_scroll != 0) --browser->horiz_scroll; + else + goto out; break; case K_PGDN: case ' ': ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 15:57 ` Arnaldo Carvalho de Melo @ 2025-04-09 19:17 ` Arnaldo Carvalho de Melo 2025-04-09 19:22 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-09 19:17 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Wed, Apr 09, 2025 at 12:57:05PM -0300, Arnaldo Carvalho de Melo wrote: > On Wed, Apr 09, 2025 at 09:19:01AM -0300, Arnaldo Carvalho de Melo wrote: > > On Tue, Apr 08, 2025 at 11:23:54PM -0300, Arnaldo Carvalho de Melo wrote: > > > On Tue, Apr 08, 2025 at 10:05:15AM +0200, Ingo Molnar wrote: > > > > - Yay! But very unintuitive. The intuitive step would have been > > > > the left arrow key BTW. (with restoring the cursor position to > > > > the cc1 entry I pressed 'Enter' on originally), but that doesn't > > > > > That is how I implemented it initially, yes, over 10 years ago. > > > > I'll experiment with making the Left arrow key, if already on the first > > column, to work as zoom out. > > Can you try this, on top of the series or even just this, pressing left > should do what we want. This is working for zooming into/out of a DSO, but for threads, which is what you described, it is not working, checking that... - Arnaldo > diff --git a/tools/perf/ui/browser.c b/tools/perf/ui/browser.c > index 115aa1af5a0218cd..dc88427b4ae5af4c 100644 > --- a/tools/perf/ui/browser.c > +++ b/tools/perf/ui/browser.c > @@ -459,6 +459,8 @@ int ui_browser__run(struct ui_browser *browser, int delay_secs) > goto out; > if (browser->horiz_scroll != 0) > --browser->horiz_scroll; > + else > + goto out; > break; > case K_PGDN: > case ' ': ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 19:17 ` Arnaldo Carvalho de Melo @ 2025-04-09 19:22 ` Arnaldo Carvalho de Melo 2025-04-09 21:26 ` Ingo Molnar 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-09 19:22 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Wed, Apr 09, 2025 at 04:17:11PM -0300, Arnaldo Carvalho de Melo wrote: > On Wed, Apr 09, 2025 at 12:57:05PM -0300, Arnaldo Carvalho de Melo wrote: > > On Wed, Apr 09, 2025 at 09:19:01AM -0300, Arnaldo Carvalho de Melo wrote: > > > On Tue, Apr 08, 2025 at 11:23:54PM -0300, Arnaldo Carvalho de Melo wrote: > > > > On Tue, Apr 08, 2025 at 10:05:15AM +0200, Ingo Molnar wrote: > > > > > - Yay! But very unintuitive. The intuitive step would have been > > > > > the left arrow key BTW. (with restoring the cursor position to > > > > > the cc1 entry I pressed 'Enter' on originally), but that doesn't > > > > > > > That is how I implemented it initially, yes, over 10 years ago. > > > > > > I'll experiment with making the Left arrow key, if already on the first > > > column, to work as zoom out. > > > > Can you try this, on top of the series or even just this, pressing left > > should do what we want. > > This is working for zooming into/out of a DSO, but for threads, which is > what you described, it is not working, checking that... Using 't' to zoom into a thread and then LEFT to zoom out works, ditto for 'd' + LEFT, 'k' (zoom into the kernel) + LEFT its just 'm' + thread or ENTER + Zoom into Thread that isn't working... Not even that, there is something subtle, investigating. But as a suggestion, pressing 't' zooms into a thread, pressing it again zooms out of that thread, rinse repeat. Ditto for 'd' for DSO and 'k' for the kernel DSO, 'S' for a CPU socket (Kan Liang implemented this, but not for 'c' CPU, will try to add that). Anyway, back to the bug. - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 19:22 ` Arnaldo Carvalho de Melo @ 2025-04-09 21:26 ` Ingo Molnar 2025-04-10 1:38 ` Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Ingo Molnar @ 2025-04-09 21:26 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > On Wed, Apr 09, 2025 at 04:17:11PM -0300, Arnaldo Carvalho de Melo wrote: > > On Wed, Apr 09, 2025 at 12:57:05PM -0300, Arnaldo Carvalho de Melo wrote: > > > On Wed, Apr 09, 2025 at 09:19:01AM -0300, Arnaldo Carvalho de Melo wrote: > > > > On Tue, Apr 08, 2025 at 11:23:54PM -0300, Arnaldo Carvalho de Melo wrote: > > > > > On Tue, Apr 08, 2025 at 10:05:15AM +0200, Ingo Molnar wrote: > > > > > > - Yay! But very unintuitive. The intuitive step would have been > > > > > > the left arrow key BTW. (with restoring the cursor position to > > > > > > the cc1 entry I pressed 'Enter' on originally), but that doesn't > > > > > > > > > That is how I implemented it initially, yes, over 10 years ago. > > > > > > > > I'll experiment with making the Left arrow key, if already on the first > > > > column, to work as zoom out. > > > > > > Can you try this, on top of the series or even just this, pressing left > > > should do what we want. > > > > This is working for zooming into/out of a DSO, but for threads, which is > > what you described, it is not working, checking that... > > Using 't' to zoom into a thread and then LEFT to zoom out works, ditto > for 'd' + LEFT, 'k' (zoom into the kernel) + LEFT its just 'm' + thread > or ENTER + Zoom into Thread that isn't working... > > Not even that, there is something subtle, investigating. > > But as a suggestion, pressing 't' zooms into a thread, pressing it again > zooms out of that thread, rinse repeat. Indeed, this works it around. It's not *that* easy though: 't' goes into thread view, but has to be pressed another 3(?) times to get back to the previous non-threaded view again. > > Ditto for 'd' for DSO and 'k' for the kernel DSO, 'S' for a CPU socket > (Kan Liang implemented this, but not for 'c' CPU, will try to add that). > > Anyway, back to the bug. Thanks for having a look! :) Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-09 21:26 ` Ingo Molnar @ 2025-04-10 1:38 ` Arnaldo Carvalho de Melo 2025-04-10 6:24 ` Ingo Molnar 0 siblings, 1 reply; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-10 1:38 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov On Wed, Apr 09, 2025 at 11:26:10PM +0200, Ingo Molnar wrote: > * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Using 't' to zoom into a thread and then LEFT to zoom out works, ditto > > for 'd' + LEFT, 'k' (zoom into the kernel) + LEFT its just 'm' + thread > > or ENTER + Zoom into Thread that isn't working... > > Not even that, there is something subtle, investigating. > > But as a suggestion, pressing 't' zooms into a thread, pressing it again > > zooms out of that thread, rinse repeat. > Indeed, this works it around. It's not *that* easy though: 't' goes > into thread view, but has to be pressed another 3(?) times to get back > to the previous non-threaded view again. > > Ditto for 'd' for DSO and 'k' for the kernel DSO, 'S' for a CPU socket > > (Kan Liang implemented this, but not for 'c' CPU, will try to add that). > > Anyway, back to the bug. > Thanks for having a look! :) I guess I found it, another one-liner (well, two if you count removing a comment line) with a long explanation, see below. There is a tidy up patch before this, all is at: https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-annotate+build So far: ⬢ [acme@toolbox perf-tools-next]$ git log --oneline v6.15-rc1.. 4af32d73e850cb2c (HEAD -> perf-tools-next) perf ui browser hists: Set actions->thread before calling do_zoom_thread() caab12ee523ccf7c perf ui browser hists: Simplify the routines that add entries to the popup menu dd2ca9ca0d76a04b perf ui browser: Accept the left arrow key as a Zoom out if done on the first column 6b4e380deb02de46 perf ui browser annotate: Don't show the source code view status initially befd9928ac3b0914 perf ui browser annotate: Show in the title the source code view toggle 6c557247d907487e perf ui browser map: Provide feedback on unhandled hotkeys dcee76bd1ef37cd5 perf ui browser hists: Provide feedback on unhandled hotkeys 7cd3b61be22b6111 perf ui browser header: Provide feedback on unhandled hotkeys 4226b944d6cfec62 perf ui browser annotate: Provide feedback on unhandled hotkeys 96f43ea8e198b9cd perf ui browser annotate-data: Provide feedback on unhandled hotkeys 1101a930a674936a perf ui browser: Add a warn on unhandled hotkey helper bdd6ce8eb8c2f2f7 perf ui browser: Add key_name() helper 65185f2cf7af5026 tools build: Don't show libbfd build status as it is opt-in 33da8804f7c00056 perf check: Add tip about building with libbfd using BUILD_NONDISTRO=1 da4de82814f54c6c perf build: Warn when libdebuginfod devel files are not available d5f1c3eb842aaafe tools build: Don't show libunwind build status as it is opt-in 6cf8fa4dcce720f9 perf check: Allow showing a tip for opt-in features not built into perf 07f8bc136355d09a perf check: Move the FEATURE_STATUS() macro to its only user source file a3fb01a0d062d9f4 perf check: Share the feature status printing routine with 'perf version' c2bc1a34f2488b7f tools build: Don't set libunwind as available if test-all.c build succeeds ⬢ [acme@toolbox perf-tools-next]$ I'm starting to take a look at: ⬢ [acme@toolbox git]$ head -5 ./llvm-project/llvm/tools/llvm-objdump/SourcePrinter.h //===-- SourcePrinter.h - source interleaving utilities --------*- C++ -*-===// // // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. // See https://llvm.org/LICENSE.txt for license information. // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception ⬢ [acme@toolbox git]$ :-) - Arnaldo From 4af32d73e850cb2c0c1679a0d30a65d8d5f4222d Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo <acme@redhat.com> Date: Wed, 9 Apr 2025 21:58:19 -0300 Subject: [PATCH 1/1] perf ui browser hists: Set actions->thread before calling do_zoom_thread() In 7cecb7fe8388d5c3 ("perf hists: Move sort__has_comm into struct perf_hpp_list") it assumes that act->thread is set prior to calling do_zoom_thread(). This doesn't happen when we use ESC or the Left arrow key to Zoom out of a specific thread, making this operation not to work and we get stuck into the thread zoom. In 6422184b087ff435 ("perf hists browser: Simplify zooming code using pstack_peek()") it says no need to set actions->thread, and at that point that was true, but in 7cecb7fe8388d5c3 a actions->thread == NULL check was added before the zoom out of thread could kick in. We can zoom out using the alternative 't' thread zoom toggle hotkey to finally set actions->thread before calling do_zoom_thread() and zoom out, but lets also fix the ESC/Zoom out of thread case. Fixes: 7cecb7fe8388d5c3 ("perf hists: Move sort__has_comm into struct perf_hpp_list") Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> --- tools/perf/ui/browsers/hists.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c index 0db2a3f06e23cc5a..cf022e92d06b9b28 100644 --- a/tools/perf/ui/browsers/hists.c +++ b/tools/perf/ui/browsers/hists.c @@ -3264,10 +3264,10 @@ static int evsel__hists_browse(struct evsel *evsel, int nr_events, const char *h /* * No need to set actions->dso here since * it's just to remove the current filter. - * Ditto for thread below. */ do_zoom_dso(browser, actions); } else if (top == &browser->hists->thread_filter) { + actions->thread = thread; do_zoom_thread(browser, actions); } else if (top == &browser->hists->socket_filter) { do_zoom_socket(browser, actions); -- 2.48.1 ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-10 1:38 ` Arnaldo Carvalho de Melo @ 2025-04-10 6:24 ` Ingo Molnar 2025-04-10 14:03 ` Fixes for perf build system and TUI browsers was " Arnaldo Carvalho de Melo 0 siblings, 1 reply; 37+ messages in thread From: Ingo Molnar @ 2025-04-10 6:24 UTC (permalink / raw) To: Arnaldo Carvalho de Melo Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > On Wed, Apr 09, 2025 at 11:26:10PM +0200, Ingo Molnar wrote: > > * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > Using 't' to zoom into a thread and then LEFT to zoom out works, ditto > > > for 'd' + LEFT, 'k' (zoom into the kernel) + LEFT its just 'm' + thread > > > or ENTER + Zoom into Thread that isn't working... > > > > Not even that, there is something subtle, investigating. > > > > But as a suggestion, pressing 't' zooms into a thread, pressing it again > > > zooms out of that thread, rinse repeat. > > > Indeed, this works it around. It's not *that* easy though: 't' goes > > into thread view, but has to be pressed another 3(?) times to get back > > to the previous non-threaded view again. > > > > Ditto for 'd' for DSO and 'k' for the kernel DSO, 'S' for a CPU socket > > > (Kan Liang implemented this, but not for 'c' CPU, will try to add that). > > > > Anyway, back to the bug. > > > Thanks for having a look! :) > > I guess I found it, another one-liner (well, two if you count removing a > comment line) with a long explanation, see below. > > There is a tidy up patch before this, all is at: > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-annotate+build Wow, that's quite a lot! So, I can confirm that the build warning is gone - as the libunwind entry is gone: Auto-detecting system features: ... libdw: [ on ] ... glibc: [ on ] ... libelf: [ on ] ... libnuma: [ on ] ... numa_num_possible_cpus: [ on ] ... libperl: [ on ] ... libpython: [ on ] ... libcrypto: [ on ] ... libcapstone: [ on ] ... llvm-perf: [ on ] ... zlib: [ on ] ... lzma: [ on ] ... get_cpuid: [ on ] ... bpf: [ on ] ... libaio: [ on ] ... libzstd: [ on ] And there's this second round of auto-detection: Auto-detecting system features: ... clang-bpf-co-re: [ on ] ... llvm: [ on ] ... libcap: [ on ] ... libbfd: [ on ] which is all-green as well. I still have these dependency warnings: Makefile.config:563: No elfutils/debuginfod.h found, no debuginfo server support, please install elfutils-debuginfod-client-devel or equivalent Makefile.config:1146: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel Makefile.config:1191: libtracefs is missing. Please install libtracefs-dev/libtracefs-devel The libtracefs one was resolved via 'apt install libtracefs-dev', leaving two: Makefile.config:563: No elfutils/debuginfod.h found, no debuginfo server support, please install elfutils-debuginfod-client-devel or equivalent Makefile.config:1146: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel I cannot resolve the openjdk dependency warning, despite installing a plethora of packages: kepler:~/tip> dpkg -l | grep openjdk rc openjdk-14-jre-headless:amd64 14.0.2+12-1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) ii openjdk-17-jre:amd64 17.0.14+7-1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT ii openjdk-17-jre-headless:amd64 17.0.14+7-1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) rc openjdk-18-jre-headless:amd64 18.0.2+9-2ubuntu1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) ii openjdk-21-jdk-headless:amd64 21.0.6+7-1~24.10.1 amd64 OpenJDK Development Kit (JDK) (headless) ii openjdk-21-jre:amd64 21.0.6+7-1~24.10.1 amd64 OpenJDK Java runtime, using Hotspot JIT ii openjdk-21-jre-headless:amd64 21.0.6+7-1~24.10.1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) ii openjdk-8-jdk:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Development Kit (JDK) ii openjdk-8-jdk-headless:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Development Kit (JDK) (headless) ii openjdk-8-jre:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT ii openjdk-8-jre-headless:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) And I managed to resolve the elfutils/debuginfod.h warning via an apt-file search and installing the libdebuginfod-dev package, the warning message should probably be updated via the patch further below that gives a more specific package name to install. With that resolved, I now have only a single dependency failure left: Makefile.config:1146: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel On the UI front: - I can confirm that the left-arrow now works intuitively wrt. context zooming, thanks for implementing that! - Likewise, the Esc key suggested in the 'h' screen now works intuitively as well. - When pressing an unbound key I now get a helpful message. - One small detail with the unbound key warning I noticed: it now says "'l' key not associated, use 'h'/'?'/F1 to see actions!", but pressing 'h' will only work after dismissing this window. It might be a nice little extra UI tweak to allow 'h/?/F1' to jump to the help window straight away from warning windows, instead of requiring two keys to be pressed? - I also see that 'P' now works in a broader context as well, saving the perf.hist.0 file. - A small observation with the 'P' key too: maybe the status line at the bottom should update with the action performed? I only see a small flash and the "Press '?' for help on key bindings" message is reprinted. Anyway, I'd say that the large majority of my TUI complaints were resolved by your changes, and I had no new problems, so for the changes so far: Tested-by: Ingo Molnar <mingo@kernel.org> Thanks, Ingo =================================> Signed-off-by: Ingo Molnar <mingo@kernel.org> tools/perf/Makefile.config | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 211eb88e6bd8..9f08a6e96b35 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -560,7 +560,7 @@ ifndef NO_LIBELF CFLAGS += -DHAVE_DEBUGINFOD_SUPPORT EXTLIBS += -ldebuginfod else - $(warning No elfutils/debuginfod.h found, no debuginfo server support, please install elfutils-debuginfod-client-devel or equivalent) + $(warning No elfutils/debuginfod.h found, no debuginfo server support, please install libdebuginfod-dev/elfutils-debuginfod-client-devel or equivalent) endif endif ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Fixes for perf build system and TUI browsers was Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either 2025-04-10 6:24 ` Ingo Molnar @ 2025-04-10 14:03 ` Arnaldo Carvalho de Melo 0 siblings, 0 replies; 37+ messages in thread From: Arnaldo Carvalho de Melo @ 2025-04-10 14:03 UTC (permalink / raw) To: Ingo Molnar Cc: Howard Chu, Namhyung Kim, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, linux-perf-users, Dmitry Vyukov, Michael Petlan, Linux Kernel Mailing List On Thu, Apr 10, 2025 at 08:24:52AM +0200, Ingo Molnar wrote: > * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > On Wed, Apr 09, 2025 at 11:26:10PM +0200, Ingo Molnar wrote: > > > * Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > > > Anyway, back to the bug. > > > Thanks for having a look! :) > > I guess I found it, another one-liner (well, two if you count removing a > > comment line) with a long explanation, see below. > > There is a tidy up patch before this, all is at: > > https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git perf-annotate+build > Wow, that's quite a lot! You reported various problems that accumulated thru over a decade, which shows how difficult is to regression test TUIs :-\ We have 'perf test' to test various aspects of the tools, with C ones testing APIs, shell ones running the tools and looking if they are generating expected results by grepping output and using other techniques, the perf test shell infrastructure is itself checked using ShellCheck, if installed: ⬢ [acme@toolbox perf-tools-next]$ rpm -q --qf "%{description}\n" ShellCheck The goals of ShellCheck are: * To point out and clarify typical beginner's syntax issues, that causes a shell to give cryptic error messages. * To point out and clarify typical intermediate level semantic problems, that causes a shell to behave strangely and counter-intuitively. * To point out subtle caveats, corner cases and pitfalls, that may cause an advanced user's otherwise working script to fail under future circumstances. ⬢ [acme@toolbox perf-tools-next]$ A big regression test suite that lived/lives inside Red Hat, created/maintained by Michael Petlan (and was continuosly used to test perf) is being merged with it over time. Ian did lots of work on making it parallel, with some tests needing to be run exclusively as it changes global state (perf probe adding stuff, etc). But one area we lack is to test things that interact with the user, like the TUI, something to improve! So I'm rather happy when someone that uses the TUI comes along and points out problems, as this isn't an area that we are automatically testing, it all depends on sheer code review, with corner case bugs living in the codebase for years, sometimes decades... > So, I can confirm that the build warning is gone - as the libunwind > entry is gone: Great! > Auto-detecting system features: > ... libdw: [ on ] > ... glibc: [ on ] > ... libelf: [ on ] > ... libnuma: [ on ] > ... numa_num_possible_cpus: [ on ] > ... libperl: [ on ] > ... libpython: [ on ] > ... libcrypto: [ on ] > ... libcapstone: [ on ] > ... llvm-perf: [ on ] > ... zlib: [ on ] > ... lzma: [ on ] > ... get_cpuid: [ on ] > ... bpf: [ on ] > ... libaio: [ on ] > ... libzstd: [ on ] > > And there's this second round of auto-detection: > Auto-detecting system features: > ... clang-bpf-co-re: [ on ] > ... llvm: [ on ] > ... libcap: [ on ] > ... libbfd: [ on ] > which is all-green as well. > I still have these dependency warnings: > Makefile.config:563: No elfutils/debuginfod.h found, no debuginfo server support, please install elfutils-debuginfod-client-devel or equivalent > Makefile.config:1146: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel > Makefile.config:1191: libtracefs is missing. Please install libtracefs-dev/libtracefs-devel > The libtracefs one was resolved via 'apt install libtracefs-dev', > leaving two: > Makefile.config:563: No elfutils/debuginfod.h found, no debuginfo server support, please install elfutils-debuginfod-client-devel or equivalent > Makefile.config:1146: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel > I cannot resolve the openjdk dependency warning, despite installing a > plethora of packages: This one is a sore spot, I'll try to take a look. > kepler:~/tip> dpkg -l | grep openjdk > rc openjdk-14-jre-headless:amd64 14.0.2+12-1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) > ii openjdk-17-jre:amd64 17.0.14+7-1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT > ii openjdk-17-jre-headless:amd64 17.0.14+7-1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) > rc openjdk-18-jre-headless:amd64 18.0.2+9-2ubuntu1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) > ii openjdk-21-jdk-headless:amd64 21.0.6+7-1~24.10.1 amd64 OpenJDK Development Kit (JDK) (headless) > ii openjdk-21-jre:amd64 21.0.6+7-1~24.10.1 amd64 OpenJDK Java runtime, using Hotspot JIT > ii openjdk-21-jre-headless:amd64 21.0.6+7-1~24.10.1 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) > ii openjdk-8-jdk:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Development Kit (JDK) > ii openjdk-8-jdk-headless:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Development Kit (JDK) (headless) > ii openjdk-8-jre:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT > ii openjdk-8-jre-headless:amd64 8u442-b06~us1-0ubuntu1~24.10 amd64 OpenJDK Java runtime, using Hotspot JIT (headless) > And I managed to resolve the elfutils/debuginfod.h warning via an > apt-file search and installing the libdebuginfod-dev package, the > warning message should probably be updated via the patch further below > that gives a more specific package name to install. I've folded your patch in the patch I added in this series while addressing the other problems you pointed out with libunwind, ditto for libbfd. > With that resolved, I now have only a single dependency failure left: > Makefile.config:1146: No openjdk development package found, please install JDK package, e.g. openjdk-8-jdk, java-1.8.0-openjdk-devel I'll work on that. > On the UI front: > > - I can confirm that the left-arrow now works intuitively wrt. context > zooming, thanks for implementing that! Great! the good thing was that minor changes were needed. > - Likewise, the Esc key suggested in the 'h' screen now works > intuitively as well. Right, I didn't even try to fix that one explicitely, it was that do_zoom_thread() (that does zoom in and out) not having act->thread set when handling the left arrow, that happens to also handle the ESC case. > - When pressing an unbound key I now get a helpful message. Great. > - One small detail with the unbound key warning I noticed: it now > says "'l' key not associated, use 'h'/'?'/F1 to see actions!", but > pressing 'h' will only work after dismissing this window. It might > be a nice little extra UI tweak to allow 'h/?/F1' to jump to the > help window straight away from warning windows, instead of > requiring two keys to be pressed? Sure, I'll add a patch on top of this series doing that. > - I also see that 'P' now works in a broader context as well, saving > the perf.hist.0 file. I did nothing in this regard, a good fallout from the other changes then. > - A small observation with the 'P' key too: maybe the status line > at the bottom should update with the action performed? I only see > a small flash and the "Press '?' for help on key bindings" > message is reprinted. Right, I think this is a case to use a popup window stating that the "screenshot" was produced and was saved in file xyz.bla > Anyway, I'd say that the large majority of my TUI complaints were > resolved by your changes, and I had no new problems, so for the > changes so far: > Tested-by: Ingo Molnar <mingo@kernel.org> Thanks, I added that to the patches and will use what I have right now to open the perf-tools-next.git tree, so that in addition to this work I start processing the many patches that are out there for 6.16! Please, if you have further comments, just send them my way but do not forget to CC perf's co-maintainer, reviewers and the mailing list, its not always that I can focus on something like this ;-) Cheers, - Arnaldo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-20 9:32 ` Ingo Molnar 2025-03-20 16:16 ` Namhyung Kim @ 2025-03-25 0:46 ` Namhyung Kim 2025-03-30 5:54 ` Namhyung Kim 2 siblings, 0 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-25 0:46 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, LKML, linux-perf-users, Dmitry Vyukov On Thu, Mar 20, 2025 at 10:32:39AM +0100, Ingo Molnar wrote: > 4) > > It would also be nice to have an export function to save current 'perf > top' profiling data and have it available for 'perf report' et al > analysis. Ie. frequently I just see an interesting snapshot and decide > that it's good for further analysis, freeze the screen and are left > with very few options to keep it for further look and reference. I've thought about this but there are some limitations. Basically perf top and perf report aggregates sammples into hist_entry's using the given sort keys. That means other information in the samples are gone already. For perf top, it only has dso and symbol - plus number of samples and the total periods. Other informations like CPU, PID and TIME are not available so it cannot reconstruct samples precisely. So it's only possible to use the same sort keys or a subset of them IMHO. Do you think it'd be useful even with this? Thanks, Namhyung ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-20 9:32 ` Ingo Molnar 2025-03-20 16:16 ` Namhyung Kim 2025-03-25 0:46 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim @ 2025-03-30 5:54 ` Namhyung Kim 2 siblings, 0 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-30 5:54 UTC (permalink / raw) To: Ingo Molnar Cc: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Jiri Olsa, Adrian Hunter, Peter Zijlstra, LKML, linux-perf-users, Dmitry Vyukov On Thu, Mar 20, 2025 at 10:32:39AM +0100, Ingo Molnar wrote: > > * Namhyung Kim <namhyung@kernel.org> wrote: > > > On Fri, Mar 07, 2025 at 12:08:27AM -0800, Namhyung Kim wrote: > > > This is useful for hierarchy output mode where the first level is > > > considered as output fields. We want them in the same level so that it > > > can show only the remaining groups in the hierarchy. > > > > > > Before: > > > $ perf report -s overhead,sample,period,comm,dso -H --stdio > > > ... > > > # Overhead Samples / Period / Command / Shared Object > > > # ................. .......................................... > > > # > > > 100.00% 4035 > > > 100.00% 3835883066 > > > 100.00% perf > > > 99.37% perf > > > 0.50% ld-linux-x86-64.so.2 > > > 0.06% [unknown] > > > 0.04% libc.so.6 > > > 0.02% libLLVM-16.so.1 > > > > > > After: > > > $ perf report -s overhead,sample,period,comm,dso -H --stdio > > > ... > > > # Overhead Samples Period Command / Shared Object > > > # ....................................... ....................... > > > # > > > 100.00% 4035 3835883066 perf > > > 99.37% 4005 3811826223 perf > > > 0.50% 19 19210014 ld-linux-x86-64.so.2 > > > 0.06% 8 2367089 [unknown] > > > 0.04% 2 1720336 libc.so.6 > > > 0.02% 1 759404 libLLVM-16.so.1 > > > > > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > > > > Ping! Anybody interested in this change? :) > > Oh yes, all such pieces of intelligent organization of textual output > of profiling data are worth their weight in gold in my book. :-) > > Acked-by: Ingo Molnar <mingo@kernel.org> > > 1) > > On a related note, does anyone know why perf stat output alignment > sucks so much these days: > > starship:~/tip> perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging 2>&1 | grep elapsed > 0.11620 +- 0.00327 seconds time elapsed ( +- 2.81% ) > 0.120813 +- 0.000570 seconds time elapsed ( +- 0.47% ) > 0.122280 +- 0.000443 seconds time elapsed ( +- 0.36% ) > 0.119813 +- 0.000752 seconds time elapsed ( +- 0.63% ) > 0.12190 +- 0.00134 seconds time elapsed ( +- 1.10% ) > 0.119862 +- 0.000542 seconds time elapsed ( +- 0.45% ) > 0.120075 +- 0.000608 seconds time elapsed ( +- 0.51% ) > 0.120350 +- 0.000273 seconds time elapsed ( +- 0.23% ) > 0.12203 +- 0.00114 seconds time elapsed ( +- 0.93% ) > 0.12229 +- 0.00114 seconds time elapsed ( +- 0.93% ) > 0.12032 +- 0.00115 seconds time elapsed ( +- 0.95% ) > 0.121241 +- 0.000463 seconds time elapsed ( +- 0.38% ) > 0.119404 +- 0.000333 seconds time elapsed ( +- 0.28% ) > 0.119945 +- 0.000766 seconds time elapsed ( +- 0.64% ) > 0.121215 +- 0.000879 seconds time elapsed ( +- 0.72% ) > 0.12001 +- 0.00109 seconds time elapsed ( +- 0.91% ) > 0.12193 +- 0.00182 seconds time elapsed ( +- 1.49% ) > 0.119184 +- 0.000794 seconds time elapsed ( +- 0.67% ) > 0.120062 +- 0.000439 seconds time elapsed ( +- 0.37% ) > 0.120834 +- 0.000760 seconds time elapsed ( +- 0.63% ) > 0.369473 +- 0.000992 seconds time elapsed ( +- 0.27% ) > > ... see how the vertical alignment of the output goes randomly wacko - > I presume because there's a trailing zero in the output number and the > code for some inexplicable reason decides to shorten it to make the > life of developers harder? ;-) I found this commit. :) I think it didn't consider nested perf stat with --repeat options. commit bc22de9bcdb2249150fb5b3c48fdc4f6bedd3ad7 Author: Jiri Olsa <jolsa@kernel.org> Date: Mon Apr 23 11:08:20 2018 +0200 perf stat: Display time in precision based on std deviation Ingo suggested to display elapsed time for multirun workload (perf stat -e) with precision based on the precision of the standard deviation. In his own words: > This output is a slightly bit misleading: > Performance counter stats for 'make -j128' (10 runs): > 27.988995256 seconds time elapsed ( +- 0.39% ) > The 9 significant digits in the result, while only 1 is valid, suggests accuracy > where none exists. > It would be better if 'perf stat' would display elapsed time with a precision > adjusted to stddev, it should display at most 2 more significant digits than > the stddev inaccuracy. > I.e. in the above case 0.39% is 0.109, so we only have accuracy for 1 digit, and > so we should only display 3: > 27.988 seconds time elapsed ( +- 0.39% ) Plus a suggestion about the output, which is small enough and connected with the above change that I merged both changes together. > Small output style nit - I think it would be nice if with --repeat the stddev was > also displayed in absolute values, besides percentage: > > 27.988 +- 0.109 seconds time elapsed ( +- 0.39% ) The output is now: Performance counter stats for './perf bench sched pipe' (5 runs): SNIP 13.3667 +- 0.0256 seconds time elapsed ( +- 0.19% ) Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180423090823.32309-7-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> > > 2) > > It's also incredibly hard to Ctrl-C a 'perf stat --repeat' instance: > > starship:~/tip> perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging > # Running 'sched/messaging' benchmark: > # 20 sender and receiver processes per group > # 10 groups == 400 processes run > > ... > Ctrl-C > > # Running 'sched/messaging' benchmark: > perf: pollperf: perf: pollperf: pollpollperf: pollperf: pollperf: : Interrupted system call > : Interrupted system call > poll: Interrupted system call > perf: pollperf: : Interrupted system call > perf: pollperf: pollpollperf: : Interrupted system call > pollperf: pollperf: perf: perf: pollpollpollperf: : Interrupted system call > pollperf: poll: Interrupted system call > : Interrupted system call > : Interrupted system call > : Interrupted system call > perf: poll: Interrupted system call > perf: perf: pollpoll: Interrupted system call > : Interrupted system call > perf: perf: perf: perf: perf: perf: : Interrupted system call > pollpollpollpollpollpoll: Interrupted system call > : Interrupted system call > : Interrupted system call > perf: perf: pollperf: perf: perf: perf: perf: perf: pollperf: : Interrupted system call > pollpollpoll: Interrupted system call > > Note how the perf stat instance actually *hangs*. I have to Ctrl-Z it, > and kill -9 %1 it the hard way to clean up: It looks like the problem is in the perf bench sched messaging not in perf stat. If the signal delivered before all groups are created, it may wait for children indefinitely. Thanks, Namhyung > > pollpollpoll: Interrupted system call > � > [1]+ Stopped perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging > starship:~/tip> kill -9 %1 > > [1]+ Stopped perf stat --null --repeat 20 perf stat --null --repeat 3 perf bench sched messaging > starship:~/tip> kill -9 %1 > > Does anyone use this thing for actual benchmarking work? ;-) ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH 1/3] perf sort: Keep output fields in the same level 2025-03-07 8:08 [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim ` (2 preceding siblings ...) 2025-03-20 0:36 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim @ 2025-03-21 18:30 ` Namhyung Kim 3 siblings, 0 replies; 37+ messages in thread From: Namhyung Kim @ 2025-03-21 18:30 UTC (permalink / raw) To: Arnaldo Carvalho de Melo, Ian Rogers, Kan Liang, Namhyung Kim Cc: Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar, LKML, linux-perf-users, Dmitry Vyukov On Fri, 07 Mar 2025 00:08:27 -0800, Namhyung Kim wrote: > This is useful for hierarchy output mode where the first level is > considered as output fields. We want them in the same level so that it > can show only the remaining groups in the hierarchy. > > Before: > $ perf report -s overhead,sample,period,comm,dso -H --stdio > ... > # Overhead Samples / Period / Command / Shared Object > # ................. .......................................... > # > 100.00% 4035 > 100.00% 3835883066 > 100.00% perf > 99.37% perf > 0.50% ld-linux-x86-64.so.2 > 0.06% [unknown] > 0.04% libc.so.6 > 0.02% libLLVM-16.so.1 > > [...] Applied to perf-tools-next, thanks! Best regards, Namhyung ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2025-04-24 12:37 UTC | newest] Thread overview: 37+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-03-07 8:08 [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-07 8:08 ` [PATCH 2/3] perf report: Allow hierarchy mode for --children Namhyung Kim 2025-03-07 8:08 ` [PATCH 3/3] perf report: Disable children column for data type profiling Namhyung Kim 2025-03-20 0:36 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-20 9:32 ` Ingo Molnar 2025-03-20 16:16 ` Namhyung Kim 2025-03-24 7:28 ` Ingo Molnar 2025-03-25 0:26 ` Namhyung Kim 2025-04-04 9:41 ` [perf top] annotation doesn't work, libunwind doesn't seem to be working either Ingo Molnar 2025-04-04 17:28 ` Namhyung Kim 2025-04-04 18:13 ` Arnaldo Carvalho de Melo 2025-04-04 18:25 ` Arnaldo Carvalho de Melo 2025-04-04 18:40 ` Arnaldo Carvalho de Melo 2025-04-05 9:06 ` Ingo Molnar 2025-04-05 9:09 ` Ingo Molnar 2025-04-07 6:02 ` Howard Chu 2025-04-07 16:58 ` Ingo Molnar 2025-04-07 17:04 ` Ingo Molnar 2025-04-08 0:54 ` Arnaldo Carvalho de Melo 2025-04-08 6:16 ` Namhyung Kim 2025-04-09 3:26 ` Arnaldo Carvalho de Melo 2025-04-10 20:48 ` Namhyung Kim 2025-04-10 20:54 ` Ingo Molnar 2025-04-24 12:37 ` Arnaldo Carvalho de Melo 2025-04-08 8:05 ` Ingo Molnar 2025-04-09 2:23 ` Arnaldo Carvalho de Melo 2025-04-09 12:19 ` Arnaldo Carvalho de Melo 2025-04-09 15:57 ` Arnaldo Carvalho de Melo 2025-04-09 19:17 ` Arnaldo Carvalho de Melo 2025-04-09 19:22 ` Arnaldo Carvalho de Melo 2025-04-09 21:26 ` Ingo Molnar 2025-04-10 1:38 ` Arnaldo Carvalho de Melo 2025-04-10 6:24 ` Ingo Molnar 2025-04-10 14:03 ` Fixes for perf build system and TUI browsers was " Arnaldo Carvalho de Melo 2025-03-25 0:46 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim 2025-03-30 5:54 ` Namhyung Kim 2025-03-21 18:30 ` Namhyung Kim
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).