All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Howard Chu <howardchu95@gmail.com>,
	Namhyung Kim <namhyung@kernel.org>,
	Ian Rogers <irogers@google.com>,
	Kan Liang <kan.liang@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-perf-users@vger.kernel.org,
	Dmitry Vyukov <dvyukov@google.com>
Subject: Re: [perf top] annotation doesn't work, libunwind doesn't seem to be working either
Date: Tue, 8 Apr 2025 10:05:15 +0200	[thread overview]
Message-ID: <Z_TYux5fUg2pW-pF@gmail.com> (raw)
In-Reply-To: <Z_Rz10stoLzBocIO@x1>


* Arnaldo Carvalho de Melo <acme@kernel.org> wrote:

> So, with a patch I have (attached) here this turns into:
> 
> ⬢ [acme@toolbox perf-tools-next]$ perf check feature libunwind
>              libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
> ⬢ [acme@toolbox perf-tools-next]$

I can confirm that on one system with your fix applied there's no 'OFF' 
entry anymore:

...                                   libdw: [ on  ]
...                                   glibc: [ on  ]
...                                  libbfd: [ on  ]
...                          libbfd-buildid: [ on  ]
...                                  libelf: [ on  ]
...                                 libnuma: [ on  ]
...                  numa_num_possible_cpus: [ on  ]
...                                 libperl: [ on  ]
...                               libpython: [ on  ]
...                               libcrypto: [ on  ]
...                               libunwind: [ on  ]
...                             libcapstone: [ on  ]
...                               llvm-perf: [ on  ]
...                                    zlib: [ on  ]
...                                    lzma: [ on  ]
...                               get_cpuid: [ on  ]
...                                     bpf: [ on  ]
...                                  libaio: [ on  ]
...                                 libzstd: [ on  ]

But on another, similar Ubuntu system I still get:

...                               libunwind: [ OFF ]

  kepler:~/tip> dpkg -l | grep -E 'capstone|unwind|libdw'
  ii  libcapstone-dev:amd64                                       4.0.2-5.1build1                                     amd64        lightweight multi-architecture disassembly framework - devel files
  ii  libcapstone4:amd64                                          4.0.2-5.1build1                                     amd64        lightweight multi-architecture disassembly framework - library
  ii  libdw-dev:amd64                                             0.191-2ubuntu0.1                                    amd64        libdw1t64 development libraries and header files
  ii  libdw1t64:amd64                                             0.191-2ubuntu0.1                                    amd64        library that provides access to the DWARF debug information
  ii  libunwind-dev:amd64                                         1.6.2-3.1                                           amd64        library to determine the call-chain of a program - development
  ii  libunwind8:amd64                                            1.6.2-3.1                                           amd64        library to determine the call-chain of a program - runtime

And I have your fix applied:

 kepler:~/tip> git diff --stat
  tools/perf/builtin-check.c   | 27 +++++++++++----------------
  tools/perf/builtin-version.c | 30 +-----------------------------
  tools/perf/builtin.h         | 10 ++++++++++
  3 files changed, 22 insertions(+), 45 deletions(-)

  kepler:~/tip> perf check feature libunwind
             libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )

> The annotate case is unrelated from what I can see, the preference 
> for the disassembler (libcapstone, libllvm or the old method of 
> parsing objdump output) has to start with the most capable, and the 
> source code part shows that the current order isn't the best, I'll 
> continue the investigation/tests tomorrow and will come up with a 
> series of patches for the problems reported.

But I was able to test the annotation behavior with 
annotate.disassemblers=objdump:

> For reference, from 'perf-config' man page:
> 
> annotate.*::
>         These are in control of addresses, jump function, source code
>         in lines of assembly code from a specific program.
> 
>         annotate.disassemblers::
>                 Choose the disassembler to use: "objdump", "llvm",  "capstone",
>                 if not specified it will first try, if available, the "llvm" one,
>                 then, if it fails, "capstone", and finally the original "objdump"
>                 based one.
> 
>                 Choosing a different one is useful when handling some feature that
>                 is known to be best support at some point by one of the options,
>                 to compare the output when in doubt about some bug, etc.
> 
>                 This can be a list, in order of preference, the first one that works
>                 finishes the process.
> 
> A quick test here with perf top without setting the above ends up with
> the behaviour that Ingo reported, no source code is shown and 's'
> doesn't work. Then if I set it to use objdump:
> 
> root@number:~# perf config annotate.disassemblers=objdump
> root@number:~# perf config annotate.disassemblers
> annotate.disassemblers=objdump
> root@number:~# cat ~/.perfconfig 
> # this file is auto-generated.
> [annotate]
> 	disassemblers = objdump
> root@number:~#
> 
> It takes longer to show the annotated output but there is source code
> and 's' works.

I can confirm that this works around the annotation bug, and I can also 
confirm that objdump is pretty slow, 'perf top' just appears to hang 
for many seconds. :-)

BTW., if objdump source-annotation is the best output we have right 
now, then it would be nice to somehow visually distinguish source code 
versus disassembly. Right now it's all the same color and flows 
together on the screen. Maybe we could gray out source code lines or 
so?

Also, the source code appears to lose its horizontal indentation 
levels:

   0.25          sub    $0xffffffff8495b8a0,%rax
   0.20          sar    $0x3,%rax 
   0.22          imul   %rdx,%rax 
               __debug_atomic_inc(lock_class_ops[idx]);
   0.59          movslq %eax,%rbx 
   0.38          incq   %gs:-0x7b9433f8(,%rbx,8)
                 
               class_idx = class - lock_classes;
                 
               if (depth && !sync) {
               /* we're holding locks and the new held lock is not a sync */
               hlock = curr->held_locks + depth - 1;
   0.39          mov    oops_in_progress,%edx
               class_idx = class - lock_classes;
   0.26          mov    0xeb8(%r14),%ecx
   0.25          mov    %ecx,(%rsp)
               hlock = curr->held_locks + depth - 1;
   0.25          test   %edx,%edx 
               ↓ jne    c3        
   0.26          cmp    $0x2f,%ecx
   0.00        ↓ ja     fc7       
               if (!references)
               references++;
                 
               if (!hlock->references)
               hlock->references++;
                 
   0.26    c3:   mov    (%rsp),%ecx
   0.26          lea    0x0(,%rcx,8),%r10
   0.13          mov    %rcx,%rsi 
   0.26          sub    %rcx,%r10 
   0.26          lea    0xec0(%r14),%rcx
   0.13          shl    $0x3,%r10 
               if (!hlock->references)

I suspect a lot of the formatting is up to objdump, but we should at 
least be able to print all out extra 's' output a visually more 
distinctive fashion?

But if we had full control over annotation output, I think a popular 
annotation output style would be to leave the original source code 
formatting and indentation.

Finally, here's a higher level UI design discussion of current perf 
top/report behavior:

perf top/report IMHO frequently violates core principles of good TUI 
interfaces:

  - There's no guaranteed feedback on keypresses:

     - If I press an unbound key it just ignores it, and I keep 
       wondering whether I mis-typed, misremembered the key binding, 
       whether the keybinding doesn't work, or wether it's a keybinding 
       with 'silent' behavior.

     - There are keybindings with 'silent' behavior: for example 'P' 
       (print current annotation) doesn't update anything in the status 
       line. A good TUI would do that, and it would also show something 
       visible on repeat keypresses of the same key.

     - Basically I consider it an UI bug if the tool does not give me 
       visual feedback on *any* keypress. If a user types something, 
       it's very much intended: arrow keys move up and down or 
       enter/exit functions, 'q' will quit, etc. There's *always* 
       expected behavior when a user presses a key - and it's basically 
       an UI bug if that is ignored for any reason.

  - On long processing steps sometimes there's no progress bar, the 
    tool just 'hangs'. In some cases this is implemented: such as the 
    initial parsing of perf.data. In others it's not: such as the 
    objdump annotation output. (Which is default-off, so this isn't 
    really a bugreport.)

  - Sometimes there's nonsensical UI flows with non-intuitive outcomes. 
    An example I noticed is the 'zoom in' behavior in perf report/top:

     - If in the default 'perf report' starting screen of a short 
       profiling session I enter into a function name on the screen, I 
       get these choices:

                Annotate file_cache_slot::get_next_line(char**, long*)
                Zoom into cc1 thread
                Zoom into cc1 DSO (use the 'k' hotkey to zoom directly into the kernel)
                Browse map details
                Run scripts for samples of symbol [file_cache_slot::get_next_line(char**, long*)]
                Run scripts for all samples
                Switch to another data file in PWD
                Exit

     - If I press 'Zoom into cc1 thread', I get into some sort of 
       filtered mode I am unable to exit via intuitive ways. There's no 
       on-screen hint to make it go away, I have to exit the tool and 
       restart it entirely to get back to the main profiling screen again 
       ... which is obviously a suboptimal UI flow.

     - To make matters worse, the 'h' screen gives me this hint:

             ESC           Zoom out

       except it doesn't appear to be working, my profiling output is 
       still limited to cc1 entries.

     - The *only* way I found to zoom out into the general screen again 
       was to press 'm' for the 'context menu' (I found this not 
       because it was named intuitively, but because I trial-error 
       brute forced the UI keybindings), and there it had:

             Zoom out of cc1 thread

     - Yay! But very unintuitive. The intuitive step would have been 
       the left arrow key BTW. (with restoring the cursor position to 
       the cc1 entry I pressed 'Enter' on originally), but that doesn't 
       do anything, it gets ignored. Also, a status line printout that 
       we are filtered to cc1, and how to exit that mode, would be nice 
       as well.

  - [ Sub-bugreport: at higher levels the 'P' keybinding doesn't print 
      the current screen like on the annotation screen, it goes back a 
      level. Guess how I found out. ;-)

      To be able to quote the above context menu screen, I copy-pasted 
      the ncurses lines, and manually edited out all the non-printable 
      characters unsuitable for emails. ]

I genuinely believe that these "basic profiling workflow" details are 
*FAR* more important to perf usability and perf popularity than pretty 
much any of the more esoteric hardware features, scalability 
enhancements and niche tools I see almost daily progress on - and these 
basic workflow details affect *everyone* who uses perf...

Thanks,

	Ingo

  parent reply	other threads:[~2025-04-08  8:05 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-07  8:08 [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim
2025-03-07  8:08 ` [PATCH 2/3] perf report: Allow hierarchy mode for --children Namhyung Kim
2025-03-07  8:08 ` [PATCH 3/3] perf report: Disable children column for data type profiling Namhyung Kim
2025-03-20  0:36 ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim
2025-03-20  9:32   ` Ingo Molnar
2025-03-20 16:16     ` Namhyung Kim
2025-03-24  7:28       ` Ingo Molnar
2025-03-25  0:26         ` Namhyung Kim
2025-04-04  9:41         ` [perf top] annotation doesn't work, libunwind doesn't seem to be working either Ingo Molnar
2025-04-04 17:28           ` Namhyung Kim
2025-04-04 18:13             ` Arnaldo Carvalho de Melo
2025-04-04 18:25               ` Arnaldo Carvalho de Melo
2025-04-04 18:40                 ` Arnaldo Carvalho de Melo
2025-04-05  9:06             ` Ingo Molnar
2025-04-05  9:09               ` Ingo Molnar
2025-04-07  6:02           ` Howard Chu
2025-04-07 16:58             ` Ingo Molnar
2025-04-07 17:04               ` Ingo Molnar
2025-04-08  0:54               ` Arnaldo Carvalho de Melo
2025-04-08  6:16                 ` Namhyung Kim
2025-04-09  3:26                   ` Arnaldo Carvalho de Melo
2025-04-10 20:48                     ` Namhyung Kim
2025-04-10 20:54                       ` Ingo Molnar
2025-04-24 12:37                       ` Arnaldo Carvalho de Melo
2025-04-08  8:05                 ` Ingo Molnar [this message]
2025-04-09  2:23                   ` Arnaldo Carvalho de Melo
2025-04-09 12:19                     ` Arnaldo Carvalho de Melo
2025-04-09 15:57                       ` Arnaldo Carvalho de Melo
2025-04-09 19:17                         ` Arnaldo Carvalho de Melo
2025-04-09 19:22                           ` Arnaldo Carvalho de Melo
2025-04-09 21:26                             ` Ingo Molnar
2025-04-10  1:38                               ` Arnaldo Carvalho de Melo
2025-04-10  6:24                                 ` Ingo Molnar
2025-04-10 14:03                                   ` Fixes for perf build system and TUI browsers was " Arnaldo Carvalho de Melo
2025-03-25  0:46     ` [PATCH 1/3] perf sort: Keep output fields in the same level Namhyung Kim
2025-03-30  5:54     ` Namhyung Kim
2025-03-21 18:30 ` Namhyung Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z_TYux5fUg2pW-pF@gmail.com \
    --to=mingo@kernel.org \
    --cc=acme@kernel.org \
    --cc=adrian.hunter@intel.com \
    --cc=dvyukov@google.com \
    --cc=howardchu95@gmail.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=kan.liang@linux.intel.com \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.