[BUG] perf top reports not being able to resolve kernel symbols

linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [BUG] perf top reports not being able to resolve kernel symbols
@ 2025-01-02 18:41 Arnaldo Carvalho de Melo
  2025-01-02 19:25 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-02 18:41 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

While investigating a report by Christophe I stumbled on this:

⬢ [acme@toolbox perf-tools-next]$ git bisect bad
77b004f4c5c3c90b20ad61c5fa2ba7d494c1dba1 is the first bad commit
commit 77b004f4c5c3c90b20ad61c5fa2ba7d494c1dba1 (HEAD)
Author: Namhyung Kim <namhyung@kernel.org>
Date:   Thu Sep 12 15:42:08 2024 -0700
    perf symbol: Do not fixup end address of labels

⬢ [acme@toolbox perf-tools-next]$ git tag --contains 77b004f4c5c3c90b20ad61c5fa2ba7d494c1dba1 | grep ^v6
v6.13-rc1
v6.13-rc2
v6.13-rc3
v6.13-rc4
v6.13-rc5
⬢ [acme@toolbox perf-tools-next]$ 
⬢ [acme@toolbox perf-tools-next]$ git merge perf-tools/perf-tools
Already up to date.
⬢ [acme@toolbox perf-tools-next]$ git merge perf-tools/tmp.perf-tools
Already up to date.
⬢ [acme@toolbox perf-tools-next]$

Running just:

# perf top --stdio
   PerfTop:       0 irqs/sec  kernel: 0.0%  exact:  0.0% lost: 0/0 drop: 0/0 [cpu_atom/cycles/P],  (all, 28 CPUs)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Warning:
A vmlinux file was not found.
Kernel samples will not be resolved.
^C^C^C^Z
[1]+  Stopped                 perf top --stdio
root@number:/home/acme/git/pahole# 

But kernel addresses are being used and there is a matching vmlinux:

# pahole --running_kernel_vmlinux
/lib/modules/6.13.0-rc2/build/vmlinux
#

root@number:/home/acme/git/pahole# file /lib/modules/6.13.0-rc2/build/vmlinux
/lib/modules/6.13.0-rc2/build/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=d6f220b80bb50b35238aec50ed10f98df58849c0, with debug_info, not stripped
root@number:/home/acme/git/pahole# perf buildid-list --kernel
d6f220b80bb50b35238aec50ed10f98df58849c0
root@number:/home/acme/git/pahole# 
root@number:/home/acme/git/pahole# perf buildid-list -h --kernel

 Usage: perf buildid-list [<options>]

    -k, --kernel          Show current kernel build id

root@number:/home/acme/git/pahole#

I'm trying to figure this out now.

- Arnaldo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf top reports not being able to resolve kernel symbols
  2025-01-02 18:41 [BUG] perf top reports not being able to resolve kernel symbols Arnaldo Carvalho de Melo
@ 2025-01-02 19:25 ` Arnaldo Carvalho de Melo
  2025-01-02 19:51   ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-02 19:25 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

On Thu, Jan 02, 2025 at 03:41:34PM -0300, Arnaldo Carvalho de Melo wrote:
> While investigating a report by Christophe I stumbled on this:
> 
> ⬢ [acme@toolbox perf-tools-next]$ git bisect bad
> 77b004f4c5c3c90b20ad61c5fa2ba7d494c1dba1 is the first bad commit
> commit 77b004f4c5c3c90b20ad61c5fa2ba7d494c1dba1 (HEAD)
> Author: Namhyung Kim <namhyung@kernel.org>
> Date:   Thu Sep 12 15:42:08 2024 -0700
>     perf symbol: Do not fixup end address of labels
> 
> ⬢ [acme@toolbox perf-tools-next]$ git tag --contains 77b004f4c5c3c90b20ad61c5fa2ba7d494c1dba1 | grep ^v6
> v6.13-rc1
> v6.13-rc2
> v6.13-rc3
> v6.13-rc4
> v6.13-rc5
> ⬢ [acme@toolbox perf-tools-next]$ 
> ⬢ [acme@toolbox perf-tools-next]$ git merge perf-tools/perf-tools
> Already up to date.
> ⬢ [acme@toolbox perf-tools-next]$ git merge perf-tools/tmp.perf-tools
> Already up to date.
> ⬢ [acme@toolbox perf-tools-next]$
> 
> Running just:
> 
> # perf top --stdio
>    PerfTop:       0 irqs/sec  kernel: 0.0%  exact:  0.0% lost: 0/0 drop: 0/0 [cpu_atom/cycles/P],  (all, 28 CPUs)
> ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> 
> Warning:
> A vmlinux file was not found.
> Kernel samples will not be resolved.
> ^C^C^C^Z
> [1]+  Stopped                 perf top --stdio
> root@number:/home/acme/git/pahole# 
> 
> But kernel addresses are being used and there is a matching vmlinux:
> 
> # pahole --running_kernel_vmlinux
> /lib/modules/6.13.0-rc2/build/vmlinux
> #
> 
> root@number:/home/acme/git/pahole# file /lib/modules/6.13.0-rc2/build/vmlinux
> /lib/modules/6.13.0-rc2/build/vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=d6f220b80bb50b35238aec50ed10f98df58849c0, with debug_info, not stripped
> root@number:/home/acme/git/pahole# perf buildid-list --kernel
> d6f220b80bb50b35238aec50ed10f98df58849c0
> root@number:/home/acme/git/pahole# 
> root@number:/home/acme/git/pahole# perf buildid-list -h --kernel
> 
>  Usage: perf buildid-list [<options>]
> 
>     -k, --kernel          Show current kernel build id
> 
> root@number:/home/acme/git/pahole#
> 
> I'm trying to figure this out now.

So the original logic seems borked, its a kernel sample that isn't
getting resolved, but that doesn't mean that other samples are not being
resolved, so we can't say a vmlinux wasn't found, as it _was_:

Thread 31 "perf" hit Breakpoint 2, perf_event__process_sample (tool=0x7fffffff9bd0, event=0x28ef890, evsel=0xf68860, sample=0x7fff867fb470, machine=0xf8e818) at builtin-top.c:813
813				if (symbol_conf.vmlinux_name) {
(gdb) p symbol_conf.vmlinux_name
$8 = 0x0
(gdb) p al
$9 = {thread = 0xfe8cd0, maps = 0xf8ee80, map = 0xf8f120, sym = 0x0, srcline = 0x0, addr = 23076781, level = 107 'k', filtered = 0 '\000', cpumode = 1 '\001', cpu = 0, socket = -1}
(gdb) p al.map
$10 = (struct map *) 0xf8f120
(gdb) p al.map->dso
$11 = (struct dso *) 0xf8ef30
(gdb) p al.map->dso->name 
$12 = 0xf8f0bb "[kernel.kallsyms]"
(gdb) p __map__is_kernel(al.map)
(gdb) $13 = true
(gdb) p map__has_symbols(al.map)
$14 = true

So looking at that logic I think we should just do those warnings if
there are no symbols:

diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 724a7938632126bf..ca3e8eca6610e851 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -809,7 +809,7 @@ static void perf_event__process_sample(const struct perf_tool *tool,
 		 * invalid --vmlinux ;-)
 		 */
 		if (!machine->kptr_restrict_warned && !top->vmlinux_warned &&
-		    __map__is_kernel(al.map) && map__has_symbols(al.map)) {
+		    __map__is_kernel(al.map) && !map__has_symbols(al.map)) {
 			if (symbol_conf.vmlinux_name) {
 				char serr[256];
 

With this in place we get there when we specify an ELF file that is not
suitable as a vmlinux, be it a mismatching vmlinux or some other random
ELF file, when we then end up with the kernel map "loaded" (attempted to
load but found it invalid, so no symbols in that map):

   PerfTop:       0 irqs/sec  kernel: 0.0%  exact:  0.0% lost: 0/0 drop: 0/0 [cpu_atom/cycles/P],  (all, 28 CPUs)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Warning:
The /bin/bash file can't be used: Mismatching build id
Kernel samples will not be resolved.

[2]+  Stopped                 perf top --stdio --vmlinux /bin/bash
root@number:~#

But why is that before your commig we seemingly would have entries in
the symtab that would cover addresses we now can't resolve?

One of these addresses is:

   0.58%  [kernel]                              [k] 0x00000000016001c1

Which...

root@number:~# pahole --running_kernel_vmlinux
/lib/modules/6.13.0-rc2/build/vmlinux
root@number:~#

root@number:~# readelf -sw /lib/modules/6.13.0-rc2/build/vmlinux | grep -B5 -A5 ' 0000000001600'
259227: ffffffff8156e290   262 FUNC    GLOBAL DEFAULT    1 zs_free
259228: ffffffff8183a4d0   269 FUNC    GLOBAL DEFAULT    1 security_inode_g[...]
259229: ffffffff81c8d900   191 FUNC    GLOBAL DEFAULT    1 devres_find
259230: ffffffff812e11c0    16 FUNC    GLOBAL DEFAULT    1 __pfx___probestu[...]
259231: ffffffff81c985a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_pm_qos_sys[...]
259232: 0000000001600000     0 NOTYPE  GLOBAL DEFAULT  ABS text_size
259233: ffffffff81487f10   117 FUNC    GLOBAL DEFAULT    1 shmem_read_folio_gfp
259234: ffffffff81e08540   155 FUNC    GLOBAL DEFAULT    1 __traceiter_smbu[...]
259235: ffffffff811e13a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_thaw_workqueues
259236: ffffffff81b04c70   599 FUNC    GLOBAL DEFAULT    1 acpi_install_method
259237: ffffffff81de7d40    16 FUNC    GLOBAL DEFAULT    1 __pfx_psmouse_se[...]
root@number:~#

There it is, that "text_size" symbol stayed with with a prev->end equal
to prev->start and thus 0x00000000016001c1 stops being resolved, which
leads us to get to that buggy warning.

I'll put all this into a patch and send it for review,

Thanks,

- Arnaldo

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUG] perf top reports not being able to resolve kernel symbols
  2025-01-02 19:25 ` Arnaldo Carvalho de Melo
@ 2025-01-02 19:51   ` Arnaldo Carvalho de Melo
  2025-01-02 20:58     ` Namhyung Kim
  0 siblings, 1 reply; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-02 19:51 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

On Thu, Jan 02, 2025 at 04:25:07PM -0300, Arnaldo Carvalho de Melo wrote:
> root@number:~# readelf -sw /lib/modules/6.13.0-rc2/build/vmlinux | grep -B5 -A5 ' 0000000001600'
> 259227: ffffffff8156e290   262 FUNC    GLOBAL DEFAULT    1 zs_free
> 259228: ffffffff8183a4d0   269 FUNC    GLOBAL DEFAULT    1 security_inode_g[...]
> 259229: ffffffff81c8d900   191 FUNC    GLOBAL DEFAULT    1 devres_find
> 259230: ffffffff812e11c0    16 FUNC    GLOBAL DEFAULT    1 __pfx___probestu[...]
> 259231: ffffffff81c985a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_pm_qos_sys[...]
> 259232: 0000000001600000     0 NOTYPE  GLOBAL DEFAULT  ABS text_size
> 259233: ffffffff81487f10   117 FUNC    GLOBAL DEFAULT    1 shmem_read_folio_gfp
> 259234: ffffffff81e08540   155 FUNC    GLOBAL DEFAULT    1 __traceiter_smbu[...]
> 259235: ffffffff811e13a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_thaw_workqueues
> 259236: ffffffff81b04c70   599 FUNC    GLOBAL DEFAULT    1 acpi_install_method
> 259237: ffffffff81de7d40    16 FUNC    GLOBAL DEFAULT    1 __pfx_psmouse_se[...]
> root@number:~#
 
> There it is, that "text_size" symbol stayed with with a prev->end equal
> to prev->start and thus 0x00000000016001c1 stops being resolved, which
> leads us to get to that buggy warning.
 
> I'll put all this into a patch and send it for review,

But looking further, where do those 0x00000000016001c1 addresses coming
from?

(gdb) p /x sample->ip
$10 = 0xffffffffb7401fad
(gdb) p /x al->addr
$11 = 0x1601fad
(gdb) bt
#0  perf_event__process_sample (tool=0x7fffffff9bd0, event=0x1017400, evsel=0xf68860, sample=0x7fff8dffa470, machine=0xf8e818) at builtin-top.c:813
#1  0x0000000000447c5c in deliver_event (qe=0x7fffffff9ee8, qevent=0x1024670) at builtin-top.c:1213
#2  0x0000000000642706 in do_flush (oe=0x7fffffff9ee8, show_progress=false) at util/ordered-events.c:245
#3  0x0000000000642a5d in __ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
#4  0x0000000000642b47 in ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP) at util/ordered-events.c:342
#5  0x00000000004477e9 in process_thread (arg=0x7fffffff9bd0) at builtin-top.c:1125
#6  0x00007ffff6ea5d97 in start_thread () from /lib64/libc.so.6
#7  0x00007ffff6f29c8c in clone3 () from /lib64/libc.so.6
(gdb)

root@number:~# grep ffffffffb7401f /proc/kallsyms 
ffffffffb7401f09 t repeat_nmi
ffffffffb7401f2e t end_repeat_nmi
ffffffffb7401f81 t nmi_no_fsgsbase
ffffffffb7401f85 t nmi_swapgs
ffffffffb7401f88 t nmi_restore
ffffffffb7401fb0 T entry_SYSCALL32_ignore
ffffffffb7401fd0 T __pfx_clear_bhb_loop
ffffffffb7401fe0 T clear_bhb_loop
root@number:~# 

Looks like nmi_restore...

Which is...

   780: ffffffff82401ee8     0 NOTYPE  LOCAL  DEFAULT    1 nested_nmi_out
   781: ffffffff82401ed0     0 NOTYPE  LOCAL  DEFAULT    1 nested_nmi
   782: ffffffff82401eeb     0 NOTYPE  LOCAL  DEFAULT    1 first_nmi
   783: ffffffff82401f81     0 NOTYPE  LOCAL  DEFAULT    1 nmi_no_fsgsbase
   784: ffffffff82401f88     0 NOTYPE  LOCAL  DEFAULT    1 nmi_restore
   785: ffffffff82401f85     0 NOTYPE  LOCAL  DEFAULT    1 nmi_swapgs
   786: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS syscall_64.c
   787: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS common.c
   788: ffffffff810cc2b0    16 FUNC    LOCAL  DEFAULT    1 ia32_emulation_o[...]
   789: ffffffff821e57f0   241 FUNC    LOCAL  DEFAULT    1 __do_fast_syscall_32

So there are symbols that are not being resolved anymore that were
before your patch, namely:

arch/x86/entry/entry_64.S

nmi_no_fsgsbase:
        /* EBX == 0 -> invoke SWAPGS */
        testl   %ebx, %ebx
        jnz     nmi_restore

nmi_swapgs:
        swapgs

nmi_restore:
        POP_REGS

- Arnaldo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf top reports not being able to resolve kernel symbols
  2025-01-02 19:51   ` Arnaldo Carvalho de Melo
@ 2025-01-02 20:58     ` Namhyung Kim
  2025-01-03  1:16       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2025-01-02 20:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

Hi Arnaldo,

On Thu, Jan 02, 2025 at 04:51:06PM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jan 02, 2025 at 04:25:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > root@number:~# readelf -sw /lib/modules/6.13.0-rc2/build/vmlinux | grep -B5 -A5 ' 0000000001600'
> > 259227: ffffffff8156e290   262 FUNC    GLOBAL DEFAULT    1 zs_free
> > 259228: ffffffff8183a4d0   269 FUNC    GLOBAL DEFAULT    1 security_inode_g[...]
> > 259229: ffffffff81c8d900   191 FUNC    GLOBAL DEFAULT    1 devres_find
> > 259230: ffffffff812e11c0    16 FUNC    GLOBAL DEFAULT    1 __pfx___probestu[...]
> > 259231: ffffffff81c985a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_pm_qos_sys[...]
> > 259232: 0000000001600000     0 NOTYPE  GLOBAL DEFAULT  ABS text_size
> > 259233: ffffffff81487f10   117 FUNC    GLOBAL DEFAULT    1 shmem_read_folio_gfp
> > 259234: ffffffff81e08540   155 FUNC    GLOBAL DEFAULT    1 __traceiter_smbu[...]
> > 259235: ffffffff811e13a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_thaw_workqueues
> > 259236: ffffffff81b04c70   599 FUNC    GLOBAL DEFAULT    1 acpi_install_method
> > 259237: ffffffff81de7d40    16 FUNC    GLOBAL DEFAULT    1 __pfx_psmouse_se[...]
> > root@number:~#
>  
> > There it is, that "text_size" symbol stayed with with a prev->end equal
> > to prev->start and thus 0x00000000016001c1 stops being resolved, which
> > leads us to get to that buggy warning.
>  
> > I'll put all this into a patch and send it for review,
> 
> But looking further, where do those 0x00000000016001c1 addresses coming
> from?
> 
> (gdb) p /x sample->ip
> $10 = 0xffffffffb7401fad
> (gdb) p /x al->addr
> $11 = 0x1601fad
> (gdb) bt
> #0  perf_event__process_sample (tool=0x7fffffff9bd0, event=0x1017400, evsel=0xf68860, sample=0x7fff8dffa470, machine=0xf8e818) at builtin-top.c:813
> #1  0x0000000000447c5c in deliver_event (qe=0x7fffffff9ee8, qevent=0x1024670) at builtin-top.c:1213
> #2  0x0000000000642706 in do_flush (oe=0x7fffffff9ee8, show_progress=false) at util/ordered-events.c:245
> #3  0x0000000000642a5d in __ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
> #4  0x0000000000642b47 in ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP) at util/ordered-events.c:342
> #5  0x00000000004477e9 in process_thread (arg=0x7fffffff9bd0) at builtin-top.c:1125
> #6  0x00007ffff6ea5d97 in start_thread () from /lib64/libc.so.6
> #7  0x00007ffff6f29c8c in clone3 () from /lib64/libc.so.6
> (gdb)
> 
> root@number:~# grep ffffffffb7401f /proc/kallsyms 
> ffffffffb7401f09 t repeat_nmi
> ffffffffb7401f2e t end_repeat_nmi
> ffffffffb7401f81 t nmi_no_fsgsbase
> ffffffffb7401f85 t nmi_swapgs
> ffffffffb7401f88 t nmi_restore
> ffffffffb7401fb0 T entry_SYSCALL32_ignore
> ffffffffb7401fd0 T __pfx_clear_bhb_loop
> ffffffffb7401fe0 T clear_bhb_loop
> root@number:~# 
> 
> Looks like nmi_restore...
> 
> Which is...
> 
>    780: ffffffff82401ee8     0 NOTYPE  LOCAL  DEFAULT    1 nested_nmi_out
>    781: ffffffff82401ed0     0 NOTYPE  LOCAL  DEFAULT    1 nested_nmi
>    782: ffffffff82401eeb     0 NOTYPE  LOCAL  DEFAULT    1 first_nmi
>    783: ffffffff82401f81     0 NOTYPE  LOCAL  DEFAULT    1 nmi_no_fsgsbase
>    784: ffffffff82401f88     0 NOTYPE  LOCAL  DEFAULT    1 nmi_restore
>    785: ffffffff82401f85     0 NOTYPE  LOCAL  DEFAULT    1 nmi_swapgs
>    786: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS syscall_64.c
>    787: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS common.c
>    788: ffffffff810cc2b0    16 FUNC    LOCAL  DEFAULT    1 ia32_emulation_o[...]
>    789: ffffffff821e57f0   241 FUNC    LOCAL  DEFAULT    1 __do_fast_syscall_32
> 
> So there are symbols that are not being resolved anymore that were
> before your patch, namely:
> 
> arch/x86/entry/entry_64.S
> 
> nmi_no_fsgsbase:
>         /* EBX == 0 -> invoke SWAPGS */
>         testl   %ebx, %ebx
>         jnz     nmi_restore
> 
> nmi_swapgs:
>         swapgs
> 
> nmi_restore:
>         POP_REGS
> 

Sorry about that, maybe I should've done this instead.  Can you check
if it works correctly?

Thanks,
Namhyung

---8<---

From 3130ee711d28f6e280d4bf04bdacca094657bb99 Mon Sep 17 00:00:00 2001
From: Namhyung Kim <namhyung@kernel.org>
Date: Thu, 2 Jan 2025 12:32:51 -0800
Subject: [PATCH] perf symbol: Prefer non-label symbols with same address

When there are more than one symbols at the same address, it needs to
choose which one is better.  In choose_best_symbol() it didn't check the
type of symbols.  It's possible to have labels in other symbols and in
that case, it would be better to pick the actual symbol over the labels.
To minimize the possible impact on other symbols, I only check NOTYPE
symbols specifically.

  $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
  105089: ffffffff82000000   814 FUNC    GLOBAL DEFAULT    1 __do_softirq
  111954: ffffffff82000000     0 NOTYPE  GLOBAL DEFAULT    1 __softirqentry_text_start

The commit 77b004f4c5c3c90b tried to do the same by not giving the size
to the label symbols but it seems there's some label-only symbols in asm
code.  Let's restore the original code and choose the right symbol using
type of the symbols.

Fixes: 77b004f4c5c3c90b ("perf symbol: Do not fixup end address of labels")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/perf/util/symbol.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 0037f11639195dbf..49b08adc6ee34365 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -154,6 +154,13 @@ static int choose_best_symbol(struct symbol *syma, struct symbol *symb)
 	else if ((a == 0) && (b > 0))
 		return SYMBOL_B;
 
+	if (syma->type != symb->type) {
+		if (syma->type == STT_NOTYPE)
+			return SYMBOL_B;
+		if (symb->type == STT_NOTYPE)
+			return SYMBOL_A;
+	}
+
 	/* Prefer a non weak symbol over a weak one */
 	a = syma->binding == STB_WEAK;
 	b = symb->binding == STB_WEAK;
@@ -257,7 +264,7 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
 		 * like in:
 		 *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
 		 */
-		if (prev->end == prev->start && prev->type != STT_NOTYPE) {
+		if (prev->end == prev->start) {
 			const char *prev_mod;
 			const char *curr_mod;
 
-- 
2.47.1.613.gc27f4b7a9f-goog


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUG] perf top reports not being able to resolve kernel symbols
  2025-01-02 20:58     ` Namhyung Kim
@ 2025-01-03  1:16       ` Arnaldo Carvalho de Melo
  2025-01-03 16:33         ` Arnaldo Carvalho de Melo
  2025-01-09 21:17         ` Arnaldo Carvalho de Melo
  0 siblings, 2 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-03  1:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

On Thu, Jan 02, 2025 at 12:58:54PM -0800, Namhyung Kim wrote:
> On Thu, Jan 02, 2025 at 04:51:06PM -0300, Arnaldo Carvalho de Melo wrote:
> > So there are symbols that are not being resolved anymore that were
> > before your patch, namely:

> > arch/x86/entry/entry_64.S

> > nmi_no_fsgsbase:
> >         /* EBX == 0 -> invoke SWAPGS */
> >         testl   %ebx, %ebx
> >         jnz     nmi_restore

> > nmi_swapgs:
> >         swapgs

> > nmi_restore:
> >         POP_REGS
 
> Sorry about that, maybe I should've done this instead.  Can you check
> if it works correctly?

Its late here, but basic test shows samples being resolved to
nmi_restore, when using the TUI 'perf top' interface and pressing / to
ask for samples resolved to samples with 'nmi' on its name, several
other such routines appeared on the radar, including:

Samples: 2K of event 'cpu_atom/cycles/P', 4000 Hz, Event count (approx.): 496683332 lost: 0/0 drop: 0/0
Overhead  Shared O  Symbol
   0.05%  [kernel]  [k] ct_nmi_enter
   0.04%  [kernel]  [k] local_touch_nmi
   0.01%  [kernel]  [k] ct_nmi_exit
   0.01%  [kernel]  [k] nmi_restore
   0.00%  [kernel]  [k] nmi_handle

[1]+  Stopped                 perf top
root@number:~#

So:

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>

And preliminarly:

Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Thanks, I'll submit the other patch, that now doesn't need to go into
the 6.13 window and thus can be added to the perf-tools-next branch,
tomorrow.

- Arnaldo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf top reports not being able to resolve kernel symbols
  2025-01-03  1:16       ` Arnaldo Carvalho de Melo
@ 2025-01-03 16:33         ` Arnaldo Carvalho de Melo
  2025-01-09 21:17         ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-03 16:33 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

On Thu, Jan 02, 2025 at 10:16:35PM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jan 02, 2025 at 12:58:54PM -0800, Namhyung Kim wrote:
> > On Thu, Jan 02, 2025 at 04:51:06PM -0300, Arnaldo Carvalho de Melo wrote:
> > > So there are symbols that are not being resolved anymore that were
> > > before your patch, namely:
> 
> > > arch/x86/entry/entry_64.S
> 
> > > nmi_no_fsgsbase:
> > >         /* EBX == 0 -> invoke SWAPGS */
> > >         testl   %ebx, %ebx
> > >         jnz     nmi_restore
> 
> > > nmi_swapgs:
> > >         swapgs
> 
> > > nmi_restore:
> > >         POP_REGS
>  
> > Sorry about that, maybe I should've done this instead.  Can you check
> > if it works correctly?
> 
> Its late here, but basic test shows samples being resolved to
> nmi_restore, when using the TUI 'perf top' interface and pressing / to
> ask for samples resolved to samples with 'nmi' on its name, several
> other such routines appeared on the radar, including:
> 
> Samples: 2K of event 'cpu_atom/cycles/P', 4000 Hz, Event count (approx.): 496683332 lost: 0/0 drop: 0/0
> Overhead  Shared O  Symbol
>    0.05%  [kernel]  [k] ct_nmi_enter
>    0.04%  [kernel]  [k] local_touch_nmi
>    0.01%  [kernel]  [k] ct_nmi_exit
>    0.01%  [kernel]  [k] nmi_restore
>    0.00%  [kernel]  [k] nmi_handle
> 
> [1]+  Stopped                 perf top
> root@number:~#
> 
> So:
> 
> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> And preliminarly:
> 
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Please also add a:

Closes: https://lore.kernel.org/lkml/Z3buKhcCsZi3_aGb@x1

So that further details are provided about what those asm symbols are.

- Arnaldo
 
> Thanks, I'll submit the other patch, that now doesn't need to go into
> the 6.13 window and thus can be added to the perf-tools-next branch,
> tomorrow.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf top reports not being able to resolve kernel symbols
  2025-01-03  1:16       ` Arnaldo Carvalho de Melo
  2025-01-03 16:33         ` Arnaldo Carvalho de Melo
@ 2025-01-09 21:17         ` Arnaldo Carvalho de Melo
  1 sibling, 0 replies; 7+ messages in thread
From: Arnaldo Carvalho de Melo @ 2025-01-09 21:17 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Christophe Leroy, Adrian Hunter, Ian Rogers, James Clark,
	Jiri Olsa, Kan Liang, Linux Kernel Mailing List, linux-perf-users

On Thu, Jan 02, 2025 at 10:16:35PM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jan 02, 2025 at 12:58:54PM -0800, Namhyung Kim wrote:
> > On Thu, Jan 02, 2025 at 04:51:06PM -0300, Arnaldo Carvalho de Melo wrote:
> > > So there are symbols that are not being resolved anymore that were
> > > before your patch, namely:
> 
> > > arch/x86/entry/entry_64.S
> 
> > > nmi_no_fsgsbase:
> > >         /* EBX == 0 -> invoke SWAPGS */
> > >         testl   %ebx, %ebx
> > >         jnz     nmi_restore
> 
> > > nmi_swapgs:
> > >         swapgs
> 
> > > nmi_restore:
> > >         POP_REGS
>  
> > Sorry about that, maybe I should've done this instead.  Can you check
> > if it works correctly?
> 
> Its late here, but basic test shows samples being resolved to
> nmi_restore, when using the TUI 'perf top' interface and pressing / to
> ask for samples resolved to samples with 'nmi' on its name, several
> other such routines appeared on the radar, including:
> 
> Samples: 2K of event 'cpu_atom/cycles/P', 4000 Hz, Event count (approx.): 496683332 lost: 0/0 drop: 0/0
> Overhead  Shared O  Symbol
>    0.05%  [kernel]  [k] ct_nmi_enter
>    0.04%  [kernel]  [k] local_touch_nmi
>    0.01%  [kernel]  [k] ct_nmi_exit
>    0.01%  [kernel]  [k] nmi_restore
>    0.00%  [kernel]  [k] nmi_handle
> 
> [1]+  Stopped                 perf top
> root@number:~#
> 
> So:
> 
> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
> 
> And preliminarly:
> 
> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Applied.

- Arnaldo

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-01-09 21:17 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-02 18:41 [BUG] perf top reports not being able to resolve kernel symbols Arnaldo Carvalho de Melo
2025-01-02 19:25 ` Arnaldo Carvalho de Melo
2025-01-02 19:51   ` Arnaldo Carvalho de Melo
2025-01-02 20:58     ` Namhyung Kim
2025-01-03  1:16       ` Arnaldo Carvalho de Melo
2025-01-03 16:33         ` Arnaldo Carvalho de Melo
2025-01-09 21:17         ` Arnaldo Carvalho de Melo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).