linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] perf report: fix missing references to maps and map in thread__find_map
@ 2024-06-27 21:36 Casey Chen
  2024-06-27 21:36 ` [PATCH] perf probe: " Casey Chen
  2024-06-27 21:36 ` [PATCH] debug patch for perf report segfault Casey Chen
  0 siblings, 2 replies; 5+ messages in thread
From: Casey Chen @ 2024-06-27 21:36 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel; +Cc: yzhong, Casey Chen

We found a segfault when run perf report on raw perf_data from perf record.
Basically with 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions"),
thread__find_map() would return with al->maps being NULL when cpumode is 3
(macro PERF_RECORD_MISC_HYPERVISOR). Later dereferencing maps->machine would crash.

The fix is to add back references to struct maps/map in thread__find_map(),
please see 0001-perf-probe-fix-missing-references-to-maps-and-map-in.patch,
which is made and tested on afcd48134c58 ("Merge tag
'mm-hotfixes-stable-2024-06-26-17-28' of
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm")

Debug diffs made on 0dd5041c9a0e is attached as
0002-debug-patch-for-perf-report-segfault.patch

Steps to reproduce and what are observed:
1. Run 'gdb ./perf'
2. Initial gdb output. cpumode is 3 and address of node->ms is 0x55555684aee8.

(gdb) run report --stdio -v -i ~/perf_data
...
add_callchain_ip: i'm here! al.map (nil) al.sym (nil) ip >= PERF_CONTEXT_MAX: 0 cpumode 3 ip 0x8348fb8948535441
callchain_cursor_append addr of node->ms 0x55555684aee8 ms 0x7fffffff9e70 has no maps map (nil) symbol (nil)
Failed to open /tmp/perf-31239.map, continuing without symbols
callchain_cursor_commit: cursor 0x7ffff7a17b98 curr 0x55555684aee0 ms 0x55555684aee8
callchain_cursor_commit: cursor 0x7ffff7a17b98 curr 0x55555684aee0 ms 0x55555684aee8
callchain_cursor_commit: cursor 0x7ffff7a17b98 curr 0x55555684aee0 ms 0x55555684aee8

Program received signal SIGSEGV, Segmentation fault.
fill_callchain_info (al=0x7fffffffa120, node=0x55555684aee0, hide_unresolved=false) at util/callchain.c:1119
1119            struct machine *machine = maps__machine(node->ms.maps);

3. Backtrace from gdb
(gdb) bt
#0  fill_callchain_info (al=0x7fffffffa120, node=0x55555684aee0, hide_unresolved=false) at util/callchain.c:1119
#1  0x00005555556bdbc6 in hist_entry_iter__add (iter=iter@entry=0x7fffffffa160, al=al@entry=0x7fffffffa120, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffffb570) at util/hist.c:1245
#2  0x00005555555cd098 in process_sample_event (tool=0x7fffffffb570, event=0x7fffeb58be78, sample=0x7fffffffa210, evsel=<optimized out>, machine=0x55555585b8e0) at builtin-report.c:332
#3  0x00005555556958e3 in perf_session__deliver_event (session=0x55555585b730, event=0x7fffeb58be78, tool=0x7fffffffb570, file_offset=12656248, file_path=0x55555585ade0 "/root/perf_data/1/ct1/perf_data") at util/session.c:1645
#4  0x000055555569a5c1 in do_flush (show_progress=false, oe=0x555555862268) at util/ordered-events.c:245
#5  __ordered_events__flush (oe=0x555555862268, how=OE_FLUSH__ROUND, timestamp=<optimized out>) at util/ordered-events.c:324
#6  0x0000555555695cf5 in perf_session__process_user_event (session=0x55555585b730, event=0x7fffec19f7f8, file_offset=25319416, file_path=<optimized out>) at util/session.c:1698
#7  0x00005555556962a8 in reader__read_event (rd=rd@entry=0x7fffffffaef0, session=session@entry=0x55555585b730, prog=prog@entry=0x7fffffffaec0) at util/session.c:2371
#8  0x0000555555697e41 in reader__process_events (prog=0x7fffffffaec0, session=0x55555585b730, rd=0x7fffffffaef0) at util/session.c:2420
#9  __perf_session__process_events (session=0x55555585b730) at util/session.c:2467
#10 perf_session__process_events (session=session@entry=0x55555585b730) at util/session.c:2633
#11 0x00005555555cffcd in __cmd_report (rep=0x7fffffffb570) at builtin-report.c:991
#12 cmd_report (argc=<optimized out>, argv=<optimized out>) at builtin-report.c:1711
#13 0x0000555555631250 in run_builtin (p=p@entry=0x555555840d40 <commands+288>, argc=argc@entry=5, argv=argv@entry=0x7fffffffddd0) at perf.c:323
#14 0x00005555555b6179 in handle_internal_command (argv=0x7fffffffddd0, argc=5) at perf.c:377
#15 run_argv (argv=<synthetic pointer>, argcp=<synthetic pointer>) at perf.c:421
#16 main (argc=5, argv=0x7fffffffddd0) at perf.c:537

4. The line of code at fault. We can see that deferencing maps->machine caused segfault.
(gdb) disas
Dump of assembler code for function fill_callchain_info:
   0x0000555555684100 <+0>:     endbr64
   0x0000555555684104 <+4>:     push   %rbp
   0x0000555555684105 <+5>:     mov    %rsp,%rbp
   0x0000555555684108 <+8>:     push   %r14
   0x000055555568410a <+10>:    push   %r13
   0x000055555568410c <+12>:    mov    %rsi,%r13
   0x000055555568410f <+15>:    push   %r12
   0x0000555555684111 <+17>:    mov    %rdi,%r12
   0x0000555555684114 <+20>:    push   %rbx
   0x0000555555684115 <+21>:    mov    %edx,%ebx
   0x0000555555684117 <+23>:    sub    $0x10,%rsp
   0x000055555568411b <+27>:    mov    %fs:0x28,%rax
   0x0000555555684124 <+36>:    mov    %rax,-0x28(%rbp)
   0x0000555555684128 <+40>:    mov    0x8(%rsi),%rax
=> 0x000055555568412c <+44>:    mov    0x40(%rax),%r14

...
(gdb) list *0x000055555568412c
0x55555568412c is in fill_callchain_info (util/maps.h:81).
76              return &RC_CHK_ACCESS(maps)->entries;
77      }
78
79      static inline struct machine *maps__machine(struct maps *maps)
80      {
81              return RC_CHK_ACCESS(maps)->machine;
82      }
83
84      static inline struct rw_semaphore *maps__lock(struct maps *maps)
85      {

(gdb) p/x &(((struct maps *)0)->machine)
$1 = 0x40

5. Data structure in trouble. struct maps from node->ms are empty. The address of node->ms is 0x55555684aee8.
(gdb) p/x *node
$2 = {ip = 0x8348fb8948535441, ms = {maps = 0x0, map = 0x0, sym = 0x0}, srcline = 0x0, valid = 0x0, branch = 0x0, branch_flags = {{value = 0x0, {mispred = 0x0, predicted = 0x0, in_tx = 0x0, abort = 0x0, cycles = 0x0, type = 0x0, spec = 0x0, new_type = 0x0, priv = 0x0,
        reserved = 0x0}}}, branch_from = 0x0, nr_loop_iter = 0x0, iter_cycles = 0x0, next = 0x555556813730}
(gdb) p/x &(node->ms)
$3 = 0x55555684aee8

Casey Chen (1):
  perf probe: fix missing references to maps and map in thread__find_map

 tools/perf/util/event.c | 2 ++
 1 file changed, 2 insertions(+)

-- 
2.45.2


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] perf probe: fix missing references to maps and map in thread__find_map
  2024-06-27 21:36 [PATCH 0/1] perf report: fix missing references to maps and map in thread__find_map Casey Chen
@ 2024-06-27 21:36 ` Casey Chen
  2024-07-02 22:50   ` Namhyung Kim
  2024-06-27 21:36 ` [PATCH] debug patch for perf report segfault Casey Chen
  1 sibling, 1 reply; 5+ messages in thread
From: Casey Chen @ 2024-06-27 21:36 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel; +Cc: yzhong, Casey Chen

With 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions"),
thread__find_map() would return with al->maps being NULL when cpumode is 3
(macro PERF_RECORD_MISC_HYPERVISOR), later dereferencing on it would crash.
Fix is to add back the reference to maps and map.
---
 tools/perf/util/event.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index f32f9abf6344..259cb137e5bb 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -638,6 +638,8 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
 			!perf_host)
 			al->filtered |= (1 << HIST_FILTER__HOST);
 
+	al->maps = maps__get(maps);
+	al->map = maps__find(maps, al->addr);
 		return NULL;
 	}
 	al->maps = maps__get(maps);
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH] debug patch for perf report segfault
  2024-06-27 21:36 [PATCH 0/1] perf report: fix missing references to maps and map in thread__find_map Casey Chen
  2024-06-27 21:36 ` [PATCH] perf probe: " Casey Chen
@ 2024-06-27 21:36 ` Casey Chen
  1 sibling, 0 replies; 5+ messages in thread
From: Casey Chen @ 2024-06-27 21:36 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel; +Cc: yzhong, Casey Chen

Debug patch to apply on 0dd5041c9a0e ("perf addr_location: Add
init/exit/copy functions")

---
 tools/perf/util/callchain.c | 4 ++++
 tools/perf/util/callchain.h | 9 +++++++++
 tools/perf/util/machine.c   | 8 ++++++--
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index b0dafc758173..2c1d9b8ff8d9 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1067,6 +1067,10 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
 	node->ip = ip;
 	map__zput(node->ms.map);
 	node->ms = *ms;
+    if (ms && !ms->maps) {
+        pr_info("%s addr of node->ms %p ms %p has no maps map %p symbol %p\n",
+                __func__, &node->ms, ms, ms->map, ms->sym);
+    }
 	node->ms.map = map__get(node->ms.map);
 	node->branch = branch;
 	node->nr_loop_iter = nr_loop_iter;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index d95615daed73..8bba51be33a9 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -6,6 +6,7 @@
 #include <linux/rbtree.h>
 #include "map_symbol.h"
 #include "branch.h"
+#include "debug.h"
 
 struct addr_location;
 struct evsel;
@@ -213,6 +214,14 @@ static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
 {
 	cursor->curr = cursor->first;
 	cursor->pos = 0;
+    if (cursor->curr) {
+       struct callchain_cursor_node *curr = cursor->curr;
+       struct map_symbol *ms = &curr->ms;
+          if (ms && !ms->maps) {
+            pr_info("%s: cursor %p curr %p ms %p\n", __func__, cursor, curr, ms);
+            dump_stack();
+          }
+    }
 }
 
 /* Cursor reading iteration helpers */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9fcf357a4d53..e6c90bb6b842 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2357,6 +2357,10 @@ static int add_callchain_ip(struct thread *thread,
 			goto out;
 		}
 		thread__find_symbol(thread, *cpumode, ip, &al);
+        if (!al.maps) {
+            pr_info("%s: i'm here! al.map %p al.sym %p ip >= PERF_CONTEXT_MAX: %d cpumode %d ip 0x%lx\n",
+                    __func__, al.map, al.sym, ip >= PERF_CONTEXT_MAX, *cpumode, ip);
+        }
 	}
 
 	if (al.sym != NULL) {
@@ -2384,8 +2388,8 @@ static int add_callchain_ip(struct thread *thread,
 	ms.map = map__get(al.map);
 	ms.sym = al.sym;
 
-	if (!branch && append_inlines(cursor, &ms, ip) == 0)
-		goto out;
+//	if (!branch && append_inlines(cursor, &ms, ip) == 0)
+//		goto out;
 
 	srcline = callchain_srcline(&ms, al.addr);
 	err = callchain_cursor_append(cursor, ip, &ms,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] perf probe: fix missing references to maps and map in thread__find_map
  2024-06-27 21:36 ` [PATCH] perf probe: " Casey Chen
@ 2024-07-02 22:50   ` Namhyung Kim
  0 siblings, 0 replies; 5+ messages in thread
From: Namhyung Kim @ 2024-07-02 22:50 UTC (permalink / raw)
  To: Casey Chen; +Cc: linux-perf-users, linux-kernel, yzhong

Hello,

I think the subject prefix should be "perf tool:" instead of "perf
probe:" as it's generic and not related to perf probe command.

On Thu, Jun 27, 2024 at 03:36:18PM -0600, Casey Chen wrote:
> With 0dd5041c9a0e ("perf addr_location: Add init/exit/copy functions"),
> thread__find_map() would return with al->maps being NULL when cpumode is 3
> (macro PERF_RECORD_MISC_HYPERVISOR), later dereferencing on it would crash.
> Fix is to add back the reference to maps and map.
> ---
>  tools/perf/util/event.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index f32f9abf6344..259cb137e5bb 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -638,6 +638,8 @@ struct map *thread__find_map(struct thread *thread, u8 cpumode, u64 addr,
>  			!perf_host)
>  			al->filtered |= (1 << HIST_FILTER__HOST);
>  
> +	al->maps = maps__get(maps);
> +	al->map = maps__find(maps, al->addr);

First, can you pleas fix the indentation?

And I'm not sure if it's the right fix.  It returns NULL so I think we
should expect al->map and al->maps being NULL.  Maybe we need to fix the
caller to handle NULL map or maps properly..

Thanks,
Namhyung


>  		return NULL;
>  	}
>  	al->maps = maps__get(maps);
> -- 
> 2.45.2
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] debug patch for perf report segfault
  2024-07-08 23:23 [PATCH 0/1] perf tool: fix handling NULL al->maps returned from thread__find_map Casey Chen
@ 2024-07-08 23:23 ` Casey Chen
  0 siblings, 0 replies; 5+ messages in thread
From: Casey Chen @ 2024-07-08 23:23 UTC (permalink / raw)
  To: linux-perf-users, linux-kernel; +Cc: yzhong, Casey Chen

Debug patch to apply on 0dd5041c9a0e ("perf addr_location: Add
init/exit/copy functions")

---
 tools/perf/util/callchain.c | 4 ++++
 tools/perf/util/callchain.h | 9 +++++++++
 tools/perf/util/machine.c   | 8 ++++++--
 3 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index b0dafc758173..2c1d9b8ff8d9 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -1067,6 +1067,10 @@ int callchain_cursor_append(struct callchain_cursor *cursor,
 	node->ip = ip;
 	map__zput(node->ms.map);
 	node->ms = *ms;
+    if (ms && !ms->maps) {
+        pr_info("%s addr of node->ms %p ms %p has no maps map %p symbol %p\n",
+                __func__, &node->ms, ms, ms->map, ms->sym);
+    }
 	node->ms.map = map__get(node->ms.map);
 	node->branch = branch;
 	node->nr_loop_iter = nr_loop_iter;
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index d95615daed73..8bba51be33a9 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -6,6 +6,7 @@
 #include <linux/rbtree.h>
 #include "map_symbol.h"
 #include "branch.h"
+#include "debug.h"
 
 struct addr_location;
 struct evsel;
@@ -213,6 +214,14 @@ static inline void callchain_cursor_commit(struct callchain_cursor *cursor)
 {
 	cursor->curr = cursor->first;
 	cursor->pos = 0;
+    if (cursor->curr) {
+       struct callchain_cursor_node *curr = cursor->curr;
+       struct map_symbol *ms = &curr->ms;
+          if (ms && !ms->maps) {
+            pr_info("%s: cursor %p curr %p ms %p\n", __func__, cursor, curr, ms);
+            dump_stack();
+          }
+    }
 }
 
 /* Cursor reading iteration helpers */
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 9fcf357a4d53..e6c90bb6b842 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2357,6 +2357,10 @@ static int add_callchain_ip(struct thread *thread,
 			goto out;
 		}
 		thread__find_symbol(thread, *cpumode, ip, &al);
+        if (!al.maps) {
+            pr_info("%s: i'm here! al.map %p al.sym %p ip >= PERF_CONTEXT_MAX: %d cpumode %d ip 0x%lx\n",
+                    __func__, al.map, al.sym, ip >= PERF_CONTEXT_MAX, *cpumode, ip);
+        }
 	}
 
 	if (al.sym != NULL) {
@@ -2384,8 +2388,8 @@ static int add_callchain_ip(struct thread *thread,
 	ms.map = map__get(al.map);
 	ms.sym = al.sym;
 
-	if (!branch && append_inlines(cursor, &ms, ip) == 0)
-		goto out;
+//	if (!branch && append_inlines(cursor, &ms, ip) == 0)
+//		goto out;
 
 	srcline = callchain_srcline(&ms, al.addr);
 	err = callchain_cursor_append(cursor, ip, &ms,
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-07-08 23:23 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-27 21:36 [PATCH 0/1] perf report: fix missing references to maps and map in thread__find_map Casey Chen
2024-06-27 21:36 ` [PATCH] perf probe: " Casey Chen
2024-07-02 22:50   ` Namhyung Kim
2024-06-27 21:36 ` [PATCH] debug patch for perf report segfault Casey Chen
  -- strict thread matches above, loose matches on Subject: below --
2024-07-08 23:23 [PATCH 0/1] perf tool: fix handling NULL al->maps returned from thread__find_map Casey Chen
2024-07-08 23:23 ` [PATCH] debug patch for perf report segfault Casey Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).