* [PATCH] decode_stacktrace: Support heuristic caller address search
@ 2026-03-05 5:12 Masami Hiramatsu (Google)
2026-03-05 14:56 ` Matthieu Baerts
2026-03-05 15:51 ` Sasha Levin
0 siblings, 2 replies; 6+ messages in thread
From: Masami Hiramatsu (Google) @ 2026-03-05 5:12 UTC (permalink / raw)
To: Matthieu Baerts, Andrew Morton, Sasha Levin
Cc: Carlos Llamas, Luca Ceresoli, Masami Hiramatsu, linux-kernel
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Add -c option to search call address search to decode_stacktrace.
This tries to decode line info backwards, starting from 1byte before
the return address, and displays the first line info it founds as
the caller address.
If it tries up to 10bytes before (or the symbol address) and still
can not find it, it gives up and decodes the return address.
With -c option:
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876)
event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057)
kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664)
__x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779)
do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
? trace_irq_disable (include/trace/events/preemptirq.h:36)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
Without -c option:
Call Trace:
<TASK>
dump_stack_lvl (lib/dump_stack.c:122)
lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877)
event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?)
kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664)
__x64_sys_clone (kernel/fork.c:2779)
do_syscall_64 (arch/x86/entry/syscall_64.c:?)
? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
? trace_irq_disable (include/trace/events/preemptirq.h:36)
entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
scripts/decode_stacktrace.sh | 51 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 46 insertions(+), 5 deletions(-)
diff --git a/scripts/decode_stacktrace.sh b/scripts/decode_stacktrace.sh
index 8d01b741de62..78e0810af476 100755
--- a/scripts/decode_stacktrace.sh
+++ b/scripts/decode_stacktrace.sh
@@ -5,9 +5,11 @@
usage() {
echo "Usage:"
- echo " $0 -r <release>"
- echo " $0 [<vmlinux> [<base_path>|auto [<modules_path>]]]"
+ echo " $0 [-c] -r <release>"
+ echo " $0 [-c] [<vmlinux> [<base_path>|auto [<modules_path>]]]"
echo " $0 -h"
+ echo "Options:"
+ echo " -c: Decode heuristically searched call address."
}
# Try to find a Rust demangler
@@ -33,11 +35,17 @@ fi
READELF=${UTIL_PREFIX}readelf${UTIL_SUFFIX}
ADDR2LINE=${UTIL_PREFIX}addr2line${UTIL_SUFFIX}
NM=${UTIL_PREFIX}nm${UTIL_SUFFIX}
+call_search=false
if [[ $1 == "-h" ]] ; then
usage
exit 0
-elif [[ $1 == "-r" ]] ; then
+elif [[ $1 == "-c" ]] ; then
+ call_search=true
+ shift 1
+fi
+
+if [[ $1 == "-r" ]] ; then
vmlinux=""
basepath="auto"
modpath=""
@@ -123,6 +131,28 @@ find_module() {
return 1
}
+UNKNOWN_LINE="??:0"
+
+search_call_site() {
+ # Instead of using the return address, use the nearest line info
+ # address before given address.
+ local return_addr=${2}
+ local max=${3}
+ local i
+
+ for i in $(seq 1 ${max}); do
+ local expr=$((0x$return_addr-$i))
+ local address=$(printf "%x\n" "$expr")
+
+ local code=$(${ADDR2LINE} -i -e "${1}" "$address" 2>/dev/null)
+ local first=${code% *}
+ if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
+ echo "$code"
+ break
+ fi
+ done
+}
+
parse_symbol() {
# The structure of symbol at this point is:
# ([name]+[offset]/[total length])
@@ -176,6 +206,9 @@ parse_symbol() {
# Let's start doing the math to get the exact address into the
# symbol. First, strip out the symbol total length.
local expr=${symbol%/*}
+ # Also parse the offset from symbol.
+ local offset=${expr#*+}
+ offset=$((offset))
# Now, replace the symbol name with the base address we found
# before.
@@ -190,7 +223,15 @@ parse_symbol() {
if [[ $aarray_support == true && "${cache[$module,$address]+isset}" == "isset" ]]; then
local code=${cache[$module,$address]}
else
- local code=$(${ADDR2LINE} -i -e "$objfile" "$address" 2>/dev/null)
+ local code
+ if [[ $call_search == true && $offset != 0 ]]; then
+ code=$(search_call_site "$objfile" "$address" "$offset")
+ fi
+
+ if [[ "$code" == "" ]]; then
+ code=$(${ADDR2LINE} -i -e "$objfile" "$address" 2>/dev/null)
+ fi
+
if [[ $aarray_support == true ]]; then
cache[$module,$address]=$code
fi
@@ -199,7 +240,7 @@ parse_symbol() {
# addr2line doesn't return a proper error code if it fails, so
# we detect it using the value it prints so that we could preserve
# the offset/size into the function and bail out
- if [[ $code == "??:0" ]]; then
+ if [[ $code == ${UNKNOWN_LINE} ]]; then
return
fi
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] decode_stacktrace: Support heuristic caller address search
2026-03-05 5:12 [PATCH] decode_stacktrace: Support heuristic caller address search Masami Hiramatsu (Google)
@ 2026-03-05 14:56 ` Matthieu Baerts
2026-03-05 16:11 ` Masami Hiramatsu
2026-03-05 15:51 ` Sasha Levin
1 sibling, 1 reply; 6+ messages in thread
From: Matthieu Baerts @ 2026-03-05 14:56 UTC (permalink / raw)
To: Masami Hiramatsu (Google), Andrew Morton, Sasha Levin
Cc: Carlos Llamas, Luca Ceresoli, linux-kernel
Hi Masami,
On 05/03/2026 06:12, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Add -c option to search call address search to decode_stacktrace.
> This tries to decode line info backwards, starting from 1byte before
> the return address, and displays the first line info it founds as
> the caller address.
> If it tries up to 10bytes before (or the symbol address) and still
> can not find it, it gives up and decodes the return address.
Thank you for this new option!
> With -c option:
> Call Trace:
> <TASK>
> dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
> lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876)
> event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057)
> kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664)
> __x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779)
> do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
> ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
> ? trace_irq_disable (include/trace/events/preemptirq.h:36)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
>
>
> Without -c option:
> Call Trace:
> <TASK>
> dump_stack_lvl (lib/dump_stack.c:122)
> lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877)
> event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?)
> kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664)
> __x64_sys_clone (kernel/fork.c:2779)
> do_syscall_64 (arch/x86/entry/syscall_64.c:?)
> ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> ? trace_irq_disable (include/trace/events/preemptirq.h:36)
> entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
That's better indeed!
Do we need a new option for that? Could it not be the new default
behaviour? Or are there any downsides with it?
"addr2line" will be called more, but if it is worth it, it is probably
not an issue, or is it?
Cheers,
Matt
--
Sponsored by the NGI0 Core fund.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] decode_stacktrace: Support heuristic caller address search
2026-03-05 5:12 [PATCH] decode_stacktrace: Support heuristic caller address search Masami Hiramatsu (Google)
2026-03-05 14:56 ` Matthieu Baerts
@ 2026-03-05 15:51 ` Sasha Levin
2026-03-05 16:32 ` Masami Hiramatsu
1 sibling, 1 reply; 6+ messages in thread
From: Sasha Levin @ 2026-03-05 15:51 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Matthieu Baerts, Andrew Morton, Carlos Llamas, Luca Ceresoli,
linux-kernel
On Thu, 5 Mar 2026 14:12:19 +0900, Masami Hiramatsu (Google) wrote:
> Add -c option to search call address search to decode_stacktrace.
> This tries to decode line info backwards, starting from 1byte before
> the return address, and displays the first line info it founds as
> the caller address.
> If it tries up to 10bytes before (or the symbol address) and still
> can not find it, it gives up and decodes the return address.
The commit message says "up to 10bytes" but the code passes $offset
(the function offset from the symbol) as the max iteration count to
search_call_site(). There's no 10-byte cap anywhere in the code?
$offset can easily be hundreds or thousands of bytes into a function.
> +search_call_site() {
> + # Instead of using the return address, use the nearest line info
> + # address before given address.
> + local return_addr=${2}
> + local max=${3}
> + local i
> +
> + for i in $(seq 1 ${max}); do
> + local expr=$((0x$return_addr-$i))
> + local address=$(printf "%x\n" "$expr")
> +
> + local code=$(${ADDR2LINE} -i -e "${1}" "$address" 2>/dev/null)
> + local first=${code% *}
> + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
To also address Matthieu's question about performance: I think this
whole iterative search could be replaced by simply subtracting 1 from
the return address before passing it to addr2line.
DWARF line tables map address *ranges* to source lines, so any address
within the CALL instruction resolves to the correct source line.
return_addr-1 is guaranteed to land inside the CALL instruction (it's
the last byte of it), so a single addr2line call is sufficient.
This is exactly what the kernel itself does in sprint_backtrace()
(kernel/kallsyms.c:570): it passes symbol_offset=-1 to
__sprint_symbol(), which does `address += symbol_offset` before
lookup. GDB, perf, and libunwind all use the same addr-1 trick for
the same reason.
That would make this both correct and free.
> + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
Minor: ${UNKNOWN_LINE} is "??:0" -- when unquoted on the RHS of != inside
[[ ]], the ? characters are interpreted as glob wildcards (each matching
any single character). It happens to work here because ? also matches '?'
itself, but it should be quoted as "${UNKNOWN_LINE}" for correctness.
Same issue on the other != ${UNKNOWN_LINE} below.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] decode_stacktrace: Support heuristic caller address search
2026-03-05 14:56 ` Matthieu Baerts
@ 2026-03-05 16:11 ` Masami Hiramatsu
0 siblings, 0 replies; 6+ messages in thread
From: Masami Hiramatsu @ 2026-03-05 16:11 UTC (permalink / raw)
To: Matthieu Baerts
Cc: Andrew Morton, Sasha Levin, Carlos Llamas, Luca Ceresoli,
linux-kernel
On Thu, 5 Mar 2026 15:56:13 +0100
Matthieu Baerts <matttbe@kernel.org> wrote:
> Hi Masami,
>
> On 05/03/2026 06:12, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Add -c option to search call address search to decode_stacktrace.
> > This tries to decode line info backwards, starting from 1byte before
> > the return address, and displays the first line info it founds as
> > the caller address.
> > If it tries up to 10bytes before (or the symbol address) and still
> > can not find it, it gives up and decodes the return address.
>
> Thank you for this new option!
>
> > With -c option:
> > Call Trace:
> > <TASK>
> > dump_stack_lvl (lib/dump_stack.c:94 lib/dump_stack.c:120)
> > lockdep_rcu_suspicious (kernel/locking/lockdep.c:6876)
> > event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:1057)
> > kernel_clone (include/trace/events/sched.h:396 include/trace/events/sched.h:396 kernel/fork.c:2664)
> > __x64_sys_clone (kernel/fork.c:2795 kernel/fork.c:2779 kernel/fork.c:2779)
> > do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94)
> > ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
> > ? trace_irq_disable (include/trace/events/preemptirq.h:36)
> > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:121)
> >
> >
> > Without -c option:
> > Call Trace:
> > <TASK>
> > dump_stack_lvl (lib/dump_stack.c:122)
> > lockdep_rcu_suspicious (kernel/locking/lockdep.c:6877)
> > event_filter_pid_sched_process_fork (kernel/trace/trace_events.c:?)
> > kernel_clone (include/trace/events/sched.h:? include/trace/events/sched.h:396 kernel/fork.c:2664)
> > __x64_sys_clone (kernel/fork.c:2779)
> > do_syscall_64 (arch/x86/entry/syscall_64.c:?)
> > ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> > ? trace_irq_disable (include/trace/events/preemptirq.h:36)
> > entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> That's better indeed!
>
> Do we need a new option for that? Could it not be the new default
> behaviour? Or are there any downsides with it?
AFAIK, this may not work well on the architectures which have delay
slot (I have not tested) which will execute one more instruction
after branch before branching. In that case, the return address will
not be the next instruction of the delay slot.
But I think that is not popular anymore, so we can switch the default
behavior and maybe we can switch it based on architecture.
Thank you,
>
> "addr2line" will be called more, but if it is worth it, it is probably
> not an issue, or is it?
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] decode_stacktrace: Support heuristic caller address search
2026-03-05 15:51 ` Sasha Levin
@ 2026-03-05 16:32 ` Masami Hiramatsu
2026-03-05 20:38 ` Sasha Levin
0 siblings, 1 reply; 6+ messages in thread
From: Masami Hiramatsu @ 2026-03-05 16:32 UTC (permalink / raw)
To: Sasha Levin
Cc: Matthieu Baerts, Andrew Morton, Carlos Llamas, Luca Ceresoli,
linux-kernel
On Thu, 5 Mar 2026 10:51:47 -0500
Sasha Levin <sashal@kernel.org> wrote:
> On Thu, 5 Mar 2026 14:12:19 +0900, Masami Hiramatsu (Google) wrote:
> > Add -c option to search call address search to decode_stacktrace.
> > This tries to decode line info backwards, starting from 1byte before
> > the return address, and displays the first line info it founds as
> > the caller address.
> > If it tries up to 10bytes before (or the symbol address) and still
> > can not find it, it gives up and decodes the return address.
>
> The commit message says "up to 10bytes" but the code passes $offset
> (the function offset from the symbol) as the max iteration count to
> search_call_site(). There's no 10-byte cap anywhere in the code?
> $offset can easily be hundreds or thousands of bytes into a function.
Ah, sorry. I forgot to set maximum :(
>
> > +search_call_site() {
> > + # Instead of using the return address, use the nearest line info
> > + # address before given address.
> > + local return_addr=${2}
> > + local max=${3}
> > + local i
> > +
> > + for i in $(seq 1 ${max}); do
> > + local expr=$((0x$return_addr-$i))
> > + local address=$(printf "%x\n" "$expr")
> > +
> > + local code=$(${ADDR2LINE} -i -e "${1}" "$address" 2>/dev/null)
> > + local first=${code% *}
> > + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
>
> To also address Matthieu's question about performance: I think this
> whole iterative search could be replaced by simply subtracting 1 from
> the return address before passing it to addr2line.
>
> DWARF line tables map address *ranges* to source lines, so any address
> within the CALL instruction resolves to the correct source line.
> return_addr-1 is guaranteed to land inside the CALL instruction (it's
> the last byte of it), so a single addr2line call is sufficient.
Ah, got it, OK. I also confirmed "addr-1" works. But if there is no lineinfo
entry for the call instruction, shouldn't we check more instructions before
the call?
>
> This is exactly what the kernel itself does in sprint_backtrace()
> (kernel/kallsyms.c:570): it passes symbol_offset=-1 to
> __sprint_symbol(), which does `address += symbol_offset` before
> lookup. GDB, perf, and libunwind all use the same addr-1 trick for
> the same reason.
OK.
>
> That would make this both correct and free.
>
> > + if [[ "$code" != "" && "$code" != ${UNKNOWN_LINE} && "${first#*:}" != "?" ]]; then
>
> Minor: ${UNKNOWN_LINE} is "??:0" -- when unquoted on the RHS of != inside
> [[ ]], the ? characters are interpreted as glob wildcards (each matching
> any single character). It happens to work here because ? also matches '?'
> itself, but it should be quoted as "${UNKNOWN_LINE}" for correctness.
> Same issue on the other != ${UNKNOWN_LINE} below.
Ah, OK. Let me fix it.
Thanks,
>
> --
> Thanks,
> Sasha
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] decode_stacktrace: Support heuristic caller address search
2026-03-05 16:32 ` Masami Hiramatsu
@ 2026-03-05 20:38 ` Sasha Levin
0 siblings, 0 replies; 6+ messages in thread
From: Sasha Levin @ 2026-03-05 20:38 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Matthieu Baerts, Andrew Morton, Carlos Llamas, Luca Ceresoli,
linux-kernel
On Fri, Mar 06, 2026 at 01:32:41AM +0900, Masami Hiramatsu wrote:
>On Thu, 5 Mar 2026 10:51:47 -0500
>Sasha Levin <sashal@kernel.org> wrote:
>> DWARF line tables map address *ranges* to source lines, so any address
>> within the CALL instruction resolves to the correct source line.
>> return_addr-1 is guaranteed to land inside the CALL instruction (it's
>> the last byte of it), so a single addr2line call is sufficient.
>
>Ah, got it, OK. I also confirmed "addr-1" works. But if there is no lineinfo
>entry for the call instruction, shouldn't we check more instructions before
>the call?
There's no such thing as "no lineinfo entry for the call instruction" - DWARF
line tables are range-based, not discrete points. Each row covers all addresses
up to the next row, so every address within a function resolves to some source
line. addr-1 lands inside the CALL instruction and will always resolve to same
line as the CALL itself.
We show "??:0" because the address we passed falls outside of any DWARF
compilation unit altogether.
--
Thanks,
Sasha
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-03-05 20:38 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-05 5:12 [PATCH] decode_stacktrace: Support heuristic caller address search Masami Hiramatsu (Google)
2026-03-05 14:56 ` Matthieu Baerts
2026-03-05 16:11 ` Masami Hiramatsu
2026-03-05 15:51 ` Sasha Levin
2026-03-05 16:32 ` Masami Hiramatsu
2026-03-05 20:38 ` Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox