From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F31D3E316F for ; Mon, 27 Apr 2026 17:17:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777310268; cv=none; b=EuLAXjXllUTaC/FhSEJXnP9OaFTLaIUt340jy70j0MVMsYj3Kr3s4MXxKCh4/XgPNjuDgvKJgujfRf/vqKPYRk/nR0uYieOk+s7kLcPl83DkCLEoZt15+cI7178OznCRqgkUD2qE7OPp5tTebLUF+usRpmaUtiLq56cnpwLoMTY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777310268; c=relaxed/simple; bh=+aVKacKonwpTDcXijyWEMw1FDy1IVHlfw1ZJF0M+N5s=; h=Date:To:From:Subject:Message-Id; b=hh0eVIdLItXMFtSBmSRzOTkNdYs5hVW+1g7CGave36KAD674QAcLmadQywo2y8H49e9yLejMDZ+IRbhJMCvX21jkv1udesKGT5SxxJRXibVDwNwejY4Jm0zm5qlDC24JVCsVKYIKY6Z3EOf/sNmz1AXbxSwzs7UeTe24t70ar+o= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b=HG7FLKCy; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux-foundation.org header.i=@linux-foundation.org header.b="HG7FLKCy" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF24FC2BCB4; Mon, 27 Apr 2026 17:17:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1777310268; bh=+aVKacKonwpTDcXijyWEMw1FDy1IVHlfw1ZJF0M+N5s=; h=Date:To:From:Subject:From; b=HG7FLKCyVHpFBTxU6zXH3f0G8IKPLqmJCzfqN8VyDtP859bhmDMBFGXtqDkgikKWS gwkSIzvp6GJgqnkEGKvFI7uInyPM8lnKRWbI52KpjillTOEzrjEkvta9GKKHEKg2li LPZXGaJ0uvkNOvCP8S/0s+oVLXRs0TXoExSAC2Lw= Date: Mon, 27 Apr 2026 10:17:47 -0700 To: mm-commits@vger.kernel.org,sashal@kernel.org,akpm@linux-foundation.org From: Andrew Morton Subject: [to-be-updated] kallsyms-embed-source-file-line-info-in-kernel-stack-traces.patch removed from -mm tree Message-Id: <20260427171747.DF24FC2BCB4@smtp.kernel.org> Precedence: bulk X-Mailing-List: mm-commits@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: The quilt patch titled Subject: kallsyms: embed source file:line info in kernel stack traces has been removed from the -mm tree. Its filename was kallsyms-embed-source-file-line-info-in-kernel-stack-traces.patch This patch was dropped because an updated version will be issued ------------------------------------------------------ From: Sasha Levin Subject: kallsyms: embed source file:line info in kernel stack traces Date: Sun, 22 Mar 2026 09:15:40 -0400 Patch series "kallsyms: embed source file:line info in kernel stack traces", v4. This series adds CONFIG_KALLSYMS_LINEINFO, which embeds source file:line information directly in the kernel image so that stack traces annotate every frame with the originating source location - no external tools, no debug symbols at runtime, and safe to use in NMI/panic context. Motivation ========== The recent "slowly decommission bugzilla?" thread surfaced a recurring problem: when users encounter kernel crashes they see stack traces like `func+0x1ec/0x240` but have no way to identify which subsystem or maintainer to contact. Richard Weinberger proposed building a database mapping symbols to source files using nm/DWARF. Linus pointed to scripts/decode_stacktrace.sh as the existing solution. But as the discussion progressed, it became clear that decode_stacktrace.sh has significant practical barriers that prevent it from being useful in the common case. Problems with scripts/decode_stacktrace.sh ========================================== - Requires debug symbols: the script needs vmlinux with DWARF debug info. Many distros don't retain debug symbols for older or security kernels, and even when available, asking users to obtain matching debuginfo packages is a significant hurdle. - Requires toolchain: users need addr2line and nm installed. - Version-matching requirement: debug symbols must exactly match the running kernel binary. What this series does ===================== Patch 1: CONFIG_KALLSYMS_LINEINFO At build time, a host tool (scripts/gen_lineinfo) reads DWARF .debug_line from vmlinux, extracts address-to-file:line mappings, and embeds them as sorted lookup tables in .rodata. At runtime, kallsyms_lookup_lineinfo() binary-searches the table and __sprint_symbol() appends "(file:line)" to each stack frame. NMI/panic- safe (no locks, no allocations), KASLR-compatible. Patch 2: CONFIG_KALLSYMS_LINEINFO_MODULES Extends lineinfo to loadable modules. Each .ko gets a .mod_lineinfo section embedded at build time. The module loader picks it up at load time. Same zero-allocation, NMI-safe lookup. Patch 3: delta compression Block-indexed delta-encoding with LEB128 varints, implementing the approach suggested by Juergen Gross in the RFC review. Reduces overhead from ~44 MiB to ~11 MiB (~3.7 bytes/entry), addressing the primary size concern from the RFC. Patch 4: KUnit tests 30 KUnit tests covering the lineinfo lookup paths, delta-decode logic, boundary conditions, and integration with the backtrace formatting APIs. Example output ============== [ 11.206749] dump_stack_lvl+0x5d/0x80 (lib/dump_stack.c:94) [ 11.207403] vpanic+0x36e/0x620 (kernel/panic.c:650) [ 11.209324] panic+0xc9/0xd0 (kernel/panic.c:787) [ 11.213312] sysrq_handle_crash+0x1a/0x20 (drivers/tty/sysrq.c:154) [ 11.214005] __handle_sysrq.cold+0x66/0x256 (drivers/tty/sysrq.c:611) [ 11.214712] write_sysrq_trigger+0x65/0x80 (drivers/tty/sysrq.c:1221) [ 11.215424] proc_reg_write+0x1bd/0x3c0 (fs/proc/inode.c:330) [ 11.216061] vfs_write+0x1c6/0xff0 (fs/read_write.c:686) [ 11.218848] ksys_write+0xfa/0x200 (fs/read_write.c:740) [ 11.222394] do_syscall_64+0xf3/0x690 (arch/x86/entry/syscall_64.c:63) Size impact =========== Measured with a Debian kernel config: - bzImage: +3.6 MiB (14 MiB -> 18 MiB, +26%) - Runtime memory: +5.9 MiB (text+data+bss) - Code overhead: +5.0 KiB (.text, lookup functions only) - Data overhead: +5.9 MiB (.data, lineinfo tables) Lineinfo data breakdown: - lineinfo_data (delta-compressed): 5,728 KiB (97%) - lineinfo_block_addrs: 99 KiB - lineinfo_block_offsets: 99 KiB - lineinfo_filenames: 111 KiB - lineinfo_file_offsets: 17 KiB The ~5.9 MiB is after 2.7x delta compression; uncompressed would be ~16 MiB. This is a fraction of the cost of shipping full DWARF debug info (hundreds of MiB), which distros must store and serve for every kernel version. For distros, maintaining debug symbol repositories is expensive: storage, mirrors, and CDN bandwidth for hundreds of MiB per kernel build add up quickly. A ~5.9 MiB increase in the kernel image itself is a modest cost that eliminates the need for users to find, download, and version-match debuginfo packages just to make a crash report useful. For developers, the file:line annotations appear immediately in crash traces - no post-processing with decode_stacktrace.sh needed. This patch (of 4): Add CONFIG_KALLSYMS_LINEINFO, which embeds a compact address-to-line lookup table in the kernel image so stack traces directly print source file and line number information: root@localhost:~# echo c > /proc/sysrq-trigger [ 11.201987] sysrq: Trigger a crash [ 11.202831] Kernel panic - not syncing: sysrq triggered crash [ 11.206218] Call Trace: [ 11.206501] [ 11.206749] dump_stack_lvl+0x5d/0x80 (lib/dump_stack.c:94) [ 11.207403] vpanic+0x36e/0x620 (kernel/panic.c:650) [ 11.208565] ? __lock_acquire+0x465/0x2240 (kernel/locking/lockdep.c:4674) [ 11.209324] panic+0xc9/0xd0 (kernel/panic.c:787) [ 11.211873] ? find_held_lock+0x2b/0x80 (kernel/locking/lockdep.c:5350) [ 11.212597] ? lock_release+0xd3/0x300 (kernel/locking/lockdep.c:5535) [ 11.213312] sysrq_handle_crash+0x1a/0x20 (drivers/tty/sysrq.c:154) [ 11.214005] __handle_sysrq.cold+0x66/0x256 (drivers/tty/sysrq.c:611) [ 11.214712] write_sysrq_trigger+0x65/0x80 (drivers/tty/sysrq.c:1221) [ 11.215424] proc_reg_write+0x1bd/0x3c0 (fs/proc/inode.c:330) [ 11.216061] vfs_write+0x1c6/0xff0 (fs/read_write.c:686) [ 11.218848] ksys_write+0xfa/0x200 (fs/read_write.c:740) [ 11.222394] do_syscall_64+0xf3/0x690 (arch/x86/entry/syscall_64.c:63) [ 11.223942] entry_SYSCALL_64_after_hwframe+0x77/0x7f (arch/x86/entry/entry_64.S:121) At build time, a new host tool (scripts/gen_lineinfo) reads DWARF .debug_line from vmlinux using libdw (elfutils), extracts all address-to-file:line mappings, and generates an assembly file with sorted parallel arrays (offsets from _text, file IDs, and line numbers). These are linked into vmlinux as .rodata. At runtime, kallsyms_lookup_lineinfo() does a binary search on the table and __sprint_symbol() appends "(file:line)" to each stack frame. The lookup uses offsets from _text so it works with KASLR, requires no locks or allocations, and is safe in any context including panic. The feature requires CONFIG_DEBUG_INFO (for DWARF data) and elfutils (libdw-dev) on the build host. Memory footprint measured with a 1852-option x86_64 config: Table: 4,597,583 entries from 4,841 source files lineinfo_addrs[] 4,597,583 x u32 = 17.5 MiB lineinfo_file_ids[] 4,597,583 x u16 = 8.8 MiB lineinfo_lines[] 4,597,583 x u32 = 17.5 MiB file_offsets + filenames ~ 0.1 MiB Total .rodata increase: ~ 44.0 MiB vmlinux (stripped): 529 MiB -> 573 MiB (+44 MiB / +8.3%) Note: this probably won't be something we roll into "production", but it might be useful for the average user given the relatively low memory footprint, in canary deployments for hyperscalers, or by default for folks who run tests/fuzzing/etc. Disclaimer: this was vibe coded over an afternoon with an AI coding assistant. The .config used for testing is a simple KVM guest configuration for local development and testing. Assisted-by: Claude:claude-opus-4-6 Link: https://lore.kernel.org/20260322131543.971079-1-sashal@kernel.org Link: https://lore.kernel.org/20260322131543.971079-2-sashal@kernel.org Signed-off-by: Sasha Levin Cc: Geert Uytterhoeven Cc: Greg Kroah-Hartman Cc: Helge Deller Cc: James Bottomley Cc: Jonathan Corbet Cc: Juegren Gross Cc: Kees Cook Cc: Laurent Pinchart Cc: Luis Chamberalin Cc: Masahiro Yamada Cc: Nathan Chancellor Cc: Peter Zijlstra Cc: Petr Mladek Cc: Petr Pavlu Cc: Randy Dunlap Cc: Richard Weinberger Cc: Steven Rostedt Cc: Thorsten Leemhuis Signed-off-by: Andrew Morton --- Documentation/admin-guide/index.rst | 1 Documentation/admin-guide/kallsyms-lineinfo.rst | 72 + MAINTAINERS | 6 include/linux/kallsyms.h | 17 init/Kconfig | 20 kernel/kallsyms.c | 56 + kernel/kallsyms_internal.h | 9 scripts/.gitignore | 1 scripts/Makefile | 3 scripts/empty_lineinfo.S | 30 scripts/gen_lineinfo.c | 510 ++++++++++++++ scripts/kallsyms.c | 16 scripts/link-vmlinux.sh | 43 + 13 files changed, 780 insertions(+), 4 deletions(-) --- a/Documentation/admin-guide/index.rst~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/Documentation/admin-guide/index.rst @@ -73,6 +73,7 @@ problems and bugs in particular. ramoops dynamic-debug-howto init + kallsyms-lineinfo kdump/index perf/index pstore-blk diff --git a/Documentation/admin-guide/kallsyms-lineinfo.rst a/Documentation/admin-guide/kallsyms-lineinfo.rst new file mode 100644 --- /dev/null +++ a/Documentation/admin-guide/kallsyms-lineinfo.rst @@ -0,0 +1,72 @@ +.. SPDX-License-Identifier: GPL-2.0 + +==================================== +Kallsyms Source Line Info (LINEINFO) +==================================== + +Overview +======== + +``CONFIG_KALLSYMS_LINEINFO`` embeds DWARF-derived source file and line number +mappings into the kernel image so that stack traces include +``(file.c:123)`` annotations next to each symbol. This makes it significantly +easier to pinpoint the exact source location during debugging, without needing +to manually cross-reference addresses with ``addr2line``. + +Enabling the Feature +==================== + +Enable the following kernel configuration options:: + + CONFIG_KALLSYMS=y + CONFIG_DEBUG_INFO=y + CONFIG_KALLSYMS_LINEINFO=y + +Build dependency: the host tool ``scripts/gen_lineinfo`` requires ``libdw`` +from elfutils. Install the development package: + +- Debian/Ubuntu: ``apt install libdw-dev`` +- Fedora/RHEL: ``dnf install elfutils-devel`` +- Arch Linux: ``pacman -S elfutils`` + +Example Output +============== + +Without ``CONFIG_KALLSYMS_LINEINFO``:: + + Call Trace: + + dump_stack_lvl+0x5d/0x80 + do_syscall_64+0x82/0x190 + entry_SYSCALL_64_after_hwframe+0x76/0x7e + +With ``CONFIG_KALLSYMS_LINEINFO``:: + + Call Trace: + + dump_stack_lvl+0x5d/0x80 (lib/dump_stack.c:123) + do_syscall_64+0x82/0x190 (arch/x86/entry/common.c:52) + entry_SYSCALL_64_after_hwframe+0x76/0x7e + +Note that assembly routines (such as ``entry_SYSCALL_64_after_hwframe``) are +not annotated because they lack DWARF debug information. + +Memory Overhead +=============== + +The lineinfo tables are stored in ``.rodata`` and typically add approximately +44 MiB to the kernel image for a standard configuration (~4.6 million DWARF +line entries, ~10 bytes per entry after deduplication). + +Known Limitations +================= + +- **vmlinux only**: Only symbols in the core kernel image are annotated. + Module symbols are not covered. +- **4 GiB offset limit**: Address offsets from ``_text`` are stored as 32-bit + values. Entries beyond 4 GiB from ``_text`` are skipped at build time with + a warning. +- **65535 file limit**: Source file IDs are stored as 16-bit values. Builds + with more than 65535 unique source files will fail with an error. +- **No assembly annotations**: Functions implemented in assembly that lack + DWARF ``.debug_line`` data are not annotated. --- a/include/linux/kallsyms.h~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/include/linux/kallsyms.h @@ -16,10 +16,15 @@ #include #define KSYM_NAME_LEN 512 + +/* Extra space for " (path/to/file.c:12345)" suffix when lineinfo is enabled */ +#define KSYM_LINEINFO_LEN (IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) ? 128 : 0) + #define KSYM_SYMBOL_LEN (sizeof("%s+%#lx/%#lx [%s %s]") + \ (KSYM_NAME_LEN - 1) + \ 2*(BITS_PER_LONG*3/10) + (MODULE_NAME_LEN - 1) + \ - (BUILD_ID_SIZE_MAX * 2) + 1) + (BUILD_ID_SIZE_MAX * 2) + 1 + \ + KSYM_LINEINFO_LEN) struct cred; struct module; @@ -96,6 +101,9 @@ extern int sprint_backtrace_build_id(cha int lookup_symbol_name(unsigned long addr, char *symname); +bool kallsyms_lookup_lineinfo(unsigned long addr, + const char **file, unsigned int *line); + #else /* !CONFIG_KALLSYMS */ static inline unsigned long kallsyms_lookup_name(const char *name) @@ -164,6 +172,13 @@ static inline int kallsyms_on_each_match { return -EOPNOTSUPP; } + +static inline bool kallsyms_lookup_lineinfo(unsigned long addr, + const char **file, + unsigned int *line) +{ + return false; +} #endif /*CONFIG_KALLSYMS*/ static inline void print_ip_sym(const char *loglvl, unsigned long ip) --- a/init/Kconfig~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/init/Kconfig @@ -2070,6 +2070,26 @@ config KALLSYMS_ALL Say N unless you really need all symbols, or kernel live patching. +config KALLSYMS_LINEINFO + bool "Embed source file:line information in stack traces" + depends on KALLSYMS && DEBUG_INFO + help + Embeds an address-to-source-line mapping table in the kernel + image so that stack traces directly include file:line information, + similar to what scripts/decode_stacktrace.sh provides but without + needing external tools or a vmlinux with debug info at runtime. + + When enabled, stack traces will look like: + + kmem_cache_alloc_noprof+0x60/0x630 (mm/slub.c:3456) + anon_vma_clone+0x2ed/0xcf0 (mm/rmap.c:412) + + This requires elfutils (libdw-dev/elfutils-devel) on the build host. + Adds approximately 44MB to a typical kernel image (10 bytes per + DWARF line-table entry, ~4.6M entries for a typical config). + + If unsure, say N. + # end of the "standard kernel features (expert users)" menu config ARCH_HAS_MEMBARRIER_CALLBACKS --- a/kernel/kallsyms.c~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/kernel/kallsyms.c @@ -467,6 +467,52 @@ static int append_buildid(char *buffer, #endif /* CONFIG_STACKTRACE_BUILD_ID */ +bool kallsyms_lookup_lineinfo(unsigned long addr, + const char **file, unsigned int *line) +{ + unsigned long long raw_offset; + unsigned int offset, low, high, mid, file_id; + + if (!IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) || !lineinfo_num_entries) + return false; + + /* Compute offset from _text */ + if (addr < (unsigned long)_text) + return false; + + raw_offset = addr - (unsigned long)_text; + if (raw_offset > UINT_MAX) + return false; + offset = (unsigned int)raw_offset; + + /* Binary search for largest entry <= offset */ + low = 0; + high = lineinfo_num_entries; + while (low < high) { + mid = low + (high - low) / 2; + if (lineinfo_addrs[mid] <= offset) + low = mid + 1; + else + high = mid; + } + + if (low == 0) + return false; + low--; + + file_id = lineinfo_file_ids[low]; + *line = lineinfo_lines[low]; + + if (file_id >= lineinfo_num_files) + return false; + + if (lineinfo_file_offsets[file_id] >= lineinfo_filenames_size) + return false; + + *file = &lineinfo_filenames[lineinfo_file_offsets[file_id]]; + return true; +} + /* Look up a kernel symbol and return it in a text buffer. */ static int __sprint_symbol(char *buffer, unsigned long address, int symbol_offset, int add_offset, int add_buildid) @@ -497,6 +543,16 @@ static int __sprint_symbol(char *buffer, len += sprintf(buffer + len, "]"); } + if (IS_ENABLED(CONFIG_KALLSYMS_LINEINFO) && !modname) { + const char *li_file; + unsigned int li_line; + + if (kallsyms_lookup_lineinfo(address, + &li_file, &li_line)) + len += snprintf(buffer + len, KSYM_SYMBOL_LEN - len, + " (%s:%u)", li_file, li_line); + } + return len; } --- a/kernel/kallsyms_internal.h~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/kernel/kallsyms_internal.h @@ -15,4 +15,13 @@ extern const u16 kallsyms_token_index[]; extern const unsigned int kallsyms_markers[]; extern const u8 kallsyms_seqs_of_names[]; +extern const u32 lineinfo_num_entries; +extern const u32 lineinfo_addrs[]; +extern const u16 lineinfo_file_ids[]; +extern const u32 lineinfo_lines[]; +extern const u32 lineinfo_num_files; +extern const u32 lineinfo_file_offsets[]; +extern const u32 lineinfo_filenames_size; +extern const char lineinfo_filenames[]; + #endif // LINUX_KALLSYMS_INTERNAL_H_ --- a/MAINTAINERS~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/MAINTAINERS @@ -13793,6 +13793,12 @@ S: Maintained F: Documentation/hwmon/k8temp.rst F: drivers/hwmon/k8temp.c +KALLSYMS LINEINFO +M: Sasha Levin +S: Maintained +F: Documentation/admin-guide/kallsyms-lineinfo.rst +F: scripts/gen_lineinfo.c + KASAN M: Andrey Ryabinin R: Alexander Potapenko diff --git a/scripts/empty_lineinfo.S a/scripts/empty_lineinfo.S new file mode 100644 --- /dev/null +++ a/scripts/empty_lineinfo.S @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2026 Sasha Levin + * + * Empty lineinfo stub for the initial vmlinux link. + * The real lineinfo is generated from .tmp_vmlinux1 by gen_lineinfo. + */ + .section .rodata, "a" + .globl lineinfo_num_entries + .balign 4 +lineinfo_num_entries: + .long 0 + .globl lineinfo_num_files + .balign 4 +lineinfo_num_files: + .long 0 + .globl lineinfo_addrs +lineinfo_addrs: + .globl lineinfo_file_ids +lineinfo_file_ids: + .globl lineinfo_lines +lineinfo_lines: + .globl lineinfo_file_offsets +lineinfo_file_offsets: + .globl lineinfo_filenames_size + .balign 4 +lineinfo_filenames_size: + .long 0 + .globl lineinfo_filenames +lineinfo_filenames: diff --git a/scripts/gen_lineinfo.c a/scripts/gen_lineinfo.c new file mode 100644 --- /dev/null +++ a/scripts/gen_lineinfo.c @@ -0,0 +1,510 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * gen_lineinfo.c - Generate address-to-source-line lookup tables from DWARF + * + * Copyright (C) 2026 Sasha Levin + * + * Reads DWARF .debug_line from a vmlinux ELF file and outputs an assembly + * file containing sorted lookup tables that the kernel uses to annotate + * stack traces with source file:line information. + * + * Requires libdw from elfutils. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static unsigned int skipped_overflow; + +struct line_entry { + unsigned int offset; /* offset from _text */ + unsigned int file_id; + unsigned int line; +}; + +struct file_entry { + char *name; + unsigned int id; + unsigned int str_offset; +}; + +static struct line_entry *entries; +static unsigned int num_entries; +static unsigned int entries_capacity; + +static struct file_entry *files; +static unsigned int num_files; +static unsigned int files_capacity; + +#define FILE_HASH_BITS 13 +#define FILE_HASH_SIZE (1 << FILE_HASH_BITS) + +struct file_hash_entry { + const char *name; + unsigned int id; +}; + +static struct file_hash_entry file_hash[FILE_HASH_SIZE]; + +static unsigned int hash_str(const char *s) +{ + unsigned int h = 5381; + + for (; *s; s++) + h = h * 33 + (unsigned char)*s; + return h & (FILE_HASH_SIZE - 1); +} + +static void add_entry(unsigned int offset, unsigned int file_id, + unsigned int line) +{ + if (num_entries >= entries_capacity) { + entries_capacity = entries_capacity ? entries_capacity * 2 : 65536; + entries = realloc(entries, entries_capacity * sizeof(*entries)); + if (!entries) { + fprintf(stderr, "out of memory\n"); + exit(1); + } + } + entries[num_entries].offset = offset; + entries[num_entries].file_id = file_id; + entries[num_entries].line = line; + num_entries++; +} + +static unsigned int find_or_add_file(const char *name) +{ + unsigned int h = hash_str(name); + + /* Open-addressing lookup with linear probing */ + while (file_hash[h].name) { + if (!strcmp(file_hash[h].name, name)) + return file_hash[h].id; + h = (h + 1) & (FILE_HASH_SIZE - 1); + } + + if (num_files >= 65535) { + fprintf(stderr, + "gen_lineinfo: too many source files (%u > 65535)\n", + num_files); + exit(1); + } + + if (num_files >= files_capacity) { + files_capacity = files_capacity ? files_capacity * 2 : 4096; + files = realloc(files, files_capacity * sizeof(*files)); + if (!files) { + fprintf(stderr, "out of memory\n"); + exit(1); + } + } + files[num_files].name = strdup(name); + files[num_files].id = num_files; + + /* Insert into hash table (points to files[] entry) */ + file_hash[h].name = files[num_files].name; + file_hash[h].id = num_files; + + num_files++; + return num_files - 1; +} + +/* + * Well-known top-level directories in the kernel source tree. + * Used as a fallback to recover relative paths from absolute DWARF paths + * when comp_dir doesn't match (e.g. O= out-of-tree builds where comp_dir + * is the build directory but source paths point into the source tree). + */ +static const char * const kernel_dirs[] = { + "arch/", "block/", "certs/", "crypto/", "drivers/", "fs/", + "include/", "init/", "io_uring/", "ipc/", "kernel/", "lib/", + "mm/", "net/", "rust/", "samples/", "scripts/", "security/", + "sound/", "tools/", "usr/", "virt/", +}; + +/* + * Strip a filename to a kernel-relative path. + * + * For absolute paths, strip the comp_dir prefix (from DWARF) to get + * a kernel-tree-relative path. When that fails (e.g. O= builds where + * comp_dir is the build directory), scan for a well-known kernel + * top-level directory name in the path to recover the relative path. + * Fall back to the basename as a last resort. + * + * For relative paths (common in modules), libdw may produce a bogus + * doubled path like "net/foo/bar.c/net/foo/bar.c" due to ET_REL DWARF + * quirks. Detect and strip such duplicates. + */ +static const char *make_relative(const char *path, const char *comp_dir) +{ + const char *p; + + /* If already relative, use as-is */ + if (path[0] != '/') + return path; + + /* comp_dir from DWARF is the most reliable method */ + if (comp_dir) { + size_t len = strlen(comp_dir); + + if (!strncmp(path, comp_dir, len) && path[len] == '/') { + const char *rel = path + len + 1; + + /* + * If comp_dir pointed to a subdirectory + * (e.g. arch/parisc/kernel) rather than + * the tree root, stripping it leaves a + * bare filename. Fall through to the + * kernel_dirs scan so we recover the full + * relative path instead. + */ + if (strchr(rel, '/')) + return rel; + } + + /* + * comp_dir prefix didn't help — either it didn't match + * or it was too specific and left a bare filename. + * Scan for a known kernel top-level directory component + * to find where the relative path starts. This handles + * O= builds and arches where comp_dir is a subdirectory. + */ + for (p = path + 1; *p; p++) { + if (*(p - 1) == '/') { + for (unsigned int i = 0; i < sizeof(kernel_dirs) / + sizeof(kernel_dirs[0]); i++) { + if (!strncmp(p, kernel_dirs[i], + strlen(kernel_dirs[i]))) + return p; + } + } + } + + /* Fall back to basename */ + p = strrchr(path, '/'); + return p ? p + 1 : path; + } + + /* Fall back to basename */ + p = strrchr(path, '/'); + return p ? p + 1 : path; +} + +static int compare_entries(const void *a, const void *b) +{ + const struct line_entry *ea = a; + const struct line_entry *eb = b; + + if (ea->offset != eb->offset) + return ea->offset < eb->offset ? -1 : 1; + if (ea->file_id != eb->file_id) + return ea->file_id < eb->file_id ? -1 : 1; + if (ea->line != eb->line) + return ea->line < eb->line ? -1 : 1; + return 0; +} + +static unsigned long long find_text_addr(Elf *elf) +{ + size_t nsyms, i; + Elf_Scn *scn = NULL; + GElf_Shdr shdr; + + while ((scn = elf_nextscn(elf, scn)) != NULL) { + Elf_Data *data; + + if (!gelf_getshdr(scn, &shdr)) + continue; + if (shdr.sh_type != SHT_SYMTAB) + continue; + + data = elf_getdata(scn, NULL); + if (!data) + continue; + + nsyms = shdr.sh_size / shdr.sh_entsize; + for (i = 0; i < nsyms; i++) { + GElf_Sym sym; + const char *name; + + if (!gelf_getsym(data, i, &sym)) + continue; + name = elf_strptr(elf, shdr.sh_link, sym.st_name); + if (name && !strcmp(name, "_text")) + return sym.st_value; + } + } + + fprintf(stderr, "Cannot find _text symbol\n"); + exit(1); +} + +static void process_dwarf(Dwarf *dwarf, unsigned long long text_addr) +{ + Dwarf_Off off = 0, next_off; + size_t hdr_size; + + while (dwarf_nextcu(dwarf, off, &next_off, &hdr_size, + NULL, NULL, NULL) == 0) { + Dwarf_Die cudie; + Dwarf_Lines *lines; + size_t nlines; + Dwarf_Attribute attr; + const char *comp_dir = NULL; + + if (!dwarf_offdie(dwarf, off + hdr_size, &cudie)) + goto next; + + if (dwarf_attr(&cudie, DW_AT_comp_dir, &attr)) + comp_dir = dwarf_formstring(&attr); + + if (dwarf_getsrclines(&cudie, &lines, &nlines) != 0) + goto next; + + for (size_t i = 0; i < nlines; i++) { + Dwarf_Line *line = dwarf_onesrcline(lines, i); + Dwarf_Addr addr; + const char *src; + const char *rel; + unsigned int file_id, loffset; + int lineno; + + if (!line) + continue; + + if (dwarf_lineaddr(line, &addr) != 0) + continue; + if (dwarf_lineno(line, &lineno) != 0) + continue; + if (lineno == 0) + continue; + + src = dwarf_linesrc(line, NULL, NULL); + if (!src) + continue; + + if (addr < text_addr) + continue; + + { + unsigned long long raw_offset = addr - text_addr; + + if (raw_offset > UINT_MAX) { + skipped_overflow++; + continue; + } + loffset = (unsigned int)raw_offset; + } + + rel = make_relative(src, comp_dir); + file_id = find_or_add_file(rel); + + add_entry(loffset, file_id, (unsigned int)lineno); + } +next: + off = next_off; + } +} + +static void deduplicate(void) +{ + unsigned int i, j; + + if (num_entries < 2) + return; + + /* Sort by offset, then file_id, then line for stability */ + qsort(entries, num_entries, sizeof(*entries), compare_entries); + + /* + * Remove duplicate entries: + * - Same offset: keep first (deterministic from stable sort keys) + * - Same file:line as previous kept entry: redundant for binary + * search -- any address between them resolves to the earlier one + */ + j = 0; + for (i = 1; i < num_entries; i++) { + if (entries[i].offset == entries[j].offset) + continue; + if (entries[i].file_id == entries[j].file_id && + entries[i].line == entries[j].line) + continue; + j++; + if (j != i) + entries[j] = entries[i]; + } + num_entries = j + 1; +} + +static void compute_file_offsets(void) +{ + unsigned int offset = 0; + + for (unsigned int i = 0; i < num_files; i++) { + files[i].str_offset = offset; + offset += strlen(files[i].name) + 1; + } +} + +static void print_escaped_asciz(const char *s) +{ + printf("\t.asciz \""); + for (; *s; s++) { + if (*s == '"' || *s == '\\') + putchar('\\'); + putchar(*s); + } + printf("\"\n"); +} + +static void output_assembly(void) +{ + printf("/* SPDX-License-Identifier: GPL-2.0 */\n"); + printf("/*\n"); + printf(" * Automatically generated by scripts/gen_lineinfo\n"); + printf(" * Do not edit.\n"); + printf(" */\n\n"); + + printf("\t.section .rodata, \"a\"\n\n"); + + /* Number of entries */ + printf("\t.globl lineinfo_num_entries\n"); + printf("\t.balign 4\n"); + printf("lineinfo_num_entries:\n"); + printf("\t.long %u\n\n", num_entries); + + /* Number of files */ + printf("\t.globl lineinfo_num_files\n"); + printf("\t.balign 4\n"); + printf("lineinfo_num_files:\n"); + printf("\t.long %u\n\n", num_files); + + /* Sorted address offsets from _text */ + printf("\t.globl lineinfo_addrs\n"); + printf("\t.balign 4\n"); + printf("lineinfo_addrs:\n"); + for (unsigned int i = 0; i < num_entries; i++) + printf("\t.long 0x%x\n", entries[i].offset); + printf("\n"); + + /* File IDs, parallel to addrs (u16 -- supports up to 65535 files) */ + printf("\t.globl lineinfo_file_ids\n"); + printf("\t.balign 2\n"); + printf("lineinfo_file_ids:\n"); + for (unsigned int i = 0; i < num_entries; i++) + printf("\t.short %u\n", entries[i].file_id); + printf("\n"); + + /* Line numbers, parallel to addrs */ + printf("\t.globl lineinfo_lines\n"); + printf("\t.balign 4\n"); + printf("lineinfo_lines:\n"); + for (unsigned int i = 0; i < num_entries; i++) + printf("\t.long %u\n", entries[i].line); + printf("\n"); + + /* File string offset table */ + printf("\t.globl lineinfo_file_offsets\n"); + printf("\t.balign 4\n"); + printf("lineinfo_file_offsets:\n"); + for (unsigned int i = 0; i < num_files; i++) + printf("\t.long %u\n", files[i].str_offset); + printf("\n"); + + /* Filenames size */ + { + unsigned int fsize = 0; + + for (unsigned int i = 0; i < num_files; i++) + fsize += strlen(files[i].name) + 1; + printf("\t.globl lineinfo_filenames_size\n"); + printf("\t.balign 4\n"); + printf("lineinfo_filenames_size:\n"); + printf("\t.long %u\n\n", fsize); + } + + /* Concatenated NUL-terminated filenames */ + printf("\t.globl lineinfo_filenames\n"); + printf("lineinfo_filenames:\n"); + for (unsigned int i = 0; i < num_files; i++) + print_escaped_asciz(files[i].name); + printf("\n"); +} + +int main(int argc, char *argv[]) +{ + int fd; + Elf *elf; + Dwarf *dwarf; + unsigned long long text_addr; + + if (argc != 2) { + fprintf(stderr, "Usage: %s \n", argv[0]); + return 1; + } + + fd = open(argv[1], O_RDONLY); + if (fd < 0) { + fprintf(stderr, "Cannot open %s: %s\n", argv[1], + strerror(errno)); + return 1; + } + + elf_version(EV_CURRENT); + elf = elf_begin(fd, ELF_C_READ, NULL); + if (!elf) { + fprintf(stderr, "elf_begin failed: %s\n", + elf_errmsg(elf_errno())); + close(fd); + return 1; + } + + text_addr = find_text_addr(elf); + + dwarf = dwarf_begin_elf(elf, DWARF_C_READ, NULL); + if (!dwarf) { + fprintf(stderr, "dwarf_begin_elf failed: %s\n", + dwarf_errmsg(dwarf_errno())); + fprintf(stderr, "Is %s built with CONFIG_DEBUG_INFO?\n", + argv[1]); + elf_end(elf); + close(fd); + return 1; + } + + process_dwarf(dwarf, text_addr); + + if (skipped_overflow) + fprintf(stderr, + "lineinfo: warning: %u entries skipped (offset > 4 GiB from _text)\n", + skipped_overflow); + + deduplicate(); + compute_file_offsets(); + + fprintf(stderr, "lineinfo: %u entries, %u files\n", + num_entries, num_files); + + output_assembly(); + + dwarf_end(dwarf); + elf_end(elf); + close(fd); + + /* Cleanup */ + free(entries); + for (unsigned int i = 0; i < num_files; i++) + free(files[i].name); + free(files); + + return 0; +} --- a/scripts/.gitignore~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/scripts/.gitignore @@ -1,5 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only /asn1_compiler +/gen_lineinfo /gen_packed_field_checks /generate_rust_target /insert-sys-cert --- a/scripts/kallsyms.c~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/scripts/kallsyms.c @@ -78,6 +78,17 @@ static char *sym_name(const struct sym_e static bool is_ignored_symbol(const char *name, char type) { + /* Ignore lineinfo symbols for kallsyms pass stability */ + static const char * const lineinfo_syms[] = { + "lineinfo_addrs", + "lineinfo_file_ids", + "lineinfo_file_offsets", + "lineinfo_filenames", + "lineinfo_lines", + "lineinfo_num_entries", + "lineinfo_num_files", + }; + if (type == 'u' || type == 'n') return true; @@ -90,6 +101,11 @@ static bool is_ignored_symbol(const char return true; } + for (size_t i = 0; i < ARRAY_SIZE(lineinfo_syms); i++) { + if (!strcmp(name, lineinfo_syms[i])) + return true; + } + return false; } --- a/scripts/link-vmlinux.sh~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/scripts/link-vmlinux.sh @@ -103,7 +103,7 @@ vmlinux_link() ${ld} ${ldflags} -o ${output} \ ${wl}--whole-archive ${objs} ${wl}--no-whole-archive \ ${wl}--start-group ${libs} ${wl}--end-group \ - ${kallsymso} ${btf_vmlinux_bin_o} ${arch_vmlinux_o} ${ldlibs} + ${kallsymso} ${lineinfo_o} ${btf_vmlinux_bin_o} ${arch_vmlinux_o} ${ldlibs} } # Create ${2}.o file with all symbols from the ${1} object file @@ -129,6 +129,26 @@ kallsyms() kallsymso=${2}.o } +# Generate lineinfo tables from DWARF debug info in a temporary vmlinux. +# ${1} - temporary vmlinux with debug info +# Output: sets lineinfo_o to the generated .o file +gen_lineinfo() +{ + info LINEINFO .tmp_lineinfo.S + if ! scripts/gen_lineinfo "${1}" > .tmp_lineinfo.S; then + echo >&2 "Failed to generate lineinfo from ${1}" + echo >&2 "Try to disable CONFIG_KALLSYMS_LINEINFO" + exit 1 + fi + + info AS .tmp_lineinfo.o + ${CC} ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS} \ + ${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \ + -c -o .tmp_lineinfo.o .tmp_lineinfo.S + + lineinfo_o=.tmp_lineinfo.o +} + # Perform kallsyms for the given temporary vmlinux. sysmap_and_kallsyms() { @@ -155,6 +175,7 @@ sorttable() cleanup() { rm -f .btf.* + rm -f .tmp_lineinfo.* rm -f .tmp_vmlinux.nm-sort rm -f System.map rm -f vmlinux @@ -183,6 +204,7 @@ fi btf_vmlinux_bin_o= btfids_vmlinux= kallsymso= +lineinfo_o= strip_debug= generate_map= @@ -198,10 +220,21 @@ if is_enabled CONFIG_KALLSYMS; then kallsyms .tmp_vmlinux0.syms .tmp_vmlinux0.kallsyms fi +if is_enabled CONFIG_KALLSYMS_LINEINFO; then + # Assemble an empty lineinfo stub for the initial link. + # The real lineinfo is generated from .tmp_vmlinux1 by gen_lineinfo. + ${CC} ${NOSTDINC_FLAGS} ${LINUXINCLUDE} ${KBUILD_CPPFLAGS} \ + ${KBUILD_AFLAGS} ${KBUILD_AFLAGS_KERNEL} \ + -c -o .tmp_lineinfo.o "${srctree}/scripts/empty_lineinfo.S" + lineinfo_o=.tmp_lineinfo.o +fi + if is_enabled CONFIG_KALLSYMS || is_enabled CONFIG_DEBUG_INFO_BTF; then - # The kallsyms linking does not need debug symbols, but the BTF does. - if ! is_enabled CONFIG_DEBUG_INFO_BTF; then + # The kallsyms linking does not need debug symbols, but BTF and + # lineinfo generation do. + if ! is_enabled CONFIG_DEBUG_INFO_BTF && + ! is_enabled CONFIG_KALLSYMS_LINEINFO; then strip_debug=1 fi @@ -219,6 +252,10 @@ if is_enabled CONFIG_DEBUG_INFO_BTF; the btfids_vmlinux=.tmp_vmlinux1.BTF_ids fi +if is_enabled CONFIG_KALLSYMS_LINEINFO; then + gen_lineinfo .tmp_vmlinux1 +fi + if is_enabled CONFIG_KALLSYMS; then # kallsyms support --- a/scripts/Makefile~kallsyms-embed-source-file-line-info-in-kernel-stack-traces +++ a/scripts/Makefile @@ -4,6 +4,7 @@ # the kernel for the build process. hostprogs-always-$(CONFIG_KALLSYMS) += kallsyms +hostprogs-always-$(CONFIG_KALLSYMS_LINEINFO) += gen_lineinfo hostprogs-always-$(BUILD_C_RECORDMCOUNT) += recordmcount hostprogs-always-$(CONFIG_BUILDTIME_TABLE_SORT) += sorttable hostprogs-always-$(CONFIG_ASN1) += asn1_compiler @@ -37,6 +38,8 @@ HOSTCFLAGS_asn1_compiler.o = -I$(srctree HOSTCFLAGS_sign-file.o = $(shell $(HOSTPKG_CONFIG) --cflags libcrypto 2> /dev/null) HOSTCFLAGS_sign-file.o += -I$(srctree)/tools/include/uapi/ HOSTLDLIBS_sign-file = $(shell $(HOSTPKG_CONFIG) --libs libcrypto 2> /dev/null || echo -lcrypto) +HOSTCFLAGS_gen_lineinfo.o = $(shell $(HOSTPKG_CONFIG) --cflags libdw 2> /dev/null) +HOSTLDLIBS_gen_lineinfo = $(shell $(HOSTPKG_CONFIG) --libs libdw 2> /dev/null || echo -ldw -lelf -lz) ifdef CONFIG_UNWINDER_ORC ifeq ($(ARCH),x86_64) _ Patches currently in -mm which might be from sashal@kernel.org are get_maintainer-add-json-output-mode.patch kallsyms-extend-lineinfo-to-loadable-modules.patch kallsyms-delta-compress-lineinfo-tables-for-27x-size-reduction.patch kallsyms-add-kunit-tests-for-lineinfo-feature.patch