public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Sasha Levin <sashal@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Richard Weinberger <richard@nod.at>,
	Juergen Gross <jgross@suse.com>,
	Geert Uytterhoeven <geert@linux-m68k.org>,
	James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jonathan Corbet <corbet@lwn.net>,
	Nathan Chancellor <nathan@kernel.org>,
	Nicolas Schier <nsc@kernel.org>, Petr Pavlu <petr.pavlu@suse.com>,
	Daniel Gomez <da.gomez@kernel.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	Petr Mladek <pmladek@suse.com>,
	Steven Rostedt <rostedt@goodmis.org>, Kees Cook <kees@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thorsten Leemhuis <linux@leemhuis.info>,
	Vlastimil Babka <vbabka@kernel.org>,
	linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org,
	linux-modules@vger.kernel.org, linux-doc@vger.kernel.org,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 3/3] kallsyms: delta-compress lineinfo tables for ~2.7x size reduction
Date: Tue,  3 Mar 2026 13:21:03 -0500	[thread overview]
Message-ID: <20260303182103.3523438-4-sashal@kernel.org> (raw)
In-Reply-To: <20260303182103.3523438-1-sashal@kernel.org>

Replace the flat uncompressed parallel arrays (lineinfo_addrs[],
lineinfo_file_ids[], lineinfo_lines[]) with a block-indexed,
delta-encoded, ULEB128 varint compressed format.

The sorted address array has small deltas between consecutive entries
(typically 1-50 bytes), file IDs have high locality (delta often 0,
same file), and line numbers change slowly.  Delta-encoding followed
by ULEB128 varint compression shrinks most values from 4 bytes to 1.

Entries are grouped into blocks of 64.  A small uncompressed block
index (first addr + byte offset per block) enables O(log(N/64)) binary
search, followed by sequential decode of at most 64 varints within the
matching block.  All decode state lives on the stack -- zero
allocations, still safe for NMI/panic context.

Measured on a defconfig+debug x86_64 build (3,017,154 entries, 4,822
source files, 47,144 blocks):

  Before (flat arrays):
    lineinfo_addrs[]    12,068,616 bytes (u32 x 3.0M)
    lineinfo_file_ids[]  6,034,308 bytes (u16 x 3.0M)
    lineinfo_lines[]    12,068,616 bytes (u32 x 3.0M)
    Total:              30,171,540 bytes (28.8 MiB, 10.0 bytes/entry)

  After (block-indexed delta + ULEB128):
    lineinfo_block_addrs[]    188,576 bytes (184 KiB)
    lineinfo_block_offsets[]  188,576 bytes (184 KiB)
    lineinfo_data[]        10,926,128 bytes (10.4 MiB)
    Total:                 11,303,280 bytes (10.8 MiB, 3.7 bytes/entry)

  Savings: 18.0 MiB (2.7x reduction)

Booted in QEMU and verified with SysRq-l that annotations still work:

  default_idle+0x9/0x10 (arch/x86/kernel/process.c:767)
  default_idle_call+0x6c/0xb0 (kernel/sched/idle.c:122)
  do_idle+0x335/0x490 (kernel/sched/idle.c:191)
  cpu_startup_entry+0x4e/0x60 (kernel/sched/idle.c:429)
  rest_init+0x1aa/0x1b0 (init/main.c:760)

Suggested-by: Juergen Gross <jgross@suse.com>
Assisted-by: Claude:claude-opus-4-6
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../admin-guide/kallsyms-lineinfo.rst         |   7 +-
 include/linux/mod_lineinfo.h                  | 103 ++++++++--
 init/Kconfig                                  |   8 +-
 kernel/kallsyms.c                             |  91 +++++++--
 kernel/kallsyms_internal.h                    |   7 +-
 kernel/module/kallsyms.c                      | 107 +++++++---
 scripts/gen_lineinfo.c                        | 192 ++++++++++++++----
 scripts/kallsyms.c                            |   7 +-
 scripts/link-vmlinux.sh                       |  16 +-
 9 files changed, 423 insertions(+), 115 deletions(-)

diff --git a/Documentation/admin-guide/kallsyms-lineinfo.rst b/Documentation/admin-guide/kallsyms-lineinfo.rst
index 21450569d5324..fe92c5dde16b3 100644
--- a/Documentation/admin-guide/kallsyms-lineinfo.rst
+++ b/Documentation/admin-guide/kallsyms-lineinfo.rst
@@ -76,10 +76,11 @@ Memory Overhead
 ===============
 
 The vmlinux lineinfo tables are stored in ``.rodata`` and typically add
-approximately 44 MiB to the kernel image for a standard configuration
-(~4.6 million DWARF line entries, ~10 bytes per entry after deduplication).
+approximately 10-15 MiB to the kernel image for a standard configuration
+(~4.6 million DWARF line entries, ~2-3 bytes per entry after delta
+compression).
 
-Per-module lineinfo adds approximately 10 bytes per DWARF line entry to each
+Per-module lineinfo adds approximately 2-3 bytes per DWARF line entry to each
 ``.ko`` file.
 
 Known Limitations
diff --git a/include/linux/mod_lineinfo.h b/include/linux/mod_lineinfo.h
index d62e9608f0f82..ab758acfadceb 100644
--- a/include/linux/mod_lineinfo.h
+++ b/include/linux/mod_lineinfo.h
@@ -8,13 +8,19 @@
  *
  * Section layout (all values in target-native endianness):
  *
- *   struct mod_lineinfo_header     (16 bytes)
- *   u32 addrs[num_entries]         -- offsets from .text base, sorted
- *   u16 file_ids[num_entries]      -- parallel to addrs
- *   <2-byte pad if num_entries is odd>
- *   u32 lines[num_entries]         -- parallel to addrs
+ *   struct mod_lineinfo_header     (24 bytes)
+ *   u32 block_addrs[num_blocks]    -- first addr per block, for binary search
+ *   u32 block_offsets[num_blocks]  -- byte offset into compressed data stream
+ *   u8  data[data_size]            -- ULEB128 delta-compressed entries
  *   u32 file_offsets[num_files]    -- byte offset into filenames[]
  *   char filenames[filenames_size] -- concatenated NUL-terminated strings
+ *
+ * Compressed stream format (per block of LINEINFO_BLOCK_ENTRIES entries):
+ *   Entry 0: file_id (ULEB128), line (ULEB128)
+ *            addr is in block_addrs[]
+ *   Entry 1..N: addr_delta (ULEB128),
+ *               file_id_delta (zigzag-encoded ULEB128),
+ *               line_delta (zigzag-encoded ULEB128)
  */
 #ifndef _LINUX_MOD_LINEINFO_H
 #define _LINUX_MOD_LINEINFO_H
@@ -25,44 +31,107 @@
 #include <stdint.h>
 typedef uint32_t u32;
 typedef uint16_t u16;
+typedef uint8_t  u8;
 #endif
 
+#define LINEINFO_BLOCK_ENTRIES 64
+
 struct mod_lineinfo_header {
 	u32 num_entries;
 	u32 num_files;
 	u32 filenames_size;	/* total bytes of concatenated filenames */
+	u32 num_blocks;
+	u32 data_size;		/* total bytes of compressed data stream */
 	u32 reserved;		/* padding, must be 0 */
 };
 
 /* Offset helpers: compute byte offset from start of section to each array */
 
-static inline u32 mod_lineinfo_addrs_off(void)
+static inline u32 mod_lineinfo_block_addrs_off(void)
 {
 	return sizeof(struct mod_lineinfo_header);
 }
 
-static inline u32 mod_lineinfo_file_ids_off(u32 num_entries)
+static inline u32 mod_lineinfo_block_offsets_off(u32 num_blocks)
 {
-	return mod_lineinfo_addrs_off() + num_entries * sizeof(u32);
+	return mod_lineinfo_block_addrs_off() + num_blocks * sizeof(u32);
 }
 
-static inline u32 mod_lineinfo_lines_off(u32 num_entries)
+static inline u32 mod_lineinfo_data_off(u32 num_blocks)
 {
-	/* u16 file_ids[] may need 2-byte padding to align lines[] to 4 bytes */
-	u32 off = mod_lineinfo_file_ids_off(num_entries) +
-		  num_entries * sizeof(u16);
-	return (off + 3) & ~3u;
+	return mod_lineinfo_block_offsets_off(num_blocks) +
+	       num_blocks * sizeof(u32);
 }
 
-static inline u32 mod_lineinfo_file_offsets_off(u32 num_entries)
+static inline u32 mod_lineinfo_file_offsets_off(u32 num_blocks, u32 data_size)
 {
-	return mod_lineinfo_lines_off(num_entries) + num_entries * sizeof(u32);
+	return mod_lineinfo_data_off(num_blocks) + data_size;
 }
 
-static inline u32 mod_lineinfo_filenames_off(u32 num_entries, u32 num_files)
+static inline u32 mod_lineinfo_filenames_off(u32 num_blocks, u32 data_size,
+					     u32 num_files)
 {
-	return mod_lineinfo_file_offsets_off(num_entries) +
+	return mod_lineinfo_file_offsets_off(num_blocks, data_size) +
 	       num_files * sizeof(u32);
 }
 
+/* Zigzag encoding: map signed to unsigned so small magnitudes are small */
+static inline u32 zigzag_encode(int32_t v)
+{
+	return ((u32)v << 1) ^ (u32)(v >> 31);
+}
+
+static inline int32_t zigzag_decode(u32 v)
+{
+	return (int32_t)((v >> 1) ^ -(v & 1));
+}
+
+/*
+ * Read a ULEB128 varint from a byte stream.
+ * Returns the decoded value and advances *pos past the encoded bytes.
+ * If *pos would exceed 'end', returns 0 and sets *pos = end (safe for
+ * NMI/panic context -- no crash, just a missed annotation).
+ */
+static inline u32 lineinfo_read_uleb128(const u8 *data, u32 *pos, u32 end)
+{
+	u32 result = 0;
+	unsigned int shift = 0;
+
+	while (*pos < end) {
+		u8 byte = data[*pos];
+		(*pos)++;
+		result |= (u32)(byte & 0x7f) << shift;
+		if (!(byte & 0x80))
+			return result;
+		shift += 7;
+		if (shift >= 32) {
+			/* Malformed -- skip remaining continuation bytes */
+			while (*pos < end && (data[*pos] & 0x80))
+				(*pos)++;
+			if (*pos < end)
+				(*pos)++;
+			return result;
+		}
+	}
+	return result;
+}
+
+/* Write a ULEB128 varint -- build tool only */
+#ifndef __KERNEL__
+static inline unsigned int lineinfo_write_uleb128(u8 *buf, u32 value)
+{
+	unsigned int len = 0;
+
+	do {
+		u8 byte = value & 0x7f;
+
+		value >>= 7;
+		if (value)
+			byte |= 0x80;
+		buf[len++] = byte;
+	} while (value);
+	return len;
+}
+#endif /* !__KERNEL__ */
+
 #endif /* _LINUX_MOD_LINEINFO_H */
diff --git a/init/Kconfig b/init/Kconfig
index bf53275bc405a..6e3795b3dbd62 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2065,8 +2065,9 @@ config KALLSYMS_LINEINFO
 	    anon_vma_clone+0x2ed/0xcf0 (mm/rmap.c:412)
 
 	  This requires elfutils (libdw-dev/elfutils-devel) on the build host.
-	  Adds approximately 44MB to a typical kernel image (10 bytes per
-	  DWARF line-table entry, ~4.6M entries for a typical config).
+	  Adds approximately 10-15MB to a typical kernel image (~2-3 bytes
+	  per entry after delta compression, ~4.6M entries for a typical
+	  config).
 
 	  If unsure, say N.
 
@@ -2079,7 +2080,8 @@ config KALLSYMS_LINEINFO_MODULES
 	  so stack traces from module code include (file.c:123) annotations.
 
 	  Requires elfutils (libdw-dev/elfutils-devel) on the build host.
-	  Increases .ko sizes by approximately 10 bytes per DWARF line entry.
+	  Increases .ko sizes by approximately 2-3 bytes per DWARF line
+	  entry after delta compression.
 
 	  If unsure, say N.
 
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index cea74992e5427..de4aa8fcfd69d 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -468,14 +468,20 @@ static int append_buildid(char *buffer,   const char *modname,
 #endif /* CONFIG_STACKTRACE_BUILD_ID */
 
 #ifdef CONFIG_KALLSYMS_LINEINFO
+#include <linux/mod_lineinfo.h>
+
 bool kallsyms_lookup_lineinfo(unsigned long addr, unsigned long sym_start,
 			      const char **file, unsigned int *line)
 {
 	unsigned long long raw_offset;
-	unsigned int offset, low, high, mid, file_id;
-	unsigned long line_addr;
-
-	if (!lineinfo_num_entries)
+	unsigned int offset, low, high, mid, block;
+	unsigned int cur_addr, cur_file_id, cur_line;
+	unsigned int best_file_id = 0, best_line = 0;
+	unsigned int block_entries, data_end;
+	bool found = false;
+	u32 pos;
+
+	if (!lineinfo_num_entries || !lineinfo_num_blocks)
 		return false;
 
 	/* Compute offset from _text */
@@ -487,12 +493,12 @@ bool kallsyms_lookup_lineinfo(unsigned long addr, unsigned long sym_start,
 		return false;
 	offset = (unsigned int)raw_offset;
 
-	/* Binary search for largest entry <= offset */
+	/* Binary search on block_addrs[] to find the right block */
 	low = 0;
-	high = lineinfo_num_entries;
+	high = lineinfo_num_blocks;
 	while (low < high) {
 		mid = low + (high - low) / 2;
-		if (lineinfo_addrs[mid] <= offset)
+		if (lineinfo_block_addrs[mid] <= offset)
 			low = mid + 1;
 		else
 			high = mid;
@@ -500,25 +506,68 @@ bool kallsyms_lookup_lineinfo(unsigned long addr, unsigned long sym_start,
 
 	if (low == 0)
 		return false;
-	low--;
+	block = low - 1;
 
-	/*
-	 * Validate that the matched lineinfo entry belongs to the same
-	 * symbol.  Without this check, assembly routines or other
-	 * functions lacking DWARF data would inherit the file:line of
-	 * a preceding C function.
-	 */
-	line_addr = (unsigned long)_text + lineinfo_addrs[low];
-	if (line_addr < sym_start)
-		return false;
+	/* How many entries in this block? */
+	block_entries = LINEINFO_BLOCK_ENTRIES;
+	if (block == lineinfo_num_blocks - 1) {
+		unsigned int remaining = lineinfo_num_entries - block * LINEINFO_BLOCK_ENTRIES;
+
+		if (remaining < block_entries)
+			block_entries = remaining;
+	}
+
+	/* Determine end of this block's data in the compressed stream */
+	if (block + 1 < lineinfo_num_blocks)
+		data_end = lineinfo_block_offsets[block + 1];
+	else
+		data_end = UINT_MAX; /* last block: read to end */
+
+	/* Decode entry 0: addr from block_addrs, file_id and line from stream */
+	pos = lineinfo_block_offsets[block];
+	cur_addr = lineinfo_block_addrs[block];
+	cur_file_id = lineinfo_read_uleb128(lineinfo_data, &pos, data_end);
+	cur_line = lineinfo_read_uleb128(lineinfo_data, &pos, data_end);
+
+	/* Check entry 0 */
+	if (cur_addr <= offset &&
+	    (unsigned long)_text + cur_addr >= sym_start) {
+		best_file_id = cur_file_id;
+		best_line = cur_line;
+		found = true;
+	}
 
-	file_id = lineinfo_file_ids[low];
-	*line = lineinfo_lines[low];
+	/* Decode entries 1..N */
+	for (unsigned int i = 1; i < block_entries; i++) {
+		unsigned int addr_delta;
+		int32_t file_delta, line_delta;
+
+		addr_delta = lineinfo_read_uleb128(lineinfo_data, &pos, data_end);
+		file_delta = zigzag_decode(lineinfo_read_uleb128(lineinfo_data, &pos, data_end));
+		line_delta = zigzag_decode(lineinfo_read_uleb128(lineinfo_data, &pos, data_end));
+
+		cur_addr += addr_delta;
+		cur_file_id = (unsigned int)((int32_t)cur_file_id + file_delta);
+		cur_line = (unsigned int)((int32_t)cur_line + line_delta);
+
+		if (cur_addr > offset)
+			break;
+
+		if ((unsigned long)_text + cur_addr >= sym_start) {
+			best_file_id = cur_file_id;
+			best_line = cur_line;
+			found = true;
+		}
+	}
+
+	if (!found)
+		return false;
 
-	if (file_id >= lineinfo_num_files)
+	if (best_file_id >= lineinfo_num_files)
 		return false;
 
-	*file = &lineinfo_filenames[lineinfo_file_offsets[file_id]];
+	*file = &lineinfo_filenames[lineinfo_file_offsets[best_file_id]];
+	*line = best_line;
 	return true;
 }
 #endif /* CONFIG_KALLSYMS_LINEINFO */
diff --git a/kernel/kallsyms_internal.h b/kernel/kallsyms_internal.h
index 868a1d5035212..691be44440395 100644
--- a/kernel/kallsyms_internal.h
+++ b/kernel/kallsyms_internal.h
@@ -17,10 +17,11 @@ extern const u8 kallsyms_seqs_of_names[];
 
 #ifdef CONFIG_KALLSYMS_LINEINFO
 extern const u32 lineinfo_num_entries;
-extern const u32 lineinfo_addrs[];
-extern const u16 lineinfo_file_ids[];
-extern const u32 lineinfo_lines[];
 extern const u32 lineinfo_num_files;
+extern const u32 lineinfo_num_blocks;
+extern const u32 lineinfo_block_addrs[];
+extern const u32 lineinfo_block_offsets[];
+extern const u8  lineinfo_data[];
 extern const u32 lineinfo_file_offsets[];
 extern const char lineinfo_filenames[];
 #endif
diff --git a/kernel/module/kallsyms.c b/kernel/module/kallsyms.c
index 7af414bd65e79..0ead1bb69de4e 100644
--- a/kernel/module/kallsyms.c
+++ b/kernel/module/kallsyms.c
@@ -512,15 +512,19 @@ bool module_lookup_lineinfo(struct module *mod, unsigned long addr,
 {
 	const struct mod_lineinfo_header *hdr;
 	const void *base;
-	const u32 *addrs, *lines, *file_offsets;
-	const u16 *file_ids;
+	const u32 *blk_addrs, *blk_offsets, *file_offsets;
+	const u8 *data;
 	const char *filenames;
-	u32 num_entries, num_files, filenames_size;
+	u32 num_entries, num_files, filenames_size, num_blocks, data_size;
 	unsigned long text_base;
 	unsigned int offset;
 	unsigned long long raw_offset;
-	unsigned int low, high, mid;
-	u16 file_id;
+	unsigned int low, high, mid, block;
+	unsigned int cur_addr, cur_file_id, cur_line;
+	unsigned int best_file_id = 0, best_line = 0;
+	unsigned int block_entries, data_end;
+	bool found = false;
+	u32 pos;
 
 	base = mod->lineinfo_data;
 	if (!base)
@@ -533,20 +537,24 @@ bool module_lookup_lineinfo(struct module *mod, unsigned long addr,
 	num_entries = hdr->num_entries;
 	num_files = hdr->num_files;
 	filenames_size = hdr->filenames_size;
+	num_blocks = hdr->num_blocks;
+	data_size = hdr->data_size;
 
-	if (num_entries == 0)
+	if (num_entries == 0 || num_blocks == 0)
 		return false;
 
 	/* Validate section is large enough for all arrays */
 	if (mod->lineinfo_data_size <
-	    mod_lineinfo_filenames_off(num_entries, num_files) + filenames_size)
+	    mod_lineinfo_filenames_off(num_blocks, data_size, num_files) +
+	    filenames_size)
 		return false;
 
-	addrs = base + mod_lineinfo_addrs_off();
-	file_ids = base + mod_lineinfo_file_ids_off(num_entries);
-	lines = base + mod_lineinfo_lines_off(num_entries);
-	file_offsets = base + mod_lineinfo_file_offsets_off(num_entries);
-	filenames = base + mod_lineinfo_filenames_off(num_entries, num_files);
+	blk_addrs = base + mod_lineinfo_block_addrs_off();
+	blk_offsets = base + mod_lineinfo_block_offsets_off(num_blocks);
+	data = base + mod_lineinfo_data_off(num_blocks);
+	file_offsets = base + mod_lineinfo_file_offsets_off(num_blocks, data_size);
+	filenames = base + mod_lineinfo_filenames_off(num_blocks, data_size,
+						      num_files);
 
 	/* Compute offset from module .text base */
 	text_base = (unsigned long)mod->mem[MOD_TEXT].base;
@@ -558,12 +566,12 @@ bool module_lookup_lineinfo(struct module *mod, unsigned long addr,
 		return false;
 	offset = (unsigned int)raw_offset;
 
-	/* Binary search for largest entry <= offset */
+	/* Binary search on block_addrs[] to find the right block */
 	low = 0;
-	high = num_entries;
+	high = num_blocks;
 	while (low < high) {
 		mid = low + (high - low) / 2;
-		if (addrs[mid] <= offset)
+		if (blk_addrs[mid] <= offset)
 			low = mid + 1;
 		else
 			high = mid;
@@ -571,21 +579,74 @@ bool module_lookup_lineinfo(struct module *mod, unsigned long addr,
 
 	if (low == 0)
 		return false;
-	low--;
+	block = low - 1;
 
-	/* Ensure the matched entry belongs to the same symbol */
-	if (text_base + addrs[low] < sym_start)
+	/* How many entries in this block? */
+	block_entries = LINEINFO_BLOCK_ENTRIES;
+	if (block == num_blocks - 1) {
+		unsigned int remaining = num_entries - block * LINEINFO_BLOCK_ENTRIES;
+
+		if (remaining < block_entries)
+			block_entries = remaining;
+	}
+
+	/* Determine end of this block's data in the compressed stream */
+	if (block + 1 < num_blocks)
+		data_end = blk_offsets[block + 1];
+	else
+		data_end = data_size;
+
+	/* Decode entry 0: addr from block_addrs, file_id and line from stream */
+	pos = blk_offsets[block];
+	if (pos >= data_end)
+		return false;
+
+	cur_addr = blk_addrs[block];
+	cur_file_id = lineinfo_read_uleb128(data, &pos, data_end);
+	cur_line = lineinfo_read_uleb128(data, &pos, data_end);
+
+	/* Check entry 0 */
+	if (cur_addr <= offset &&
+	    text_base + cur_addr >= sym_start) {
+		best_file_id = cur_file_id;
+		best_line = cur_line;
+		found = true;
+	}
+
+	/* Decode entries 1..N */
+	for (unsigned int i = 1; i < block_entries; i++) {
+		unsigned int addr_delta;
+		int32_t file_delta, line_delta;
+
+		addr_delta = lineinfo_read_uleb128(data, &pos, data_end);
+		file_delta = zigzag_decode(lineinfo_read_uleb128(data, &pos, data_end));
+		line_delta = zigzag_decode(lineinfo_read_uleb128(data, &pos, data_end));
+
+		cur_addr += addr_delta;
+		cur_file_id = (unsigned int)((int32_t)cur_file_id + file_delta);
+		cur_line = (unsigned int)((int32_t)cur_line + line_delta);
+
+		if (cur_addr > offset)
+			break;
+
+		if (text_base + cur_addr >= sym_start) {
+			best_file_id = cur_file_id;
+			best_line = cur_line;
+			found = true;
+		}
+	}
+
+	if (!found)
 		return false;
 
-	file_id = file_ids[low];
-	if (file_id >= num_files)
+	if (best_file_id >= num_files)
 		return false;
 
-	if (file_offsets[file_id] >= filenames_size)
+	if (file_offsets[best_file_id] >= filenames_size)
 		return false;
 
-	*file = &filenames[file_offsets[file_id]];
-	*line = lines[low];
+	*file = &filenames[file_offsets[best_file_id]];
+	*line = best_line;
 	return true;
 }
 #endif /* CONFIG_KALLSYMS_LINEINFO_MODULES */
diff --git a/scripts/gen_lineinfo.c b/scripts/gen_lineinfo.c
index 609de59f47ffd..9507ed9bcbe55 100644
--- a/scripts/gen_lineinfo.c
+++ b/scripts/gen_lineinfo.c
@@ -8,6 +8,9 @@
  * file containing sorted lookup tables that the kernel uses to annotate
  * stack traces with source file:line information.
  *
+ * The output uses a block-indexed, delta-encoded, ULEB128-compressed format
+ * for ~3-4x size reduction compared to flat arrays.
+ *
  * Requires libdw from elfutils.
  */
 
@@ -53,6 +56,15 @@ static struct file_entry *files;
 static unsigned int num_files;
 static unsigned int files_capacity;
 
+/* Compressed output */
+static unsigned char *compressed_data;
+static unsigned int compressed_size;
+static unsigned int compressed_capacity;
+
+static unsigned int *block_addrs;
+static unsigned int *block_offsets;
+static unsigned int num_blocks;
+
 #define FILE_HASH_BITS 13
 #define FILE_HASH_SIZE (1 << FILE_HASH_BITS)
 
@@ -352,6 +364,93 @@ static void deduplicate(void)
 	num_entries = j + 1;
 }
 
+static void compressed_ensure(unsigned int need)
+{
+	if (compressed_size + need <= compressed_capacity)
+		return;
+	compressed_capacity = compressed_capacity ? compressed_capacity * 2 : 1024 * 1024;
+	while (compressed_capacity < compressed_size + need)
+		compressed_capacity *= 2;
+	compressed_data = realloc(compressed_data, compressed_capacity);
+	if (!compressed_data) {
+		fprintf(stderr, "out of memory\n");
+		exit(1);
+	}
+}
+
+static void compress_entries(void)
+{
+	unsigned int i, block;
+
+	if (num_entries == 0) {
+		num_blocks = 0;
+		return;
+	}
+
+	num_blocks = (num_entries + LINEINFO_BLOCK_ENTRIES - 1) / LINEINFO_BLOCK_ENTRIES;
+	block_addrs = calloc(num_blocks, sizeof(*block_addrs));
+	block_offsets = calloc(num_blocks, sizeof(*block_offsets));
+	if (!block_addrs || !block_offsets) {
+		fprintf(stderr, "out of memory\n");
+		exit(1);
+	}
+
+	for (block = 0; block < num_blocks; block++) {
+		unsigned int base = block * LINEINFO_BLOCK_ENTRIES;
+		unsigned int count = num_entries - base;
+		unsigned int prev_addr, prev_file_id, prev_line;
+		unsigned char buf[10]; /* max 5 bytes per ULEB128 */
+
+		if (count > LINEINFO_BLOCK_ENTRIES)
+			count = LINEINFO_BLOCK_ENTRIES;
+
+		block_addrs[block] = entries[base].offset;
+		block_offsets[block] = compressed_size;
+
+		/* Entry 0: file_id (ULEB128), line (ULEB128) */
+		compressed_ensure(20);
+		compressed_size += lineinfo_write_uleb128(
+			compressed_data + compressed_size,
+			entries[base].file_id);
+		compressed_size += lineinfo_write_uleb128(
+			compressed_data + compressed_size,
+			entries[base].line);
+
+		prev_addr = entries[base].offset;
+		prev_file_id = entries[base].file_id;
+		prev_line = entries[base].line;
+
+		/* Entries 1..N: deltas */
+		for (i = 1; i < count; i++) {
+			unsigned int idx = base + i;
+			unsigned int addr_delta;
+			int32_t file_delta, line_delta;
+			unsigned int n;
+
+			addr_delta = entries[idx].offset - prev_addr;
+			file_delta = (int32_t)entries[idx].file_id - (int32_t)prev_file_id;
+			line_delta = (int32_t)entries[idx].line - (int32_t)prev_line;
+
+			compressed_ensure(15);
+			n = lineinfo_write_uleb128(buf, addr_delta);
+			memcpy(compressed_data + compressed_size, buf, n);
+			compressed_size += n;
+
+			n = lineinfo_write_uleb128(buf, zigzag_encode(file_delta));
+			memcpy(compressed_data + compressed_size, buf, n);
+			compressed_size += n;
+
+			n = lineinfo_write_uleb128(buf, zigzag_encode(line_delta));
+			memcpy(compressed_data + compressed_size, buf, n);
+			compressed_size += n;
+
+			prev_addr = entries[idx].offset;
+			prev_file_id = entries[idx].file_id;
+			prev_line = entries[idx].line;
+		}
+	}
+}
+
 static void compute_file_offsets(void)
 {
 	unsigned int offset = 0;
@@ -395,28 +494,40 @@ static void output_assembly(void)
 	printf("lineinfo_num_files:\n");
 	printf("\t.long %u\n\n", num_files);
 
-	/* Sorted address offsets from _text */
-	printf("\t.globl lineinfo_addrs\n");
+	/* Number of blocks */
+	printf("\t.globl lineinfo_num_blocks\n");
 	printf("\t.balign 4\n");
-	printf("lineinfo_addrs:\n");
-	for (unsigned int i = 0; i < num_entries; i++)
-		printf("\t.long 0x%x\n", entries[i].offset);
-	printf("\n");
+	printf("lineinfo_num_blocks:\n");
+	printf("\t.long %u\n\n", num_blocks);
 
-	/* File IDs, parallel to addrs (u16 -- supports up to 65535 files) */
-	printf("\t.globl lineinfo_file_ids\n");
-	printf("\t.balign 2\n");
-	printf("lineinfo_file_ids:\n");
-	for (unsigned int i = 0; i < num_entries; i++)
-		printf("\t.short %u\n", entries[i].file_id);
+	/* Block first-addresses for binary search */
+	printf("\t.globl lineinfo_block_addrs\n");
+	printf("\t.balign 4\n");
+	printf("lineinfo_block_addrs:\n");
+	for (unsigned int i = 0; i < num_blocks; i++)
+		printf("\t.long 0x%x\n", block_addrs[i]);
 	printf("\n");
 
-	/* Line numbers, parallel to addrs */
-	printf("\t.globl lineinfo_lines\n");
+	/* Block byte offsets into compressed stream */
+	printf("\t.globl lineinfo_block_offsets\n");
 	printf("\t.balign 4\n");
-	printf("lineinfo_lines:\n");
-	for (unsigned int i = 0; i < num_entries; i++)
-		printf("\t.long %u\n", entries[i].line);
+	printf("lineinfo_block_offsets:\n");
+	for (unsigned int i = 0; i < num_blocks; i++)
+		printf("\t.long %u\n", block_offsets[i]);
+	printf("\n");
+
+	/* Compressed data stream */
+	printf("\t.globl lineinfo_data\n");
+	printf("lineinfo_data:\n");
+	for (unsigned int i = 0; i < compressed_size; i++) {
+		if ((i % 16) == 0)
+			printf("\t.byte ");
+		else
+			printf(",");
+		printf("0x%02x", compressed_data[i]);
+		if ((i % 16) == 15 || i == compressed_size - 1)
+			printf("\n");
+	}
 	printf("\n");
 
 	/* File string offset table */
@@ -450,33 +561,38 @@ static void output_module_assembly(void)
 
 	printf("\t.section .mod_lineinfo, \"a\"\n\n");
 
-	/* Header: num_entries, num_files, filenames_size, reserved */
+	/* Header: num_entries, num_files, filenames_size, num_blocks, data_size, reserved */
 	printf("\t.balign 4\n");
 	printf("\t.long %u\n", num_entries);
 	printf("\t.long %u\n", num_files);
 	printf("\t.long %u\n", filenames_size);
+	printf("\t.long %u\n", num_blocks);
+	printf("\t.long %u\n", compressed_size);
 	printf("\t.long 0\n\n");
 
-	/* addrs[] */
-	for (unsigned int i = 0; i < num_entries; i++)
-		printf("\t.long 0x%x\n", entries[i].offset);
-	if (num_entries)
+	/* block_addrs[] */
+	for (unsigned int i = 0; i < num_blocks; i++)
+		printf("\t.long 0x%x\n", block_addrs[i]);
+	if (num_blocks)
 		printf("\n");
 
-	/* file_ids[] */
-	for (unsigned int i = 0; i < num_entries; i++)
-		printf("\t.short %u\n", entries[i].file_id);
-
-	/* Padding to align lines[] to 4 bytes */
-	if (num_entries & 1)
-		printf("\t.short 0\n");
-	if (num_entries)
+	/* block_offsets[] */
+	for (unsigned int i = 0; i < num_blocks; i++)
+		printf("\t.long %u\n", block_offsets[i]);
+	if (num_blocks)
 		printf("\n");
 
-	/* lines[] */
-	for (unsigned int i = 0; i < num_entries; i++)
-		printf("\t.long %u\n", entries[i].line);
-	if (num_entries)
+	/* compressed data[] */
+	for (unsigned int i = 0; i < compressed_size; i++) {
+		if ((i % 16) == 0)
+			printf("\t.byte ");
+		else
+			printf(",");
+		printf("0x%02x", compressed_data[i]);
+		if ((i % 16) == 15 || i == compressed_size - 1)
+			printf("\n");
+	}
+	if (compressed_size)
 		printf("\n");
 
 	/* file_offsets[] */
@@ -558,10 +674,11 @@ int main(int argc, char *argv[])
 			skipped_overflow);
 
 	deduplicate();
+	compress_entries();
 	compute_file_offsets();
 
-	fprintf(stderr, "lineinfo: %u entries, %u files\n",
-		num_entries, num_files);
+	fprintf(stderr, "lineinfo: %u entries, %u files, %u blocks, %u compressed bytes\n",
+		num_entries, num_files, num_blocks, compressed_size);
 
 	if (module_mode)
 		output_module_assembly();
@@ -577,6 +694,9 @@ int main(int argc, char *argv[])
 	for (unsigned int i = 0; i < num_files; i++)
 		free(files[i].name);
 	free(files);
+	free(compressed_data);
+	free(block_addrs);
+	free(block_offsets);
 
 	return 0;
 }
diff --git a/scripts/kallsyms.c b/scripts/kallsyms.c
index 42662c4fbc6c9..94fbdad3df7c6 100644
--- a/scripts/kallsyms.c
+++ b/scripts/kallsyms.c
@@ -80,11 +80,12 @@ static bool is_ignored_symbol(const char *name, char type)
 {
 	/* Ignore lineinfo symbols for kallsyms pass stability */
 	static const char * const lineinfo_syms[] = {
-		"lineinfo_addrs",
-		"lineinfo_file_ids",
+		"lineinfo_block_addrs",
+		"lineinfo_block_offsets",
+		"lineinfo_data",
 		"lineinfo_file_offsets",
 		"lineinfo_filenames",
-		"lineinfo_lines",
+		"lineinfo_num_blocks",
 		"lineinfo_num_entries",
 		"lineinfo_num_files",
 	};
diff --git a/scripts/link-vmlinux.sh b/scripts/link-vmlinux.sh
index 640209f2e9eb9..3c122cf9b95c5 100755
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@ -235,12 +235,16 @@ lineinfo_num_entries:
 	.balign 4
 lineinfo_num_files:
 	.long 0
-	.globl lineinfo_addrs
-lineinfo_addrs:
-	.globl lineinfo_file_ids
-lineinfo_file_ids:
-	.globl lineinfo_lines
-lineinfo_lines:
+	.globl lineinfo_num_blocks
+	.balign 4
+lineinfo_num_blocks:
+	.long 0
+	.globl lineinfo_block_addrs
+lineinfo_block_addrs:
+	.globl lineinfo_block_offsets
+lineinfo_block_offsets:
+	.globl lineinfo_data
+lineinfo_data:
 	.globl lineinfo_file_offsets
 lineinfo_file_offsets:
 	.globl lineinfo_filenames
-- 
2.51.0


  parent reply	other threads:[~2026-03-03 18:22 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-03 18:21 [PATCH 0/3] kallsyms: embed source file:line info in kernel stack traces Sasha Levin
2026-03-03 18:21 ` [PATCH 1/3] " Sasha Levin
2026-03-04 20:17   ` Helge Deller
2026-03-05  2:18     ` Sasha Levin
2026-03-05 22:26       ` Helge Deller
2026-03-06  5:31         ` Randy Dunlap
2026-03-06 17:53           ` Helge Deller
2026-03-06  5:28   ` Randy Dunlap
2026-03-06 16:36   ` Petr Mladek
2026-03-06 17:14     ` Sasha Levin
2026-03-10 15:20       ` Petr Mladek
2026-03-11  0:58         ` Sasha Levin
2026-03-03 18:21 ` [PATCH 2/3] kallsyms: extend lineinfo to loadable modules Sasha Levin
2026-03-03 18:21 ` Sasha Levin [this message]
2026-03-03 21:25   ` [PATCH 3/3] kallsyms: delta-compress lineinfo tables for ~2.7x size reduction Geert Uytterhoeven
2026-03-04  1:11     ` Sasha Levin
2026-03-11  3:34   ` Vivian Wang
2026-03-11  4:13     ` Vivian Wang
2026-03-11 14:49     ` Sasha Levin
2026-03-12  2:03       ` Vivian Wang
2026-03-12  2:18         ` Sasha Levin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260303182103.3523438-4-sashal@kernel.org \
    --to=sashal@kernel.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=da.gomez@kernel.org \
    --cc=geert@linux-m68k.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgross@suse.com \
    --cc=kees@kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-modules@vger.kernel.org \
    --cc=linux@leemhuis.info \
    --cc=masahiroy@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=nathan@kernel.org \
    --cc=nsc@kernel.org \
    --cc=peterz@infradead.org \
    --cc=petr.pavlu@suse.com \
    --cc=pmladek@suse.com \
    --cc=richard@nod.at \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox