rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/15] Implement MODVERSIONS for Rust
@ 2024-06-17 17:58 Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 01/15] tools: Add gendwarfksyms Sami Tolvanen
                   ` (18 more replies)
  0 siblings, 19 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Hi folks,

This series implements CONFIG_MODVERSIONS for Rust, an important
feature for distributions like Android that want to ship Rust
kernel modules, and depend on modversions to help ensure module ABI
compatibility.

There have been earlier proposals [1][2] that would allow Rust
modules to coexist with modversions, but none that actually implement
symbol versioning. Unlike C, Rust source code doesn't have sufficient
information about the final ABI, as the compiler has considerable
freedom in adjusting structure layout for improved performance [3],
for example, which makes using a source code parser like genksyms
a non-starter. Based on Matt's suggestion and previous feedback
from maintainers, this series uses DWARF debugging information for
computing versions. DWARF is an established and relatively stable
format, which includes all the necessary ABI details, and adding a
CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
reasonable trade-off.

The first 12 patches of this series add a small tool for computing
symbol versions from DWARF, called gendwarfksyms. When passed a list
of exported symbols, the tool generates an expanded type string
for each symbol, and computes symbol CRCs similarly to genksyms.
gendwarfksyms is written in C and uses libdw to process DWARF, mainly
because of the existing support for C host tools that use elfutils
(e.g., objtool).

Another compatibility issue is fitting the extremely long mangled
Rust symbol names into struct modversion_info, which can only hold
55-character names (on 64-bit systems). Previous proposals suggested
changing the structure to support longer names, but the conclusion was
that we cannot break userspace tools that parse the version table.

The next two patches of the series implement support for hashed
symbol names in struct modversion_info, where names longer than 55
characters are hashed, and the hash is stored in the name field. To
avoid breaking userspace tools, the binary hash is prefixed with a
null-terminated string containing the name of the hash function. While
userspace tools can later be updated to potentially produce more
useful information about the long symbols, this allows them to
continue working in the meantime.

The final patch allows CONFIG_MODVERSIONS to be selected with Rust,
provided that debugging information is also available.

[1] https://lore.kernel.org/lkml/20230111161155.1349375-1-gary@garyguo.net/
[2] https://lore.kernel.org/rust-for-linux/20231118025748.2778044-1-mmaurer@google.com/
[3] https://lore.kernel.org/rust-for-linux/CAGSQo005hRiUZdeppCifDqG9zFDJRwahpBLE4x7-MyfJscn7tQ@mail.gmail.com/

Sami


Sami Tolvanen (15):
  tools: Add gendwarfksyms
  gendwarfksyms: Add symbol list input handling
  gendwarfksyms: Add CRC calculation
  gendwarfksyms: Expand base_type
  gendwarfksyms: Add a cache
  gendwarfksyms: Expand type modifiers and typedefs
  gendwarfksyms: Add pretty-printing
  gendwarfksyms: Expand subroutine_type
  gendwarfksyms: Expand array_type
  gendwarfksyms: Expand structure types
  gendwarfksyms: Limit structure expansion
  gendwarfksyms: Add inline debugging
  modpost: Add support for hashing long symbol names
  module: Support hashed symbol names when checking modversions
  kbuild: Use gendwarfksyms to generate Rust symbol versions

 Makefile                            |   6 +
 init/Kconfig                        |   2 +-
 kernel/module/version.c             |  38 +-
 rust/Makefile                       |  30 +-
 scripts/mod/Makefile                |   4 +-
 scripts/mod/modpost.c               |  20 +-
 scripts/mod/modpost.h               |  20 +
 scripts/mod/symhash.c               | 327 +++++++++++++
 tools/Makefile                      |  11 +-
 tools/gendwarfksyms/Build           |   5 +
 tools/gendwarfksyms/Makefile        |  49 ++
 tools/gendwarfksyms/cache.c         | 209 +++++++++
 tools/gendwarfksyms/crc32.c         |  69 +++
 tools/gendwarfksyms/crc32.h         |  34 ++
 tools/gendwarfksyms/gendwarfksyms.c | 141 ++++++
 tools/gendwarfksyms/gendwarfksyms.h | 173 +++++++
 tools/gendwarfksyms/symbols.c       | 193 ++++++++
 tools/gendwarfksyms/types.c         | 697 ++++++++++++++++++++++++++++
 18 files changed, 2008 insertions(+), 20 deletions(-)
 create mode 100644 scripts/mod/symhash.c
 create mode 100644 tools/gendwarfksyms/Build
 create mode 100644 tools/gendwarfksyms/Makefile
 create mode 100644 tools/gendwarfksyms/cache.c
 create mode 100644 tools/gendwarfksyms/crc32.c
 create mode 100644 tools/gendwarfksyms/crc32.h
 create mode 100644 tools/gendwarfksyms/gendwarfksyms.c
 create mode 100644 tools/gendwarfksyms/gendwarfksyms.h
 create mode 100644 tools/gendwarfksyms/symbols.c
 create mode 100644 tools/gendwarfksyms/types.c


base-commit: 6ba59ff4227927d3a8530fc2973b80e94b54d58f
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/15] tools: Add gendwarfksyms
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 02/15] gendwarfksyms: Add symbol list input handling Sami Tolvanen
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add a basic DWARF parser, which uses libdw to traverse the debugging
information in an object file and looks for functions and variables.
In follow-up patches, this will be expanded to produce symbol versions
for CONFIG_MODVERSIONS from DWARF. This tool is needed mainly for Rust
kernel module compatibility.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/Makefile                      |  11 +--
 tools/gendwarfksyms/Build           |   2 +
 tools/gendwarfksyms/Makefile        |  49 +++++++++++
 tools/gendwarfksyms/gendwarfksyms.c | 128 ++++++++++++++++++++++++++++
 tools/gendwarfksyms/gendwarfksyms.h |  71 +++++++++++++++
 tools/gendwarfksyms/types.c         |  87 +++++++++++++++++++
 6 files changed, 343 insertions(+), 5 deletions(-)
 create mode 100644 tools/gendwarfksyms/Build
 create mode 100644 tools/gendwarfksyms/Makefile
 create mode 100644 tools/gendwarfksyms/gendwarfksyms.c
 create mode 100644 tools/gendwarfksyms/gendwarfksyms.h
 create mode 100644 tools/gendwarfksyms/types.c

diff --git a/tools/Makefile b/tools/Makefile
index 276f5d0d53a4..60806ff753b7 100644
--- a/tools/Makefile
+++ b/tools/Makefile
@@ -17,6 +17,7 @@ help:
 	@echo '  firewire               - the userspace part of nosy, an IEEE-1394 traffic sniffer'
 	@echo '  firmware               - Firmware tools'
 	@echo '  freefall               - laptop accelerometer program for disk protection'
+	@echo '  gendwarfksyms          - generates symbol versions from DWARF'
 	@echo '  gpio                   - GPIO tools'
 	@echo '  hv                     - tools used when in Hyper-V clients'
 	@echo '  iio                    - IIO tools'
@@ -68,7 +69,7 @@ acpi: FORCE
 cpupower: FORCE
 	$(call descend,power/$@)
 
-counter firewire hv guest bootconfig spi usb virtio mm bpf iio gpio objtool leds wmi pci firmware debugging tracing: FORCE
+counter firewire hv guest bootconfig spi usb virtio mm bpf iio gendwarfksyms gpio objtool leds wmi pci firmware debugging tracing: FORCE
 	$(call descend,$@)
 
 bpf/%: FORCE
@@ -154,8 +155,8 @@ freefall_install:
 kvm_stat_install:
 	$(call descend,kvm/$(@:_install=),install)
 
-install: acpi_install counter_install cpupower_install gpio_install \
-		hv_install firewire_install iio_install \
+install: acpi_install counter_install cpupower_install gendwarfksyms_install \
+		gpio_install hv_install firewire_install iio_install \
 		perf_install selftests_install turbostat_install usb_install \
 		virtio_install mm_install bpf_install x86_energy_perf_policy_install \
 		tmon_install freefall_install objtool_install kvm_stat_install \
@@ -168,7 +169,7 @@ acpi_clean:
 cpupower_clean:
 	$(call descend,power/cpupower,clean)
 
-counter_clean hv_clean firewire_clean bootconfig_clean spi_clean usb_clean virtio_clean mm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean tracing_clean:
+counter_clean hv_clean firewire_clean bootconfig_clean spi_clean usb_clean virtio_clean mm_clean wmi_clean bpf_clean iio_clean gendwarfksyms_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean tracing_clean:
 	$(call descend,$(@:_clean=),clean)
 
 libapi_clean:
@@ -211,7 +212,7 @@ build_clean:
 clean: acpi_clean counter_clean cpupower_clean hv_clean firewire_clean \
 		perf_clean selftests_clean turbostat_clean bootconfig_clean spi_clean usb_clean virtio_clean \
 		mm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
-		freefall_clean build_clean libbpf_clean libsubcmd_clean \
+		freefall_clean build_clean libbpf_clean libsubcmd_clean gendwarfksyms_clean \
 		gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
 		intel-speed-select_clean tracing_clean thermal_clean thermometer_clean thermal-engine_clean
 
diff --git a/tools/gendwarfksyms/Build b/tools/gendwarfksyms/Build
new file mode 100644
index 000000000000..805591b6df80
--- /dev/null
+++ b/tools/gendwarfksyms/Build
@@ -0,0 +1,2 @@
+gendwarfksyms-y += gendwarfksyms.o
+gendwarfksyms-y += types.o
diff --git a/tools/gendwarfksyms/Makefile b/tools/gendwarfksyms/Makefile
new file mode 100644
index 000000000000..1138ebd8bd0f
--- /dev/null
+++ b/tools/gendwarfksyms/Makefile
@@ -0,0 +1,49 @@
+# SPDX-License-Identifier: GPL-2.0
+include ../scripts/Makefile.include
+include ../scripts/Makefile.arch
+
+ifeq ($(srctree),)
+srctree := $(patsubst %/,%,$(dir $(CURDIR)))
+srctree := $(patsubst %/,%,$(dir $(srctree)))
+endif
+
+GENDWARFKSYMS    := $(OUTPUT)gendwarfksyms
+GENDWARFKSYMS_IN := $(GENDWARFKSYMS)-in.o
+
+LIBDW_FLAGS := $(shell $(HOSTPKG_CONFIG) libdw --cflags 2>/dev/null)
+LIBDW_LIBS  := $(shell $(HOSTPKG_CONFIG) libdw --libs 2>/dev/null || echo -ldw -lelf)
+
+all: $(GENDWARFKSYMS)
+
+INCLUDES := -I$(srctree)/tools/include
+
+WARNINGS := -Wall -Wno-unused-value
+GENDWARFKSYMS_CFLAGS := -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) $(LIBDW_FLAGS)
+GENDWARFKSYMS_LDFLAGS := $(LIBDW_LIBS) $(KBUILD_HOSTLDFLAGS)
+
+# Always want host compilation.
+HOST_OVERRIDES := CC="$(HOSTCC)" LD="$(HOSTLD)" AR="$(HOSTAR)"
+
+ifeq ($(V),1)
+  Q =
+else
+  Q = @
+endif
+
+export srctree OUTPUT
+include $(srctree)/tools/build/Makefile.include
+
+$(GENDWARFKSYMS_IN): FORCE
+	$(Q)$(MAKE) $(build)=gendwarfksyms $(HOST_OVERRIDES) CFLAGS="$(GENDWARFKSYMS_CFLAGS)" \
+		LDFLAGS="$(GENDWARFKSYMS_LDFLAGS)"
+
+
+$(GENDWARFKSYMS): $(GENDWARFKSYMS_IN)
+	$(QUIET_LINK)$(HOSTCC) $(GENDWARFKSYMS_IN) $(GENDWARFKSYMS_LDFLAGS) -o $@
+
+clean:
+	$(call QUIET_CLEAN, gendwarfksyms) $(RM) $(GENDWARFKSYMS)
+
+FORCE:
+
+.PHONY: clean FORCE
diff --git a/tools/gendwarfksyms/gendwarfksyms.c b/tools/gendwarfksyms/gendwarfksyms.c
new file mode 100644
index 000000000000..4a2dea307849
--- /dev/null
+++ b/tools/gendwarfksyms/gendwarfksyms.c
@@ -0,0 +1,128 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024 Google LLC
+ */
+
+#include <fcntl.h>
+#include <errno.h>
+#include <stdarg.h>
+#include <string.h>
+#include "gendwarfksyms.h"
+
+/*
+ * Options
+ */
+
+/* Print type descriptions and debugging output to stderr */
+bool debug;
+
+static const struct {
+	const char *arg;
+	bool *flag;
+} options[] = {
+	{ "--debug", &debug },
+};
+
+static int usage(void)
+{
+	error("usage: gendwarfksyms [options] elf-object-file < symbol-list");
+	return -1;
+}
+
+static int parse_options(int argc, const char **argv, const char **filename)
+{
+	*filename = NULL;
+
+	for (int i = 1; i < argc; i++) {
+		bool found = false;
+
+		for (int j = 0; j < ARRAY_SIZE(options); j++) {
+			if (!strcmp(argv[i], options[j].arg)) {
+				*options[j].flag = true;
+				found = true;
+				break;
+			}
+		}
+
+		if (!found) {
+			if (!*filename)
+				*filename = argv[i];
+			else
+				return -1;
+		}
+	}
+
+	return *filename ? 0 : -1;
+}
+
+static int process_modules(Dwfl_Module *mod, void **userdata, const char *name,
+			   Dwarf_Addr base, void *arg)
+{
+	Dwarf_Addr dwbias;
+	Dwarf_Die cudie;
+	Dwarf_CU *cu = NULL;
+	Dwarf *dbg;
+	int res;
+
+	debug("%s", name);
+	dbg = dwfl_module_getdwarf(mod, &dwbias);
+
+	do {
+		res = dwarf_get_units(dbg, cu, &cu, NULL, NULL, &cudie, NULL);
+		if (res < 0) {
+			error("dwarf_get_units failed: no debugging information?");
+			return -1;
+		} else if (res == 1) {
+			return DWARF_CB_OK; /* No more units */
+		}
+
+		check(process_module(mod, dbg, &cudie));
+	} while (cu);
+
+	return DWARF_CB_OK;
+}
+
+static const Dwfl_Callbacks callbacks = {
+	.section_address = dwfl_offline_section_address,
+	.find_debuginfo = dwfl_standard_find_debuginfo,
+};
+
+int main(int argc, const char **argv)
+{
+	const char *filename = NULL;
+	Dwfl *dwfl;
+	int fd;
+
+	if (parse_options(argc, argv, &filename) < 0)
+		return usage();
+
+	fd = open(filename, O_RDONLY);
+	if (fd == -1) {
+		error("open failed for '%s': %s", filename, strerror(errno));
+		return -1;
+	}
+
+	dwfl = dwfl_begin(&callbacks);
+	if (!dwfl) {
+		error("dwfl_begin failed for '%s': %s", filename,
+		      dwarf_errmsg(-1));
+		return -1;
+	}
+
+	if (!dwfl_report_offline(dwfl, filename, filename, fd)) {
+		error("dwfl_report_offline failed for '%s': %s", filename,
+		      dwarf_errmsg(-1));
+		return -1;
+	}
+
+	dwfl_report_end(dwfl, NULL, NULL);
+
+	if (dwfl_getmodules(dwfl, &process_modules, NULL, 0)) {
+		error("dwfl_getmodules failed for '%s'", filename);
+		return -1;
+	}
+
+	dwfl_end(dwfl);
+
+	return 0;
+}
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
new file mode 100644
index 000000000000..44e94f1d9671
--- /dev/null
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2024 Google LLC
+ */
+
+#include <dwarf.h>
+#include <elfutils/libdw.h>
+#include <elfutils/libdwfl.h>
+#include <linux/hashtable.h>
+#include <inttypes.h>
+#include <stdlib.h>
+#include <stdio.h>
+
+#ifndef __GENDWARFKSYMS_H
+#define __GENDWARFKSYMS_H
+
+/*
+ * Options -- in gendwarfksyms.c
+ */
+extern bool debug;
+
+/*
+ * Output helpers
+ */
+#define __PREFIX "gendwarfksyms: "
+#define __println(prefix, format, ...)                                \
+	fprintf(stderr, prefix __PREFIX "%s: " format "\n", __func__, \
+		##__VA_ARGS__)
+
+#define debug(format, ...)                                    \
+	do {                                                  \
+		if (debug)                                    \
+			__println("", format, ##__VA_ARGS__); \
+	} while (0)
+
+#define warn(format, ...) __println("warning: ", format, ##__VA_ARGS__)
+#define error(format, ...) __println("error: ", format, ##__VA_ARGS__)
+
+/*
+ * Error handling helpers
+ */
+#define check(expr)                                             \
+	({                                                      \
+		int __res = expr;                               \
+		if (__res < 0) {                                \
+			error("`%s` failed: %d", #expr, __res); \
+			return __res;                           \
+		}                                               \
+		__res;                                          \
+	})
+
+/*
+ * types.c
+ */
+
+struct state {
+	Dwfl_Module *mod;
+	Dwarf *dbg;
+};
+
+typedef int (*die_callback_t)(struct state *state, Dwarf_Die *die);
+typedef bool (*die_match_callback_t)(Dwarf_Die *die);
+extern bool match_all(Dwarf_Die *die);
+
+extern int process_die_container(struct state *state, Dwarf_Die *die,
+				 die_callback_t func,
+				 die_match_callback_t match);
+
+extern int process_module(Dwfl_Module *mod, Dwarf *dbg, Dwarf_Die *cudie);
+
+#endif /* __GENDWARFKSYMS_H */
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
new file mode 100644
index 000000000000..2a8e45ae911c
--- /dev/null
+++ b/tools/gendwarfksyms/types.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024 Google LLC
+ */
+
+#include "gendwarfksyms.h"
+
+/*
+ * Type string processing
+ */
+static int process(struct state *state, const char *s)
+{
+	s = s ?: "<null>";
+
+	if (debug)
+		fputs(s, stderr);
+
+	return 0;
+}
+
+bool match_all(Dwarf_Die *die)
+{
+	return true;
+}
+
+int process_die_container(struct state *state, Dwarf_Die *die,
+			  die_callback_t func, die_match_callback_t match)
+{
+	Dwarf_Die current;
+	int res;
+
+	res = check(dwarf_child(die, &current));
+	while (!res) {
+		if (match(&current))
+			check(func(state, &current));
+		res = check(dwarf_siblingof(&current, &current));
+	}
+
+	return 0;
+}
+
+/*
+ * Symbol processing
+ */
+static int process_subprogram(struct state *state, Dwarf_Die *die)
+{
+	return check(process(state, "subprogram;\n"));
+}
+
+static int process_variable(struct state *state, Dwarf_Die *die)
+{
+	return check(process(state, "variable;\n"));
+}
+
+static int process_exported_symbols(struct state *state, Dwarf_Die *die)
+{
+	int tag = dwarf_tag(die);
+
+	switch (tag) {
+	/* Possible containers of exported symbols */
+	case DW_TAG_namespace:
+	case DW_TAG_class_type:
+	case DW_TAG_structure_type:
+		return check(process_die_container(
+			state, die, process_exported_symbols, match_all));
+
+	/* Possible exported symbols */
+	case DW_TAG_subprogram:
+	case DW_TAG_variable:
+		if (tag == DW_TAG_subprogram)
+			check(process_subprogram(state, die));
+		else
+			check(process_variable(state, die));
+
+		return 0;
+	default:
+		return 0;
+	}
+}
+
+int process_module(Dwfl_Module *mod, Dwarf *dbg, Dwarf_Die *cudie)
+{
+	struct state state = { .mod = mod, .dbg = dbg };
+
+	return check(process_die_container(
+		&state, cudie, process_exported_symbols, match_all));
+}
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 02/15] gendwarfksyms: Add symbol list input handling
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 01/15] tools: Add gendwarfksyms Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 03/15] gendwarfksyms: Add CRC calculation Sami Tolvanen
                   ` (16 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Support passing a list of exported symbols to gendwarfksyms via stdin
and filter non-exported symbols from the output.

The symbol list input has the format 'symbol-address symbol-name' to
allow the parser to discover exported symbols also by address. This
is necessary for aliased symbols, where only one name appears in the
debugging information.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/Build           |   1 +
 tools/gendwarfksyms/gendwarfksyms.c |   2 +
 tools/gendwarfksyms/gendwarfksyms.h |  17 ++++
 tools/gendwarfksyms/symbols.c       | 130 ++++++++++++++++++++++++++++
 tools/gendwarfksyms/types.c         |  70 ++++++++++++++-
 5 files changed, 219 insertions(+), 1 deletion(-)
 create mode 100644 tools/gendwarfksyms/symbols.c

diff --git a/tools/gendwarfksyms/Build b/tools/gendwarfksyms/Build
index 805591b6df80..a83c59bfef8b 100644
--- a/tools/gendwarfksyms/Build
+++ b/tools/gendwarfksyms/Build
@@ -1,2 +1,3 @@
 gendwarfksyms-y += gendwarfksyms.o
+gendwarfksyms-y += symbols.o
 gendwarfksyms-y += types.o
diff --git a/tools/gendwarfksyms/gendwarfksyms.c b/tools/gendwarfksyms/gendwarfksyms.c
index 4a2dea307849..4a1bd9239182 100644
--- a/tools/gendwarfksyms/gendwarfksyms.c
+++ b/tools/gendwarfksyms/gendwarfksyms.c
@@ -96,6 +96,8 @@ int main(int argc, const char **argv)
 	if (parse_options(argc, argv, &filename) < 0)
 		return usage();
 
+	check(symbol_read_list(stdin));
+
 	fd = open(filename, O_RDONLY);
 	if (fd == -1) {
 		error("open failed for '%s': %s", filename, strerror(errno));
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index 44e94f1d9671..b77855cc94a7 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -49,6 +49,21 @@ extern bool debug;
 		__res;                                          \
 	})
 
+/*
+ * symbols.c
+ */
+
+/* Exported symbol -- matching either the name or the address */
+struct symbol {
+	const char *name;
+	uintptr_t addr;
+	struct hlist_node addr_hash;
+	struct hlist_node name_hash;
+};
+
+extern int symbol_read_list(FILE *file);
+extern struct symbol *symbol_get(uintptr_t addr, const char *name);
+
 /*
  * types.c
  */
@@ -56,6 +71,8 @@ extern bool debug;
 struct state {
 	Dwfl_Module *mod;
 	Dwarf *dbg;
+	struct symbol *sym;
+	Dwarf_Die origin;
 };
 
 typedef int (*die_callback_t)(struct state *state, Dwarf_Die *die);
diff --git a/tools/gendwarfksyms/symbols.c b/tools/gendwarfksyms/symbols.c
new file mode 100644
index 000000000000..2cae61bcede7
--- /dev/null
+++ b/tools/gendwarfksyms/symbols.c
@@ -0,0 +1,130 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024 Google LLC
+ */
+
+#include <string.h>
+#include <linux/jhash.h>
+#include "gendwarfksyms.h"
+
+/* Hash tables for looking up requested symbols by address and name */
+#define SYMBOL_HASH_BITS 7
+DEFINE_HASHTABLE(symbol_addrs, SYMBOL_HASH_BITS);
+DEFINE_HASHTABLE(symbol_names, SYMBOL_HASH_BITS);
+
+static u32 name_hash(const char *name)
+{
+	return jhash(name, strlen(name), 0);
+}
+
+/* symbol_for_each callback -- return true to stop, false to continue */
+typedef bool (*symbol_callback_t)(struct symbol *, void *arg);
+
+static bool for_each_addr(uintptr_t addr, symbol_callback_t func, void *data)
+{
+	struct symbol *sym;
+	bool found = false;
+
+	if (addr == UINTPTR_MAX)
+		return false;
+
+	hash_for_each_possible(symbol_addrs, sym, addr_hash, addr) {
+		if (sym->addr == addr) {
+			if (func(sym, data))
+				return true;
+			found = true;
+		}
+	}
+
+	return found;
+}
+
+static bool for_each_name(const char *name, symbol_callback_t func, void *data)
+{
+	struct symbol *sym;
+	bool found = false;
+
+	if (!name)
+		return false;
+
+	hash_for_each_possible(symbol_names, sym, name_hash, name_hash(name)) {
+		if (!strcmp(sym->name, name)) {
+			if (func(sym, data))
+				return true;
+			found = true;
+		}
+	}
+
+	return found;
+}
+
+static bool for_each(uintptr_t addr, const char *name, symbol_callback_t func,
+		     void *data)
+{
+	bool found = false;
+
+	if (for_each_addr(addr, func, data))
+		found = true;
+	if (for_each_name(name, func, data))
+		found = true;
+
+	return found;
+}
+
+int symbol_read_list(FILE *file)
+{
+	struct symbol *sym;
+	char *line = NULL;
+	char *name = NULL;
+	uint64_t addr;
+	size_t size = 0;
+
+	while (getline(&line, &size, file) > 0) {
+		if (sscanf(line, "%" PRIx64 " %ms\n", &addr, &name) != 2) {
+			error("malformed input line (expected 'address symbol-name'): %s",
+			      line);
+			return -1;
+		}
+
+		free(line);
+		line = NULL;
+
+		sym = malloc(sizeof(struct symbol));
+		if (!sym) {
+			error("malloc failed");
+			return -1;
+		}
+
+		debug("adding { %lx, \"%s\" }", addr, name);
+
+		sym->addr = (uintptr_t)addr;
+		sym->name = name;
+		name = NULL;
+
+		hash_add(symbol_addrs, &sym->addr_hash, sym->addr);
+		hash_add(symbol_names, &sym->name_hash, name_hash(sym->name));
+	}
+
+	if (line)
+		free(line);
+
+	return 0;
+}
+
+static bool return_symbol(struct symbol *sym, void *arg)
+{
+	struct symbol **res = (struct symbol **)arg;
+
+	*res = sym;
+	return true; /* Stop -- return the first match */
+}
+
+struct symbol *symbol_get(uintptr_t addr, const char *name)
+{
+	struct symbol *sym;
+
+	if (for_each(addr, name, return_symbol, &sym))
+		return sym;
+
+	return NULL;
+}
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index 2a8e45ae911c..f1ce7bfcf510 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -5,6 +5,68 @@
 
 #include "gendwarfksyms.h"
 
+#define DEFINE_GET_ATTR(attr, type)                                    \
+	static bool get_##attr##_attr(Dwarf_Die *die, unsigned int id, \
+				      type *value)                     \
+	{                                                              \
+		Dwarf_Attribute da;                                    \
+		return dwarf_attr(die, id, &da) &&                     \
+		       !dwarf_form##attr(&da, value);                  \
+	}
+
+DEFINE_GET_ATTR(addr, Dwarf_Addr)
+
+static bool get_ref_die_attr(Dwarf_Die *die, unsigned int id, Dwarf_Die *value)
+{
+	Dwarf_Attribute da;
+
+	/* dwarf_formref_die returns a pointer instead of an error value. */
+	return dwarf_attr(die, id, &da) && dwarf_formref_die(&da, value);
+}
+
+static const char *get_name(Dwarf_Die *die)
+{
+	Dwarf_Attribute attr;
+
+	/* rustc uses DW_AT_linkage_name for exported symbols */
+	if (dwarf_attr(die, DW_AT_linkage_name, &attr) ||
+	    dwarf_attr(die, DW_AT_name, &attr)) {
+		return dwarf_formstring(&attr);
+	}
+
+	return NULL;
+}
+
+static Dwarf_Die *get_exported(struct state *state, Dwarf_Die *die)
+{
+	Dwarf_Die *origin = NULL;
+	Dwarf_Word addr = UINTPTR_MAX;
+
+	state->sym = NULL;
+
+	/* If the DIE has an abstract origin, use it for type expansion. */
+	if (get_ref_die_attr(die, DW_AT_abstract_origin, &state->origin))
+		origin = &state->origin;
+
+	/*
+	 * Only one name is emitted for aliased functions, so we must match
+	 * the address too, if available.
+	 */
+	if (get_addr_attr(die, DW_AT_low_pc, &addr) &&
+	    dwfl_module_relocate_address(state->mod, &addr) < 0) {
+		error("dwfl_module_relocate_address failed");
+		return NULL;
+	}
+
+	state->sym = symbol_get(addr, get_name(die));
+
+	/* Look up using the origin name if there are no matches. */
+	if (!state->sym && origin)
+		state->sym = symbol_get(addr, get_name(origin));
+
+	return origin ? origin : die;
+}
+
 /*
  * Type string processing
  */
@@ -40,7 +102,7 @@ int process_die_container(struct state *state, Dwarf_Die *die,
 }
 
 /*
- * Symbol processing
+ * Exported symbol processing
  */
 static int process_subprogram(struct state *state, Dwarf_Die *die)
 {
@@ -67,6 +129,12 @@ static int process_exported_symbols(struct state *state, Dwarf_Die *die)
 	/* Possible exported symbols */
 	case DW_TAG_subprogram:
 	case DW_TAG_variable:
+		die = get_exported(state, die);
+		if (!die || !state->sym)
+			return 0;
+
+		debug("%s (@ %lx)", state->sym->name, state->sym->addr);
+
 		if (tag == DW_TAG_subprogram)
 			check(process_subprogram(state, die));
 		else
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 03/15] gendwarfksyms: Add CRC calculation
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 01/15] tools: Add gendwarfksyms Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 02/15] gendwarfksyms: Add symbol list input handling Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 04/15] gendwarfksyms: Expand base_type Sami Tolvanen
                   ` (15 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add a basic CRC32 implementation adapted from genksyms, and produce
matching output from type strings.

Example usage:

  $ nm -p --defined-only rust/alloc.o | \
      awk '/ (T|R|D) / {printf "%s %s\n",$1, $3}' | \
      tools/gendwarfksyms/gendwarfksyms rust/alloc.o

Output:

  #SYMVER _RNvNvMs_NtCscg2doA4GLaz_5alloc3vecINtB6_3VecppE11swap_remove13assert_failed 0x985f94dd
  #SYMVER _RNvXs1_NtCscg2doA4GLaz_5alloc11collectionsNtB5_15TryReserveErrorNtNtCs5QSdWC790r4_4core3fmt7Display3fmt 0x985f94dd
  #SYMVER _RNvNvMs_NtCscg2doA4GLaz_5alloc3vecINtB6_3VecppE6remove13assert_failed 0x985f94dd
  ...

Note that type strings are not yet expanded, so the CRCs can only
distinguish between functions and variables for now.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/Build           |  1 +
 tools/gendwarfksyms/crc32.c         | 69 ++++++++++++++++++++++++++++
 tools/gendwarfksyms/crc32.h         | 34 ++++++++++++++
 tools/gendwarfksyms/gendwarfksyms.c |  1 +
 tools/gendwarfksyms/gendwarfksyms.h |  6 +++
 tools/gendwarfksyms/symbols.c       | 71 +++++++++++++++++++++++++++--
 tools/gendwarfksyms/types.c         | 19 ++++++--
 7 files changed, 194 insertions(+), 7 deletions(-)
 create mode 100644 tools/gendwarfksyms/crc32.c
 create mode 100644 tools/gendwarfksyms/crc32.h

diff --git a/tools/gendwarfksyms/Build b/tools/gendwarfksyms/Build
index a83c59bfef8b..518642c9b9de 100644
--- a/tools/gendwarfksyms/Build
+++ b/tools/gendwarfksyms/Build
@@ -1,3 +1,4 @@
 gendwarfksyms-y += gendwarfksyms.o
+gendwarfksyms-y += crc32.o
 gendwarfksyms-y += symbols.o
 gendwarfksyms-y += types.o
diff --git a/tools/gendwarfksyms/crc32.c b/tools/gendwarfksyms/crc32.c
new file mode 100644
index 000000000000..23b328cd74f2
--- /dev/null
+++ b/tools/gendwarfksyms/crc32.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Based on scripts/genksyms/genksyms.c, which has the following
+ * notice:
+ *
+ * Generate kernel symbol version hashes.
+ * Copyright 1996, 1997 Linux International.
+ *
+ * New implementation contributed by Richard Henderson <rth@tamu.edu>
+ * Based on original work by Bjorn Ekwall <bj0rn@blox.se>
+ *
+ * This file was part of the Linux modutils 2.4.22: moved back into the
+ * kernel sources by Rusty Russell/Kai Germaschewski.
+ */
+
+const unsigned int crctab32[] = {
+	0x00000000U, 0x77073096U, 0xee0e612cU, 0x990951baU, 0x076dc419U,
+	0x706af48fU, 0xe963a535U, 0x9e6495a3U, 0x0edb8832U, 0x79dcb8a4U,
+	0xe0d5e91eU, 0x97d2d988U, 0x09b64c2bU, 0x7eb17cbdU, 0xe7b82d07U,
+	0x90bf1d91U, 0x1db71064U, 0x6ab020f2U, 0xf3b97148U, 0x84be41deU,
+	0x1adad47dU, 0x6ddde4ebU, 0xf4d4b551U, 0x83d385c7U, 0x136c9856U,
+	0x646ba8c0U, 0xfd62f97aU, 0x8a65c9ecU, 0x14015c4fU, 0x63066cd9U,
+	0xfa0f3d63U, 0x8d080df5U, 0x3b6e20c8U, 0x4c69105eU, 0xd56041e4U,
+	0xa2677172U, 0x3c03e4d1U, 0x4b04d447U, 0xd20d85fdU, 0xa50ab56bU,
+	0x35b5a8faU, 0x42b2986cU, 0xdbbbc9d6U, 0xacbcf940U, 0x32d86ce3U,
+	0x45df5c75U, 0xdcd60dcfU, 0xabd13d59U, 0x26d930acU, 0x51de003aU,
+	0xc8d75180U, 0xbfd06116U, 0x21b4f4b5U, 0x56b3c423U, 0xcfba9599U,
+	0xb8bda50fU, 0x2802b89eU, 0x5f058808U, 0xc60cd9b2U, 0xb10be924U,
+	0x2f6f7c87U, 0x58684c11U, 0xc1611dabU, 0xb6662d3dU, 0x76dc4190U,
+	0x01db7106U, 0x98d220bcU, 0xefd5102aU, 0x71b18589U, 0x06b6b51fU,
+	0x9fbfe4a5U, 0xe8b8d433U, 0x7807c9a2U, 0x0f00f934U, 0x9609a88eU,
+	0xe10e9818U, 0x7f6a0dbbU, 0x086d3d2dU, 0x91646c97U, 0xe6635c01U,
+	0x6b6b51f4U, 0x1c6c6162U, 0x856530d8U, 0xf262004eU, 0x6c0695edU,
+	0x1b01a57bU, 0x8208f4c1U, 0xf50fc457U, 0x65b0d9c6U, 0x12b7e950U,
+	0x8bbeb8eaU, 0xfcb9887cU, 0x62dd1ddfU, 0x15da2d49U, 0x8cd37cf3U,
+	0xfbd44c65U, 0x4db26158U, 0x3ab551ceU, 0xa3bc0074U, 0xd4bb30e2U,
+	0x4adfa541U, 0x3dd895d7U, 0xa4d1c46dU, 0xd3d6f4fbU, 0x4369e96aU,
+	0x346ed9fcU, 0xad678846U, 0xda60b8d0U, 0x44042d73U, 0x33031de5U,
+	0xaa0a4c5fU, 0xdd0d7cc9U, 0x5005713cU, 0x270241aaU, 0xbe0b1010U,
+	0xc90c2086U, 0x5768b525U, 0x206f85b3U, 0xb966d409U, 0xce61e49fU,
+	0x5edef90eU, 0x29d9c998U, 0xb0d09822U, 0xc7d7a8b4U, 0x59b33d17U,
+	0x2eb40d81U, 0xb7bd5c3bU, 0xc0ba6cadU, 0xedb88320U, 0x9abfb3b6U,
+	0x03b6e20cU, 0x74b1d29aU, 0xead54739U, 0x9dd277afU, 0x04db2615U,
+	0x73dc1683U, 0xe3630b12U, 0x94643b84U, 0x0d6d6a3eU, 0x7a6a5aa8U,
+	0xe40ecf0bU, 0x9309ff9dU, 0x0a00ae27U, 0x7d079eb1U, 0xf00f9344U,
+	0x8708a3d2U, 0x1e01f268U, 0x6906c2feU, 0xf762575dU, 0x806567cbU,
+	0x196c3671U, 0x6e6b06e7U, 0xfed41b76U, 0x89d32be0U, 0x10da7a5aU,
+	0x67dd4accU, 0xf9b9df6fU, 0x8ebeeff9U, 0x17b7be43U, 0x60b08ed5U,
+	0xd6d6a3e8U, 0xa1d1937eU, 0x38d8c2c4U, 0x4fdff252U, 0xd1bb67f1U,
+	0xa6bc5767U, 0x3fb506ddU, 0x48b2364bU, 0xd80d2bdaU, 0xaf0a1b4cU,
+	0x36034af6U, 0x41047a60U, 0xdf60efc3U, 0xa867df55U, 0x316e8eefU,
+	0x4669be79U, 0xcb61b38cU, 0xbc66831aU, 0x256fd2a0U, 0x5268e236U,
+	0xcc0c7795U, 0xbb0b4703U, 0x220216b9U, 0x5505262fU, 0xc5ba3bbeU,
+	0xb2bd0b28U, 0x2bb45a92U, 0x5cb36a04U, 0xc2d7ffa7U, 0xb5d0cf31U,
+	0x2cd99e8bU, 0x5bdeae1dU, 0x9b64c2b0U, 0xec63f226U, 0x756aa39cU,
+	0x026d930aU, 0x9c0906a9U, 0xeb0e363fU, 0x72076785U, 0x05005713U,
+	0x95bf4a82U, 0xe2b87a14U, 0x7bb12baeU, 0x0cb61b38U, 0x92d28e9bU,
+	0xe5d5be0dU, 0x7cdcefb7U, 0x0bdbdf21U, 0x86d3d2d4U, 0xf1d4e242U,
+	0x68ddb3f8U, 0x1fda836eU, 0x81be16cdU, 0xf6b9265bU, 0x6fb077e1U,
+	0x18b74777U, 0x88085ae6U, 0xff0f6a70U, 0x66063bcaU, 0x11010b5cU,
+	0x8f659effU, 0xf862ae69U, 0x616bffd3U, 0x166ccf45U, 0xa00ae278U,
+	0xd70dd2eeU, 0x4e048354U, 0x3903b3c2U, 0xa7672661U, 0xd06016f7U,
+	0x4969474dU, 0x3e6e77dbU, 0xaed16a4aU, 0xd9d65adcU, 0x40df0b66U,
+	0x37d83bf0U, 0xa9bcae53U, 0xdebb9ec5U, 0x47b2cf7fU, 0x30b5ffe9U,
+	0xbdbdf21cU, 0xcabac28aU, 0x53b39330U, 0x24b4a3a6U, 0xbad03605U,
+	0xcdd70693U, 0x54de5729U, 0x23d967bfU, 0xb3667a2eU, 0xc4614ab8U,
+	0x5d681b02U, 0x2a6f2b94U, 0xb40bbe37U, 0xc30c8ea1U, 0x5a05df1bU,
+	0x2d02ef8dU
+};
diff --git a/tools/gendwarfksyms/crc32.h b/tools/gendwarfksyms/crc32.h
new file mode 100644
index 000000000000..89e4454b2a70
--- /dev/null
+++ b/tools/gendwarfksyms/crc32.h
@@ -0,0 +1,34 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Based on scripts/genksyms/genksyms.c, which has the following
+ * notice:
+ *
+ * Generate kernel symbol version hashes.
+ * Copyright 1996, 1997 Linux International.
+ *
+ * New implementation contributed by Richard Henderson <rth@tamu.edu>
+ * Based on original work by Bjorn Ekwall <bj0rn@blox.se>
+ *
+ * This file was part of the Linux modutils 2.4.22: moved back into the
+ * kernel sources by Rusty Russell/Kai Germaschewski.
+ */
+
+#ifndef __CRC32_H
+#define __CRC32_H
+
+extern const unsigned int crctab32[];
+
+static inline unsigned long partial_crc32_one(unsigned char c,
+					      unsigned long crc)
+{
+	return crctab32[(crc ^ c) & 0xff] ^ (crc >> 8);
+}
+
+static inline unsigned long partial_crc32(const char *s, unsigned long crc)
+{
+	while (*s)
+		crc = partial_crc32_one(*s++, crc);
+	return crc;
+}
+
+#endif /* __CRC32_H */
diff --git a/tools/gendwarfksyms/gendwarfksyms.c b/tools/gendwarfksyms/gendwarfksyms.c
index 4a1bd9239182..7938b7440674 100644
--- a/tools/gendwarfksyms/gendwarfksyms.c
+++ b/tools/gendwarfksyms/gendwarfksyms.c
@@ -125,6 +125,7 @@ int main(int argc, const char **argv)
 	}
 
 	dwfl_end(dwfl);
+	symbol_print_versions();
 
 	return 0;
 }
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index b77855cc94a7..7440f1c73500 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -52,17 +52,22 @@ extern bool debug;
 /*
  * symbols.c
  */
+enum symbol_state { UNPROCESSED, PROCESSED_ADDR, PROCESSED_NAME };
 
 /* Exported symbol -- matching either the name or the address */
 struct symbol {
 	const char *name;
 	uintptr_t addr;
+	enum symbol_state state;
+	unsigned long crc;
 	struct hlist_node addr_hash;
 	struct hlist_node name_hash;
 };
 
+extern int symbol_set_crc(struct symbol *sym, unsigned long crc);
 extern int symbol_read_list(FILE *file);
 extern struct symbol *symbol_get(uintptr_t addr, const char *name);
+extern void symbol_print_versions(void);
 
 /*
  * types.c
@@ -73,6 +78,7 @@ struct state {
 	Dwarf *dbg;
 	struct symbol *sym;
 	Dwarf_Die origin;
+	unsigned long crc;
 };
 
 typedef int (*die_callback_t)(struct state *state, Dwarf_Die *die);
diff --git a/tools/gendwarfksyms/symbols.c b/tools/gendwarfksyms/symbols.c
index 2cae61bcede7..71029635fc83 100644
--- a/tools/gendwarfksyms/symbols.c
+++ b/tools/gendwarfksyms/symbols.c
@@ -18,7 +18,8 @@ static u32 name_hash(const char *name)
 }
 
 /* symbol_for_each callback -- return true to stop, false to continue */
-typedef bool (*symbol_callback_t)(struct symbol *, void *arg);
+typedef bool (*symbol_callback_t)(struct symbol *, enum symbol_state type,
+				  void *arg);
 
 static bool for_each_addr(uintptr_t addr, symbol_callback_t func, void *data)
 {
@@ -30,7 +31,7 @@ static bool for_each_addr(uintptr_t addr, symbol_callback_t func, void *data)
 
 	hash_for_each_possible(symbol_addrs, sym, addr_hash, addr) {
 		if (sym->addr == addr) {
-			if (func(sym, data))
+			if (func(sym, PROCESSED_ADDR, data))
 				return true;
 			found = true;
 		}
@@ -49,7 +50,7 @@ static bool for_each_name(const char *name, symbol_callback_t func, void *data)
 
 	hash_for_each_possible(symbol_names, sym, name_hash, name_hash(name)) {
 		if (!strcmp(sym->name, name)) {
-			if (func(sym, data))
+			if (func(sym, PROCESSED_NAME, data))
 				return true;
 			found = true;
 		}
@@ -71,6 +72,45 @@ static bool for_each(uintptr_t addr, const char *name, symbol_callback_t func,
 	return found;
 }
 
+static bool set_crc(struct symbol *sym, enum symbol_state type, void *data)
+{
+	unsigned long *crc = data;
+
+	/* Prefer an address match if found, otherwise match by name. */
+	if (type == PROCESSED_ADDR) {
+		if (sym->state == PROCESSED_ADDR) {
+			warn("symbol %s (@ %lx) already matched by address (crc %lx vs. %lx)",
+			     sym->name, sym->addr, sym->crc, *crc);
+			return false;
+		}
+		if (sym->state == PROCESSED_NAME && sym->crc != *crc)
+			debug("symbol %s (@ %lx) overriding name match (crc %lx vs. %lx)",
+			      sym->name, sym->addr, sym->crc, *crc);
+	} else if (type == PROCESSED_NAME) {
+		if (sym->state == PROCESSED_ADDR) {
+			if (sym->crc != *crc)
+				debug("symbol %s (@ %lx) ignoring name match (crc %lx vs. %lx)",
+				      sym->name, sym->addr, sym->crc, *crc);
+			return false;
+		}
+		if (sym->state == PROCESSED_NAME) {
+			warn("symbol %s (@ %lx) already matched by name (crc %lx vs. %lx)",
+			     sym->name, sym->addr, sym->crc, *crc);
+			return false;
+		}
+	}
+
+	sym->state = type;
+	sym->crc = *crc;
+
+	return false; /* Continue */
+}
+
+int symbol_set_crc(struct symbol *sym, unsigned long crc)
+{
+	return for_each(sym->addr, sym->name, set_crc, &crc) ? 0 : -1;
+}
+
 int symbol_read_list(FILE *file)
 {
 	struct symbol *sym;
@@ -99,6 +139,8 @@ int symbol_read_list(FILE *file)
 
 		sym->addr = (uintptr_t)addr;
 		sym->name = name;
+		sym->state = UNPROCESSED;
+		sym->crc = 0;
 		name = NULL;
 
 		hash_add(symbol_addrs, &sym->addr_hash, sym->addr);
@@ -111,7 +153,7 @@ int symbol_read_list(FILE *file)
 	return 0;
 }
 
-static bool return_symbol(struct symbol *sym, void *arg)
+static bool return_symbol(struct symbol *sym, enum symbol_state type, void *arg)
 {
 	struct symbol **res = (struct symbol **)arg;
 
@@ -128,3 +170,24 @@ struct symbol *symbol_get(uintptr_t addr, const char *name)
 
 	return NULL;
 }
+
+void symbol_print_versions(void)
+{
+	struct hlist_node *tmp;
+	struct symbol *sym;
+	int i;
+
+	hash_for_each_safe(symbol_addrs, i, tmp, sym, addr_hash) {
+		if (sym->state == UNPROCESSED)
+			warn("no information for symbol %s (@ %lx)", sym->name,
+			     sym->addr);
+
+		printf("#SYMVER %s 0x%08lx\n", sym->name, sym->crc);
+
+		free((void *)sym->name);
+		free(sym);
+	}
+
+	hash_init(symbol_addrs);
+	hash_init(symbol_names);
+}
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index f1ce7bfcf510..47ae967d42ee 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -4,6 +4,7 @@
  */
 
 #include "gendwarfksyms.h"
+#include "crc32.h"
 
 #define DEFINE_GET_ATTR(attr, type)                                    \
 	static bool get_##attr##_attr(Dwarf_Die *die, unsigned int id, \
@@ -68,7 +69,7 @@ static Dwarf_Die *get_exported(struct state *state, Dwarf_Die *die)
 }
 
 /*
- * Type string processing
+ * Type string and CRC processing
  */
 static int process(struct state *state, const char *s)
 {
@@ -77,6 +78,7 @@ static int process(struct state *state, const char *s)
 	if (debug)
 		fputs(s, stderr);
 
+	state->crc = partial_crc32(s, state->crc);
 	return 0;
 }
 
@@ -101,6 +103,11 @@ int process_die_container(struct state *state, Dwarf_Die *die,
 	return 0;
 }
 
+static void state_init(struct state *state)
+{
+	state->crc = 0xffffffff;
+}
+
 /*
  * Exported symbol processing
  */
@@ -130,17 +137,23 @@ static int process_exported_symbols(struct state *state, Dwarf_Die *die)
 	case DW_TAG_subprogram:
 	case DW_TAG_variable:
 		die = get_exported(state, die);
-		if (!die || !state->sym)
+		if (!die || !state->sym || state->sym->state != UNPROCESSED)
 			return 0;
 
+		/*
+		 * For each exported symbol, compute a CRC of the expanded type
+		 * description.
+		 */
 		debug("%s (@ %lx)", state->sym->name, state->sym->addr);
+		state_init(state);
 
 		if (tag == DW_TAG_subprogram)
 			check(process_subprogram(state, die));
 		else
 			check(process_variable(state, die));
 
-		return 0;
+		return check(
+			symbol_set_crc(state->sym, state->crc ^ 0xffffffff));
 	default:
 		return 0;
 	}
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 04/15] gendwarfksyms: Expand base_type
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (2 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 03/15] gendwarfksyms: Add CRC calculation Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 05/15] gendwarfksyms: Add a cache Sami Tolvanen
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add support for expanding basic DWARF attributes and DW_TAG_base_type
types.

Example usage:

  $ nm -p --defined-only vmlinux.o | \
      awk '/ (T|R|D) / {printf "%s %s\n",$1, $3}' | \
      grep loops_per_jiffy | \
      ./tools/gendwarfksyms/gendwarfksyms --debug vmlinux.o

Output:

  gendwarfksyms: symbol_read_list: adding { 4b0, "loops_per_jiffy" }
  gendwarfksyms: process_modules: vmlinux.o
  gendwarfksyms: process_exported_symbols: loops_per_jiffy (@ 4b0)
  variable base_type unsigned long byte_size(8);
  #SYMVER loops_per_jiffy 0xc694aefc

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/types.c | 109 +++++++++++++++++++++++++++++++++++-
 1 file changed, 108 insertions(+), 1 deletion(-)

diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index 47ae967d42ee..7508d0fb8b80 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -16,6 +16,7 @@
 	}
 
 DEFINE_GET_ATTR(addr, Dwarf_Addr)
+DEFINE_GET_ATTR(udata, Dwarf_Word)
 
 static bool get_ref_die_attr(Dwarf_Die *die, unsigned int id, Dwarf_Die *value)
 {
@@ -82,6 +83,73 @@ static int process(struct state *state, const char *s)
 	return 0;
 }
 
+#define MAX_FMT_BUFFER_SIZE 128
+
+static int process_fmt(struct state *state, const char *fmt, ...)
+{
+	char buf[MAX_FMT_BUFFER_SIZE];
+	va_list args;
+	int res;
+
+	va_start(args, fmt);
+
+	res = check(vsnprintf(buf, sizeof(buf), fmt, args));
+	if (res >= MAX_FMT_BUFFER_SIZE - 1) {
+		error("vsnprintf overflow: increase MAX_FMT_BUFFER_SIZE");
+		res = -1;
+	} else {
+		check(process(state, buf));
+	}
+
+	va_end(args);
+	return res;
+}
+
+/* Process a fully qualified name from DWARF scopes */
+static int process_fqn(struct state *state, Dwarf_Die *die)
+{
+	Dwarf_Die *scopes = NULL;
+	const char *name;
+	int res;
+	int i;
+
+	res = check(dwarf_getscopes_die(die, &scopes));
+	if (!res) {
+		name = get_name(die);
+		name = name ?: "<unnamed>";
+		return check(process(state, name));
+	}
+
+	for (i = res - 1; i >= 0; i--) {
+		if (dwarf_tag(&scopes[i]) == DW_TAG_compile_unit)
+			continue;
+
+		name = get_name(&scopes[i]);
+		name = name ?: "<unnamed>";
+		check(process(state, name));
+		if (i > 0)
+			check(process(state, "::"));
+	}
+
+	free(scopes);
+	return 0;
+}
+
+#define DEFINE_PROCESS_UDATA_ATTRIBUTE(attribute)                         \
+	static int process_##attribute##_attr(struct state *state,        \
+					      Dwarf_Die *die)             \
+	{                                                                 \
+		Dwarf_Word value;                                         \
+		if (get_udata_attr(die, DW_AT_##attribute, &value))       \
+			check(process_fmt(state,                          \
+					  " " #attribute "(%" PRIu64 ")", \
+					  value));                        \
+		return 0;                                                 \
+	}
+
+DEFINE_PROCESS_UDATA_ATTRIBUTE(alignment)
+DEFINE_PROCESS_UDATA_ATTRIBUTE(byte_size)
+
 bool match_all(Dwarf_Die *die)
 {
 	return true;
@@ -103,11 +171,48 @@ int process_die_container(struct state *state, Dwarf_Die *die,
 	return 0;
 }
 
+static int process_type(struct state *state, Dwarf_Die *die);
+
+static int process_type_attr(struct state *state, Dwarf_Die *die)
+{
+	Dwarf_Die type;
+
+	if (get_ref_die_attr(die, DW_AT_type, &type))
+		return check(process_type(state, &type));
+
+	/* Compilers can omit DW_AT_type -- print out 'void' to clarify */
+	return check(process(state, "base_type void"));
+}
+
+static int process_base_type(struct state *state, Dwarf_Die *die)
+{
+	check(process(state, "base_type "));
+	check(process_fqn(state, die));
+	check(process_byte_size_attr(state, die));
+	return check(process_alignment_attr(state, die));
+}
+
 static void state_init(struct state *state)
 {
 	state->crc = 0xffffffff;
 }
 
+static int process_type(struct state *state, Dwarf_Die *die)
+{
+	int tag = dwarf_tag(die);
+
+	switch (tag) {
+	case DW_TAG_base_type:
+		check(process_base_type(state, die));
+		break;
+	default:
+		debug("unimplemented type: %x", tag);
+		break;
+	}
+
+	return 0;
+}
+
 /*
  * Exported symbol processing
  */
@@ -118,7 +223,9 @@ static int process_subprogram(struct state *state, Dwarf_Die *die)
 
 static int process_variable(struct state *state, Dwarf_Die *die)
 {
-	return check(process(state, "variable;\n"));
+	check(process(state, "variable "));
+	check(process_type_attr(state, die));
+	return check(process(state, ";\n"));
 }
 
 static int process_exported_symbols(struct state *state, Dwarf_Die *die)
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 05/15] gendwarfksyms: Add a cache
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (3 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 04/15] gendwarfksyms: Expand base_type Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 06/15] gendwarfksyms: Expand type modifiers and typedefs Sami Tolvanen
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Basic types in DWARF repeat frequently and traversing the DIEs using
libdw is relatively slow. Add a simple hashtable based cache for the
processed DIEs.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/Build           |   1 +
 tools/gendwarfksyms/cache.c         | 147 ++++++++++++++++++++++++++++
 tools/gendwarfksyms/gendwarfksyms.c |   4 +
 tools/gendwarfksyms/gendwarfksyms.h |  37 ++++++-
 tools/gendwarfksyms/types.c         | 140 ++++++++++++++++++--------
 5 files changed, 286 insertions(+), 43 deletions(-)
 create mode 100644 tools/gendwarfksyms/cache.c

diff --git a/tools/gendwarfksyms/Build b/tools/gendwarfksyms/Build
index 518642c9b9de..220a4aa9b380 100644
--- a/tools/gendwarfksyms/Build
+++ b/tools/gendwarfksyms/Build
@@ -1,4 +1,5 @@
 gendwarfksyms-y += gendwarfksyms.o
+gendwarfksyms-y += cache.o
 gendwarfksyms-y += crc32.o
 gendwarfksyms-y += symbols.o
 gendwarfksyms-y += types.o
diff --git a/tools/gendwarfksyms/cache.c b/tools/gendwarfksyms/cache.c
new file mode 100644
index 000000000000..9adda113e0b6
--- /dev/null
+++ b/tools/gendwarfksyms/cache.c
@@ -0,0 +1,147 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2024 Google LLC
+ */
+
+#include <string.h>
+#include "gendwarfksyms.h"
+
+#define DIE_HASH_BITS 10
+
+/* die->addr -> struct cached_die */
+DEFINE_HASHTABLE(die_cache, DIE_HASH_BITS);
+
+static unsigned int cache_hits;
+static unsigned int cache_misses;
+
+static int create_die(Dwarf_Die *die, struct cached_die **res)
+{
+	struct cached_die *cd;
+
+	cd = malloc(sizeof(struct cached_die));
+	if (!cd) {
+		error("malloc failed");
+		return -1;
+	}
+
+	cd->state = INCOMPLETE;
+	cd->addr = (uintptr_t)die->addr;
+	cd->list = NULL;
+
+	hash_add(die_cache, &cd->hash, cd->addr);
+	*res = cd;
+	return 0;
+}
+
+int cache_get(Dwarf_Die *die, enum cached_die_state state,
+	      struct cached_die **res)
+{
+	struct cached_die *cd;
+	uintptr_t addr = (uintptr_t)die->addr;
+
+	hash_for_each_possible(die_cache, cd, hash, addr) {
+		if (cd->addr == addr && cd->state == state) {
+			*res = cd;
+			cache_hits++;
+			return 0;
+		}
+	}
+
+	cache_misses++;
+	return check(create_die(die, res));
+}
+
+static void reset_die(struct cached_die *cd)
+{
+	struct cached_item *tmp;
+	struct cached_item *ci = cd->list;
+
+	while (ci) {
+		if (ci->type == STRING)
+			free(ci->data.str);
+
+		tmp = ci->next;
+		free(ci);
+		ci = tmp;
+	}
+
+	cd->state = INCOMPLETE;
+	cd->list = NULL;
+}
+
+void cache_free(void)
+{
+	struct cached_die *cd;
+	struct hlist_node *tmp;
+	int i;
+
+	hash_for_each_safe(die_cache, i, tmp, cd, hash) {
+		reset_die(cd);
+		free(cd);
+	}
+	hash_init(die_cache);
+
+	if ((cache_hits + cache_misses > 0))
+		debug("cache: hits %u, misses %u (hit rate %.02f%%)",
+		      cache_hits, cache_misses,
+		      (100.0f * cache_hits) / (cache_hits + cache_misses));
+}
+
+static int append_item(struct cached_die *cd, struct cached_item **res)
+{
+	struct cached_item *prev;
+	struct cached_item *ci;
+
+	ci = malloc(sizeof(struct cached_item));
+	if (!ci) {
+		error("malloc failed");
+		return -1;
+	}
+
+	ci->type = EMPTY;
+	ci->next = NULL;
+
+	prev = cd->list;
+	while (prev && prev->next)
+		prev = prev->next;
+
+	if (prev)
+		prev->next = ci;
+	else
+		cd->list = ci;
+
+	*res = ci;
+	return 0;
+}
+
+int cache_add_string(struct cached_die *cd, const char *str)
+{
+	struct cached_item *ci;
+
+	if (!cd)
+		return 0;
+
+	check(append_item(cd, &ci));
+
+	ci->data.str = strdup(str);
+	if (!ci->data.str) {
+		error("strdup failed");
+		return -1;
+	}
+
+	ci->type = STRING;
+	return 0;
+}
+
+int cache_add_die(struct cached_die *cd, Dwarf_Die *die)
+{
+	struct cached_item *ci;
+
+	if (!cd)
+		return 0;
+
+	check(append_item(cd, &ci));
+	ci->data.addr = (uintptr_t)die->addr;
+	ci->type = DIE;
+	return 0;
+}
diff --git a/tools/gendwarfksyms/gendwarfksyms.c b/tools/gendwarfksyms/gendwarfksyms.c
index 7938b7440674..38ccaeb72426 100644
--- a/tools/gendwarfksyms/gendwarfksyms.c
+++ b/tools/gendwarfksyms/gendwarfksyms.c
@@ -15,12 +15,15 @@
 
 /* Print type descriptions and debugging output to stderr */
 bool debug;
+/* Don't use caching */
+bool no_cache;
 
 static const struct {
 	const char *arg;
 	bool *flag;
 } options[] = {
 	{ "--debug", &debug },
+	{ "--no-cache", &no_cache },
 };
 
 static int usage(void)
@@ -126,6 +129,7 @@ int main(int argc, const char **argv)
 
 	dwfl_end(dwfl);
 	symbol_print_versions();
+	cache_free();
 
 	return 0;
 }
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index 7440f1c73500..ea5a9fbda66f 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -18,6 +18,7 @@
  * Options -- in gendwarfksyms.c
  */
 extern bool debug;
+extern bool no_cache;
 
 /*
  * Output helpers
@@ -69,6 +70,35 @@ extern int symbol_read_list(FILE *file);
 extern struct symbol *symbol_get(uintptr_t addr, const char *name);
 extern void symbol_print_versions(void);
 
+/*
+ * cache.c
+ */
+enum cached_item_type { EMPTY, STRING, DIE };
+
+struct cached_item {
+	enum cached_item_type type;
+	union {
+		char *str;
+		uintptr_t addr;
+	} data;
+	struct cached_item *next;
+};
+
+enum cached_die_state { INCOMPLETE, COMPLETE };
+
+struct cached_die {
+	enum cached_die_state state;
+	uintptr_t addr;
+	struct cached_item *list;
+	struct hlist_node hash;
+};
+
+extern int cache_get(Dwarf_Die *die, enum cached_die_state state,
+		     struct cached_die **res);
+extern int cache_add_string(struct cached_die *pd, const char *str);
+extern int cache_add_die(struct cached_die *pd, Dwarf_Die *die);
+extern void cache_free(void);
+
 /*
  * types.c
  */
@@ -81,12 +111,13 @@ struct state {
 	unsigned long crc;
 };
 
-typedef int (*die_callback_t)(struct state *state, Dwarf_Die *die);
+typedef int (*die_callback_t)(struct state *state, struct cached_die *cache,
+			      Dwarf_Die *die);
 typedef bool (*die_match_callback_t)(Dwarf_Die *die);
 extern bool match_all(Dwarf_Die *die);
 
-extern int process_die_container(struct state *state, Dwarf_Die *die,
-				 die_callback_t func,
+extern int process_die_container(struct state *state, struct cached_die *cache,
+				 Dwarf_Die *die, die_callback_t func,
 				 die_match_callback_t match);
 
 extern int process_module(Dwfl_Module *mod, Dwarf *dbg, Dwarf_Die *cudie);
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index 7508d0fb8b80..78046c08be23 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -72,7 +72,7 @@ static Dwarf_Die *get_exported(struct state *state, Dwarf_Die *die)
 /*
  * Type string and CRC processing
  */
-static int process(struct state *state, const char *s)
+static int process(struct state *state, struct cached_die *cache, const char *s)
 {
 	s = s ?: "<null>";
 
@@ -80,12 +80,13 @@ static int process(struct state *state, const char *s)
 		fputs(s, stderr);
 
 	state->crc = partial_crc32(s, state->crc);
-	return 0;
+	return cache_add_string(cache, s);
 }
 
 #define MAX_FMT_BUFFER_SIZE 128
 
-static int process_fmt(struct state *state, const char *fmt, ...)
+static int process_fmt(struct state *state, struct cached_die *cache,
+		       const char *fmt, ...)
 {
 	char buf[MAX_FMT_BUFFER_SIZE];
 	va_list args;
@@ -98,7 +99,7 @@ static int process_fmt(struct state *state, const char *fmt, ...)
 		error("vsnprintf overflow: increase MAX_FMT_BUFFER_SIZE");
 		res = -1;
 	} else {
-		check(process(state, buf));
+		check(process(state, cache, buf));
 	}
 
 	va_end(args);
@@ -106,7 +107,8 @@ static int process_fmt(struct state *state, const char *fmt, ...)
 }
 
 /* Process a fully qualified name from DWARF scopes */
-static int process_fqn(struct state *state, Dwarf_Die *die)
+static int process_fqn(struct state *state, struct cached_die *cache,
+		       Dwarf_Die *die)
 {
 	Dwarf_Die *scopes = NULL;
 	const char *name;
@@ -117,7 +119,7 @@ static int process_fqn(struct state *state, Dwarf_Die *die)
 	if (!res) {
 		name = get_name(die);
 		name = name ?: "<unnamed>";
-		return check(process(state, name));
+		return check(process(state, cache, name));
 	}
 
 	for (i = res - 1; i >= 0; i--) {
@@ -126,25 +128,25 @@ static int process_fqn(struct state *state, Dwarf_Die *die)
 
 		name = get_name(&scopes[i]);
 		name = name ?: "<unnamed>";
-		check(process(state, name));
+		check(process(state, cache, name));
 		if (i > 0)
-			check(process(state, "::"));
+			check(process(state, cache, "::"));
 	}
 
 	free(scopes);
 	return 0;
 }
 
-#define DEFINE_PROCESS_UDATA_ATTRIBUTE(attribute)                         \
-	static int process_##attribute##_attr(struct state *state,        \
-					      Dwarf_Die *die)             \
-	{                                                                 \
-		Dwarf_Word value;                                         \
-		if (get_udata_attr(die, DW_AT_##attribute, &value))       \
-			check(process_fmt(state,                          \
-					  " " #attribute "(%" PRIu64 ")", \
-					  value));                        \
-		return 0;                                                 \
+#define DEFINE_PROCESS_UDATA_ATTRIBUTE(attribute)                              \
+	static int process_##attribute##_attr(                                 \
+		struct state *state, struct cached_die *cache, Dwarf_Die *die) \
+	{                                                                      \
+		Dwarf_Word value;                                              \
+		if (get_udata_attr(die, DW_AT_##attribute, &value))            \
+			check(process_fmt(state, cache,                        \
+					  " " #attribute "(%" PRIu64 ")",      \
+					  value));                             \
+		return 0;                                                      \
 	}
 
 DEFINE_PROCESS_UDATA_ATTRIBUTE(alignment)
@@ -155,8 +157,9 @@ bool match_all(Dwarf_Die *die)
 	return true;
 }
 
-int process_die_container(struct state *state, Dwarf_Die *die,
-			  die_callback_t func, die_match_callback_t match)
+int process_die_container(struct state *state, struct cached_die *cache,
+			  Dwarf_Die *die, die_callback_t func,
+			  die_match_callback_t match)
 {
 	Dwarf_Die current;
 	int res;
@@ -164,32 +167,65 @@ int process_die_container(struct state *state, Dwarf_Die *die,
 	res = check(dwarf_child(die, &current));
 	while (!res) {
 		if (match(&current))
-			check(func(state, &current));
+			check(func(state, cache, &current));
 		res = check(dwarf_siblingof(&current, &current));
 	}
 
 	return 0;
 }
 
-static int process_type(struct state *state, Dwarf_Die *die);
+static int process_type(struct state *state, struct cached_die *parent,
+			Dwarf_Die *die);
 
-static int process_type_attr(struct state *state, Dwarf_Die *die)
+static int process_type_attr(struct state *state, struct cached_die *cache,
+			     Dwarf_Die *die)
 {
 	Dwarf_Die type;
 
 	if (get_ref_die_attr(die, DW_AT_type, &type))
-		return check(process_type(state, &type));
+		return check(process_type(state, cache, &type));
 
 	/* Compilers can omit DW_AT_type -- print out 'void' to clarify */
-	return check(process(state, "base_type void"));
+	return check(process(state, cache, "base_type void"));
+}
+
+static int process_base_type(struct state *state, struct cached_die *cache,
+			     Dwarf_Die *die)
+{
+	check(process(state, cache, "base_type "));
+	check(process_fqn(state, cache, die));
+	check(process_byte_size_attr(state, cache, die));
+	return check(process_alignment_attr(state, cache, die));
 }
 
-static int process_base_type(struct state *state, Dwarf_Die *die)
+static int process_cached(struct state *state, struct cached_die *cache,
+			  Dwarf_Die *die)
 {
-	check(process(state, "base_type "));
-	check(process_fqn(state, die));
-	check(process_byte_size_attr(state, die));
-	return check(process_alignment_attr(state, die));
+	struct cached_item *ci = cache->list;
+	Dwarf_Die child;
+
+	while (ci) {
+		switch (ci->type) {
+		case STRING:
+			check(process(state, NULL, ci->data.str));
+			break;
+		case DIE:
+			if (!dwarf_die_addr_die(state->dbg,
+						(void *)ci->data.addr,
+						&child)) {
+				error("dwarf_die_addr_die failed");
+				return -1;
+			}
+			check(process_type(state, NULL, &child));
+			break;
+		default:
+			error("empty cached_item");
+			return -1;
+		}
+		ci = ci->next;
+	}
+
+	return 0;
 }
 
 static void state_init(struct state *state)
@@ -197,19 +233,41 @@ static void state_init(struct state *state)
 	state->crc = 0xffffffff;
 }
 
-static int process_type(struct state *state, Dwarf_Die *die)
+static int process_type(struct state *state, struct cached_die *parent,
+			Dwarf_Die *die)
 {
+	struct cached_die *cache = NULL;
 	int tag = dwarf_tag(die);
 
+	/*
+	 * If we have the DIE already cached, use it instead of walking
+	 * through DWARF.
+	 */
+	if (!no_cache) {
+		check(cache_get(die, COMPLETE, &cache));
+
+		if (cache->state == COMPLETE) {
+			check(process_cached(state, cache, die));
+			check(cache_add_die(parent, die));
+			return 0;
+		}
+	}
+
 	switch (tag) {
 	case DW_TAG_base_type:
-		check(process_base_type(state, die));
+		check(process_base_type(state, cache, die));
 		break;
 	default:
 		debug("unimplemented type: %x", tag);
 		break;
 	}
 
+	if (!no_cache) {
+		/* Update cache state and append to the parent (if any) */
+		cache->state = COMPLETE;
+		check(cache_add_die(parent, die));
+	}
+
 	return 0;
 }
 
@@ -218,17 +276,18 @@ static int process_type(struct state *state, Dwarf_Die *die)
  */
 static int process_subprogram(struct state *state, Dwarf_Die *die)
 {
-	return check(process(state, "subprogram;\n"));
+	return check(process(state, NULL, "subprogram;\n"));
 }
 
 static int process_variable(struct state *state, Dwarf_Die *die)
 {
-	check(process(state, "variable "));
-	check(process_type_attr(state, die));
-	return check(process(state, ";\n"));
+	check(process(state, NULL, "variable "));
+	check(process_type_attr(state, NULL, die));
+	return check(process(state, NULL, ";\n"));
 }
 
-static int process_exported_symbols(struct state *state, Dwarf_Die *die)
+static int process_exported_symbols(struct state *state,
+				    struct cached_die *cache, Dwarf_Die *die)
 {
 	int tag = dwarf_tag(die);
 
@@ -237,8 +296,9 @@ static int process_exported_symbols(struct state *state, Dwarf_Die *die)
 	case DW_TAG_namespace:
 	case DW_TAG_class_type:
 	case DW_TAG_structure_type:
-		return check(process_die_container(
-			state, die, process_exported_symbols, match_all));
+		return check(process_die_container(state, cache, die,
+						   process_exported_symbols,
+						   match_all));
 
 	/* Possible exported symbols */
 	case DW_TAG_subprogram:
@@ -271,5 +331,5 @@ int process_module(Dwfl_Module *mod, Dwarf *dbg, Dwarf_Die *cudie)
 	struct state state = { .mod = mod, .dbg = dbg };
 
 	return check(process_die_container(
-		&state, cudie, process_exported_symbols, match_all));
+		&state, NULL, cudie, process_exported_symbols, match_all));
 }
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 06/15] gendwarfksyms: Expand type modifiers and typedefs
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (4 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 05/15] gendwarfksyms: Add a cache Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 07/15] gendwarfksyms: Add pretty-printing Sami Tolvanen
                   ` (12 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add support for expanding DWARF type modifiers, such as pointers,
const values etc., and typedefs. These types all have DW_AT_type
attribute pointing the underlying type, and thus produce similar
output.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/gendwarfksyms.h |  3 ++
 tools/gendwarfksyms/types.c         | 54 +++++++++++++++++++++++++++--
 2 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index ea5a9fbda66f..d5ebfe405445 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -50,6 +50,9 @@ extern bool no_cache;
 		__res;                                          \
 	})
 
+/* Consistent aliases (DW_TAG_<type>_type) for DWARF tags */
+#define DW_TAG_typedef_type DW_TAG_typedef
+
 /*
  * symbols.c
  */
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index 78046c08be23..27169c77937f 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -189,6 +189,38 @@ static int process_type_attr(struct state *state, struct cached_die *cache,
 	return check(process(state, cache, "base_type void"));
 }
 
+/* Container types with DW_AT_type */
+static int __process_type(struct state *state, struct cached_die *cache,
+			  Dwarf_Die *die, const char *type)
+{
+	check(process(state, cache, type));
+	check(process_fqn(state, cache, die));
+	check(process(state, cache, " {"));
+	check(process_type_attr(state, cache, die));
+	check(process(state, cache, "}"));
+	check(process_byte_size_attr(state, cache, die));
+	return check(process_alignment_attr(state, cache, die));
+}
+
+#define DEFINE_PROCESS_TYPE(type)                                              \
+	static int process_##type##_type(                                      \
+		struct state *state, struct cached_die *cache, Dwarf_Die *die) \
+	{                                                                      \
+		return __process_type(state, cache, die, #type "_type ");      \
+	}
+
+DEFINE_PROCESS_TYPE(atomic)
+DEFINE_PROCESS_TYPE(const)
+DEFINE_PROCESS_TYPE(immutable)
+DEFINE_PROCESS_TYPE(packed)
+DEFINE_PROCESS_TYPE(pointer)
+DEFINE_PROCESS_TYPE(reference)
+DEFINE_PROCESS_TYPE(restrict)
+DEFINE_PROCESS_TYPE(rvalue_reference)
+DEFINE_PROCESS_TYPE(shared)
+DEFINE_PROCESS_TYPE(volatile)
+DEFINE_PROCESS_TYPE(typedef)
+
 static int process_base_type(struct state *state, struct cached_die *cache,
 			     Dwarf_Die *die)
 {
@@ -233,6 +265,11 @@ static void state_init(struct state *state)
 	state->crc = 0xffffffff;
 }
 
+#define PROCESS_TYPE(type)                                       \
+	case DW_TAG_##type##_type:                               \
+		check(process_##type##_type(state, cache, die)); \
+		break;
+
 static int process_type(struct state *state, struct cached_die *parent,
 			Dwarf_Die *die)
 {
@@ -254,9 +291,20 @@ static int process_type(struct state *state, struct cached_die *parent,
 	}
 
 	switch (tag) {
-	case DW_TAG_base_type:
-		check(process_base_type(state, cache, die));
-		break;
+	/* Type modifiers */
+	PROCESS_TYPE(atomic)
+	PROCESS_TYPE(const)
+	PROCESS_TYPE(immutable)
+	PROCESS_TYPE(packed)
+	PROCESS_TYPE(pointer)
+	PROCESS_TYPE(reference)
+	PROCESS_TYPE(restrict)
+	PROCESS_TYPE(rvalue_reference)
+	PROCESS_TYPE(shared)
+	PROCESS_TYPE(volatile)
+	/* Other types */
+	PROCESS_TYPE(base)
+	PROCESS_TYPE(typedef)
 	default:
 		debug("unimplemented type: %x", tag);
 		break;
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 07/15] gendwarfksyms: Add pretty-printing
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (5 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 06/15] gendwarfksyms: Expand type modifiers and typedefs Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 08/15] gendwarfksyms: Expand subroutine_type Sami Tolvanen
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add linebreaks and indentation to --debug output. This will be
particularly useful for more complex structures.

Example output with --debug:

  variable pointer_type <unnamed> {
    base_type char byte_size(1)
  };

And the same with --debug --no-pretty-print:

  variable pointer_type <unnamed> {base_type char byte_size(1)};

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/cache.c         | 13 +++++++++++++
 tools/gendwarfksyms/gendwarfksyms.c |  3 +++
 tools/gendwarfksyms/gendwarfksyms.h |  5 ++++-
 tools/gendwarfksyms/types.c         | 22 ++++++++++++++++++++++
 4 files changed, 42 insertions(+), 1 deletion(-)

diff --git a/tools/gendwarfksyms/cache.c b/tools/gendwarfksyms/cache.c
index 9adda113e0b6..85a2adeb649d 100644
--- a/tools/gendwarfksyms/cache.c
+++ b/tools/gendwarfksyms/cache.c
@@ -133,6 +133,19 @@ int cache_add_string(struct cached_die *cd, const char *str)
 	return 0;
 }
 
+int cache_add_linebreak(struct cached_die *cd, int linebreak)
+{
+	struct cached_item *ci;
+
+	if (!cd)
+		return 0;
+
+	check(append_item(cd, &ci));
+	ci->data.linebreak = linebreak;
+	ci->type = LINEBREAK;
+	return 0;
+}
+
 int cache_add_die(struct cached_die *cd, Dwarf_Die *die)
 {
 	struct cached_item *ci;
diff --git a/tools/gendwarfksyms/gendwarfksyms.c b/tools/gendwarfksyms/gendwarfksyms.c
index 38ccaeb72426..7095f0ecccab 100644
--- a/tools/gendwarfksyms/gendwarfksyms.c
+++ b/tools/gendwarfksyms/gendwarfksyms.c
@@ -17,6 +17,8 @@
 bool debug;
 /* Don't use caching */
 bool no_cache;
+/* Don't pretty-print (with --debug) */
+bool no_pretty_print;
 
 static const struct {
 	const char *arg;
@@ -24,6 +26,7 @@ static const struct {
 } options[] = {
 	{ "--debug", &debug },
 	{ "--no-cache", &no_cache },
+	{ "--no-pretty-print", &no_pretty_print },
 };
 
 static int usage(void)
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index d5ebfe405445..43eff91e2f2f 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -19,6 +19,7 @@
  */
 extern bool debug;
 extern bool no_cache;
+extern bool no_pretty_print;
 
 /*
  * Output helpers
@@ -76,12 +77,13 @@ extern void symbol_print_versions(void);
 /*
  * cache.c
  */
-enum cached_item_type { EMPTY, STRING, DIE };
+enum cached_item_type { EMPTY, STRING, LINEBREAK, DIE };
 
 struct cached_item {
 	enum cached_item_type type;
 	union {
 		char *str;
+		int linebreak;
 		uintptr_t addr;
 	} data;
 	struct cached_item *next;
@@ -99,6 +101,7 @@ struct cached_die {
 extern int cache_get(Dwarf_Die *die, enum cached_die_state state,
 		     struct cached_die **res);
 extern int cache_add_string(struct cached_die *pd, const char *str);
+extern int cache_add_linebreak(struct cached_die *pd, int linebreak);
 extern int cache_add_die(struct cached_die *pd, Dwarf_Die *die);
 extern void cache_free(void);
 
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index 27169c77937f..74b3755c3e16 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -6,6 +6,17 @@
 #include "gendwarfksyms.h"
 #include "crc32.h"
 
+static bool do_linebreak;
+static int indentation_level;
+
+/* Line breaks and indentation for pretty-printing */
+static int process_linebreak(struct cached_die *cache, int n)
+{
+	indentation_level += n;
+	do_linebreak = true;
+	return check(cache_add_linebreak(cache, n));
+}
+
 #define DEFINE_GET_ATTR(attr, type)                                    \
 	static bool get_##attr##_attr(Dwarf_Die *die, unsigned int id, \
 				      type *value)                     \
@@ -76,6 +87,12 @@ static int process(struct state *state, struct cached_die *cache, const char *s)
 {
 	s = s ?: "<null>";
 
+	if (debug && !no_pretty_print && do_linebreak) {
+		fputs("\n", stderr);
+		for (int i = 0; i < indentation_level; i++)
+			fputs("  ", stderr);
+		do_linebreak = false;
+	}
 	if (debug)
 		fputs(s, stderr);
 
@@ -196,7 +213,9 @@ static int __process_type(struct state *state, struct cached_die *cache,
 	check(process(state, cache, type));
 	check(process_fqn(state, cache, die));
 	check(process(state, cache, " {"));
+	check(process_linebreak(cache, 1));
 	check(process_type_attr(state, cache, die));
+	check(process_linebreak(cache, -1));
 	check(process(state, cache, "}"));
 	check(process_byte_size_attr(state, cache, die));
 	return check(process_alignment_attr(state, cache, die));
@@ -241,6 +260,9 @@ static int process_cached(struct state *state, struct cached_die *cache,
 		case STRING:
 			check(process(state, NULL, ci->data.str));
 			break;
+		case LINEBREAK:
+			check(process_linebreak(NULL, ci->data.linebreak));
+			break;
 		case DIE:
 			if (!dwarf_die_addr_die(state->dbg,
 						(void *)ci->data.addr,
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 08/15] gendwarfksyms: Expand subroutine_type
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (6 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 07/15] gendwarfksyms: Add pretty-printing Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 09/15] gendwarfksyms: Expand array_type Sami Tolvanen
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add support for expanding DW_TAG_subroutine_type and the parameters
in DW_TAG_formal_parameter. Use also to expand subprograms.

Example output with --debug:

  subprogram(
    formal_parameter base_type usize byte_size(8),
    formal_parameter base_type usize byte_size(8),
  )
  -> base_type void;

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/gendwarfksyms.h |  1 +
 tools/gendwarfksyms/types.c         | 58 ++++++++++++++++++++++++++++-
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index 43eff91e2f2f..03d8a4a039c3 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -52,6 +52,7 @@ extern bool no_pretty_print;
 	})
 
 /* Consistent aliases (DW_TAG_<type>_type) for DWARF tags */
+#define DW_TAG_formal_parameter_type DW_TAG_formal_parameter
 #define DW_TAG_typedef_type DW_TAG_typedef
 
 /*
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index 74b3755c3e16..a56aeaa4f3a1 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -169,6 +169,15 @@ static int process_fqn(struct state *state, struct cached_die *cache,
 DEFINE_PROCESS_UDATA_ATTRIBUTE(alignment)
 DEFINE_PROCESS_UDATA_ATTRIBUTE(byte_size)
 
+/* Match functions -- die_match_callback_t */
+#define DEFINE_MATCH(type)                                     \
+	static bool match_##type##_type(Dwarf_Die *die)        \
+	{                                                      \
+		return dwarf_tag(die) == DW_TAG_##type##_type; \
+	}
+
+DEFINE_MATCH(formal_parameter)
+
 bool match_all(Dwarf_Die *die)
 {
 	return true;
@@ -206,6 +215,25 @@ static int process_type_attr(struct state *state, struct cached_die *cache,
 	return check(process(state, cache, "base_type void"));
 }
 
+/* Comma-separated with DW_AT_type */
+static int __process_list_type(struct state *state, struct cached_die *cache,
+			       Dwarf_Die *die, const char *type)
+{
+	check(process(state, cache, type));
+	check(process_type_attr(state, cache, die));
+	check(process(state, cache, ","));
+	return check(process_linebreak(cache, 0));
+}
+
+#define DEFINE_PROCESS_LIST_TYPE(type)                                         \
+	static int process_##type##_type(                                      \
+		struct state *state, struct cached_die *cache, Dwarf_Die *die) \
+	{                                                                      \
+		return __process_list_type(state, cache, die, #type " ");      \
+	}
+
+DEFINE_PROCESS_LIST_TYPE(formal_parameter)
+
 /* Container types with DW_AT_type */
 static int __process_type(struct state *state, struct cached_die *cache,
 			  Dwarf_Die *die, const char *type)
@@ -240,6 +268,30 @@ DEFINE_PROCESS_TYPE(shared)
 DEFINE_PROCESS_TYPE(volatile)
 DEFINE_PROCESS_TYPE(typedef)
 
+static int __process_subroutine_type(struct state *state,
+				     struct cached_die *cache, Dwarf_Die *die,
+				     const char *type)
+{
+	check(process(state, cache, type));
+	check(process(state, cache, "("));
+	check(process_linebreak(cache, 1));
+	/* Parameters */
+	check(process_die_container(state, cache, die, process_type,
+				    match_formal_parameter_type));
+	check(process_linebreak(cache, -1));
+	check(process(state, cache, ")"));
+	process_linebreak(cache, 0);
+	/* Return type */
+	check(process(state, cache, "-> "));
+	return check(process_type_attr(state, cache, die));
+}
+
+static int process_subroutine_type(struct state *state,
+				   struct cached_die *cache, Dwarf_Die *die)
+{
+	return check(__process_subroutine_type(state, cache, die,
+					       "subroutine_type"));
+}
 static int process_base_type(struct state *state, struct cached_die *cache,
 			     Dwarf_Die *die)
 {
@@ -324,8 +376,11 @@ static int process_type(struct state *state, struct cached_die *parent,
 	PROCESS_TYPE(rvalue_reference)
 	PROCESS_TYPE(shared)
 	PROCESS_TYPE(volatile)
+	/* Subtypes */
+	PROCESS_TYPE(formal_parameter)
 	/* Other types */
 	PROCESS_TYPE(base)
+	PROCESS_TYPE(subroutine)
 	PROCESS_TYPE(typedef)
 	default:
 		debug("unimplemented type: %x", tag);
@@ -346,7 +401,8 @@ static int process_type(struct state *state, struct cached_die *parent,
  */
 static int process_subprogram(struct state *state, Dwarf_Die *die)
 {
-	return check(process(state, NULL, "subprogram;\n"));
+	check(__process_subroutine_type(state, NULL, die, "subprogram"));
+	return check(process(state, NULL, ";\n"));
 }
 
 static int process_variable(struct state *state, Dwarf_Die *die)
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 09/15] gendwarfksyms: Expand array_type
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (7 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 08/15] gendwarfksyms: Expand subroutine_type Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 10/15] gendwarfksyms: Expand structure types Sami Tolvanen
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Add support for expanding DW_TAG_array_type, and the subrange type
indicating array size.

Example output with --debug:

  variable array_type [34] {
    pointer_type <unnamed> {
      const_type <unnamed> {
        base_type char byte_size(1)
      }
    }
  };

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/types.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index a56aeaa4f3a1..b1b82d166eb8 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -177,6 +177,7 @@ DEFINE_PROCESS_UDATA_ATTRIBUTE(byte_size)
 	}
 
 DEFINE_MATCH(formal_parameter)
+DEFINE_MATCH(subrange)
 
 bool match_all(Dwarf_Die *die)
 {
@@ -268,6 +269,31 @@ DEFINE_PROCESS_TYPE(shared)
 DEFINE_PROCESS_TYPE(volatile)
 DEFINE_PROCESS_TYPE(typedef)
 
+static int process_subrange_type(struct state *state, struct cached_die *cache,
+				 Dwarf_Die *die)
+{
+	Dwarf_Word count = 0;
+
+	if (get_udata_attr(die, DW_AT_count, &count))
+		return check(process_fmt(state, cache, "[%" PRIu64 "]", count));
+
+	return check(process(state, cache, "[]"));
+}
+
+static int process_array_type(struct state *state, struct cached_die *cache,
+			      Dwarf_Die *die)
+{
+	check(process(state, cache, "array_type "));
+	/* Array size */
+	check(process_die_container(state, cache, die, process_type,
+				    match_subrange_type));
+	check(process(state, cache, " {"));
+	check(process_linebreak(cache, 1));
+	check(process_type_attr(state, cache, die));
+	check(process_linebreak(cache, -1));
+	return check(process(state, cache, "}"));
+}
+
 static int __process_subroutine_type(struct state *state,
 				     struct cached_die *cache, Dwarf_Die *die,
 				     const char *type)
@@ -378,7 +404,9 @@ static int process_type(struct state *state, struct cached_die *parent,
 	PROCESS_TYPE(volatile)
 	/* Subtypes */
 	PROCESS_TYPE(formal_parameter)
+	PROCESS_TYPE(subrange)
 	/* Other types */
+	PROCESS_TYPE(array)
 	PROCESS_TYPE(base)
 	PROCESS_TYPE(subroutine)
 	PROCESS_TYPE(typedef)
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 10/15] gendwarfksyms: Expand structure types
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (8 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 09/15] gendwarfksyms: Expand array_type Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 11/15] gendwarfksyms: Limit structure expansion Sami Tolvanen
                   ` (8 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Recursively expand DWARF structure types, e.g. structs, unions, and
enums. Type strings also encode structure member layout, which allows
us to determine ABI breakages if the compiler decides to reorder
members or otherwise change the layout.

Example output with --debug:

  subprogram(
    formal_parameter pointer_type *mut &str {
      structure_type &str {
        member pointer_type <unnamed> {
          base_type u8 byte_size(1)
        } data_member_location(0),
        member base_type usize byte_size(8) data_member_location(8),
      } byte_size(16) alignment(8)
    },
  )
  -> base_type void;

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/gendwarfksyms.h |   5 ++
 tools/gendwarfksyms/types.c         | 127 +++++++++++++++++++++++++++-
 2 files changed, 130 insertions(+), 2 deletions(-)

diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index 03d8a4a039c3..4646eaf5c85e 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -52,8 +52,13 @@ extern bool no_pretty_print;
 	})
 
 /* Consistent aliases (DW_TAG_<type>_type) for DWARF tags */
+#define DW_TAG_enumerator_type DW_TAG_enumerator
 #define DW_TAG_formal_parameter_type DW_TAG_formal_parameter
+#define DW_TAG_member_type DW_TAG_member
+#define DW_TAG_template_type_parameter_type DW_TAG_template_type_parameter
 #define DW_TAG_typedef_type DW_TAG_typedef
+#define DW_TAG_variant_part_type DW_TAG_variant_part
+#define DW_TAG_variant_type DW_TAG_variant
 
 /*
  * symbols.c
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index b1b82d166eb8..fa74e6fc26e3 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -166,8 +166,10 @@ static int process_fqn(struct state *state, struct cached_die *cache,
 		return 0;                                                      \
 	}
 
+DEFINE_PROCESS_UDATA_ATTRIBUTE(accessibility)
 DEFINE_PROCESS_UDATA_ATTRIBUTE(alignment)
 DEFINE_PROCESS_UDATA_ATTRIBUTE(byte_size)
+DEFINE_PROCESS_UDATA_ATTRIBUTE(data_member_location)
 
 /* Match functions -- die_match_callback_t */
 #define DEFINE_MATCH(type)                                     \
@@ -176,8 +178,11 @@ DEFINE_PROCESS_UDATA_ATTRIBUTE(byte_size)
 		return dwarf_tag(die) == DW_TAG_##type##_type; \
 	}
 
+DEFINE_MATCH(enumerator)
 DEFINE_MATCH(formal_parameter)
+DEFINE_MATCH(member)
 DEFINE_MATCH(subrange)
+DEFINE_MATCH(variant)
 
 bool match_all(Dwarf_Die *die)
 {
@@ -222,6 +227,8 @@ static int __process_list_type(struct state *state, struct cached_die *cache,
 {
 	check(process(state, cache, type));
 	check(process_type_attr(state, cache, die));
+	check(process_accessibility_attr(state, cache, die));
+	check(process_data_member_location_attr(state, cache, die));
 	check(process(state, cache, ","));
 	return check(process_linebreak(cache, 0));
 }
@@ -234,6 +241,7 @@ static int __process_list_type(struct state *state, struct cached_die *cache,
 	}
 
 DEFINE_PROCESS_LIST_TYPE(formal_parameter)
+DEFINE_PROCESS_LIST_TYPE(member)
 
 /* Container types with DW_AT_type */
 static int __process_type(struct state *state, struct cached_die *cache,
@@ -266,6 +274,7 @@ DEFINE_PROCESS_TYPE(reference)
 DEFINE_PROCESS_TYPE(restrict)
 DEFINE_PROCESS_TYPE(rvalue_reference)
 DEFINE_PROCESS_TYPE(shared)
+DEFINE_PROCESS_TYPE(template_type_parameter)
 DEFINE_PROCESS_TYPE(volatile)
 DEFINE_PROCESS_TYPE(typedef)
 
@@ -318,6 +327,110 @@ static int process_subroutine_type(struct state *state,
 	return check(__process_subroutine_type(state, cache, die,
 					       "subroutine_type"));
 }
+
+static int process_variant_type(struct state *state, struct cached_die *cache,
+				Dwarf_Die *die)
+{
+	return check(process_die_container(state, cache, die, process_type,
+					   match_member_type));
+}
+
+static int process_variant_part_type(struct state *state,
+				     struct cached_die *cache, Dwarf_Die *die)
+{
+	check(process(state, cache, "variant_part {"));
+	check(process_linebreak(cache, 1));
+	check(process_die_container(state, cache, die, process_type,
+				    match_variant_type));
+	check(process_linebreak(cache, -1));
+	check(process(state, cache, "},"));
+	return check(process_linebreak(cache, 0));
+}
+
+static int ___process_structure_type(struct state *state,
+				     struct cached_die *cache, Dwarf_Die *die)
+{
+	switch (dwarf_tag(die)) {
+	case DW_TAG_member:
+	case DW_TAG_variant_part:
+		return check(process_type(state, cache, die));
+	case DW_TAG_class_type:
+	case DW_TAG_enumeration_type:
+	case DW_TAG_structure_type:
+	case DW_TAG_template_type_parameter:
+	case DW_TAG_union_type:
+		check(process_type(state, cache, die));
+		check(process(state, cache, ","));
+		return check(process_linebreak(cache, 0));
+	case DW_TAG_subprogram:
+		return 0; /* Skip member functions */
+	default:
+		error("unexpected structure_type child: %x", dwarf_tag(die));
+		return -1;
+	}
+}
+
+static int __process_structure_type(struct state *state,
+				    struct cached_die *cache, Dwarf_Die *die,
+				    const char *type,
+				    die_callback_t process_func,
+				    die_match_callback_t match_func)
+{
+	check(process(state, cache, type));
+	check(process_fqn(state, cache, die));
+	check(process(state, cache, " {"));
+	check(process_linebreak(cache, 1));
+
+	check(process_die_container(state, cache, die, process_func,
+				    match_func));
+
+	check(process_linebreak(cache, -1));
+	check(process(state, cache, "}"));
+
+	check(process_byte_size_attr(state, cache, die));
+	check(process_alignment_attr(state, cache, die));
+
+	return 0;
+}
+
+#define DEFINE_PROCESS_STRUCTURE_TYPE(structure)                               \
+	static int process_##structure##_type(                                 \
+		struct state *state, struct cached_die *cache, Dwarf_Die *die) \
+	{                                                                      \
+		return check(__process_structure_type(                         \
+			state, cache, die, #structure "_type ",                \
+			___process_structure_type, match_all));                \
+	}
+
+DEFINE_PROCESS_STRUCTURE_TYPE(class)
+DEFINE_PROCESS_STRUCTURE_TYPE(structure)
+DEFINE_PROCESS_STRUCTURE_TYPE(union)
+
+static int process_enumerator_type(struct state *state,
+				   struct cached_die *cache, Dwarf_Die *die)
+{
+	Dwarf_Word value;
+
+	check(process(state, cache, "enumerator "));
+	check(process_fqn(state, cache, die));
+
+	if (get_udata_attr(die, DW_AT_const_value, &value)) {
+		check(process(state, cache, " = "));
+		check(process_fmt(state, cache, "%" PRIu64, value));
+	}
+
+	check(process(state, cache, ","));
+	return check(process_linebreak(cache, 0));
+}
+
+static int process_enumeration_type(struct state *state,
+				    struct cached_die *cache, Dwarf_Die *die)
+{
+	return check(__process_structure_type(state, cache, die,
+					      "enumeration_type ", process_type,
+					      match_enumerator_type));
+}
+
 static int process_base_type(struct state *state, struct cached_die *cache,
 			     Dwarf_Die *die)
 {
@@ -402,17 +515,27 @@ static int process_type(struct state *state, struct cached_die *parent,
 	PROCESS_TYPE(rvalue_reference)
 	PROCESS_TYPE(shared)
 	PROCESS_TYPE(volatile)
+	/* Container types */
+	PROCESS_TYPE(class)
+	PROCESS_TYPE(structure)
+	PROCESS_TYPE(union)
+	PROCESS_TYPE(enumeration)
 	/* Subtypes */
+	PROCESS_TYPE(enumerator)
 	PROCESS_TYPE(formal_parameter)
+	PROCESS_TYPE(member)
 	PROCESS_TYPE(subrange)
+	PROCESS_TYPE(template_type_parameter)
+	PROCESS_TYPE(variant)
+	PROCESS_TYPE(variant_part)
 	/* Other types */
 	PROCESS_TYPE(array)
 	PROCESS_TYPE(base)
 	PROCESS_TYPE(subroutine)
 	PROCESS_TYPE(typedef)
 	default:
-		debug("unimplemented type: %x", tag);
-		break;
+		error("unexpected type: %x", tag);
+		return -1;
 	}
 
 	if (!no_cache) {
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 11/15] gendwarfksyms: Limit structure expansion
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (9 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 10/15] gendwarfksyms: Expand structure types Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 12/15] gendwarfksyms: Add inline debugging Sami Tolvanen
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Expand each structure type only once per exported symbol. This
is necessary to support self-referential structures, which would
otherwise result in infinite recursion. Expanding each structure type
just once is enough to catch ABI changes.

For pointers to structure types, limit expansion to three levels
inside the pointer. This should be plenty for catching ABI differences
and stops us from pulling in half the kernel for structs that contain
pointers to large structs like task_struct.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/cache.c         | 49 ++++++++++++++++
 tools/gendwarfksyms/gendwarfksyms.h |  9 ++-
 tools/gendwarfksyms/types.c         | 87 ++++++++++++++++++++++++++---
 3 files changed, 136 insertions(+), 9 deletions(-)

diff --git a/tools/gendwarfksyms/cache.c b/tools/gendwarfksyms/cache.c
index 85a2adeb649d..1d6eb27d799d 100644
--- a/tools/gendwarfksyms/cache.c
+++ b/tools/gendwarfksyms/cache.c
@@ -158,3 +158,52 @@ int cache_add_die(struct cached_die *cd, Dwarf_Die *die)
 	ci->type = DIE;
 	return 0;
 }
+
+/* A list of structure types that were already expanded for the current symbol */
+struct expanded {
+	uintptr_t addr;
+	struct hlist_node hash;
+};
+
+/* die->addr -> struct expanded */
+DEFINE_HASHTABLE(expansion_cache, DIE_HASH_BITS);
+
+int cache_mark_expanded(Dwarf_Die *die)
+{
+	struct expanded *es;
+
+	es = malloc(sizeof(struct expanded));
+	if (!es) {
+		error("malloc failed");
+		return -1;
+	}
+
+	es->addr = (uintptr_t)die->addr;
+	hash_add(expansion_cache, &es->hash, es->addr);
+	return 0;
+}
+
+bool cache_was_expanded(Dwarf_Die *die)
+{
+	struct expanded *es;
+	uintptr_t addr = (uintptr_t)die->addr;
+
+	hash_for_each_possible(expansion_cache, es, hash, addr) {
+		if (es->addr == addr)
+			return true;
+	}
+
+	return false;
+}
+
+void cache_clear_expanded(void)
+{
+	struct hlist_node *tmp;
+	struct expanded *es;
+	int i;
+
+	hash_for_each_safe(expansion_cache, i, tmp, es, hash) {
+		free(es);
+	}
+	hash_init(expansion_cache);
+}
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index 4646eaf5c85e..568f3727017e 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -95,7 +95,7 @@ struct cached_item {
 	struct cached_item *next;
 };
 
-enum cached_die_state { INCOMPLETE, COMPLETE };
+enum cached_die_state { INCOMPLETE, UNEXPANDED, COMPLETE };
 
 struct cached_die {
 	enum cached_die_state state;
@@ -111,6 +111,10 @@ extern int cache_add_linebreak(struct cached_die *pd, int linebreak);
 extern int cache_add_die(struct cached_die *pd, Dwarf_Die *die);
 extern void cache_free(void);
 
+extern int cache_mark_expanded(Dwarf_Die *die);
+extern bool cache_was_expanded(Dwarf_Die *die);
+extern void cache_clear_expanded(void);
+
 /*
  * types.c
  */
@@ -120,6 +124,9 @@ struct state {
 	Dwarf *dbg;
 	struct symbol *sym;
 	Dwarf_Die origin;
+	unsigned int ptr_expansion_depth;
+	bool in_pointer_type;
+	bool expand;
 	unsigned long crc;
 };
 
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index fa74e6fc26e3..da3aa2ad9f57 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -381,14 +381,21 @@ static int __process_structure_type(struct state *state,
 	check(process(state, cache, " {"));
 	check(process_linebreak(cache, 1));
 
-	check(process_die_container(state, cache, die, process_func,
-				    match_func));
+	if (state->expand) {
+		check(cache_mark_expanded(die));
+		check(process_die_container(state, cache, die, process_func,
+					    match_func));
+	} else {
+		check(process(state, cache, "<unexpanded>"));
+	}
 
 	check(process_linebreak(cache, -1));
 	check(process(state, cache, "}"));
 
-	check(process_byte_size_attr(state, cache, die));
-	check(process_alignment_attr(state, cache, die));
+	if (state->expand) {
+		check(process_byte_size_attr(state, cache, die));
+		check(process_alignment_attr(state, cache, die));
+	}
 
 	return 0;
 }
@@ -475,9 +482,38 @@ static int process_cached(struct state *state, struct cached_die *cache,
 
 static void state_init(struct state *state)
 {
+	state->ptr_expansion_depth = 0;
+	state->in_pointer_type = false;
+	state->expand = true;
 	state->crc = 0xffffffff;
 }
 
+static void state_restore(struct state *state, struct state *saved)
+{
+	state->ptr_expansion_depth = saved->ptr_expansion_depth;
+	state->in_pointer_type = saved->in_pointer_type;
+	state->expand = saved->expand;
+}
+
+static void state_save(struct state *state, struct state *saved)
+{
+	state_restore(saved, state);
+}
+
+static bool is_pointer_type(int tag)
+{
+	return tag == DW_TAG_pointer_type || tag == DW_TAG_reference_type;
+}
+
+static bool is_expanded_type(int tag)
+{
+	return tag == DW_TAG_class_type || tag == DW_TAG_structure_type ||
+	       tag == DW_TAG_union_type || tag == DW_TAG_enumeration_type;
+}
+
+/* The maximum depth for expanding structures in pointers */
+#define MAX_POINTER_EXPANSION_DEPTH 3
+
 #define PROCESS_TYPE(type)                                       \
 	case DW_TAG_##type##_type:                               \
 		check(process_##type##_type(state, cache, die)); \
@@ -486,19 +522,52 @@ static void state_init(struct state *state)
 static int process_type(struct state *state, struct cached_die *parent,
 			Dwarf_Die *die)
 {
+	enum cached_die_state want_state = COMPLETE;
 	struct cached_die *cache = NULL;
+	struct state saved;
 	int tag = dwarf_tag(die);
 
+	state_save(state, &saved);
+
 	/*
-	 * If we have the DIE already cached, use it instead of walking
+	 * Structures and enumeration types are expanded only once per
+	 * exported symbol. This is sufficient for detecting ABI changes
+	 * within the structure.
+	 *
+	 * If the exported symbol contains a pointer to a structure,
+	 * at most MAX_POINTER_EXPANSION_DEPTH levels are expanded into
+	 * the referenced structure.
+	 */
+	state->in_pointer_type = saved.in_pointer_type || is_pointer_type(tag);
+
+	if (state->in_pointer_type &&
+	    state->ptr_expansion_depth >= MAX_POINTER_EXPANSION_DEPTH)
+		state->expand = false;
+	else
+		state->expand = saved.expand && !cache_was_expanded(die);
+
+	/* Keep track of pointer expansion depth */
+	if (state->expand && state->in_pointer_type && is_expanded_type(tag))
+		state->ptr_expansion_depth++;
+
+	/*
+	 * If we have want_state already cached, use it instead of walking
 	 * through DWARF.
 	 */
 	if (!no_cache) {
-		check(cache_get(die, COMPLETE, &cache));
+		if (!state->expand && is_expanded_type(tag))
+			want_state = UNEXPANDED;
+
+		check(cache_get(die, want_state, &cache));
+
+		if (cache->state == want_state) {
+			if (want_state == COMPLETE && is_expanded_type(tag))
+				check(cache_mark_expanded(die));
 
-		if (cache->state == COMPLETE) {
 			check(process_cached(state, cache, die));
 			check(cache_add_die(parent, die));
+
+			state_restore(state, &saved);
 			return 0;
 		}
 	}
@@ -540,10 +609,11 @@ static int process_type(struct state *state, struct cached_die *parent,
 
 	if (!no_cache) {
 		/* Update cache state and append to the parent (if any) */
-		cache->state = COMPLETE;
+		cache->state = want_state;
 		check(cache_add_die(parent, die));
 	}
 
+	state_restore(state, &saved);
 	return 0;
 }
 
@@ -596,6 +666,7 @@ static int process_exported_symbols(struct state *state,
 		else
 			check(process_variable(state, die));
 
+		cache_clear_expanded();
 		return check(
 			symbol_set_crc(state->sym, state->crc ^ 0xffffffff));
 	default:
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 12/15] gendwarfksyms: Add inline debugging
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (10 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 11/15] gendwarfksyms: Limit structure expansion Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 13/15] modpost: Add support for hashing long symbol names Sami Tolvanen
                   ` (6 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Understanding the operation and caching of the tool can be somewhat
challenging without a debugger. Add inline debugging information with
the --inline-debug command, which adds highlighted tags to the --debug
output with information about cache states etc.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 tools/gendwarfksyms/gendwarfksyms.c |  3 +++
 tools/gendwarfksyms/gendwarfksyms.h | 29 +++++++++++++++++++++++++++++
 tools/gendwarfksyms/types.c         | 14 ++++++++++++++
 3 files changed, 46 insertions(+)

diff --git a/tools/gendwarfksyms/gendwarfksyms.c b/tools/gendwarfksyms/gendwarfksyms.c
index 7095f0ecccab..acf4b44238e2 100644
--- a/tools/gendwarfksyms/gendwarfksyms.c
+++ b/tools/gendwarfksyms/gendwarfksyms.c
@@ -15,6 +15,8 @@
 
 /* Print type descriptions and debugging output to stderr */
 bool debug;
+/* Print inline debugging information to stderr (with --debug) */
+bool inline_debug;
 /* Don't use caching */
 bool no_cache;
 /* Don't pretty-print (with --debug) */
@@ -25,6 +27,7 @@ static const struct {
 	bool *flag;
 } options[] = {
 	{ "--debug", &debug },
+	{ "--inline-debug", &inline_debug },
 	{ "--no-cache", &no_cache },
 	{ "--no-pretty-print", &no_pretty_print },
 };
diff --git a/tools/gendwarfksyms/gendwarfksyms.h b/tools/gendwarfksyms/gendwarfksyms.h
index 568f3727017e..34d1be3cfb7f 100644
--- a/tools/gendwarfksyms/gendwarfksyms.h
+++ b/tools/gendwarfksyms/gendwarfksyms.h
@@ -18,6 +18,7 @@
  * Options -- in gendwarfksyms.c
  */
 extern bool debug;
+extern bool inline_debug;
 extern bool no_cache;
 extern bool no_pretty_print;
 
@@ -38,6 +39,18 @@ extern bool no_pretty_print;
 #define warn(format, ...) __println("warning: ", format, ##__VA_ARGS__)
 #define error(format, ...) __println("error: ", format, ##__VA_ARGS__)
 
+#define __inline_debug(color, format, ...)                              \
+	do {                                                            \
+		if (debug && inline_debug)                              \
+			fprintf(stderr,                                 \
+				"\033[" #color "m<" format ">\033[39m", \
+				__VA_ARGS__);                           \
+	} while (0)
+
+#define inline_debug_r(format, ...) __inline_debug(91, format, __VA_ARGS__)
+#define inline_debug_g(format, ...) __inline_debug(92, format, __VA_ARGS__)
+#define inline_debug_b(format, ...) __inline_debug(94, format, __VA_ARGS__)
+
 /*
  * Error handling helpers
  */
@@ -51,6 +64,9 @@ extern bool no_pretty_print;
 		__res;                                          \
 	})
 
+/*
+ * DWARF helpers
+ */
 /* Consistent aliases (DW_TAG_<type>_type) for DWARF tags */
 #define DW_TAG_enumerator_type DW_TAG_enumerator
 #define DW_TAG_formal_parameter_type DW_TAG_formal_parameter
@@ -97,6 +113,19 @@ struct cached_item {
 
 enum cached_die_state { INCOMPLETE, UNEXPANDED, COMPLETE };
 
+static inline const char *cache_state_name(enum cached_die_state state)
+{
+	switch (state) {
+	default:
+	case INCOMPLETE:
+		return "INCOMPLETE";
+	case UNEXPANDED:
+		return "UNEXPANDED";
+	case COMPLETE:
+		return "COMPLETE";
+	}
+}
+
 struct cached_die {
 	enum cached_die_state state;
 	uintptr_t addr;
diff --git a/tools/gendwarfksyms/types.c b/tools/gendwarfksyms/types.c
index da3aa2ad9f57..694b33cdd606 100644
--- a/tools/gendwarfksyms/types.c
+++ b/tools/gendwarfksyms/types.c
@@ -97,6 +97,7 @@ static int process(struct state *state, struct cached_die *cache, const char *s)
 		fputs(s, stderr);
 
 	state->crc = partial_crc32(s, state->crc);
+	inline_debug_r("cache %p string '%s'", cache, s);
 	return cache_add_string(cache, s);
 }
 
@@ -456,6 +457,8 @@ static int process_cached(struct state *state, struct cached_die *cache,
 	while (ci) {
 		switch (ci->type) {
 		case STRING:
+			inline_debug_b("cache %p STRING '%s'", cache,
+				       ci->data.str);
 			check(process(state, NULL, ci->data.str));
 			break;
 		case LINEBREAK:
@@ -468,6 +471,8 @@ static int process_cached(struct state *state, struct cached_die *cache,
 				error("dwarf_die_addr_die failed");
 				return -1;
 			}
+			inline_debug_b("cache %p DIE addr %lx tag %d", cache,
+				       ci->data.addr, dwarf_tag(&child));
 			check(process_type(state, NULL, &child));
 			break;
 		default:
@@ -561,6 +566,9 @@ static int process_type(struct state *state, struct cached_die *parent,
 		check(cache_get(die, want_state, &cache));
 
 		if (cache->state == want_state) {
+			inline_debug_g("cached addr %p tag %d -- %s", die->addr,
+				       tag, cache_state_name(cache->state));
+
 			if (want_state == COMPLETE && is_expanded_type(tag))
 				check(cache_mark_expanded(die));
 
@@ -572,6 +580,9 @@ static int process_type(struct state *state, struct cached_die *parent,
 		}
 	}
 
+	inline_debug_g("addr %p tag %d -- INCOMPLETE -> %s", die->addr, tag,
+		       cache_state_name(want_state));
+
 	switch (tag) {
 	/* Type modifiers */
 	PROCESS_TYPE(atomic)
@@ -608,6 +619,9 @@ static int process_type(struct state *state, struct cached_die *parent,
 	}
 
 	if (!no_cache) {
+		inline_debug_r("parent %p cache %p die addr %p tag %d", parent,
+			       cache, die->addr, tag);
+
 		/* Update cache state and append to the parent (if any) */
 		cache->state = want_state;
 		check(cache_add_die(parent, die));
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 13/15] modpost: Add support for hashing long symbol names
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (11 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 12/15] gendwarfksyms: Add inline debugging Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-18 16:47   ` Masahiro Yamada
  2024-06-17 17:58 ` [PATCH 14/15] module: Support hashed symbol names when checking modversions Sami Tolvanen
                   ` (5 subsequent siblings)
  18 siblings, 1 reply; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Rust frequently has symbol names longer than MODULE_NAME_LEN, because
the full namespace path is encoded into the mangled name. Instead of
modpost failing when running into a long name with CONFIG_MODVERSIONS,
store a hash of the name in struct modversion_info.

To avoid breaking userspace tools that parse the __versions array,
include a null-terminated hash function name at the beginning of the
name string, followed by a binary hash. In order to keep .mod.c files
more human-readable, add a comment with the full symbol name before the
array entry. Example output in rust_minimal.mod.c:

  static const struct modversion_info ____versions[]
  __used __section("__versions") = {
        /* _RNvNtNtCs1cdwasc6FUb_6kernel5print14format_strings4INFO */
        { 0x9ec5442f, "sha256\x00\x56\x96\xf4\x27\xdb\x4a\xbf[...]" },
        { 0x1d6595b1, "_RNvNtCs1cdwasc6FUb_6kernel5print11call_printk" },
        { 0x3c642974, "__rust_dealloc" },
        ...
  };

modprobe output for the resulting module:

  $ modprobe --dump-modversions rust_minimal.ko
  0x9ec5442f	sha256
  0x1d6595b1	_RNvNtCs1cdwasc6FUb_6kernel5print11call_printk
  0x3c642974	__rust_dealloc
  ...

While the output is less useful, the tool continues to work and can be
later updated to produce more helpful output for hashed symbols.

Note that this patch adds a generic SHA-256 implementation to modpost
adapted from the Crypto API, but other hash functions may be used in
future if needed.

Suggested-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 scripts/mod/Makefile  |   4 +-
 scripts/mod/modpost.c |  20 ++-
 scripts/mod/modpost.h |  20 +++
 scripts/mod/symhash.c | 327 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 364 insertions(+), 7 deletions(-)
 create mode 100644 scripts/mod/symhash.c

diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
index c729bc936bae..ddd59ea9027e 100644
--- a/scripts/mod/Makefile
+++ b/scripts/mod/Makefile
@@ -4,7 +4,7 @@ CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO)
 hostprogs-always-y	+= modpost mk_elfconfig
 always-y		+= empty.o
 
-modpost-objs	:= modpost.o file2alias.o sumversion.o symsearch.o
+modpost-objs	:= modpost.o file2alias.o symhash.o sumversion.o symsearch.o
 
 devicetable-offsets-file := devicetable-offsets.h
 
@@ -15,7 +15,7 @@ targets += $(devicetable-offsets-file) devicetable-offsets.s
 
 # dependencies on generated files need to be listed explicitly
 
-$(obj)/modpost.o $(obj)/file2alias.o $(obj)/sumversion.o $(obj)/symsearch.o: $(obj)/elfconfig.h
+$(obj)/modpost.o $(obj)/file2alias.o $(obj)/symhash.o $(obj)/sumversion.o $(obj)/symsearch.o: $(obj)/elfconfig.h
 $(obj)/file2alias.o: $(obj)/$(devicetable-offsets-file)
 
 quiet_cmd_elfconfig = MKELF   $@
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index f48d72d22dc2..2631e3e75a5c 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1900,7 +1900,10 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod)
  **/
 static void add_versions(struct buffer *b, struct module *mod)
 {
+	char hash[SYMHASH_STR_LEN];
 	struct symbol *s;
+	const char *name;
+	size_t len;
 
 	if (!modversions)
 		return;
@@ -1917,13 +1920,20 @@ static void add_versions(struct buffer *b, struct module *mod)
 				s->name, mod->name);
 			continue;
 		}
-		if (strlen(s->name) >= MODULE_NAME_LEN) {
-			error("too long symbol \"%s\" [%s.ko]\n",
-			      s->name, mod->name);
-			break;
+		len = strlen(s->name);
+		/*
+		 * For symbols with a long name, use the hash format, but include
+		 * the full symbol name as a comment to keep the generated files
+		 * human-readable.
+		 */
+		if (len >= MODULE_NAME_LEN) {
+			buf_printf(b, "\t/* %s */\n", s->name);
+			name = symhash_str(s->name, len, hash);
+		} else {
+			name = s->name;
 		}
 		buf_printf(b, "\t{ %#8x, \"%s\" },\n",
-			   s->crc, s->name);
+			   s->crc, name);
 	}
 
 	buf_printf(b, "};\n");
diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
index ee43c7950636..cd2eb718936b 100644
--- a/scripts/mod/modpost.h
+++ b/scripts/mod/modpost.h
@@ -183,6 +183,26 @@ void handle_moddevtable(struct module *mod, struct elf_info *info,
 			Elf_Sym *sym, const char *symname);
 void add_moddevtable(struct buffer *buf, struct module *mod);
 
+/* symhash.c */
+/*
+ * For symbol names longer than MODULE_NAME_LEN, encode a hash of the
+ * symbol name in version information as follows:
+ *
+ * <hash name>\0<binary hash of the symbol name>
+ *
+ * e.g. as a string in .mod.c files:
+ * "sha256\x00\xNN{32}"
+ *
+ * The string is null terminated after the hash name to avoid breaking
+ * userspace tools that parse the __versions table and don't understand
+ * the format.
+ */
+#define SYMHASH_STR_PREFIX	"sha256\\x00"
+#define SYMHASH_STR_PREFIX_LEN	(sizeof(SYMHASH_STR_PREFIX) - 1)
+#define SYMHASH_STR_LEN		(SYMHASH_STR_PREFIX_LEN + 4*32 + 1)
+
+char *symhash_str(const char *name, size_t len, char hash_str[SYMHASH_STR_LEN]);
+
 /* sumversion.c */
 void get_src_version(const char *modname, char sum[], unsigned sumlen);
 
diff --git a/scripts/mod/symhash.c b/scripts/mod/symhash.c
new file mode 100644
index 000000000000..d0c9cf5f1f6c
--- /dev/null
+++ b/scripts/mod/symhash.c
@@ -0,0 +1,327 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * symhash.c
+ *
+ * Symbol name hashing using a SHA-256 implementation adapted from the
+ * Cryptographic API.
+ */
+#include <byteswap.h>
+#include "modpost.h"
+
+#if HOST_ELFDATA == ELFDATA2MSB
+/* Big endian */
+#define be32_to_cpu(val) (val)
+#define cpu_to_be32(val) (val)
+#define cpu_to_be64(val) (val)
+#else
+/* Little endian */
+#define be32_to_cpu(val) bswap_32(val)
+#define cpu_to_be32(val) bswap_32(val)
+#define cpu_to_be64(val) bswap_64(val)
+#endif
+
+#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
+
+static inline void memzero_explicit(void *s, size_t count)
+{
+	memset(s, 0, count);
+	barrier_data(s);
+}
+
+static inline uint32_t ror32(uint32_t word, unsigned int shift)
+{
+	return (word >> (shift & 31)) | (word << ((-shift) & 31));
+}
+
+/*
+ * include/crypto/sha2.h - Common values for SHA-2 algorithms
+ */
+#define SHA256_DIGEST_SIZE      32
+#define SHA256_BLOCK_SIZE       64
+
+#define SHA256_H0	0x6a09e667UL
+#define SHA256_H1	0xbb67ae85UL
+#define SHA256_H2	0x3c6ef372UL
+#define SHA256_H3	0xa54ff53aUL
+#define SHA256_H4	0x510e527fUL
+#define SHA256_H5	0x9b05688cUL
+#define SHA256_H6	0x1f83d9abUL
+#define SHA256_H7	0x5be0cd19UL
+
+struct sha256_state {
+	uint32_t state[SHA256_DIGEST_SIZE / 4];
+	uint64_t count;
+	uint8_t buf[SHA256_BLOCK_SIZE];
+};
+
+static inline void sha256_init(struct sha256_state *sctx)
+{
+	sctx->state[0] = SHA256_H0;
+	sctx->state[1] = SHA256_H1;
+	sctx->state[2] = SHA256_H2;
+	sctx->state[3] = SHA256_H3;
+	sctx->state[4] = SHA256_H4;
+	sctx->state[5] = SHA256_H5;
+	sctx->state[6] = SHA256_H6;
+	sctx->state[7] = SHA256_H7;
+	sctx->count = 0;
+}
+
+/*
+ * include/crypto/sha256_base.h - core logic for SHA-256 implementations
+ *
+ * Copyright (C) 2015 Linaro Ltd <ard.biesheuvel@linaro.org>
+ */
+typedef void (sha256_block_fn)(struct sha256_state *sst, uint8_t const *src,
+			       int blocks);
+
+static inline int lib_sha256_base_do_update(struct sha256_state *sctx,
+					    const uint8_t *data,
+					    unsigned int len,
+					    sha256_block_fn *block_fn)
+{
+	unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
+
+	sctx->count += len;
+
+	if ((partial + len) >= SHA256_BLOCK_SIZE) {
+		int blocks;
+
+		if (partial) {
+			int p = SHA256_BLOCK_SIZE - partial;
+
+			memcpy(sctx->buf + partial, data, p);
+			data += p;
+			len -= p;
+
+			block_fn(sctx, sctx->buf, 1);
+		}
+
+		blocks = len / SHA256_BLOCK_SIZE;
+		len %= SHA256_BLOCK_SIZE;
+
+		if (blocks) {
+			block_fn(sctx, data, blocks);
+			data += blocks * SHA256_BLOCK_SIZE;
+		}
+		partial = 0;
+	}
+	if (len)
+		memcpy(sctx->buf + partial, data, len);
+
+	return 0;
+}
+
+static inline int lib_sha256_base_do_finalize(struct sha256_state *sctx,
+					      sha256_block_fn *block_fn)
+{
+	const int bit_offset = SHA256_BLOCK_SIZE - sizeof(uint64_t);
+	uint64_t *bits = (uint64_t *)(sctx->buf + bit_offset);
+	unsigned int partial = sctx->count % SHA256_BLOCK_SIZE;
+
+	sctx->buf[partial++] = 0x80;
+	if (partial > bit_offset) {
+		memset(sctx->buf + partial, 0x0, SHA256_BLOCK_SIZE - partial);
+		partial = 0;
+
+		block_fn(sctx, sctx->buf, 1);
+	}
+
+	memset(sctx->buf + partial, 0x0, bit_offset - partial);
+	*bits = cpu_to_be64(sctx->count << 3);
+	block_fn(sctx, sctx->buf, 1);
+
+	return 0;
+}
+
+static inline int lib_sha256_base_finish(struct sha256_state *sctx, uint8_t *out,
+					 unsigned int digest_size)
+{
+	uint32_t *digest = (uint32_t *)out;
+	int i;
+
+	for (i = 0; digest_size > 0; i++, digest_size -= sizeof(uint32_t))
+		*digest++ = cpu_to_be32(sctx->state[i]);
+
+	memzero_explicit(sctx, sizeof(*sctx));
+	return 0;
+}
+
+/*
+ * lib/crypto/sha256.c
+ *
+ * SHA-256, as specified in
+ * http://csrc.nist.gov/groups/STM/cavp/documents/shs/sha256-384-512.pdf
+ *
+ * SHA-256 code by Jean-Luc Cooke <jlcooke@certainkey.com>.
+ *
+ * Copyright (c) Jean-Luc Cooke <jlcooke@certainkey.com>
+ * Copyright (c) Andrew McDonald <andrew@mcdonald.org.uk>
+ * Copyright (c) 2002 James Morris <jmorris@intercode.com.au>
+ * Copyright (c) 2014 Red Hat Inc.
+ */
+static const uint32_t SHA256_K[] = {
+	0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5,
+	0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5,
+	0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3,
+	0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174,
+	0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc,
+	0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da,
+	0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7,
+	0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967,
+	0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13,
+	0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85,
+	0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3,
+	0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070,
+	0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5,
+	0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3,
+	0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208,
+	0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2,
+};
+
+static inline uint32_t Ch(uint32_t x, uint32_t y, uint32_t z)
+{
+	return z ^ (x & (y ^ z));
+}
+
+static inline uint32_t Maj(uint32_t x, uint32_t y, uint32_t z)
+{
+	return (x & y) | (z & (x | y));
+}
+
+#define e0(x)       (ror32(x, 2) ^ ror32(x, 13) ^ ror32(x, 22))
+#define e1(x)       (ror32(x, 6) ^ ror32(x, 11) ^ ror32(x, 25))
+#define s0(x)       (ror32(x, 7) ^ ror32(x, 18) ^ (x >> 3))
+#define s1(x)       (ror32(x, 17) ^ ror32(x, 19) ^ (x >> 10))
+
+static inline void LOAD_OP(int I, uint32_t *W, const uint8_t *input)
+{
+	W[I] = be32_to_cpu(*((__uint32_t *)input + I));
+}
+
+static inline void BLEND_OP(int I, uint32_t *W)
+{
+	W[I] = s1(W[I-2]) + W[I-7] + s0(W[I-15]) + W[I-16];
+}
+
+#define SHA256_ROUND(i, a, b, c, d, e, f, g, h) do {		\
+	uint32_t t1, t2;						\
+	t1 = h + e1(e) + Ch(e, f, g) + SHA256_K[i] + W[i];	\
+	t2 = e0(a) + Maj(a, b, c);				\
+	d += t1;						\
+	h = t1 + t2;						\
+} while (0)
+
+static void sha256_transform(uint32_t *state, const uint8_t *input, uint32_t *W)
+{
+	uint32_t a, b, c, d, e, f, g, h;
+	int i;
+
+	/* load the input */
+	for (i = 0; i < 16; i += 8) {
+		LOAD_OP(i + 0, W, input);
+		LOAD_OP(i + 1, W, input);
+		LOAD_OP(i + 2, W, input);
+		LOAD_OP(i + 3, W, input);
+		LOAD_OP(i + 4, W, input);
+		LOAD_OP(i + 5, W, input);
+		LOAD_OP(i + 6, W, input);
+		LOAD_OP(i + 7, W, input);
+	}
+
+	/* now blend */
+	for (i = 16; i < 64; i += 8) {
+		BLEND_OP(i + 0, W);
+		BLEND_OP(i + 1, W);
+		BLEND_OP(i + 2, W);
+		BLEND_OP(i + 3, W);
+		BLEND_OP(i + 4, W);
+		BLEND_OP(i + 5, W);
+		BLEND_OP(i + 6, W);
+		BLEND_OP(i + 7, W);
+	}
+
+	/* load the state into our registers */
+	a = state[0];  b = state[1];  c = state[2];  d = state[3];
+	e = state[4];  f = state[5];  g = state[6];  h = state[7];
+
+	/* now iterate */
+	for (i = 0; i < 64; i += 8) {
+		SHA256_ROUND(i + 0, a, b, c, d, e, f, g, h);
+		SHA256_ROUND(i + 1, h, a, b, c, d, e, f, g);
+		SHA256_ROUND(i + 2, g, h, a, b, c, d, e, f);
+		SHA256_ROUND(i + 3, f, g, h, a, b, c, d, e);
+		SHA256_ROUND(i + 4, e, f, g, h, a, b, c, d);
+		SHA256_ROUND(i + 5, d, e, f, g, h, a, b, c);
+		SHA256_ROUND(i + 6, c, d, e, f, g, h, a, b);
+		SHA256_ROUND(i + 7, b, c, d, e, f, g, h, a);
+	}
+
+	state[0] += a; state[1] += b; state[2] += c; state[3] += d;
+	state[4] += e; state[5] += f; state[6] += g; state[7] += h;
+}
+
+static void sha256_transform_blocks(struct sha256_state *sctx,
+				    const uint8_t *input, int blocks)
+{
+	uint32_t W[64];
+
+	do {
+		sha256_transform(sctx->state, input, W);
+		input += SHA256_BLOCK_SIZE;
+	} while (--blocks);
+
+	memzero_explicit(W, sizeof(W));
+}
+
+static void sha256_update(struct sha256_state *sctx, const uint8_t *data, unsigned int len)
+{
+	lib_sha256_base_do_update(sctx, data, len, sha256_transform_blocks);
+}
+
+static void __sha256_final(struct sha256_state *sctx, uint8_t *out, int digest_size)
+{
+	lib_sha256_base_do_finalize(sctx, sha256_transform_blocks);
+	lib_sha256_base_finish(sctx, out, digest_size);
+}
+
+static void sha256_final(struct sha256_state *sctx, uint8_t *out)
+{
+	__sha256_final(sctx, out, 32);
+}
+
+char *symhash_str(const char *name, size_t len, char hash_str[SYMHASH_STR_LEN])
+{
+	static const char hex[] = "0123456789abcdef";
+	uint8_t hash[SHA256_DIGEST_SIZE];
+	struct sha256_state sctx;
+	char *p = hash_str;
+
+	/*
+	 * If the symbol name has an initial dot, dedotify it before hashing to match
+	 * PPC64 behavior in arch/powerpc/kernel/module_64.c.
+	 */
+	if (name[0] == '.') {
+		name++;
+		len--;
+	}
+
+	sha256_init(&sctx);
+	sha256_update(&sctx, (const uint8_t *)name, len);
+	sha256_final(&sctx, hash);
+
+	/* Null-terminated prefix */
+	memcpy(p, SYMHASH_STR_PREFIX, SYMHASH_STR_PREFIX_LEN);
+	p += SYMHASH_STR_PREFIX_LEN;
+
+	/* Binary hash */
+	for (int i = 0; i < SHA256_DIGEST_SIZE; i++) {
+		*p++ = '\\';
+		*p++ = 'x';
+		*p++ = hex[(hash[i] & 0xf0) >> 4];
+		*p++ = hex[hash[i] & 0x0f];
+	}
+
+	hash_str[SYMHASH_STR_LEN - 1] = '\0';
+	return hash_str;
+}
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 14/15] module: Support hashed symbol names when checking modversions
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (12 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 13/15] modpost: Add support for hashing long symbol names Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-17 17:58 ` [PATCH 15/15] kbuild: Use gendwarfksyms to generate Rust symbol versions Sami Tolvanen
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

When checking modversions for symbol names longer than MODULE_NAME_LEN,
look for the hashed symbol name instead. This is needed for Rust modules,
which frequently use symbol names longer than the maximum length
supported by struct modversion_info.

Suggested-by: Matthew Maurer <mmaurer@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 kernel/module/version.c | 38 +++++++++++++++++++++++++++++++++++++-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/kernel/module/version.c b/kernel/module/version.c
index 53f43ac5a73e..0b33ca085138 100644
--- a/kernel/module/version.c
+++ b/kernel/module/version.c
@@ -8,8 +8,36 @@
 #include <linux/module.h>
 #include <linux/string.h>
 #include <linux/printk.h>
+#include <crypto/sha2.h>
 #include "internal.h"
 
+/*
+ * For symbol names longer than MODULE_NAME_LEN, the version table includes
+ * a hash of the symbol name in the following format:
+ *
+ * <hash name>\0<binary hash of the symbol name>
+ */
+#define SYMHASH_PREFIX		"sha256"
+#define SYMHASH_PREFIX_LEN	sizeof(SYMHASH_PREFIX) /* includes \0 */
+#define SYMHASH_LEN		(SYMHASH_PREFIX_LEN + SHA256_DIGEST_SIZE)
+
+static void symhash(const char *name, size_t len, u8 hash[SYMHASH_LEN])
+{
+	memcpy(hash, SYMHASH_PREFIX, SYMHASH_PREFIX_LEN);
+	sha256(name, len, &hash[SYMHASH_PREFIX_LEN]);
+}
+
+static int symcmp(const char *version_name, const char *name, size_t len,
+		  const u8 *hash)
+{
+	BUILD_BUG_ON(SYMHASH_LEN > MODULE_NAME_LEN);
+
+	if (len >= MODULE_NAME_LEN)
+		return memcmp(version_name, hash, SYMHASH_LEN);
+
+	return strcmp(version_name, name);
+}
+
 int check_version(const struct load_info *info,
 		  const char *symname,
 			 struct module *mod,
@@ -19,6 +47,8 @@ int check_version(const struct load_info *info,
 	unsigned int versindex = info->index.vers;
 	unsigned int i, num_versions;
 	struct modversion_info *versions;
+	u8 hash[SYMHASH_LEN];
+	size_t len;
 
 	/* Exporting module didn't supply crcs?  OK, we're already tainted. */
 	if (!crc)
@@ -32,10 +62,16 @@ int check_version(const struct load_info *info,
 	num_versions = sechdrs[versindex].sh_size
 		/ sizeof(struct modversion_info);
 
+	len = strlen(symname);
+
+	/* For symbols with a long name, use the hash format. */
+	if (len >= MODULE_NAME_LEN)
+		symhash(symname, len, hash);
+
 	for (i = 0; i < num_versions; i++) {
 		u32 crcval;
 
-		if (strcmp(versions[i].name, symname) != 0)
+		if (symcmp(versions[i].name, symname, len, hash) != 0)
 			continue;
 
 		crcval = *crc;
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 15/15] kbuild: Use gendwarfksyms to generate Rust symbol versions
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (13 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 14/15] module: Support hashed symbol names when checking modversions Sami Tolvanen
@ 2024-06-17 17:58 ` Sami Tolvanen
  2024-06-18 16:28 ` [PATCH 00/15] Implement MODVERSIONS for Rust Masahiro Yamada
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-17 17:58 UTC (permalink / raw)
  To: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman
  Cc: Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux,
	Sami Tolvanen

Use gendwarfksyms to generate symbol versions for exported Rust symbols,
and allow CONFIG_MODVERSIONS to be enabled with CONFIG_RUST, assuming the
debugging information needed by gendwarfksyms is also available.

Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
---
 Makefile      |  6 ++++++
 init/Kconfig  |  2 +-
 rust/Makefile | 30 ++++++++++++++++++++++++------
 3 files changed, 31 insertions(+), 7 deletions(-)

diff --git a/Makefile b/Makefile
index 14427547dc1e..0701917c5172 100644
--- a/Makefile
+++ b/Makefile
@@ -1344,6 +1344,12 @@ prepare: tools/bpf/resolve_btfids
 endif
 endif
 
+ifdef CONFIG_MODVERSIONS
+ifdef CONFIG_RUST
+prepare: tools/gendwarfksyms
+endif
+endif
+
 PHONY += resolve_btfids_clean
 
 resolve_btfids_O = $(abspath $(objtree))/tools/bpf/resolve_btfids
diff --git a/init/Kconfig b/init/Kconfig
index 72404c1f2157..ef29f002e932 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1903,7 +1903,7 @@ config RUST
 	depends on HAVE_RUST
 	depends on RUST_IS_AVAILABLE
 	depends on !CFI_CLANG
-	depends on !MODVERSIONS
+	depends on !MODVERSIONS || (DEBUG_INFO && !DEBUG_INFO_REDUCED && !DEBUG_INFO_SPLIT)
 	depends on !GCC_PLUGINS
 	depends on !RANDSTRUCT
 	depends on !DEBUG_INFO_BTF || PAHOLE_HAS_LANG_EXCLUDE
diff --git a/rust/Makefile b/rust/Makefile
index f70d5e244fee..385cdf5dc320 100644
--- a/rust/Makefile
+++ b/rust/Makefile
@@ -356,10 +356,11 @@ $(obj)/bindings/bindings_helpers_generated.rs: private bindgen_target_extra = ;
 $(obj)/bindings/bindings_helpers_generated.rs: $(src)/helpers.c FORCE
 	$(call if_changed_dep,bindgen)
 
+rust_exports = $(NM) -p --defined-only $(1) | awk '/ (T|R|D) / { printf $(2),$(3) }'
+
 quiet_cmd_exports = EXPORTS $@
       cmd_exports = \
-	$(NM) -p --defined-only $< \
-		| awk '/ (T|R|D) / {printf "EXPORT_SYMBOL_RUST_GPL(%s);\n",$$3}' > $@
+	$(call rust_exports,$<,"EXPORT_SYMBOL_RUST_GPL(%s);\n",$$3) > $@
 
 $(obj)/exports_core_generated.h: $(obj)/core.o FORCE
 	$(call if_changed,exports)
@@ -420,12 +421,29 @@ ifneq ($(or $(CONFIG_ARM64),$(and $(CONFIG_RISCV),$(CONFIG_64BIT))),)
 		__ashlti3 __lshrti3
 endif
 
+ifdef CONFIG_MODVERSIONS
+# When module versioning is enabled, calculate symbol versions from the DWARF
+# debugging information. We can't use a source code parser like genksyms,
+# because the source files don't have information about the final structure
+# layout and emitted symbols.
+gendwarfksyms := $(objtree)/tools/gendwarfksyms/gendwarfksyms
+
+cmd_gendwarfksyms = \
+	$(call rust_exports,$@,"%s %s\n",$$1$(comma)$$3) \
+		| $(gendwarfksyms) $@ >> $(dot-target).cmd
+endif
+
+define rule_rustc_library_with_exports
+	$(call cmd_and_fixdep,rustc_library)
+	$(call cmd,gendwarfksyms)
+endef
+
 $(obj)/core.o: private skip_clippy = 1
 $(obj)/core.o: private skip_flags = -Dunreachable_pub
 $(obj)/core.o: private rustc_objcopy = $(foreach sym,$(redirect-intrinsics),--redefine-sym $(sym)=__rust$(sym))
 $(obj)/core.o: private rustc_target_flags = $(core-cfgs)
 $(obj)/core.o: $(RUST_LIB_SRC)/core/src/lib.rs FORCE
-	+$(call if_changed_dep,rustc_library)
+	+$(call if_changed_rule,rustc_library_with_exports)
 ifdef CONFIG_X86_64
 $(obj)/core.o: scripts/target.json
 endif
@@ -438,7 +456,7 @@ $(obj)/alloc.o: private skip_clippy = 1
 $(obj)/alloc.o: private skip_flags = -Dunreachable_pub
 $(obj)/alloc.o: private rustc_target_flags = $(alloc-cfgs)
 $(obj)/alloc.o: $(RUST_LIB_SRC)/alloc/src/lib.rs $(obj)/compiler_builtins.o FORCE
-	+$(call if_changed_dep,rustc_library)
+	+$(call if_changed_rule,rustc_library_with_exports)
 
 $(obj)/build_error.o: $(src)/build_error.rs $(obj)/compiler_builtins.o FORCE
 	+$(call if_changed_dep,rustc_library)
@@ -447,7 +465,7 @@ $(obj)/bindings.o: $(src)/bindings/lib.rs \
     $(obj)/compiler_builtins.o \
     $(obj)/bindings/bindings_generated.rs \
     $(obj)/bindings/bindings_helpers_generated.rs FORCE
-	+$(call if_changed_dep,rustc_library)
+	+$(call if_changed_rule,rustc_library_with_exports)
 
 $(obj)/uapi.o: $(src)/uapi/lib.rs \
     $(obj)/compiler_builtins.o \
@@ -458,6 +476,6 @@ $(obj)/kernel.o: private rustc_target_flags = --extern alloc \
     --extern build_error --extern macros --extern bindings --extern uapi
 $(obj)/kernel.o: $(src)/kernel/lib.rs $(obj)/alloc.o $(obj)/build_error.o \
     $(obj)/libmacros.so $(obj)/bindings.o $(obj)/uapi.o FORCE
-	+$(call if_changed_dep,rustc_library)
+	+$(call if_changed_rule,rustc_library_with_exports)
 
 endif # CONFIG_RUST
-- 
2.45.2.627.g7a2c4fd464-goog


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (14 preceding siblings ...)
  2024-06-17 17:58 ` [PATCH 15/15] kbuild: Use gendwarfksyms to generate Rust symbol versions Sami Tolvanen
@ 2024-06-18 16:28 ` Masahiro Yamada
  2024-06-18 20:05   ` Sami Tolvanen
  2024-06-18 16:44 ` Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  18 siblings, 1 reply; 37+ messages in thread
From: Masahiro Yamada @ 2024-06-18 16:28 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Luis Chamberlain, Miguel Ojeda, Greg Kroah-Hartman,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

[-- Attachment #1: Type: text/plain, Size: 3006 bytes --]

On Tue, Jun 18, 2024 at 2:58 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> Hi folks,
>
> This series implements CONFIG_MODVERSIONS for Rust, an important
> feature for distributions like Android that want to ship Rust
> kernel modules, and depend on modversions to help ensure module ABI
> compatibility.
>
> There have been earlier proposals [1][2] that would allow Rust
> modules to coexist with modversions, but none that actually implement
> symbol versioning. Unlike C, Rust source code doesn't have sufficient
> information about the final ABI, as the compiler has considerable
> freedom in adjusting structure layout for improved performance [3],
> for example, which makes using a source code parser like genksyms
> a non-starter. Based on Matt's suggestion and previous feedback
> from maintainers, this series uses DWARF debugging information for
> computing versions. DWARF is an established and relatively stable
> format, which includes all the necessary ABI details, and adding a
> CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> reasonable trade-off.
>
> The first 12 patches of this series add a small tool for computing
> symbol versions from DWARF, called gendwarfksyms. When passed a list
> of exported symbols, the tool generates an expanded type string
> for each symbol, and computes symbol CRCs similarly to genksyms.
> gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> because of the existing support for C host tools that use elfutils
> (e.g., objtool).
>
> Another compatibility issue is fitting the extremely long mangled
> Rust symbol names into struct modversion_info, which can only hold
> 55-character names (on 64-bit systems). Previous proposals suggested
> changing the structure to support longer names, but the conclusion was
> that we cannot break userspace tools that parse the version table.
>
> The next two patches of the series implement support for hashed
> symbol names in struct modversion_info, where names longer than 55
> characters are hashed, and the hash is stored in the name field. To
> avoid breaking userspace tools, the binary hash is prefixed with a
> null-terminated string containing the name of the hash function. While
> userspace tools can later be updated to potentially produce more
> useful information about the long symbols, this allows them to
> continue working in the meantime.
>
> The final patch allows CONFIG_MODVERSIONS to be selected with Rust,
> provided that debugging information is also available.




I am surprised at someone who attempts to add another variant of genksyms.


I am also surprised at the tool being added under the tools/ directory.

At least, we can avoid the latter misfortune, though.


A patch attached (on top of your patch set).

Such a tool can be compiled in scripts/, or even better
in rust/Makefile if it is only used in rust/Makefile.





--
Best Regards
Masahiro Yamada

[-- Attachment #2: 0001-kbuild-move-tools-gendwarfksyms-to-scripts-gendwarfk.patch --]
[-- Type: application/x-patch, Size: 9299 bytes --]

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (15 preceding siblings ...)
  2024-06-18 16:28 ` [PATCH 00/15] Implement MODVERSIONS for Rust Masahiro Yamada
@ 2024-06-18 16:44 ` Greg Kroah-Hartman
  2024-06-18 16:50   ` Masahiro Yamada
  2024-06-18 19:42 ` Luis Chamberlain
  2024-07-10  7:30 ` Petr Pavlu
  18 siblings, 1 reply; 37+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-18 16:44 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda, Matthew Maurer,
	Alex Gaynor, Wedson Almeida Filho, Gary Guo, linux-kbuild,
	linux-kernel, linux-modules, rust-for-linux

On Mon, Jun 17, 2024 at 05:58:19PM +0000, Sami Tolvanen wrote:
> Hi folks,
> 
> This series implements CONFIG_MODVERSIONS for Rust, an important
> feature for distributions like Android that want to ship Rust
> kernel modules, and depend on modversions to help ensure module ABI
> compatibility.
> 
> There have been earlier proposals [1][2] that would allow Rust
> modules to coexist with modversions, but none that actually implement
> symbol versioning. Unlike C, Rust source code doesn't have sufficient
> information about the final ABI, as the compiler has considerable
> freedom in adjusting structure layout for improved performance [3],
> for example, which makes using a source code parser like genksyms
> a non-starter. Based on Matt's suggestion and previous feedback
> from maintainers, this series uses DWARF debugging information for
> computing versions. DWARF is an established and relatively stable
> format, which includes all the necessary ABI details, and adding a
> CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> reasonable trade-off.
> 
> The first 12 patches of this series add a small tool for computing
> symbol versions from DWARF, called gendwarfksyms. When passed a list
> of exported symbols, the tool generates an expanded type string
> for each symbol, and computes symbol CRCs similarly to genksyms.
> gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> because of the existing support for C host tools that use elfutils
> (e.g., objtool).

That's cool, can the C code be switched to also use this?  That way we
only have one path/code for all of this?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 13/15] modpost: Add support for hashing long symbol names
  2024-06-17 17:58 ` [PATCH 13/15] modpost: Add support for hashing long symbol names Sami Tolvanen
@ 2024-06-18 16:47   ` Masahiro Yamada
  2024-06-18 20:07     ` Sami Tolvanen
  0 siblings, 1 reply; 37+ messages in thread
From: Masahiro Yamada @ 2024-06-18 16:47 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Luis Chamberlain, Miguel Ojeda, Greg Kroah-Hartman,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

On Tue, Jun 18, 2024 at 2:58 AM Sami Tolvanen <samitolvanen@google.com> wrote:
>
> Rust frequently has symbol names longer than MODULE_NAME_LEN, because
> the full namespace path is encoded into the mangled name. Instead of
> modpost failing when running into a long name with CONFIG_MODVERSIONS,
> store a hash of the name in struct modversion_info.
>
> To avoid breaking userspace tools that parse the __versions array,
> include a null-terminated hash function name at the beginning of the
> name string, followed by a binary hash. In order to keep .mod.c files
> more human-readable, add a comment with the full symbol name before the
> array entry. Example output in rust_minimal.mod.c:
>
>   static const struct modversion_info ____versions[]
>   __used __section("__versions") = {
>         /* _RNvNtNtCs1cdwasc6FUb_6kernel5print14format_strings4INFO */
>         { 0x9ec5442f, "sha256\x00\x56\x96\xf4\x27\xdb\x4a\xbf[...]" },
>         { 0x1d6595b1, "_RNvNtCs1cdwasc6FUb_6kernel5print11call_printk" },
>         { 0x3c642974, "__rust_dealloc" },
>         ...
>   };
>
> modprobe output for the resulting module:
>
>   $ modprobe --dump-modversions rust_minimal.ko
>   0x9ec5442f    sha256
>   0x1d6595b1    _RNvNtCs1cdwasc6FUb_6kernel5print11call_printk
>   0x3c642974    __rust_dealloc
>   ...
>
> While the output is less useful, the tool continues to work and can be
> later updated to produce more helpful output for hashed symbols.
>
> Note that this patch adds a generic SHA-256 implementation to modpost
> adapted from the Crypto API, but other hash functions may be used in
> future if needed.
>
> Suggested-by: Matthew Maurer <mmaurer@google.com>
> Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
> ---
>  scripts/mod/Makefile  |   4 +-
>  scripts/mod/modpost.c |  20 ++-
>  scripts/mod/modpost.h |  20 +++
>  scripts/mod/symhash.c | 327 ++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 364 insertions(+), 7 deletions(-)
>  create mode 100644 scripts/mod/symhash.c
>
> diff --git a/scripts/mod/Makefile b/scripts/mod/Makefile
> index c729bc936bae..ddd59ea9027e 100644
> --- a/scripts/mod/Makefile
> +++ b/scripts/mod/Makefile
> @@ -4,7 +4,7 @@ CFLAGS_REMOVE_empty.o += $(CC_FLAGS_LTO)
>  hostprogs-always-y     += modpost mk_elfconfig
>  always-y               += empty.o
>
> -modpost-objs   := modpost.o file2alias.o sumversion.o symsearch.o
> +modpost-objs   := modpost.o file2alias.o symhash.o sumversion.o symsearch.o
>
>  devicetable-offsets-file := devicetable-offsets.h
>
> @@ -15,7 +15,7 @@ targets += $(devicetable-offsets-file) devicetable-offsets.s
>
>  # dependencies on generated files need to be listed explicitly
>
> -$(obj)/modpost.o $(obj)/file2alias.o $(obj)/sumversion.o $(obj)/symsearch.o: $(obj)/elfconfig.h
> +$(obj)/modpost.o $(obj)/file2alias.o $(obj)/symhash.o $(obj)/sumversion.o $(obj)/symsearch.o: $(obj)/elfconfig.h
>  $(obj)/file2alias.o: $(obj)/$(devicetable-offsets-file)
>
>  quiet_cmd_elfconfig = MKELF   $@
> diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
> index f48d72d22dc2..2631e3e75a5c 100644
> --- a/scripts/mod/modpost.c
> +++ b/scripts/mod/modpost.c
> @@ -1900,7 +1900,10 @@ static void add_exported_symbols(struct buffer *buf, struct module *mod)
>   **/
>  static void add_versions(struct buffer *b, struct module *mod)
>  {
> +       char hash[SYMHASH_STR_LEN];
>         struct symbol *s;
> +       const char *name;
> +       size_t len;
>
>         if (!modversions)
>                 return;
> @@ -1917,13 +1920,20 @@ static void add_versions(struct buffer *b, struct module *mod)
>                                 s->name, mod->name);
>                         continue;
>                 }
> -               if (strlen(s->name) >= MODULE_NAME_LEN) {
> -                       error("too long symbol \"%s\" [%s.ko]\n",
> -                             s->name, mod->name);
> -                       break;
> +               len = strlen(s->name);
> +               /*
> +                * For symbols with a long name, use the hash format, but include
> +                * the full symbol name as a comment to keep the generated files
> +                * human-readable.
> +                */
> +               if (len >= MODULE_NAME_LEN) {
> +                       buf_printf(b, "\t/* %s */\n", s->name);
> +                       name = symhash_str(s->name, len, hash);
> +               } else {
> +                       name = s->name;
>                 }
>                 buf_printf(b, "\t{ %#8x, \"%s\" },\n",
> -                          s->crc, s->name);
> +                          s->crc, name);
>         }
>
>         buf_printf(b, "};\n");
> diff --git a/scripts/mod/modpost.h b/scripts/mod/modpost.h
> index ee43c7950636..cd2eb718936b 100644
> --- a/scripts/mod/modpost.h
> +++ b/scripts/mod/modpost.h
> @@ -183,6 +183,26 @@ void handle_moddevtable(struct module *mod, struct elf_info *info,
>                         Elf_Sym *sym, const char *symname);
>  void add_moddevtable(struct buffer *buf, struct module *mod);
>
> +/* symhash.c */
> +/*
> + * For symbol names longer than MODULE_NAME_LEN, encode a hash of the
> + * symbol name in version information as follows:
> + *
> + * <hash name>\0<binary hash of the symbol name>
> + *
> + * e.g. as a string in .mod.c files:
> + * "sha256\x00\xNN{32}"
> + *
> + * The string is null terminated after the hash name to avoid breaking
> + * userspace tools that parse the __versions table and don't understand
> + * the format.
> + */
> +#define SYMHASH_STR_PREFIX     "sha256\\x00"
> +#define SYMHASH_STR_PREFIX_LEN (sizeof(SYMHASH_STR_PREFIX) - 1)
> +#define SYMHASH_STR_LEN                (SYMHASH_STR_PREFIX_LEN + 4*32 + 1)
> +
> +char *symhash_str(const char *name, size_t len, char hash_str[SYMHASH_STR_LEN]);
> +
>  /* sumversion.c */
>  void get_src_version(const char *modname, char sum[], unsigned sumlen);
>
> diff --git a/scripts/mod/symhash.c b/scripts/mod/symhash.c
> new file mode 100644
> index 000000000000..d0c9cf5f1f6c
> --- /dev/null
> +++ b/scripts/mod/symhash.c
> @@ -0,0 +1,327 @@
> +// SPDX-License-Identifier: GPL-2.0-or-later
> +/*
> + * symhash.c
> + *
> + * Symbol name hashing using a SHA-256 implementation adapted from the
> + * Cryptographic API.
> + */
> +#include <byteswap.h>
> +#include "modpost.h"
> +
> +#if HOST_ELFDATA == ELFDATA2MSB
> +/* Big endian */
> +#define be32_to_cpu(val) (val)
> +#define cpu_to_be32(val) (val)
> +#define cpu_to_be64(val) (val)
> +#else
> +/* Little endian */
> +#define be32_to_cpu(val) bswap_32(val)
> +#define cpu_to_be32(val) bswap_32(val)
> +#define cpu_to_be64(val) bswap_64(val)
> +#endif
> +
> +#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")




I know this is a copy-paste of the kernel space code,
but is barrier_data() also necessary for host programs?





> +
> +static inline void memzero_explicit(void *s, size_t count)
> +{
> +       memset(s, 0, count);
> +       barrier_data(s);
> +}
> +



--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 16:44 ` Greg Kroah-Hartman
@ 2024-06-18 16:50   ` Masahiro Yamada
  2024-06-18 17:18     ` Greg Kroah-Hartman
  0 siblings, 1 reply; 37+ messages in thread
From: Masahiro Yamada @ 2024-06-18 16:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sami Tolvanen, Luis Chamberlain, Miguel Ojeda, Matthew Maurer,
	Alex Gaynor, Wedson Almeida Filho, Gary Guo, linux-kbuild,
	linux-kernel, linux-modules, rust-for-linux

On Wed, Jun 19, 2024 at 1:44 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Mon, Jun 17, 2024 at 05:58:19PM +0000, Sami Tolvanen wrote:
> > Hi folks,
> >
> > This series implements CONFIG_MODVERSIONS for Rust, an important
> > feature for distributions like Android that want to ship Rust
> > kernel modules, and depend on modversions to help ensure module ABI
> > compatibility.
> >
> > There have been earlier proposals [1][2] that would allow Rust
> > modules to coexist with modversions, but none that actually implement
> > symbol versioning. Unlike C, Rust source code doesn't have sufficient
> > information about the final ABI, as the compiler has considerable
> > freedom in adjusting structure layout for improved performance [3],
> > for example, which makes using a source code parser like genksyms
> > a non-starter. Based on Matt's suggestion and previous feedback
> > from maintainers, this series uses DWARF debugging information for
> > computing versions. DWARF is an established and relatively stable
> > format, which includes all the necessary ABI details, and adding a
> > CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> > reasonable trade-off.
> >
> > The first 12 patches of this series add a small tool for computing
> > symbol versions from DWARF, called gendwarfksyms. When passed a list
> > of exported symbols, the tool generates an expanded type string
> > for each symbol, and computes symbol CRCs similarly to genksyms.
> > gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> > because of the existing support for C host tools that use elfutils
> > (e.g., objtool).
>
> That's cool, can the C code be switched to also use this?  That way we
> only have one path/code for all of this?


As the description says, it requires CONFIG_DEBUG_INFO.
We can strip the debug info from the final vmlinux, but
I guess the build speed will be even slower than the current genksyms.




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 16:50   ` Masahiro Yamada
@ 2024-06-18 17:18     ` Greg Kroah-Hartman
  2024-06-18 19:03       ` Masahiro Yamada
  0 siblings, 1 reply; 37+ messages in thread
From: Greg Kroah-Hartman @ 2024-06-18 17:18 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Sami Tolvanen, Luis Chamberlain, Miguel Ojeda, Matthew Maurer,
	Alex Gaynor, Wedson Almeida Filho, Gary Guo, linux-kbuild,
	linux-kernel, linux-modules, rust-for-linux

On Wed, Jun 19, 2024 at 01:50:36AM +0900, Masahiro Yamada wrote:
> On Wed, Jun 19, 2024 at 1:44 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Mon, Jun 17, 2024 at 05:58:19PM +0000, Sami Tolvanen wrote:
> > > Hi folks,
> > >
> > > This series implements CONFIG_MODVERSIONS for Rust, an important
> > > feature for distributions like Android that want to ship Rust
> > > kernel modules, and depend on modversions to help ensure module ABI
> > > compatibility.
> > >
> > > There have been earlier proposals [1][2] that would allow Rust
> > > modules to coexist with modversions, but none that actually implement
> > > symbol versioning. Unlike C, Rust source code doesn't have sufficient
> > > information about the final ABI, as the compiler has considerable
> > > freedom in adjusting structure layout for improved performance [3],
> > > for example, which makes using a source code parser like genksyms
> > > a non-starter. Based on Matt's suggestion and previous feedback
> > > from maintainers, this series uses DWARF debugging information for
> > > computing versions. DWARF is an established and relatively stable
> > > format, which includes all the necessary ABI details, and adding a
> > > CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> > > reasonable trade-off.
> > >
> > > The first 12 patches of this series add a small tool for computing
> > > symbol versions from DWARF, called gendwarfksyms. When passed a list
> > > of exported symbols, the tool generates an expanded type string
> > > for each symbol, and computes symbol CRCs similarly to genksyms.
> > > gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> > > because of the existing support for C host tools that use elfutils
> > > (e.g., objtool).
> >
> > That's cool, can the C code be switched to also use this?  That way we
> > only have one path/code for all of this?
> 
> 
> As the description says, it requires CONFIG_DEBUG_INFO.
> We can strip the debug info from the final vmlinux, but
> I guess the build speed will be even slower than the current genksyms.

For people who want genksyms (i.e. distros), don't they normally already
enable DEBUG_INFO as well?  The problems of genksyms are well known and
a pain (I speak from experience), so replacing it with info based on
DWARF would be great, I'll gladly trade off the DEBUG_INFO issue for
stablilty!

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 17:18     ` Greg Kroah-Hartman
@ 2024-06-18 19:03       ` Masahiro Yamada
  2024-06-18 20:19         ` Sami Tolvanen
  0 siblings, 1 reply; 37+ messages in thread
From: Masahiro Yamada @ 2024-06-18 19:03 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sami Tolvanen, Luis Chamberlain, Miguel Ojeda, Matthew Maurer,
	Alex Gaynor, Wedson Almeida Filho, Gary Guo, linux-kbuild,
	linux-kernel, linux-modules, rust-for-linux

On Wed, Jun 19, 2024 at 2:18 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Jun 19, 2024 at 01:50:36AM +0900, Masahiro Yamada wrote:
> > On Wed, Jun 19, 2024 at 1:44 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Mon, Jun 17, 2024 at 05:58:19PM +0000, Sami Tolvanen wrote:
> > > > Hi folks,
> > > >
> > > > This series implements CONFIG_MODVERSIONS for Rust, an important
> > > > feature for distributions like Android that want to ship Rust
> > > > kernel modules, and depend on modversions to help ensure module ABI
> > > > compatibility.
> > > >
> > > > There have been earlier proposals [1][2] that would allow Rust
> > > > modules to coexist with modversions, but none that actually implement
> > > > symbol versioning. Unlike C, Rust source code doesn't have sufficient
> > > > information about the final ABI, as the compiler has considerable
> > > > freedom in adjusting structure layout for improved performance [3],
> > > > for example, which makes using a source code parser like genksyms
> > > > a non-starter. Based on Matt's suggestion and previous feedback
> > > > from maintainers, this series uses DWARF debugging information for
> > > > computing versions. DWARF is an established and relatively stable
> > > > format, which includes all the necessary ABI details, and adding a
> > > > CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> > > > reasonable trade-off.
> > > >
> > > > The first 12 patches of this series add a small tool for computing
> > > > symbol versions from DWARF, called gendwarfksyms. When passed a list
> > > > of exported symbols, the tool generates an expanded type string
> > > > for each symbol, and computes symbol CRCs similarly to genksyms.
> > > > gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> > > > because of the existing support for C host tools that use elfutils
> > > > (e.g., objtool).
> > >
> > > That's cool, can the C code be switched to also use this?  That way we
> > > only have one path/code for all of this?
> >
> >
> > As the description says, it requires CONFIG_DEBUG_INFO.
> > We can strip the debug info from the final vmlinux, but
> > I guess the build speed will be even slower than the current genksyms.
>
> For people who want genksyms (i.e. distros), don't they normally already
> enable DEBUG_INFO as well?  The problems of genksyms are well known and
> a pain (I speak from experience), so replacing it with info based on
> DWARF would be great, I'll gladly trade off the DEBUG_INFO issue for
> stablilty!
>
> thanks,
>
> greg k-h
>



I do not think gendwarfksyms is a drop-in replacement,
because it relies on libelf and libdw, which will not
work with LLVM bitcode when CONFIG_LTO_CLANG=y.

His "Let's postpone this until final linking" stuff will
come back?
Then, vmlinux.o is processed to extract the CRC
of all symbols?


In my benchmark, this tool took 3.84 sec just for processing
a single rust/core.o object.

I'd love to see how long it will take to process vmlinux.o

And this occurs even when a single source file is changed
and vmlinux.o is re-linked.




--
Best Regards
Masahiro Yamada

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (16 preceding siblings ...)
  2024-06-18 16:44 ` Greg Kroah-Hartman
@ 2024-06-18 19:42 ` Luis Chamberlain
  2024-06-18 21:19   ` Sami Tolvanen
  2024-07-10  7:30 ` Petr Pavlu
  18 siblings, 1 reply; 37+ messages in thread
From: Luis Chamberlain @ 2024-06-18 19:42 UTC (permalink / raw)
  To: Sami Tolvanen, Kris Van Hees, Steven Rostedt
  Cc: Masahiro Yamada, Miguel Ojeda, Greg Kroah-Hartman, Matthew Maurer,
	Alex Gaynor, Wedson Almeida Filho, Gary Guo, linux-kbuild,
	linux-kernel, linux-modules, rust-for-linux

On Mon, Jun 17, 2024 at 05:58:19PM +0000, Sami Tolvanen wrote:
> Hi folks,
> 
> This series implements CONFIG_MODVERSIONS for Rust, an important
> feature for distributions like Android that want to ship Rust
> kernel modules, and depend on modversions to help ensure module ABI
> compatibility.
> 
> There have been earlier proposals [1][2] that would allow Rust
> modules to coexist with modversions, but none that actually implement
> symbol versioning. Unlike C, Rust source code doesn't have sufficient
> information about the final ABI, as the compiler has considerable
> freedom in adjusting structure layout for improved performance [3],
> for example, which makes using a source code parser like genksyms
> a non-starter. Based on Matt's suggestion and previous feedback
> from maintainers, this series uses DWARF debugging information for
> computing versions. DWARF is an established and relatively stable
> format, which includes all the necessary ABI details, and adding a
> CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> reasonable trade-off.

OK sure.

> The first 12 patches of this series add a small tool for computing
> symbol versions from DWARF, called gendwarfksyms. When passed a list
> of exported symbols, the tool generates an expanded type string
> for each symbol, and computes symbol CRCs similarly to genksyms.

So this is too word centric Rust, let's think about this generically.
We still ahve a symbol limitation even in the C world then, and this
solution can solve that problem also for other reasons for *whatever*
reason we devise to-come-up-with-in-the-future for augmenting symbols.
Today Rust, tomorrow, who knows.

> gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> because of the existing support for C host tools that use elfutils
> (e.g., objtool).

I agree with Masahiro, that testing this with vmlinux would be eye
opening to what challenges we really have ahead. So, to help with this
let's try to re-think about this  from another perspective.

Yes, old userspace should not break, but you can add yet another option
to let you opt-in to a new world order of how these crc are mapped to
hashed repersentations of the symbols. This would allow us to:

a) Ensure correctness for all users / tools, so that proper plumbing is
   really done. By considering all symbols you increase your scope of
   awareness of anything that could really break.

b) Remove the "Rust" nature about this

c) Rust modules just becomes a *user* of this approach

It gets me wondering, given Kris is also working on allowing traces to
map symbols to module names, how does this fit into that world [0]?

As for a) the reason I'm thinking about having the ability to test a
full real kernel and moules with this is, without that, how are you sure
you have the full scope of the changes needed?

[0] https://lkml.kernel.org/r/20240614171428.968174-3-kris.van.hees@oracle.com

  Luis

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 16:28 ` [PATCH 00/15] Implement MODVERSIONS for Rust Masahiro Yamada
@ 2024-06-18 20:05   ` Sami Tolvanen
  0 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-18 20:05 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Luis Chamberlain, Miguel Ojeda, Greg Kroah-Hartman,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

Hi Masahiro,

On Wed, Jun 19, 2024 at 01:28:21AM +0900, Masahiro Yamada wrote:
> I am surprised at someone who attempts to add another variant of genksyms.

The options are rather limited if we want Rust modules that are
compatible with modversions. We either come up with a way to version
Rust symbols or we bypass version checks for them, which is basically
what Matt's earlier patch set did:

https://lore.kernel.org/rust-for-linux/20231118025748.2778044-1-mmaurer@google.com/

If there are better solutions, I would be happy to hear them.

> I am also surprised at the tool being added under the tools/ directory.

I picked this location because that's where objtool lives, and wasn't
aware that this was frowned upon. I'm fine with moving the tool
elsewhere too.

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 13/15] modpost: Add support for hashing long symbol names
  2024-06-18 16:47   ` Masahiro Yamada
@ 2024-06-18 20:07     ` Sami Tolvanen
  0 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-18 20:07 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Luis Chamberlain, Miguel Ojeda, Greg Kroah-Hartman,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

On Wed, Jun 19, 2024 at 01:47:13AM +0900, Masahiro Yamada wrote:
> On Tue, Jun 18, 2024 at 2:58 AM Sami Tolvanen <samitolvanen@google.com> wrote:
> >
> > +#define barrier_data(ptr) __asm__ __volatile__("": :"r"(ptr) :"memory")
> 
> 
> 
> 
> I know this is a copy-paste of the kernel space code,
> but is barrier_data() also necessary for host programs?

I tested this with Clang and it worked fine without the barrier, but
with gcc, the code no longer produces correct values. Presumably this is
a load bearing data barrier.

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 19:03       ` Masahiro Yamada
@ 2024-06-18 20:19         ` Sami Tolvanen
  0 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-18 20:19 UTC (permalink / raw)
  To: Masahiro Yamada
  Cc: Greg Kroah-Hartman, Luis Chamberlain, Miguel Ojeda,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

On Wed, Jun 19, 2024 at 04:03:45AM +0900, Masahiro Yamada wrote:
> On Wed, Jun 19, 2024 at 2:18 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Jun 19, 2024 at 01:50:36AM +0900, Masahiro Yamada wrote:
> > > On Wed, Jun 19, 2024 at 1:44 AM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > That's cool, can the C code be switched to also use this?  That way we
> > > > only have one path/code for all of this?
> > >
> > >
> > > As the description says, it requires CONFIG_DEBUG_INFO.
> > > We can strip the debug info from the final vmlinux, but
> > > I guess the build speed will be even slower than the current genksyms.
> >
> > For people who want genksyms (i.e. distros), don't they normally already
> > enable DEBUG_INFO as well?  The problems of genksyms are well known and
> > a pain (I speak from experience), so replacing it with info based on
> > DWARF would be great, I'll gladly trade off the DEBUG_INFO issue for
> > stablilty!
> >
> > thanks,
> >
> > greg k-h
> >
> 
> 
> 
> I do not think gendwarfksyms is a drop-in replacement,
> because it relies on libelf and libdw, which will not
> work with LLVM bitcode when CONFIG_LTO_CLANG=y.
> 
> His "Let's postpone this until final linking" stuff will
> come back?
> Then, vmlinux.o is processed to extract the CRC
> of all symbols?

I agree, this won't work with LTO unless we process vmlinux.o.

> In my benchmark, this tool took 3.84 sec just for processing
> a single rust/core.o object.

To be fair, Rust currently exports all globals and core.o has 400
exported symbols as a result. During my brief testing, this tool is
faster than genksyms for normal C code.

> I'd love to see how long it will take to process vmlinux.o

It's obviously going to be quite slow, my defconfig vmlinux.o has
14k exported symbols:

 Performance counter stats for './tools/gendwarfksyms/gendwarfksyms vmlinux.o':

        371,527.67 msec task-clock:u                     #    1.000 CPUs utilized
                 0      context-switches:u               #    0.000 /sec
                 0      cpu-migrations:u                 #    0.000 /sec
           231,554      page-faults:u                    #  623.248 /sec
   <not supported>      cycles:u
   <not supported>      instructions:u
   <not supported>      branches:u
   <not supported>      branch-misses:u

     371.686151684 seconds time elapsed

     370.534637000 seconds user
       0.987825000 seconds sys

The tool is currently single-threaded, so if we really want to go this
route, it could probably be made a bit faster.

> And this occurs even when a single source file is changed
> and vmlinux.o is re-linked.

I suppose anyone using LTO already knows it won't be a quick rebuild
though.

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 19:42 ` Luis Chamberlain
@ 2024-06-18 21:19   ` Sami Tolvanen
  2024-06-18 23:32     ` Luis Chamberlain
  0 siblings, 1 reply; 37+ messages in thread
From: Sami Tolvanen @ 2024-06-18 21:19 UTC (permalink / raw)
  To: Luis Chamberlain
  Cc: Kris Van Hees, Steven Rostedt, Masahiro Yamada, Miguel Ojeda,
	Greg Kroah-Hartman, Matthew Maurer, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, linux-kbuild, linux-kernel,
	linux-modules, rust-for-linux

Hi Luis,

On Tue, Jun 18, 2024 at 12:42:51PM -0700, Luis Chamberlain wrote:
> On Mon, Jun 17, 2024 at 05:58:19PM +0000, Sami Tolvanen wrote:
> > The first 12 patches of this series add a small tool for computing
> > symbol versions from DWARF, called gendwarfksyms. When passed a list
> > of exported symbols, the tool generates an expanded type string
> > for each symbol, and computes symbol CRCs similarly to genksyms.
> 
> So this is too word centric Rust, let's think about this generically.
> We still ahve a symbol limitation even in the C world then, and this
> solution can solve that problem also for other reasons for *whatever*
> reason we devise to-come-up-with-in-the-future for augmenting symbols.
> Today Rust, tomorrow, who knows.

If you're referring to the symbol hashing in the __versions table,
that's not specific to Rust. Rust just happens to be the only source of
long symbol names right now.

> > gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> > because of the existing support for C host tools that use elfutils
> > (e.g., objtool).
> 
> I agree with Masahiro, that testing this with vmlinux would be eye
> opening to what challenges we really have ahead. So, to help with this
> let's try to re-think about this  from another perspective.
>
> Yes, old userspace should not break, but you can add yet another option
> to let you opt-in to a new world order of how these crc are mapped to
> hashed repersentations of the symbols. This would allow us to:

We did run into an issue with depmod in our testing, where it needs to
be taught about hashed names to avoid 'unknown symbol' warnings. I'm not
sure if this is acceptable, so I would like to hear feedback about the
viability of the hashing scheme in general.

If old userspace can't have any regressions because of long symbol
names, I suspect we'll have to go back to omitting long symbols from
struct modversion_info and adding a v2 of the __versions table with no
symbol name length limitations. Happy to hear your thoughts.

> a) Ensure correctness for all users / tools, so that proper plumbing is
>    really done. By considering all symbols you increase your scope of
>    awareness of anything that could really break.
> 
> b) Remove the "Rust" nature about this
> 
> c) Rust modules just becomes a *user* of this approach

I believe the only Rust nature here is the last patch that enables
gendwarfksyms only for Rust. Otherwise, there shouldn't be anything
strictly Rust specific.

> It gets me wondering, given Kris is also working on allowing traces to
> map symbols to module names, how does this fit into that world [0]?

AFAIK long symbol names are only a problem for struct modversion_info,
which uses a relatively short name buffer, so I'm hoping other efforts
won't be affected.

> As for a) the reason I'm thinking about having the ability to test a
> full real kernel and moules with this is, without that, how are you sure
> you have the full scope of the changes needed?

Sure, I can look into hooking this up for the entire kernel and seeing
what breaks, in addition the issues Masahiro pointed out, of course.

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-18 21:19   ` Sami Tolvanen
@ 2024-06-18 23:32     ` Luis Chamberlain
  0 siblings, 0 replies; 37+ messages in thread
From: Luis Chamberlain @ 2024-06-18 23:32 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Kris Van Hees, Steven Rostedt, Masahiro Yamada, Miguel Ojeda,
	Greg Kroah-Hartman, Matthew Maurer, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, linux-kbuild, linux-kernel,
	linux-modules, rust-for-linux

On Tue, Jun 18, 2024 at 02:19:47PM -0700, Sami Tolvanen wrote:
> Hi Luis,
> 
> On Tue, Jun 18, 2024 at 12:42:51PM -0700, Luis Chamberlain wrote:
> > a) Ensure correctness for all users / tools, so that proper plumbing is
> >    really done. By considering all symbols you increase your scope of
> >    awareness of anything that could really break.
> > 
> > b) Remove the "Rust" nature about this
> > 
> > c) Rust modules just becomes a *user* of this approach
> 
> I believe the only Rust nature here is the last patch that enables
> gendwarfksyms only for Rust. Otherwise, there shouldn't be anything
> strictly Rust specific.

Right now the check for length is generic, and assumes a hash may exist
without considering that check is moot for non-rust modules. So the
inverse is true, but doesn't provide a solution or proper architecture
for it.

> > It gets me wondering, given Kris is also working on allowing traces to
> > map symbols to module names, how does this fit into that world [0]?
> 
> AFAIK long symbol names are only a problem for struct modversion_info,
> which uses a relatively short name buffer, so I'm hoping other efforts
> won't be affected.

We'll see!

> > As for a) the reason I'm thinking about having the ability to test a
> > full real kernel and moules with this is, without that, how are you sure
> > you have the full scope of the changes needed?
> 
> Sure, I can look into hooking this up for the entire kernel and seeing
> what breaks, in addition the issues Masahiro pointed out, of course.

:) should be fun!

I think once its revised as a "generic" strategy and not a Rust one, the
proper architecture will be considered. Right now, it just looks like a
cheap band aid for Rust.

  Luis

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
                   ` (17 preceding siblings ...)
  2024-06-18 19:42 ` Luis Chamberlain
@ 2024-07-10  7:30 ` Petr Pavlu
  2024-07-15 20:39   ` Sami Tolvanen
  18 siblings, 1 reply; 37+ messages in thread
From: Petr Pavlu @ 2024-07-10  7:30 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Greg Kroah-Hartman, Matthew Maurer, Alex Gaynor,
	Wedson Almeida Filho, Gary Guo, linux-kbuild, linux-kernel,
	linux-modules, rust-for-linux

On 6/17/24 19:58, Sami Tolvanen wrote:
> Hi folks,
> 
> This series implements CONFIG_MODVERSIONS for Rust, an important
> feature for distributions like Android that want to ship Rust
> kernel modules, and depend on modversions to help ensure module ABI
> compatibility.

Thanks for working on this. Below is some feedback with my (open)SUSE
hat on, although it should be quite general.

> There have been earlier proposals [1][2] that would allow Rust
> modules to coexist with modversions, but none that actually implement
> symbol versioning. Unlike C, Rust source code doesn't have sufficient
> information about the final ABI, as the compiler has considerable
> freedom in adjusting structure layout for improved performance [3],
> for example, which makes using a source code parser like genksyms
> a non-starter. Based on Matt's suggestion and previous feedback
> from maintainers, this series uses DWARF debugging information for
> computing versions. DWARF is an established and relatively stable
> format, which includes all the necessary ABI details, and adding a
> CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> reasonable trade-off.

Using the DWARF data makes sense to me. Distribution kernels are
normally built with debuginfo because one has to be able to debug them
later, unsurprisingly. Besides that, one now typically wants to use BPF
together with built-in BTF data (CONFIG_DEBUG_INFO_BTF), which also
requires building the kernel with debuginfo.

I would however keep in mind that producing all DWARF data has some
cost. From a quick test, an x86_64-defconfig build is ~30% slower for me
with CONFIG_DEBUG_INFO=y. The current genksyms tool allows to have
debuginfo disabled when backporting some patches and consequently have
a quicker feedback whether modversions changed. This option would
disappear with gendwarfksyms but I think it is acceptable.

> 
> The first 12 patches of this series add a small tool for computing
> symbol versions from DWARF, called gendwarfksyms. When passed a list
> of exported symbols, the tool generates an expanded type string
> for each symbol, and computes symbol CRCs similarly to genksyms.
> gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> because of the existing support for C host tools that use elfutils
> (e.g., objtool).

In addition to calculating CRCs of exported symbols, genksyms has other
features which I think are important.

Firstly, the genksyms tool has a human-readable storage format for input
data used in the calculation of symbol CRCs. Setting the make variable
KBUILD_SYMTYPES enables dumping this data and storing it in *.symtypes
files.

When a developer later modifies the kernel and wants to check if some
symbols have changed, they can take these files and feed them as
*.symref back to genksyms. This allows the tool to provide an actual
reason why some symbols have changed, instead of just printing that
their CRCs are different.

Is there any plan to add the same functionality to gendwarfksyms, or do
you envison that people will use libabigail, Symbol-Type Graph, or
another tool for making this type of comparison?

Secondly, when distributions want to maintain stable kABI, they need to
be able to deal with patch backports that add new members to structures.
One common approach is to have placeholders in important structures
which can be later replaced by the new members as needed. __GENKSYMS__
ifdefs are then used at the C source level to hide these kABI-compatible
changes from genksyms.

Gendwarfksyms works on the resulting binary and so using such ifdefs
wouldn't work. Instead, I suspect that what is required is a mechanism
to tell the tool that a given change is ok, probably by allowing to
specify some map from the original definition to the new one.

Is there a plan to implement something like this, or how could it be
addressed?

> Another compatibility issue is fitting the extremely long mangled
> Rust symbol names into struct modversion_info, which can only hold
> 55-character names (on 64-bit systems). Previous proposals suggested
> changing the structure to support longer names, but the conclusion was
> that we cannot break userspace tools that parse the version table.
> 
> The next two patches of the series implement support for hashed
> symbol names in struct modversion_info, where names longer than 55
> characters are hashed, and the hash is stored in the name field. To
> avoid breaking userspace tools, the binary hash is prefixed with a
> null-terminated string containing the name of the hash function. While
> userspace tools can later be updated to potentially produce more
> useful information about the long symbols, this allows them to
> continue working in the meantime.

I think this approach with hashed names is quite complex. I'd personally
be also in favor of having a new section with variable-length strings to
store the names of imported symbols. As yet another alternative, it
should be also possible to refer directly into .symtab/.strtab to avoid
duplicating the names, but I suspect it would be non-trivial to
implement.

Cheers,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-10  7:30 ` Petr Pavlu
@ 2024-07-15 20:39   ` Sami Tolvanen
  2024-07-16  7:12     ` Greg Kroah-Hartman
  2024-07-22  8:20     ` Petr Pavlu
  0 siblings, 2 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-07-15 20:39 UTC (permalink / raw)
  To: Petr Pavlu, Greg Kroah-Hartman
  Cc: Masahiro Yamada, Luis Chamberlain, Miguel Ojeda, Matthew Maurer,
	Alex Gaynor, Wedson Almeida Filho, Gary Guo, linux-kbuild,
	linux-kernel, linux-modules, rust-for-linux

Hi Petr,

On Wed, Jul 10, 2024 at 7:30 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
>
> On 6/17/24 19:58, Sami Tolvanen wrote:
> > Hi folks,
> >
> > This series implements CONFIG_MODVERSIONS for Rust, an important
> > feature for distributions like Android that want to ship Rust
> > kernel modules, and depend on modversions to help ensure module ABI
> > compatibility.
>
> Thanks for working on this. Below is some feedback with my (open)SUSE
> hat on, although it should be quite general.

Great, thanks for taking a look!

> > There have been earlier proposals [1][2] that would allow Rust
> > modules to coexist with modversions, but none that actually implement
> > symbol versioning. Unlike C, Rust source code doesn't have sufficient
> > information about the final ABI, as the compiler has considerable
> > freedom in adjusting structure layout for improved performance [3],
> > for example, which makes using a source code parser like genksyms
> > a non-starter. Based on Matt's suggestion and previous feedback
> > from maintainers, this series uses DWARF debugging information for
> > computing versions. DWARF is an established and relatively stable
> > format, which includes all the necessary ABI details, and adding a
> > CONFIG_DEBUG_INFO dependency for Rust symbol versioning seems like a
> > reasonable trade-off.
>
> Using the DWARF data makes sense to me. Distribution kernels are
> normally built with debuginfo because one has to be able to debug them
> later, unsurprisingly. Besides that, one now typically wants to use BPF
> together with built-in BTF data (CONFIG_DEBUG_INFO_BTF), which also
> requires building the kernel with debuginfo.
>
> I would however keep in mind that producing all DWARF data has some
> cost. From a quick test, an x86_64-defconfig build is ~30% slower for me
> with CONFIG_DEBUG_INFO=y. The current genksyms tool allows to have
> debuginfo disabled when backporting some patches and consequently have
> a quicker feedback whether modversions changed. This option would
> disappear with gendwarfksyms but I think it is acceptable.

Yes, this matches my benchmarks too. Presumably folks who care about
build times and are not interested in debugging information or Rust
support would prefer genksyms instead. I'm planning to turn this into
a Kconfig choice in the next version.

> > The first 12 patches of this series add a small tool for computing
> > symbol versions from DWARF, called gendwarfksyms. When passed a list
> > of exported symbols, the tool generates an expanded type string
> > for each symbol, and computes symbol CRCs similarly to genksyms.
> > gendwarfksyms is written in C and uses libdw to process DWARF, mainly
> > because of the existing support for C host tools that use elfutils
> > (e.g., objtool).
>
> In addition to calculating CRCs of exported symbols, genksyms has other
> features which I think are important.
>
> Firstly, the genksyms tool has a human-readable storage format for input
> data used in the calculation of symbol CRCs. Setting the make variable
> KBUILD_SYMTYPES enables dumping this data and storing it in *.symtypes
> files.
>
> When a developer later modifies the kernel and wants to check if some
> symbols have changed, they can take these files and feed them as
> *.symref back to genksyms. This allows the tool to provide an actual
> reason why some symbols have changed, instead of just printing that
> their CRCs are different.
>
> Is there any plan to add the same functionality to gendwarfksyms, or do
> you envison that people will use libabigail, Symbol-Type Graph, or
> another tool for making this type of comparison?

gendwarfksyms also uses human-readable input for the CRC calculations,
and it prints out the input strings with the --debug option. I plan to
hook this up to KBUILD_SYMTYPES in v2. It should be convenient enough
to simply compare the pretty-printed output with diff, so I'm not sure
if a built-in comparison option is needed. Any other DWARF analysis
tool can be used to spot the differences too, as you mentioned.

> Secondly, when distributions want to maintain stable kABI, they need to
> be able to deal with patch backports that add new members to structures.
> One common approach is to have placeholders in important structures
> which can be later replaced by the new members as needed. __GENKSYMS__
> ifdefs are then used at the C source level to hide these kABI-compatible
> changes from genksyms.
>
> Gendwarfksyms works on the resulting binary and so using such ifdefs
> wouldn't work. Instead, I suspect that what is required is a mechanism
> to tell the tool that a given change is ok, probably by allowing to
> specify some map from the original definition to the new one.
>
> Is there a plan to implement something like this, or how could it be
> addressed?

That's a great question. Here's what Android uses currently to
maintain a stable kABI, I assume you're doing something similar?

https://android.googlesource.com/kernel/common/+/refs/heads/android15-6.6/include/linux/android_kabi.h

If using unions here is acceptable to everyone, a simple solution
would be to use a known name prefix for the reserved members and teach
gendwarfksyms to only print out the original type for the replaced
ones. For example:

The initial placeholder:

    u8 __kabi_reserved_1[8];

After replacement:

    union {
            u64 new_member;
            struct {
                    u8 __kabi_reserved_1[8];
            };
    }

Here gendwarfksyms would see the __kabi_reserved prefix and only use
u8 [8] for the CRC calculation. Does this sound reasonable?

Greg, I know you've been dealing with this for a long time, any thoughts?

> > Another compatibility issue is fitting the extremely long mangled
> > Rust symbol names into struct modversion_info, which can only hold
> > 55-character names (on 64-bit systems). Previous proposals suggested
> > changing the structure to support longer names, but the conclusion was
> > that we cannot break userspace tools that parse the version table.
> >
> > The next two patches of the series implement support for hashed
> > symbol names in struct modversion_info, where names longer than 55
> > characters are hashed, and the hash is stored in the name field. To
> > avoid breaking userspace tools, the binary hash is prefixed with a
> > null-terminated string containing the name of the hash function. While
> > userspace tools can later be updated to potentially produce more
> > useful information about the long symbols, this allows them to
> > continue working in the meantime.
>
> I think this approach with hashed names is quite complex. I'd personally
> be also in favor of having a new section with variable-length strings to
> store the names of imported symbols. As yet another alternative, it
> should be also possible to refer directly into .symtab/.strtab to avoid
> duplicating the names, but I suspect it would be non-trivial to
> implement.

Thanks for the feedback. I think for the next version we'll look into
reviving the relevant bits of Matt's previous patch series, which
implemented a new section that supports variable-length names:

https://lore.kernel.org/lkml/CAGSQo03R+HYcX2JJDSm7LA8V370s_3hrbTBc2s51Pp9nxWTz5w@mail.gmail.com/

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-15 20:39   ` Sami Tolvanen
@ 2024-07-16  7:12     ` Greg Kroah-Hartman
  2024-07-18 17:04       ` Sami Tolvanen
  2024-07-22  8:20     ` Petr Pavlu
  1 sibling, 1 reply; 37+ messages in thread
From: Greg Kroah-Hartman @ 2024-07-16  7:12 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Petr Pavlu, Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

On Mon, Jul 15, 2024 at 08:39:59PM +0000, Sami Tolvanen wrote:
> If using unions here is acceptable to everyone, a simple solution
> would be to use a known name prefix for the reserved members and teach
> gendwarfksyms to only print out the original type for the replaced
> ones. For example:
> 
> The initial placeholder:
> 
>     u8 __kabi_reserved_1[8];

Don't use u8, use u64 please, it makes things simpler :)

> After replacement:
> 
>     union {
>             u64 new_member;
>             struct {
>                     u8 __kabi_reserved_1[8];
>             };
>     }

Note, such a thing would only be for the distros that want it, you can
add support for this to the tool, but there is no need for any
__kabi_reserved fields in mainline.

> Here gendwarfksyms would see the __kabi_reserved prefix and only use
> u8 [8] for the CRC calculation. Does this sound reasonable?
> 
> Greg, I know you've been dealing with this for a long time, any thoughts?

It's a good start, yes.  Also watch out for when structures go from
"anonymous" to "fully described" when new #include lines get added to
files.  The current tooling has issues with that, so we need to use
__GENKSYMS__ #ifdef lines in some places to keep crc generation stable.
Don't know if dwarf output would be susceptible to the same issues with
that or not, but you should check.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-16  7:12     ` Greg Kroah-Hartman
@ 2024-07-18 17:04       ` Sami Tolvanen
  0 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-07-18 17:04 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Petr Pavlu, Masahiro Yamada, Luis Chamberlain, Miguel Ojeda,
	Matthew Maurer, Alex Gaynor, Wedson Almeida Filho, Gary Guo,
	linux-kbuild, linux-kernel, linux-modules, rust-for-linux

On Tue, Jul 16, 2024 at 12:12 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> > After replacement:
> >
> >     union {
> >             u64 new_member;
> >             struct {
> >                     u8 __kabi_reserved_1[8];
> >             };
> >     }
>
> Note, such a thing would only be for the distros that want it, you can
> add support for this to the tool, but there is no need for any
> __kabi_reserved fields in mainline.

Sure. I assume modversions are primarily useful for distros, so I just
want to make sure this scheme can handle all the common issues as
well.

> > Greg, I know you've been dealing with this for a long time, any thoughts?
>
> It's a good start, yes.  Also watch out for when structures go from
> "anonymous" to "fully described" when new #include lines get added to
> files.  The current tooling has issues with that, so we need to use
> __GENKSYMS__ #ifdef lines in some places to keep crc generation stable.
> Don't know if dwarf output would be susceptible to the same issues with
> that or not, but you should check.

That's a great point. DWARF output has the exact same problem if a
structure declaration later becomes a definition, but I think we can
come up with a solution for this issue too in v2.

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-15 20:39   ` Sami Tolvanen
  2024-07-16  7:12     ` Greg Kroah-Hartman
@ 2024-07-22  8:20     ` Petr Pavlu
  2024-07-26 21:05       ` Sami Tolvanen
  1 sibling, 1 reply; 37+ messages in thread
From: Petr Pavlu @ 2024-07-22  8:20 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Greg Kroah-Hartman, Masahiro Yamada, Luis Chamberlain,
	Miguel Ojeda, Matthew Maurer, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, linux-kbuild, linux-kernel, linux-modules,
	rust-for-linux

On 7/15/24 22:39, Sami Tolvanen wrote:
> On Wed, Jul 10, 2024 at 7:30 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
>> On 6/17/24 19:58, Sami Tolvanen wrote:
>>> The first 12 patches of this series add a small tool for computing
>>> symbol versions from DWARF, called gendwarfksyms. When passed a list
>>> of exported symbols, the tool generates an expanded type string
>>> for each symbol, and computes symbol CRCs similarly to genksyms.
>>> gendwarfksyms is written in C and uses libdw to process DWARF, mainly
>>> because of the existing support for C host tools that use elfutils
>>> (e.g., objtool).
>>
>> In addition to calculating CRCs of exported symbols, genksyms has other
>> features which I think are important.
>>
>> Firstly, the genksyms tool has a human-readable storage format for input
>> data used in the calculation of symbol CRCs. Setting the make variable
>> KBUILD_SYMTYPES enables dumping this data and storing it in *.symtypes
>> files.
>>
>> When a developer later modifies the kernel and wants to check if some
>> symbols have changed, they can take these files and feed them as
>> *.symref back to genksyms. This allows the tool to provide an actual
>> reason why some symbols have changed, instead of just printing that
>> their CRCs are different.
>>
>> Is there any plan to add the same functionality to gendwarfksyms, or do
>> you envison that people will use libabigail, Symbol-Type Graph, or
>> another tool for making this type of comparison?
> 
> gendwarfksyms also uses human-readable input for the CRC calculations,
> and it prints out the input strings with the --debug option. I plan to
> hook this up to KBUILD_SYMTYPES in v2. It should be convenient enough
> to simply compare the pretty-printed output with diff, so I'm not sure
> if a built-in comparison option is needed. Any other DWARF analysis
> tool can be used to spot the differences too, as you mentioned.

From my perspective, I'm okay if gendwarfksyms doesn't provide
functionality to compare a new object file with its reference symtypes
file.

As mentioned, genksyms has this functionality but I actually think the
way it works is not ideal. Its design is to operate on one compilation
unit at the time. This has the advantage that a comparison of each file
is performed in parallel during the build, simply because of the make
job system. On the other hand, it has two problems.

The first one is that genksyms doesn't provide a comparison of the
kernel as a whole. This means that the tool gives rather scattered and
duplicated output about changed structs in the build log. Ideally, one
would like to see a single compact report about what changed at the end
of the build.

The second problem is the handling of symtypes files. This data is large
and if one wants to store them in a Git repository together with the
kernel source, it is advisable to first compress/consolidate it in some
way. This is trivial because these files typically contain many
duplicates. However, the issue is that to feed the data back to
genksyms, they need to be unpacked during each build which can take some
time.

I think a better approach is to have a tool that can be given
a consolidated symtypes file as one input and can compare it with all
new symtypes files produced during a kernel build. An example of a tool
that takes this approach is the kabi Python script in UEK [1].

A few months ago, I also started working on a tool inspired by this
script. The goal is to have similar functionality but hopefully with
a much faster implementation. Hence, this tool is written in a compiled
language (Rust at the moment) and should also become multi-threaded. I'm
hoping to find some time to make progress on it and make the code
public. It could later be added to the upstream kernel to replace the
comparison functionality implemented by genksyms, if there is interest.

So as mentioned, I'm fine if gendwarfksyms doesn't have this
functionality. However, for distributions that rely on the symtypes
format, I'd be interested in having gendwarfksyms output its dump data
in this format as well.

For example, instead of producing:

gendwarfksyms: process_exported_symbols: _some_mangled_func_name (@ XYZ)
subprogram(
   [formal parameters...]
)
-> structure_type core::result::Result<(), core::fmt::Error> {
   [a description of the structure...]
};

.. the output could be something like this:

S#'core::result::Result<(), core::fmt::Error>' structure_type core::result::Result<(), core::fmt::Error> { [a description of the structure...] }
_some_mangled_func_name subprogram _some_mangled_func_name ( [formal parameters...] ) -> S#'core::result::Result<(), core::fmt::Error>'

>> Secondly, when distributions want to maintain stable kABI, they need to
>> be able to deal with patch backports that add new members to structures.
>> One common approach is to have placeholders in important structures
>> which can be later replaced by the new members as needed. __GENKSYMS__
>> ifdefs are then used at the C source level to hide these kABI-compatible
>> changes from genksyms.
>>
>> Gendwarfksyms works on the resulting binary and so using such ifdefs
>> wouldn't work. Instead, I suspect that what is required is a mechanism
>> to tell the tool that a given change is ok, probably by allowing to
>> specify some map from the original definition to the new one.
>>
>> Is there a plan to implement something like this, or how could it be
>> addressed?
> 
> That's a great question. Here's what Android uses currently to
> maintain a stable kABI, I assume you're doing something similar?

Correct, (open)SUSE kernels have placeholders in likely-to-change
structs which can be used for new members. Or if no placeholder is
present, it might be necessary to place a new member in a gap (padding)
in the struct layout.

> 
> https://android.googlesource.com/kernel/common/+/refs/heads/android15-6.6/include/linux/android_kabi.h
> 
> If using unions here is acceptable to everyone, a simple solution
> would be to use a known name prefix for the reserved members and teach
> gendwarfksyms to only print out the original type for the replaced
> ones. For example:
> 
> The initial placeholder:
> 
>     u8 __kabi_reserved_1[8];
> 
> After replacement:
> 
>     union {
>             u64 new_member;
>             struct {
>                     u8 __kabi_reserved_1[8];
>             };
>     }
> 
> Here gendwarfksyms would see the __kabi_reserved prefix and only use
> u8 [8] for the CRC calculation. Does this sound reasonable?

I like this idea. I think it's good that the necessary kABI information
about an updated member can be expressed at the source code level in
place of the actual change, and it isn't needed to feed additional input
to the tool.

[1] https://github.com/oracle/linux-uek/blob/dbdd7f3611cb03e607e156834497dd2767103530/uek-rpm/tools/kabi

Thanks,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-22  8:20     ` Petr Pavlu
@ 2024-07-26 21:05       ` Sami Tolvanen
  2024-07-31 20:46         ` Neal Gompa
  2024-08-01 11:22         ` Petr Pavlu
  0 siblings, 2 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-07-26 21:05 UTC (permalink / raw)
  To: Petr Pavlu
  Cc: Greg Kroah-Hartman, Masahiro Yamada, Luis Chamberlain,
	Miguel Ojeda, Matthew Maurer, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, linux-kbuild, linux-kernel, linux-modules,
	rust-for-linux

Hi Petr,

On Mon, Jul 22, 2024 at 8:20 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
>
> From my perspective, I'm okay if gendwarfksyms doesn't provide
> functionality to compare a new object file with its reference symtypes
> file.
>
> As mentioned, genksyms has this functionality but I actually think the
> way it works is not ideal. Its design is to operate on one compilation
> unit at the time. This has the advantage that a comparison of each file
> is performed in parallel during the build, simply because of the make
> job system. On the other hand, it has two problems.
>
> The first one is that genksyms doesn't provide a comparison of the
> kernel as a whole. This means that the tool gives rather scattered and
> duplicated output about changed structs in the build log. Ideally, one
> would like to see a single compact report about what changed at the end
> of the build.

Sure, that makes sense. Android uses STG for this, which might be
useful to other folks too:

https://android.googlesource.com/platform/external/stg/
https://android.googlesource.com/platform/external/stg/+/refs/heads/main/doc/stgdiff.md#output-formats

> A few months ago, I also started working on a tool inspired by this
> script. The goal is to have similar functionality but hopefully with
> a much faster implementation. Hence, this tool is written in a compiled
> language (Rust at the moment) and should also become multi-threaded. I'm
> hoping to find some time to make progress on it and make the code
> public. It could later be added to the upstream kernel to replace the
> comparison functionality implemented by genksyms, if there is interest.
>
> So as mentioned, I'm fine if gendwarfksyms doesn't have this
> functionality. However, for distributions that rely on the symtypes
> format, I'd be interested in having gendwarfksyms output its dump data
> in this format as well.

We can definitely tweak the output format, but I'm not sure if making
it fully compatible with the genksyms symtypes format is feasible,
especially for Rust code. I also intentionally decided to use DWARF
tag names in the output instead of shorthands like s# etc. to make it
a bit more readable.

> For example, instead of producing:
>
> gendwarfksyms: process_exported_symbols: _some_mangled_func_name (@ XYZ)
> subprogram(
>    [formal parameters...]
> )
> -> structure_type core::result::Result<(), core::fmt::Error> {
>    [a description of the structure...]
> };
>
> .. the output could be something like this:
>
> S#'core::result::Result<(), core::fmt::Error>' structure_type core::result::Result<(), core::fmt::Error> { [a description of the structure...] }
> _some_mangled_func_name subprogram _some_mangled_func_name ( [formal parameters...] ) -> S#'core::result::Result<(), core::fmt::Error>'

This wouldn't be enough to make the output format compatible with
symtypes though. genksyms basically produces a simple key-value pair
database while gendwarfksyms currently outputs the fully expanded type
string for each symbol. If you need the tool to produce a type
database, it might also be worth discussing if we should use a bit
less ad hoc format in that case.

One more thing to note about the current --debug output is that it
directly correlates with the debugging information and thus may not
contain all aliases. For example, the Rust compiler deduplicates
identical function implementations (e.g. Deref::deref and
DerefMut::deref_mut etc.), but only one of the symbol names appears in
DWARF. We use symbol addresses to print out #SYMVERs also for the
aliases, but they don't show up in the debugging output right now.

> > If using unions here is acceptable to everyone, a simple solution
> > would be to use a known name prefix for the reserved members and teach
> > gendwarfksyms to only print out the original type for the replaced
> > ones. For example:
> >
> > The initial placeholder:
> >
> >     u8 __kabi_reserved_1[8];
> >
> > After replacement:
> >
> >     union {
> >             u64 new_member;
> >             struct {
> >                     u8 __kabi_reserved_1[8];
> >             };
> >     }
> >
> > Here gendwarfksyms would see the __kabi_reserved prefix and only use
> > u8 [8] for the CRC calculation. Does this sound reasonable?
>
> I like this idea. I think it's good that the necessary kABI information
> about an updated member can be expressed at the source code level in
> place of the actual change, and it isn't needed to feed additional input
> to the tool.

OK, cool. I agree that being able to specify these details in source
code is much cleaner. I'll add an implementation for this, and for the
definition visibility issue Greg mentioned in v2.

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-26 21:05       ` Sami Tolvanen
@ 2024-07-31 20:46         ` Neal Gompa
  2024-08-01 11:22         ` Petr Pavlu
  1 sibling, 0 replies; 37+ messages in thread
From: Neal Gompa @ 2024-07-31 20:46 UTC (permalink / raw)
  To: Petr Pavlu, Sami Tolvanen
  Cc: Greg Kroah-Hartman, Masahiro Yamada, Luis Chamberlain,
	Miguel Ojeda, Matthew Maurer, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, linux-kbuild, linux-kernel, linux-modules,
	rust-for-linux, Asahi Linux, Hector Martin, Janne Grunau

On Friday, July 26, 2024 5:05:22 PM EDT Sami Tolvanen wrote:
> Hi Petr,
> 
> On Mon, Jul 22, 2024 at 8:20 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
> > From my perspective, I'm okay if gendwarfksyms doesn't provide
> > functionality to compare a new object file with its reference symtypes
> > file.
> > 
> > As mentioned, genksyms has this functionality but I actually think the
> > way it works is not ideal. Its design is to operate on one compilation
> > unit at the time. This has the advantage that a comparison of each file
> > is performed in parallel during the build, simply because of the make
> > job system. On the other hand, it has two problems.
> > 
> > The first one is that genksyms doesn't provide a comparison of the
> > kernel as a whole. This means that the tool gives rather scattered and
> > duplicated output about changed structs in the build log. Ideally, one
> > would like to see a single compact report about what changed at the end
> > of the build.
> 
> Sure, that makes sense. Android uses STG for this, which might be
> useful to other folks too:
> 
> https://android.googlesource.com/platform/external/stg/
> https://android.googlesource.com/platform/external/stg/+/refs/heads/main/doc
> /stgdiff.md#output-formats
> > A few months ago, I also started working on a tool inspired by this
> > script. The goal is to have similar functionality but hopefully with
> > a much faster implementation. Hence, this tool is written in a compiled
> > language (Rust at the moment) and should also become multi-threaded. I'm
> > hoping to find some time to make progress on it and make the code
> > public. It could later be added to the upstream kernel to replace the
> > comparison functionality implemented by genksyms, if there is interest.
> > 
> > So as mentioned, I'm fine if gendwarfksyms doesn't have this
> > functionality. However, for distributions that rely on the symtypes
> > format, I'd be interested in having gendwarfksyms output its dump data
> > in this format as well.
> 
> We can definitely tweak the output format, but I'm not sure if making
> it fully compatible with the genksyms symtypes format is feasible,
> especially for Rust code. I also intentionally decided to use DWARF
> tag names in the output instead of shorthands like s# etc. to make it
> a bit more readable.
> 
> > For example, instead of producing:
> > 
> > gendwarfksyms: process_exported_symbols: _some_mangled_func_name (@ XYZ)
> > subprogram(
> > 
> >    [formal parameters...]
> > 
> > )
> > -> structure_type core::result::Result<(), core::fmt::Error> {
> > 
> >    [a description of the structure...]
> > 
> > };
> > 
> > .. the output could be something like this:
> > 
> > S#'core::result::Result<(), core::fmt::Error>' structure_type
> > core::result::Result<(), core::fmt::Error> { [a description of the
> > structure...] } _some_mangled_func_name subprogram
> > _some_mangled_func_name ( [formal parameters...] ) ->
> > S#'core::result::Result<(), core::fmt::Error>'
> This wouldn't be enough to make the output format compatible with
> symtypes though. genksyms basically produces a simple key-value pair
> database while gendwarfksyms currently outputs the fully expanded type
> string for each symbol. If you need the tool to produce a type
> database, it might also be worth discussing if we should use a bit
> less ad hoc format in that case.
> 
> One more thing to note about the current --debug output is that it
> directly correlates with the debugging information and thus may not
> contain all aliases. For example, the Rust compiler deduplicates
> identical function implementations (e.g. Deref::deref and
> DerefMut::deref_mut etc.), but only one of the symbol names appears in
> DWARF. We use symbol addresses to print out #SYMVERs also for the
> aliases, but they don't show up in the debugging output right now.
> 
> > > If using unions here is acceptable to everyone, a simple solution
> > > would be to use a known name prefix for the reserved members and teach
> > > gendwarfksyms to only print out the original type for the replaced
> > > ones. For example:
> > > 
> > > The initial placeholder:
> > >     u8 __kabi_reserved_1[8];
> > > 
> > > After replacement:
> > >     union {
> > >     
> > >             u64 new_member;
> > >             struct {
> > >             
> > >                     u8 __kabi_reserved_1[8];
> > >             
> > >             };
> > >     
> > >     }
> > > 
> > > Here gendwarfksyms would see the __kabi_reserved prefix and only use
> > > u8 [8] for the CRC calculation. Does this sound reasonable?
> > 
> > I like this idea. I think it's good that the necessary kABI information
> > about an updated member can be expressed at the source code level in
> > place of the actual change, and it isn't needed to feed additional input
> > to the tool.
> 
> OK, cool. I agree that being able to specify these details in source
> code is much cleaner. I'll add an implementation for this, and for the
> definition visibility issue Greg mentioned in v2.
> 

Could you please add myself, Hector, and Janne (along with the Asahi Linux 
mailing list) to the recipients when you send the v2 patch set? We're also 
interested in this. :)

Thanks in advance!


-- 
真実はいつも一つ!/ Always, there's only one truth!



^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-07-26 21:05       ` Sami Tolvanen
  2024-07-31 20:46         ` Neal Gompa
@ 2024-08-01 11:22         ` Petr Pavlu
  2024-08-01 19:38           ` Sami Tolvanen
  1 sibling, 1 reply; 37+ messages in thread
From: Petr Pavlu @ 2024-08-01 11:22 UTC (permalink / raw)
  To: Sami Tolvanen
  Cc: Greg Kroah-Hartman, Masahiro Yamada, Luis Chamberlain,
	Miguel Ojeda, Matthew Maurer, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, linux-kbuild, linux-kernel, linux-modules,
	rust-for-linux

On 7/26/24 23:05, Sami Tolvanen wrote:
> On Mon, Jul 22, 2024 at 8:20 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
>>
>> From my perspective, I'm okay if gendwarfksyms doesn't provide
>> functionality to compare a new object file with its reference symtypes
>> file.
>>
>> As mentioned, genksyms has this functionality but I actually think the
>> way it works is not ideal. Its design is to operate on one compilation
>> unit at the time. This has the advantage that a comparison of each file
>> is performed in parallel during the build, simply because of the make
>> job system. On the other hand, it has two problems.
>>
>> The first one is that genksyms doesn't provide a comparison of the
>> kernel as a whole. This means that the tool gives rather scattered and
>> duplicated output about changed structs in the build log. Ideally, one
>> would like to see a single compact report about what changed at the end
>> of the build.
> 
> Sure, that makes sense. Android uses STG for this, which might be
> useful to other folks too:
> 
> https://android.googlesource.com/platform/external/stg/
> https://android.googlesource.com/platform/external/stg/+/refs/heads/main/doc/stgdiff.md#output-formats

STG is an interesting tool. I've played with it a bit last year. To be
frank, I was surprised to see a new tool being proposed by Google to
generate modversion CRCs from DWARF instead of potentially extending
your STG project for this purpose. I'm not sure if it is something that
you folks have considered and evaluated.

>> A few months ago, I also started working on a tool inspired by this
>> script. The goal is to have similar functionality but hopefully with
>> a much faster implementation. Hence, this tool is written in a compiled
>> language (Rust at the moment) and should also become multi-threaded. I'm
>> hoping to find some time to make progress on it and make the code
>> public. It could later be added to the upstream kernel to replace the
>> comparison functionality implemented by genksyms, if there is interest.
>>
>> So as mentioned, I'm fine if gendwarfksyms doesn't have this
>> functionality. However, for distributions that rely on the symtypes
>> format, I'd be interested in having gendwarfksyms output its dump data
>> in this format as well.
> 
> We can definitely tweak the output format, but I'm not sure if making
> it fully compatible with the genksyms symtypes format is feasible,
> especially for Rust code. I also intentionally decided to use DWARF
> tag names in the output instead of shorthands like s# etc. to make it
> a bit more readable.

Sure, it might be necessary to extend the symtypes format a bit, for
example, by allowing spaces in type names. What other problems do you
see?

The example I showed preserves the DWARF tag names in type descriptions.
Cross-references and the target type names use the s# prefix as they
they need to be distinguished from other tokens.

>> For example, instead of producing:
>>
>> gendwarfksyms: process_exported_symbols: _some_mangled_func_name (@ XYZ)
>> subprogram(
>>    [formal parameters...]
>> )
>> -> structure_type core::result::Result<(), core::fmt::Error> {
>>    [a description of the structure...]
>> };
>>
>> .. the output could be something like this:
>>
>> S#'core::result::Result<(), core::fmt::Error>' structure_type core::result::Result<(), core::fmt::Error> { [a description of the structure...] }
>> _some_mangled_func_name subprogram _some_mangled_func_name ( [formal parameters...] ) -> S#'core::result::Result<(), core::fmt::Error>'
> 
> This wouldn't be enough to make the output format compatible with
> symtypes though. genksyms basically produces a simple key-value pair
> database while gendwarfksyms currently outputs the fully expanded type
> string for each symbol. If you need the tool to produce a type
> database, it might also be worth discussing if we should use a bit
> less ad hoc format in that case.

What I think is needed is the ability to compare an updated kernel with
some previous reference and have an output that clearly and accurately
shows why CRCs of some symbols changed. The previous reference should be
possible to store in Git together with the kernel source. It means it
should be ideally some text format and limited in size. This is what
distributions that care about stable kABI do in some form currently.

This functionality would be needed if some distribution wants to
maintain stable Rust kABI (not sure if it is actually feasible), or if
the idea is for gendwarfksyms to be a general tool that could replace
genksyms. I assume for the sake of argument that this is the case.

Gendwarfksyms could implement this functionality on its own, or as
discussed, I believe it could provide a symtypes-like dump and a second
tool could be used to work with this format and for comparing it.

From my point of view, the current --debug format is not suitable for
this purpose because its expanded and unstructured form means it is
bloated and hard to compare with a previous reference.

I'm also not quite yet sold on using separate DWARF tooling, such as
libabigail or STG, to actually understand why gendwarfksyms produced
a different CRC for some symbol. Using these tools makes sense in the
genksyms world, where genksyms operates on the source code level and
this additional tooling can only work on debug data.

With gendwarfksyms working directly with DWARF data, it doesn't seem
appealing to me to first run gendwarfksyms to produce CRCs, compare them
with their reference, and if they are different, use a second tool to
process the same DWARF data again and with some luck hopefully get an
actual answer why the CRCs changed. I'm worried that users might
encounter inaccurate answers if the two tools interpret the input data
differently.

> 
> One more thing to note about the current --debug output is that it
> directly correlates with the debugging information and thus may not
> contain all aliases. For example, the Rust compiler deduplicates
> identical function implementations (e.g. Deref::deref and
> DerefMut::deref_mut etc.), but only one of the symbol names appears in
> DWARF. We use symbol addresses to print out #SYMVERs also for the
> aliases, but they don't show up in the debugging output right now.

Thanks,
Petr

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/15] Implement MODVERSIONS for Rust
  2024-08-01 11:22         ` Petr Pavlu
@ 2024-08-01 19:38           ` Sami Tolvanen
  0 siblings, 0 replies; 37+ messages in thread
From: Sami Tolvanen @ 2024-08-01 19:38 UTC (permalink / raw)
  To: Petr Pavlu
  Cc: Greg Kroah-Hartman, Masahiro Yamada, Luis Chamberlain,
	Miguel Ojeda, Matthew Maurer, Alex Gaynor, Wedson Almeida Filho,
	Gary Guo, linux-kbuild, linux-kernel, linux-modules,
	rust-for-linux

Hi Petr,

On Thu, Aug 1, 2024 at 4:22 AM Petr Pavlu <petr.pavlu@suse.com> wrote:
>
> STG is an interesting tool. I've played with it a bit last year. To be
> frank, I was surprised to see a new tool being proposed by Google to
> generate modversion CRCs from DWARF instead of potentially extending
> your STG project for this purpose. I'm not sure if it is something that
> you folks have considered and evaluated.

Yes. STG is fairly large and written in C++, so it didn't seem like a
great candidate for inclusion in the kernel tree, and it's presumably
not shipped by most distributions, so it's not really ideal as a
dependency either.

> Sure, it might be necessary to extend the symtypes format a bit, for
> example, by allowing spaces in type names. What other problems do you
> see?
>
> The example I showed preserves the DWARF tag names in type descriptions.
> Cross-references and the target type names use the s# prefix as they
> they need to be distinguished from other tokens.

What I meant is that genksyms output is basically preprocessed C with
references scattered in, so if you want something that's fully
backwards compatible, that might not be possible. However, if you're
happy with just the same database structure and don't care about the
exact type description format beyond that, that should be doable, but
like you said, still requires extending the format to support more
complex type names.

> What I think is needed is the ability to compare an updated kernel with
> some previous reference and have an output that clearly and accurately
> shows why CRCs of some symbols changed. The previous reference should be
> possible to store in Git together with the kernel source. It means it
> should be ideally some text format and limited in size. This is what
> distributions that care about stable kABI do in some form currently.
>
> This functionality would be needed if some distribution wants to
> maintain stable Rust kABI (not sure if it is actually feasible), or if
> the idea is for gendwarfksyms to be a general tool that could replace
> genksyms. I assume for the sake of argument that this is the case.
>
> Gendwarfksyms could implement this functionality on its own, or as
> discussed, I believe it could provide a symtypes-like dump and a second
> tool could be used to work with this format and for comparing it.
>
> From my point of view, the current --debug format is not suitable for
> this purpose because its expanded and unstructured form means it is
> bloated and hard to compare with a previous reference.

I do see your point, there's a ton of duplication in the --debug
output as each symbol expands all the types it uses.

> I'm also not quite yet sold on using separate DWARF tooling, such as
> libabigail or STG, to actually understand why gendwarfksyms produced
> a different CRC for some symbol. Using these tools makes sense in the
> genksyms world, where genksyms operates on the source code level and
> this additional tooling can only work on debug data.
>
> With gendwarfksyms working directly with DWARF data, it doesn't seem
> appealing to me to first run gendwarfksyms to produce CRCs, compare them
> with their reference, and if they are different, use a second tool to
> process the same DWARF data again and with some luck hopefully get an
> actual answer why the CRCs changed. I'm worried that users might
> encounter inaccurate answers if the two tools interpret the input data
> differently.

I agree, there might be other DWARF changes that we don't care about,
but that nevertheless make it harder to figure out what caused the
actual CRC to change.

My initial thought was that simply running a diff between the --debug
output would be sufficient, as it would directly tell you what
changed, but I didn't consider that the size of the output dump would
be of concern as well. I'll look into adding a symtypes-style output
format in the next version. Thanks again for the feedback!

Sami

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2024-08-01 19:39 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-17 17:58 [PATCH 00/15] Implement MODVERSIONS for Rust Sami Tolvanen
2024-06-17 17:58 ` [PATCH 01/15] tools: Add gendwarfksyms Sami Tolvanen
2024-06-17 17:58 ` [PATCH 02/15] gendwarfksyms: Add symbol list input handling Sami Tolvanen
2024-06-17 17:58 ` [PATCH 03/15] gendwarfksyms: Add CRC calculation Sami Tolvanen
2024-06-17 17:58 ` [PATCH 04/15] gendwarfksyms: Expand base_type Sami Tolvanen
2024-06-17 17:58 ` [PATCH 05/15] gendwarfksyms: Add a cache Sami Tolvanen
2024-06-17 17:58 ` [PATCH 06/15] gendwarfksyms: Expand type modifiers and typedefs Sami Tolvanen
2024-06-17 17:58 ` [PATCH 07/15] gendwarfksyms: Add pretty-printing Sami Tolvanen
2024-06-17 17:58 ` [PATCH 08/15] gendwarfksyms: Expand subroutine_type Sami Tolvanen
2024-06-17 17:58 ` [PATCH 09/15] gendwarfksyms: Expand array_type Sami Tolvanen
2024-06-17 17:58 ` [PATCH 10/15] gendwarfksyms: Expand structure types Sami Tolvanen
2024-06-17 17:58 ` [PATCH 11/15] gendwarfksyms: Limit structure expansion Sami Tolvanen
2024-06-17 17:58 ` [PATCH 12/15] gendwarfksyms: Add inline debugging Sami Tolvanen
2024-06-17 17:58 ` [PATCH 13/15] modpost: Add support for hashing long symbol names Sami Tolvanen
2024-06-18 16:47   ` Masahiro Yamada
2024-06-18 20:07     ` Sami Tolvanen
2024-06-17 17:58 ` [PATCH 14/15] module: Support hashed symbol names when checking modversions Sami Tolvanen
2024-06-17 17:58 ` [PATCH 15/15] kbuild: Use gendwarfksyms to generate Rust symbol versions Sami Tolvanen
2024-06-18 16:28 ` [PATCH 00/15] Implement MODVERSIONS for Rust Masahiro Yamada
2024-06-18 20:05   ` Sami Tolvanen
2024-06-18 16:44 ` Greg Kroah-Hartman
2024-06-18 16:50   ` Masahiro Yamada
2024-06-18 17:18     ` Greg Kroah-Hartman
2024-06-18 19:03       ` Masahiro Yamada
2024-06-18 20:19         ` Sami Tolvanen
2024-06-18 19:42 ` Luis Chamberlain
2024-06-18 21:19   ` Sami Tolvanen
2024-06-18 23:32     ` Luis Chamberlain
2024-07-10  7:30 ` Petr Pavlu
2024-07-15 20:39   ` Sami Tolvanen
2024-07-16  7:12     ` Greg Kroah-Hartman
2024-07-18 17:04       ` Sami Tolvanen
2024-07-22  8:20     ` Petr Pavlu
2024-07-26 21:05       ` Sami Tolvanen
2024-07-31 20:46         ` Neal Gompa
2024-08-01 11:22         ` Petr Pavlu
2024-08-01 19:38           ` Sami Tolvanen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).