* [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-02 23:31 ` Stephen Brennan
2026-04-03 8:10 ` HAGIO KAZUHITO(萩尾 一仁)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving Tao Liu
` (7 subsequent siblings)
8 siblings, 2 replies; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
This patch makes preparation for btf/kallsyms support of
makedumpfile and extensions. Any needed kernel symbols/types
will be reserved within a special section, .init_ksyms for
kallsyms symbols and .init_ktypes for kernel types. During
makedumpfile kallsyms/btf initialization, those missing info
will be resolved. A makedumpfile.ld script is introduced for the
purpose.
Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
---
Makefile | 2 +-
makedumpfile.ld | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)
create mode 100644 makedumpfile.ld
diff --git a/Makefile b/Makefile
index 05ab5f2..15a4ba0 100644
--- a/Makefile
+++ b/Makefile
@@ -113,7 +113,7 @@ $(OBJ_ARCH): $(SRC_ARCH)
$(CC) $(CFLAGS_ARCH) -c -o ./$@ $(VPATH)$(@:.o=.c)
makedumpfile: $(SRC_BASE) $(OBJ_PART) $(OBJ_ARCH)
- $(CC) $(CFLAGS) $(LDFLAGS) $(OBJ_PART) $(OBJ_ARCH) -rdynamic -o $@ $< $(LIBS)
+ $(CC) $(CFLAGS) $(LDFLAGS) $(OBJ_PART) $(OBJ_ARCH) -rdynamic -Wl,-T,makedumpfile.ld -o $@ $< $(LIBS)
@sed -e "s/@DATE@/$(DATE)/" \
-e "s/@VERSION@/$(VERSION)/" \
$(VPATH)makedumpfile.8.in > $(VPATH)makedumpfile.8
diff --git a/makedumpfile.ld b/makedumpfile.ld
new file mode 100644
index 0000000..231a162
--- /dev/null
+++ b/makedumpfile.ld
@@ -0,0 +1,15 @@
+SECTIONS
+{
+ .init_ksyms ALIGN(8) : {
+ __start_init_ksyms = .;
+ KEEP(*(.init_ksyms*))
+ __stop_init_ksyms = .;
+ }
+
+ .init_ktypes ALIGN(8) : {
+ __start_init_ktypes = .;
+ KEEP(*(.init_ktypes*))
+ __stop_init_ktypes = .;
+ }
+}
+INSERT AFTER .data;
\ No newline at end of file
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions
2026-03-17 15:07 ` [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions Tao Liu
@ 2026-04-02 23:31 ` Stephen Brennan
2026-04-03 8:10 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-02 23:31 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> This patch makes preparation for btf/kallsyms support of
> makedumpfile and extensions. Any needed kernel symbols/types
> will be reserved within a special section, .init_ksyms for
> kallsyms symbols and .init_ktypes for kernel types. During
> makedumpfile kallsyms/btf initialization, those missing info
> will be resolved. A makedumpfile.ld script is introduced for the
> purpose.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
As I mentioned before, I don't think this patch is strictly necessary.
If we rename the sections to avoid the "." at the start of the names,
then the compiler should automatically provide the start and stop
symbols. But that said, there's nothing wrong with doing it this way,
and it is a bit more explicit, so I see the value in keeping it.
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> Makefile | 2 +-
> makedumpfile.ld | 15 +++++++++++++++
> 2 files changed, 16 insertions(+), 1 deletion(-)
> create mode 100644 makedumpfile.ld
>
> diff --git a/Makefile b/Makefile
> index 05ab5f2..15a4ba0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -113,7 +113,7 @@ $(OBJ_ARCH): $(SRC_ARCH)
> $(CC) $(CFLAGS_ARCH) -c -o ./$@ $(VPATH)$(@:.o=.c)
>
> makedumpfile: $(SRC_BASE) $(OBJ_PART) $(OBJ_ARCH)
> - $(CC) $(CFLAGS) $(LDFLAGS) $(OBJ_PART) $(OBJ_ARCH) -rdynamic -o $@ $< $(LIBS)
> + $(CC) $(CFLAGS) $(LDFLAGS) $(OBJ_PART) $(OBJ_ARCH) -rdynamic -Wl,-T,makedumpfile.ld -o $@ $< $(LIBS)
> @sed -e "s/@DATE@/$(DATE)/" \
> -e "s/@VERSION@/$(VERSION)/" \
> $(VPATH)makedumpfile.8.in > $(VPATH)makedumpfile.8
> diff --git a/makedumpfile.ld b/makedumpfile.ld
> new file mode 100644
> index 0000000..231a162
> --- /dev/null
> +++ b/makedumpfile.ld
> @@ -0,0 +1,15 @@
> +SECTIONS
> +{
> + .init_ksyms ALIGN(8) : {
> + __start_init_ksyms = .;
> + KEEP(*(.init_ksyms*))
> + __stop_init_ksyms = .;
> + }
> +
> + .init_ktypes ALIGN(8) : {
> + __start_init_ktypes = .;
> + KEEP(*(.init_ktypes*))
> + __stop_init_ktypes = .;
> + }
> +}
> +INSERT AFTER .data;
> \ No newline at end of file
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions
2026-03-17 15:07 ` [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions Tao Liu
2026-04-02 23:31 ` Stephen Brennan
@ 2026-04-03 8:10 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2026-04-03 8:10 UTC (permalink / raw)
To: Tao Liu
Cc: stephen.s.brennan@oracle.com,
YAMAZAKI MASAMITSU(山崎 真光),
kexec@lists.infradead.org
On 2026/03/18 0:07, Tao Liu wrote:
> This patch makes preparation for btf/kallsyms support of
> makedumpfile and extensions. Any needed kernel symbols/types
> will be reserved within a special section, .init_ksyms for
> kallsyms symbols and .init_ktypes for kernel types. During
> makedumpfile kallsyms/btf initialization, those missing info
> will be resolved. A makedumpfile.ld script is introduced for the
> purpose.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
> ---
> Makefile | 2 +-
> makedumpfile.ld | 15 +++++++++++++++
> 2 files changed, 16 insertions(+), 1 deletion(-)
> create mode 100644 makedumpfile.ld
>
> diff --git a/Makefile b/Makefile
> index 05ab5f2..15a4ba0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -113,7 +113,7 @@ $(OBJ_ARCH): $(SRC_ARCH)
> $(CC) $(CFLAGS_ARCH) -c -o ./$@ $(VPATH)$(@:.o=.c)
>
> makedumpfile: $(SRC_BASE) $(OBJ_PART) $(OBJ_ARCH)
> - $(CC) $(CFLAGS) $(LDFLAGS) $(OBJ_PART) $(OBJ_ARCH) -rdynamic -o $@ $< $(LIBS)
> + $(CC) $(CFLAGS) $(LDFLAGS) $(OBJ_PART) $(OBJ_ARCH) -rdynamic -Wl,-T,makedumpfile.ld -o $@ $< $(LIBS)
> @sed -e "s/@DATE@/$(DATE)/" \
> -e "s/@VERSION@/$(VERSION)/" \
> $(VPATH)makedumpfile.8.in > $(VPATH)makedumpfile.8
> diff --git a/makedumpfile.ld b/makedumpfile.ld
> new file mode 100644
> index 0000000..231a162
> --- /dev/null
> +++ b/makedumpfile.ld
> @@ -0,0 +1,15 @@
> +SECTIONS
> +{
> + .init_ksyms ALIGN(8) : {
> + __start_init_ksyms = .;
> + KEEP(*(.init_ksyms*))
> + __stop_init_ksyms = .;
> + }
> +
> + .init_ktypes ALIGN(8) : {
> + __start_init_ktypes = .;
> + KEEP(*(.init_ktypes*))
> + __stop_init_ktypes = .;
> + }
> +}
> +INSERT AFTER .data;
> \ No newline at end of file
Please add a newline to files that have this line?
Command prompt is shown on the same line like this:
[root@rhel97u makedumpfile]# cat makedumpfile.ld
...
INSERT AFTER .data;[root@rhel97u makedumpfile]#
Thanks,
Kazu
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
2026-03-17 15:07 ` [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-02 23:32 ` Stephen Brennan
2026-04-03 8:12 ` HAGIO KAZUHITO(萩尾 一仁)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving Tao Liu
` (6 subsequent siblings)
8 siblings, 2 replies; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
This patch will parse kernel's kallsyms data. During the parsing
process, the .init_ksyms sections of makedumpfile and the
extensions will be iterated, so the kallsyms symbols which belongs
to vmlinux can be resolved at this moment.
Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
---
Makefile | 2 +-
kallsyms.c | 350 +++++++++++++++++++++++++++++++++++++++++++++++++
kallsyms.h | 91 +++++++++++++
makedumpfile.c | 3 +
makedumpfile.h | 11 ++
5 files changed, 456 insertions(+), 1 deletion(-)
create mode 100644 kallsyms.c
create mode 100644 kallsyms.h
diff --git a/Makefile b/Makefile
index 15a4ba0..a57185e 100644
--- a/Makefile
+++ b/Makefile
@@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
endif
SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
-SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c
+SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c
OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
diff --git a/kallsyms.c b/kallsyms.c
new file mode 100644
index 0000000..f7737cb
--- /dev/null
+++ b/kallsyms.c
@@ -0,0 +1,350 @@
+#include <stdint.h>
+#include <stdbool.h>
+#include <string.h>
+#include "makedumpfile.h"
+#include "kallsyms.h"
+
+static uint32_t *kallsyms_offsets = NULL;
+static uint16_t *kallsyms_token_index = NULL;
+static uint8_t *kallsyms_token_table = NULL;
+static uint8_t *kallsyms_names = NULL;
+static unsigned long kallsyms_relative_base = 0;
+static unsigned int kallsyms_num_syms = 0;
+
+/* makedumpfile & extensions' .init_ksyms section range array */
+static struct section_range **sr = NULL;
+static int sr_len = 0;
+static int sr_cap = 0;
+
+/* Which mod's kallsyms should be inited? */
+static char **mods = NULL;
+static int mods_len = 0;
+static int mods_cap = 0;
+
+INIT_KERN_SYM(_stext);
+
+/*
+ * Utility: add elem to arr, which can auto extend its capacity.
+ * (*arr) is a pointer array, holding pointers of elem
+*/
+bool add_to_arr(void ***arr, int *arr_len, int *arr_cap, void *elem)
+{
+ void *tmp;
+ int new_cap = 0;
+
+ if (*arr == NULL) {
+ *arr_len = 0;
+ new_cap = 4;
+ } else if (*arr_len >= *arr_cap) {
+ new_cap = (*arr_cap) + ((*arr_cap) >> 1);
+ }
+
+ if (new_cap) {
+ tmp = reallocarray(*arr, new_cap, sizeof(void *));
+ if (!tmp)
+ goto no_mem;
+ *arr = tmp;
+ *arr_cap = new_cap;
+ }
+
+ (*arr)[(*arr_len)++] = elem;
+ return true;
+
+no_mem:
+ fprintf(stderr, "%s: Not enough memory!\n", __func__);
+ return false;
+}
+
+/*
+ * Utility: add uniq string to arr, which can auto extend its capacity.
+*/
+bool push_uniq_str(void ***arr, int *arr_len, int *arr_cap, char *str)
+{
+ for (int i = 0; i < (*arr_len); i++) {
+ if (!strcmp((*arr)[i], str))
+ /* String already exists, skip it */
+ return true;
+ }
+ return add_to_arr(arr, arr_len, arr_cap, str);
+}
+
+static bool add_ksym_modname(char *modname)
+{
+ return push_uniq_str((void ***)&mods, &mods_len, &mods_cap, modname);
+}
+
+bool check_ksyms_require_modname(char *modname, int *total)
+{
+ if (total)
+ *total = mods_len;
+ for (int i = 0; i < mods_len; i++) {
+ if (!strcmp(modname, mods[i]))
+ return true;
+ }
+ return false;
+}
+
+static void cleanup_ksyms_modname(void)
+{
+ if (mods) {
+ free(mods);
+ mods = NULL;
+ }
+ mods_len = 0;
+ mods_cap = 0;
+}
+
+/*
+ * Used by makedumpfile and extensions, to register their .init_ksyms section.
+ * so kallsyms can know which module/sym should be inited.
+*/
+REGISTER_SECTION(ksym)
+
+static void cleanup_ksyms_section_range(void)
+{
+ for (int i = 0; i < sr_len; i++) {
+ free(sr[i]);
+ }
+ if (sr) {
+ free(sr);
+ sr = NULL;
+ }
+ sr_len = 0;
+ sr_cap = 0;
+}
+
+static uint64_t absolute_percpu(uint64_t base, int32_t val)
+{
+ if (val >= 0)
+ return (uint64_t)val;
+ else
+ return base - 1 - val;
+}
+
+static uint64_t calc_addr_absolute_percpu(struct ksym_info *p)
+{
+ return absolute_percpu(kallsyms_relative_base, p->value);
+}
+
+static uint64_t calc_addr_relative_base(struct ksym_info *p)
+{
+ return p->value + kallsyms_relative_base;
+}
+
+static uint64_t calc_addr_place_relative(struct ksym_info *p)
+{
+ return SYMBOL(kallsyms_offsets) + p->index * sizeof(uint32_t) +
+ (int32_t)kallsyms_offsets[p->index];
+}
+
+#define BUFLEN 1024
+static bool parse_kernel_kallsyms(void)
+{
+ char buf[BUFLEN];
+ int index = 0, i, j;
+ uint8_t *compressd_data;
+ uint8_t *uncompressd_data;
+ uint8_t len, len_old;
+ struct ksym_info **p;
+ uint64_t (*calc_addr)(struct ksym_info *);
+ struct ksym_info *stext_p;
+
+ for (i = 0; i < kallsyms_num_syms; i++) {
+ memset(buf, 0, BUFLEN);
+ len = kallsyms_names[index];
+ if (len & 0x80) {
+ index++;
+ len_old = len;
+ len = kallsyms_names[index];
+ if (len & 0x80) {
+ fprintf(stderr, "%s: BUG! Unexpected 3-byte length,"
+ " should be detected in init_kernel_kallsyms()\n",
+ __func__);
+ goto out;
+ }
+ len = (len_old & 0x7F) | (len << 7);
+ }
+ index++;
+
+ compressd_data = &kallsyms_names[index];
+ index += len;
+ while (len--) {
+ uncompressd_data = &kallsyms_token_table[kallsyms_token_index[*compressd_data]];
+ if (strlen(buf) + strlen((char *)uncompressd_data) >= BUFLEN) {
+ goto next_symbol;
+ }
+ strcat(buf, (char *)uncompressd_data);
+ compressd_data++;
+ }
+
+ /* Now check if the symbol is we wanted */
+ for (j = 0; j < sr_len; j++) {
+ for (p = (struct ksym_info **)(sr[j]->start);
+ p < (struct ksym_info **)(sr[j]->stop);
+ p++) {
+ if (!strcmp((*p)->modname, "vmlinux") &&
+ !strcmp((*p)->symname, &buf[1])) {
+ (*p)->value = kallsyms_offsets[i];
+ (*p)->index = i;
+ }
+ }
+ }
+next_symbol:
+ }
+
+ /* Check the approach for calc absolute kallsyms address
+ *
+ * A complete comment of each approaches please refer to:
+ * https://github.com/osandov/drgn/commit/744f36ec3c3f64d7e1323a0037898158698585c4
+ */
+ if (!KERN_SYM_EXIST(_stext)) {
+ fprintf(stderr, "%s: symbol _stext not found!\n", __func__);
+ goto out;
+ }
+
+ stext_p = GET_KERN_SYM_PTR(_stext);
+
+ if (SYMBOL(_stext) == calc_addr_absolute_percpu(stext_p)) {
+ calc_addr = calc_addr_absolute_percpu;
+ } else if (SYMBOL(_stext) == calc_addr_relative_base(stext_p)) {
+ calc_addr = calc_addr_relative_base;
+ } else if (SYMBOL(_stext) == calc_addr_place_relative(stext_p)) {
+ calc_addr = calc_addr_place_relative;
+ } else {
+ fprintf(stderr, "%s: Wrong calculate kallsyms symbol value!\n", __func__);
+ goto out;
+ }
+
+ /* Now do the calc */
+ for (j = 0; j < sr_len; j++) {
+ for (p = (struct ksym_info **)(sr[j]->start);
+ p < (struct ksym_info **)(sr[j]->stop);
+ p++) {
+ if (!strcmp((*p)->modname, "vmlinux") &&
+ SYM_EXIST(*p)) {
+ (*p)->value = calc_addr(*p);
+ }
+ }
+ }
+
+ return true;
+out:
+ return false;
+}
+
+static bool vmcore_info_ready = false;
+
+bool read_vmcoreinfo_kallsyms(void)
+{
+ READ_SYMBOL("kallsyms_names", kallsyms_names);
+ READ_SYMBOL("kallsyms_num_syms", kallsyms_num_syms);
+ READ_SYMBOL("kallsyms_token_table", kallsyms_token_table);
+ READ_SYMBOL("kallsyms_token_index", kallsyms_token_index);
+ READ_SYMBOL("kallsyms_offsets", kallsyms_offsets);
+ READ_SYMBOL("kallsyms_relative_base", kallsyms_relative_base);
+ vmcore_info_ready = true;
+ return true;
+}
+
+/*
+ * Makedumpfile's .init_ksyms section
+*/
+extern struct ksym_info *__start_init_ksyms[];
+extern struct ksym_info *__stop_init_ksyms[];
+
+bool init_kernel_kallsyms(void)
+{
+ const int token_index_size = (UINT8_MAX + 1) * sizeof(uint16_t);
+ uint64_t last_token, len;
+ unsigned char data, data_old;
+ int i;
+ bool ret = false;
+
+ if (vmcore_info_ready == false) {
+ fprintf(stderr, "%s: vmcoreinfo not ready for kallsyms!\n",
+ __func__);
+ return ret;
+ }
+
+ if (!register_ksym_section((char *)__start_init_ksyms,
+ (char *)__stop_init_ksyms))
+ return ret;
+
+ readmem(VADDR, SYMBOL(kallsyms_num_syms), &kallsyms_num_syms,
+ sizeof(kallsyms_num_syms));
+ if (SYMBOL(kallsyms_relative_base) != NOT_FOUND_SYMBOL)
+ readmem(VADDR, SYMBOL(kallsyms_relative_base),
+ &kallsyms_relative_base, sizeof(kallsyms_relative_base));
+
+ kallsyms_offsets = malloc(sizeof(uint32_t) * kallsyms_num_syms);
+ if (!kallsyms_offsets)
+ goto no_mem;
+ readmem(VADDR, SYMBOL(kallsyms_offsets), kallsyms_offsets,
+ kallsyms_num_syms * sizeof(uint32_t));
+
+ kallsyms_token_index = malloc(token_index_size);
+ if (!kallsyms_token_index)
+ goto no_mem;
+ readmem(VADDR, SYMBOL(kallsyms_token_index), kallsyms_token_index,
+ token_index_size);
+
+ last_token = SYMBOL(kallsyms_token_table) + kallsyms_token_index[UINT8_MAX];
+ do {
+ readmem(VADDR, last_token++, &data, 1);
+ } while(data);
+ len = last_token - SYMBOL(kallsyms_token_table);
+ kallsyms_token_table = malloc(len);
+ if (!kallsyms_token_table)
+ goto no_mem;
+ readmem(VADDR, SYMBOL(kallsyms_token_table), kallsyms_token_table, len);
+
+ for (len = 0, i = 0; i < kallsyms_num_syms; i++) {
+ readmem(VADDR, SYMBOL(kallsyms_names) + len, &data, 1);
+ /*
+ * The 2-byte representation was added in commit 73bbb94466fd3
+ * ("kallsyms: support "big" kernel symbols") in v6.1, thus for
+ * v6.1+, they indicate a long symbol, but for kernel versions
+ * prior to v6.1, they might be ambiguous.
+ */
+ if (data & 0x80) {
+ len += 1;
+ data_old = data;
+ readmem(VADDR, SYMBOL(kallsyms_names) + len, &data, 1);
+ if (data & 0x80) {
+ fprintf(stderr, "%s: BUG! Unexpected 3-byte length"
+ " encoding in kallsyms names\n", __func__);
+ goto out;
+ }
+ data = (data_old & 0x7F) | (data << 7);
+ }
+ len += data + 1;
+ }
+ kallsyms_names = malloc(len);
+ if (!kallsyms_names)
+ goto no_mem;
+ readmem(VADDR, SYMBOL(kallsyms_names), kallsyms_names, len);
+
+ ret = parse_kernel_kallsyms();
+ goto out;
+
+no_mem:
+ fprintf(stderr, "%s: Not enough memory!\n", __func__);
+out:
+ if (kallsyms_offsets) {
+ free(kallsyms_offsets);
+ kallsyms_offsets = NULL;
+ }
+ if (kallsyms_token_index) {
+ free(kallsyms_token_index);
+ kallsyms_token_index = NULL;
+ }
+ if (kallsyms_token_table) {
+ free(kallsyms_token_table);
+ kallsyms_token_table = NULL;
+ }
+ if (kallsyms_names) {
+ free(kallsyms_names);
+ kallsyms_names = NULL;
+ }
+ return ret;
+}
\ No newline at end of file
diff --git a/kallsyms.h b/kallsyms.h
new file mode 100644
index 0000000..3791284
--- /dev/null
+++ b/kallsyms.h
@@ -0,0 +1,91 @@
+#ifndef _KALLSYMS_H
+#define _KALLSYMS_H
+
+#include <stdint.h>
+#include <stdbool.h>
+
+struct ksym_info {
+ /********in******/
+ char *modname;
+ char *symname;
+ bool sym_required;
+ /********out*****/
+ uint64_t value;
+ int index; // -1 if sym not found
+};
+
+#define QUATE(x) #x
+#define INIT_MOD_SYM_RQD(MOD, SYM, R) \
+ struct ksym_info _##MOD##_##SYM = { \
+ QUATE(MOD), QUATE(SYM), R, 0, -1 \
+ }; \
+ __attribute__((section(".init_ksyms"), used)) \
+ struct ksym_info * _ptr_##MOD##_##SYM = &_##MOD##_##SYM
+
+#define GET_MOD_SYM(MOD, SYM) (_##MOD##_##SYM.value)
+#define GET_MOD_SYM_PTR(MOD, SYM) (&_##MOD##_##SYM)
+#define MOD_SYM_EXIST(MOD, SYM) (_##MOD##_##SYM.index >= 0)
+#define SYM_EXIST(p) ((p)->index >= 0)
+
+#define GET_KERN_SYM(SYM) GET_MOD_SYM(vmlinux, SYM)
+#define GET_KERN_SYM_PTR(SYM) GET_MOD_SYM_PTR(vmlinux, SYM)
+#define KERN_SYM_EXIST(SYM) MOD_SYM_EXIST(vmlinux, SYM)
+
+/*
+ * Required syms will be checked automatically before extension running.
+ * Optinal syms should be checked manually at extension runtime.
+ */
+#define INIT_MOD_SYM(MOD, SYM) INIT_MOD_SYM_RQD(MOD, SYM, 1)
+#define INIT_OPT_MOD_SYM(MOD, SYM) INIT_MOD_SYM_RQD(MOD, SYM, 0)
+
+#define INIT_KERN_SYM(SYM) INIT_MOD_SYM(vmlinux, SYM)
+#define INIT_OPT_KERN_SYM(SYM) INIT_OPT_MOD_SYM(vmlinux, SYM)
+
+struct section_range {
+ char *start;
+ char *stop;
+};
+
+#define REGISTER_SECTION(T) \
+bool register_##T##_section(char *start, char *stop) \
+{ \
+ struct section_range *new_sr; \
+ struct T##_info **p; \
+ bool ret = false; \
+ \
+ if (!start || !stop) { \
+ fprintf(stderr, "%s: Invalid section start/stop\n", \
+ __func__); \
+ goto out; \
+ } \
+ \
+ for (p = (struct T##_info **)start; \
+ p < (struct T##_info **)stop; \
+ p++) { \
+ if (!add_##T##_modname((*p)->modname)) \
+ goto out; \
+ } \
+ \
+ new_sr = malloc(sizeof(struct section_range)); \
+ if (!new_sr) { \
+ fprintf(stderr, "%s: Not enough memory!\n", __func__); \
+ goto out; \
+ } \
+ new_sr->start = start; \
+ new_sr->stop = stop; \
+ if (!add_to_arr((void ***)&sr, &sr_len, &sr_cap, new_sr)) { \
+ free(new_sr); \
+ goto out; \
+ } \
+ ret = true; \
+out: \
+ return ret; \
+}
+
+bool add_to_arr(void ***arr, int *arr_len, int *arr_cap, void *elem);
+bool push_uniq_str(void ***arr, int *arr_len, int *arr_cap, char *str);
+bool check_ksyms_require_modname(char *modname, int *total);
+bool register_ksym_section(char *start, char *stop);
+bool read_vmcoreinfo_kallsyms(void);
+bool init_kernel_kallsyms(void);
+#endif /* _KALLSYMS_H */
\ No newline at end of file
diff --git a/makedumpfile.c b/makedumpfile.c
index 12fb0d8..dba3628 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -27,6 +27,7 @@
#include <limits.h>
#include <assert.h>
#include <zlib.h>
+#include "kallsyms.h"
struct symbol_table symbol_table;
struct size_table size_table;
@@ -3105,6 +3106,8 @@ read_vmcoreinfo_from_vmcore(off_t offset, unsigned long size, int flag_xen_hv)
if (!read_vmcoreinfo())
goto out;
}
+ read_vmcoreinfo_kallsyms();
+
close_vmcoreinfo();
ret = TRUE;
diff --git a/makedumpfile.h b/makedumpfile.h
index 134eb7a..0f13743 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -259,6 +259,7 @@ static inline int string_exists(char *s) { return (s ? TRUE : FALSE); }
#define UINT(ADDR) *((unsigned int *)(ADDR))
#define ULONG(ADDR) *((unsigned long *)(ADDR))
#define ULONGLONG(ADDR) *((unsigned long long *)(ADDR))
+#define VOID_PTR(ADDR) *((void **)(ADDR))
/*
@@ -1919,6 +1920,16 @@ struct symbol_table {
* symbols on sparc64 arch
*/
unsigned long long vmemmap_table;
+
+ /*
+ * kallsyms related
+ */
+ unsigned long long kallsyms_names;
+ unsigned long long kallsyms_num_syms;
+ unsigned long long kallsyms_token_table;
+ unsigned long long kallsyms_token_index;
+ unsigned long long kallsyms_offsets;
+ unsigned long long kallsyms_relative_base;
};
struct size_table {
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving
2026-03-17 15:07 ` [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving Tao Liu
@ 2026-04-02 23:32 ` Stephen Brennan
2026-04-03 8:12 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-02 23:32 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> This patch will parse kernel's kallsyms data. During the parsing
> process, the .init_ksyms sections of makedumpfile and the
> extensions will be iterated, so the kallsyms symbols which belongs
> to vmlinux can be resolved at this moment.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> Makefile | 2 +-
> kallsyms.c | 350 +++++++++++++++++++++++++++++++++++++++++++++++++
> kallsyms.h | 91 +++++++++++++
> makedumpfile.c | 3 +
> makedumpfile.h | 11 ++
> 5 files changed, 456 insertions(+), 1 deletion(-)
> create mode 100644 kallsyms.c
> create mode 100644 kallsyms.h
>
> diff --git a/Makefile b/Makefile
> index 15a4ba0..a57185e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
> endif
>
> SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
> -SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c
> +SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c
> OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
> OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
> diff --git a/kallsyms.c b/kallsyms.c
> new file mode 100644
> index 0000000..f7737cb
> --- /dev/null
> +++ b/kallsyms.c
> @@ -0,0 +1,350 @@
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <string.h>
> +#include "makedumpfile.h"
> +#include "kallsyms.h"
> +
> +static uint32_t *kallsyms_offsets = NULL;
> +static uint16_t *kallsyms_token_index = NULL;
> +static uint8_t *kallsyms_token_table = NULL;
> +static uint8_t *kallsyms_names = NULL;
> +static unsigned long kallsyms_relative_base = 0;
> +static unsigned int kallsyms_num_syms = 0;
> +
> +/* makedumpfile & extensions' .init_ksyms section range array */
> +static struct section_range **sr = NULL;
> +static int sr_len = 0;
> +static int sr_cap = 0;
> +
> +/* Which mod's kallsyms should be inited? */
> +static char **mods = NULL;
> +static int mods_len = 0;
> +static int mods_cap = 0;
> +
> +INIT_KERN_SYM(_stext);
> +
> +/*
> + * Utility: add elem to arr, which can auto extend its capacity.
> + * (*arr) is a pointer array, holding pointers of elem
> +*/
> +bool add_to_arr(void ***arr, int *arr_len, int *arr_cap, void *elem)
> +{
> + void *tmp;
> + int new_cap = 0;
> +
> + if (*arr == NULL) {
> + *arr_len = 0;
> + new_cap = 4;
> + } else if (*arr_len >= *arr_cap) {
> + new_cap = (*arr_cap) + ((*arr_cap) >> 1);
> + }
> +
> + if (new_cap) {
> + tmp = reallocarray(*arr, new_cap, sizeof(void *));
> + if (!tmp)
> + goto no_mem;
> + *arr = tmp;
> + *arr_cap = new_cap;
> + }
> +
> + (*arr)[(*arr_len)++] = elem;
> + return true;
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> + return false;
> +}
> +
> +/*
> + * Utility: add uniq string to arr, which can auto extend its capacity.
> +*/
> +bool push_uniq_str(void ***arr, int *arr_len, int *arr_cap, char *str)
> +{
> + for (int i = 0; i < (*arr_len); i++) {
> + if (!strcmp((*arr)[i], str))
> + /* String already exists, skip it */
> + return true;
> + }
> + return add_to_arr(arr, arr_len, arr_cap, str);
> +}
> +
> +static bool add_ksym_modname(char *modname)
> +{
> + return push_uniq_str((void ***)&mods, &mods_len, &mods_cap, modname);
> +}
> +
> +bool check_ksyms_require_modname(char *modname, int *total)
> +{
> + if (total)
> + *total = mods_len;
> + for (int i = 0; i < mods_len; i++) {
> + if (!strcmp(modname, mods[i]))
> + return true;
> + }
> + return false;
> +}
> +
> +static void cleanup_ksyms_modname(void)
> +{
> + if (mods) {
> + free(mods);
> + mods = NULL;
> + }
> + mods_len = 0;
> + mods_cap = 0;
> +}
> +
> +/*
> + * Used by makedumpfile and extensions, to register their .init_ksyms section.
> + * so kallsyms can know which module/sym should be inited.
> +*/
> +REGISTER_SECTION(ksym)
> +
> +static void cleanup_ksyms_section_range(void)
> +{
> + for (int i = 0; i < sr_len; i++) {
> + free(sr[i]);
> + }
> + if (sr) {
> + free(sr);
> + sr = NULL;
> + }
> + sr_len = 0;
> + sr_cap = 0;
> +}
> +
> +static uint64_t absolute_percpu(uint64_t base, int32_t val)
> +{
> + if (val >= 0)
> + return (uint64_t)val;
> + else
> + return base - 1 - val;
> +}
> +
> +static uint64_t calc_addr_absolute_percpu(struct ksym_info *p)
> +{
> + return absolute_percpu(kallsyms_relative_base, p->value);
> +}
> +
> +static uint64_t calc_addr_relative_base(struct ksym_info *p)
> +{
> + return p->value + kallsyms_relative_base;
> +}
> +
> +static uint64_t calc_addr_place_relative(struct ksym_info *p)
> +{
> + return SYMBOL(kallsyms_offsets) + p->index * sizeof(uint32_t) +
> + (int32_t)kallsyms_offsets[p->index];
> +}
> +
> +#define BUFLEN 1024
> +static bool parse_kernel_kallsyms(void)
> +{
> + char buf[BUFLEN];
> + int index = 0, i, j;
> + uint8_t *compressd_data;
> + uint8_t *uncompressd_data;
> + uint8_t len, len_old;
> + struct ksym_info **p;
> + uint64_t (*calc_addr)(struct ksym_info *);
> + struct ksym_info *stext_p;
> +
> + for (i = 0; i < kallsyms_num_syms; i++) {
> + memset(buf, 0, BUFLEN);
> + len = kallsyms_names[index];
> + if (len & 0x80) {
> + index++;
> + len_old = len;
> + len = kallsyms_names[index];
> + if (len & 0x80) {
> + fprintf(stderr, "%s: BUG! Unexpected 3-byte length,"
> + " should be detected in init_kernel_kallsyms()\n",
> + __func__);
> + goto out;
> + }
> + len = (len_old & 0x7F) | (len << 7);
> + }
> + index++;
> +
> + compressd_data = &kallsyms_names[index];
> + index += len;
> + while (len--) {
> + uncompressd_data = &kallsyms_token_table[kallsyms_token_index[*compressd_data]];
> + if (strlen(buf) + strlen((char *)uncompressd_data) >= BUFLEN) {
> + goto next_symbol;
> + }
> + strcat(buf, (char *)uncompressd_data);
> + compressd_data++;
> + }
> +
> + /* Now check if the symbol is we wanted */
> + for (j = 0; j < sr_len; j++) {
> + for (p = (struct ksym_info **)(sr[j]->start);
> + p < (struct ksym_info **)(sr[j]->stop);
> + p++) {
> + if (!strcmp((*p)->modname, "vmlinux") &&
> + !strcmp((*p)->symname, &buf[1])) {
> + (*p)->value = kallsyms_offsets[i];
> + (*p)->index = i;
> + }
> + }
> + }
> +next_symbol:
> + }
> +
> + /* Check the approach for calc absolute kallsyms address
> + *
> + * A complete comment of each approaches please refer to:
> + * https://github.com/osandov/drgn/commit/744f36ec3c3f64d7e1323a0037898158698585c4
> + */
> + if (!KERN_SYM_EXIST(_stext)) {
> + fprintf(stderr, "%s: symbol _stext not found!\n", __func__);
> + goto out;
> + }
> +
> + stext_p = GET_KERN_SYM_PTR(_stext);
> +
> + if (SYMBOL(_stext) == calc_addr_absolute_percpu(stext_p)) {
> + calc_addr = calc_addr_absolute_percpu;
> + } else if (SYMBOL(_stext) == calc_addr_relative_base(stext_p)) {
> + calc_addr = calc_addr_relative_base;
> + } else if (SYMBOL(_stext) == calc_addr_place_relative(stext_p)) {
> + calc_addr = calc_addr_place_relative;
> + } else {
> + fprintf(stderr, "%s: Wrong calculate kallsyms symbol value!\n", __func__);
> + goto out;
> + }
> +
> + /* Now do the calc */
> + for (j = 0; j < sr_len; j++) {
> + for (p = (struct ksym_info **)(sr[j]->start);
> + p < (struct ksym_info **)(sr[j]->stop);
> + p++) {
> + if (!strcmp((*p)->modname, "vmlinux") &&
> + SYM_EXIST(*p)) {
> + (*p)->value = calc_addr(*p);
> + }
> + }
> + }
> +
> + return true;
> +out:
> + return false;
> +}
> +
> +static bool vmcore_info_ready = false;
> +
> +bool read_vmcoreinfo_kallsyms(void)
> +{
> + READ_SYMBOL("kallsyms_names", kallsyms_names);
> + READ_SYMBOL("kallsyms_num_syms", kallsyms_num_syms);
> + READ_SYMBOL("kallsyms_token_table", kallsyms_token_table);
> + READ_SYMBOL("kallsyms_token_index", kallsyms_token_index);
> + READ_SYMBOL("kallsyms_offsets", kallsyms_offsets);
> + READ_SYMBOL("kallsyms_relative_base", kallsyms_relative_base);
> + vmcore_info_ready = true;
> + return true;
> +}
> +
> +/*
> + * Makedumpfile's .init_ksyms section
> +*/
> +extern struct ksym_info *__start_init_ksyms[];
> +extern struct ksym_info *__stop_init_ksyms[];
> +
> +bool init_kernel_kallsyms(void)
> +{
> + const int token_index_size = (UINT8_MAX + 1) * sizeof(uint16_t);
> + uint64_t last_token, len;
> + unsigned char data, data_old;
> + int i;
> + bool ret = false;
> +
> + if (vmcore_info_ready == false) {
> + fprintf(stderr, "%s: vmcoreinfo not ready for kallsyms!\n",
> + __func__);
> + return ret;
> + }
> +
> + if (!register_ksym_section((char *)__start_init_ksyms,
> + (char *)__stop_init_ksyms))
> + return ret;
> +
> + readmem(VADDR, SYMBOL(kallsyms_num_syms), &kallsyms_num_syms,
> + sizeof(kallsyms_num_syms));
> + if (SYMBOL(kallsyms_relative_base) != NOT_FOUND_SYMBOL)
> + readmem(VADDR, SYMBOL(kallsyms_relative_base),
> + &kallsyms_relative_base, sizeof(kallsyms_relative_base));
> +
> + kallsyms_offsets = malloc(sizeof(uint32_t) * kallsyms_num_syms);
> + if (!kallsyms_offsets)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_offsets), kallsyms_offsets,
> + kallsyms_num_syms * sizeof(uint32_t));
> +
> + kallsyms_token_index = malloc(token_index_size);
> + if (!kallsyms_token_index)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_token_index), kallsyms_token_index,
> + token_index_size);
> +
> + last_token = SYMBOL(kallsyms_token_table) + kallsyms_token_index[UINT8_MAX];
> + do {
> + readmem(VADDR, last_token++, &data, 1);
> + } while(data);
> + len = last_token - SYMBOL(kallsyms_token_table);
> + kallsyms_token_table = malloc(len);
> + if (!kallsyms_token_table)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_token_table), kallsyms_token_table, len);
> +
> + for (len = 0, i = 0; i < kallsyms_num_syms; i++) {
> + readmem(VADDR, SYMBOL(kallsyms_names) + len, &data, 1);
> + /*
> + * The 2-byte representation was added in commit 73bbb94466fd3
> + * ("kallsyms: support "big" kernel symbols") in v6.1, thus for
> + * v6.1+, they indicate a long symbol, but for kernel versions
> + * prior to v6.1, they might be ambiguous.
> + */
> + if (data & 0x80) {
> + len += 1;
> + data_old = data;
> + readmem(VADDR, SYMBOL(kallsyms_names) + len, &data, 1);
> + if (data & 0x80) {
> + fprintf(stderr, "%s: BUG! Unexpected 3-byte length"
> + " encoding in kallsyms names\n", __func__);
> + goto out;
> + }
> + data = (data_old & 0x7F) | (data << 7);
> + }
> + len += data + 1;
> + }
> + kallsyms_names = malloc(len);
> + if (!kallsyms_names)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_names), kallsyms_names, len);
> +
> + ret = parse_kernel_kallsyms();
> + goto out;
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> +out:
> + if (kallsyms_offsets) {
> + free(kallsyms_offsets);
> + kallsyms_offsets = NULL;
> + }
> + if (kallsyms_token_index) {
> + free(kallsyms_token_index);
> + kallsyms_token_index = NULL;
> + }
> + if (kallsyms_token_table) {
> + free(kallsyms_token_table);
> + kallsyms_token_table = NULL;
> + }
> + if (kallsyms_names) {
> + free(kallsyms_names);
> + kallsyms_names = NULL;
> + }
> + return ret;
> +}
> \ No newline at end of file
> diff --git a/kallsyms.h b/kallsyms.h
> new file mode 100644
> index 0000000..3791284
> --- /dev/null
> +++ b/kallsyms.h
> @@ -0,0 +1,91 @@
> +#ifndef _KALLSYMS_H
> +#define _KALLSYMS_H
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +struct ksym_info {
> + /********in******/
> + char *modname;
> + char *symname;
> + bool sym_required;
> + /********out*****/
> + uint64_t value;
> + int index; // -1 if sym not found
> +};
> +
> +#define QUATE(x) #x
> +#define INIT_MOD_SYM_RQD(MOD, SYM, R) \
> + struct ksym_info _##MOD##_##SYM = { \
> + QUATE(MOD), QUATE(SYM), R, 0, -1 \
> + }; \
> + __attribute__((section(".init_ksyms"), used)) \
> + struct ksym_info * _ptr_##MOD##_##SYM = &_##MOD##_##SYM
> +
> +#define GET_MOD_SYM(MOD, SYM) (_##MOD##_##SYM.value)
> +#define GET_MOD_SYM_PTR(MOD, SYM) (&_##MOD##_##SYM)
> +#define MOD_SYM_EXIST(MOD, SYM) (_##MOD##_##SYM.index >= 0)
> +#define SYM_EXIST(p) ((p)->index >= 0)
> +
> +#define GET_KERN_SYM(SYM) GET_MOD_SYM(vmlinux, SYM)
> +#define GET_KERN_SYM_PTR(SYM) GET_MOD_SYM_PTR(vmlinux, SYM)
> +#define KERN_SYM_EXIST(SYM) MOD_SYM_EXIST(vmlinux, SYM)
> +
> +/*
> + * Required syms will be checked automatically before extension running.
> + * Optinal syms should be checked manually at extension runtime.
> + */
> +#define INIT_MOD_SYM(MOD, SYM) INIT_MOD_SYM_RQD(MOD, SYM, 1)
> +#define INIT_OPT_MOD_SYM(MOD, SYM) INIT_MOD_SYM_RQD(MOD, SYM, 0)
> +
> +#define INIT_KERN_SYM(SYM) INIT_MOD_SYM(vmlinux, SYM)
> +#define INIT_OPT_KERN_SYM(SYM) INIT_OPT_MOD_SYM(vmlinux, SYM)
> +
> +struct section_range {
> + char *start;
> + char *stop;
> +};
> +
> +#define REGISTER_SECTION(T) \
> +bool register_##T##_section(char *start, char *stop) \
> +{ \
> + struct section_range *new_sr; \
> + struct T##_info **p; \
> + bool ret = false; \
> + \
> + if (!start || !stop) { \
> + fprintf(stderr, "%s: Invalid section start/stop\n", \
> + __func__); \
> + goto out; \
> + } \
> + \
> + for (p = (struct T##_info **)start; \
> + p < (struct T##_info **)stop; \
> + p++) { \
> + if (!add_##T##_modname((*p)->modname)) \
> + goto out; \
> + } \
> + \
> + new_sr = malloc(sizeof(struct section_range)); \
> + if (!new_sr) { \
> + fprintf(stderr, "%s: Not enough memory!\n", __func__); \
> + goto out; \
> + } \
> + new_sr->start = start; \
> + new_sr->stop = stop; \
> + if (!add_to_arr((void ***)&sr, &sr_len, &sr_cap, new_sr)) { \
> + free(new_sr); \
> + goto out; \
> + } \
> + ret = true; \
> +out: \
> + return ret; \
> +}
> +
> +bool add_to_arr(void ***arr, int *arr_len, int *arr_cap, void *elem);
> +bool push_uniq_str(void ***arr, int *arr_len, int *arr_cap, char *str);
> +bool check_ksyms_require_modname(char *modname, int *total);
> +bool register_ksym_section(char *start, char *stop);
> +bool read_vmcoreinfo_kallsyms(void);
> +bool init_kernel_kallsyms(void);
> +#endif /* _KALLSYMS_H */
> \ No newline at end of file
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 12fb0d8..dba3628 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -27,6 +27,7 @@
> #include <limits.h>
> #include <assert.h>
> #include <zlib.h>
> +#include "kallsyms.h"
>
> struct symbol_table symbol_table;
> struct size_table size_table;
> @@ -3105,6 +3106,8 @@ read_vmcoreinfo_from_vmcore(off_t offset, unsigned long size, int flag_xen_hv)
> if (!read_vmcoreinfo())
> goto out;
> }
> + read_vmcoreinfo_kallsyms();
> +
> close_vmcoreinfo();
>
> ret = TRUE;
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 134eb7a..0f13743 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -259,6 +259,7 @@ static inline int string_exists(char *s) { return (s ? TRUE : FALSE); }
> #define UINT(ADDR) *((unsigned int *)(ADDR))
> #define ULONG(ADDR) *((unsigned long *)(ADDR))
> #define ULONGLONG(ADDR) *((unsigned long long *)(ADDR))
> +#define VOID_PTR(ADDR) *((void **)(ADDR))
>
>
> /*
> @@ -1919,6 +1920,16 @@ struct symbol_table {
> * symbols on sparc64 arch
> */
> unsigned long long vmemmap_table;
> +
> + /*
> + * kallsyms related
> + */
> + unsigned long long kallsyms_names;
> + unsigned long long kallsyms_num_syms;
> + unsigned long long kallsyms_token_table;
> + unsigned long long kallsyms_token_index;
> + unsigned long long kallsyms_offsets;
> + unsigned long long kallsyms_relative_base;
> };
>
> struct size_table {
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving
2026-03-17 15:07 ` [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving Tao Liu
2026-04-02 23:32 ` Stephen Brennan
@ 2026-04-03 8:12 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2026-04-03 8:12 UTC (permalink / raw)
To: Tao Liu
Cc: stephen.s.brennan@oracle.com,
YAMAZAKI MASAMITSU(山崎 真光),
kexec@lists.infradead.org
On 2026/03/18 0:07, Tao Liu wrote:
> This patch will parse kernel's kallsyms data. During the parsing
> process, the .init_ksyms sections of makedumpfile and the
> extensions will be iterated, so the kallsyms symbols which belongs
> to vmlinux can be resolved at this moment.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
> ---
> Makefile | 2 +-
> kallsyms.c | 350 +++++++++++++++++++++++++++++++++++++++++++++++++
> kallsyms.h | 91 +++++++++++++
> makedumpfile.c | 3 +
> makedumpfile.h | 11 ++
> 5 files changed, 456 insertions(+), 1 deletion(-)
> create mode 100644 kallsyms.c
> create mode 100644 kallsyms.h
>
> diff --git a/Makefile b/Makefile
> index 15a4ba0..a57185e 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
> endif
>
> SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
> -SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c
> +SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c
> OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
> OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
> diff --git a/kallsyms.c b/kallsyms.c
> new file mode 100644
> index 0000000..f7737cb
> --- /dev/null
> +++ b/kallsyms.c
> @@ -0,0 +1,350 @@
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <string.h>
> +#include "makedumpfile.h"
> +#include "kallsyms.h"
> +
> +static uint32_t *kallsyms_offsets = NULL;
> +static uint16_t *kallsyms_token_index = NULL;
> +static uint8_t *kallsyms_token_table = NULL;
> +static uint8_t *kallsyms_names = NULL;
> +static unsigned long kallsyms_relative_base = 0;
> +static unsigned int kallsyms_num_syms = 0;
> +
> +/* makedumpfile & extensions' .init_ksyms section range array */
> +static struct section_range **sr = NULL;
> +static int sr_len = 0;
> +static int sr_cap = 0;
> +
> +/* Which mod's kallsyms should be inited? */
> +static char **mods = NULL;
> +static int mods_len = 0;
> +static int mods_cap = 0;
> +
> +INIT_KERN_SYM(_stext);
> +
> +/*
> + * Utility: add elem to arr, which can auto extend its capacity.
> + * (*arr) is a pointer array, holding pointers of elem
> +*/
> +bool add_to_arr(void ***arr, int *arr_len, int *arr_cap, void *elem)
> +{
> + void *tmp;
> + int new_cap = 0;
> +
> + if (*arr == NULL) {
> + *arr_len = 0;
> + new_cap = 4;
> + } else if (*arr_len >= *arr_cap) {
> + new_cap = (*arr_cap) + ((*arr_cap) >> 1);
> + }
> +
> + if (new_cap) {
> + tmp = reallocarray(*arr, new_cap, sizeof(void *));
> + if (!tmp)
> + goto no_mem;
> + *arr = tmp;
> + *arr_cap = new_cap;
> + }
> +
> + (*arr)[(*arr_len)++] = elem;
> + return true;
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
Could you use ERRMSG() or MSG() instead of fprintf/printf for message
control, through the patchset?
> + return false;
> +}
> +
> +/*
> + * Utility: add uniq string to arr, which can auto extend its capacity.
> +*/
> +bool push_uniq_str(void ***arr, int *arr_len, int *arr_cap, char *str)
> +{
> + for (int i = 0; i < (*arr_len); i++) {
> + if (!strcmp((*arr)[i], str))
> + /* String already exists, skip it */
> + return true;
> + }
> + return add_to_arr(arr, arr_len, arr_cap, str);
> +}
> +
> +static bool add_ksym_modname(char *modname)
> +{
> + return push_uniq_str((void ***)&mods, &mods_len, &mods_cap, modname);
> +}
> +
> +bool check_ksyms_require_modname(char *modname, int *total)
> +{
> + if (total)
> + *total = mods_len;
> + for (int i = 0; i < mods_len; i++) {
> + if (!strcmp(modname, mods[i]))
> + return true;
> + }
> + return false;
> +}
> +
> +static void cleanup_ksyms_modname(void)
> +{
> + if (mods) {
> + free(mods);
> + mods = NULL;
> + }
> + mods_len = 0;
> + mods_cap = 0;
> +}
> +
> +/*
> + * Used by makedumpfile and extensions, to register their .init_ksyms section.
> + * so kallsyms can know which module/sym should be inited.
> +*/
> +REGISTER_SECTION(ksym)
> +
> +static void cleanup_ksyms_section_range(void)
> +{
> + for (int i = 0; i < sr_len; i++) {
> + free(sr[i]);
> + }
> + if (sr) {
> + free(sr);
> + sr = NULL;
> + }
> + sr_len = 0;
> + sr_cap = 0;
> +}
> +
> +static uint64_t absolute_percpu(uint64_t base, int32_t val)
> +{
> + if (val >= 0)
> + return (uint64_t)val;
> + else
> + return base - 1 - val;
> +}
> +
> +static uint64_t calc_addr_absolute_percpu(struct ksym_info *p)
> +{
> + return absolute_percpu(kallsyms_relative_base, p->value);
> +}
> +
> +static uint64_t calc_addr_relative_base(struct ksym_info *p)
> +{
> + return p->value + kallsyms_relative_base;
> +}
> +
> +static uint64_t calc_addr_place_relative(struct ksym_info *p)
> +{
> + return SYMBOL(kallsyms_offsets) + p->index * sizeof(uint32_t) +
> + (int32_t)kallsyms_offsets[p->index];
> +}
> +
> +#define BUFLEN 1024
> +static bool parse_kernel_kallsyms(void)
> +{
> + char buf[BUFLEN];
There already is BUFSIZE, which is the same value.
> + int index = 0, i, j;
> + uint8_t *compressd_data;
> + uint8_t *uncompressd_data;
> + uint8_t len, len_old;
> + struct ksym_info **p;
> + uint64_t (*calc_addr)(struct ksym_info *);
> + struct ksym_info *stext_p;
> +
> + for (i = 0; i < kallsyms_num_syms; i++) {
> + memset(buf, 0, BUFLEN);
> + len = kallsyms_names[index];
> + if (len & 0x80) {
> + index++;
> + len_old = len;
> + len = kallsyms_names[index];
> + if (len & 0x80) {
> + fprintf(stderr, "%s: BUG! Unexpected 3-byte length,"
> + " should be detected in init_kernel_kallsyms()\n",
> + __func__);
> + goto out;
> + }
> + len = (len_old & 0x7F) | (len << 7);
> + }
> + index++;
> +
> + compressd_data = &kallsyms_names[index];
> + index += len;
> + while (len--) {
> + uncompressd_data = &kallsyms_token_table[kallsyms_token_index[*compressd_data]];
> + if (strlen(buf) + strlen((char *)uncompressd_data) >= BUFLEN) {
> + goto next_symbol;
> + }
> + strcat(buf, (char *)uncompressd_data);
> + compressd_data++;
> + }
> +
> + /* Now check if the symbol is we wanted */
> + for (j = 0; j < sr_len; j++) {
> + for (p = (struct ksym_info **)(sr[j]->start);
> + p < (struct ksym_info **)(sr[j]->stop);
> + p++) {
> + if (!strcmp((*p)->modname, "vmlinux") &&
> + !strcmp((*p)->symname, &buf[1])) {
> + (*p)->value = kallsyms_offsets[i];
> + (*p)->index = i;
> + }
> + }
> + }
> +next_symbol:
gcc-8.5.0 rejects this style, how about adding ";" here to avoid this?
kallsyms.c: In function ‘parse_kernel_kallsyms’:
kallsyms.c:193:1: error: label at end of compound statement
next_symbol:
^~~~~~~~~~~
make: *** [Makefile:109: kallsyms.o] Error 1
> + }
> +
> + /* Check the approach for calc absolute kallsyms address
> + *
> + * A complete comment of each approaches please refer to:
> + * https://github.com/osandov/drgn/commit/744f36ec3c3f64d7e1323a0037898158698585c4
> + */
> + if (!KERN_SYM_EXIST(_stext)) {
> + fprintf(stderr, "%s: symbol _stext not found!\n", __func__);
> + goto out;
> + }
> +
> + stext_p = GET_KERN_SYM_PTR(_stext);
> +
> + if (SYMBOL(_stext) == calc_addr_absolute_percpu(stext_p)) {
> + calc_addr = calc_addr_absolute_percpu;
> + } else if (SYMBOL(_stext) == calc_addr_relative_base(stext_p)) {
> + calc_addr = calc_addr_relative_base;
> + } else if (SYMBOL(_stext) == calc_addr_place_relative(stext_p)) {
> + calc_addr = calc_addr_place_relative;
> + } else {
> + fprintf(stderr, "%s: Wrong calculate kallsyms symbol value!\n", __func__);
> + goto out;
> + }
> +
> + /* Now do the calc */
> + for (j = 0; j < sr_len; j++) {
> + for (p = (struct ksym_info **)(sr[j]->start);
> + p < (struct ksym_info **)(sr[j]->stop);
> + p++) {
> + if (!strcmp((*p)->modname, "vmlinux") &&
> + SYM_EXIST(*p)) {
> + (*p)->value = calc_addr(*p);
> + }
> + }
> + }
> +
> + return true;
> +out:
> + return false;
> +}
> +
> +static bool vmcore_info_ready = false;
> +
> +bool read_vmcoreinfo_kallsyms(void)
> +{
> + READ_SYMBOL("kallsyms_names", kallsyms_names);
> + READ_SYMBOL("kallsyms_num_syms", kallsyms_num_syms);
> + READ_SYMBOL("kallsyms_token_table", kallsyms_token_table);
> + READ_SYMBOL("kallsyms_token_index", kallsyms_token_index);
> + READ_SYMBOL("kallsyms_offsets", kallsyms_offsets);
> + READ_SYMBOL("kallsyms_relative_base", kallsyms_relative_base);
> + vmcore_info_ready = true;
> + return true;
> +}
> +
> +/*
> + * Makedumpfile's .init_ksyms section
> +*/
> +extern struct ksym_info *__start_init_ksyms[];
> +extern struct ksym_info *__stop_init_ksyms[];
> +
> +bool init_kernel_kallsyms(void)
> +{
> + const int token_index_size = (UINT8_MAX + 1) * sizeof(uint16_t);
> + uint64_t last_token, len;
> + unsigned char data, data_old;
> + int i;
> + bool ret = false;
> +
> + if (vmcore_info_ready == false) {
> + fprintf(stderr, "%s: vmcoreinfo not ready for kallsyms!\n",
> + __func__);
> + return ret;
> + }
> +
> + if (!register_ksym_section((char *)__start_init_ksyms,
> + (char *)__stop_init_ksyms))
> + return ret;
> +
> + readmem(VADDR, SYMBOL(kallsyms_num_syms), &kallsyms_num_syms,
> + sizeof(kallsyms_num_syms));
> + if (SYMBOL(kallsyms_relative_base) != NOT_FOUND_SYMBOL)
> + readmem(VADDR, SYMBOL(kallsyms_relative_base),
> + &kallsyms_relative_base, sizeof(kallsyms_relative_base));
> +
> + kallsyms_offsets = malloc(sizeof(uint32_t) * kallsyms_num_syms);
> + if (!kallsyms_offsets)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_offsets), kallsyms_offsets,
> + kallsyms_num_syms * sizeof(uint32_t));
> +
> + kallsyms_token_index = malloc(token_index_size);
> + if (!kallsyms_token_index)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_token_index), kallsyms_token_index,
> + token_index_size);
> +
> + last_token = SYMBOL(kallsyms_token_table) + kallsyms_token_index[UINT8_MAX];
> + do {
> + readmem(VADDR, last_token++, &data, 1);
> + } while(data);
> + len = last_token - SYMBOL(kallsyms_token_table);
> + kallsyms_token_table = malloc(len);
> + if (!kallsyms_token_table)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_token_table), kallsyms_token_table, len);
> +
> + for (len = 0, i = 0; i < kallsyms_num_syms; i++) {
> + readmem(VADDR, SYMBOL(kallsyms_names) + len, &data, 1);
> + /*
> + * The 2-byte representation was added in commit 73bbb94466fd3
> + * ("kallsyms: support "big" kernel symbols") in v6.1, thus for
> + * v6.1+, they indicate a long symbol, but for kernel versions
> + * prior to v6.1, they might be ambiguous.
> + */
> + if (data & 0x80) {
> + len += 1;
> + data_old = data;
> + readmem(VADDR, SYMBOL(kallsyms_names) + len, &data, 1);
> + if (data & 0x80) {
> + fprintf(stderr, "%s: BUG! Unexpected 3-byte length"
> + " encoding in kallsyms names\n", __func__);
> + goto out;
> + }
> + data = (data_old & 0x7F) | (data << 7);
> + }
> + len += data + 1;
> + }
> + kallsyms_names = malloc(len);
> + if (!kallsyms_names)
> + goto no_mem;
> + readmem(VADDR, SYMBOL(kallsyms_names), kallsyms_names, len);
> +
> + ret = parse_kernel_kallsyms();
> + goto out;
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> +out:
> + if (kallsyms_offsets) {
> + free(kallsyms_offsets);
> + kallsyms_offsets = NULL;
> + }
> + if (kallsyms_token_index) {
> + free(kallsyms_token_index);
> + kallsyms_token_index = NULL;
> + }
> + if (kallsyms_token_table) {
> + free(kallsyms_token_table);
> + kallsyms_token_table = NULL;
> + }
> + if (kallsyms_names) {
> + free(kallsyms_names);
> + kallsyms_names = NULL;
> + }
> + return ret;
> +}
> \ No newline at end of file
> diff --git a/kallsyms.h b/kallsyms.h
> new file mode 100644
> index 0000000..3791284
> --- /dev/null
> +++ b/kallsyms.h
> @@ -0,0 +1,91 @@
> +#ifndef _KALLSYMS_H
> +#define _KALLSYMS_H
> +
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +struct ksym_info {
> + /********in******/
> + char *modname;
> + char *symname;
> + bool sym_required;
> + /********out*****/
> + uint64_t value;
> + int index; // -1 if sym not found
> +};
> +
> +#define QUATE(x) #x
Is this intended? I think what this does is "quote"..
Thanks,
Kazu
> +#define INIT_MOD_SYM_RQD(MOD, SYM, R) \
> + struct ksym_info _##MOD##_##SYM = { \
> + QUATE(MOD), QUATE(SYM), R, 0, -1 \
> + }; \
> + __attribute__((section(".init_ksyms"), used)) \
> + struct ksym_info * _ptr_##MOD##_##SYM = &_##MOD##_##SYM
> +
> +#define GET_MOD_SYM(MOD, SYM) (_##MOD##_##SYM.value)
> +#define GET_MOD_SYM_PTR(MOD, SYM) (&_##MOD##_##SYM)
> +#define MOD_SYM_EXIST(MOD, SYM) (_##MOD##_##SYM.index >= 0)
> +#define SYM_EXIST(p) ((p)->index >= 0)
> +
> +#define GET_KERN_SYM(SYM) GET_MOD_SYM(vmlinux, SYM)
> +#define GET_KERN_SYM_PTR(SYM) GET_MOD_SYM_PTR(vmlinux, SYM)
> +#define KERN_SYM_EXIST(SYM) MOD_SYM_EXIST(vmlinux, SYM)
> +
> +/*
> + * Required syms will be checked automatically before extension running.
> + * Optinal syms should be checked manually at extension runtime.
> + */
> +#define INIT_MOD_SYM(MOD, SYM) INIT_MOD_SYM_RQD(MOD, SYM, 1)
> +#define INIT_OPT_MOD_SYM(MOD, SYM) INIT_MOD_SYM_RQD(MOD, SYM, 0)
> +
> +#define INIT_KERN_SYM(SYM) INIT_MOD_SYM(vmlinux, SYM)
> +#define INIT_OPT_KERN_SYM(SYM) INIT_OPT_MOD_SYM(vmlinux, SYM)
> +
> +struct section_range {
> + char *start;
> + char *stop;
> +};
> +
> +#define REGISTER_SECTION(T) \
> +bool register_##T##_section(char *start, char *stop) \
> +{ \
> + struct section_range *new_sr; \
> + struct T##_info **p; \
> + bool ret = false; \
> + \
> + if (!start || !stop) { \
> + fprintf(stderr, "%s: Invalid section start/stop\n", \
> + __func__); \
> + goto out; \
> + } \
> + \
> + for (p = (struct T##_info **)start; \
> + p < (struct T##_info **)stop; \
> + p++) { \
> + if (!add_##T##_modname((*p)->modname)) \
> + goto out; \
> + } \
> + \
> + new_sr = malloc(sizeof(struct section_range)); \
> + if (!new_sr) { \
> + fprintf(stderr, "%s: Not enough memory!\n", __func__); \
> + goto out; \
> + } \
> + new_sr->start = start; \
> + new_sr->stop = stop; \
> + if (!add_to_arr((void ***)&sr, &sr_len, &sr_cap, new_sr)) { \
> + free(new_sr); \
> + goto out; \
> + } \
> + ret = true; \
> +out: \
> + return ret; \
> +}
> +
> +bool add_to_arr(void ***arr, int *arr_len, int *arr_cap, void *elem);
> +bool push_uniq_str(void ***arr, int *arr_len, int *arr_cap, char *str);
> +bool check_ksyms_require_modname(char *modname, int *total);
> +bool register_ksym_section(char *start, char *stop);
> +bool read_vmcoreinfo_kallsyms(void);
> +bool init_kernel_kallsyms(void);
> +#endif /* _KALLSYMS_H */
> \ No newline at end of file
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 12fb0d8..dba3628 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -27,6 +27,7 @@
> #include <limits.h>
> #include <assert.h>
> #include <zlib.h>
> +#include "kallsyms.h"
>
> struct symbol_table symbol_table;
> struct size_table size_table;
> @@ -3105,6 +3106,8 @@ read_vmcoreinfo_from_vmcore(off_t offset, unsigned long size, int flag_xen_hv)
> if (!read_vmcoreinfo())
> goto out;
> }
> + read_vmcoreinfo_kallsyms();
> +
> close_vmcoreinfo();
>
> ret = TRUE;
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 134eb7a..0f13743 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -259,6 +259,7 @@ static inline int string_exists(char *s) { return (s ? TRUE : FALSE); }
> #define UINT(ADDR) *((unsigned int *)(ADDR))
> #define ULONG(ADDR) *((unsigned long *)(ADDR))
> #define ULONGLONG(ADDR) *((unsigned long long *)(ADDR))
> +#define VOID_PTR(ADDR) *((void **)(ADDR))
>
>
> /*
> @@ -1919,6 +1920,16 @@ struct symbol_table {
> * symbols on sparc64 arch
> */
> unsigned long long vmemmap_table;
> +
> + /*
> + * kallsyms related
> + */
> + unsigned long long kallsyms_names;
> + unsigned long long kallsyms_num_syms;
> + unsigned long long kallsyms_token_table;
> + unsigned long long kallsyms_token_index;
> + unsigned long long kallsyms_offsets;
> + unsigned long long kallsyms_relative_base;
> };
>
> struct size_table {
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
2026-03-17 15:07 ` [PATCH v4][makedumpfile 1/7] Reserve sections for makedumpfile and extenions Tao Liu
2026-03-17 15:07 ` [PATCH v4][makedumpfile 2/7] Implement kernel kallsyms resolving Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-02 23:41 ` Stephen Brennan
2026-04-03 8:13 ` HAGIO KAZUHITO(萩尾 一仁)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 4/7] Implement kernel module's kallsyms resolving Tao Liu
` (5 subsequent siblings)
8 siblings, 2 replies; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
This patch will parse kernel's btf data using libbpf. The kernel's
btf data is located between __start_BTF and __stop_BTF symbols which
are resolved by kallsyms of the previous patch. Same as the previous
one, the .init_ktypes section of makedumpfile and the extensions will
be iterated, and any types which belongs to vmlinux can be resolved
at this time.
Another primary function implemented in this patch, is recursively
diving into anonymous struct/union when encountered any, to find a
member by given its name.
Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
---
Makefile | 4 +-
btf_info.c | 233 +++++++++++++++++++++++++++++++++++++++++++++++++++++
btf_info.h | 90 +++++++++++++++++++++
3 files changed, 325 insertions(+), 2 deletions(-)
create mode 100644 btf_info.c
create mode 100644 btf_info.h
diff --git a/Makefile b/Makefile
index a57185e..320677d 100644
--- a/Makefile
+++ b/Makefile
@@ -45,12 +45,12 @@ CFLAGS_ARCH += -m32
endif
SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
-SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c
+SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c
OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
-LIBS = -ldw -lbz2 -ldl -lelf -lz
+LIBS = -ldw -lbz2 -ldl -lelf -lz -lbpf
ifneq ($(LINKTYPE), dynamic)
LIBS := -static $(LIBS) -llzma
endif
diff --git a/btf_info.c b/btf_info.c
new file mode 100644
index 0000000..1cb66e2
--- /dev/null
+++ b/btf_info.c
@@ -0,0 +1,233 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <bpf/btf.h>
+#include <bpf/libbpf_legacy.h>
+#include "makedumpfile.h"
+#include "kallsyms.h"
+#include "btf_info.h"
+
+struct btf_arr_elem {
+ struct btf *btf;
+ char *module;
+};
+
+static struct btf_arr_elem **btf_arr = NULL;
+static int btf_arr_len = 0;
+static int btf_arr_cap = 0;
+
+/* makedumpfile & extensions' .init_ktypes section range array */
+static struct section_range **sr = NULL;
+static int sr_len = 0;
+static int sr_cap = 0;
+
+/* Which mod's btf should be inited? */
+static char **mods = NULL;
+static int mods_len = 0;
+static int mods_cap = 0;
+
+static bool add_ktype_modname(char *modname)
+{
+ return push_uniq_str((void ***)&mods, &mods_len, &mods_cap, modname);
+}
+
+bool check_ktypes_require_modname(char *modname, int *total)
+{
+ if (total)
+ *total = mods_len;
+ for (int i = 0; i < mods_len; i++) {
+ if (!strcmp(modname, mods[i]))
+ return true;
+ }
+ return false;
+}
+
+static void cleanup_ktypes_modname(void)
+{
+ if (mods) {
+ free(mods);
+ mods = NULL;
+ }
+ mods_len = 0;
+ mods_cap = 0;
+}
+
+/*
+ * Used by makedumpfile and extensions, to register their .init_ktypes section,
+ * so btf_info can know which module/type should be inited.
+*/
+REGISTER_SECTION(ktype)
+
+static void cleanup_ktypes_section_range(void)
+{
+ for (int i = 0; i < sr_len; i++) {
+ free(sr[i]);
+ }
+ if (sr) {
+ free(sr);
+ sr = NULL;
+ }
+ sr_len = 0;
+ sr_cap = 0;
+}
+
+static void find_member_recursive(struct btf *btf, int struct_typeid,
+ int base_offset, struct ktype_info *ki)
+{
+ const struct btf_type *st;
+ struct btf_member *bm;
+ int i, vlen;
+
+ struct_typeid = btf__resolve_type(btf, struct_typeid);
+ st = btf__type_by_id(btf, struct_typeid);
+
+ if (!st)
+ return;
+
+ if (BTF_INFO_KIND(st->info) != BTF_KIND_STRUCT &&
+ BTF_INFO_KIND(st->info) != BTF_KIND_UNION)
+ return;
+
+ vlen = BTF_INFO_VLEN(st->info);
+ bm = btf_members(st);
+
+ for (i = 0; i < vlen; i++, bm++) {
+ const char *name = btf__name_by_offset(btf, bm->name_off);
+ int member_bit_offset = btf_member_bit_offset(st, i) + base_offset;
+ int member_typeid = btf__resolve_type(btf, bm->type);
+ const struct btf_type *mt = btf__type_by_id(btf, member_typeid);
+
+ if (name && strcmp(name, ki->member_name) == 0) {
+ ki->member_bit_offset = member_bit_offset;
+ ki->member_bit_sz = btf_member_bitfield_size(st, i);
+ ki->member_size = btf__resolve_size(btf, member_typeid);
+ ki->index = i;
+ return;
+ }
+
+ if (!name || !name[0]) {
+ if (BTF_INFO_KIND(mt->info) == BTF_KIND_STRUCT ||
+ BTF_INFO_KIND(mt->info) == BTF_KIND_UNION) {
+ find_member_recursive(btf, member_typeid,
+ member_bit_offset, ki);
+ }
+ }
+ }
+}
+
+static void get_ktype_info(struct ktype_info *ki, char *mod_to_resolve)
+{
+ int i, j, start_id;
+
+ if (mod_to_resolve != NULL) {
+ if (strcmp(ki->modname, mod_to_resolve) != 0)
+ /* Exit safely */
+ return;
+ }
+
+ for (i = 0; i < btf_arr_len; i++) {
+ if (strcmp(btf_arr[i]->module, ki->modname) != 0)
+ continue;
+ /*
+ * vmlinux(btf_arr[0])'s typeid is 1~vmlinux_type_cnt,
+ * modules(btf_arr[1...])'s typeid is vmlinux_type_cnt~btf__type_cnt
+ */
+ start_id = (i == 0 ? 1 : btf__type_cnt(btf_arr[0]->btf));
+
+ for (j = start_id; j < btf__type_cnt(btf_arr[i]->btf); j++) {
+ const struct btf_type *bt =
+ btf__type_by_id(btf_arr[i]->btf, j);
+ const char *name =
+ btf__name_by_offset(btf_arr[i]->btf, bt->name_off);
+
+ if (name && strcmp(ki->struct_name, name) == 0) {
+ if (ki->member_name != NULL) {
+ /* Retrieve member info */
+ find_member_recursive(btf_arr[i]->btf, j, 0, ki);
+ } else {
+ ki->index = j;
+ }
+ ki->struct_size = btf__resolve_size(btf_arr[i]->btf, j);
+ return;
+ }
+ }
+ }
+}
+
+static bool add_to_btf_arr(struct btf *btf, char *module_name)
+{
+ struct btf_arr_elem *new_p;
+
+ new_p = malloc(sizeof(struct btf_arr_elem));
+ if (!new_p)
+ goto no_mem;
+
+ new_p->btf = btf;
+ new_p->module = module_name;
+
+ return add_to_arr((void ***)&btf_arr, &btf_arr_len, &btf_arr_cap, new_p);
+
+no_mem:
+ fprintf(stderr, "%s: Not enough memory!\n", __func__);
+ return false;
+}
+
+INIT_KERN_SYM(__start_BTF);
+INIT_KERN_SYM(__stop_BTF);
+
+/*
+ * Makedumpfile's .init_ktypes section
+*/
+extern struct ktype_info *__start_init_ktypes[];
+extern struct ktype_info *__stop_init_ktypes[];
+
+bool init_kernel_btf(void)
+{
+ uint64_t size;
+ struct btf *btf;
+ int i;
+ struct ktype_info **p;
+ char *buf = NULL;
+ bool ret = false;
+
+ uint64_t start_btf = GET_KERN_SYM(__start_BTF);
+ uint64_t stop_btf = GET_KERN_SYM(__stop_BTF);
+ if (!KERN_SYM_EXIST(__start_BTF) ||
+ !KERN_SYM_EXIST(__stop_BTF)) {
+ fprintf(stderr, "%s: symbol __start/stop_BTF not found!\n", __func__);
+ goto out;
+ }
+
+ if (!register_ktype_section((char *)__start_init_ktypes,
+ (char *)__stop_init_ktypes))
+ return ret;
+
+ size = stop_btf - start_btf;
+ buf = (char *)malloc(size);
+ if (!buf) {
+ fprintf(stderr, "%s: Not enough memory!\n", __func__);
+ goto out;
+ }
+ readmem(VADDR, start_btf, buf, size);
+ btf = btf__new(buf, size);
+
+ if (libbpf_get_error(btf) != 0 ||
+ add_to_btf_arr(btf, strdup("vmlinux")) == false) {
+ fprintf(stderr, "%s: init vmlinux btf fail\n", __func__);
+ goto out;
+ }
+
+ for (i = 0; i < sr_len; i++) {
+ for (p = (struct ktype_info **)(sr[i]->start);
+ p < (struct ktype_info **)(sr[i]->stop);
+ p++) {
+ get_ktype_info(*p, "vmlinux");
+ }
+ }
+
+ ret = true;
+out:
+ if (buf)
+ free(buf);
+ return ret;
+}
\ No newline at end of file
diff --git a/btf_info.h b/btf_info.h
new file mode 100644
index 0000000..2cf6b07
--- /dev/null
+++ b/btf_info.h
@@ -0,0 +1,90 @@
+#ifndef _BTF_INFO_H
+#define _BTF_INFO_H
+#include <stdint.h>
+#include <stdbool.h>
+
+struct ktype_info {
+ /********in******/
+ char *modname; // Set to search within the module, in case
+ // name conflict of different modules
+ char *struct_name; // Search by struct name
+ char *member_name; // Search by member name
+ bool struct_required : 1;
+ bool member_required : 1;
+ /********out*****/
+ uint32_t member_bit_offset; // member offset in bits
+ uint32_t member_bit_sz; // member width in bits
+ uint32_t member_size; // member size in bytes
+ uint32_t struct_size; // struct size in bytes
+ int index; // -1 if type not found
+};
+
+bool check_ktypes_require_modname(char *modname, int *total);
+bool register_ktype_section(char *start, char *stop);
+bool init_kernel_btf(void);
+
+#define QUATE(x) #x
+#define INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, R) \
+ struct ktype_info _##MOD##_##S##_##M = { \
+ QUATE(MOD), QUATE(S), QUATE(M), R, R, 0, 0, 0, 0, -1 \
+ }; \
+ __attribute__((section(".init_ktypes"), used)) \
+ struct ktype_info * _ptr_##MOD##_##S##_##M = &_##MOD##_##S##_##M
+
+/*
+ * Required types will be checked automatically before extension running.
+ * Optinal types should be checked manually at extension runtime.
+ */
+#define INIT_MOD_STRUCT_MEMBER(MOD, S, M) \
+ INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, 1)
+#define INIT_OPT_MOD_STRUCT_MEMBER(MOD, S, M) \
+ INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, 0)
+
+#define DECLARE_MOD_STRUCT_MEMBER(MOD, S, M) \
+ extern struct ktype_info _##MOD##_##S##_##M
+
+#define GET_MOD_STRUCT_MEMBER_MOFF(MOD, S, M) (_##MOD##_##S##_##M.member_bit_offset)
+#define GET_MOD_STRUCT_MEMBER_MSIZE(MOD, S, M) (_##MOD##_##S##_##M.member_size)
+#define GET_MOD_STRUCT_MEMBER_SSIZE(MOD, S, M) (_##MOD##_##S##_##M.struct_size)
+#define MOD_STRUCT_MEMBER_EXIST(MOD, S, M) (_##MOD##_##S##_##M.index >= 0)
+#define TYPE_EXIST(p) ((p)->index >= 0)
+
+#define INIT_KERN_STRUCT_MEMBER(S, M) \
+ INIT_MOD_STRUCT_MEMBER(vmlinux, S, M)
+#define INIT_OPT_KERN_STRUCT_MEMBER(S, M) \
+ INIT_OPT_MOD_STRUCT_MEMBER(vmlinux, S, M)
+
+#define DECLARE_KERN_STRUCT_MEMBER(S, M) \
+ DECLARE_MOD_STRUCT_MEMBER(vmlinux, S, M)
+
+#define GET_KERN_STRUCT_MEMBER_MOFF(S, M) GET_MOD_STRUCT_MEMBER_MOFF(vmlinux, S, M)
+#define GET_KERN_STRUCT_MEMBER_MSIZE(S, M) GET_MOD_STRUCT_MEMBER_MSIZE(vmlinux, S, M)
+#define GET_KERN_STRUCT_MEMBER_SSIZE(S, M) GET_MOD_STRUCT_MEMBER_SSIZE(vmlinux, S, M)
+#define KERN_STRUCT_MEMBER_EXIST(S, M) MOD_STRUCT_MEMBER_EXIST(vmlinux, S, M)
+
+#define INIT_MOD_STRUCT_RQD(MOD, S, R) \
+ struct ktype_info _##MOD##_##S = { \
+ QUATE(MOD), QUATE(S), 0, R, 0, 0, 0, 0, 0, -1 \
+ }; \
+ __attribute__((section(".init_ktypes"), used)) \
+ struct ktype_info * _ptr_##MOD##_##S = &_##MOD##_##S
+
+#define INIT_MOD_STRUCT(MOD, S) INIT_MOD_STRUCT_RQD(MOD, S, 1)
+#define INIT_OPT_MOD_STRUCT(MOD, S) INIT_MOD_STRUCT_RQD(MOD, S, 0)
+
+#define DECLARE_MOD_STRUCT(MOD, S) \
+ extern struct ktype_info _##MOD##_##S;
+
+#define GET_MOD_STRUCT_SSIZE(MOD, S) (_##MOD##_##S.struct_size)
+#define MOD_STRUCT_EXIST(MOD, S) (_##MOD##_##S.index >= 0)
+
+#define INIT_KERN_STRUCT(S) INIT_MOD_STRUCT(vmlinux, S)
+#define INIT_OPT_KERN_STRUCT(S) INIT_OPT_MOD_STRUCT(vmlinux, S)
+
+#define DECLARE_KERN_STRUCT(S) \
+ DECLARE_MOD_STRUCT(vmlinux, S)
+
+#define GET_KERN_STRUCT_SSIZE(S) GET_MOD_STRUCT_SSIZE(vmlinux, S)
+#define KERN_STRUCT_EXIST(S) MOD_STRUCT_EXIST(vmlinux, S)
+
+#endif /* _BTF_INFO_H */
\ No newline at end of file
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving
2026-03-17 15:07 ` [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving Tao Liu
@ 2026-04-02 23:41 ` Stephen Brennan
2026-04-03 8:13 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-02 23:41 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> This patch will parse kernel's btf data using libbpf. The kernel's
> btf data is located between __start_BTF and __stop_BTF symbols which
> are resolved by kallsyms of the previous patch. Same as the previous
> one, the .init_ktypes section of makedumpfile and the extensions will
> be iterated, and any types which belongs to vmlinux can be resolved
> at this time.
>
> Another primary function implemented in this patch, is recursively
> diving into anonymous struct/union when encountered any, to find a
> member by given its name.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> Makefile | 4 +-
> btf_info.c | 233 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> btf_info.h | 90 +++++++++++++++++++++
> 3 files changed, 325 insertions(+), 2 deletions(-)
> create mode 100644 btf_info.c
> create mode 100644 btf_info.h
>
> diff --git a/Makefile b/Makefile
> index a57185e..320677d 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,12 +45,12 @@ CFLAGS_ARCH += -m32
> endif
>
> SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
> -SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c
> +SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c
> OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
> OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
>
> -LIBS = -ldw -lbz2 -ldl -lelf -lz
> +LIBS = -ldw -lbz2 -ldl -lelf -lz -lbpf
> ifneq ($(LINKTYPE), dynamic)
> LIBS := -static $(LIBS) -llzma
> endif
> diff --git a/btf_info.c b/btf_info.c
> new file mode 100644
> index 0000000..1cb66e2
> --- /dev/null
> +++ b/btf_info.c
> @@ -0,0 +1,233 @@
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <bpf/btf.h>
> +#include <bpf/libbpf_legacy.h>
> +#include "makedumpfile.h"
> +#include "kallsyms.h"
> +#include "btf_info.h"
> +
> +struct btf_arr_elem {
> + struct btf *btf;
> + char *module;
> +};
> +
> +static struct btf_arr_elem **btf_arr = NULL;
> +static int btf_arr_len = 0;
> +static int btf_arr_cap = 0;
> +
> +/* makedumpfile & extensions' .init_ktypes section range array */
> +static struct section_range **sr = NULL;
> +static int sr_len = 0;
> +static int sr_cap = 0;
> +
> +/* Which mod's btf should be inited? */
> +static char **mods = NULL;
> +static int mods_len = 0;
> +static int mods_cap = 0;
> +
> +static bool add_ktype_modname(char *modname)
> +{
> + return push_uniq_str((void ***)&mods, &mods_len, &mods_cap, modname);
> +}
> +
> +bool check_ktypes_require_modname(char *modname, int *total)
> +{
> + if (total)
> + *total = mods_len;
> + for (int i = 0; i < mods_len; i++) {
> + if (!strcmp(modname, mods[i]))
> + return true;
> + }
> + return false;
> +}
> +
> +static void cleanup_ktypes_modname(void)
> +{
> + if (mods) {
> + free(mods);
> + mods = NULL;
> + }
> + mods_len = 0;
> + mods_cap = 0;
> +}
> +
> +/*
> + * Used by makedumpfile and extensions, to register their .init_ktypes section,
> + * so btf_info can know which module/type should be inited.
> +*/
> +REGISTER_SECTION(ktype)
> +
> +static void cleanup_ktypes_section_range(void)
> +{
> + for (int i = 0; i < sr_len; i++) {
> + free(sr[i]);
> + }
> + if (sr) {
> + free(sr);
> + sr = NULL;
> + }
> + sr_len = 0;
> + sr_cap = 0;
> +}
> +
> +static void find_member_recursive(struct btf *btf, int struct_typeid,
> + int base_offset, struct ktype_info *ki)
> +{
> + const struct btf_type *st;
> + struct btf_member *bm;
> + int i, vlen;
> +
> + struct_typeid = btf__resolve_type(btf, struct_typeid);
> + st = btf__type_by_id(btf, struct_typeid);
> +
> + if (!st)
> + return;
> +
> + if (BTF_INFO_KIND(st->info) != BTF_KIND_STRUCT &&
> + BTF_INFO_KIND(st->info) != BTF_KIND_UNION)
> + return;
> +
> + vlen = BTF_INFO_VLEN(st->info);
> + bm = btf_members(st);
> +
> + for (i = 0; i < vlen; i++, bm++) {
> + const char *name = btf__name_by_offset(btf, bm->name_off);
> + int member_bit_offset = btf_member_bit_offset(st, i) + base_offset;
> + int member_typeid = btf__resolve_type(btf, bm->type);
> + const struct btf_type *mt = btf__type_by_id(btf, member_typeid);
> +
> + if (name && strcmp(name, ki->member_name) == 0) {
> + ki->member_bit_offset = member_bit_offset;
> + ki->member_bit_sz = btf_member_bitfield_size(st, i);
> + ki->member_size = btf__resolve_size(btf, member_typeid);
> + ki->index = i;
> + return;
> + }
> +
> + if (!name || !name[0]) {
> + if (BTF_INFO_KIND(mt->info) == BTF_KIND_STRUCT ||
> + BTF_INFO_KIND(mt->info) == BTF_KIND_UNION) {
> + find_member_recursive(btf, member_typeid,
> + member_bit_offset, ki);
> + }
> + }
> + }
> +}
> +
> +static void get_ktype_info(struct ktype_info *ki, char *mod_to_resolve)
> +{
> + int i, j, start_id;
> +
> + if (mod_to_resolve != NULL) {
> + if (strcmp(ki->modname, mod_to_resolve) != 0)
> + /* Exit safely */
> + return;
> + }
> +
> + for (i = 0; i < btf_arr_len; i++) {
> + if (strcmp(btf_arr[i]->module, ki->modname) != 0)
> + continue;
> + /*
> + * vmlinux(btf_arr[0])'s typeid is 1~vmlinux_type_cnt,
> + * modules(btf_arr[1...])'s typeid is vmlinux_type_cnt~btf__type_cnt
> + */
> + start_id = (i == 0 ? 1 : btf__type_cnt(btf_arr[0]->btf));
> +
> + for (j = start_id; j < btf__type_cnt(btf_arr[i]->btf); j++) {
> + const struct btf_type *bt =
> + btf__type_by_id(btf_arr[i]->btf, j);
> + const char *name =
> + btf__name_by_offset(btf_arr[i]->btf, bt->name_off);
> +
> + if (name && strcmp(ki->struct_name, name) == 0) {
> + if (ki->member_name != NULL) {
> + /* Retrieve member info */
> + find_member_recursive(btf_arr[i]->btf, j, 0, ki);
> + } else {
> + ki->index = j;
> + }
> + ki->struct_size = btf__resolve_size(btf_arr[i]->btf, j);
> + return;
> + }
> + }
> + }
> +}
> +
> +static bool add_to_btf_arr(struct btf *btf, char *module_name)
> +{
> + struct btf_arr_elem *new_p;
> +
> + new_p = malloc(sizeof(struct btf_arr_elem));
> + if (!new_p)
> + goto no_mem;
> +
> + new_p->btf = btf;
> + new_p->module = module_name;
> +
> + return add_to_arr((void ***)&btf_arr, &btf_arr_len, &btf_arr_cap, new_p);
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> + return false;
> +}
> +
> +INIT_KERN_SYM(__start_BTF);
> +INIT_KERN_SYM(__stop_BTF);
> +
> +/*
> + * Makedumpfile's .init_ktypes section
> +*/
> +extern struct ktype_info *__start_init_ktypes[];
> +extern struct ktype_info *__stop_init_ktypes[];
> +
> +bool init_kernel_btf(void)
> +{
> + uint64_t size;
> + struct btf *btf;
> + int i;
> + struct ktype_info **p;
> + char *buf = NULL;
> + bool ret = false;
> +
> + uint64_t start_btf = GET_KERN_SYM(__start_BTF);
> + uint64_t stop_btf = GET_KERN_SYM(__stop_BTF);
> + if (!KERN_SYM_EXIST(__start_BTF) ||
> + !KERN_SYM_EXIST(__stop_BTF)) {
> + fprintf(stderr, "%s: symbol __start/stop_BTF not found!\n", __func__);
> + goto out;
> + }
> +
> + if (!register_ktype_section((char *)__start_init_ktypes,
> + (char *)__stop_init_ktypes))
> + return ret;
> +
> + size = stop_btf - start_btf;
> + buf = (char *)malloc(size);
> + if (!buf) {
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> + goto out;
> + }
> + readmem(VADDR, start_btf, buf, size);
> + btf = btf__new(buf, size);
> +
> + if (libbpf_get_error(btf) != 0 ||
> + add_to_btf_arr(btf, strdup("vmlinux")) == false) {
> + fprintf(stderr, "%s: init vmlinux btf fail\n", __func__);
> + goto out;
> + }
> +
> + for (i = 0; i < sr_len; i++) {
> + for (p = (struct ktype_info **)(sr[i]->start);
> + p < (struct ktype_info **)(sr[i]->stop);
> + p++) {
> + get_ktype_info(*p, "vmlinux");
> + }
> + }
> +
> + ret = true;
> +out:
> + if (buf)
> + free(buf);
> + return ret;
> +}
> \ No newline at end of file
> diff --git a/btf_info.h b/btf_info.h
> new file mode 100644
> index 0000000..2cf6b07
> --- /dev/null
> +++ b/btf_info.h
> @@ -0,0 +1,90 @@
> +#ifndef _BTF_INFO_H
> +#define _BTF_INFO_H
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +struct ktype_info {
> + /********in******/
> + char *modname; // Set to search within the module, in case
> + // name conflict of different modules
> + char *struct_name; // Search by struct name
> + char *member_name; // Search by member name
> + bool struct_required : 1;
> + bool member_required : 1;
> + /********out*****/
> + uint32_t member_bit_offset; // member offset in bits
> + uint32_t member_bit_sz; // member width in bits
> + uint32_t member_size; // member size in bytes
> + uint32_t struct_size; // struct size in bytes
> + int index; // -1 if type not found
> +};
> +
> +bool check_ktypes_require_modname(char *modname, int *total);
> +bool register_ktype_section(char *start, char *stop);
> +bool init_kernel_btf(void);
> +
> +#define QUATE(x) #x
> +#define INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, R) \
> + struct ktype_info _##MOD##_##S##_##M = { \
> + QUATE(MOD), QUATE(S), QUATE(M), R, R, 0, 0, 0, 0, -1 \
> + }; \
> + __attribute__((section(".init_ktypes"), used)) \
> + struct ktype_info * _ptr_##MOD##_##S##_##M = &_##MOD##_##S##_##M
> +
> +/*
> + * Required types will be checked automatically before extension running.
> + * Optinal types should be checked manually at extension runtime.
> + */
> +#define INIT_MOD_STRUCT_MEMBER(MOD, S, M) \
> + INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, 1)
> +#define INIT_OPT_MOD_STRUCT_MEMBER(MOD, S, M) \
> + INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, 0)
> +
> +#define DECLARE_MOD_STRUCT_MEMBER(MOD, S, M) \
> + extern struct ktype_info _##MOD##_##S##_##M
> +
> +#define GET_MOD_STRUCT_MEMBER_MOFF(MOD, S, M) (_##MOD##_##S##_##M.member_bit_offset)
> +#define GET_MOD_STRUCT_MEMBER_MSIZE(MOD, S, M) (_##MOD##_##S##_##M.member_size)
> +#define GET_MOD_STRUCT_MEMBER_SSIZE(MOD, S, M) (_##MOD##_##S##_##M.struct_size)
> +#define MOD_STRUCT_MEMBER_EXIST(MOD, S, M) (_##MOD##_##S##_##M.index >= 0)
> +#define TYPE_EXIST(p) ((p)->index >= 0)
> +
> +#define INIT_KERN_STRUCT_MEMBER(S, M) \
> + INIT_MOD_STRUCT_MEMBER(vmlinux, S, M)
> +#define INIT_OPT_KERN_STRUCT_MEMBER(S, M) \
> + INIT_OPT_MOD_STRUCT_MEMBER(vmlinux, S, M)
> +
> +#define DECLARE_KERN_STRUCT_MEMBER(S, M) \
> + DECLARE_MOD_STRUCT_MEMBER(vmlinux, S, M)
> +
> +#define GET_KERN_STRUCT_MEMBER_MOFF(S, M) GET_MOD_STRUCT_MEMBER_MOFF(vmlinux, S, M)
> +#define GET_KERN_STRUCT_MEMBER_MSIZE(S, M) GET_MOD_STRUCT_MEMBER_MSIZE(vmlinux, S, M)
> +#define GET_KERN_STRUCT_MEMBER_SSIZE(S, M) GET_MOD_STRUCT_MEMBER_SSIZE(vmlinux, S, M)
> +#define KERN_STRUCT_MEMBER_EXIST(S, M) MOD_STRUCT_MEMBER_EXIST(vmlinux, S, M)
> +
> +#define INIT_MOD_STRUCT_RQD(MOD, S, R) \
> + struct ktype_info _##MOD##_##S = { \
> + QUATE(MOD), QUATE(S), 0, R, 0, 0, 0, 0, 0, -1 \
> + }; \
> + __attribute__((section(".init_ktypes"), used)) \
> + struct ktype_info * _ptr_##MOD##_##S = &_##MOD##_##S
> +
> +#define INIT_MOD_STRUCT(MOD, S) INIT_MOD_STRUCT_RQD(MOD, S, 1)
> +#define INIT_OPT_MOD_STRUCT(MOD, S) INIT_MOD_STRUCT_RQD(MOD, S, 0)
> +
> +#define DECLARE_MOD_STRUCT(MOD, S) \
> + extern struct ktype_info _##MOD##_##S;
> +
> +#define GET_MOD_STRUCT_SSIZE(MOD, S) (_##MOD##_##S.struct_size)
> +#define MOD_STRUCT_EXIST(MOD, S) (_##MOD##_##S.index >= 0)
> +
> +#define INIT_KERN_STRUCT(S) INIT_MOD_STRUCT(vmlinux, S)
> +#define INIT_OPT_KERN_STRUCT(S) INIT_OPT_MOD_STRUCT(vmlinux, S)
> +
> +#define DECLARE_KERN_STRUCT(S) \
> + DECLARE_MOD_STRUCT(vmlinux, S)
> +
> +#define GET_KERN_STRUCT_SSIZE(S) GET_MOD_STRUCT_SSIZE(vmlinux, S)
> +#define KERN_STRUCT_EXIST(S) MOD_STRUCT_EXIST(vmlinux, S)
> +
> +#endif /* _BTF_INFO_H */
> \ No newline at end of file
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving
2026-03-17 15:07 ` [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving Tao Liu
2026-04-02 23:41 ` Stephen Brennan
@ 2026-04-03 8:13 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2026-04-03 8:13 UTC (permalink / raw)
To: Tao Liu
Cc: stephen.s.brennan@oracle.com,
YAMAZAKI MASAMITSU(山崎 真光),
kexec@lists.infradead.org
On 2026/03/18 0:07, Tao Liu wrote:
> This patch will parse kernel's btf data using libbpf. The kernel's
> btf data is located between __start_BTF and __stop_BTF symbols which
> are resolved by kallsyms of the previous patch. Same as the previous
> one, the .init_ktypes section of makedumpfile and the extensions will
> be iterated, and any types which belongs to vmlinux can be resolved
> at this time.
>
> Another primary function implemented in this patch, is recursively
> diving into anonymous struct/union when encountered any, to find a
> member by given its name.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
> ---
> Makefile | 4 +-
> btf_info.c | 233 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> btf_info.h | 90 +++++++++++++++++++++
> 3 files changed, 325 insertions(+), 2 deletions(-)
> create mode 100644 btf_info.c
> create mode 100644 btf_info.h
>
> diff --git a/Makefile b/Makefile
> index a57185e..320677d 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,12 +45,12 @@ CFLAGS_ARCH += -m32
> endif
>
> SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
> -SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c
> +SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c
> OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
> OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
>
> -LIBS = -ldw -lbz2 -ldl -lelf -lz
> +LIBS = -ldw -lbz2 -ldl -lelf -lz -lbpf
Not all distributions may use the extension function or have libbpf.
Also I would like to build makedumpfile on RHEL8 too, but it cannot be
built due to the version of libbpf. Could we introduce an option e.g.
"EXTENSION=on" to use the function?
> ifneq ($(LINKTYPE), dynamic)
> LIBS := -static $(LIBS) -llzma
> endif
> diff --git a/btf_info.c b/btf_info.c
> new file mode 100644
> index 0000000..1cb66e2
> --- /dev/null
> +++ b/btf_info.c
> @@ -0,0 +1,233 @@
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <bpf/btf.h>
> +#include <bpf/libbpf_legacy.h>
> +#include "makedumpfile.h"
> +#include "kallsyms.h"
> +#include "btf_info.h"
> +
> +struct btf_arr_elem {
> + struct btf *btf;
> + char *module;
> +};
> +
> +static struct btf_arr_elem **btf_arr = NULL;
> +static int btf_arr_len = 0;
> +static int btf_arr_cap = 0;
> +
> +/* makedumpfile & extensions' .init_ktypes section range array */
> +static struct section_range **sr = NULL;
> +static int sr_len = 0;
> +static int sr_cap = 0;
> +
> +/* Which mod's btf should be inited? */
> +static char **mods = NULL;
> +static int mods_len = 0;
> +static int mods_cap = 0;
> +
> +static bool add_ktype_modname(char *modname)
> +{
> + return push_uniq_str((void ***)&mods, &mods_len, &mods_cap, modname);
> +}
> +
> +bool check_ktypes_require_modname(char *modname, int *total)
> +{
> + if (total)
> + *total = mods_len;
> + for (int i = 0; i < mods_len; i++) {
> + if (!strcmp(modname, mods[i]))
> + return true;
> + }
> + return false;
> +}
> +
> +static void cleanup_ktypes_modname(void)
> +{
> + if (mods) {
> + free(mods);
> + mods = NULL;
> + }
> + mods_len = 0;
> + mods_cap = 0;
> +}
> +
> +/*
> + * Used by makedumpfile and extensions, to register their .init_ktypes section,
> + * so btf_info can know which module/type should be inited.
> +*/
> +REGISTER_SECTION(ktype)
> +
> +static void cleanup_ktypes_section_range(void)
> +{
> + for (int i = 0; i < sr_len; i++) {
> + free(sr[i]);
> + }
> + if (sr) {
> + free(sr);
> + sr = NULL;
> + }
> + sr_len = 0;
> + sr_cap = 0;
> +}
> +
> +static void find_member_recursive(struct btf *btf, int struct_typeid,
> + int base_offset, struct ktype_info *ki)
> +{
> + const struct btf_type *st;
> + struct btf_member *bm;
> + int i, vlen;
> +
> + struct_typeid = btf__resolve_type(btf, struct_typeid);
> + st = btf__type_by_id(btf, struct_typeid);
> +
> + if (!st)
> + return;
> +
> + if (BTF_INFO_KIND(st->info) != BTF_KIND_STRUCT &&
> + BTF_INFO_KIND(st->info) != BTF_KIND_UNION)
> + return;
> +
> + vlen = BTF_INFO_VLEN(st->info);
> + bm = btf_members(st);
> +
> + for (i = 0; i < vlen; i++, bm++) {
> + const char *name = btf__name_by_offset(btf, bm->name_off);
> + int member_bit_offset = btf_member_bit_offset(st, i) + base_offset;
> + int member_typeid = btf__resolve_type(btf, bm->type);
> + const struct btf_type *mt = btf__type_by_id(btf, member_typeid);
> +
> + if (name && strcmp(name, ki->member_name) == 0) {
> + ki->member_bit_offset = member_bit_offset;
> + ki->member_bit_sz = btf_member_bitfield_size(st, i);
> + ki->member_size = btf__resolve_size(btf, member_typeid);
> + ki->index = i;
> + return;
> + }
> +
> + if (!name || !name[0]) {
> + if (BTF_INFO_KIND(mt->info) == BTF_KIND_STRUCT ||
> + BTF_INFO_KIND(mt->info) == BTF_KIND_UNION) {
> + find_member_recursive(btf, member_typeid,
> + member_bit_offset, ki);
> + }
> + }
> + }
> +}
> +
> +static void get_ktype_info(struct ktype_info *ki, char *mod_to_resolve)
> +{
> + int i, j, start_id;
> +
> + if (mod_to_resolve != NULL) {
> + if (strcmp(ki->modname, mod_to_resolve) != 0)
> + /* Exit safely */
> + return;
> + }
> +
> + for (i = 0; i < btf_arr_len; i++) {
> + if (strcmp(btf_arr[i]->module, ki->modname) != 0)
> + continue;
> + /*
> + * vmlinux(btf_arr[0])'s typeid is 1~vmlinux_type_cnt,
> + * modules(btf_arr[1...])'s typeid is vmlinux_type_cnt~btf__type_cnt
> + */
> + start_id = (i == 0 ? 1 : btf__type_cnt(btf_arr[0]->btf));
> +
> + for (j = start_id; j < btf__type_cnt(btf_arr[i]->btf); j++) {
> + const struct btf_type *bt =
> + btf__type_by_id(btf_arr[i]->btf, j);
> + const char *name =
> + btf__name_by_offset(btf_arr[i]->btf, bt->name_off);
> +
> + if (name && strcmp(ki->struct_name, name) == 0) {
> + if (ki->member_name != NULL) {
> + /* Retrieve member info */
> + find_member_recursive(btf_arr[i]->btf, j, 0, ki);
> + } else {
> + ki->index = j;
> + }
> + ki->struct_size = btf__resolve_size(btf_arr[i]->btf, j);
> + return;
> + }
> + }
> + }
> +}
> +
> +static bool add_to_btf_arr(struct btf *btf, char *module_name)
> +{
> + struct btf_arr_elem *new_p;
> +
> + new_p = malloc(sizeof(struct btf_arr_elem));
> + if (!new_p)
> + goto no_mem;
> +
> + new_p->btf = btf;
> + new_p->module = module_name;
> +
> + return add_to_arr((void ***)&btf_arr, &btf_arr_len, &btf_arr_cap, new_p);
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> + return false;
> +}
> +
> +INIT_KERN_SYM(__start_BTF);
> +INIT_KERN_SYM(__stop_BTF);
> +
> +/*
> + * Makedumpfile's .init_ktypes section
> +*/
> +extern struct ktype_info *__start_init_ktypes[];
> +extern struct ktype_info *__stop_init_ktypes[];
> +
> +bool init_kernel_btf(void)
> +{
> + uint64_t size;
> + struct btf *btf;
> + int i;
> + struct ktype_info **p;
> + char *buf = NULL;
> + bool ret = false;
> +
> + uint64_t start_btf = GET_KERN_SYM(__start_BTF);
> + uint64_t stop_btf = GET_KERN_SYM(__stop_BTF);
> + if (!KERN_SYM_EXIST(__start_BTF) ||
> + !KERN_SYM_EXIST(__stop_BTF)) {
> + fprintf(stderr, "%s: symbol __start/stop_BTF not found!\n", __func__);
> + goto out;
> + }
> +
> + if (!register_ktype_section((char *)__start_init_ktypes,
> + (char *)__stop_init_ktypes))
> + return ret;
> +
> + size = stop_btf - start_btf;
> + buf = (char *)malloc(size);
> + if (!buf) {
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> + goto out;
> + }
> + readmem(VADDR, start_btf, buf, size);
> + btf = btf__new(buf, size);
> +
> + if (libbpf_get_error(btf) != 0 ||
> + add_to_btf_arr(btf, strdup("vmlinux")) == false) {
> + fprintf(stderr, "%s: init vmlinux btf fail\n", __func__);
> + goto out;
> + }
> +
> + for (i = 0; i < sr_len; i++) {
> + for (p = (struct ktype_info **)(sr[i]->start);
> + p < (struct ktype_info **)(sr[i]->stop);
> + p++) {
> + get_ktype_info(*p, "vmlinux");
> + }
> + }
> +
> + ret = true;
> +out:
> + if (buf)
> + free(buf);
> + return ret;
> +}
> \ No newline at end of file
> diff --git a/btf_info.h b/btf_info.h
> new file mode 100644
> index 0000000..2cf6b07
> --- /dev/null
> +++ b/btf_info.h
> @@ -0,0 +1,90 @@
> +#ifndef _BTF_INFO_H
> +#define _BTF_INFO_H
> +#include <stdint.h>
> +#include <stdbool.h>
> +
> +struct ktype_info {
> + /********in******/
> + char *modname; // Set to search within the module, in case
> + // name conflict of different modules
> + char *struct_name; // Search by struct name
> + char *member_name; // Search by member name
> + bool struct_required : 1;
> + bool member_required : 1;
> + /********out*****/
> + uint32_t member_bit_offset; // member offset in bits
> + uint32_t member_bit_sz; // member width in bits
> + uint32_t member_size; // member size in bytes
> + uint32_t struct_size; // struct size in bytes
> + int index; // -1 if type not found
> +};
> +
> +bool check_ktypes_require_modname(char *modname, int *total);
> +bool register_ktype_section(char *start, char *stop);
> +bool init_kernel_btf(void);
> +
> +#define QUATE(x) #x
> +#define INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, R) \
> + struct ktype_info _##MOD##_##S##_##M = { \
> + QUATE(MOD), QUATE(S), QUATE(M), R, R, 0, 0, 0, 0, -1 \
> + }; \
> + __attribute__((section(".init_ktypes"), used)) \
> + struct ktype_info * _ptr_##MOD##_##S##_##M = &_##MOD##_##S##_##M
> +
> +/*
> + * Required types will be checked automatically before extension running.
> + * Optinal types should be checked manually at extension runtime.
> + */
> +#define INIT_MOD_STRUCT_MEMBER(MOD, S, M) \
> + INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, 1)
> +#define INIT_OPT_MOD_STRUCT_MEMBER(MOD, S, M) \
> + INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, 0)
> +
> +#define DECLARE_MOD_STRUCT_MEMBER(MOD, S, M) \
> + extern struct ktype_info _##MOD##_##S##_##M
> +
> +#define GET_MOD_STRUCT_MEMBER_MOFF(MOD, S, M) (_##MOD##_##S##_##M.member_bit_offset)
> +#define GET_MOD_STRUCT_MEMBER_MSIZE(MOD, S, M) (_##MOD##_##S##_##M.member_size)
> +#define GET_MOD_STRUCT_MEMBER_SSIZE(MOD, S, M) (_##MOD##_##S##_##M.struct_size)
> +#define MOD_STRUCT_MEMBER_EXIST(MOD, S, M) (_##MOD##_##S##_##M.index >= 0)
> +#define TYPE_EXIST(p) ((p)->index >= 0)
> +
> +#define INIT_KERN_STRUCT_MEMBER(S, M) \
> + INIT_MOD_STRUCT_MEMBER(vmlinux, S, M)
> +#define INIT_OPT_KERN_STRUCT_MEMBER(S, M) \
> + INIT_OPT_MOD_STRUCT_MEMBER(vmlinux, S, M)
> +
> +#define DECLARE_KERN_STRUCT_MEMBER(S, M) \
> + DECLARE_MOD_STRUCT_MEMBER(vmlinux, S, M)
> +
> +#define GET_KERN_STRUCT_MEMBER_MOFF(S, M) GET_MOD_STRUCT_MEMBER_MOFF(vmlinux, S, M)
> +#define GET_KERN_STRUCT_MEMBER_MSIZE(S, M) GET_MOD_STRUCT_MEMBER_MSIZE(vmlinux, S, M)
> +#define GET_KERN_STRUCT_MEMBER_SSIZE(S, M) GET_MOD_STRUCT_MEMBER_SSIZE(vmlinux, S, M)
> +#define KERN_STRUCT_MEMBER_EXIST(S, M) MOD_STRUCT_MEMBER_EXIST(vmlinux, S, M)
> +
> +#define INIT_MOD_STRUCT_RQD(MOD, S, R) \
> + struct ktype_info _##MOD##_##S = { \
> + QUATE(MOD), QUATE(S), 0, R, 0, 0, 0, 0, 0, -1 \
> + }; \
> + __attribute__((section(".init_ktypes"), used)) \
> + struct ktype_info * _ptr_##MOD##_##S = &_##MOD##_##S
> +
> +#define INIT_MOD_STRUCT(MOD, S) INIT_MOD_STRUCT_RQD(MOD, S, 1)
> +#define INIT_OPT_MOD_STRUCT(MOD, S) INIT_MOD_STRUCT_RQD(MOD, S, 0)
> +
> +#define DECLARE_MOD_STRUCT(MOD, S) \
> + extern struct ktype_info _##MOD##_##S;
> +
> +#define GET_MOD_STRUCT_SSIZE(MOD, S) (_##MOD##_##S.struct_size)
> +#define MOD_STRUCT_EXIST(MOD, S) (_##MOD##_##S.index >= 0)
> +
> +#define INIT_KERN_STRUCT(S) INIT_MOD_STRUCT(vmlinux, S)
> +#define INIT_OPT_KERN_STRUCT(S) INIT_OPT_MOD_STRUCT(vmlinux, S)
> +
> +#define DECLARE_KERN_STRUCT(S) \
> + DECLARE_MOD_STRUCT(vmlinux, S)
> +
> +#define GET_KERN_STRUCT_SSIZE(S) GET_MOD_STRUCT_SSIZE(vmlinux, S)
> +#define KERN_STRUCT_EXIST(S) MOD_STRUCT_EXIST(vmlinux, S)
I feel these macros are a bit messy.. "KERN" macros multiply the number
of macros. Does this need both KERN/MOD macros, i.e. how about uniting
them and describe that "vmlinux" is used for kernel on a manual?
Thanks,
Kazu
> +
> +#endif /* _BTF_INFO_H */
> \ No newline at end of file
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4][makedumpfile 4/7] Implement kernel module's kallsyms resolving
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
` (2 preceding siblings ...)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 3/7] Implement kernel btf resolving Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-02 23:54 ` Stephen Brennan
2026-03-17 15:07 ` [PATCH v4][makedumpfile 5/7] Implement kernel module's btf resolving Tao Liu
` (4 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
With kernel's kallsyms and btf ready, we can get any kernel types and
symbol addresses. So we can iterate kernel modules' linked list, and
parse each one of kernel module's structure to get its kallsyms data.
At this time, kernel modules' kallsyms symbol defined within .init_ksyms
section will be resolved.
Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
---
kallsyms.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
kallsyms.h | 3 ++
2 files changed, 127 insertions(+), 1 deletion(-)
diff --git a/kallsyms.c b/kallsyms.c
index f7737cb..f07b0ee 100644
--- a/kallsyms.c
+++ b/kallsyms.c
@@ -3,6 +3,7 @@
#include <string.h>
#include "makedumpfile.h"
#include "kallsyms.h"
+#include "btf_info.h"
static uint32_t *kallsyms_offsets = NULL;
static uint16_t *kallsyms_token_index = NULL;
@@ -347,4 +348,126 @@ out:
kallsyms_names = NULL;
}
return ret;
-}
\ No newline at end of file
+}
+
+INIT_KERN_SYM(modules);
+
+INIT_KERN_STRUCT_MEMBER(list_head, next);
+INIT_KERN_STRUCT_MEMBER(module, list);
+INIT_KERN_STRUCT_MEMBER(module, name);
+INIT_KERN_STRUCT_MEMBER(module, core_kallsyms);
+INIT_KERN_STRUCT_MEMBER(mod_kallsyms, symtab);
+INIT_KERN_STRUCT_MEMBER(mod_kallsyms, num_symtab);
+INIT_KERN_STRUCT_MEMBER(mod_kallsyms, strtab);
+INIT_KERN_STRUCT_MEMBER(elf64_sym, st_name);
+INIT_KERN_STRUCT_MEMBER(elf64_sym, st_value);
+
+#define MEMBER_OFF(S, M) \
+ GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
+
+uint64_t next_list(uint64_t list)
+{
+ uint64_t next = 0;
+
+ readmem(VADDR, list + MEMBER_OFF(list_head, next),
+ &next, GET_KERN_STRUCT_MEMBER_MSIZE(list_head, next));
+ return next;
+}
+
+bool init_module_kallsyms(void)
+{
+ uint64_t modules, list, value = 0, symtab = 0, strtab = 0;
+ uint32_t st_name = 0;
+ int num_symtab, i, j;
+ struct ksym_info **p;
+ char symname[512], ch;
+ char *modname = NULL;
+ bool ret = false;
+
+ modules = GET_KERN_SYM(modules);
+ if (!KERN_SYM_EXIST(modules)) {
+ /* Not a failure if no module enabled */
+ ret = true;
+ goto out;
+ }
+
+ if (!KERN_STRUCT_MEMBER_EXIST(list_head, next) ||
+ !KERN_STRUCT_MEMBER_EXIST(module, list) ||
+ !KERN_STRUCT_MEMBER_EXIST(module, name) ||
+ !KERN_STRUCT_MEMBER_EXIST(module, core_kallsyms) ||
+ !KERN_STRUCT_MEMBER_EXIST(mod_kallsyms, symtab) ||
+ !KERN_STRUCT_MEMBER_EXIST(mod_kallsyms, num_symtab) ||
+ !KERN_STRUCT_MEMBER_EXIST(mod_kallsyms, strtab) ||
+ !KERN_STRUCT_MEMBER_EXIST(elf64_sym, st_name) ||
+ !KERN_STRUCT_MEMBER_EXIST(elf64_sym, st_value)) {
+ /* Fail when module enabled but any required types not found */
+ fprintf(stderr, "%s: Missing required module syms/types!", __func__);
+ goto out;
+ }
+
+ modname = (char *)malloc(GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
+ if (!modname)
+ goto no_mem;
+
+ for (list = next_list(modules); list != modules; list = next_list(list)) {
+ readmem(VADDR, list - MEMBER_OFF(module, list) +
+ MEMBER_OFF(module, name),
+ modname, GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
+ if (!check_ksyms_require_modname(modname, NULL))
+ continue;
+ readmem(VADDR, list - MEMBER_OFF(module, list) +
+ MEMBER_OFF(module, core_kallsyms) +
+ MEMBER_OFF(mod_kallsyms, num_symtab),
+ &num_symtab, GET_KERN_STRUCT_MEMBER_MSIZE(mod_kallsyms, num_symtab));
+ readmem(VADDR, list - MEMBER_OFF(module, list) +
+ MEMBER_OFF(module, core_kallsyms) +
+ MEMBER_OFF(mod_kallsyms, symtab),
+ &symtab, GET_KERN_STRUCT_MEMBER_MSIZE(mod_kallsyms, symtab));
+ readmem(VADDR, list - MEMBER_OFF(module, list) +
+ MEMBER_OFF(module, core_kallsyms) +
+ MEMBER_OFF(mod_kallsyms, strtab),
+ &strtab, GET_KERN_STRUCT_MEMBER_MSIZE(mod_kallsyms, strtab));
+ for (i = 0; i < num_symtab; i++) {
+ j = 0;
+ readmem(VADDR, symtab + i * GET_KERN_STRUCT_MEMBER_SSIZE(elf64_sym, st_value) +
+ MEMBER_OFF(elf64_sym, st_value),
+ &value, GET_KERN_STRUCT_MEMBER_MSIZE(elf64_sym, st_value));
+ readmem(VADDR, symtab + i * GET_KERN_STRUCT_MEMBER_SSIZE(elf64_sym, st_name) +
+ MEMBER_OFF(elf64_sym, st_name),
+ &st_name, GET_KERN_STRUCT_MEMBER_MSIZE(elf64_sym, st_name));
+ do {
+ readmem(VADDR, strtab + st_name + j++, &ch, 1);
+ } while (ch != '\0');
+ if (j == 1 || j > sizeof(symname))
+ /* Skip empty or too long string */
+ continue;
+ readmem(VADDR, strtab + st_name, symname, j);
+
+ for (j = 0; j < sr_len; j++) {
+ for (p = (struct ksym_info **)(sr[j]->start);
+ p < (struct ksym_info **)(sr[j]->stop);
+ p++) {
+ if (!strcmp((*p)->modname, modname) &&
+ !strcmp((*p)->symname, symname)) {
+ (*p)->value = value;
+ (*p)->index = i;
+ }
+ }
+ }
+ }
+ }
+ ret = true;
+ goto out;
+no_mem:
+ fprintf(stderr, "%s: Not enough memory!\n", __func__);
+out:
+ if (modname)
+ free(modname);
+ return ret;
+}
+
+void cleanup_kallsyms(void)
+{
+ cleanup_ksyms_section_range();
+ cleanup_ksyms_modname();
+}
diff --git a/kallsyms.h b/kallsyms.h
index 3791284..897bcdd 100644
--- a/kallsyms.h
+++ b/kallsyms.h
@@ -88,4 +88,7 @@ bool check_ksyms_require_modname(char *modname, int *total);
bool register_ksym_section(char *start, char *stop);
bool read_vmcoreinfo_kallsyms(void);
bool init_kernel_kallsyms(void);
+uint64_t next_list(uint64_t list);
+bool init_module_kallsyms(void);
+void cleanup_kallsyms(void);
#endif /* _KALLSYMS_H */
\ No newline at end of file
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 4/7] Implement kernel module's kallsyms resolving
2026-03-17 15:07 ` [PATCH v4][makedumpfile 4/7] Implement kernel module's kallsyms resolving Tao Liu
@ 2026-04-02 23:54 ` Stephen Brennan
0 siblings, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-02 23:54 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> With kernel's kallsyms and btf ready, we can get any kernel types and
> symbol addresses. So we can iterate kernel modules' linked list, and
> parse each one of kernel module's structure to get its kallsyms data.
> At this time, kernel modules' kallsyms symbol defined within .init_ksyms
> section will be resolved.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> kallsyms.c | 125 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
> kallsyms.h | 3 ++
> 2 files changed, 127 insertions(+), 1 deletion(-)
>
> diff --git a/kallsyms.c b/kallsyms.c
> index f7737cb..f07b0ee 100644
> --- a/kallsyms.c
> +++ b/kallsyms.c
> @@ -3,6 +3,7 @@
> #include <string.h>
> #include "makedumpfile.h"
> #include "kallsyms.h"
> +#include "btf_info.h"
>
> static uint32_t *kallsyms_offsets = NULL;
> static uint16_t *kallsyms_token_index = NULL;
> @@ -347,4 +348,126 @@ out:
> kallsyms_names = NULL;
> }
> return ret;
> -}
> \ No newline at end of file
> +}
> +
> +INIT_KERN_SYM(modules);
> +
> +INIT_KERN_STRUCT_MEMBER(list_head, next);
> +INIT_KERN_STRUCT_MEMBER(module, list);
> +INIT_KERN_STRUCT_MEMBER(module, name);
> +INIT_KERN_STRUCT_MEMBER(module, core_kallsyms);
> +INIT_KERN_STRUCT_MEMBER(mod_kallsyms, symtab);
> +INIT_KERN_STRUCT_MEMBER(mod_kallsyms, num_symtab);
> +INIT_KERN_STRUCT_MEMBER(mod_kallsyms, strtab);
> +INIT_KERN_STRUCT_MEMBER(elf64_sym, st_name);
> +INIT_KERN_STRUCT_MEMBER(elf64_sym, st_value);
> +
> +#define MEMBER_OFF(S, M) \
> + GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
> +
> +uint64_t next_list(uint64_t list)
> +{
> + uint64_t next = 0;
> +
> + readmem(VADDR, list + MEMBER_OFF(list_head, next),
> + &next, GET_KERN_STRUCT_MEMBER_MSIZE(list_head, next));
> + return next;
> +}
> +
> +bool init_module_kallsyms(void)
> +{
> + uint64_t modules, list, value = 0, symtab = 0, strtab = 0;
> + uint32_t st_name = 0;
> + int num_symtab, i, j;
> + struct ksym_info **p;
> + char symname[512], ch;
> + char *modname = NULL;
> + bool ret = false;
> +
> + modules = GET_KERN_SYM(modules);
> + if (!KERN_SYM_EXIST(modules)) {
> + /* Not a failure if no module enabled */
> + ret = true;
> + goto out;
> + }
> +
> + if (!KERN_STRUCT_MEMBER_EXIST(list_head, next) ||
> + !KERN_STRUCT_MEMBER_EXIST(module, list) ||
> + !KERN_STRUCT_MEMBER_EXIST(module, name) ||
> + !KERN_STRUCT_MEMBER_EXIST(module, core_kallsyms) ||
> + !KERN_STRUCT_MEMBER_EXIST(mod_kallsyms, symtab) ||
> + !KERN_STRUCT_MEMBER_EXIST(mod_kallsyms, num_symtab) ||
> + !KERN_STRUCT_MEMBER_EXIST(mod_kallsyms, strtab) ||
> + !KERN_STRUCT_MEMBER_EXIST(elf64_sym, st_name) ||
> + !KERN_STRUCT_MEMBER_EXIST(elf64_sym, st_value)) {
> + /* Fail when module enabled but any required types not found */
> + fprintf(stderr, "%s: Missing required module syms/types!", __func__);
> + goto out;
> + }
> +
> + modname = (char *)malloc(GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
> + if (!modname)
> + goto no_mem;
> +
> + for (list = next_list(modules); list != modules; list = next_list(list)) {
> + readmem(VADDR, list - MEMBER_OFF(module, list) +
> + MEMBER_OFF(module, name),
> + modname, GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
> + if (!check_ksyms_require_modname(modname, NULL))
> + continue;
> + readmem(VADDR, list - MEMBER_OFF(module, list) +
> + MEMBER_OFF(module, core_kallsyms) +
> + MEMBER_OFF(mod_kallsyms, num_symtab),
> + &num_symtab, GET_KERN_STRUCT_MEMBER_MSIZE(mod_kallsyms, num_symtab));
> + readmem(VADDR, list - MEMBER_OFF(module, list) +
> + MEMBER_OFF(module, core_kallsyms) +
> + MEMBER_OFF(mod_kallsyms, symtab),
> + &symtab, GET_KERN_STRUCT_MEMBER_MSIZE(mod_kallsyms, symtab));
> + readmem(VADDR, list - MEMBER_OFF(module, list) +
> + MEMBER_OFF(module, core_kallsyms) +
> + MEMBER_OFF(mod_kallsyms, strtab),
> + &strtab, GET_KERN_STRUCT_MEMBER_MSIZE(mod_kallsyms, strtab));
> + for (i = 0; i < num_symtab; i++) {
> + j = 0;
> + readmem(VADDR, symtab + i * GET_KERN_STRUCT_MEMBER_SSIZE(elf64_sym, st_value) +
> + MEMBER_OFF(elf64_sym, st_value),
> + &value, GET_KERN_STRUCT_MEMBER_MSIZE(elf64_sym, st_value));
> + readmem(VADDR, symtab + i * GET_KERN_STRUCT_MEMBER_SSIZE(elf64_sym, st_name) +
> + MEMBER_OFF(elf64_sym, st_name),
> + &st_name, GET_KERN_STRUCT_MEMBER_MSIZE(elf64_sym, st_name));
> + do {
> + readmem(VADDR, strtab + st_name + j++, &ch, 1);
> + } while (ch != '\0');
> + if (j == 1 || j > sizeof(symname))
> + /* Skip empty or too long string */
> + continue;
> + readmem(VADDR, strtab + st_name, symname, j);
> +
> + for (j = 0; j < sr_len; j++) {
> + for (p = (struct ksym_info **)(sr[j]->start);
> + p < (struct ksym_info **)(sr[j]->stop);
> + p++) {
> + if (!strcmp((*p)->modname, modname) &&
> + !strcmp((*p)->symname, symname)) {
> + (*p)->value = value;
> + (*p)->index = i;
> + }
> + }
> + }
> + }
> + }
> + ret = true;
> + goto out;
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> +out:
> + if (modname)
> + free(modname);
> + return ret;
> +}
> +
> +void cleanup_kallsyms(void)
> +{
> + cleanup_ksyms_section_range();
> + cleanup_ksyms_modname();
> +}
> diff --git a/kallsyms.h b/kallsyms.h
> index 3791284..897bcdd 100644
> --- a/kallsyms.h
> +++ b/kallsyms.h
> @@ -88,4 +88,7 @@ bool check_ksyms_require_modname(char *modname, int *total);
> bool register_ksym_section(char *start, char *stop);
> bool read_vmcoreinfo_kallsyms(void);
> bool init_kernel_kallsyms(void);
> +uint64_t next_list(uint64_t list);
> +bool init_module_kallsyms(void);
> +void cleanup_kallsyms(void);
> #endif /* _KALLSYMS_H */
> \ No newline at end of file
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4][makedumpfile 5/7] Implement kernel module's btf resolving
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
` (3 preceding siblings ...)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 4/7] Implement kernel module's kallsyms resolving Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-02 23:56 ` Stephen Brennan
2026-03-17 15:07 ` [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support Tao Liu
` (3 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
Same as the previous patch, with kernel's kallsyms and btf ready,
we can locate and iterate all kernel modules' btf data. So kernel
modules' types specified within .init_ksyms section will be resolved.
Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
---
btf_info.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
btf_info.h | 2 +
2 files changed, 115 insertions(+), 1 deletion(-)
diff --git a/btf_info.c b/btf_info.c
index 1cb66e2..7682a82 100644
--- a/btf_info.c
+++ b/btf_info.c
@@ -230,4 +230,116 @@ out:
if (buf)
free(buf);
return ret;
-}
\ No newline at end of file
+}
+
+INIT_KERN_SYM(btf_modules);
+
+INIT_KERN_STRUCT_MEMBER(btf_module, list);
+INIT_KERN_STRUCT_MEMBER(btf_module, btf);
+INIT_KERN_STRUCT_MEMBER(btf_module, module);
+DECLARE_KERN_STRUCT_MEMBER(module, name);
+INIT_KERN_STRUCT_MEMBER(btf, data);
+INIT_KERN_STRUCT_MEMBER(btf, data_size);
+
+#define MEMBER_OFF(S, M) \
+ GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
+
+bool init_module_btf(void)
+{
+ struct btf *btf_mod;
+ uint64_t btf_modules, list;
+ uint64_t btf = 0, data = 0, module = 0;
+ int data_size = 0;
+ bool ret = false;
+ char *btf_buf = NULL;
+ char *modname = NULL;
+ struct ktype_info **p;
+
+ btf_modules = GET_KERN_SYM(btf_modules);
+ if (!KERN_SYM_EXIST(btf_modules))
+ /* Maybe module is not enabled, this is not an error */
+ return true;
+
+ if (!KERN_STRUCT_MEMBER_EXIST(btf_module, list) ||
+ !KERN_STRUCT_MEMBER_EXIST(btf_module, btf) ||
+ !KERN_STRUCT_MEMBER_EXIST(btf_module, module) ||
+ !KERN_STRUCT_MEMBER_EXIST(btf, data) ||
+ !KERN_STRUCT_MEMBER_EXIST(btf, data_size)) {
+ /* Fail when module enabled but any required types not found */
+ fprintf(stderr, "%s: Missing required btf syms/types!", __func__);
+ goto out;
+ }
+
+ modname = (char *)malloc(GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
+ if (!modname)
+ goto no_mem;
+
+ for (list = next_list(btf_modules); list != btf_modules; list = next_list(list)) {
+ readmem(VADDR, list - MEMBER_OFF(btf_module, list) +
+ MEMBER_OFF(btf_module, btf),
+ &btf, GET_KERN_STRUCT_MEMBER_MSIZE(btf_module, btf));
+ readmem(VADDR, list - MEMBER_OFF(btf_module, list) +
+ MEMBER_OFF(btf_module, module),
+ &module, GET_KERN_STRUCT_MEMBER_MSIZE(btf_module, module));
+ readmem(VADDR, module + MEMBER_OFF(module, name),
+ modname, GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
+ if (!check_ktypes_require_modname(modname, NULL)) {
+ continue;
+ }
+ readmem(VADDR, btf + MEMBER_OFF(btf, data),
+ &data, GET_KERN_STRUCT_MEMBER_MSIZE(btf, data));
+ readmem(VADDR, btf + MEMBER_OFF(btf, data_size),
+ &data_size, GET_KERN_STRUCT_MEMBER_MSIZE(btf, data_size));
+ btf_buf = (char *)malloc(data_size);
+ if (!btf_buf)
+ goto no_mem;
+ readmem(VADDR, data, btf_buf, data_size);
+ btf_mod = btf__new_split(btf_buf, data_size, btf_arr[0]->btf);
+ free(btf_buf);
+ if (libbpf_get_error(btf_mod) != 0 ||
+ add_to_btf_arr(btf_mod, strdup(modname)) == false) {
+ fprintf(stderr, "%s: init %s btf fail\n", __func__, modname);
+ goto out;
+ }
+ }
+
+ /* OK, we have loaded all needed modules's btf, now resolve the types */
+ for (int i = 0; i < sr_len; i++) {
+ for (p = (struct ktype_info **)(sr[i]->start);
+ p < (struct ktype_info **)(sr[i]->stop);
+ p++)
+ get_ktype_info(*p, NULL);
+ }
+
+ ret = true;
+ goto out;
+
+no_mem:
+ fprintf(stderr, "%s: Not enough memory!\n", __func__);
+out:
+ if (modname)
+ free(modname);
+ return ret;
+}
+
+static void cleanup_btf_arr(void)
+{
+ for (int i = 0; i < btf_arr_len; i++) {
+ free(btf_arr[i]->module);
+ btf__free(btf_arr[i]->btf);
+ free(btf_arr[i]);
+ }
+ if (btf_arr) {
+ free(btf_arr);
+ btf_arr = NULL;
+ }
+ btf_arr_len = 0;
+ btf_arr_cap = 0;
+}
+
+void cleanup_btf(void)
+{
+ cleanup_btf_arr();
+ cleanup_ktypes_section_range();
+ cleanup_ktypes_modname();
+}
diff --git a/btf_info.h b/btf_info.h
index 2cf6b07..b7f6810 100644
--- a/btf_info.h
+++ b/btf_info.h
@@ -22,6 +22,8 @@ struct ktype_info {
bool check_ktypes_require_modname(char *modname, int *total);
bool register_ktype_section(char *start, char *stop);
bool init_kernel_btf(void);
+bool init_module_btf(void);
+void cleanup_btf(void);
#define QUATE(x) #x
#define INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, R) \
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 5/7] Implement kernel module's btf resolving
2026-03-17 15:07 ` [PATCH v4][makedumpfile 5/7] Implement kernel module's btf resolving Tao Liu
@ 2026-04-02 23:56 ` Stephen Brennan
0 siblings, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-02 23:56 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> Same as the previous patch, with kernel's kallsyms and btf ready,
> we can locate and iterate all kernel modules' btf data. So kernel
> modules' types specified within .init_ksyms section will be resolved.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> ---
> btf_info.c | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++-
> btf_info.h | 2 +
> 2 files changed, 115 insertions(+), 1 deletion(-)
>
> diff --git a/btf_info.c b/btf_info.c
> index 1cb66e2..7682a82 100644
> --- a/btf_info.c
> +++ b/btf_info.c
> @@ -230,4 +230,116 @@ out:
> if (buf)
> free(buf);
> return ret;
> -}
> \ No newline at end of file
> +}
> +
> +INIT_KERN_SYM(btf_modules);
> +
> +INIT_KERN_STRUCT_MEMBER(btf_module, list);
> +INIT_KERN_STRUCT_MEMBER(btf_module, btf);
> +INIT_KERN_STRUCT_MEMBER(btf_module, module);
> +DECLARE_KERN_STRUCT_MEMBER(module, name);
> +INIT_KERN_STRUCT_MEMBER(btf, data);
> +INIT_KERN_STRUCT_MEMBER(btf, data_size);
> +
> +#define MEMBER_OFF(S, M) \
> + GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
> +
> +bool init_module_btf(void)
> +{
> + struct btf *btf_mod;
> + uint64_t btf_modules, list;
> + uint64_t btf = 0, data = 0, module = 0;
> + int data_size = 0;
> + bool ret = false;
> + char *btf_buf = NULL;
> + char *modname = NULL;
> + struct ktype_info **p;
> +
> + btf_modules = GET_KERN_SYM(btf_modules);
> + if (!KERN_SYM_EXIST(btf_modules))
> + /* Maybe module is not enabled, this is not an error */
> + return true;
> +
> + if (!KERN_STRUCT_MEMBER_EXIST(btf_module, list) ||
> + !KERN_STRUCT_MEMBER_EXIST(btf_module, btf) ||
> + !KERN_STRUCT_MEMBER_EXIST(btf_module, module) ||
> + !KERN_STRUCT_MEMBER_EXIST(btf, data) ||
> + !KERN_STRUCT_MEMBER_EXIST(btf, data_size)) {
> + /* Fail when module enabled but any required types not found */
> + fprintf(stderr, "%s: Missing required btf syms/types!", __func__);
> + goto out;
> + }
> +
> + modname = (char *)malloc(GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
> + if (!modname)
> + goto no_mem;
> +
> + for (list = next_list(btf_modules); list != btf_modules; list = next_list(list)) {
> + readmem(VADDR, list - MEMBER_OFF(btf_module, list) +
> + MEMBER_OFF(btf_module, btf),
> + &btf, GET_KERN_STRUCT_MEMBER_MSIZE(btf_module, btf));
> + readmem(VADDR, list - MEMBER_OFF(btf_module, list) +
> + MEMBER_OFF(btf_module, module),
> + &module, GET_KERN_STRUCT_MEMBER_MSIZE(btf_module, module));
> + readmem(VADDR, module + MEMBER_OFF(module, name),
> + modname, GET_KERN_STRUCT_MEMBER_MSIZE(module, name));
> + if (!check_ktypes_require_modname(modname, NULL)) {
> + continue;
> + }
> + readmem(VADDR, btf + MEMBER_OFF(btf, data),
> + &data, GET_KERN_STRUCT_MEMBER_MSIZE(btf, data));
> + readmem(VADDR, btf + MEMBER_OFF(btf, data_size),
> + &data_size, GET_KERN_STRUCT_MEMBER_MSIZE(btf, data_size));
> + btf_buf = (char *)malloc(data_size);
> + if (!btf_buf)
> + goto no_mem;
> + readmem(VADDR, data, btf_buf, data_size);
> + btf_mod = btf__new_split(btf_buf, data_size, btf_arr[0]->btf);
> + free(btf_buf);
> + if (libbpf_get_error(btf_mod) != 0 ||
> + add_to_btf_arr(btf_mod, strdup(modname)) == false) {
> + fprintf(stderr, "%s: init %s btf fail\n", __func__, modname);
> + goto out;
> + }
> + }
> +
> + /* OK, we have loaded all needed modules's btf, now resolve the types */
> + for (int i = 0; i < sr_len; i++) {
> + for (p = (struct ktype_info **)(sr[i]->start);
> + p < (struct ktype_info **)(sr[i]->stop);
> + p++)
> + get_ktype_info(*p, NULL);
> + }
> +
> + ret = true;
> + goto out;
> +
> +no_mem:
> + fprintf(stderr, "%s: Not enough memory!\n", __func__);
> +out:
> + if (modname)
> + free(modname);
> + return ret;
> +}
> +
> +static void cleanup_btf_arr(void)
> +{
> + for (int i = 0; i < btf_arr_len; i++) {
> + free(btf_arr[i]->module);
> + btf__free(btf_arr[i]->btf);
> + free(btf_arr[i]);
> + }
> + if (btf_arr) {
> + free(btf_arr);
> + btf_arr = NULL;
> + }
> + btf_arr_len = 0;
> + btf_arr_cap = 0;
> +}
> +
> +void cleanup_btf(void)
> +{
> + cleanup_btf_arr();
> + cleanup_ktypes_section_range();
> + cleanup_ktypes_modname();
> +}
> diff --git a/btf_info.h b/btf_info.h
> index 2cf6b07..b7f6810 100644
> --- a/btf_info.h
> +++ b/btf_info.h
> @@ -22,6 +22,8 @@ struct ktype_info {
> bool check_ktypes_require_modname(char *modname, int *total);
> bool register_ktype_section(char *start, char *stop);
> bool init_kernel_btf(void);
> +bool init_module_btf(void);
> +void cleanup_btf(void);
>
> #define QUATE(x) #x
> #define INIT_MOD_STRUCT_MEMBER_RQD(MOD, S, M, R) \
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
` (4 preceding siblings ...)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 5/7] Implement kernel module's btf resolving Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-03 0:11 ` Stephen Brennan
2026-04-03 8:14 ` HAGIO KAZUHITO(萩尾 一仁)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 7/7] Filter amdgpu mm pages Tao Liu
` (2 subsequent siblings)
8 siblings, 2 replies; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
The extensions can be specified by makedumpfile cmdline parameter as
"--extension", followed by extension's filename or absolute path. If
filename is give, then "./extenisons" and "/usr/lib64/makedumpfile/extensions/"
will be searched.
The procedures of extensions are as follows:
Step 0: Every extensions will declare which kernel symbol/types they needed
during programming. This info will be stored within .init_ksyms/ktypes section.
Also extension will have a callback function for makedumpfile to call.
Step 1: Register .init_ksyms and .init_ktypes sections of makedumpfile
itself and extension's .so files, then tell kallsyms/btf subcomponent that which
kernel symbols/types will be resolved. And callbacks are also registered.
Step 2: Init kernel/module's btf/kallsyms on demand. Any un-needed kenrel
modules will be skipped.
Step 3: During btf/kallsyms parsing, the needed info will be filled. For
syms/types which are defined via INIT_OPT(...) macro, these are optinal
syms/types, it won't fail at parsing step if any are missing, instead, they
need to be checked within extension_init() of each extensions; For
syms/types which defined via INIT_(...) macro, these are must-have syms/types,
if any missing, the extension will fail at this step and as a result
this extension will be skipped.
After this step, required kernel symbol value and kernel types size/offset
are resolved, the extensions are ready to go.
Step 4: When makedumpfile doing page filtering, in addition to its
original filtering mechanism, it will call extensions callbacks for advice
whether the page should be included/excluded.
Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Tao Liu <ltao@redhat.com>
---
Makefile | 7 +-
extension.c | 300 ++++++++++++++++++++++++++++++++++++++++++++
extension.h | 12 ++
extensions/Makefile | 10 ++
makedumpfile.c | 38 +++++-
makedumpfile.h | 2 +
6 files changed, 363 insertions(+), 6 deletions(-)
create mode 100644 extension.c
create mode 100644 extension.h
create mode 100644 extensions/Makefile
diff --git a/Makefile b/Makefile
index 320677d..1bb67d9 100644
--- a/Makefile
+++ b/Makefile
@@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
endif
SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
-SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c
+SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c extension.c
OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
@@ -126,6 +126,7 @@ eppic_makedumpfile.so: extension_eppic.c
clean:
rm -f $(OBJ) $(OBJ_PART) $(OBJ_ARCH) makedumpfile makedumpfile.8 makedumpfile.conf.5
+ $(MAKE) -C extensions clean
install:
install -m 755 -d ${DESTDIR}/${SBINDIR} ${DESTDIR}/usr/share/man/man5 ${DESTDIR}/usr/share/man/man8
@@ -135,3 +136,7 @@ install:
mkdir -p ${DESTDIR}/usr/share/makedumpfile/eppic_scripts
install -m 644 -D $(VPATH)makedumpfile.conf ${DESTDIR}/usr/share/makedumpfile/makedumpfile.conf.sample
install -m 644 -t ${DESTDIR}/usr/share/makedumpfile/eppic_scripts/ $(VPATH)eppic_scripts/*
+
+.PHONY: extensions
+extensions:
+ $(MAKE) -C extensions CC=$(CC)
\ No newline at end of file
diff --git a/extension.c b/extension.c
new file mode 100644
index 0000000..35e2756
--- /dev/null
+++ b/extension.c
@@ -0,0 +1,300 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <dirent.h>
+#include <dlfcn.h>
+#include <stdbool.h>
+#include <unistd.h>
+#include "kallsyms.h"
+#include "btf_info.h"
+#include "extension.h"
+
+typedef int (*callback_fn)(unsigned long, const void *);
+
+struct extension_handle_cb {
+ void *handle;
+ callback_fn cb;
+};
+
+/* Extension .so extension_handle_cb array */
+static struct extension_handle_cb **handle_cbs = NULL;
+static int handle_cbs_len = 0;
+static int handle_cbs_cap = 0;
+
+/* Extension option array */
+static char **extension_opts = NULL;
+static int extension_opts_len = 0;
+static int extension_opts_cap = 0;
+
+static const char *dirs[] = {
+ "/usr/lib64/makedumpfile/extensions/",
+ "./extensions/",
+};
+
+void add_extension_opts(char *opt)
+{
+ if (!add_to_arr((void ***)&extension_opts, &extension_opts_len,
+ &extension_opts_cap, opt))
+ /*
+ * If fail, print error info and skip the extension.
+ */
+ fprintf(stderr, "%s: Fail to add extension %s\n", __func__, opt);
+}
+
+static bool init_kallsyms_btf(void)
+{
+ int count;
+ bool ret = false;
+ /* We will load module's btf/kallsyms on demand */
+ bool init_ksyms_module = false;
+ bool init_ktypes_module = false;
+
+ if (check_ksyms_require_modname("vmlinux", &count)) {
+ if (!init_kernel_kallsyms())
+ goto out;
+ if (count >= 2)
+ init_ksyms_module = true;
+ }
+ if (check_ktypes_require_modname("vmlinux", &count)) {
+ if (!init_kernel_btf())
+ goto out;
+ if (count >= 2)
+ init_ktypes_module = true;
+ }
+ if (init_ksyms_module && !init_module_kallsyms())
+ goto out;
+ if (init_ktypes_module && !init_module_btf())
+ goto out;
+ ret = true;
+out:
+ return ret;
+}
+
+static void cleanup_kallsyms_btf(void)
+{
+ cleanup_kallsyms();
+ cleanup_btf();
+}
+
+static void load_extensions(void)
+{
+ char path[512];
+ int len, i, j;
+ void *handle;
+ struct extension_handle_cb *ehc;
+
+ for (i = 0; i < extension_opts_len; i++) {
+ handle = NULL;
+ if (!extension_opts[i])
+ continue;
+ if ((len = strlen(extension_opts[i])) <= 3 ||
+ (strcmp(extension_opts[i] + len - 3, ".so") != 0)) {
+ fprintf(stderr, "%s: Skip invalid extension: %s\n",
+ __func__, extension_opts[i]);
+ continue;
+ }
+
+ if (extension_opts[i][0] == '/') {
+ /* Path & filename */
+ snprintf(path, sizeof(path), "%s", extension_opts[i]);
+ handle = dlopen(path, RTLD_NOW);
+ if (!handle) {
+ fprintf(stderr, "%s: Failed to load %s\n",
+ __func__, dlerror());
+ continue;
+ }
+ } else {
+ /* Only filename */
+ for (j = 0; j < sizeof(dirs) / sizeof(char *); j++) {
+ snprintf(path, sizeof(path), "%s", dirs[j]);
+ len = strlen(path);
+ snprintf(path + len, sizeof(path) - len, "%s",
+ extension_opts[i]);
+ if (access(path, F_OK) == 0) {
+ handle = dlopen(path, RTLD_NOW);
+ if (handle)
+ break;
+ else
+ fprintf(stderr, "%s: Failed to load %s\n",
+ __func__, dlerror());
+ }
+ }
+ if (!handle && j >= sizeof(dirs) / sizeof(char *)) {
+ fprintf(stderr, "%s: Not found %s\n",
+ __func__, extension_opts[i]);
+ continue;
+ }
+ }
+
+ if (dlsym(handle, "extension_init") == NULL) {
+ fprintf(stderr, "%s: Skip extension %s: No extension_init()\n",
+ __func__, path);
+ dlclose(handle);
+ continue;
+ }
+
+ if ((ehc = malloc(sizeof(struct extension_handle_cb))) == NULL) {
+ fprintf(stderr, "%s: Skip extension %s: No memory\n",
+ __func__, path);
+ dlclose(handle);
+ continue;
+ }
+
+ ehc->handle = handle;
+ ehc->cb = dlsym(handle, "extension_callback");
+
+ if (!add_to_arr((void ***)&handle_cbs, &handle_cbs_len, &handle_cbs_cap, ehc)) {
+ fprintf(stderr, "%s: Failed to load %s\n", __func__,
+ extension_opts[i]);
+ free(ehc);
+ dlclose(handle);
+ continue;
+ }
+ printf("Loaded extension: %s\n", path);
+ }
+}
+
+static bool register_extension_sections(void)
+{
+ char *start, *stop;
+ int i;
+ bool ret = false;
+
+ for (i = 0; i < handle_cbs_len; i++) {
+ start = dlsym(handle_cbs[i]->handle, "__start_init_ksyms");
+ stop = dlsym(handle_cbs[i]->handle, "__stop_init_ksyms");
+ if (!register_ksym_section(start, stop))
+ goto out;
+
+ start = dlsym(handle_cbs[i]->handle, "__start_init_ktypes");
+ stop = dlsym(handle_cbs[i]->handle, "__stop_init_ktypes");
+ if (!register_ktype_section(start, stop))
+ goto out;
+ }
+ ret = true;
+out:
+ return ret;
+}
+
+void cleanup_extensions(void)
+{
+ for (int i = 0; i < handle_cbs_len; i++) {
+ dlclose(handle_cbs[i]->handle);
+ free(handle_cbs[i]);
+ }
+ if (handle_cbs) {
+ free(handle_cbs);
+ handle_cbs = NULL;
+ }
+ handle_cbs_len = 0;
+ handle_cbs_cap = 0;
+ if (extension_opts) {
+ free(extension_opts);
+ extension_opts = NULL;
+ }
+ extension_opts_len = 0;
+ extension_opts_cap = 0;
+
+ cleanup_kallsyms_btf();
+}
+
+static bool check_required_ksyms_all_resolved(void *handle)
+{
+ char *start, *stop;
+ struct ksym_info **p;
+ bool ret = true;
+
+ start = dlsym(handle, "__start_init_ksyms");
+ stop = dlsym(handle, "__stop_init_ksyms");
+
+ for (p = (struct ksym_info **)start;
+ p < (struct ksym_info **)stop;
+ p++) {
+ if ((*p)->sym_required && !SYM_EXIST(*p)) {
+ ret = false;
+ fprintf(stderr, "Symbol %s in %s not found\n",
+ (*p)->symname, (*p)->modname);
+ }
+ }
+
+ return ret;
+}
+
+static bool check_required_ktypes_all_resolved(void *handle)
+{
+ char *start, *stop;
+ struct ktype_info **p;
+ bool ret = true;
+
+ start = dlsym(handle, "__start_init_ktypes");
+ stop = dlsym(handle, "__stop_init_ktypes");
+
+ for (p = (struct ktype_info **)start;
+ p < (struct ktype_info **)stop;
+ p++) {
+ if (!TYPE_EXIST(*p)) {
+ if ((*p)->member_required) {
+ ret = false;
+ fprintf(stderr, "Member %s of struct %s in %s not found\n",
+ (*p)->member_name, (*p)->struct_name, (*p)->modname);
+ } else if ((*p)->struct_required) {
+ ret = false;
+ fprintf(stderr, "Struct %s in %s not found\n",
+ (*p)->struct_name, (*p)->modname);
+ }
+ }
+ }
+
+ return ret;
+}
+
+static bool extension_runnable(void *handle)
+{
+ return check_required_ksyms_all_resolved(handle) &&
+ check_required_ktypes_all_resolved(handle);
+}
+
+void init_extensions(void)
+{
+ /* Entry of extension init */
+ void (*init)(void);
+
+ load_extensions();
+ if (!register_extension_sections())
+ goto fail;
+ if (!init_kallsyms_btf())
+ goto fail;
+ for (int i = 0; i < handle_cbs_len; i++) {
+ if (extension_runnable(handle_cbs[i]->handle)) {
+ init = dlsym(handle_cbs[i]->handle, "extension_init");
+ init();
+ } else {
+ fprintf(stderr, "%s: Skip %dth extension\n",
+ __func__, i + 1);
+ }
+ }
+ return;
+fail:
+ fprintf(stderr, "%s: fail & skip all extensions\n", __func__);
+ cleanup_extensions();
+}
+
+int run_extension_callback(unsigned long pfn, const void *pcache)
+{
+ int result;
+ int ret = PG_UNDECID;
+
+ for (int i = 0; i < handle_cbs_len; i++) {
+ if (handle_cbs[i]->cb) {
+ result = handle_cbs[i]->cb(pfn, pcache);
+ if (result == PG_INCLUDE) {
+ ret = result;
+ goto out;
+ } else if (result == PG_EXCLUDE) {
+ ret = result;
+ }
+ }
+ }
+out:
+ return ret;
+}
\ No newline at end of file
diff --git a/extension.h b/extension.h
new file mode 100644
index 0000000..dc5902e
--- /dev/null
+++ b/extension.h
@@ -0,0 +1,12 @@
+#ifndef _EXTENSION_H
+#define _EXTENSION_H
+
+enum {
+ PG_INCLUDE, // Exntesion will keep the page
+ PG_EXCLUDE, // Exntesion will discard the page
+ PG_UNDECID, // Exntesion makes no decision
+};
+int run_extension_callback(unsigned long pfn, const void *pcache);
+void init_extensions(void);
+void cleanup_extensions(void);
+#endif /* _EXTENSION_H */
\ No newline at end of file
diff --git a/extensions/Makefile b/extensions/Makefile
new file mode 100644
index 0000000..b8bbfbc
--- /dev/null
+++ b/extensions/Makefile
@@ -0,0 +1,10 @@
+CC ?= gcc
+CONTRIB_SO :=
+
+all: $(CONTRIB_SO)
+
+$(CONTRIB_SO): %.so: %.c
+ $(CC) -O2 -g -fPIC -shared -Wl,-T,../makedumpfile.ld -o $@ $^
+
+clean:
+ rm -f $(CONTRIB_SO)
\ No newline at end of file
diff --git a/makedumpfile.c b/makedumpfile.c
index dba3628..ef7468f 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -28,6 +28,7 @@
#include <assert.h>
#include <zlib.h>
#include "kallsyms.h"
+#include "extension.h"
struct symbol_table symbol_table;
struct size_table size_table;
@@ -102,6 +103,7 @@ mdf_pfn_t pfn_free;
mdf_pfn_t pfn_hwpoison;
mdf_pfn_t pfn_offline;
mdf_pfn_t pfn_elf_excluded;
+mdf_pfn_t pfn_extension;
mdf_pfn_t num_dumped;
@@ -6459,6 +6461,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
unsigned int order_offset, dtor_offset;
unsigned long flags, mapping, private = 0;
unsigned long compound_dtor, compound_head = 0;
+ int filter_pg;
/*
* If a multi-page exclusion is pending, do it first
@@ -6531,6 +6534,14 @@ __exclude_unnecessary_pages(unsigned long mem_map,
pfn_read_end = pfn + pfn_mm - 1;
}
+ /*
+ * Include pages that specified by user via
+ * makedumpfile extensions
+ */
+ filter_pg = run_extension_callback(pfn, pcache);
+ if (filter_pg == PG_INCLUDE)
+ continue;
+
flags = ULONG(pcache + OFFSET(page.flags));
_count = UINT(pcache + OFFSET(page._refcount));
mapping = ULONG(pcache + OFFSET(page.mapping));
@@ -6687,6 +6698,14 @@ check_order:
else if (isOffline(flags, _mapcount)) {
pfn_counter = &pfn_offline;
}
+ /*
+ * Exclude pages that specified by user via
+ * makedumpfile extensions
+ */
+ else if (filter_pg == PG_EXCLUDE) {
+ nr_pages = 1;
+ pfn_counter = &pfn_extension;
+ }
/*
* Unexcludable page
*/
@@ -8234,7 +8253,7 @@ write_elf_pages_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
*/
if (info->flag_cyclic) {
pfn_zero = pfn_cache = pfn_cache_private = 0;
- pfn_user = pfn_free = pfn_hwpoison = pfn_offline = 0;
+ pfn_user = pfn_free = pfn_hwpoison = pfn_offline = pfn_extension = 0;
pfn_memhole = info->max_mapnr;
}
@@ -9579,7 +9598,7 @@ write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_d
* Reset counter for debug message.
*/
pfn_zero = pfn_cache = pfn_cache_private = 0;
- pfn_user = pfn_free = pfn_hwpoison = pfn_offline = 0;
+ pfn_user = pfn_free = pfn_hwpoison = pfn_offline = pfn_extension = 0;
pfn_memhole = info->max_mapnr;
/*
@@ -10528,7 +10547,7 @@ print_report(void)
pfn_original = info->max_mapnr - pfn_memhole;
pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
- + pfn_user + pfn_free + pfn_hwpoison + pfn_offline;
+ + pfn_user + pfn_free + pfn_hwpoison + pfn_offline + pfn_extension;
REPORT_MSG("\n");
REPORT_MSG("Original pages : 0x%016llx\n", pfn_original);
@@ -10544,6 +10563,7 @@ print_report(void)
REPORT_MSG(" Free pages : 0x%016llx\n", pfn_free);
REPORT_MSG(" Hwpoison pages : 0x%016llx\n", pfn_hwpoison);
REPORT_MSG(" Offline pages : 0x%016llx\n", pfn_offline);
+ REPORT_MSG(" Extension filter pages : 0x%016llx\n", pfn_extension);
REPORT_MSG(" Remaining pages : 0x%016llx\n",
pfn_original - pfn_excluded);
@@ -10584,7 +10604,7 @@ print_mem_usage(void)
pfn_original = info->max_mapnr - pfn_memhole;
pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
- + pfn_user + pfn_free + pfn_hwpoison + pfn_offline;
+ + pfn_user + pfn_free + pfn_hwpoison + pfn_offline + pfn_extension;
shrinking = (pfn_original - pfn_excluded) * 100;
shrinking = shrinking / pfn_original;
total_size = info->page_size * pfn_original;
@@ -10878,6 +10898,7 @@ create_dumpfile(void)
}
print_vtop();
+ init_extensions();
num_retry = 0;
retry:
@@ -10888,8 +10909,11 @@ retry:
&& !gather_filter_info())
return FALSE;
- if (!create_dump_bitmap())
+ if (!create_dump_bitmap()) {
+ cleanup_extensions();
return FALSE;
+ }
+ cleanup_extensions();
if (info->flag_split) {
if ((status = writeout_multiple_dumpfiles()) == FALSE)
@@ -12130,6 +12154,7 @@ static struct option longopts[] = {
{"check-params", no_argument, NULL, OPT_CHECK_PARAMS},
{"dry-run", no_argument, NULL, OPT_DRY_RUN},
{"show-stats", no_argument, NULL, OPT_SHOW_STATS},
+ {"extension", required_argument, NULL, OPT_EXTENSION},
{0, 0, 0, 0}
};
@@ -12317,6 +12342,9 @@ main(int argc, char *argv[])
case OPT_SHOW_STATS:
flag_show_stats = TRUE;
break;
+ case OPT_EXTENSION:
+ add_extension_opts(optarg);
+ break;
case '?':
MSG("Commandline parameter is invalid.\n");
MSG("Try `makedumpfile --help' for more information.\n");
diff --git a/makedumpfile.h b/makedumpfile.h
index 0f13743..d880ae7 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -2747,6 +2747,7 @@ struct elf_prstatus {
#define OPT_CHECK_PARAMS OPT_START+18
#define OPT_DRY_RUN OPT_START+19
#define OPT_SHOW_STATS OPT_START+20
+#define OPT_EXTENSION OPT_START+21
/*
* Function Prototype.
@@ -2777,5 +2778,6 @@ int write_and_check_space(int fd, void *buf, size_t buf_size,
int open_dump_file(void);
int dump_lockless_dmesg(void);
unsigned long long memparse(char *ptr, char **retptr);
+void add_extension_opts(char *opt);
#endif /* MAKEDUMPFILE_H */
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support
2026-03-17 15:07 ` [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support Tao Liu
@ 2026-04-03 0:11 ` Stephen Brennan
2026-04-03 8:14 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-03 0:11 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> The extensions can be specified by makedumpfile cmdline parameter as
> "--extension", followed by extension's filename or absolute path. If
> filename is give, then "./extenisons" and "/usr/lib64/makedumpfile/extensions/"
> will be searched.
>
> The procedures of extensions are as follows:
>
> Step 0: Every extensions will declare which kernel symbol/types they needed
> during programming. This info will be stored within .init_ksyms/ktypes section.
> Also extension will have a callback function for makedumpfile to call.
>
> Step 1: Register .init_ksyms and .init_ktypes sections of makedumpfile
> itself and extension's .so files, then tell kallsyms/btf subcomponent that which
> kernel symbols/types will be resolved. And callbacks are also registered.
>
> Step 2: Init kernel/module's btf/kallsyms on demand. Any un-needed kenrel
> modules will be skipped.
>
> Step 3: During btf/kallsyms parsing, the needed info will be filled. For
> syms/types which are defined via INIT_OPT(...) macro, these are optinal
> syms/types, it won't fail at parsing step if any are missing, instead, they
> need to be checked within extension_init() of each extensions; For
> syms/types which defined via INIT_(...) macro, these are must-have syms/types,
> if any missing, the extension will fail at this step and as a result
> this extension will be skipped.
>
> After this step, required kernel symbol value and kernel types size/offset
> are resolved, the extensions are ready to go.
>
> Step 4: When makedumpfile doing page filtering, in addition to its
> original filtering mechanism, it will call extensions callbacks for advice
> whether the page should be included/excluded.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
> ---
> Makefile | 7 +-
> extension.c | 300 ++++++++++++++++++++++++++++++++++++++++++++
> extension.h | 12 ++
> extensions/Makefile | 10 ++
> makedumpfile.c | 38 +++++-
> makedumpfile.h | 2 +
I think the manual page will need updating with this change.
Documentation updates could probably be done in a separate patch, though.
> 6 files changed, 363 insertions(+), 6 deletions(-)
> create mode 100644 extension.c
> create mode 100644 extension.h
> create mode 100644 extensions/Makefile
>
> diff --git a/Makefile b/Makefile
> index 320677d..1bb67d9 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
> endif
>
> SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
> -SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c
> +SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c extension.c
> OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
> OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
> @@ -126,6 +126,7 @@ eppic_makedumpfile.so: extension_eppic.c
>
> clean:
> rm -f $(OBJ) $(OBJ_PART) $(OBJ_ARCH) makedumpfile makedumpfile.8 makedumpfile.conf.5
> + $(MAKE) -C extensions clean
>
> install:
> install -m 755 -d ${DESTDIR}/${SBINDIR} ${DESTDIR}/usr/share/man/man5 ${DESTDIR}/usr/share/man/man8
> @@ -135,3 +136,7 @@ install:
> mkdir -p ${DESTDIR}/usr/share/makedumpfile/eppic_scripts
> install -m 644 -D $(VPATH)makedumpfile.conf ${DESTDIR}/usr/share/makedumpfile/makedumpfile.conf.sample
> install -m 644 -t ${DESTDIR}/usr/share/makedumpfile/eppic_scripts/ $(VPATH)eppic_scripts/*
> +
> +.PHONY: extensions
> +extensions:
> + $(MAKE) -C extensions CC=$(CC)
> \ No newline at end of file
> diff --git a/extension.c b/extension.c
> new file mode 100644
> index 0000000..35e2756
> --- /dev/null
> +++ b/extension.c
> @@ -0,0 +1,300 @@
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <dirent.h>
> +#include <dlfcn.h>
> +#include <stdbool.h>
> +#include <unistd.h>
> +#include "kallsyms.h"
> +#include "btf_info.h"
> +#include "extension.h"
> +
> +typedef int (*callback_fn)(unsigned long, const void *);
> +
> +struct extension_handle_cb {
> + void *handle;
> + callback_fn cb;
> +};
> +
> +/* Extension .so extension_handle_cb array */
> +static struct extension_handle_cb **handle_cbs = NULL;
> +static int handle_cbs_len = 0;
> +static int handle_cbs_cap = 0;
> +
> +/* Extension option array */
> +static char **extension_opts = NULL;
> +static int extension_opts_len = 0;
> +static int extension_opts_cap = 0;
> +
> +static const char *dirs[] = {
> + "/usr/lib64/makedumpfile/extensions/",
> + "./extensions/",
> +};
> +
> +void add_extension_opts(char *opt)
> +{
> + if (!add_to_arr((void ***)&extension_opts, &extension_opts_len,
> + &extension_opts_cap, opt))
> + /*
> + * If fail, print error info and skip the extension.
> + */
> + fprintf(stderr, "%s: Fail to add extension %s\n", __func__, opt);
> +}
> +
> +static bool init_kallsyms_btf(void)
> +{
> + int count;
> + bool ret = false;
> + /* We will load module's btf/kallsyms on demand */
> + bool init_ksyms_module = false;
> + bool init_ktypes_module = false;
> +
> + if (check_ksyms_require_modname("vmlinux", &count)) {
> + if (!init_kernel_kallsyms())
> + goto out;
> + if (count >= 2)
> + init_ksyms_module = true;
> + }
> + if (check_ktypes_require_modname("vmlinux", &count)) {
> + if (!init_kernel_btf())
> + goto out;
> + if (count >= 2)
> + init_ktypes_module = true;
> + }
> + if (init_ksyms_module && !init_module_kallsyms())
> + goto out;
> + if (init_ktypes_module && !init_module_btf())
> + goto out;
> + ret = true;
> +out:
> + return ret;
> +}
> +
> +static void cleanup_kallsyms_btf(void)
> +{
> + cleanup_kallsyms();
> + cleanup_btf();
> +}
> +
> +static void load_extensions(void)
> +{
> + char path[512];
> + int len, i, j;
> + void *handle;
> + struct extension_handle_cb *ehc;
> +
> + for (i = 0; i < extension_opts_len; i++) {
> + handle = NULL;
> + if (!extension_opts[i])
> + continue;
> + if ((len = strlen(extension_opts[i])) <= 3 ||
> + (strcmp(extension_opts[i] + len - 3, ".so") != 0)) {
> + fprintf(stderr, "%s: Skip invalid extension: %s\n",
> + __func__, extension_opts[i]);
> + continue;
> + }
> +
> + if (extension_opts[i][0] == '/') {
> + /* Path & filename */
> + snprintf(path, sizeof(path), "%s", extension_opts[i]);
> + handle = dlopen(path, RTLD_NOW);
> + if (!handle) {
> + fprintf(stderr, "%s: Failed to load %s\n",
> + __func__, dlerror());
> + continue;
> + }
> + } else {
> + /* Only filename */
> + for (j = 0; j < sizeof(dirs) / sizeof(char *); j++) {
> + snprintf(path, sizeof(path), "%s", dirs[j]);
> + len = strlen(path);
> + snprintf(path + len, sizeof(path) - len, "%s",
> + extension_opts[i]);
> + if (access(path, F_OK) == 0) {
> + handle = dlopen(path, RTLD_NOW);
> + if (handle)
> + break;
> + else
> + fprintf(stderr, "%s: Failed to load %s\n",
> + __func__, dlerror());
> + }
> + }
> + if (!handle && j >= sizeof(dirs) / sizeof(char *)) {
> + fprintf(stderr, "%s: Not found %s\n",
> + __func__, extension_opts[i]);
> + continue;
> + }
> + }
> +
> + if (dlsym(handle, "extension_init") == NULL) {
> + fprintf(stderr, "%s: Skip extension %s: No extension_init()\n",
> + __func__, path);
> + dlclose(handle);
> + continue;
> + }
> +
> + if ((ehc = malloc(sizeof(struct extension_handle_cb))) == NULL) {
> + fprintf(stderr, "%s: Skip extension %s: No memory\n",
> + __func__, path);
> + dlclose(handle);
> + continue;
> + }
> +
> + ehc->handle = handle;
> + ehc->cb = dlsym(handle, "extension_callback");
> +
> + if (!add_to_arr((void ***)&handle_cbs, &handle_cbs_len, &handle_cbs_cap, ehc)) {
> + fprintf(stderr, "%s: Failed to load %s\n", __func__,
> + extension_opts[i]);
> + free(ehc);
> + dlclose(handle);
> + continue;
> + }
> + printf("Loaded extension: %s\n", path);
> + }
> +}
> +
> +static bool register_extension_sections(void)
> +{
> + char *start, *stop;
> + int i;
> + bool ret = false;
> +
> + for (i = 0; i < handle_cbs_len; i++) {
> + start = dlsym(handle_cbs[i]->handle, "__start_init_ksyms");
> + stop = dlsym(handle_cbs[i]->handle, "__stop_init_ksyms");
> + if (!register_ksym_section(start, stop))
> + goto out;
> +
> + start = dlsym(handle_cbs[i]->handle, "__start_init_ktypes");
> + stop = dlsym(handle_cbs[i]->handle, "__stop_init_ktypes");
> + if (!register_ktype_section(start, stop))
> + goto out;
> + }
> + ret = true;
> +out:
> + return ret;
> +}
> +
> +void cleanup_extensions(void)
> +{
> + for (int i = 0; i < handle_cbs_len; i++) {
> + dlclose(handle_cbs[i]->handle);
> + free(handle_cbs[i]);
> + }
> + if (handle_cbs) {
> + free(handle_cbs);
> + handle_cbs = NULL;
> + }
> + handle_cbs_len = 0;
> + handle_cbs_cap = 0;
> + if (extension_opts) {
> + free(extension_opts);
> + extension_opts = NULL;
> + }
> + extension_opts_len = 0;
> + extension_opts_cap = 0;
> +
> + cleanup_kallsyms_btf();
> +}
> +
> +static bool check_required_ksyms_all_resolved(void *handle)
> +{
> + char *start, *stop;
> + struct ksym_info **p;
> + bool ret = true;
> +
> + start = dlsym(handle, "__start_init_ksyms");
> + stop = dlsym(handle, "__stop_init_ksyms");
> +
> + for (p = (struct ksym_info **)start;
> + p < (struct ksym_info **)stop;
> + p++) {
> + if ((*p)->sym_required && !SYM_EXIST(*p)) {
> + ret = false;
> + fprintf(stderr, "Symbol %s in %s not found\n",
> + (*p)->symname, (*p)->modname);
> + }
> + }
> +
> + return ret;
> +}
> +
> +static bool check_required_ktypes_all_resolved(void *handle)
> +{
> + char *start, *stop;
> + struct ktype_info **p;
> + bool ret = true;
> +
> + start = dlsym(handle, "__start_init_ktypes");
> + stop = dlsym(handle, "__stop_init_ktypes");
> +
> + for (p = (struct ktype_info **)start;
> + p < (struct ktype_info **)stop;
> + p++) {
> + if (!TYPE_EXIST(*p)) {
> + if ((*p)->member_required) {
> + ret = false;
> + fprintf(stderr, "Member %s of struct %s in %s not found\n",
> + (*p)->member_name, (*p)->struct_name, (*p)->modname);
> + } else if ((*p)->struct_required) {
> + ret = false;
> + fprintf(stderr, "Struct %s in %s not found\n",
> + (*p)->struct_name, (*p)->modname);
> + }
> + }
> + }
> +
> + return ret;
> +}
> +
> +static bool extension_runnable(void *handle)
> +{
> + return check_required_ksyms_all_resolved(handle) &&
> + check_required_ktypes_all_resolved(handle);
> +}
> +
> +void init_extensions(void)
> +{
> + /* Entry of extension init */
> + void (*init)(void);
> +
> + load_extensions();
> + if (!register_extension_sections())
> + goto fail;
> + if (!init_kallsyms_btf())
> + goto fail;
> + for (int i = 0; i < handle_cbs_len; i++) {
> + if (extension_runnable(handle_cbs[i]->handle)) {
> + init = dlsym(handle_cbs[i]->handle, "extension_init");
> + init();
> + } else {
> + fprintf(stderr, "%s: Skip %dth extension\n",
> + __func__, i + 1);
> + }
> + }
> + return;
> +fail:
> + fprintf(stderr, "%s: fail & skip all extensions\n", __func__);
> + cleanup_extensions();
> +}
> +
> +int run_extension_callback(unsigned long pfn, const void *pcache)
> +{
> + int result;
> + int ret = PG_UNDECID;
> +
> + for (int i = 0; i < handle_cbs_len; i++) {
> + if (handle_cbs[i]->cb) {
> + result = handle_cbs[i]->cb(pfn, pcache);
> + if (result == PG_INCLUDE) {
> + ret = result;
> + goto out;
> + } else if (result == PG_EXCLUDE) {
> + ret = result;
> + }
> + }
> + }
> +out:
> + return ret;
> +}
> \ No newline at end of file
> diff --git a/extension.h b/extension.h
> new file mode 100644
> index 0000000..dc5902e
> --- /dev/null
> +++ b/extension.h
> @@ -0,0 +1,12 @@
> +#ifndef _EXTENSION_H
> +#define _EXTENSION_H
> +
> +enum {
> + PG_INCLUDE, // Exntesion will keep the page
> + PG_EXCLUDE, // Exntesion will discard the page
> + PG_UNDECID, // Exntesion makes no decision
> +};
> +int run_extension_callback(unsigned long pfn, const void *pcache);
> +void init_extensions(void);
> +void cleanup_extensions(void);
> +#endif /* _EXTENSION_H */
> \ No newline at end of file
> diff --git a/extensions/Makefile b/extensions/Makefile
> new file mode 100644
> index 0000000..b8bbfbc
> --- /dev/null
> +++ b/extensions/Makefile
> @@ -0,0 +1,10 @@
> +CC ?= gcc
> +CONTRIB_SO :=
> +
> +all: $(CONTRIB_SO)
> +
> +$(CONTRIB_SO): %.so: %.c
> + $(CC) -O2 -g -fPIC -shared -Wl,-T,../makedumpfile.ld -o $@ $^
> +
> +clean:
> + rm -f $(CONTRIB_SO)
> \ No newline at end of file
> diff --git a/makedumpfile.c b/makedumpfile.c
> index dba3628..ef7468f 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -28,6 +28,7 @@
> #include <assert.h>
> #include <zlib.h>
> #include "kallsyms.h"
> +#include "extension.h"
>
> struct symbol_table symbol_table;
> struct size_table size_table;
> @@ -102,6 +103,7 @@ mdf_pfn_t pfn_free;
> mdf_pfn_t pfn_hwpoison;
> mdf_pfn_t pfn_offline;
> mdf_pfn_t pfn_elf_excluded;
> +mdf_pfn_t pfn_extension;
>
> mdf_pfn_t num_dumped;
>
> @@ -6459,6 +6461,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> unsigned int order_offset, dtor_offset;
> unsigned long flags, mapping, private = 0;
> unsigned long compound_dtor, compound_head = 0;
> + int filter_pg;
>
> /*
> * If a multi-page exclusion is pending, do it first
> @@ -6531,6 +6534,14 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> pfn_read_end = pfn + pfn_mm - 1;
> }
>
> + /*
> + * Include pages that specified by user via
> + * makedumpfile extensions
> + */
> + filter_pg = run_extension_callback(pfn, pcache);
> + if (filter_pg == PG_INCLUDE)
> + continue;
> +
> flags = ULONG(pcache + OFFSET(page.flags));
> _count = UINT(pcache + OFFSET(page._refcount));
> mapping = ULONG(pcache + OFFSET(page.mapping));
> @@ -6687,6 +6698,14 @@ check_order:
> else if (isOffline(flags, _mapcount)) {
> pfn_counter = &pfn_offline;
> }
> + /*
> + * Exclude pages that specified by user via
> + * makedumpfile extensions
> + */
> + else if (filter_pg == PG_EXCLUDE) {
> + nr_pages = 1;
> + pfn_counter = &pfn_extension;
> + }
> /*
> * Unexcludable page
> */
> @@ -8234,7 +8253,7 @@ write_elf_pages_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
> */
> if (info->flag_cyclic) {
> pfn_zero = pfn_cache = pfn_cache_private = 0;
> - pfn_user = pfn_free = pfn_hwpoison = pfn_offline = 0;
> + pfn_user = pfn_free = pfn_hwpoison = pfn_offline = pfn_extension = 0;
> pfn_memhole = info->max_mapnr;
> }
>
> @@ -9579,7 +9598,7 @@ write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_d
> * Reset counter for debug message.
> */
> pfn_zero = pfn_cache = pfn_cache_private = 0;
> - pfn_user = pfn_free = pfn_hwpoison = pfn_offline = 0;
> + pfn_user = pfn_free = pfn_hwpoison = pfn_offline = pfn_extension = 0;
> pfn_memhole = info->max_mapnr;
>
> /*
> @@ -10528,7 +10547,7 @@ print_report(void)
> pfn_original = info->max_mapnr - pfn_memhole;
>
> pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
> - + pfn_user + pfn_free + pfn_hwpoison + pfn_offline;
> + + pfn_user + pfn_free + pfn_hwpoison + pfn_offline + pfn_extension;
>
> REPORT_MSG("\n");
> REPORT_MSG("Original pages : 0x%016llx\n", pfn_original);
> @@ -10544,6 +10563,7 @@ print_report(void)
> REPORT_MSG(" Free pages : 0x%016llx\n", pfn_free);
> REPORT_MSG(" Hwpoison pages : 0x%016llx\n", pfn_hwpoison);
> REPORT_MSG(" Offline pages : 0x%016llx\n", pfn_offline);
> + REPORT_MSG(" Extension filter pages : 0x%016llx\n", pfn_extension);
> REPORT_MSG(" Remaining pages : 0x%016llx\n",
> pfn_original - pfn_excluded);
>
> @@ -10584,7 +10604,7 @@ print_mem_usage(void)
> pfn_original = info->max_mapnr - pfn_memhole;
>
> pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
> - + pfn_user + pfn_free + pfn_hwpoison + pfn_offline;
> + + pfn_user + pfn_free + pfn_hwpoison + pfn_offline + pfn_extension;
> shrinking = (pfn_original - pfn_excluded) * 100;
> shrinking = shrinking / pfn_original;
> total_size = info->page_size * pfn_original;
> @@ -10878,6 +10898,7 @@ create_dumpfile(void)
> }
>
> print_vtop();
> + init_extensions();
>
> num_retry = 0;
> retry:
> @@ -10888,8 +10909,11 @@ retry:
> && !gather_filter_info())
> return FALSE;
>
> - if (!create_dump_bitmap())
> + if (!create_dump_bitmap()) {
> + cleanup_extensions();
> return FALSE;
> + }
> + cleanup_extensions();
If a retry happens, then all of the data related to extensions will be
cleared: including the list of extensions, the callbacks, etc.
Functionally, extensions will be disabled on the retry. Is that
intentional?
If not, then I think we'll need to move this cleanup so that it happens
when we exit the retry loop (either by an early return, or on success).
If this is intentional, then it probably ought to be documented.
Thanks,
Stephen
> if (info->flag_split) {
> if ((status = writeout_multiple_dumpfiles()) == FALSE)
> @@ -12130,6 +12154,7 @@ static struct option longopts[] = {
> {"check-params", no_argument, NULL, OPT_CHECK_PARAMS},
> {"dry-run", no_argument, NULL, OPT_DRY_RUN},
> {"show-stats", no_argument, NULL, OPT_SHOW_STATS},
> + {"extension", required_argument, NULL, OPT_EXTENSION},
> {0, 0, 0, 0}
> };
>
> @@ -12317,6 +12342,9 @@ main(int argc, char *argv[])
> case OPT_SHOW_STATS:
> flag_show_stats = TRUE;
> break;
> + case OPT_EXTENSION:
> + add_extension_opts(optarg);
> + break;
> case '?':
> MSG("Commandline parameter is invalid.\n");
> MSG("Try `makedumpfile --help' for more information.\n");
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 0f13743..d880ae7 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -2747,6 +2747,7 @@ struct elf_prstatus {
> #define OPT_CHECK_PARAMS OPT_START+18
> #define OPT_DRY_RUN OPT_START+19
> #define OPT_SHOW_STATS OPT_START+20
> +#define OPT_EXTENSION OPT_START+21
>
> /*
> * Function Prototype.
> @@ -2777,5 +2778,6 @@ int write_and_check_space(int fd, void *buf, size_t buf_size,
> int open_dump_file(void);
> int dump_lockless_dmesg(void);
> unsigned long long memparse(char *ptr, char **retptr);
> +void add_extension_opts(char *opt);
>
> #endif /* MAKEDUMPFILE_H */
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support
2026-03-17 15:07 ` [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support Tao Liu
2026-04-03 0:11 ` Stephen Brennan
@ 2026-04-03 8:14 ` HAGIO KAZUHITO(萩尾 一仁)
1 sibling, 0 replies; 21+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2026-04-03 8:14 UTC (permalink / raw)
To: Tao Liu
Cc: stephen.s.brennan@oracle.com,
YAMAZAKI MASAMITSU(山崎 真光),
kexec@lists.infradead.org
On 2026/03/18 0:07, Tao Liu wrote:
> The extensions can be specified by makedumpfile cmdline parameter as
> "--extension", followed by extension's filename or absolute path. If
> filename is give, then "./extenisons" and "/usr/lib64/makedumpfile/extensions/"
> will be searched.
>
> The procedures of extensions are as follows:
>
> Step 0: Every extensions will declare which kernel symbol/types they needed
> during programming. This info will be stored within .init_ksyms/ktypes section.
> Also extension will have a callback function for makedumpfile to call.
>
> Step 1: Register .init_ksyms and .init_ktypes sections of makedumpfile
> itself and extension's .so files, then tell kallsyms/btf subcomponent that which
> kernel symbols/types will be resolved. And callbacks are also registered.
>
> Step 2: Init kernel/module's btf/kallsyms on demand. Any un-needed kenrel
> modules will be skipped.
>
> Step 3: During btf/kallsyms parsing, the needed info will be filled. For
> syms/types which are defined via INIT_OPT(...) macro, these are optinal
> syms/types, it won't fail at parsing step if any are missing, instead, they
> need to be checked within extension_init() of each extensions; For
> syms/types which defined via INIT_(...) macro, these are must-have syms/types,
> if any missing, the extension will fail at this step and as a result
> this extension will be skipped.
>
> After this step, required kernel symbol value and kernel types size/offset
> are resolved, the extensions are ready to go.
>
> Step 4: When makedumpfile doing page filtering, in addition to its
> original filtering mechanism, it will call extensions callbacks for advice
> whether the page should be included/excluded.
>
> Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
> Signed-off-by: Tao Liu <ltao@redhat.com>
> ---
> Makefile | 7 +-
> extension.c | 300 ++++++++++++++++++++++++++++++++++++++++++++
> extension.h | 12 ++
> extensions/Makefile | 10 ++
> makedumpfile.c | 38 +++++-
> makedumpfile.h | 2 +
> 6 files changed, 363 insertions(+), 6 deletions(-)
> create mode 100644 extension.c
> create mode 100644 extension.h
> create mode 100644 extensions/Makefile
>
> diff --git a/Makefile b/Makefile
> index 320677d..1bb67d9 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -45,7 +45,7 @@ CFLAGS_ARCH += -m32
> endif
>
> SRC_BASE = makedumpfile.c makedumpfile.h diskdump_mod.h sadump_mod.h sadump_info.h
> -SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c
> +SRC_PART = print_info.c dwarf_info.c elf_info.c erase_info.c sadump_info.c cache.c tools.c printk.c detect_cycle.c kallsyms.c btf_info.c extension.c
> OBJ_PART=$(patsubst %.c,%.o,$(SRC_PART))
> SRC_ARCH = arch/arm.c arch/arm64.c arch/x86.c arch/x86_64.c arch/ia64.c arch/ppc64.c arch/s390x.c arch/ppc.c arch/sparc64.c arch/mips64.c arch/loongarch64.c arch/riscv64.c
> OBJ_ARCH=$(patsubst %.c,%.o,$(SRC_ARCH))
> @@ -126,6 +126,7 @@ eppic_makedumpfile.so: extension_eppic.c
>
> clean:
> rm -f $(OBJ) $(OBJ_PART) $(OBJ_ARCH) makedumpfile makedumpfile.8 makedumpfile.conf.5
> + $(MAKE) -C extensions clean
>
> install:
> install -m 755 -d ${DESTDIR}/${SBINDIR} ${DESTDIR}/usr/share/man/man5 ${DESTDIR}/usr/share/man/man8
> @@ -135,3 +136,7 @@ install:
> mkdir -p ${DESTDIR}/usr/share/makedumpfile/eppic_scripts
> install -m 644 -D $(VPATH)makedumpfile.conf ${DESTDIR}/usr/share/makedumpfile/makedumpfile.conf.sample
> install -m 644 -t ${DESTDIR}/usr/share/makedumpfile/eppic_scripts/ $(VPATH)eppic_scripts/*
> +
> +.PHONY: extensions
> +extensions:
> + $(MAKE) -C extensions CC=$(CC)
> \ No newline at end of file
> diff --git a/extension.c b/extension.c
> new file mode 100644
> index 0000000..35e2756
> --- /dev/null
> +++ b/extension.c
> @@ -0,0 +1,300 @@
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <dirent.h>
> +#include <dlfcn.h>
> +#include <stdbool.h>
> +#include <unistd.h>
> +#include "kallsyms.h"
> +#include "btf_info.h"
> +#include "extension.h"
> +
> +typedef int (*callback_fn)(unsigned long, const void *);
> +
> +struct extension_handle_cb {
> + void *handle;
> + callback_fn cb;
> +};
> +
> +/* Extension .so extension_handle_cb array */
> +static struct extension_handle_cb **handle_cbs = NULL;
> +static int handle_cbs_len = 0;
> +static int handle_cbs_cap = 0;
> +
> +/* Extension option array */
> +static char **extension_opts = NULL;
> +static int extension_opts_len = 0;
> +static int extension_opts_cap = 0;
> +
> +static const char *dirs[] = {
> + "/usr/lib64/makedumpfile/extensions/",
> + "./extensions/",
> +};
I think we should add "./" to the first and arrange them in this order:
"./",
"./extensions/",
"/usr/lib64/makedumpfile/extensions/"
Without this, we cannot use a relative path like this, it's not natural.
$ ls
amdgpu_filter.so makedumpfile vmcore
$ ./makedumpfile -ld31 --extension amdgpu_filter.so vmcore dump
load_extensions: Not found amdgpu_filter.so
Copying data : [ 26.5 %]
> +
> +void add_extension_opts(char *opt)
> +{
> + if (!add_to_arr((void ***)&extension_opts, &extension_opts_len,
> + &extension_opts_cap, opt))
> + /*
> + * If fail, print error info and skip the extension.
> + */
> + fprintf(stderr, "%s: Fail to add extension %s\n", __func__, opt);
> +}
> +
> +static bool init_kallsyms_btf(void)
> +{
> + int count;
> + bool ret = false;
> + /* We will load module's btf/kallsyms on demand */
> + bool init_ksyms_module = false;
> + bool init_ktypes_module = false;
> +
> + if (check_ksyms_require_modname("vmlinux", &count)) {
> + if (!init_kernel_kallsyms())
> + goto out;
> + if (count >= 2)
> + init_ksyms_module = true;
I may have misread, if an extension depends only a module's symbols, no
init functions are run? or it can't be such a situation?
> + }
> + if (check_ktypes_require_modname("vmlinux", &count)) {
> + if (!init_kernel_btf())
> + goto out;
> + if (count >= 2)
> + init_ktypes_module = true;
> + }
> + if (init_ksyms_module && !init_module_kallsyms())
> + goto out;
> + if (init_ktypes_module && !init_module_btf())
> + goto out;
> + ret = true;
> +out:
> + return ret;
> +}
> +
> +static void cleanup_kallsyms_btf(void)
> +{
> + cleanup_kallsyms();
> + cleanup_btf();
> +}
> +
> +static void load_extensions(void)
> +{
> + char path[512];
> + int len, i, j;
> + void *handle;
> + struct extension_handle_cb *ehc;
> +
> + for (i = 0; i < extension_opts_len; i++) {
> + handle = NULL;
> + if (!extension_opts[i])
> + continue;
> + if ((len = strlen(extension_opts[i])) <= 3 ||
> + (strcmp(extension_opts[i] + len - 3, ".so") != 0)) {
> + fprintf(stderr, "%s: Skip invalid extension: %s\n",
> + __func__, extension_opts[i]);
> + continue;
> + }
> +
> + if (extension_opts[i][0] == '/') {
> + /* Path & filename */
> + snprintf(path, sizeof(path), "%s", extension_opts[i]);
> + handle = dlopen(path, RTLD_NOW);
> + if (!handle) {
> + fprintf(stderr, "%s: Failed to load %s\n",
> + __func__, dlerror());
> + continue;
> + }
> + } else {
> + /* Only filename */
> + for (j = 0; j < sizeof(dirs) / sizeof(char *); j++) {
> + snprintf(path, sizeof(path), "%s", dirs[j]);
> + len = strlen(path);
> + snprintf(path + len, sizeof(path) - len, "%s",
> + extension_opts[i]);
> + if (access(path, F_OK) == 0) {
> + handle = dlopen(path, RTLD_NOW);
> + if (handle)
> + break;
> + else
> + fprintf(stderr, "%s: Failed to load %s\n",
> + __func__, dlerror());
> + }
> + }
> + if (!handle && j >= sizeof(dirs) / sizeof(char *)) {
> + fprintf(stderr, "%s: Not found %s\n",
> + __func__, extension_opts[i]);
> + continue;
> + }
> + }
> +
> + if (dlsym(handle, "extension_init") == NULL) {
> + fprintf(stderr, "%s: Skip extension %s: No extension_init()\n",
> + __func__, path);
> + dlclose(handle);
> + continue;
> + }
> +
> + if ((ehc = malloc(sizeof(struct extension_handle_cb))) == NULL) {
> + fprintf(stderr, "%s: Skip extension %s: No memory\n",
> + __func__, path);
> + dlclose(handle);
> + continue;
Some lines have spaces at the end of line, please remove them.
Thanks,
Kazu
> + }
> +
> + ehc->handle = handle;
> + ehc->cb = dlsym(handle, "extension_callback");
> +
> + if (!add_to_arr((void ***)&handle_cbs, &handle_cbs_len, &handle_cbs_cap, ehc)) {
> + fprintf(stderr, "%s: Failed to load %s\n", __func__,
> + extension_opts[i]);
> + free(ehc);
> + dlclose(handle);
> + continue;
> + }
> + printf("Loaded extension: %s\n", path);
> + }
> +}
> +
> +static bool register_extension_sections(void)
> +{
> + char *start, *stop;
> + int i;
> + bool ret = false;
> +
> + for (i = 0; i < handle_cbs_len; i++) {
> + start = dlsym(handle_cbs[i]->handle, "__start_init_ksyms");
> + stop = dlsym(handle_cbs[i]->handle, "__stop_init_ksyms");
> + if (!register_ksym_section(start, stop))
> + goto out;
> +
> + start = dlsym(handle_cbs[i]->handle, "__start_init_ktypes");
> + stop = dlsym(handle_cbs[i]->handle, "__stop_init_ktypes");
> + if (!register_ktype_section(start, stop))
> + goto out;
> + }
> + ret = true;
> +out:
> + return ret;
> +}
> +
> +void cleanup_extensions(void)
> +{
> + for (int i = 0; i < handle_cbs_len; i++) {
> + dlclose(handle_cbs[i]->handle);
> + free(handle_cbs[i]);
> + }
> + if (handle_cbs) {
> + free(handle_cbs);
> + handle_cbs = NULL;
> + }
> + handle_cbs_len = 0;
> + handle_cbs_cap = 0;
> + if (extension_opts) {
> + free(extension_opts);
> + extension_opts = NULL;
> + }
> + extension_opts_len = 0;
> + extension_opts_cap = 0;
> +
> + cleanup_kallsyms_btf();
> +}
> +
> +static bool check_required_ksyms_all_resolved(void *handle)
> +{
> + char *start, *stop;
> + struct ksym_info **p;
> + bool ret = true;
> +
> + start = dlsym(handle, "__start_init_ksyms");
> + stop = dlsym(handle, "__stop_init_ksyms");
> +
> + for (p = (struct ksym_info **)start;
> + p < (struct ksym_info **)stop;
> + p++) {
> + if ((*p)->sym_required && !SYM_EXIST(*p)) {
> + ret = false;
> + fprintf(stderr, "Symbol %s in %s not found\n",
> + (*p)->symname, (*p)->modname);
> + }
> + }
> +
> + return ret;
> +}
> +
> +static bool check_required_ktypes_all_resolved(void *handle)
> +{
> + char *start, *stop;
> + struct ktype_info **p;
> + bool ret = true;
> +
> + start = dlsym(handle, "__start_init_ktypes");
> + stop = dlsym(handle, "__stop_init_ktypes");
> +
> + for (p = (struct ktype_info **)start;
> + p < (struct ktype_info **)stop;
> + p++) {
> + if (!TYPE_EXIST(*p)) {
> + if ((*p)->member_required) {
> + ret = false;
> + fprintf(stderr, "Member %s of struct %s in %s not found\n",
> + (*p)->member_name, (*p)->struct_name, (*p)->modname);
> + } else if ((*p)->struct_required) {
> + ret = false;
> + fprintf(stderr, "Struct %s in %s not found\n",
> + (*p)->struct_name, (*p)->modname);
> + }
> + }
> + }
> +
> + return ret;
> +}
> +
> +static bool extension_runnable(void *handle)
> +{
> + return check_required_ksyms_all_resolved(handle) &&
> + check_required_ktypes_all_resolved(handle);
> +}
> +
> +void init_extensions(void)
> +{
> + /* Entry of extension init */
> + void (*init)(void);
> +
> + load_extensions();
> + if (!register_extension_sections())
> + goto fail;
> + if (!init_kallsyms_btf())
> + goto fail;
> + for (int i = 0; i < handle_cbs_len; i++) {
> + if (extension_runnable(handle_cbs[i]->handle)) {
> + init = dlsym(handle_cbs[i]->handle, "extension_init");
> + init();
> + } else {
> + fprintf(stderr, "%s: Skip %dth extension\n",
> + __func__, i + 1);
> + }
> + }
> + return;
> +fail:
> + fprintf(stderr, "%s: fail & skip all extensions\n", __func__);
> + cleanup_extensions();
> +}
> +
> +int run_extension_callback(unsigned long pfn, const void *pcache)
> +{
> + int result;
> + int ret = PG_UNDECID;
> +
> + for (int i = 0; i < handle_cbs_len; i++) {
> + if (handle_cbs[i]->cb) {
> + result = handle_cbs[i]->cb(pfn, pcache);
> + if (result == PG_INCLUDE) {
> + ret = result;
> + goto out;
> + } else if (result == PG_EXCLUDE) {
> + ret = result;
> + }
> + }
> + }
> +out:
> + return ret;
> +}
> \ No newline at end of file
> diff --git a/extension.h b/extension.h
> new file mode 100644
> index 0000000..dc5902e
> --- /dev/null
> +++ b/extension.h
> @@ -0,0 +1,12 @@
> +#ifndef _EXTENSION_H
> +#define _EXTENSION_H
> +
> +enum {
> + PG_INCLUDE, // Exntesion will keep the page
> + PG_EXCLUDE, // Exntesion will discard the page
> + PG_UNDECID, // Exntesion makes no decision
> +};
> +int run_extension_callback(unsigned long pfn, const void *pcache);
> +void init_extensions(void);
> +void cleanup_extensions(void);
> +#endif /* _EXTENSION_H */
> \ No newline at end of file
> diff --git a/extensions/Makefile b/extensions/Makefile
> new file mode 100644
> index 0000000..b8bbfbc
> --- /dev/null
> +++ b/extensions/Makefile
> @@ -0,0 +1,10 @@
> +CC ?= gcc
> +CONTRIB_SO :=
> +
> +all: $(CONTRIB_SO)
> +
> +$(CONTRIB_SO): %.so: %.c
> + $(CC) -O2 -g -fPIC -shared -Wl,-T,../makedumpfile.ld -o $@ $^
> +
> +clean:
> + rm -f $(CONTRIB_SO)
> \ No newline at end of file
> diff --git a/makedumpfile.c b/makedumpfile.c
> index dba3628..ef7468f 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -28,6 +28,7 @@
> #include <assert.h>
> #include <zlib.h>
> #include "kallsyms.h"
> +#include "extension.h"
>
> struct symbol_table symbol_table;
> struct size_table size_table;
> @@ -102,6 +103,7 @@ mdf_pfn_t pfn_free;
> mdf_pfn_t pfn_hwpoison;
> mdf_pfn_t pfn_offline;
> mdf_pfn_t pfn_elf_excluded;
> +mdf_pfn_t pfn_extension;
>
> mdf_pfn_t num_dumped;
>
> @@ -6459,6 +6461,7 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> unsigned int order_offset, dtor_offset;
> unsigned long flags, mapping, private = 0;
> unsigned long compound_dtor, compound_head = 0;
> + int filter_pg;
>
> /*
> * If a multi-page exclusion is pending, do it first
> @@ -6531,6 +6534,14 @@ __exclude_unnecessary_pages(unsigned long mem_map,
> pfn_read_end = pfn + pfn_mm - 1;
> }
>
> + /*
> + * Include pages that specified by user via
> + * makedumpfile extensions
> + */
> + filter_pg = run_extension_callback(pfn, pcache);
> + if (filter_pg == PG_INCLUDE)
> + continue;
> +
> flags = ULONG(pcache + OFFSET(page.flags));
> _count = UINT(pcache + OFFSET(page._refcount));
> mapping = ULONG(pcache + OFFSET(page.mapping));
> @@ -6687,6 +6698,14 @@ check_order:
> else if (isOffline(flags, _mapcount)) {
> pfn_counter = &pfn_offline;
> }
> + /*
> + * Exclude pages that specified by user via
> + * makedumpfile extensions
> + */
> + else if (filter_pg == PG_EXCLUDE) {
> + nr_pages = 1;
> + pfn_counter = &pfn_extension;
> + }
> /*
> * Unexcludable page
> */
> @@ -8234,7 +8253,7 @@ write_elf_pages_cyclic(struct cache_data *cd_header, struct cache_data *cd_page)
> */
> if (info->flag_cyclic) {
> pfn_zero = pfn_cache = pfn_cache_private = 0;
> - pfn_user = pfn_free = pfn_hwpoison = pfn_offline = 0;
> + pfn_user = pfn_free = pfn_hwpoison = pfn_offline = pfn_extension = 0;
> pfn_memhole = info->max_mapnr;
> }
>
> @@ -9579,7 +9598,7 @@ write_kdump_pages_and_bitmap_cyclic(struct cache_data *cd_header, struct cache_d
> * Reset counter for debug message.
> */
> pfn_zero = pfn_cache = pfn_cache_private = 0;
> - pfn_user = pfn_free = pfn_hwpoison = pfn_offline = 0;
> + pfn_user = pfn_free = pfn_hwpoison = pfn_offline = pfn_extension = 0;
> pfn_memhole = info->max_mapnr;
>
> /*
> @@ -10528,7 +10547,7 @@ print_report(void)
> pfn_original = info->max_mapnr - pfn_memhole;
>
> pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
> - + pfn_user + pfn_free + pfn_hwpoison + pfn_offline;
> + + pfn_user + pfn_free + pfn_hwpoison + pfn_offline + pfn_extension;
>
> REPORT_MSG("\n");
> REPORT_MSG("Original pages : 0x%016llx\n", pfn_original);
> @@ -10544,6 +10563,7 @@ print_report(void)
> REPORT_MSG(" Free pages : 0x%016llx\n", pfn_free);
> REPORT_MSG(" Hwpoison pages : 0x%016llx\n", pfn_hwpoison);
> REPORT_MSG(" Offline pages : 0x%016llx\n", pfn_offline);
> + REPORT_MSG(" Extension filter pages : 0x%016llx\n", pfn_extension);
> REPORT_MSG(" Remaining pages : 0x%016llx\n",
> pfn_original - pfn_excluded);
>
> @@ -10584,7 +10604,7 @@ print_mem_usage(void)
> pfn_original = info->max_mapnr - pfn_memhole;
>
> pfn_excluded = pfn_zero + pfn_cache + pfn_cache_private
> - + pfn_user + pfn_free + pfn_hwpoison + pfn_offline;
> + + pfn_user + pfn_free + pfn_hwpoison + pfn_offline + pfn_extension;
> shrinking = (pfn_original - pfn_excluded) * 100;
> shrinking = shrinking / pfn_original;
> total_size = info->page_size * pfn_original;
> @@ -10878,6 +10898,7 @@ create_dumpfile(void)
> }
>
> print_vtop();
> + init_extensions();
>
> num_retry = 0;
> retry:
> @@ -10888,8 +10909,11 @@ retry:
> && !gather_filter_info())
> return FALSE;
>
> - if (!create_dump_bitmap())
> + if (!create_dump_bitmap()) {
> + cleanup_extensions();
> return FALSE;
> + }
> + cleanup_extensions();
>
> if (info->flag_split) {
> if ((status = writeout_multiple_dumpfiles()) == FALSE)
> @@ -12130,6 +12154,7 @@ static struct option longopts[] = {
> {"check-params", no_argument, NULL, OPT_CHECK_PARAMS},
> {"dry-run", no_argument, NULL, OPT_DRY_RUN},
> {"show-stats", no_argument, NULL, OPT_SHOW_STATS},
> + {"extension", required_argument, NULL, OPT_EXTENSION},
> {0, 0, 0, 0}
> };
>
> @@ -12317,6 +12342,9 @@ main(int argc, char *argv[])
> case OPT_SHOW_STATS:
> flag_show_stats = TRUE;
> break;
> + case OPT_EXTENSION:
> + add_extension_opts(optarg);
> + break;
> case '?':
> MSG("Commandline parameter is invalid.\n");
> MSG("Try `makedumpfile --help' for more information.\n");
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 0f13743..d880ae7 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -2747,6 +2747,7 @@ struct elf_prstatus {
> #define OPT_CHECK_PARAMS OPT_START+18
> #define OPT_DRY_RUN OPT_START+19
> #define OPT_SHOW_STATS OPT_START+20
> +#define OPT_EXTENSION OPT_START+21
>
> /*
> * Function Prototype.
> @@ -2777,5 +2778,6 @@ int write_and_check_space(int fd, void *buf, size_t buf_size,
> int open_dump_file(void);
> int dump_lockless_dmesg(void);
> unsigned long long memparse(char *ptr, char **retptr);
> +void add_extension_opts(char *opt);
>
> #endif /* MAKEDUMPFILE_H */
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v4][makedumpfile 7/7] Filter amdgpu mm pages
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
` (5 preceding siblings ...)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 6/7] Add makedumpfile extensions support Tao Liu
@ 2026-03-17 15:07 ` Tao Liu
2026-04-03 0:16 ` Stephen Brennan
2026-04-03 8:06 ` [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering HAGIO KAZUHITO(萩尾 一仁)
2026-04-03 18:26 ` Stephen Brennan
8 siblings, 1 reply; 21+ messages in thread
From: Tao Liu @ 2026-03-17 15:07 UTC (permalink / raw)
To: yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, stephen.s.brennan, Tao Liu
This patch will introduce maple_tree & amdgpu mm page filtering extension,
those mm pages allocated to amdgpu will be discarded from vmcore, in order
to shrink vmcore size since mm pages allocated to amdgpu are useless to kernel
crash and may contain sensitive data.
Signed-off-by: Tao Liu <ltao@redhat.com>
---
extensions/Makefile | 4 +-
extensions/amdgpu_filter.c | 190 +++++++++++++++++++++++
extensions/maple_tree.c | 307 +++++++++++++++++++++++++++++++++++++
extensions/maple_tree.h | 6 +
4 files changed, 506 insertions(+), 1 deletion(-)
create mode 100644 extensions/amdgpu_filter.c
create mode 100644 extensions/maple_tree.c
create mode 100644 extensions/maple_tree.h
diff --git a/extensions/Makefile b/extensions/Makefile
index b8bbfbc..55b789b 100644
--- a/extensions/Makefile
+++ b/extensions/Makefile
@@ -1,8 +1,10 @@
CC ?= gcc
-CONTRIB_SO :=
+CONTRIB_SO := amdgpu_filter.so
all: $(CONTRIB_SO)
+amdgpu_filter.so: maple_tree.c
+
$(CONTRIB_SO): %.so: %.c
$(CC) -O2 -g -fPIC -shared -Wl,-T,../makedumpfile.ld -o $@ $^
diff --git a/extensions/amdgpu_filter.c b/extensions/amdgpu_filter.c
new file mode 100644
index 0000000..3a1e9f2
--- /dev/null
+++ b/extensions/amdgpu_filter.c
@@ -0,0 +1,190 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include "maple_tree.h"
+#include "../makedumpfile.h"
+#include "../btf_info.h"
+#include "../kallsyms.h"
+#include "../extension.h"
+
+/*
+ * These syms/types are must-have for the extension.
+*/
+INIT_KERN_STRUCT_MEMBER(task_struct, tasks);
+INIT_KERN_STRUCT_MEMBER(task_struct, mm);
+INIT_KERN_STRUCT_MEMBER(mm_struct, mm_mt);
+INIT_KERN_STRUCT_MEMBER(vm_area_struct, vm_ops);
+INIT_KERN_STRUCT_MEMBER(vm_area_struct, vm_private_data);
+INIT_MOD_STRUCT_MEMBER(amdgpu, ttm_buffer_object, ttm);
+INIT_MOD_STRUCT_MEMBER(amdgpu, ttm_tt, pages);
+INIT_MOD_STRUCT_MEMBER(amdgpu, ttm_tt, num_pages);
+INIT_KERN_STRUCT(page);
+
+INIT_KERN_SYM(init_task);
+INIT_KERN_SYM(vmemmap_base);
+INIT_MOD_SYM(amdgpu, amdgpu_gem_vm_ops);
+
+struct ft_page_info {
+ unsigned long pfn;
+ unsigned long num;
+ struct ft_page_info *next;
+};
+
+static struct ft_page_info *ft_head_discard = NULL;
+
+static void update_filter_pages_info(unsigned long pfn, unsigned long num)
+{
+ struct ft_page_info *p, **ft_head;
+ struct ft_page_info *new_p = malloc(sizeof(struct ft_page_info));
+
+ ft_head = &ft_head_discard;
+
+ if (!new_p) {
+ fprintf(stderr, "%s: Can't allocate memory for ft_page_info\n",
+ __func__);
+ return;
+ }
+ new_p->pfn = pfn;
+ new_p->num = num;
+ new_p->next = NULL;
+
+ if (!(*ft_head) || (*ft_head)->pfn > new_p->pfn) {
+ new_p->next = (*ft_head);
+ (*ft_head) = new_p;
+ return;
+ }
+
+ p = (*ft_head);
+ while (p->next != NULL && p->next->pfn < new_p->pfn) {
+ p = p->next;
+ }
+
+ new_p->next = p->next;
+ p->next = new_p;
+}
+
+static int filter_page(unsigned long pfn, struct ft_page_info **p)
+{
+ struct ft_page_info *ft_head = ft_head_discard;
+
+ if (ft_head == NULL)
+ return PG_UNDECID;
+
+ if (*p == NULL)
+ *p = ft_head;
+
+ /* The gap before 1st block */
+ if (pfn >= 0 && pfn < ft_head->pfn)
+ return PG_UNDECID;
+
+ /* Handle 1~(n-1) blocks and following gaps */
+ while ((*p)->next) {
+ if (pfn >= (*p)->pfn && pfn < (*p)->pfn + (*p)->num)
+ return PG_EXCLUDE; // hit the block
+ if (pfn >= (*p)->pfn + (*p)->num && pfn < (*p)->next->pfn)
+ return PG_UNDECID; // the gap after the block
+ *p = (*p)->next;
+ }
+
+ /* The last block and gap */
+ if (pfn >= (*p)->pfn + (*p)->num)
+ return PG_UNDECID;
+ else
+ return PG_EXCLUDE;
+}
+
+static void do_cleanup(struct ft_page_info **ft_head)
+{
+ struct ft_page_info *p, *p_tmp;
+
+ for (p = *ft_head; p;) {
+ p_tmp = p;
+ p = p->next;
+ free(p_tmp);
+ }
+ *ft_head = NULL;
+}
+
+#define KERN_MEMBER_OFF(S, M) \
+ GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
+#define MOD_MEMBER_OFF(MOD, S, M) \
+ GET_MOD_STRUCT_MEMBER_MOFF(MOD, S, M) / 8
+
+static void gather_amdgpu_mm_range_info(void)
+{
+ uint64_t init_task, list, list_offset, amdgpu_gem_vm_ops;
+ uint64_t mm, vm_ops, tbo, ttm, num_pages, pages, pfn, vmemmap_base;
+ int array_len;
+ unsigned long *array_out;
+ init_task = GET_KERN_SYM(init_task);
+ amdgpu_gem_vm_ops = GET_MOD_SYM(amdgpu, amdgpu_gem_vm_ops);
+
+ list = init_task + KERN_MEMBER_OFF(task_struct, tasks);
+
+ do {
+ readmem(VADDR, list - KERN_MEMBER_OFF(task_struct, tasks) +
+ KERN_MEMBER_OFF(task_struct, mm),
+ &mm, sizeof(uint64_t));
+ if (!mm) {
+ list = next_list(list);
+ continue;
+ }
+
+ array_out = mt_dump(mm + KERN_MEMBER_OFF(mm_struct, mm_mt), &array_len);
+ if (!array_out)
+ return;
+
+ for (int i = 0; i < array_len; i++) {
+ num_pages = 0;
+ readmem(VADDR, array_out[i] + KERN_MEMBER_OFF(vm_area_struct, vm_ops),
+ &vm_ops, GET_KERN_STRUCT_MEMBER_MSIZE(vm_area_struct, vm_ops));
+ if (vm_ops == amdgpu_gem_vm_ops) {
+ readmem(VADDR, array_out[i] +
+ KERN_MEMBER_OFF(vm_area_struct, vm_private_data),
+ &tbo, GET_KERN_STRUCT_MEMBER_MSIZE(vm_area_struct, vm_private_data));
+ readmem(VADDR, tbo + MOD_MEMBER_OFF(amdgpu, ttm_buffer_object, ttm),
+ &ttm, GET_MOD_STRUCT_MEMBER_MSIZE(amdgpu, ttm_buffer_object, ttm));
+ if (ttm) {
+ readmem(VADDR, ttm + MOD_MEMBER_OFF(amdgpu, ttm_tt, num_pages),
+ &num_pages, GET_MOD_STRUCT_MEMBER_MSIZE(amdgpu, ttm_tt, num_pages));
+ readmem(VADDR, ttm + MOD_MEMBER_OFF(amdgpu, ttm_tt, pages),
+ &pages, GET_MOD_STRUCT_MEMBER_MSIZE(amdgpu, ttm_tt, pages));
+ readmem(VADDR, pages, &pages, sizeof(unsigned long));
+ readmem(VADDR, GET_KERN_SYM(vmemmap_base),
+ &vmemmap_base, sizeof(unsigned long));
+ pfn = (pages - vmemmap_base) / GET_KERN_STRUCT_SSIZE(page);
+ update_filter_pages_info(pfn, num_pages);
+ }
+ }
+ }
+
+ free(array_out);
+ list = next_list(list);
+ } while (list != init_task + KERN_MEMBER_OFF(task_struct, tasks));
+
+ return;
+}
+
+/* Extension callback when makedumpfile do page filtering */
+int extension_callback(unsigned long pfn, const void *pcache)
+{
+ struct ft_page_info *cur = NULL;
+
+ return filter_page(pfn, &cur);
+}
+
+/* Entry of extension */
+void extension_init(void)
+{
+ if (!maple_init()) {
+ goto out;
+ }
+ gather_amdgpu_mm_range_info();
+out:
+ return;
+}
+
+__attribute__((destructor))
+void extension_cleanup(void)
+{
+ do_cleanup(&ft_head_discard);
+}
diff --git a/extensions/maple_tree.c b/extensions/maple_tree.c
new file mode 100644
index 0000000..e367940
--- /dev/null
+++ b/extensions/maple_tree.c
@@ -0,0 +1,307 @@
+#include <stdio.h>
+#include <stdbool.h>
+#include "../btf_info.h"
+#include "../kallsyms.h"
+#include "../makedumpfile.h"
+
+static unsigned char mt_slots[4] = {0};
+static unsigned char mt_pivots[4] = {0};
+static unsigned long mt_max[4] = {0};
+
+INIT_OPT_KERN_SYM(mt_slots);
+INIT_OPT_KERN_SYM(mt_pivots);
+
+INIT_OPT_KERN_STRUCT(maple_tree);
+INIT_OPT_KERN_STRUCT(maple_node);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_tree, ma_root);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_node, ma64);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_node, mr64);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_node, slot);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_arange_64, pivot);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_arange_64, slot);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_range_64, pivot);
+INIT_OPT_KERN_STRUCT_MEMBER(maple_range_64, slot);
+
+#define MEMBER_OFF(S, M) \
+ GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
+
+#define MAPLE_BUFSIZE 512
+
+enum {
+ maple_dense_enum,
+ maple_leaf_64_enum,
+ maple_range_64_enum,
+ maple_arange_64_enum,
+};
+
+#define MAPLE_NODE_MASK 255UL
+#define MAPLE_NODE_TYPE_MASK 0x0F
+#define MAPLE_NODE_TYPE_SHIFT 0x03
+#define XA_ZERO_ENTRY xa_mk_internal(257)
+
+static unsigned long xa_mk_internal(unsigned long v)
+{
+ return (v << 2) | 2;
+}
+
+static bool xa_is_internal(unsigned long entry)
+{
+ return (entry & 3) == 2;
+}
+
+static bool xa_is_node(unsigned long entry)
+{
+ return xa_is_internal(entry) && entry > 4096;
+}
+
+static bool xa_is_value(unsigned long entry)
+{
+ return entry & 1;
+}
+
+static bool xa_is_zero(unsigned long entry)
+{
+ return entry == XA_ZERO_ENTRY;
+}
+
+static unsigned long xa_to_internal(unsigned long entry)
+{
+ return entry >> 2;
+}
+
+static unsigned long xa_to_value(unsigned long entry)
+{
+ return entry >> 1;
+}
+
+static unsigned long mte_to_node(unsigned long entry)
+{
+ return entry & ~MAPLE_NODE_MASK;
+}
+
+static unsigned long mte_node_type(unsigned long maple_enode_entry)
+{
+ return (maple_enode_entry >> MAPLE_NODE_TYPE_SHIFT) &
+ MAPLE_NODE_TYPE_MASK;
+}
+
+static unsigned long mt_slot(void **slots, unsigned char offset)
+{
+ return (unsigned long)slots[offset];
+}
+
+static bool ma_is_leaf(unsigned long type)
+{
+ return type < maple_range_64_enum;
+}
+
+static bool mte_is_leaf(unsigned long maple_enode_entry)
+{
+ return ma_is_leaf(mte_node_type(maple_enode_entry));
+}
+
+static void mt_dump_entry(unsigned long entry, unsigned long min,
+ unsigned long max, unsigned int depth,
+ unsigned long **array_out, int *array_len,
+ int *array_cap)
+{
+ if (entry == 0)
+ return;
+
+ add_to_arr((void ***)array_out, array_len, array_cap, (void *)entry);
+}
+
+static void mt_dump_node(unsigned long entry, unsigned long min,
+ unsigned long max, unsigned int depth,
+ unsigned long **array_out, int *array_len,
+ int *array_cap);
+
+static void mt_dump_range64(unsigned long entry, unsigned long min,
+ unsigned long max, unsigned int depth,
+ unsigned long **array_out, int *array_len,
+ int *array_cap)
+{
+ unsigned long maple_node_m_node = mte_to_node(entry);
+ char node_buf[MAPLE_BUFSIZE];
+ bool leaf = mte_is_leaf(entry);
+ unsigned long first = min, last;
+ int i;
+ char *mr64_buf;
+
+ readmem(VADDR, maple_node_m_node, node_buf, GET_KERN_STRUCT_SSIZE(maple_node));
+ mr64_buf = node_buf + MEMBER_OFF(maple_node, mr64);
+
+ for (i = 0; i < mt_slots[maple_range_64_enum]; i++) {
+ last = max;
+
+ if (i < (mt_slots[maple_range_64_enum] - 1))
+ last = ULONG(mr64_buf + MEMBER_OFF(maple_range_64, pivot) +
+ sizeof(ulong) * i);
+
+ else if (!VOID_PTR(mr64_buf + MEMBER_OFF(maple_range_64, slot) +
+ sizeof(void *) * i) &&
+ max != mt_max[mte_node_type(entry)])
+ break;
+ if (last == 0 && i > 0)
+ break;
+ if (leaf)
+ mt_dump_entry(mt_slot((void **)(mr64_buf +
+ MEMBER_OFF(maple_range_64, slot)), i),
+ first, last, depth + 1, array_out, array_len, array_cap);
+ else if (VOID_PTR(mr64_buf + MEMBER_OFF(maple_range_64, slot) +
+ sizeof(void *) * i)) {
+ mt_dump_node(mt_slot((void **)(mr64_buf +
+ MEMBER_OFF(maple_range_64, slot)), i),
+ first, last, depth + 1, array_out, array_len, array_cap);
+ }
+
+ if (last == max)
+ break;
+ if (last > max) {
+ printf("node %p last (%lu) > max (%lu) at pivot %d!\n",
+ mr64_buf, last, max, i);
+ break;
+ }
+ first = last + 1;
+ }
+}
+
+static void mt_dump_arange64(unsigned long entry, unsigned long min,
+ unsigned long max, unsigned int depth,
+ unsigned long **array_out, int *array_len,
+ int *array_cap)
+{
+ unsigned long maple_node_m_node = mte_to_node(entry);
+ char node_buf[MAPLE_BUFSIZE];
+ unsigned long first = min, last;
+ int i;
+ char *ma64_buf;
+
+ readmem(VADDR, maple_node_m_node, node_buf, GET_KERN_STRUCT_SSIZE(maple_node));
+ ma64_buf = node_buf + MEMBER_OFF(maple_node, ma64);
+
+ for (i = 0; i < mt_slots[maple_arange_64_enum]; i++) {
+ last = max;
+
+ if (i < (mt_slots[maple_arange_64_enum] - 1))
+ last = ULONG(ma64_buf + MEMBER_OFF(maple_arange_64, pivot) +
+ sizeof(void *) * i);
+ else if (!VOID_PTR(ma64_buf + MEMBER_OFF(maple_arange_64, slot) +
+ sizeof(void *) * i))
+ break;
+ if (last == 0 && i > 0)
+ break;
+
+ if (ULONG(ma64_buf + MEMBER_OFF(maple_arange_64, slot) + sizeof(void *) * i))
+ mt_dump_node(mt_slot((void **)(ma64_buf +
+ MEMBER_OFF(maple_arange_64, slot)), i),
+ first, last, depth + 1, array_out, array_len, array_cap);
+
+ if (last == max)
+ break;
+ if (last > max) {
+ printf("node %p last (%lu) > max (%lu) at pivot %d!\n",
+ ma64_buf, last, max, i);
+ break;
+ }
+ first = last + 1;
+ }
+}
+
+static void mt_dump_node(unsigned long entry, unsigned long min,
+ unsigned long max, unsigned int depth,
+ unsigned long **array_out, int *array_len,
+ int *array_cap)
+{
+ unsigned long maple_node = mte_to_node(entry);
+ unsigned long type = mte_node_type(entry);
+ int i;
+ char node_buf[MAPLE_BUFSIZE];
+
+ readmem(VADDR, maple_node, node_buf, GET_KERN_STRUCT_SSIZE(maple_node));
+
+ switch (type) {
+ case maple_dense_enum:
+ for (i = 0; i < mt_slots[maple_dense_enum]; i++) {
+ if (min + i > max)
+ printf("OUT OF RANGE: ");
+ mt_dump_entry(mt_slot((void **)(node_buf + MEMBER_OFF(maple_node, slot)), i),
+ min + i, min + i, depth, array_out, array_len, array_cap);
+ }
+ break;
+ case maple_leaf_64_enum:
+ case maple_range_64_enum:
+ mt_dump_range64(entry, min, max, depth, array_out, array_len, array_cap);
+ break;
+ case maple_arange_64_enum:
+ mt_dump_arange64(entry, min, max, depth, array_out, array_len, array_cap);
+ break;
+ default:
+ printf(" UNKNOWN TYPE\n");
+ }
+}
+
+unsigned long *mt_dump(unsigned long mt, int *array_len)
+{
+ char tree_buf[MAPLE_BUFSIZE];
+ unsigned long entry;
+ unsigned long *array_out = NULL;
+ int array_cap = 0;
+ *array_len = 0;
+
+ readmem(VADDR, mt, tree_buf, GET_KERN_STRUCT_SSIZE(maple_tree));
+ entry = ULONG(tree_buf + MEMBER_OFF(maple_tree, ma_root));
+
+ if (xa_is_node(entry))
+ mt_dump_node(entry, 0, mt_max[mte_node_type(entry)], 0,
+ &array_out, array_len, &array_cap);
+ else if (entry)
+ mt_dump_entry(entry, 0, 0, 0, &array_out, array_len, &array_cap);
+ else
+ printf("(empty)\n");
+
+ return array_out;
+}
+
+bool maple_init(void)
+{
+ unsigned long mt_slots_ptr;
+ unsigned long mt_pivots_ptr;
+
+ if (!KERN_SYM_EXIST(mt_slots) ||
+ !KERN_SYM_EXIST(mt_pivots) ||
+ !KERN_STRUCT_EXIST(maple_tree) ||
+ !KERN_STRUCT_EXIST(maple_node) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_tree, ma_root) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_node, ma64) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_node, mr64) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_node, slot) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_arange_64, pivot) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_arange_64, slot) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_range_64, pivot) ||
+ !KERN_STRUCT_MEMBER_EXIST(maple_range_64, slot)) {
+ printf("%s: Missing required maple tree syms/types\n",
+ __func__);
+ return false;
+ }
+
+ mt_slots_ptr = GET_KERN_SYM(mt_slots);
+ mt_pivots_ptr = GET_KERN_SYM(mt_pivots);
+
+ if (GET_KERN_STRUCT_SSIZE(maple_tree) > MAPLE_BUFSIZE ||
+ GET_KERN_STRUCT_SSIZE(maple_node) > MAPLE_BUFSIZE) {
+ printf("%s: MAPLE_BUFSIZE should be larger than maple_node/tree struct\n",
+ __func__);
+ return false;
+ }
+
+ readmem(VADDR, mt_slots_ptr, mt_slots, sizeof(mt_slots));
+ readmem(VADDR, mt_pivots_ptr, mt_pivots, sizeof(mt_pivots));
+
+ mt_max[maple_dense_enum] = mt_slots[maple_dense_enum];
+ mt_max[maple_leaf_64_enum] = ULONG_MAX;
+ mt_max[maple_range_64_enum] = ULONG_MAX;
+ mt_max[maple_arange_64_enum] = ULONG_MAX;
+
+ return true;
+}
\ No newline at end of file
diff --git a/extensions/maple_tree.h b/extensions/maple_tree.h
new file mode 100644
index 0000000..c96624c
--- /dev/null
+++ b/extensions/maple_tree.h
@@ -0,0 +1,6 @@
+#ifndef _MAPLE_TREE_H
+#define _MAPLE_TREE_H
+#include <stdbool.h>
+unsigned long *mt_dump(unsigned long mt, int *array_len);
+bool maple_init(void);
+#endif /* _MAPLE_TREE_H */
\ No newline at end of file
--
2.47.0
^ permalink raw reply related [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 7/7] Filter amdgpu mm pages
2026-03-17 15:07 ` [PATCH v4][makedumpfile 7/7] Filter amdgpu mm pages Tao Liu
@ 2026-04-03 0:16 ` Stephen Brennan
0 siblings, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-03 0:16 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Tao Liu <ltao@redhat.com> writes:
> This patch will introduce maple_tree & amdgpu mm page filtering extension,
> those mm pages allocated to amdgpu will be discarded from vmcore, in order
> to shrink vmcore size since mm pages allocated to amdgpu are useless to kernel
> crash and may contain sensitive data.
>
> Signed-off-by: Tao Liu <ltao@redhat.com>
> ---
> extensions/Makefile | 4 +-
> extensions/amdgpu_filter.c | 190 +++++++++++++++++++++++
> extensions/maple_tree.c | 307 +++++++++++++++++++++++++++++++++++++
> extensions/maple_tree.h | 6 +
> 4 files changed, 506 insertions(+), 1 deletion(-)
> create mode 100644 extensions/amdgpu_filter.c
> create mode 100644 extensions/maple_tree.c
> create mode 100644 extensions/maple_tree.h
>
> diff --git a/extensions/Makefile b/extensions/Makefile
> index b8bbfbc..55b789b 100644
> --- a/extensions/Makefile
> +++ b/extensions/Makefile
> @@ -1,8 +1,10 @@
> CC ?= gcc
> -CONTRIB_SO :=
> +CONTRIB_SO := amdgpu_filter.so
>
> all: $(CONTRIB_SO)
>
> +amdgpu_filter.so: maple_tree.c
> +
> $(CONTRIB_SO): %.so: %.c
> $(CC) -O2 -g -fPIC -shared -Wl,-T,../makedumpfile.ld -o $@ $^
>
> diff --git a/extensions/amdgpu_filter.c b/extensions/amdgpu_filter.c
> new file mode 100644
> index 0000000..3a1e9f2
> --- /dev/null
> +++ b/extensions/amdgpu_filter.c
> @@ -0,0 +1,190 @@
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include "maple_tree.h"
> +#include "../makedumpfile.h"
> +#include "../btf_info.h"
> +#include "../kallsyms.h"
> +#include "../extension.h"
> +
> +/*
> + * These syms/types are must-have for the extension.
> +*/
> +INIT_KERN_STRUCT_MEMBER(task_struct, tasks);
> +INIT_KERN_STRUCT_MEMBER(task_struct, mm);
> +INIT_KERN_STRUCT_MEMBER(mm_struct, mm_mt);
> +INIT_KERN_STRUCT_MEMBER(vm_area_struct, vm_ops);
> +INIT_KERN_STRUCT_MEMBER(vm_area_struct, vm_private_data);
> +INIT_MOD_STRUCT_MEMBER(amdgpu, ttm_buffer_object, ttm);
> +INIT_MOD_STRUCT_MEMBER(amdgpu, ttm_tt, pages);
> +INIT_MOD_STRUCT_MEMBER(amdgpu, ttm_tt, num_pages);
> +INIT_KERN_STRUCT(page);
> +
> +INIT_KERN_SYM(init_task);
> +INIT_KERN_SYM(vmemmap_base);
> +INIT_MOD_SYM(amdgpu, amdgpu_gem_vm_ops);
> +
> +struct ft_page_info {
> + unsigned long pfn;
> + unsigned long num;
> + struct ft_page_info *next;
> +};
> +
> +static struct ft_page_info *ft_head_discard = NULL;
> +
> +static void update_filter_pages_info(unsigned long pfn, unsigned long num)
> +{
> + struct ft_page_info *p, **ft_head;
> + struct ft_page_info *new_p = malloc(sizeof(struct ft_page_info));
> +
> + ft_head = &ft_head_discard;
> +
> + if (!new_p) {
> + fprintf(stderr, "%s: Can't allocate memory for ft_page_info\n",
> + __func__);
> + return;
> + }
> + new_p->pfn = pfn;
> + new_p->num = num;
> + new_p->next = NULL;
> +
> + if (!(*ft_head) || (*ft_head)->pfn > new_p->pfn) {
> + new_p->next = (*ft_head);
> + (*ft_head) = new_p;
> + return;
> + }
> +
> + p = (*ft_head);
> + while (p->next != NULL && p->next->pfn < new_p->pfn) {
> + p = p->next;
> + }
> +
> + new_p->next = p->next;
> + p->next = new_p;
> +}
> +
> +static int filter_page(unsigned long pfn, struct ft_page_info **p)
> +{
> + struct ft_page_info *ft_head = ft_head_discard;
> +
> + if (ft_head == NULL)
> + return PG_UNDECID;
> +
> + if (*p == NULL)
> + *p = ft_head;
> +
> + /* The gap before 1st block */
> + if (pfn >= 0 && pfn < ft_head->pfn)
> + return PG_UNDECID;
> +
> + /* Handle 1~(n-1) blocks and following gaps */
> + while ((*p)->next) {
> + if (pfn >= (*p)->pfn && pfn < (*p)->pfn + (*p)->num)
> + return PG_EXCLUDE; // hit the block
> + if (pfn >= (*p)->pfn + (*p)->num && pfn < (*p)->next->pfn)
> + return PG_UNDECID; // the gap after the block
> + *p = (*p)->next;
> + }
> +
> + /* The last block and gap */
> + if (pfn >= (*p)->pfn + (*p)->num)
> + return PG_UNDECID;
> + else
> + return PG_EXCLUDE;
> +}
> +
> +static void do_cleanup(struct ft_page_info **ft_head)
> +{
> + struct ft_page_info *p, *p_tmp;
> +
> + for (p = *ft_head; p;) {
> + p_tmp = p;
> + p = p->next;
> + free(p_tmp);
> + }
> + *ft_head = NULL;
> +}
> +
> +#define KERN_MEMBER_OFF(S, M) \
> + GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
> +#define MOD_MEMBER_OFF(MOD, S, M) \
> + GET_MOD_STRUCT_MEMBER_MOFF(MOD, S, M) / 8
> +
> +static void gather_amdgpu_mm_range_info(void)
> +{
> + uint64_t init_task, list, list_offset, amdgpu_gem_vm_ops;
> + uint64_t mm, vm_ops, tbo, ttm, num_pages, pages, pfn, vmemmap_base;
> + int array_len;
> + unsigned long *array_out;
> + init_task = GET_KERN_SYM(init_task);
> + amdgpu_gem_vm_ops = GET_MOD_SYM(amdgpu, amdgpu_gem_vm_ops);
> +
> + list = init_task + KERN_MEMBER_OFF(task_struct, tasks);
> +
> + do {
> + readmem(VADDR, list - KERN_MEMBER_OFF(task_struct, tasks) +
> + KERN_MEMBER_OFF(task_struct, mm),
> + &mm, sizeof(uint64_t));
> + if (!mm) {
> + list = next_list(list);
> + continue;
> + }
> +
> + array_out = mt_dump(mm + KERN_MEMBER_OFF(mm_struct, mm_mt), &array_len);
> + if (!array_out)
> + return;
> +
> + for (int i = 0; i < array_len; i++) {
> + num_pages = 0;
> + readmem(VADDR, array_out[i] + KERN_MEMBER_OFF(vm_area_struct, vm_ops),
> + &vm_ops, GET_KERN_STRUCT_MEMBER_MSIZE(vm_area_struct, vm_ops));
> + if (vm_ops == amdgpu_gem_vm_ops) {
> + readmem(VADDR, array_out[i] +
> + KERN_MEMBER_OFF(vm_area_struct, vm_private_data),
> + &tbo, GET_KERN_STRUCT_MEMBER_MSIZE(vm_area_struct, vm_private_data));
> + readmem(VADDR, tbo + MOD_MEMBER_OFF(amdgpu, ttm_buffer_object, ttm),
> + &ttm, GET_MOD_STRUCT_MEMBER_MSIZE(amdgpu, ttm_buffer_object, ttm));
> + if (ttm) {
> + readmem(VADDR, ttm + MOD_MEMBER_OFF(amdgpu, ttm_tt, num_pages),
> + &num_pages, GET_MOD_STRUCT_MEMBER_MSIZE(amdgpu, ttm_tt, num_pages));
> + readmem(VADDR, ttm + MOD_MEMBER_OFF(amdgpu, ttm_tt, pages),
> + &pages, GET_MOD_STRUCT_MEMBER_MSIZE(amdgpu, ttm_tt, pages));
Are these pages guaranteed to be contiguous?
If not, you're applying the page filter rule based on just the first
element of the pages array.
If they are guaranteed contiguous, maybe a comment would be helpful?
Thanks,
Stephen
> + readmem(VADDR, pages, &pages, sizeof(unsigned long));
> + readmem(VADDR, GET_KERN_SYM(vmemmap_base),
> + &vmemmap_base, sizeof(unsigned long));
> + pfn = (pages - vmemmap_base) / GET_KERN_STRUCT_SSIZE(page);
> + update_filter_pages_info(pfn, num_pages);
> + }
> + }
> + }
> +
> + free(array_out);
> + list = next_list(list);
> + } while (list != init_task + KERN_MEMBER_OFF(task_struct, tasks));
> +
> + return;
> +}
> +
> +/* Extension callback when makedumpfile do page filtering */
> +int extension_callback(unsigned long pfn, const void *pcache)
> +{
> + struct ft_page_info *cur = NULL;
> +
> + return filter_page(pfn, &cur);
> +}
> +
> +/* Entry of extension */
> +void extension_init(void)
> +{
> + if (!maple_init()) {
> + goto out;
> + }
> + gather_amdgpu_mm_range_info();
> +out:
> + return;
> +}
> +
> +__attribute__((destructor))
> +void extension_cleanup(void)
> +{
> + do_cleanup(&ft_head_discard);
> +}
> diff --git a/extensions/maple_tree.c b/extensions/maple_tree.c
> new file mode 100644
> index 0000000..e367940
> --- /dev/null
> +++ b/extensions/maple_tree.c
> @@ -0,0 +1,307 @@
> +#include <stdio.h>
> +#include <stdbool.h>
> +#include "../btf_info.h"
> +#include "../kallsyms.h"
> +#include "../makedumpfile.h"
> +
> +static unsigned char mt_slots[4] = {0};
> +static unsigned char mt_pivots[4] = {0};
> +static unsigned long mt_max[4] = {0};
> +
> +INIT_OPT_KERN_SYM(mt_slots);
> +INIT_OPT_KERN_SYM(mt_pivots);
> +
> +INIT_OPT_KERN_STRUCT(maple_tree);
> +INIT_OPT_KERN_STRUCT(maple_node);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_tree, ma_root);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_node, ma64);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_node, mr64);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_node, slot);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_arange_64, pivot);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_arange_64, slot);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_range_64, pivot);
> +INIT_OPT_KERN_STRUCT_MEMBER(maple_range_64, slot);
> +
> +#define MEMBER_OFF(S, M) \
> + GET_KERN_STRUCT_MEMBER_MOFF(S, M) / 8
> +
> +#define MAPLE_BUFSIZE 512
> +
> +enum {
> + maple_dense_enum,
> + maple_leaf_64_enum,
> + maple_range_64_enum,
> + maple_arange_64_enum,
> +};
> +
> +#define MAPLE_NODE_MASK 255UL
> +#define MAPLE_NODE_TYPE_MASK 0x0F
> +#define MAPLE_NODE_TYPE_SHIFT 0x03
> +#define XA_ZERO_ENTRY xa_mk_internal(257)
> +
> +static unsigned long xa_mk_internal(unsigned long v)
> +{
> + return (v << 2) | 2;
> +}
> +
> +static bool xa_is_internal(unsigned long entry)
> +{
> + return (entry & 3) == 2;
> +}
> +
> +static bool xa_is_node(unsigned long entry)
> +{
> + return xa_is_internal(entry) && entry > 4096;
> +}
> +
> +static bool xa_is_value(unsigned long entry)
> +{
> + return entry & 1;
> +}
> +
> +static bool xa_is_zero(unsigned long entry)
> +{
> + return entry == XA_ZERO_ENTRY;
> +}
> +
> +static unsigned long xa_to_internal(unsigned long entry)
> +{
> + return entry >> 2;
> +}
> +
> +static unsigned long xa_to_value(unsigned long entry)
> +{
> + return entry >> 1;
> +}
> +
> +static unsigned long mte_to_node(unsigned long entry)
> +{
> + return entry & ~MAPLE_NODE_MASK;
> +}
> +
> +static unsigned long mte_node_type(unsigned long maple_enode_entry)
> +{
> + return (maple_enode_entry >> MAPLE_NODE_TYPE_SHIFT) &
> + MAPLE_NODE_TYPE_MASK;
> +}
> +
> +static unsigned long mt_slot(void **slots, unsigned char offset)
> +{
> + return (unsigned long)slots[offset];
> +}
> +
> +static bool ma_is_leaf(unsigned long type)
> +{
> + return type < maple_range_64_enum;
> +}
> +
> +static bool mte_is_leaf(unsigned long maple_enode_entry)
> +{
> + return ma_is_leaf(mte_node_type(maple_enode_entry));
> +}
> +
> +static void mt_dump_entry(unsigned long entry, unsigned long min,
> + unsigned long max, unsigned int depth,
> + unsigned long **array_out, int *array_len,
> + int *array_cap)
> +{
> + if (entry == 0)
> + return;
> +
> + add_to_arr((void ***)array_out, array_len, array_cap, (void *)entry);
> +}
> +
> +static void mt_dump_node(unsigned long entry, unsigned long min,
> + unsigned long max, unsigned int depth,
> + unsigned long **array_out, int *array_len,
> + int *array_cap);
> +
> +static void mt_dump_range64(unsigned long entry, unsigned long min,
> + unsigned long max, unsigned int depth,
> + unsigned long **array_out, int *array_len,
> + int *array_cap)
> +{
> + unsigned long maple_node_m_node = mte_to_node(entry);
> + char node_buf[MAPLE_BUFSIZE];
> + bool leaf = mte_is_leaf(entry);
> + unsigned long first = min, last;
> + int i;
> + char *mr64_buf;
> +
> + readmem(VADDR, maple_node_m_node, node_buf, GET_KERN_STRUCT_SSIZE(maple_node));
> + mr64_buf = node_buf + MEMBER_OFF(maple_node, mr64);
> +
> + for (i = 0; i < mt_slots[maple_range_64_enum]; i++) {
> + last = max;
> +
> + if (i < (mt_slots[maple_range_64_enum] - 1))
> + last = ULONG(mr64_buf + MEMBER_OFF(maple_range_64, pivot) +
> + sizeof(ulong) * i);
> +
> + else if (!VOID_PTR(mr64_buf + MEMBER_OFF(maple_range_64, slot) +
> + sizeof(void *) * i) &&
> + max != mt_max[mte_node_type(entry)])
> + break;
> + if (last == 0 && i > 0)
> + break;
> + if (leaf)
> + mt_dump_entry(mt_slot((void **)(mr64_buf +
> + MEMBER_OFF(maple_range_64, slot)), i),
> + first, last, depth + 1, array_out, array_len, array_cap);
> + else if (VOID_PTR(mr64_buf + MEMBER_OFF(maple_range_64, slot) +
> + sizeof(void *) * i)) {
> + mt_dump_node(mt_slot((void **)(mr64_buf +
> + MEMBER_OFF(maple_range_64, slot)), i),
> + first, last, depth + 1, array_out, array_len, array_cap);
> + }
> +
> + if (last == max)
> + break;
> + if (last > max) {
> + printf("node %p last (%lu) > max (%lu) at pivot %d!\n",
> + mr64_buf, last, max, i);
> + break;
> + }
> + first = last + 1;
> + }
> +}
> +
> +static void mt_dump_arange64(unsigned long entry, unsigned long min,
> + unsigned long max, unsigned int depth,
> + unsigned long **array_out, int *array_len,
> + int *array_cap)
> +{
> + unsigned long maple_node_m_node = mte_to_node(entry);
> + char node_buf[MAPLE_BUFSIZE];
> + unsigned long first = min, last;
> + int i;
> + char *ma64_buf;
> +
> + readmem(VADDR, maple_node_m_node, node_buf, GET_KERN_STRUCT_SSIZE(maple_node));
> + ma64_buf = node_buf + MEMBER_OFF(maple_node, ma64);
> +
> + for (i = 0; i < mt_slots[maple_arange_64_enum]; i++) {
> + last = max;
> +
> + if (i < (mt_slots[maple_arange_64_enum] - 1))
> + last = ULONG(ma64_buf + MEMBER_OFF(maple_arange_64, pivot) +
> + sizeof(void *) * i);
> + else if (!VOID_PTR(ma64_buf + MEMBER_OFF(maple_arange_64, slot) +
> + sizeof(void *) * i))
> + break;
> + if (last == 0 && i > 0)
> + break;
> +
> + if (ULONG(ma64_buf + MEMBER_OFF(maple_arange_64, slot) + sizeof(void *) * i))
> + mt_dump_node(mt_slot((void **)(ma64_buf +
> + MEMBER_OFF(maple_arange_64, slot)), i),
> + first, last, depth + 1, array_out, array_len, array_cap);
> +
> + if (last == max)
> + break;
> + if (last > max) {
> + printf("node %p last (%lu) > max (%lu) at pivot %d!\n",
> + ma64_buf, last, max, i);
> + break;
> + }
> + first = last + 1;
> + }
> +}
> +
> +static void mt_dump_node(unsigned long entry, unsigned long min,
> + unsigned long max, unsigned int depth,
> + unsigned long **array_out, int *array_len,
> + int *array_cap)
> +{
> + unsigned long maple_node = mte_to_node(entry);
> + unsigned long type = mte_node_type(entry);
> + int i;
> + char node_buf[MAPLE_BUFSIZE];
> +
> + readmem(VADDR, maple_node, node_buf, GET_KERN_STRUCT_SSIZE(maple_node));
> +
> + switch (type) {
> + case maple_dense_enum:
> + for (i = 0; i < mt_slots[maple_dense_enum]; i++) {
> + if (min + i > max)
> + printf("OUT OF RANGE: ");
> + mt_dump_entry(mt_slot((void **)(node_buf + MEMBER_OFF(maple_node, slot)), i),
> + min + i, min + i, depth, array_out, array_len, array_cap);
> + }
> + break;
> + case maple_leaf_64_enum:
> + case maple_range_64_enum:
> + mt_dump_range64(entry, min, max, depth, array_out, array_len, array_cap);
> + break;
> + case maple_arange_64_enum:
> + mt_dump_arange64(entry, min, max, depth, array_out, array_len, array_cap);
> + break;
> + default:
> + printf(" UNKNOWN TYPE\n");
> + }
> +}
> +
> +unsigned long *mt_dump(unsigned long mt, int *array_len)
> +{
> + char tree_buf[MAPLE_BUFSIZE];
> + unsigned long entry;
> + unsigned long *array_out = NULL;
> + int array_cap = 0;
> + *array_len = 0;
> +
> + readmem(VADDR, mt, tree_buf, GET_KERN_STRUCT_SSIZE(maple_tree));
> + entry = ULONG(tree_buf + MEMBER_OFF(maple_tree, ma_root));
> +
> + if (xa_is_node(entry))
> + mt_dump_node(entry, 0, mt_max[mte_node_type(entry)], 0,
> + &array_out, array_len, &array_cap);
> + else if (entry)
> + mt_dump_entry(entry, 0, 0, 0, &array_out, array_len, &array_cap);
> + else
> + printf("(empty)\n");
> +
> + return array_out;
> +}
> +
> +bool maple_init(void)
> +{
> + unsigned long mt_slots_ptr;
> + unsigned long mt_pivots_ptr;
> +
> + if (!KERN_SYM_EXIST(mt_slots) ||
> + !KERN_SYM_EXIST(mt_pivots) ||
> + !KERN_STRUCT_EXIST(maple_tree) ||
> + !KERN_STRUCT_EXIST(maple_node) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_tree, ma_root) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_node, ma64) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_node, mr64) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_node, slot) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_arange_64, pivot) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_arange_64, slot) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_range_64, pivot) ||
> + !KERN_STRUCT_MEMBER_EXIST(maple_range_64, slot)) {
> + printf("%s: Missing required maple tree syms/types\n",
> + __func__);
> + return false;
> + }
> +
> + mt_slots_ptr = GET_KERN_SYM(mt_slots);
> + mt_pivots_ptr = GET_KERN_SYM(mt_pivots);
> +
> + if (GET_KERN_STRUCT_SSIZE(maple_tree) > MAPLE_BUFSIZE ||
> + GET_KERN_STRUCT_SSIZE(maple_node) > MAPLE_BUFSIZE) {
> + printf("%s: MAPLE_BUFSIZE should be larger than maple_node/tree struct\n",
> + __func__);
> + return false;
> + }
> +
> + readmem(VADDR, mt_slots_ptr, mt_slots, sizeof(mt_slots));
> + readmem(VADDR, mt_pivots_ptr, mt_pivots, sizeof(mt_pivots));
> +
> + mt_max[maple_dense_enum] = mt_slots[maple_dense_enum];
> + mt_max[maple_leaf_64_enum] = ULONG_MAX;
> + mt_max[maple_range_64_enum] = ULONG_MAX;
> + mt_max[maple_arange_64_enum] = ULONG_MAX;
> +
> + return true;
> +}
> \ No newline at end of file
> diff --git a/extensions/maple_tree.h b/extensions/maple_tree.h
> new file mode 100644
> index 0000000..c96624c
> --- /dev/null
> +++ b/extensions/maple_tree.h
> @@ -0,0 +1,6 @@
> +#ifndef _MAPLE_TREE_H
> +#define _MAPLE_TREE_H
> +#include <stdbool.h>
> +unsigned long *mt_dump(unsigned long mt, int *array_len);
> +bool maple_init(void);
> +#endif /* _MAPLE_TREE_H */
> \ No newline at end of file
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
` (6 preceding siblings ...)
2026-03-17 15:07 ` [PATCH v4][makedumpfile 7/7] Filter amdgpu mm pages Tao Liu
@ 2026-04-03 8:06 ` HAGIO KAZUHITO(萩尾 一仁)
2026-04-03 18:26 ` Stephen Brennan
8 siblings, 0 replies; 21+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2026-04-03 8:06 UTC (permalink / raw)
To: Tao Liu
Cc: stephen.s.brennan@oracle.com,
YAMAZAKI MASAMITSU(山崎 真光),
kexec@lists.infradead.org
On 2026/03/18 0:07, Tao Liu wrote:
> A) This patchset will introduce the following features to makedumpfile:
>
> 1) Add .so extension support to makedumpfile
> 2) Enable btf and kallsyms for symbol type and address resolving.
>
> B) The purpose of the features are:
>
> 1) Currently makedumpfile filters mm pages based on page flags, because flags
> can help to determine one page's usage. But this page-flag-checking method
> lacks of flexibility in certain cases, e.g. if we want to filter those mm
> pages occupied by GPU during vmcore dumping due to:
>
> a) GPU may be taking a large memory and contains sensitive data;
> b) GPU mm pages have no relations to kernel crash and useless for vmcore
> analysis.
>
> But there is no GPU mm page specific flags, and apparently we don't need
> to create one just for kdump use. A programmable filtering tool is more
> suitable for such cases. In addition, different GPU vendors may use
> different ways for mm pages allocating, programmable filtering is better
> than hard coding these GPU specific logics into makedumpfile in this case.
>
> 2) Currently makedumpfile already contains a programmable filtering tool, aka
> eppic script, which allows user to write customized code for data erasing.
> However it has the following drawbacks:
>
> a) cannot do mm page filtering.
> b) need to access to debuginfo of both kernel and modules, which is not
> applicable in the 2nd kernel.
> c) eppic library has memory leaks which are not all resolved [1]. This
> is not acceptable in 2nd kernel.
>
> makedumpfile need to resolve the dwarf data from debuginfo, to get symbols
> types and addresses. In recent kernel there are dwarf alternatives such
> as btf/kallsyms which can be used for this purpose. And btf/kallsyms info
> are already packed within vmcore, so we can use it directly.
>
> With these, this patchset introduces makedumpfile extensions, which is based
> on btf/kallsyms symbol resolving, and is programmable for mm page filtering.
> The following section shows its usage and performance, please note the tests
> are performed in 1st kernel.
>
> 3) Compile and run makedumpfile extensions:
>
> $ make LINKTYPE=dynamic USELZO=on USESNAPPY=on USEZSTD=on
> $ make extensions
>
> $ /usr/bin/time -v ./makedumpfile -d 31 -l /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore
> /tmp/extension.out --extension amdgpu_filter.so
> Loaded extension: ./extensions/amdgpu_filter.so
> makedumpfile Completed.
> User time (seconds): 5.08
> System time (seconds): 0.84
> Percent of CPU this job got: 99%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.95
> Maximum resident set size (kbytes): 17360
> ...
>
> To contrast with eppic script of v2 [2]:
>
> $ /usr/bin/time -v ./makedumpfile -d 31 -l /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore
> /tmp/eppic.out --eppic eppic_scripts/filter_amdgpu_mm_pages.c
> makedumpfile Completed.
> User time (seconds): 8.23
> System time (seconds): 0.88
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.16
> Maximum resident set size (kbytes): 57128
> ...
>
> -rw------- 1 root root 367475074 Jan 19 19:01 /tmp/extension.out
> -rw------- 1 root root 367475074 Jan 19 19:48 /tmp/eppic.out
> -rw------- 1 root root 387181418 Jun 10 18:03 /var/crash/127.0.0.1-2025-06-10-18:03:12/vmcore
>
> C) Discussion:
>
> 1) GPU types: Currently only tested with amdgpu's mm page filtering, others
> are not tested.
> 2) OS: The code can work on rhel-10+/rhel9.5+ on x86_64/arm64/s390/ppc64.
> Others are not tested.
>
> D) Testing:
>
> If you don't want to create your vmcore, you can find a vmcore which I
> created with amdgpu mm pages unfiltered [3], the amdgpu mm pages are
> allocated by program [4]. You can use the vmcore in 1st kernel to filter
> the amdgpu mm pages by the previous performance testing cmdline. To
> verify the pages are filtered in crash:
>
> Unfiltered:
> crash> search -c "!QAZXSW@#EDC"
> ffff96b7fa800000: !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> ffff96b87c800000: !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> crash> rd ffff96b7fa800000
> ffff96b7fa800000: 405753585a415121 !QAZXSW@
> crash> rd ffff96b87c800000
> ffff96b87c800000: 405753585a415121 !QAZXSW@
>
> Filtered:
> crash> search -c "!QAZXSW@#EDC"
> crash> rd ffff96b7fa800000
> rd: page excluded: kernel virtual address: ffff96b7fa800000 type: "64-bit KVADDR"
> crash> rd ffff96b87c800000
> rd: page excluded: kernel virtual address: ffff96b87c800000 type: "64-bit KVADDR"
>
> [1]: https://github.com/lucchouina/eppic/pull/32
> [2]: https://lore.kernel.org/kexec/20251020222410.8235-1-ltao@redhat.com/
> [3]: https://people.redhat.com/~ltao/core/vmcore
> [4]: https://gist.github.com/liutgnu/a8cbce1c666452f1530e1410d1f352df
>
> v4 -> v3:
>
> 1) Get rid of all hash table usage. So only required syms/types info
> will be stored, rather than install all kernel's syms/types. To do this,
> special elf sections as .init_ksyms/ktypes are used for the required info
> declaration/storage.
>
> 2) Support extension callback for makedumpfile, so during mm page
> filtering, extension can help to decide if keep/discard the page.
>
> 3) The patches are organized as follows:
>
> --- <only for test purpose, don't merge> ---
> 7. Filter amdgpu mm pages
>
> --- <code should be merged> ---
> 6. Add makedumpfile extensions support
> 5. Implement kernel module's btf resolving
> 4. Implement kernel module's kallsyms resolving
> 3. Implement kernel btf resolving
> 2. Implement kernel kallsyms resolving
> 1. Reserve sections for makedumpfile and extenions
>
> Patch 7 is customization specific, which can be maintained separately.
> Patch 1 ~ 6 are common code which should be integrate with makedumpfile.
Hi Tao,
thank you for the update, nice and interesting implementation :-)
Here is my comments to the whole patchset:
- I think we need a documentation about this function and it's better to
have a simple example extension (that does not need to be updated) if
possible.
- It could not be built on RHEL8 and it's ok if libbpf is too old, but
which version of libbpf is required? It should be written in a
documentation.
- Looking at this, e.g. "makedumpfile -d 1 --extension foo.so" does not
run run_extension_callback(), is it intended?
/*
* Exclude cache pages, cache private pages, user data pages,
* and hwpoison pages.
*/
if (info->dump_level & DL_EXCLUDE_CACHE ||
info->dump_level & DL_EXCLUDE_CACHE_PRI ||
info->dump_level & DL_EXCLUDE_USER_DATA ||
NUMBER(PG_hwpoison) != NOT_FOUND_NUMBER ||
((info->dump_level & DL_EXCLUDE_FREE) &&
info->page_is_buddy)) {
if (!exclude_unnecessary_pages(cycle)) {
ERRMSG("Can't exclude unnecessary pages.\n");
return FALSE;
}
}
- I'm concerned that readmem()s don't have check on return value, it's
better to be able to determine which value is unable to be read.
The other comments are in each patch.
Thanks,
Kazu
>
> Link to v3: https://lore.kernel.org/kexec/20260120025500.25095-1-ltao@redhat.com/
> Link to v2: https://lore.kernel.org/kexec/20251020222410.8235-1-ltao@redhat.com/
> Link to v1: https://lore.kernel.org/kexec/20250610095743.18073-1-ltao@redhat.com/
>
> Tao Liu (7):
> Reserve sections for makedumpfile and extenions
> Implement kernel kallsyms resolving
> Implement kernel btf resolving
> Implement kernel module's kallsyms resolving
> Implement kernel module's btf resolving
> Add makedumpfile extensions support
> Filter amdgpu mm pages
>
> Makefile | 11 +-
> btf_info.c | 345 +++++++++++++++++++++++++++
> btf_info.h | 92 ++++++++
> extension.c | 300 +++++++++++++++++++++++
> extension.h | 12 +
> extensions/Makefile | 12 +
> extensions/amdgpu_filter.c | 190 +++++++++++++++
> extensions/maple_tree.c | 307 ++++++++++++++++++++++++
> extensions/maple_tree.h | 6 +
> kallsyms.c | 473 +++++++++++++++++++++++++++++++++++++
> kallsyms.h | 94 ++++++++
> makedumpfile.c | 41 +++-
> makedumpfile.h | 13 +
> makedumpfile.ld | 15 ++
> 14 files changed, 1903 insertions(+), 8 deletions(-)
> create mode 100644 btf_info.c
> create mode 100644 btf_info.h
> create mode 100644 extension.c
> create mode 100644 extension.h
> create mode 100644 extensions/Makefile
> create mode 100644 extensions/amdgpu_filter.c
> create mode 100644 extensions/maple_tree.c
> create mode 100644 extensions/maple_tree.h
> create mode 100644 kallsyms.c
> create mode 100644 kallsyms.h
> create mode 100644 makedumpfile.ld
>
^ permalink raw reply [flat|nested] 21+ messages in thread* Re: [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering
2026-03-17 15:07 [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering Tao Liu
` (7 preceding siblings ...)
2026-04-03 8:06 ` [PATCH v4][makedumpfile 0/7] btf/kallsyms based makedumpfile extension for mm page filtering HAGIO KAZUHITO(萩尾 一仁)
@ 2026-04-03 18:26 ` Stephen Brennan
8 siblings, 0 replies; 21+ messages in thread
From: Stephen Brennan @ 2026-04-03 18:26 UTC (permalink / raw)
To: Tao Liu, yamazaki-msmt, k-hagio-ab, kexec; +Cc: aravinda, Tao Liu
Hello,
My testing of the patch series involves my own extension, userstack.so,
which I've implemented here[1] on top of this branch:
[1]: https://github.com/brenns10/makedumpfile/commits/stepbren_userstack_upstream_v4/
To test, I have a vmcore created at dump-level 23, and I'm using
makedumpfile to re-filter the vmcore at dump-level 31, either including
the userspace stacks, or not using an extension at all:
$ /usr/bin/time ./makedumpfile -z -d31 --extension userstack.so ./dump.withuser.img ./extension.img
...
9.66user 0.86system 0:10.56elapsed 99%CPU (0avgtext+0avgdata 15108maxresident)k
0inputs+1133800outputs (0major+2184minor)pagefaults 0swaps
$ /usr/bin/time ./makedumpfile -z -d31 ./dump.withuser.img ./baseline.img
...
9.28user 0.84system 0:10.17elapsed 99%CPU (0avgtext+0avgdata 3336maxresident)k
0inputs+1132120outputs (0major+236minor)pagefaults 0swaps
$ ls -l *.img
-rw------- 1 stepbren stepbren 577384093 Apr 3 10:55 baseline.img
-rw------- 1 stepbren stepbren 1746475073 Apr 2 15:21 dump.withuser.img
-rw------- 1 stepbren stepbren 578253346 Apr 3 10:54 extension.img
With drgn's contrib/pstack.py, we can dump a stack trace for an example
process in the vmcore. I'll show it here three times: first with the
original vmcore containing all userspace pages: it works, but the vmcore
is 1.7 GiB, more than 3x larger than a dump-level 31 vmcore. Second with
the baseline, which is at dump-level 31: the userspace stack cannot be
retrieved. Third, with the userstack.so extension: despite being less
than 1 MiB larger than the prior core, it contains the necessary stack
pages and thus the stack trace is the same as the original.
$ drgn -c dump.withuser.img /usr/share/drgn/contrib/pstack.py pstack -p 19284
[PID: 19284 COMM: bash]
Thread 0 TID=19284 [S] CPU=18 ('bash')
#0 context_switch (kernel/sched/core.c:5367:2)
#1 __schedule (kernel/sched/core.c:6752:8)
#2 __schedule_loop (kernel/sched/core.c:6829:3)
#3 schedule (kernel/sched/core.c:6844:2)
#4 do_wait (kernel/exit.c:1698:3)
#5 kernel_wait4 (kernel/exit.c:1852:8)
#6 __do_sys_wait4 (kernel/exit.c:1880:13)
#7 do_syscall_x64 (arch/x86/entry/common.c:47:14)
#8 do_syscall_64 (arch/x86/entry/common.c:84:7)
#9 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121)
#10 0x7f46e82b63d7
------ userspace ---------
#0 wait4+0x17/0xa3 (from /usr/lib64/libc.so.6 +0x1103d7)
#1 waitchld.isra.0+0x102/0xc42 (from /usr/bin/bash +0x10bd52)
#2 wait_for+0x15a/0xda6 (from /usr/bin/bash +0x5b54a)
#3 execute_command_internal+0x3186/0x3877 (from /usr/bin/bash +0x3e976)
#4 execute_command+0xcc/0x1c8 (from /usr/bin/bash +0x3f13c)
#5 reader_loop+0x1ea/0x39a (from /usr/bin/bash +0x311da)
#6 main+0xe2a/0x1a94 (from /usr/bin/bash +0x2548a)
#7 __libc_start_call_main+0x7e/0xac (from /usr/lib64/libc.so.6 +0x2d39e)
#8 __libc_start_main@@GLIBC_2.34+0x89/0x14c (from /usr/lib64/libc.so.6 +0x2d459)
#9 _start+0x25/0x26 (from /usr/bin/bash +0x26125)
#10 ???
$ drgn -c baseline.img /usr/share/drgn/contrib/pstack.py pstack -p 19284
[PID: 19284 COMM: bash]
Thread 0 TID=19284 [S] CPU=18 ('bash')
#0 context_switch (kernel/sched/core.c:5367:2)
#1 __schedule (kernel/sched/core.c:6752:8)
#2 __schedule_loop (kernel/sched/core.c:6829:3)
#3 schedule (kernel/sched/core.c:6844:2)
#4 do_wait (kernel/exit.c:1698:3)
#5 kernel_wait4 (kernel/exit.c:1852:8)
#6 __do_sys_wait4 (kernel/exit.c:1880:13)
#7 do_syscall_x64 (arch/x86/entry/common.c:47:14)
#8 do_syscall_64 (arch/x86/entry/common.c:84:7)
#9 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121)
#10 0x7f46e82b63d7
------ userspace ---------
#0 wait4+0x17/0xa3 (from /usr/lib64/libc.so.6 +0x1103d7)
$ drgn -c extension.img /usr/share/drgn/contrib/pstack.py pstack -p 19284
[PID: 19284 COMM: bash]
Thread 0 TID=19284 [S] CPU=18 ('bash')
#0 context_switch (kernel/sched/core.c:5367:2)
#1 __schedule (kernel/sched/core.c:6752:8)
#2 __schedule_loop (kernel/sched/core.c:6829:3)
#3 schedule (kernel/sched/core.c:6844:2)
#4 do_wait (kernel/exit.c:1698:3)
#5 kernel_wait4 (kernel/exit.c:1852:8)
#6 __do_sys_wait4 (kernel/exit.c:1880:13)
#7 do_syscall_x64 (arch/x86/entry/common.c:47:14)
#8 do_syscall_64 (arch/x86/entry/common.c:84:7)
#9 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121)
#10 0x7f46e82b63d7
------ userspace ---------
#0 wait4+0x17/0xa3 (from /usr/lib64/libc.so.6 +0x1103d7)
#1 waitchld.isra.0+0x102/0xc42 (from /usr/bin/bash +0x10bd52)
#2 wait_for+0x15a/0xda6 (from /usr/bin/bash +0x5b54a)
#3 execute_command_internal+0x3186/0x3877 (from /usr/bin/bash +0x3e976)
#4 execute_command+0xcc/0x1c8 (from /usr/bin/bash +0x3f13c)
#5 reader_loop+0x1ea/0x39a (from /usr/bin/bash +0x311da)
#6 main+0xe2a/0x1a94 (from /usr/bin/bash +0x2548a)
#7 __libc_start_call_main+0x7e/0xac (from /usr/lib64/libc.so.6 +0x2d39e)
#8 __libc_start_main@@GLIBC_2.34+0x89/0x14c (from /usr/lib64/libc.so.6 +0x2d459)
#9 _start+0x25/0x26 (from /usr/bin/bash +0x26125)
#10 ???
Beyond my testing / use case, the only other feedback I wanted to share
on this patch series, is that I think it would be good to have the
amdgpu and userstack extensions contributed upstream, rather than kept
as external customizations.
Given that there's not a defined extension API or ABI, I don't think
extensions yet make sense to be indpendent: they depend heavily on the
internals of makedumpfile. There's probably not going to be a
"makedumpfile-devel" package any time soon which allows building
and maintaining external extensions for your system makedumpfile. So the
only way make use of the system would be either (a) the Linux distro
bundles their own extension, or (b) the user builds and manually
installs a custom version of makedumpfile & extensions.
The main purpose of the extension API is to provide these specific
functionalities which are impossible with page-based filtering. Ideally,
I think makedumpfile should provide the real, useful capability and not
just the framework for a motivated developer to make it happen.
That said, I know it's a maintenance overhead and there may not be a
clear test strategy for every extension.
Thanks,
Stephen
Tao Liu <ltao@redhat.com> writes:
> A) This patchset will introduce the following features to makedumpfile:
>
> 1) Add .so extension support to makedumpfile
> 2) Enable btf and kallsyms for symbol type and address resolving.
>
> B) The purpose of the features are:
>
> 1) Currently makedumpfile filters mm pages based on page flags, because flags
> can help to determine one page's usage. But this page-flag-checking method
> lacks of flexibility in certain cases, e.g. if we want to filter those mm
> pages occupied by GPU during vmcore dumping due to:
>
> a) GPU may be taking a large memory and contains sensitive data;
> b) GPU mm pages have no relations to kernel crash and useless for vmcore
> analysis.
>
> But there is no GPU mm page specific flags, and apparently we don't need
> to create one just for kdump use. A programmable filtering tool is more
> suitable for such cases. In addition, different GPU vendors may use
> different ways for mm pages allocating, programmable filtering is better
> than hard coding these GPU specific logics into makedumpfile in this case.
>
> 2) Currently makedumpfile already contains a programmable filtering tool, aka
> eppic script, which allows user to write customized code for data erasing.
> However it has the following drawbacks:
>
> a) cannot do mm page filtering.
> b) need to access to debuginfo of both kernel and modules, which is not
> applicable in the 2nd kernel.
> c) eppic library has memory leaks which are not all resolved [1]. This
> is not acceptable in 2nd kernel.
>
> makedumpfile need to resolve the dwarf data from debuginfo, to get symbols
> types and addresses. In recent kernel there are dwarf alternatives such
> as btf/kallsyms which can be used for this purpose. And btf/kallsyms info
> are already packed within vmcore, so we can use it directly.
>
> With these, this patchset introduces makedumpfile extensions, which is based
> on btf/kallsyms symbol resolving, and is programmable for mm page filtering.
> The following section shows its usage and performance, please note the tests
> are performed in 1st kernel.
>
> 3) Compile and run makedumpfile extensions:
>
> $ make LINKTYPE=dynamic USELZO=on USESNAPPY=on USEZSTD=on
> $ make extensions
>
> $ /usr/bin/time -v ./makedumpfile -d 31 -l /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore
> /tmp/extension.out --extension amdgpu_filter.so
> Loaded extension: ./extensions/amdgpu_filter.so
> makedumpfile Completed.
> User time (seconds): 5.08
> System time (seconds): 0.84
> Percent of CPU this job got: 99%
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:05.95
> Maximum resident set size (kbytes): 17360
> ...
>
> To contrast with eppic script of v2 [2]:
>
> $ /usr/bin/time -v ./makedumpfile -d 31 -l /var/crash/127.0.0.1-2025-06-10-18\:03\:12/vmcore
> /tmp/eppic.out --eppic eppic_scripts/filter_amdgpu_mm_pages.c
> makedumpfile Completed.
> User time (seconds): 8.23
> System time (seconds): 0.88
> Elapsed (wall clock) time (h:mm:ss or m:ss): 0:09.16
> Maximum resident set size (kbytes): 57128
> ...
>
> -rw------- 1 root root 367475074 Jan 19 19:01 /tmp/extension.out
> -rw------- 1 root root 367475074 Jan 19 19:48 /tmp/eppic.out
> -rw------- 1 root root 387181418 Jun 10 18:03 /var/crash/127.0.0.1-2025-06-10-18:03:12/vmcore
>
> C) Discussion:
>
> 1) GPU types: Currently only tested with amdgpu's mm page filtering, others
> are not tested.
> 2) OS: The code can work on rhel-10+/rhel9.5+ on x86_64/arm64/s390/ppc64.
> Others are not tested.
>
> D) Testing:
>
> If you don't want to create your vmcore, you can find a vmcore which I
> created with amdgpu mm pages unfiltered [3], the amdgpu mm pages are
> allocated by program [4]. You can use the vmcore in 1st kernel to filter
> the amdgpu mm pages by the previous performance testing cmdline. To
> verify the pages are filtered in crash:
>
> Unfiltered:
> crash> search -c "!QAZXSW@#EDC"
> ffff96b7fa800000: !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> ffff96b87c800000: !QAZXSW@#EDCXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
> crash> rd ffff96b7fa800000
> ffff96b7fa800000: 405753585a415121 !QAZXSW@
> crash> rd ffff96b87c800000
> ffff96b87c800000: 405753585a415121 !QAZXSW@
>
> Filtered:
> crash> search -c "!QAZXSW@#EDC"
> crash> rd ffff96b7fa800000
> rd: page excluded: kernel virtual address: ffff96b7fa800000 type: "64-bit KVADDR"
> crash> rd ffff96b87c800000
> rd: page excluded: kernel virtual address: ffff96b87c800000 type: "64-bit KVADDR"
>
> [1]: https://github.com/lucchouina/eppic/pull/32
> [2]: https://lore.kernel.org/kexec/20251020222410.8235-1-ltao@redhat.com/
> [3]: https://people.redhat.com/~ltao/core/vmcore
> [4]: https://gist.github.com/liutgnu/a8cbce1c666452f1530e1410d1f352df
>
> v4 -> v3:
>
> 1) Get rid of all hash table usage. So only required syms/types info
> will be stored, rather than install all kernel's syms/types. To do this,
> special elf sections as .init_ksyms/ktypes are used for the required info
> declaration/storage.
>
> 2) Support extension callback for makedumpfile, so during mm page
> filtering, extension can help to decide if keep/discard the page.
>
> 3) The patches are organized as follows:
>
> --- <only for test purpose, don't merge> ---
> 7. Filter amdgpu mm pages
>
> --- <code should be merged> ---
> 6. Add makedumpfile extensions support
> 5. Implement kernel module's btf resolving
> 4. Implement kernel module's kallsyms resolving
> 3. Implement kernel btf resolving
> 2. Implement kernel kallsyms resolving
> 1. Reserve sections for makedumpfile and extenions
>
> Patch 7 is customization specific, which can be maintained separately.
> Patch 1 ~ 6 are common code which should be integrate with makedumpfile.
>
> Link to v3: https://lore.kernel.org/kexec/20260120025500.25095-1-ltao@redhat.com/
> Link to v2: https://lore.kernel.org/kexec/20251020222410.8235-1-ltao@redhat.com/
> Link to v1: https://lore.kernel.org/kexec/20250610095743.18073-1-ltao@redhat.com/
>
> Tao Liu (7):
> Reserve sections for makedumpfile and extenions
> Implement kernel kallsyms resolving
> Implement kernel btf resolving
> Implement kernel module's kallsyms resolving
> Implement kernel module's btf resolving
> Add makedumpfile extensions support
> Filter amdgpu mm pages
>
> Makefile | 11 +-
> btf_info.c | 345 +++++++++++++++++++++++++++
> btf_info.h | 92 ++++++++
> extension.c | 300 +++++++++++++++++++++++
> extension.h | 12 +
> extensions/Makefile | 12 +
> extensions/amdgpu_filter.c | 190 +++++++++++++++
> extensions/maple_tree.c | 307 ++++++++++++++++++++++++
> extensions/maple_tree.h | 6 +
> kallsyms.c | 473 +++++++++++++++++++++++++++++++++++++
> kallsyms.h | 94 ++++++++
> makedumpfile.c | 41 +++-
> makedumpfile.h | 13 +
> makedumpfile.ld | 15 ++
> 14 files changed, 1903 insertions(+), 8 deletions(-)
> create mode 100644 btf_info.c
> create mode 100644 btf_info.h
> create mode 100644 extension.c
> create mode 100644 extension.h
> create mode 100644 extensions/Makefile
> create mode 100644 extensions/amdgpu_filter.c
> create mode 100644 extensions/maple_tree.c
> create mode 100644 extensions/maple_tree.h
> create mode 100644 kallsyms.c
> create mode 100644 kallsyms.h
> create mode 100644 makedumpfile.ld
>
> --
> 2.47.0
^ permalink raw reply [flat|nested] 21+ messages in thread