* [RFC][PATCH v2 00/29] introduce kmemdump
@ 2025-07-24 13:54 Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
` (29 more replies)
0 siblings, 30 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
kmemdump is a mechanism which allows the kernel to mark specific memory
areas for dumping or specific backend usage.
Once regions are marked, kmemdump keeps an internal list with the regions
and registers them in the backend.
Further, depending on the backend driver, these regions can be dumped using
firmware or different hardware block.
Regions being marked beforehand, when the system is up and running, there
is no need nor dependency on a panic handler, or a working kernel that can
dump the debug information.
The kmemdump approach works when pstore, kdump, or another mechanism do not.
Pstore relies on persistent storage, a dedicated RAM area or flash, which
has the disadvantage of having the memory reserved all the time, or another
specific non volatile memory. Some devices cannot keep the RAM contents on
reboot so ramoops does not work. Some devices do not allow kexec to run
another kernel to debug the crashed one.
For such devices, that have another mechanism to help debugging, like
firmware, kmemdump is a viable solution.
kmemdump can create a core image, similar with /proc/vmcore, with only
the registered regions included. This can be loaded into crash tool/gdb and
analyzed.
To have this working, specific information from the kernel is registered,
and this is done at kmemdump init time, no need for the kmemdump user to
do anything.
This version of the kmemdump patch series includes two backend drivers:
one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo
backend for Android devices, reworked from this source here:
https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c
written originally by Jone Chou <jonechou@google.com>
Initial version of kmemdump and discussion is available here:
https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
Kmemdump has been presented and discussed at Linaro Connect 2025,
including motivation, scope, usability and feasability.
Video of the recording is available here for anyone interested:
https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14
The implementation is based on the initial Pstore/directly mapped zones
published as an RFC here:
https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
The back-end implementation for qcom_minidump is based on the minidump
patch series and driver written by Mukesh Ojha, thanks:
https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
*** How to use kmemdump with minidump backend on Qualcomm platform guide ***
Prerequisites:
Crash tool with target=ARM64 and minor changes required for usual crash mode
(minimal mode works without the patch)
A patch can be applied from here https://p.calebs.dev/49a048
This patch will be eventually sent in a reworked way to crash tool.
Target kernel must be built with :
CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging
information needed for crash tool.
Otherwise, the kernel requires these as well:
CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend
CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND
Kernel arguments:
Kernel firmware must be set to mode 'mini' by kernel module parameter
like this : qcom_scm.download_mode=mini
After the kernel boots, and qcom_minidump module is loaded, everything is ready for
a possible crash.
Once the crash happens, the firmware will kick in and you will see on
the console the message saying Sahara init, etc, that the firmware is
waiting in download mode. (this is subject to firmware supporting this
mode, I am using sa8775p-ride board)
Example of log on the console:
"
[...]
B - 1096414 - usb: init start
B - 1100287 - usb: qusb_dci_platform , 0x19
B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60
B - 1107455 - usb: usb2phy: PRIM success , 0x4
B - 1112670 - usb: dci, chgr_type_det_err
B - 1117154 - usb: ID:0x260, value: 0x4
B - 1121942 - usb: ID:0x108, value: 0x1d90
B - 1124992 - usb: timer_start , 0x4c4b40
B - 1129140 - usb: vbus_det_pm_unavail
B - 1133136 - usb: ID:0x252, value: 0x4
B - 1148874 - usb: SUPER , 0x900e
B - 1275510 - usb: SUPER , 0x900e
B - 1388970 - usb: ID:0x20d, value: 0x0
B - 1411113 - usb: ENUM success
B - 1411113 - Sahara Init
B - 1414285 - Sahara Open
"
Once the board is in download mode, you can use the qdl tool (I
personally use edl , have not tried qdl yet), to get all the regions as
separate files.
The tool from the host computer will list the regions in the order they
were downloaded.
Once you have all the files simply use `cat` to put them all together,
in the order of the indexes.
For my kernel config and setup, here is my cat command : (you can use a script
or something, I haven't done that so far):
`cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \
memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \
memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \
memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \
memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \
memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \
memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \
memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \
memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \
memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \
memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \
memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \
memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \
memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \
memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \
memory/md_Kunknown46.BIN memory/md_Kunknown47.BIN memory/md_Kunknown50.BIN \
memory/md_Kunknown51.BIN memory/md_Kunknown52.BIN \
memory/md_Kunknown53.BIN > ~/minidump_image`
Once you have the resulted file, use `crash` tool to load it, like this:
`./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image`
There is also a --minimal mode for ./crash that would work without any patch applied
to crash tool, but you can't inspect symbols, etc.
Once you load crash you will see something like this :
KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
DUMPFILE: /home/eugen/new
CPUS: 8 [OFFLINE: 7]
DATE: Thu Jan 1 02:00:00 EET 1970
UPTIME: 00:00:29
TASKS: 0
NODENAME: qemuarm64
RELEASE: 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty
VERSION: #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
MACHINE: aarch64 (unknown Mhz)
MEMORY: 34.2 GB
PANIC: ""
crash> log
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
[ 0.000000] Linux version 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty (eugen@eugen-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
*** Debug Kinfo backend driver ***
I don't have any device to actually test this. So I have not.
I hacked the driver to just use a kmalloc'ed area to save things instead
of the shared memory, and dumped everything there and checked whether it looks
sane. If someone is willing to try it out, thanks ! and let me know.
I know there is no binding documentation for the compatible either.
Thanks for everyone reviewing and bringing ideas into the discussion.
Eugen
Changelog since the v1 of the RFC:
- Reworked the whole minidump implementation based on suggestions from Thomas Gleixner.
This means new API, macros, new way to store the regions inside kmemdump
(ditched the IDR, moved to static allocation, have a static default backend, etc)
- Reworked qcom_minidump driver based on review from Bjorn Andersson
- Reworked printk log buffer registration based on review from Petr Mladek
I appologize if I missed any review comments. I know there is still lots of work
on this series and hope I will improve it more and more.
Patches are sent on top of next-20250721
Eugen Hristev (29):
kmemdump: introduce kmemdump
Documentation: add kmemdump
kmemdump: add coreimage ELF layer
Documentation: kmemdump: add section for coreimage ELF
kmemdump: introduce qcom-minidump backend driver
soc: qcom: smem: add minidump device
init/version: Annotate static information into Kmemdump
cpu: Annotate static information into Kmemdump
genirq/irqdesc: Annotate static information into Kmemdump
panic: Annotate static information into Kmemdump
sched/core: Annotate static information into Kmemdump
timers: Annotate static information into Kmemdump
kernel/fork: Annotate static information into Kmemdump
mm/page_alloc: Annotate static information into Kmemdump
mm/init-mm: Annotate static information into Kmemdump
mm/show_mem: Annotate static information into Kmemdump
mm/swapfile: Annotate static information into Kmemdump
mm/percpu: Annotate static information into Kmemdump
mm/mm_init: Annotate static information into Kmemdump
printk: Register information into Kmemdump
kernel/configs: Register dynamic information into Kmemdump
mm/numa: Register information into Kmemdump
mm/sparse: Register information into Kmemdump
kernel/vmcore_info: Register dynamic information into Kmemdump
kmemdump: Add additional symbols to the coreimage
init/version: Annotate init uts name separately into Kmemdump
kallsyms: Annotate static information into Kmemdump
mm/init-mm: Annotate additional information into Kmemdump
kmemdump: Add Kinfo backend driver
Documentation/debug/index.rst | 17 ++
Documentation/debug/kmemdump.rst | 104 +++++++++
MAINTAINERS | 18 ++
drivers/Kconfig | 4 +
drivers/Makefile | 2 +
drivers/debug/Kconfig | 55 +++++
drivers/debug/Makefile | 6 +
drivers/debug/kinfo.c | 304 +++++++++++++++++++++++++
drivers/debug/kmemdump.c | 239 +++++++++++++++++++
drivers/debug/kmemdump_coreimage.c | 223 ++++++++++++++++++
drivers/debug/qcom_minidump.c | 353 +++++++++++++++++++++++++++++
drivers/soc/qcom/smem.c | 10 +
include/asm-generic/vmlinux.lds.h | 13 ++
include/linux/kmemdump.h | 219 ++++++++++++++++++
init/version.c | 6 +
kernel/configs.c | 6 +
kernel/cpu.c | 5 +
kernel/fork.c | 2 +
kernel/irq/irqdesc.c | 2 +
kernel/kallsyms.c | 10 +
kernel/panic.c | 4 +
kernel/printk/printk.c | 28 ++-
kernel/sched/core.c | 2 +
kernel/time/timer.c | 3 +-
kernel/vmcore_info.c | 3 +
mm/init-mm.c | 12 +
mm/mm_init.c | 2 +
mm/numa.c | 5 +-
mm/page_alloc.c | 2 +
mm/percpu.c | 3 +
mm/show_mem.c | 2 +
mm/sparse.c | 16 +-
mm/swapfile.c | 2 +
33 files changed, 1670 insertions(+), 12 deletions(-)
create mode 100644 Documentation/debug/index.rst
create mode 100644 Documentation/debug/kmemdump.rst
create mode 100644 drivers/debug/Kconfig
create mode 100644 drivers/debug/Makefile
create mode 100644 drivers/debug/kinfo.c
create mode 100644 drivers/debug/kmemdump.c
create mode 100644 drivers/debug/kmemdump_coreimage.c
create mode 100644 drivers/debug/qcom_minidump.c
create mode 100644 include/linux/kmemdump.h
--
2.43.0
^ permalink raw reply [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 01/29] kmemdump: introduce kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-26 3:33 ` Randy Dunlap
2025-07-26 3:36 ` Randy Dunlap
2025-07-24 13:54 ` [RFC][PATCH v2 02/29] Documentation: add kmemdump Eugen Hristev
` (28 subsequent siblings)
29 siblings, 2 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Kmemdump mechanism allows any driver to mark a specific memory area
for later dumping purpose, depending on the functionality
of the attached backend. The backend would interface any hardware
mechanism that will allow dumping to complete regardless of the
state of the kernel (running, frozen, crashed, or any particular
state).
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 6 +
drivers/Kconfig | 4 +
drivers/Makefile | 2 +
drivers/debug/Kconfig | 16 +++
drivers/debug/Makefile | 3 +
drivers/debug/kmemdump.c | 214 ++++++++++++++++++++++++++++++
include/asm-generic/vmlinux.lds.h | 13 ++
include/linux/kmemdump.h | 135 +++++++++++++++++++
8 files changed, 393 insertions(+)
create mode 100644 drivers/debug/Kconfig
create mode 100644 drivers/debug/Makefile
create mode 100644 drivers/debug/kmemdump.c
create mode 100644 include/linux/kmemdump.h
diff --git a/MAINTAINERS b/MAINTAINERS
index 70d1a0a62a8e..7e8da575025c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13617,6 +13617,12 @@ L: linux-iio@vger.kernel.org
S: Supported
F: drivers/iio/accel/kionix-kx022a*
+KMEMDUMP
+M: Eugen Hristev <eugen.hristev@linaro.org>
+S: Maintained
+F: drivers/debug/kmemdump.c
+F: include/linux/kmemdump.h
+
KMEMLEAK
M: Catalin Marinas <catalin.marinas@arm.com>
S: Maintained
diff --git a/drivers/Kconfig b/drivers/Kconfig
index e0777f5ed543..412ef182d5c2 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -245,4 +245,8 @@ source "drivers/hte/Kconfig"
source "drivers/cdx/Kconfig"
+source "drivers/dpll/Kconfig"
+
+source "drivers/debug/Kconfig"
+
endmenu
diff --git a/drivers/Makefile b/drivers/Makefile
index b5749cf67044..e4cc23f4aba2 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -196,3 +196,5 @@ obj-$(CONFIG_CDX_BUS) += cdx/
obj-$(CONFIG_DPLL) += dpll/
obj-$(CONFIG_S390) += s390/
+
+obj-y += debug/
diff --git a/drivers/debug/Kconfig b/drivers/debug/Kconfig
new file mode 100644
index 000000000000..b86585c5d621
--- /dev/null
+++ b/drivers/debug/Kconfig
@@ -0,0 +1,16 @@
+# SPDX-License-Identifier: GPL-2.0
+menu "Generic Debug Options"
+
+config KMEMDUMP
+ bool "Allow the kernel to register memory regions for dumping purpose"
+ help
+ Kmemdump mechanism allows any driver to register a specific memory
+ area for later dumping purpose, depending on the functionality
+ of the attached backend. The backend would interface any hardware
+ mechanism that will allow dumping to happen regardless of the
+ state of the kernel (running, frozen, crashed, or any particular
+ state).
+
+ Note that modules using this feature must be rebuilt if option
+ changes.
+endmenu
diff --git a/drivers/debug/Makefile b/drivers/debug/Makefile
new file mode 100644
index 000000000000..8ed6ec2d8a0d
--- /dev/null
+++ b/drivers/debug/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_KMEMDUMP) += kmemdump.o
diff --git a/drivers/debug/kmemdump.c b/drivers/debug/kmemdump.c
new file mode 100644
index 000000000000..b6d418aafbef
--- /dev/null
+++ b/drivers/debug/kmemdump.c
@@ -0,0 +1,214 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/device.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+#include <linux/kmemdump.h>
+
+#define MAX_ZONES 201
+
+static int default_register_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id, void *area, size_t sz)
+{
+ return 0;
+}
+
+static int default_unregister_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id)
+{
+ return 0;
+}
+
+static const struct kmemdump_backend kmemdump_default_backend = {
+ .name = "default",
+ .register_region = default_register_region,
+ .unregister_region = default_unregister_region,
+};
+
+static const struct kmemdump_backend *backend = &kmemdump_default_backend;
+static DEFINE_MUTEX(kmemdump_lock);
+static struct kmemdump_zone kmemdump_zones[MAX_ZONES];
+
+static int __init init_kmemdump(void)
+{
+ const struct kmemdump_zone *e;
+
+ /* Walk the kmemdump section for static variables and register them */
+ for_each_kmemdump_entry(e)
+ kmemdump_register_id(e->id, e->zone, e->size);
+
+ return 0;
+}
+late_initcall(init_kmemdump);
+
+/**
+ * kmemdump_register_id() - Register region into kmemdump with given ID.
+ * @req_id: Requested unique kmemdump_uid that identifies the region
+ * This can be KMEMDUMP_ID_NO_ID, in which case the function will
+ * find an unused ID and return it.
+ * @zone: pointer to the zone of memory
+ * @size: region size
+ *
+ * Return: On success, it returns the unique id for the region.
+ * On failure, it returns negative error value.
+ */
+int kmemdump_register_id(enum kmemdump_uid req_id, void *zone, size_t size)
+{
+ struct kmemdump_zone *z;
+ enum kmemdump_uid uid = req_id;
+ int ret;
+
+ if (uid < KMEMDUMP_ID_START)
+ return -EINVAL;
+
+ if (uid >= MAX_ZONES)
+ return -ENOSPC;
+
+ mutex_lock(&kmemdump_lock);
+
+ if (uid == KMEMDUMP_ID_NO_ID)
+ while (uid < MAX_ZONES) {
+ if (!kmemdump_zones[uid].id)
+ break;
+ uid++;
+ }
+
+ if (uid == MAX_ZONES) {
+ mutex_unlock(&kmemdump_lock);
+ return -ENOSPC;
+ }
+
+ z = &kmemdump_zones[uid];
+
+ if (z->id) {
+ mutex_unlock(&kmemdump_lock);
+ return -EALREADY;
+ }
+
+ ret = backend->register_region(backend, uid, zone, size);
+ if (ret) {
+ mutex_unlock(&kmemdump_lock);
+ return ret;
+ }
+
+ z->zone = zone;
+ z->size = size;
+ z->id = uid;
+
+ mutex_unlock(&kmemdump_lock);
+
+ return uid;
+}
+EXPORT_SYMBOL_GPL(kmemdump_register_id);
+
+/**
+ * kmemdump_unregister() - Unregister region from kmemdump.
+ * @id: unique id that was returned when this region was successfully
+ * registered initially.
+ *
+ * Return: None
+ */
+void kmemdump_unregister(enum kmemdump_uid id)
+{
+ struct kmemdump_zone *z = NULL;
+
+ mutex_lock(&kmemdump_lock);
+
+ z = &kmemdump_zones[id];
+ if (!z->id) {
+ mutex_unlock(&kmemdump_lock);
+ return;
+ }
+
+ backend->unregister_region(backend, z->id);
+
+ memset(z, 0, sizeof(*z));
+
+ mutex_unlock(&kmemdump_lock);
+}
+EXPORT_SYMBOL_GPL(kmemdump_unregister);
+
+/**
+ * kmemdump_register_backend() - Register a backend into kmemdump.
+ * @be: Pointer to a driver allocated backend. This backend must have
+ * two callbacks for registering and deregistering a zone from the
+ * backend.
+ *
+ * Only one backend is supported at a time.
+ *
+ * Return: On success, it returns 0, negative error value otherwise.
+ */
+int kmemdump_register_backend(const struct kmemdump_backend *be)
+{
+ enum kmemdump_uid uid;
+ int ret;
+
+ if (!be || !be->register_region || !be->unregister_region)
+ return -EINVAL;
+
+ mutex_lock(&kmemdump_lock);
+
+ /* Try to call the old backend for all existing regions */
+ for (uid = KMEMDUMP_ID_START; uid < MAX_ZONES; uid++)
+ if (kmemdump_zones[uid].id)
+ backend->unregister_region(backend,
+ kmemdump_zones[uid].id);
+
+ backend = be;
+ pr_debug("kmemdump backend %s registered successfully.\n",
+ backend->name);
+
+ /* Call the new backend for all existing regions */
+ for (uid = KMEMDUMP_ID_START; uid < MAX_ZONES; uid++) {
+ if (!kmemdump_zones[uid].id)
+ continue;
+ ret = backend->register_region(backend,
+ kmemdump_zones[uid].id,
+ kmemdump_zones[uid].zone,
+ kmemdump_zones[uid].size);
+ if (ret)
+ pr_debug("register region failed with %d\n", ret);
+ }
+
+ mutex_unlock(&kmemdump_lock);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(kmemdump_register_backend);
+
+/**
+ * kmemdump_unregister_backend() - Unregister the backend from kmemdump.
+ * @be: Pointer to a driver allocated backend. This backend must match
+ * the initially registered backend.
+ *
+ * Only one backend is supported at a time.
+ * Before deregistering, this will call the backend to unregister all the
+ * previously registered zones.
+ *
+ * Return: None
+ */
+void kmemdump_unregister_backend(const struct kmemdump_backend *be)
+{
+ enum kmemdump_uid uid;
+
+ mutex_lock(&kmemdump_lock);
+
+ if (backend != be) {
+ mutex_unlock(&kmemdump_lock);
+ return;
+ }
+
+ /* Try to call the old backend for all existing regions */
+ for (uid = KMEMDUMP_ID_START; uid < MAX_ZONES; uid++)
+ if (kmemdump_zones[uid].id)
+ backend->unregister_region(backend,
+ kmemdump_zones[uid].id);
+
+ pr_debug("kmemdump backend %s removed successfully.\n", be->name);
+
+ backend = &kmemdump_default_backend;
+
+ mutex_unlock(&kmemdump_lock);
+}
+EXPORT_SYMBOL_GPL(kmemdump_unregister_backend);
+
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index fa5f19b8d53a..433719442a5e 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -488,6 +488,8 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
FW_LOADER_BUILT_IN_DATA \
TRACEDATA \
\
+ KMEMDUMP_TABLE \
+ \
PRINTK_INDEX \
\
/* Kernel symbol table: Normal symbols */ \
@@ -891,6 +893,17 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
#define TRACEDATA
#endif
+#ifdef CONFIG_KMEMDUMP
+#define KMEMDUMP_TABLE \
+ . = ALIGN(8); \
+ .kmemdump : AT(ADDR(.kmemdump) - LOAD_OFFSET) { \
+ BOUNDED_SECTION_POST_LABEL(.kmemdump, __kmemdump_table, \
+ , _end) \
+ }
+#else
+#define KMEMDUMP_TABLE
+#endif
+
#ifdef CONFIG_PRINTK_INDEX
#define PRINTK_INDEX \
.printk_index : AT(ADDR(.printk_index) - LOAD_OFFSET) { \
diff --git a/include/linux/kmemdump.h b/include/linux/kmemdump.h
new file mode 100644
index 000000000000..c3690423a347
--- /dev/null
+++ b/include/linux/kmemdump.h
@@ -0,0 +1,135 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+#ifndef _KMEMDUMP_H
+#define _KMEMDUMP_H
+
+enum kmemdump_uid {
+ KMEMDUMP_ID_START = 0,
+ KMEMDUMP_ID_USER_START,
+ KMEMDUMP_ID_USER_END,
+ KMEMDUMP_ID_NO_ID,
+};
+
+#ifdef CONFIG_KMEMDUMP
+/**
+ * struct kmemdump_zone - region mark zone information
+ * @id: unique id for this zone
+ * @zone: pointer to the memory area for this zone
+ * @size: size of the memory area of this zone
+ */
+struct kmemdump_zone {
+ enum kmemdump_uid id;
+ void *zone;
+ size_t size;
+};
+
+/* kmemdump section table markers*/
+extern const struct kmemdump_zone __kmemdump_table[];
+extern const struct kmemdump_zone __kmemdump_table_end[];
+
+/* Annotate a variable into the given kmemdump UID */
+#define KMEMDUMP_VAR_ID(idx, sym, sz) \
+ static const struct kmemdump_zone __UNIQUE_ID(__kmemdump_entry_##sym) \
+ __used __section(".kmemdump") = { .id = idx, \
+ .zone = (void *)&(sym), \
+ .size = (sz), \
+ }
+
+/* Iterate through kmemdump section entries */
+#define for_each_kmemdump_entry(__entry) \
+ for (__entry = __kmemdump_table; \
+ __entry < __kmemdump_table_end; \
+ __entry++)
+
+#else
+#define KMEMDUMP_VAR_ID(...)
+#endif
+/*
+ * Wrapper over an existing fn allocator
+ * It will :
+ * - unregister the memory already registered into kmemdump at the given UID
+ * - register the memory into kmemdump at the given UID
+ * - take an argument for the ID and the wanted size
+ */
+#define kmemdump_alloc_id_size_replace(id, sz, fn, ...) \
+ ({ \
+ void *__p = fn(__VA_ARGS__); \
+ \
+ if (__p) { \
+ kmemdump_unregister(id); \
+ kmemdump_register_id(id, __p, sz); \
+ } \
+ __p; \
+ })
+/*
+ * Wrapper over an existing fn allocator
+ * It will :
+ * - fail if the given UID is already registered
+ * - register the memory into kmemdump at the given UID
+ * - take an argument for the ID and the wanted size
+ */
+
+#define kmemdump_alloc_id_size(id, sz, fn, ...) \
+ ({ \
+ void *__p = fn(__VA_ARGS__); \
+ \
+ if (__p) \
+ kmemdump_register_id(id, __p, sz); \
+ __p; \
+ })
+
+#define kmemdump_alloc_size(...) \
+ kmemdump_alloc_id_size(KMEMDUMP_ID_NO_ID, __VA_ARGS__)
+
+#define kmemdump_phys_alloc_id_size(id, sz, fn, ...) \
+ ({ \
+ phys_addr_t __p = fn(__VA_ARGS__); \
+ \
+ if (__p) \
+ kmemdump_register_id(id, __va(__p), sz); \
+ __p; \
+ })
+
+#define kmemdump_phys_alloc_size(...) \
+ kmemdump_phys_alloc_id_size(KMEMDUMP_ID_NO_ID, __VA_ARGS__)
+
+#define kmemdump_free_id(id, fn, ...) \
+ ({ \
+ kmemdump_unregister(id); \
+ fn(__VA_ARGS__); \
+ })
+
+#ifdef CONFIG_KMEMDUMP
+
+#define KMEMDUMP_BACKEND_MAX_NAME 128
+/**
+ * struct kmemdump_backend - region mark backend information
+ * @name: the name of the backend
+ * @register_region: callback to register region in the backend
+ * @unregister_region: callback to unregister region in the backend
+ */
+struct kmemdump_backend {
+ char name[KMEMDUMP_BACKEND_MAX_NAME];
+ int (*register_region)(const struct kmemdump_backend *be,
+ enum kmemdump_uid uid, void *vaddr, size_t size);
+ int (*unregister_region)(const struct kmemdump_backend *be,
+ enum kmemdump_uid uid);
+};
+
+int kmemdump_register_backend(const struct kmemdump_backend *backend);
+void kmemdump_unregister_backend(const struct kmemdump_backend *backend);
+
+int kmemdump_register_id(enum kmemdump_uid id, void *zone, size_t size);
+void kmemdump_unregister(enum kmemdump_uid id);
+#else
+static inline int kmemdump_register_id(enum kmemdump_uid uid, void *area,
+ size_t size)
+{
+ return 0;
+}
+
+static inline void kmemdump_unregister(enum kmemdump_uid id)
+{
+}
+#endif
+
+#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 02/29] Documentation: add kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 14:13 ` Jonathan Corbet
2025-07-24 13:54 ` [RFC][PATCH v2 03/29] kmemdump: add coreimage ELF layer Eugen Hristev
` (27 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Document the new kmemdump kernel feature.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
Documentation/debug/index.rst | 17 ++++++
Documentation/debug/kmemdump.rst | 98 ++++++++++++++++++++++++++++++++
MAINTAINERS | 1 +
3 files changed, 116 insertions(+)
create mode 100644 Documentation/debug/index.rst
create mode 100644 Documentation/debug/kmemdump.rst
diff --git a/Documentation/debug/index.rst b/Documentation/debug/index.rst
new file mode 100644
index 000000000000..9a9365c62f02
--- /dev/null
+++ b/Documentation/debug/index.rst
@@ -0,0 +1,17 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===
+kmemdump
+===
+
+.. toctree::
+ :maxdepth: 1
+
+ kmemdump
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
diff --git a/Documentation/debug/kmemdump.rst b/Documentation/debug/kmemdump.rst
new file mode 100644
index 000000000000..3301abcaed7e
--- /dev/null
+++ b/Documentation/debug/kmemdump.rst
@@ -0,0 +1,98 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+==========================
+kmemdump
+==========================
+
+This document provides information about the kmemdump feature.
+
+Overview
+========
+
+kmemdump is a mechanism that allows any driver or producer to register a
+chunk of memory into kmemdump, to be used at a later time for a specific
+purpose like debugging or memory dumping.
+
+kmemdump allows a backend to be connected, this backend interfaces a
+specific hardware that can debug or dump the memory registered into
+kmemdump.
+
+kmemdump Internals
+=============
+
+API
+----
+
+A memory region is being registered with a call to `kmemdump_register` which
+takes as parameters the ID of the region, a pointer to the virtual memory
+start address and the size. If successful, this call returns an unique ID for
+the allocated zone (either the requested ID or an allocated ID).
+IDs are predefined in the kmemdump header. A second registration with the
+same ID is not allowed, the caller needs to deregister first.
+A dedicated NO_ID is defined, which has kmemdump allocate a new unique ID
+for the request and return it. This case is useful with multiple dynamic
+loop allocations where ID is not significant.
+
+The region would be registered with a call to `kmemdump_unregister` which
+takes the id as a parameter.
+
+For dynamically allocated memory, kmemdump defines a variety of wrappers
+on top of allocation functions which are given as parameters.
+This makes the dynamic allocation easy to use without additional calls
+to registration functions. However kmemdump still exposes the register API
+for cases where it may be needed (e.g. size is not exactly known at allocation
+time).
+
+For static variables, a variety of annotation macros are provided. These
+macros will create an annotation struct inside a separate section.
+
+
+Backend
+-------
+
+Backend is represented by a `struct kmemdump_backend` which has to be filled
+in by the backend driver. Further, this struct is being passed to kmemdump
+with a `backend_register` call. `backend_unregister` will remove the backend
+from kmemdump.
+
+Once a backend is being registered, all previously registered regions are
+being sent to the backend for registration.
+
+When the backend is being removed, all regions are being first deregistered
+from the backend.
+
+kmemdump will request the backend to register a region with `register_region`
+call, and deregister a region with `unregister_region` call. These two
+functions are mandatory to be provided by a backend at registration time.
+
+Data structures
+---------------
+
+`struct kmemdump_backend` represents the kmemdump backend and it has two
+function pointers, one called `register_region` and the other
+`unregister_region`.
+There is a default backend that does a no-op that is initially registered
+and is registered back if the current working backend is being removed.
+
+The regions are being stored in a simple fixed size array. It avoids
+memory allocation overhead. This is not performance critical nor does
+allocating a few hundred entries create a memory consumption problem.
+
+The static variables registered into kmemdump are being annotated into
+a dedicated `.kemdump` memory section. This is then walked by kmemdump
+at a later time and each variable is registered.
+
+kmemdump Initialization
+------------------
+
+After system boots, kmemdump will be ready to accept region registration
+from producer drivers. Even if the backend may not be registered yet,
+there is a default no-op backend that is registered. At any time the backend
+can be changed with a real backend in which case all regions are being
+registered to the new backend.
+
+backend functionality
+-----------------
+
+kmemdump backend can keep it's own list of regions and use the specific
+hardware available to dump the memory regions or use them for debugging.
diff --git a/MAINTAINERS b/MAINTAINERS
index 7e8da575025c..ef0ffdfaf3de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13620,6 +13620,7 @@ F: drivers/iio/accel/kionix-kx022a*
KMEMDUMP
M: Eugen Hristev <eugen.hristev@linaro.org>
S: Maintained
+F: Documentation/debug/kmemdump.rst
F: drivers/debug/kmemdump.c
F: include/linux/kmemdump.h
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 03/29] kmemdump: add coreimage ELF layer
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 02/29] Documentation: add kmemdump Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 04/29] Documentation: kmemdump: add section for coreimage ELF Eugen Hristev
` (26 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Implement kmemdumping into an ELF coreimage.
With this feature enabled, kmemdump will assemble all the regions
into a coreimage, by having an initial first region with an ELF header,
a second region with vmcoreinfo data, and then register vital kernel
information in the subsequent regions.
This image can then be dumped, assembled into a single file and loaded
into debugging tools like crash/gdb.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 1 +
drivers/debug/Kconfig | 14 ++
drivers/debug/Makefile | 1 +
drivers/debug/kmemdump.c | 25 ++++
drivers/debug/kmemdump_coreimage.c | 223 +++++++++++++++++++++++++++++
include/linux/kmemdump.h | 70 ++++++++-
6 files changed, 333 insertions(+), 1 deletion(-)
create mode 100644 drivers/debug/kmemdump_coreimage.c
diff --git a/MAINTAINERS b/MAINTAINERS
index ef0ffdfaf3de..b43a43b61e19 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13622,6 +13622,7 @@ M: Eugen Hristev <eugen.hristev@linaro.org>
S: Maintained
F: Documentation/debug/kmemdump.rst
F: drivers/debug/kmemdump.c
+F: drivers/debug/kmemdump_coreimage.c
F: include/linux/kmemdump.h
KMEMLEAK
diff --git a/drivers/debug/Kconfig b/drivers/debug/Kconfig
index b86585c5d621..903e3e2805b7 100644
--- a/drivers/debug/Kconfig
+++ b/drivers/debug/Kconfig
@@ -13,4 +13,18 @@ config KMEMDUMP
Note that modules using this feature must be rebuilt if option
changes.
+
+config KMEMDUMP_COREIMAGE
+ depends on KMEMDUMP
+ select VMCORE_INFO
+ bool "Assemble memory regions into a coredump readable with debuggers"
+ help
+ Enabling this will assemble all the memory regions into a
+ core ELF file. The first region will include program headers for
+ all the regions. The second region is the vmcoreinfo and specific
+ coredump structures.
+ All the other regions follow. Specific kernel variables required
+ for debug tools are being registered.
+ The coredump file can then be loaded into GDB or crash tool and
+ further inspected.
endmenu
diff --git a/drivers/debug/Makefile b/drivers/debug/Makefile
index 8ed6ec2d8a0d..2b67673393a6 100644
--- a/drivers/debug/Makefile
+++ b/drivers/debug/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_KMEMDUMP) += kmemdump.o
+obj-$(CONFIG_KMEMDUMP_COREIMAGE) += kmemdump_coreimage.o
diff --git a/drivers/debug/kmemdump.c b/drivers/debug/kmemdump.c
index b6d418aafbef..6dd49359d8ef 100644
--- a/drivers/debug/kmemdump.c
+++ b/drivers/debug/kmemdump.c
@@ -28,15 +28,34 @@ static const struct kmemdump_backend kmemdump_default_backend = {
static const struct kmemdump_backend *backend = &kmemdump_default_backend;
static DEFINE_MUTEX(kmemdump_lock);
static struct kmemdump_zone kmemdump_zones[MAX_ZONES];
+static bool kmemdump_initialized;
static int __init init_kmemdump(void)
{
const struct kmemdump_zone *e;
+ enum kmemdump_uid uid;
+
+ init_elfheader();
/* Walk the kmemdump section for static variables and register them */
for_each_kmemdump_entry(e)
kmemdump_register_id(e->id, e->zone, e->size);
+ mutex_lock(&kmemdump_lock);
+ /*
+ * Some regions may have been registered very early.
+ * Update the elf header for all existing regions,
+ * except for KMEMDUMP_ID_COREIMAGE_ELF and
+ * KMEMDUMP_ID_COREIMAGE_VMCOREINFO, those are included in the
+ * ELF header upon its creation.
+ */
+ for (uid = KMEMDUMP_ID_COREIMAGE_CONFIG; uid < MAX_ZONES; uid++)
+ if (kmemdump_zones[uid].id)
+ update_elfheader(&kmemdump_zones[uid]);
+
+ kmemdump_initialized = true;
+ mutex_unlock(&kmemdump_lock);
+
return 0;
}
late_initcall(init_kmemdump);
@@ -95,6 +114,9 @@ int kmemdump_register_id(enum kmemdump_uid req_id, void *zone, size_t size)
z->size = size;
z->id = uid;
+ if (kmemdump_initialized)
+ update_elfheader(z);
+
mutex_unlock(&kmemdump_lock);
return uid;
@@ -122,6 +144,9 @@ void kmemdump_unregister(enum kmemdump_uid id)
backend->unregister_region(backend, z->id);
+ if (kmemdump_initialized)
+ clear_elfheader(z);
+
memset(z, 0, sizeof(*z));
mutex_unlock(&kmemdump_lock);
diff --git a/drivers/debug/kmemdump_coreimage.c b/drivers/debug/kmemdump_coreimage.c
new file mode 100644
index 000000000000..2cdab22d0c5c
--- /dev/null
+++ b/drivers/debug/kmemdump_coreimage.c
@@ -0,0 +1,223 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/elfcore.h>
+#include <linux/kmemdump.h>
+#include <linux/vmcore_info.h>
+
+#define CORE_STR "CORE"
+
+#define MAX_NUM_ENTRIES 201
+
+static struct elfhdr *ehdr;
+static size_t elf_offset;
+
+static void append_kcore_note(char *notes, size_t *i, const char *name,
+ unsigned int type, const void *desc,
+ size_t descsz)
+{
+ struct elf_note *note = (struct elf_note *)¬es[*i];
+
+ note->n_namesz = strlen(name) + 1;
+ note->n_descsz = descsz;
+ note->n_type = type;
+ *i += sizeof(*note);
+ memcpy(¬es[*i], name, note->n_namesz);
+ *i = ALIGN(*i + note->n_namesz, 4);
+ memcpy(¬es[*i], desc, descsz);
+ *i = ALIGN(*i + descsz, 4);
+}
+
+static void append_kcore_note_nodesc(char *notes, size_t *i, const char *name,
+ unsigned int type, size_t descsz)
+{
+ struct elf_note *note = (struct elf_note *)¬es[*i];
+
+ note->n_namesz = strlen(name) + 1;
+ note->n_descsz = descsz;
+ note->n_type = type;
+ *i += sizeof(*note);
+ memcpy(¬es[*i], name, note->n_namesz);
+ *i = ALIGN(*i + note->n_namesz, 4);
+}
+
+static struct elf_phdr *elf_phdr_entry_addr(struct elfhdr *ehdr, int idx)
+{
+ struct elf_phdr *ephdr = (struct elf_phdr *)((size_t)ehdr + ehdr->e_phoff);
+
+ return &ephdr[idx];
+}
+
+/**
+ * clear_elfheader() - Remove the program header for a specific memory zone
+ * @z: pointer to the kmemdump zone
+ *
+ * Return: On success, it returns 0, errno otherwise
+ */
+int clear_elfheader(const struct kmemdump_zone *z)
+{
+ struct elf_phdr *phdr;
+ struct elf_phdr *tmp_phdr;
+ unsigned int phidx;
+ unsigned int i;
+
+ for (i = 0; i < ehdr->e_phnum; i++) {
+ phdr = elf_phdr_entry_addr(ehdr, i);
+ if (phdr->p_paddr == virt_to_phys(z->zone) &&
+ phdr->p_memsz == ALIGN(z->size, 4))
+ break;
+ }
+
+ if (i == ehdr->e_phnum) {
+ pr_debug("Cannot find program header entry in elf\n");
+ return -EINVAL;
+ }
+
+ phidx = i;
+
+ /* Clear program header */
+ tmp_phdr = elf_phdr_entry_addr(ehdr, phidx);
+ for (i = phidx; i < ehdr->e_phnum - 1; i++) {
+ tmp_phdr = elf_phdr_entry_addr(ehdr, i + 1);
+ phdr = elf_phdr_entry_addr(ehdr, i);
+ memcpy(phdr, tmp_phdr, sizeof(*phdr));
+ phdr->p_offset = phdr->p_offset - ALIGN(z->size, 4);
+ }
+ memset(tmp_phdr, 0, sizeof(*tmp_phdr));
+ ehdr->e_phnum--;
+
+ elf_offset -= ALIGN(z->size, 4);
+
+ return 0;
+}
+
+/**
+ * update_elfheader() - Add the program header for a specific memory zone
+ * @z: pointer to the kmemdump zone
+ *
+ * Return: None
+ */
+void update_elfheader(const struct kmemdump_zone *z)
+{
+ struct elf_phdr *phdr;
+
+ phdr = elf_phdr_entry_addr(ehdr, ehdr->e_phnum++);
+
+ phdr->p_type = PT_LOAD;
+ phdr->p_offset = elf_offset;
+ phdr->p_vaddr = (elf_addr_t)z->zone;
+ phdr->p_paddr = (elf_addr_t)virt_to_phys(z->zone);
+ phdr->p_filesz = phdr->p_memsz = ALIGN(z->size, 4);
+ phdr->p_flags = PF_R | PF_W;
+
+ elf_offset += ALIGN(z->size, 4);
+}
+
+/**
+ * init_elfheader() - Prepare coreinfo elf header
+ * This function prepares the elf header for the coredump image.
+ * Initially there is a single program header for the elf NOTE.
+ * The note contains the usual core dump information, and the
+ * vmcoreinfo.
+ *
+ * Return: 0 on success, errno otherwise
+ */
+int init_elfheader(void)
+{
+ struct elf_phdr *phdr;
+ void *notes;
+ unsigned int elfh_size;
+ unsigned int phdr_off;
+ size_t note_len, i = 0;
+
+ struct elf_prstatus prstatus = {};
+ struct elf_prpsinfo prpsinfo = {
+ .pr_sname = 'R',
+ .pr_fname = "vmlinux",
+ };
+
+ /*
+ * Header buffer contains:
+ * ELF header, Note entry with PR status, PR ps info, and vmcoreinfo
+ * MAX_NUM_ENTRIES Program headers,
+ */
+ elfh_size = sizeof(*ehdr);
+ elfh_size += sizeof(struct elf_prstatus);
+ elfh_size += sizeof(struct elf_prpsinfo);
+ elfh_size += sizeof(VMCOREINFO_NOTE_NAME);
+ elfh_size += ALIGN(vmcoreinfo_size, 4);
+ elfh_size += (sizeof(*phdr)) * (MAX_NUM_ENTRIES);
+
+ elfh_size = ALIGN(elfh_size, 4);
+
+ /* Never freed */
+ ehdr = kzalloc(elfh_size, GFP_KERNEL);
+ if (!ehdr)
+ return -ENOMEM;
+
+ /* Assign Program headers offset, it's right after the elf header. */
+ phdr = (struct elf_phdr *)(ehdr + 1);
+ phdr_off = sizeof(*ehdr);
+
+ memcpy(ehdr->e_ident, ELFMAG, SELFMAG);
+ ehdr->e_ident[EI_CLASS] = ELF_CLASS;
+ ehdr->e_ident[EI_DATA] = ELF_DATA;
+ ehdr->e_ident[EI_VERSION] = EV_CURRENT;
+ ehdr->e_ident[EI_OSABI] = ELF_OSABI;
+ ehdr->e_type = ET_CORE;
+ ehdr->e_machine = ELF_ARCH;
+ ehdr->e_version = EV_CURRENT;
+ ehdr->e_ehsize = sizeof(*ehdr);
+ ehdr->e_phentsize = sizeof(*phdr);
+
+ elf_offset = elfh_size;
+
+ notes = (void *)(((char *)ehdr) + elf_offset);
+
+ /* we have a single program header now */
+ ehdr->e_phnum = 1;
+
+ /* Length of the note is made of :
+ * 3 elf notes structs (prstatus, prpsinfo, vmcoreinfo)
+ * 3 notes names (2 core strings, 1 vmcoreinfo name)
+ * sizeof each note
+ */
+ note_len = (3 * sizeof(struct elf_note) +
+ 2 * ALIGN(sizeof(CORE_STR), 4) +
+ VMCOREINFO_NOTE_NAME_BYTES +
+ ALIGN(sizeof(struct elf_prstatus), 4) +
+ ALIGN(sizeof(struct elf_prpsinfo), 4) +
+ ALIGN(vmcoreinfo_size, 4));
+
+ phdr->p_type = PT_NOTE;
+ phdr->p_offset = elf_offset;
+ phdr->p_filesz = note_len;
+
+ /* advance elf offset */
+ elf_offset += note_len;
+
+ strscpy(prpsinfo.pr_psargs, saved_command_line,
+ sizeof(prpsinfo.pr_psargs));
+
+ append_kcore_note(notes, &i, CORE_STR, NT_PRSTATUS, &prstatus,
+ sizeof(prstatus));
+ append_kcore_note(notes, &i, CORE_STR, NT_PRPSINFO, &prpsinfo,
+ sizeof(prpsinfo));
+ append_kcore_note_nodesc(notes, &i, VMCOREINFO_NOTE_NAME, 0,
+ ALIGN(vmcoreinfo_size, 4));
+
+ ehdr->e_phoff = phdr_off;
+
+ /* This is the first kmemdump region, the ELF header */
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_ELF, ehdr,
+ elfh_size + note_len - ALIGN(vmcoreinfo_size, 4));
+
+ /*
+ * The second region is the vmcoreinfo, which goes right after.
+ * It's being registered through vmcoreinfo.
+ */
+
+ return 0;
+}
+
diff --git a/include/linux/kmemdump.h b/include/linux/kmemdump.h
index c3690423a347..7933915c2c78 100644
--- a/include/linux/kmemdump.h
+++ b/include/linux/kmemdump.h
@@ -4,6 +4,37 @@
enum kmemdump_uid {
KMEMDUMP_ID_START = 0,
+ KMEMDUMP_ID_COREIMAGE_ELF,
+ KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
+ KMEMDUMP_ID_COREIMAGE_CONFIG,
+ KMEMDUMP_ID_COREIMAGE_MEMSECT,
+ KMEMDUMP_ID_COREIMAGE__totalram_pages,
+ KMEMDUMP_ID_COREIMAGE___cpu_possible_mask,
+ KMEMDUMP_ID_COREIMAGE___cpu_present_mask,
+ KMEMDUMP_ID_COREIMAGE___cpu_online_mask,
+ KMEMDUMP_ID_COREIMAGE___cpu_active_mask,
+ KMEMDUMP_ID_COREIMAGE_jiffies_64,
+ KMEMDUMP_ID_COREIMAGE_linux_banner,
+ KMEMDUMP_ID_COREIMAGE_nr_threads,
+ KMEMDUMP_ID_COREIMAGE_nr_irqs,
+ KMEMDUMP_ID_COREIMAGE_tainted_mask,
+ KMEMDUMP_ID_COREIMAGE_taint_flags,
+ KMEMDUMP_ID_COREIMAGE_mem_section,
+ KMEMDUMP_ID_COREIMAGE_node_data,
+ KMEMDUMP_ID_COREIMAGE_node_states,
+ KMEMDUMP_ID_COREIMAGE___per_cpu_offset,
+ KMEMDUMP_ID_COREIMAGE_nr_swapfiles,
+ KMEMDUMP_ID_COREIMAGE_init_uts_ns,
+ KMEMDUMP_ID_COREIMAGE_printk_rb_static,
+ KMEMDUMP_ID_COREIMAGE_printk_rb_dynamic,
+ KMEMDUMP_ID_COREIMAGE_prb,
+ KMEMDUMP_ID_COREIMAGE_prb_descs,
+ KMEMDUMP_ID_COREIMAGE_prb_infos,
+ KMEMDUMP_ID_COREIMAGE_prb_data,
+ KMEMDUMP_ID_COREIMAGE_runqueues,
+ KMEMDUMP_ID_COREIMAGE_high_memory,
+ KMEMDUMP_ID_COREIMAGE_init_mm,
+ KMEMDUMP_ID_COREIMAGE_init_mm_pgd,
KMEMDUMP_ID_USER_START,
KMEMDUMP_ID_USER_END,
KMEMDUMP_ID_NO_ID,
@@ -33,7 +64,20 @@ extern const struct kmemdump_zone __kmemdump_table_end[];
.zone = (void *)&(sym), \
.size = (sz), \
}
-
+/* Annotate a variable into the KMEMDUMP_ID_COREIMAGE_sym UID */
+#define KMEMDUMP_VAR_CORE(sym, sz) \
+ static const struct kmemdump_zone __UNIQUE_ID(__kmemdump_entry_##sym) \
+ __used __section(".kmemdump") = { .id = KMEMDUMP_ID_COREIMAGE_##sym, \
+ .zone = (void *)&(sym), \
+ .size = (sz), \
+ }
+/* Annotate a variable into the KMEMDUMP_ID_COREIMAGE_name UID */
+#define KMEMDUMP_VAR_CORE_NAMED(name, sym, sz) \
+ static const struct kmemdump_zone __UNIQUE_ID(__kmemdump_entry_##name) \
+ __used __section(".kmemdump") = { .id = KMEMDUMP_ID_COREIMAGE_##name, \
+ .zone = (void *)&(sym), \
+ .size = (sz), \
+ }
/* Iterate through kmemdump section entries */
#define for_each_kmemdump_entry(__entry) \
for (__entry = __kmemdump_table; \
@@ -42,6 +86,9 @@ extern const struct kmemdump_zone __kmemdump_table_end[];
#else
#define KMEMDUMP_VAR_ID(...)
+#define KMEMDUMP_VAR_CORE(...)
+#define KMEMDUMP_VAR_CORE_NAMED(...)
+#define KMEMDUMP_VAR_CORE_NAMED(...)
#endif
/*
* Wrapper over an existing fn allocator
@@ -132,4 +179,25 @@ static inline void kmemdump_unregister(enum kmemdump_uid id)
}
#endif
+#ifdef CONFIG_KMEMDUMP
+#ifdef CONFIG_KMEMDUMP_COREIMAGE
+int init_elfheader(void);
+void update_elfheader(const struct kmemdump_zone *z);
+int clear_elfheader(const struct kmemdump_zone *z);
+#else
+static inline int init_elfheader(void)
+{
+ return 0;
+}
+
+static inline void update_elfheader(const struct kmemdump_zone *z)
+{
+}
+
+static inline int clear_elfheader(const struct kmemdump_zone *z)
+{
+ return 0;
+}
+#endif
+#endif
#endif
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 04/29] Documentation: kmemdump: add section for coreimage ELF
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (2 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 03/29] kmemdump: add coreimage ELF layer Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 05/29] kmemdump: introduce qcom-minidump backend driver Eugen Hristev
` (25 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Add section describing the utility of coreimage ELF generation for
kmemdump.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
Documentation/debug/kmemdump.rst | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/debug/kmemdump.rst b/Documentation/debug/kmemdump.rst
index 3301abcaed7e..9c2c23911242 100644
--- a/Documentation/debug/kmemdump.rst
+++ b/Documentation/debug/kmemdump.rst
@@ -17,6 +17,12 @@ kmemdump allows a backend to be connected, this backend interfaces a
specific hardware that can debug or dump the memory registered into
kmemdump.
+kmemdump can also prepare specific regions of the kernel that can be
+put together to form a minimal core image file. To achieve this, the first
+region is an ELF header with program headers for each region, and specific
+ELF NOTE section with vmcoreinfo. To enable this feature, use
+`CONFIG_KMEMDUMP_COREIMAGE`.
+
kmemdump Internals
=============
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 05/29] kmemdump: introduce qcom-minidump backend driver
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (3 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 04/29] Documentation: kmemdump: add section for coreimage ELF Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 06/29] soc: qcom: smem: add minidump device Eugen Hristev
` (24 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Qualcomm Minidump is a backend driver for kmemdump.
Regions are being registered into the shared memory on Qualcomm platforms
and into the table of contents.
Further, the firmware can read the table of contents and dump the memory
accordingly.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
MAINTAINERS | 5 +
drivers/debug/Kconfig | 12 ++
drivers/debug/Makefile | 1 +
drivers/debug/qcom_minidump.c | 353 ++++++++++++++++++++++++++++++++++
4 files changed, 371 insertions(+)
create mode 100644 drivers/debug/qcom_minidump.c
diff --git a/MAINTAINERS b/MAINTAINERS
index b43a43b61e19..68797717175c 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13625,6 +13625,11 @@ F: drivers/debug/kmemdump.c
F: drivers/debug/kmemdump_coreimage.c
F: include/linux/kmemdump.h
+KMEMDUMP QCOM MINIDUMP BACKEND DRIVER
+M: Eugen Hristev <eugen.hristev@linaro.org>
+S: Maintained
+F: drivers/debug/qcom_minidump.c
+
KMEMLEAK
M: Catalin Marinas <catalin.marinas@arm.com>
S: Maintained
diff --git a/drivers/debug/Kconfig b/drivers/debug/Kconfig
index 903e3e2805b7..d34ceaf99bd8 100644
--- a/drivers/debug/Kconfig
+++ b/drivers/debug/Kconfig
@@ -27,4 +27,16 @@ config KMEMDUMP_COREIMAGE
for debug tools are being registered.
The coredump file can then be loaded into GDB or crash tool and
further inspected.
+
+config KMEMDUMP_QCOM_MINIDUMP_BACKEND
+ tristate "Qualcomm Minidump kmemdump backend driver"
+ depends on ARCH_QCOM || COMPILE_TEST
+ depends on KMEMDUMP
+ help
+ Say y here to enable the Qualcomm Minidump kmemdump backend
+ driver.
+ With this backend, the registered regions are being linked
+ into the minidump table of contents. Further on, the firmware
+ will be able to read the table of contents and extract the
+ memory regions on case-by-case basis.
endmenu
diff --git a/drivers/debug/Makefile b/drivers/debug/Makefile
index 2b67673393a6..7f70b84049cb 100644
--- a/drivers/debug/Makefile
+++ b/drivers/debug/Makefile
@@ -2,3 +2,4 @@
obj-$(CONFIG_KMEMDUMP) += kmemdump.o
obj-$(CONFIG_KMEMDUMP_COREIMAGE) += kmemdump_coreimage.o
+obj-$(CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND) += qcom_minidump.o
diff --git a/drivers/debug/qcom_minidump.c b/drivers/debug/qcom_minidump.c
new file mode 100644
index 000000000000..49b0b6ef193b
--- /dev/null
+++ b/drivers/debug/qcom_minidump.c
@@ -0,0 +1,353 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Qualcomm Minidump backend driver for Kmemdump
+ * Copyright (C) 2016,2024-2025 Linaro Ltd
+ * Copyright (C) 2015 Sony Mobile Communications Inc
+ * Copyright (c) 2012-2013, The Linux Foundation. All rights reserved.
+ */
+
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/soc/qcom/smem.h>
+#include <linux/kmemdump.h>
+#include <linux/container_of.h>
+
+/*
+ * In some of the Old Qualcomm devices, boot firmware statically allocates 300
+ * as total number of supported region (including all co-processors) in
+ * minidump table out of which linux was using 201. In future, this limitation
+ * from boot firmware might get removed by allocating the region dynamically.
+ * So, keep it compatible with older devices, we can keep the current limit for
+ * Linux to 201.
+ */
+#define MAX_NUM_REGIONS 201
+
+#define MAX_NUM_SUBSYSTEMS 10
+#define MAX_REGION_NAME_LENGTH 16
+#define SBL_MINIDUMP_SMEM_ID 602
+#define MINIDUMP_REGION_VALID ('V' << 24 | 'A' << 16 | 'L' << 8 | 'I' << 0)
+#define MINIDUMP_SS_ENCR_DONE ('D' << 24 | 'O' << 16 | 'N' << 8 | 'E' << 0)
+#define MINIDUMP_SS_ENABLED ('E' << 24 | 'N' << 16 | 'B' << 8 | 'L' << 0)
+
+#define MINIDUMP_SS_ENCR_NOTREQ (0 << 24 | 0 << 16 | 'N' << 8 | 'R' << 0)
+
+#define MINIDUMP_SUBSYSTEM_APSS 0
+
+const char *kmemdump_id_to_md_string[] = {
+ "",
+ "ELF",
+ "vmcoreinfo",
+ "config",
+ "memsect",
+ "totalram",
+ "cpu_possible",
+ "cpu_present",
+ "cpu_online",
+ "cpu_active",
+ "jiffies",
+ "linux_banner",
+ "nr_threads",
+ "nr_irqs",
+ "tainted_mask",
+ "taint_flags",
+ "mem_section",
+ "node_data",
+ "node_states",
+ "__per_cpu_offset",
+ "nr_swapfiles",
+ "init_uts_ns",
+ "printk_rb_static",
+ "printk_rb_dynamic",
+ "prb",
+ "prb_descs",
+ "prb_infos",
+ "prb_data",
+ "runqueues",
+ "high_memory",
+ "init_mm",
+ "init_mm_pgd",
+};
+
+/**
+ * struct minidump_region - Minidump region
+ * @name : Name of the region to be dumped
+ * @seq_num: : Use to differentiate regions with same name.
+ * @valid : This entry to be dumped (if set to 1)
+ * @address : Physical address of region to be dumped
+ * @size : Size of the region
+ */
+struct minidump_region {
+ char name[MAX_REGION_NAME_LENGTH];
+ __le32 seq_num;
+ __le32 valid;
+ __le64 address;
+ __le64 size;
+};
+
+/**
+ * struct minidump_subsystem - Subsystem's SMEM Table of content
+ * @status : Subsystem toc init status
+ * @enabled : if set to 1, this region would be copied during coredump
+ * @encryption_status: Encryption status for this subsystem
+ * @encryption_required : Decides to encrypt the subsystem regions or not
+ * @region_count : Number of regions added in this subsystem toc
+ * @regions_baseptr : regions base pointer of the subsystem
+ */
+struct minidump_subsystem {
+ __le32 status;
+ __le32 enabled;
+ __le32 encryption_status;
+ __le32 encryption_required;
+ __le32 region_count;
+ __le64 regions_baseptr;
+};
+
+/**
+ * struct minidump_global_toc - Global Table of Content
+ * @status : Global Minidump init status
+ * @revision : Minidump revision
+ * @enabled : Minidump enable status
+ * @subsystems : Array of subsystems toc
+ */
+struct minidump_global_toc {
+ __le32 status;
+ __le32 revision;
+ __le32 enabled;
+ struct minidump_subsystem subsystems[MAX_NUM_SUBSYSTEMS];
+};
+
+#define MINIDUMP_MAX_NAME_LENGTH 12
+/**
+ * struct qcom_minidump_region - Minidump region information
+ *
+ * @name: Minidump region name
+ * @virt_addr: Virtual address of the entry.
+ * @phys_addr: Physical address of the entry to dump.
+ * @size: Number of bytes to dump from @address location,
+ * and it should be 4 byte aligned.
+ * @id: Region id.
+ */
+struct qcom_minidump_region {
+ char name[MINIDUMP_MAX_NAME_LENGTH];
+ void *virt_addr;
+ phys_addr_t phys_addr;
+ size_t size;
+ unsigned int id;
+};
+
+/**
+ * struct minidump - Minidump driver data information
+ *
+ * @dev: Minidump device struct.
+ * @toc: Minidump table of contents subsystem.
+ * @regions: Minidump regions array.
+ * @md_be: Minidump backend.
+ */
+struct minidump {
+ struct device *dev;
+ struct minidump_subsystem *toc;
+ struct minidump_region *regions;
+ struct kmemdump_backend md_be;
+};
+
+static struct minidump *md;
+
+#define be_to_minidump(be) container_of(be, struct minidump, md_be)
+
+/**
+ * qcom_apss_md_table_init() - Initialize the minidump table
+ * @md: minidump data
+ * @mdss_toc: minidump subsystem table of contents
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int qcom_apss_md_table_init(struct minidump *md,
+ struct minidump_subsystem *mdss_toc)
+{
+ md->toc = mdss_toc;
+ md->regions = devm_kcalloc(md->dev, MAX_NUM_REGIONS,
+ sizeof(*md->regions), GFP_KERNEL);
+ if (!md->regions)
+ return -ENOMEM;
+
+ md->toc->regions_baseptr = cpu_to_le64(virt_to_phys(md->regions));
+ md->toc->enabled = cpu_to_le32(MINIDUMP_SS_ENABLED);
+ md->toc->status = cpu_to_le32(1);
+ md->toc->region_count = cpu_to_le32(0);
+
+ /* Tell bootloader not to encrypt the regions of this subsystem */
+ md->toc->encryption_status = cpu_to_le32(MINIDUMP_SS_ENCR_DONE);
+ md->toc->encryption_required = cpu_to_le32(MINIDUMP_SS_ENCR_NOTREQ);
+
+ return 0;
+}
+
+/**
+ * qcom_md_get_region_index() - Lookup minidump region by kmemdump id
+ * @md: minidump data
+ * @id: minidump region id
+ *
+ * Return: On success, it returns the internal region index, on failure,
+ * returns negative error value
+ */
+static int qcom_md_get_region_index(struct minidump *md, int id)
+{
+ unsigned int count = le32_to_cpu(md->toc->region_count);
+ unsigned int i;
+
+ for (i = 0; i < count; i++)
+ if (md->regions[i].seq_num == id)
+ return i;
+
+ return -ENOENT;
+}
+
+/**
+ * register_md_region() - Register a new minidump region
+ * @be: kmemdump backend, this should be the minidump backend
+ * @id: unique id to identify the region
+ * @vaddr: virtual memory address of the region start
+ * @size: size of the region
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int register_md_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id, void *vaddr, size_t size)
+{
+ struct minidump *md = be_to_minidump(be);
+ struct minidump_region *mdr;
+ unsigned int num_region, region_cnt;
+ const char *name = "unknown";
+
+ if (!vaddr || !size)
+ return -EINVAL;
+
+ if (id < ARRAY_SIZE(kmemdump_id_to_md_string))
+ name = kmemdump_id_to_md_string[id];
+
+ if (qcom_md_get_region_index(md, id) >= 0) {
+ dev_dbg(md->dev, "%s:%d region is already registered\n",
+ name, id);
+ return -EEXIST;
+ }
+
+ /* Check if there is a room for a new entry */
+ num_region = le32_to_cpu(md->toc->region_count);
+ if (num_region >= MAX_NUM_REGIONS) {
+ dev_err(md->dev, "maximum region limit %u reached\n",
+ num_region);
+ return -ENOSPC;
+ }
+
+ region_cnt = le32_to_cpu(md->toc->region_count);
+ mdr = &md->regions[region_cnt];
+ scnprintf(mdr->name, MAX_REGION_NAME_LENGTH, "K%.8s", name);
+ mdr->seq_num = id;
+ mdr->address = cpu_to_le64(__pa(vaddr));
+ mdr->size = cpu_to_le64(ALIGN(size, 4));
+ mdr->valid = cpu_to_le32(MINIDUMP_REGION_VALID);
+ region_cnt++;
+ md->toc->region_count = cpu_to_le32(region_cnt);
+
+ return 0;
+}
+
+/**
+ * unregister_md_region() - Unregister a previously registered minidump region
+ * @be: pointer to backend
+ * @id: unique id to identify the region
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int unregister_md_region(const struct kmemdump_backend *be,
+ unsigned int id)
+{
+ struct minidump *md = be_to_minidump(be);
+ struct minidump_region *mdr;
+ unsigned int region_cnt;
+ unsigned int idx;
+
+ idx = qcom_md_get_region_index(md, id);
+ if (idx < 0) {
+ dev_err(md->dev, "%d region is not present\n", id);
+ return idx;
+ }
+
+ mdr = &md->regions[0];
+ region_cnt = le32_to_cpu(md->toc->region_count);
+ /*
+ * Left shift all the regions exist after this removed region
+ * index by 1 to fill the gap and zero out the last region
+ * present at the end.
+ */
+ memmove(&mdr[idx], &mdr[idx + 1], (region_cnt - idx - 1) * sizeof(*mdr));
+ memset(&mdr[region_cnt - 1], 0, sizeof(*mdr));
+ region_cnt--;
+ md->toc->region_count = cpu_to_le32(region_cnt);
+
+ return 0;
+}
+
+static int qcom_md_probe(struct platform_device *pdev)
+{
+ struct minidump_global_toc *mdgtoc;
+ size_t size;
+ int ret;
+
+ md = kzalloc(sizeof(*md), GFP_KERNEL);
+ if (!md)
+ return -ENOMEM;
+
+ md->dev = &pdev->dev;
+
+ strscpy(md->md_be.name, "qcom_minidump");
+ md->md_be.register_region = register_md_region;
+ md->md_be.unregister_region = unregister_md_region;
+
+ mdgtoc = qcom_smem_get(QCOM_SMEM_HOST_ANY, SBL_MINIDUMP_SMEM_ID, &size);
+ if (IS_ERR(mdgtoc)) {
+ ret = PTR_ERR(mdgtoc);
+ dev_err(md->dev, "Couldn't find minidump smem item %d\n", ret);
+ goto qcom_md_probe_fail;
+ }
+
+ if (size < sizeof(*mdgtoc) || !mdgtoc->status) {
+ dev_err(md->dev, "minidump table is not initialized %d\n", ret);
+ ret = -ENAVAIL;
+ goto qcom_md_probe_fail;
+ }
+
+ ret = qcom_apss_md_table_init(md, &mdgtoc->subsystems[MINIDUMP_SUBSYSTEM_APSS]);
+ if (ret)
+ goto qcom_md_probe_fail;
+
+ return kmemdump_register_backend(&md->md_be);
+
+qcom_md_probe_fail:
+ kfree(md);
+ return ret;
+}
+
+static void qcom_md_remove(struct platform_device *pdev)
+{
+ kfree(md);
+ kmemdump_unregister_backend(&md->md_be);
+}
+
+static struct platform_driver qcom_md_driver = {
+ .probe = qcom_md_probe,
+ .remove = qcom_md_remove,
+ .driver = {
+ .name = "qcom-minidump",
+ },
+};
+
+module_platform_driver(qcom_md_driver);
+
+MODULE_AUTHOR("Eugen Hristev <eugen.hristev@linaro.org>");
+MODULE_AUTHOR("Mukesh Ojha <quic_mojha@quicinc.com>");
+MODULE_DESCRIPTION("Qualcomm kmemdump minidump backend driver");
+MODULE_LICENSE("GPL");
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 06/29] soc: qcom: smem: add minidump device
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (4 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 05/29] kmemdump: introduce qcom-minidump backend driver Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 07/29] init/version: Annotate static information into Kmemdump Eugen Hristev
` (23 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Add a minidump platform device.
Minidump can collect various memory snippets using dedicated firmware.
To know which snippets to collect, each snippet must be registered
by the kernel into a specific shared memory table which is controlled
by the qcom smem driver.
To instantiate the minidump platform driver, register its data using
platform_device_register_data.
Later on, the minidump driver will probe and register itself into
kmemdump as a backend
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
drivers/soc/qcom/smem.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/soc/qcom/smem.c b/drivers/soc/qcom/smem.c
index cf425930539e..2aae0e696150 100644
--- a/drivers/soc/qcom/smem.c
+++ b/drivers/soc/qcom/smem.c
@@ -270,6 +270,7 @@ struct smem_region {
* @partitions: list of partitions of current processor/host
* @item_count: max accepted item number
* @socinfo: platform device pointer
+ * @mdinfo: minidump device pointer
* @num_regions: number of @regions
* @regions: list of the memory regions defining the shared memory
*/
@@ -280,6 +281,7 @@ struct qcom_smem {
u32 item_count;
struct platform_device *socinfo;
+ struct platform_device *mdinfo;
struct smem_ptable *ptable;
struct smem_partition global_partition;
struct smem_partition partitions[SMEM_HOST_COUNT];
@@ -1236,12 +1238,20 @@ static int qcom_smem_probe(struct platform_device *pdev)
if (IS_ERR(smem->socinfo))
dev_dbg(&pdev->dev, "failed to register socinfo device\n");
+ smem->mdinfo = platform_device_register_data(&pdev->dev, "qcom-minidump",
+ PLATFORM_DEVID_AUTO, NULL,
+ 0);
+ if (IS_ERR(smem->mdinfo))
+ dev_err(&pdev->dev, "failed to register platform md device\n");
+
return 0;
}
static void qcom_smem_remove(struct platform_device *pdev)
{
platform_device_unregister(__smem->socinfo);
+ if (!IS_ERR(__smem->mdinfo))
+ platform_device_unregister(__smem->mdinfo);
hwspin_lock_free(__smem->hwlock);
__smem = NULL;
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 07/29] init/version: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (5 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 06/29] soc: qcom: smem: add minidump device Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 08/29] cpu: " Eugen Hristev
` (22 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- init_uts_ns
- linux_banner
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
init/version.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/init/version.c b/init/version.c
index 94c96f6fbfe6..f5910c027948 100644
--- a/init/version.c
+++ b/init/version.c
@@ -16,6 +16,7 @@
#include <linux/uts.h>
#include <linux/utsname.h>
#include <linux/proc_ns.h>
+#include <linux/kmemdump.h>
static int __init early_hostname(char *arg)
{
@@ -51,4 +52,7 @@ const char linux_banner[] __weak;
#include "version-timestamp.c"
+KMEMDUMP_VAR_CORE(init_uts_ns, sizeof(init_uts_ns));
+KMEMDUMP_VAR_CORE(linux_banner, sizeof(linux_banner));
+
EXPORT_SYMBOL_GPL(init_uts_ns);
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 08/29] cpu: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (6 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 07/29] init/version: Annotate static information into Kmemdump Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 09/29] genirq/irqdesc: " Eugen Hristev
` (21 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- __cpu_present_mask
- __cpu_online_mask
- __cpu_possible_mask
- __cpu_active_mask
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/cpu.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index faf0f23fc5d8..d48e4dd979e9 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -38,6 +38,7 @@
#include <linux/random.h>
#include <linux/cc_platform.h>
#include <linux/parser.h>
+#include <linux/kmemdump.h>
#include <trace/events/power.h>
#define CREATE_TRACE_POINTS
@@ -3092,18 +3093,22 @@ struct cpumask __cpu_possible_mask __ro_after_init
struct cpumask __cpu_possible_mask __ro_after_init;
#endif
EXPORT_SYMBOL(__cpu_possible_mask);
+KMEMDUMP_VAR_CORE(__cpu_possible_mask, sizeof(__cpu_possible_mask));
struct cpumask __cpu_online_mask __read_mostly;
EXPORT_SYMBOL(__cpu_online_mask);
+KMEMDUMP_VAR_CORE(__cpu_online_mask, sizeof(__cpu_online_mask));
struct cpumask __cpu_enabled_mask __read_mostly;
EXPORT_SYMBOL(__cpu_enabled_mask);
struct cpumask __cpu_present_mask __read_mostly;
EXPORT_SYMBOL(__cpu_present_mask);
+KMEMDUMP_VAR_CORE(__cpu_present_mask, sizeof(__cpu_present_mask));
struct cpumask __cpu_active_mask __read_mostly;
EXPORT_SYMBOL(__cpu_active_mask);
+KMEMDUMP_VAR_CORE(__cpu_active_mask, sizeof(__cpu_active_mask));
struct cpumask __cpu_dying_mask __read_mostly;
EXPORT_SYMBOL(__cpu_dying_mask);
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 09/29] genirq/irqdesc: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (7 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 08/29] cpu: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 10/29] panic: " Eugen Hristev
` (20 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- nr_irqs
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/irq/irqdesc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index b64c57b44c20..6d11b85be2b3 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -12,6 +12,7 @@
#include <linux/export.h>
#include <linux/interrupt.h>
#include <linux/kernel_stat.h>
+#include <linux/kmemdump.h>
#include <linux/maple_tree.h>
#include <linux/irqdomain.h>
#include <linux/sysfs.h>
@@ -140,6 +141,7 @@ static void desc_set_defaults(unsigned int irq, struct irq_desc *desc, int node,
}
static unsigned int nr_irqs = NR_IRQS;
+KMEMDUMP_VAR_CORE(nr_irqs, sizeof(nr_irqs));
/**
* irq_get_nr_irqs() - Number of interrupts supported by the system.
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 10/29] panic: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (8 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 09/29] genirq/irqdesc: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 11/29] sched/core: " Eugen Hristev
` (19 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- tainted_mask
- taint_flags
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/panic.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/kernel/panic.c b/kernel/panic.c
index ccee04378d2e..fb561a2fdb59 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -39,6 +39,7 @@
#include <linux/sys_info.h>
#include <trace/events/error_report.h>
#include <asm/sections.h>
+#include <linux/kmemdump.h>
#define PANIC_TIMER_STEP 100
#define PANIC_BLINK_SPD 18
@@ -56,6 +57,7 @@ static unsigned int __read_mostly sysctl_oops_all_cpu_backtrace;
int panic_on_oops = CONFIG_PANIC_ON_OOPS_VALUE;
static unsigned long tainted_mask =
IS_ENABLED(CONFIG_RANDSTRUCT) ? (1 << TAINT_RANDSTRUCT) : 0;
+KMEMDUMP_VAR_CORE(tainted_mask, sizeof(tainted_mask));
static int pause_on_oops;
static int pause_on_oops_flag;
static DEFINE_SPINLOCK(pause_on_oops_lock);
@@ -601,6 +603,8 @@ const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
TAINT_FLAG(FWCTL, 'J', ' ', true),
};
+KMEMDUMP_VAR_CORE(taint_flags, sizeof(taint_flags));
+
#undef TAINT_FLAG
static void print_tainted_seq(struct seq_buf *s, bool verbose)
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 11/29] sched/core: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (9 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 10/29] panic: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 12/29] timers: " Eugen Hristev
` (18 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- runqueues
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/sched/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 2343f5691c54..18ba6c1e174f 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -67,6 +67,7 @@
#include <linux/wait_api.h>
#include <linux/workqueue_api.h>
#include <linux/livepatch_sched.h>
+#include <linux/kmemdump.h>
#ifdef CONFIG_PREEMPT_DYNAMIC
# ifdef CONFIG_GENERIC_IRQ_ENTRY
@@ -119,6 +120,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
+KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
#ifdef CONFIG_SCHED_PROXY_EXEC
DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 12/29] timers: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (10 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 11/29] sched/core: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 13/29] kernel/fork: " Eugen Hristev
` (17 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- jiffies_64
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/time/timer.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 553fa469d7cc..a5698e3ace2d 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -44,6 +44,7 @@
#include <linux/compat.h>
#include <linux/random.h>
#include <linux/sysctl.h>
+#include <linux/kmemdump.h>
#include <linux/uaccess.h>
#include <asm/unistd.h>
@@ -60,7 +61,7 @@
__visible u64 jiffies_64 __cacheline_aligned_in_smp = INITIAL_JIFFIES;
EXPORT_SYMBOL(jiffies_64);
-
+KMEMDUMP_VAR_CORE(jiffies_64, sizeof(jiffies_64));
/*
* The timer wheel has LVL_DEPTH array levels. Each level provides an array of
* LVL_SIZE buckets. Each level is driven by its own clock and therefore each
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 13/29] kernel/fork: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (11 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 12/29] timers: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 14/29] mm/page_alloc: " Eugen Hristev
` (16 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- nr_threads
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/fork.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/kernel/fork.c b/kernel/fork.c
index edc6579f736b..ae8ae9b9180b 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -105,6 +105,7 @@
#include <uapi/linux/pidfd.h>
#include <linux/pidfs.h>
#include <linux/tick.h>
+#include <linux/kmemdump.h>
#include <asm/pgalloc.h>
#include <linux/uaccess.h>
@@ -137,6 +138,7 @@
*/
unsigned long total_forks; /* Handle normal Linux uptimes. */
int nr_threads; /* The idle threads do not count.. */
+KMEMDUMP_VAR_CORE(nr_threads, sizeof(nr_threads));
static int max_threads; /* tunable limit on nr_threads */
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 14/29] mm/page_alloc: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (12 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 13/29] kernel/fork: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 15/29] mm/init-mm: " Eugen Hristev
` (15 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- node_states
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/page_alloc.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index fa09154a799c..5f0015e27a30 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -55,6 +55,7 @@
#include <linux/delayacct.h>
#include <linux/cacheinfo.h>
#include <linux/pgalloc_tag.h>
+#include <linux/kmemdump.h>
#include <asm/div64.h>
#include "internal.h"
#include "shuffle.h"
@@ -207,6 +208,7 @@ nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
#endif /* NUMA */
};
EXPORT_SYMBOL(node_states);
+KMEMDUMP_VAR_CORE(node_states, sizeof(node_states));
gfp_t gfp_allowed_mask __read_mostly = GFP_BOOT_MASK;
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 15/29] mm/init-mm: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (13 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 14/29] mm/page_alloc: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 16/29] mm/show_mem: " Eugen Hristev
` (14 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- init_mm
- init_mm.pgd
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/init-mm.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/mm/init-mm.c b/mm/init-mm.c
index 4600e7605cab..2dbbaf640cf4 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -7,6 +7,7 @@
#include <linux/cpumask.h>
#include <linux/mman.h>
#include <linux/pgtable.h>
+#include <linux/kmemdump.h>
#include <linux/atomic.h>
#include <linux/user_namespace.h>
@@ -48,6 +49,9 @@ struct mm_struct init_mm = {
INIT_MM_CONTEXT(init_mm)
};
+KMEMDUMP_VAR_CORE(init_mm, sizeof(init_mm));
+KMEMDUMP_VAR_CORE_NAMED(init_mm_pgd, init_mm.pgd, sizeof(*init_mm.pgd));
+
void setup_initial_init_mm(void *start_code, void *end_code,
void *end_data, void *brk)
{
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 16/29] mm/show_mem: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (14 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 15/29] mm/init-mm: " Eugen Hristev
@ 2025-07-24 13:54 ` Eugen Hristev
2025-07-30 13:55 ` David Hildenbrand
2025-07-24 13:55 ` [RFC][PATCH v2 17/29] mm/swapfile: " Eugen Hristev
` (13 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:54 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- _totalram_pages
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/show_mem.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/show_mem.c b/mm/show_mem.c
index 41999e94a56d..93a5dc041ae1 100644
--- a/mm/show_mem.c
+++ b/mm/show_mem.c
@@ -14,12 +14,14 @@
#include <linux/mmzone.h>
#include <linux/swap.h>
#include <linux/vmstat.h>
+#include <linux/kmemdump.h>
#include "internal.h"
#include "swap.h"
atomic_long_t _totalram_pages __read_mostly;
EXPORT_SYMBOL(_totalram_pages);
+KMEMDUMP_VAR_CORE(_totalram_pages, sizeof(_totalram_pages));
unsigned long totalreserve_pages __read_mostly;
unsigned long totalcma_pages __read_mostly;
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 17/29] mm/swapfile: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (15 preceding siblings ...)
2025-07-24 13:54 ` [RFC][PATCH v2 16/29] mm/show_mem: " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 18/29] mm/percpu: " Eugen Hristev
` (12 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- nr_swapfiles
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/swapfile.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index b4f3cc712580..ac5a2307a278 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -42,6 +42,7 @@
#include <linux/suspend.h>
#include <linux/zswap.h>
#include <linux/plist.h>
+#include <linux/kmemdump.h>
#include <asm/tlbflush.h>
#include <linux/swapops.h>
@@ -64,6 +65,7 @@ static inline void unlock_cluster(struct swap_cluster_info *ci);
static DEFINE_SPINLOCK(swap_lock);
static unsigned int nr_swapfiles;
+KMEMDUMP_VAR_CORE(nr_swapfiles, sizeof(nr_swapfiles));
atomic_long_t nr_swap_pages;
/*
* Some modules use swappable objects and may try to swap them out under
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 18/29] mm/percpu: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (16 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 17/29] mm/swapfile: " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 19/29] mm/mm_init: " Eugen Hristev
` (11 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- __per_cpu_offset
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/percpu.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/mm/percpu.c b/mm/percpu.c
index d9cbaee92b60..0cfe4d7818e9 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -87,6 +87,7 @@
#include <linux/sched.h>
#include <linux/sched/mm.h>
#include <linux/memcontrol.h>
+#include <linux/kmemdump.h>
#include <asm/cacheflush.h>
#include <asm/sections.h>
@@ -3342,6 +3343,8 @@ void __init setup_per_cpu_areas(void)
#endif /* CONFIG_SMP */
+KMEMDUMP_VAR_CORE(__per_cpu_offset, sizeof(__per_cpu_offset));
+
/*
* pcpu_nr_pages - calculate total number of populated backing pages
*
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 19/29] mm/mm_init: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (17 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 18/29] mm/percpu: " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 20/29] printk: Register " Eugen Hristev
` (10 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- high_memory
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/mm_init.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 5c21b3af216b..fd577f988f79 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -32,6 +32,7 @@
#include <linux/vmstat.h>
#include <linux/kexec_handover.h>
#include <linux/hugetlb.h>
+#include <linux/kmemdump.h>
#include "internal.h"
#include "slab.h"
#include "shuffle.h"
@@ -52,6 +53,7 @@ EXPORT_SYMBOL(mem_map);
*/
void *high_memory;
EXPORT_SYMBOL(high_memory);
+KMEMDUMP_VAR_CORE(high_memory, sizeof(high_memory));
#ifdef CONFIG_DEBUG_MEMORY_INIT
int __meminitdata mminit_loglevel;
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 20/29] printk: Register information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (18 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 19/29] mm/mm_init: " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 21/29] kernel/configs: Register dynamic " Eugen Hristev
` (9 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- prb_descs
- prb_infos
- prb
- prb_data
- printk_rb_static
- printk_rb_dynamic
Information on these variables is stored into dedicated kmemdump section.
Register dynamic information into kmemdump:
- new_descs
- new_infos
- new_log_buf
In the case when the log buffer is dynamically replaced by a runtime
allocated version, call kmemdump to register the data with a replace
flag to remove the old registered data.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/printk/printk.c | 28 +++++++++++++++++++++++-----
1 file changed, 23 insertions(+), 5 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 0efbcdda9aab..f7d60dbe5e5a 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -48,6 +48,7 @@
#include <linux/sched/clock.h>
#include <linux/sched/debug.h>
#include <linux/sched/task_stack.h>
+#include <linux/kmemdump.h>
#include <linux/uaccess.h>
#include <asm/sections.h>
@@ -540,10 +541,16 @@ static u32 log_buf_len = __LOG_BUF_LEN;
#endif
_DEFINE_PRINTKRB(printk_rb_static, CONFIG_LOG_BUF_SHIFT - PRB_AVGBITS,
PRB_AVGBITS, &__log_buf[0]);
+KMEMDUMP_VAR_CORE_NAMED(prb_descs, _printk_rb_static_descs, sizeof(_printk_rb_static_descs));
+KMEMDUMP_VAR_CORE_NAMED(prb_infos, _printk_rb_static_infos, sizeof(_printk_rb_static_infos));
+KMEMDUMP_VAR_CORE_NAMED(prb_data, __log_buf, __LOG_BUF_LEN);
+KMEMDUMP_VAR_CORE(printk_rb_static, sizeof(printk_rb_static));
static struct printk_ringbuffer printk_rb_dynamic;
+KMEMDUMP_VAR_CORE(printk_rb_dynamic, sizeof(printk_rb_dynamic));
struct printk_ringbuffer *prb = &printk_rb_static;
+KMEMDUMP_VAR_CORE(prb, sizeof(prb));
/*
* We cannot access per-CPU data (e.g. per-CPU flush irq_work) before
@@ -1211,7 +1218,10 @@ void __init setup_log_buf(int early)
goto out;
}
- new_log_buf = memblock_alloc(new_log_buf_len, LOG_ALIGN);
+ new_log_buf = kmemdump_alloc_id_size_replace(KMEMDUMP_ID_COREIMAGE_prb_data,
+ new_log_buf_len,
+ memblock_alloc,
+ new_log_buf_len, LOG_ALIGN);
if (unlikely(!new_log_buf)) {
pr_err("log_buf_len: %lu text bytes not available\n",
new_log_buf_len);
@@ -1219,7 +1229,10 @@ void __init setup_log_buf(int early)
}
new_descs_size = new_descs_count * sizeof(struct prb_desc);
- new_descs = memblock_alloc(new_descs_size, LOG_ALIGN);
+ new_descs = kmemdump_alloc_id_size_replace(KMEMDUMP_ID_COREIMAGE_prb_descs,
+ new_descs_size, memblock_alloc,
+ new_descs_size, LOG_ALIGN);
+
if (unlikely(!new_descs)) {
pr_err("log_buf_len: %zu desc bytes not available\n",
new_descs_size);
@@ -1227,7 +1240,10 @@ void __init setup_log_buf(int early)
}
new_infos_size = new_descs_count * sizeof(struct printk_info);
- new_infos = memblock_alloc(new_infos_size, LOG_ALIGN);
+ new_infos = kmemdump_alloc_id_size_replace(KMEMDUMP_ID_COREIMAGE_prb_infos,
+ new_infos_size, memblock_alloc,
+ new_infos_size, LOG_ALIGN);
+
if (unlikely(!new_infos)) {
pr_err("log_buf_len: %zu info bytes not available\n",
new_infos_size);
@@ -1284,9 +1300,11 @@ void __init setup_log_buf(int early)
return;
err_free_descs:
- memblock_free(new_descs, new_descs_size);
+ kmemdump_free_id(KMEMDUMP_ID_COREIMAGE_prb_descs,
+ memblock_free, new_descs, new_descs_size);
err_free_log_buf:
- memblock_free(new_log_buf, new_log_buf_len);
+ kmemdump_free_id(KMEMDUMP_ID_COREIMAGE_prb_data,
+ memblock_free, new_log_buf, new_log_buf_len);
out:
print_log_buf_usage_stats();
}
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 21/29] kernel/configs: Register dynamic information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (19 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 20/29] printk: Register " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 22/29] mm/numa: Register " Eugen Hristev
` (8 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Register kernel_config_data information into kmemdump.
Debugging tools look for the start and end markers, so we need to capture
those as well into the region.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/configs.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/configs.c b/kernel/configs.c
index a28c79c5f713..ec94b695f234 100644
--- a/kernel/configs.c
+++ b/kernel/configs.c
@@ -15,6 +15,7 @@
#include <linux/seq_file.h>
#include <linux/init.h>
#include <linux/uaccess.h>
+#include <linux/kmemdump.h>
/*
* "IKCFG_ST" and "IKCFG_ED" are used to extract the config data from
@@ -64,6 +65,11 @@ static int __init ikconfig_init(void)
proc_set_size(entry, &kernel_config_data_end - &kernel_config_data);
+ /* Register 8 bytes before and after, to catch the marker too */
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_CONFIG,
+ (void *)&kernel_config_data - 8,
+ &kernel_config_data_end - &kernel_config_data + 16);
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (20 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 21/29] kernel/configs: Register dynamic " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-30 13:52 ` David Hildenbrand
2025-07-24 13:55 ` [RFC][PATCH v2 23/29] mm/sparse: " Eugen Hristev
` (7 subsequent siblings)
29 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- node_data
Information on these variables is stored into dedicated kmemdump section.
Register dynamic information into kmemdump:
- dynamic node data for each node
This information is being allocated for each node, as physical address,
so call kmemdump_phys_alloc_size that will allocate an unique kmemdump
uid, and register the virtual address.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/numa.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/numa.c b/mm/numa.c
index 7d5e06fe5bd4..88cada571171 100644
--- a/mm/numa.c
+++ b/mm/numa.c
@@ -4,9 +4,11 @@
#include <linux/printk.h>
#include <linux/numa.h>
#include <linux/numa_memblks.h>
+#include <linux/kmemdump.h>
struct pglist_data *node_data[MAX_NUMNODES];
EXPORT_SYMBOL(node_data);
+KMEMDUMP_VAR_CORE(node_data, MAX_NUMNODES * sizeof(struct pglist_data));
/* Allocate NODE_DATA for a node on the local memory */
void __init alloc_node_data(int nid)
@@ -16,7 +18,8 @@ void __init alloc_node_data(int nid)
int tnid;
/* Allocate node data. Try node-local memory and then any node. */
- nd_pa = memblock_phys_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
+ nd_pa = kmemdump_phys_alloc_size(nd_size, memblock_phys_alloc_try_nid,
+ nd_size, SMP_CACHE_BYTES, nid);
if (!nd_pa)
panic("Cannot allocate %zu bytes for node %d data\n",
nd_size, nid);
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 23/29] mm/sparse: Register information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (21 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 22/29] mm/numa: Register " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 24/29] kernel/vmcore_info: Register dynamic " Eugen Hristev
` (6 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- mem_section
Information on these variables is stored into dedicated kmemdump section.
Register dynamic information into kmemdump:
- section
- mem_section_usage
This information is being allocated for each node, so call
kmemdump_alloc_size that will allocate an unique kmemdump uid, and
register the address.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/sparse.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/mm/sparse.c b/mm/sparse.c
index 3c012cf83cc2..04b1b679a2ad 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -15,6 +15,7 @@
#include <linux/swapops.h>
#include <linux/bootmem_info.h>
#include <linux/vmstat.h>
+#include <linux/kmemdump.h>
#include "internal.h"
#include <asm/dma.h>
@@ -30,6 +31,7 @@ struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]
____cacheline_internodealigned_in_smp;
#endif
EXPORT_SYMBOL(mem_section);
+KMEMDUMP_VAR_CORE(mem_section, sizeof(mem_section));
#ifdef NODE_NOT_IN_PAGE_FLAGS
/*
@@ -67,10 +69,11 @@ static noinline struct mem_section __ref *sparse_index_alloc(int nid)
sizeof(struct mem_section);
if (slab_is_available()) {
- section = kzalloc_node(array_size, GFP_KERNEL, nid);
+ section = kmemdump_alloc_size(array_size, kzalloc_node,
+ array_size, GFP_KERNEL, nid);
} else {
- section = memblock_alloc_node(array_size, SMP_CACHE_BYTES,
- nid);
+ section = kmemdump_alloc_size(array_size, memblock_alloc_node,
+ array_size, SMP_CACHE_BYTES, nid);
if (!section)
panic("%s: Failed to allocate %lu bytes nid=%d\n",
__func__, array_size, nid);
@@ -252,7 +255,9 @@ static void __init memblocks_present(void)
size = sizeof(struct mem_section *) * NR_SECTION_ROOTS;
align = 1 << (INTERNODE_CACHE_SHIFT);
- mem_section = memblock_alloc_or_panic(size, align);
+ mem_section = kmemdump_alloc_id_size(KMEMDUMP_ID_COREIMAGE_MEMSECT,
+ size, memblock_alloc_or_panic,
+ size, align);
}
#endif
@@ -338,7 +343,8 @@ sparse_early_usemaps_alloc_pgdat_section(struct pglist_data *pgdat,
limit = goal + (1UL << PA_SECTION_SHIFT);
nid = early_pfn_to_nid(goal >> PAGE_SHIFT);
again:
- usage = memblock_alloc_try_nid(size, SMP_CACHE_BYTES, goal, limit, nid);
+ usage = kmemdump_alloc_size(size, memblock_alloc_try_nid, size,
+ SMP_CACHE_BYTES, goal, limit, nid);
if (!usage && limit) {
limit = MEMBLOCK_ALLOC_ACCESSIBLE;
goto again;
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 24/29] kernel/vmcore_info: Register dynamic information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (22 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 23/29] mm/sparse: " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 25/29] kmemdump: Add additional symbols to the coreimage Eugen Hristev
` (5 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Register vmcoreinfo information into kmemdump.
Because the size of the info is computed after all entries are being
added, there is no point in registering the whole page, rather, call
the kmemdump registration once everything is in place with the right size.
A second reason is that the vmcoreinfo is added as a region inside
the ELF coreimage note, there is no point in having blank space at the end.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/vmcore_info.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
index e066d31d08f8..d808c5e67f35 100644
--- a/kernel/vmcore_info.c
+++ b/kernel/vmcore_info.c
@@ -14,6 +14,7 @@
#include <linux/cpuhotplug.h>
#include <linux/memblock.h>
#include <linux/kmemleak.h>
+#include <linux/kmemdump.h>
#include <asm/page.h>
#include <asm/sections.h>
@@ -227,6 +228,8 @@ static int __init crash_save_vmcoreinfo_init(void)
arch_crash_save_vmcoreinfo();
update_vmcoreinfo_note();
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
+ (void *)vmcoreinfo_data, vmcoreinfo_size);
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 25/29] kmemdump: Add additional symbols to the coreimage
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (23 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 24/29] kernel/vmcore_info: Register dynamic " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 26/29] init/version: Annotate init uts name separately into Kmemdump Eugen Hristev
` (4 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Add additional symbols which are required by specific platforms
firmware for dumping an image.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
include/linux/kmemdump.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/include/linux/kmemdump.h b/include/linux/kmemdump.h
index 7933915c2c78..94493297d643 100644
--- a/include/linux/kmemdump.h
+++ b/include/linux/kmemdump.h
@@ -35,6 +35,22 @@ enum kmemdump_uid {
KMEMDUMP_ID_COREIMAGE_high_memory,
KMEMDUMP_ID_COREIMAGE_init_mm,
KMEMDUMP_ID_COREIMAGE_init_mm_pgd,
+ KMEMDUMP_ID_COREIMAGE__sinittext,
+ KMEMDUMP_ID_COREIMAGE__einittext,
+ KMEMDUMP_ID_COREIMAGE__end,
+ KMEMDUMP_ID_COREIMAGE__text,
+ KMEMDUMP_ID_COREIMAGE__stext,
+ KMEMDUMP_ID_COREIMAGE__etext,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_num_syms,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_relative_base,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_offsets,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_names,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_token_table,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_token_index,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_markers,
+ KMEMDUMP_ID_COREIMAGE_kallsyms_seqs_of_names,
+ KMEMDUMP_ID_COREIMAGE_swapper_pg_dir,
+ KMEMDUMP_ID_COREIMAGE_init_uts_ns_name,
KMEMDUMP_ID_USER_START,
KMEMDUMP_ID_USER_END,
KMEMDUMP_ID_NO_ID,
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 26/29] init/version: Annotate init uts name separately into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (24 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 25/29] kmemdump: Add additional symbols to the coreimage Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 27/29] kallsyms: Annotate static information " Eugen Hristev
` (3 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Some specific firmware is looking for the init uts name region.
In consequence this has to be registered as a dedicated region into
kmemdump.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
init/version.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/init/version.c b/init/version.c
index f5910c027948..364e7768da68 100644
--- a/init/version.c
+++ b/init/version.c
@@ -53,6 +53,8 @@ const char linux_banner[] __weak;
#include "version-timestamp.c"
KMEMDUMP_VAR_CORE(init_uts_ns, sizeof(init_uts_ns));
+KMEMDUMP_VAR_CORE_NAMED(init_uts_ns_name, init_uts_ns.name,
+ sizeof(init_uts_ns.name));
KMEMDUMP_VAR_CORE(linux_banner, sizeof(linux_banner));
EXPORT_SYMBOL_GPL(init_uts_ns);
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 27/29] kallsyms: Annotate static information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (25 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 26/29] init/version: Annotate init uts name separately into Kmemdump Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 28/29] mm/init-mm: Annotate additional " Eugen Hristev
` (2 subsequent siblings)
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate vital static information into kmemdump:
- kallsysms_num_syms
- kallsyms_relative_base
- kallsysms_offsets
- kallsysms_names
- kallsyms_token_table
- kallsyms_token_index
- kallsyms_markers
- kallsyms_seqs_of_names
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
kernel/kallsyms.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 1e7635864124..442dc13d00cf 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -31,9 +31,19 @@
#include <linux/kernel.h>
#include <linux/bsearch.h>
#include <linux/btf_ids.h>
+#include <linux/kmemdump.h>
#include "kallsyms_internal.h"
+KMEMDUMP_VAR_CORE(kallsyms_num_syms, sizeof(kallsyms_num_syms));
+KMEMDUMP_VAR_CORE(kallsyms_relative_base, sizeof(kallsyms_relative_base));
+KMEMDUMP_VAR_CORE(kallsyms_offsets, sizeof(&kallsyms_offsets));
+KMEMDUMP_VAR_CORE(kallsyms_names, sizeof(&kallsyms_names));
+KMEMDUMP_VAR_CORE(kallsyms_token_table, sizeof(&kallsyms_token_table));
+KMEMDUMP_VAR_CORE(kallsyms_token_index, sizeof(&kallsyms_token_index));
+KMEMDUMP_VAR_CORE(kallsyms_markers, sizeof(&kallsyms_markers));
+KMEMDUMP_VAR_CORE(kallsyms_seqs_of_names, sizeof(&kallsyms_seqs_of_names));
+
/*
* Expand a compressed symbol data into the resulting uncompressed string,
* if uncompressed string is too long (>= maxlen), it will be truncated,
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 28/29] mm/init-mm: Annotate additional information into Kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (26 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 27/29] kallsyms: Annotate static information " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 29/29] kmemdump: Add Kinfo backend driver Eugen Hristev
2025-08-26 17:14 ` [RFC][PATCH v2 00/29] introduce kmemdump Mukesh Ojha
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Annotate additional static information into kmemdump:
- _sinittext
- _einittext
- _end
- _text
- _stext
- _etext
- swapper_pg_dir
Information on these variables is stored into dedicated kmemdump section.
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
mm/init-mm.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/mm/init-mm.c b/mm/init-mm.c
index 2dbbaf640cf4..01ff91f35b23 100644
--- a/mm/init-mm.c
+++ b/mm/init-mm.c
@@ -20,6 +20,13 @@
const struct vm_operations_struct vma_dummy_vm_ops;
+KMEMDUMP_VAR_CORE(_sinittext, sizeof(void *));
+KMEMDUMP_VAR_CORE(_einittext, sizeof(void *));
+KMEMDUMP_VAR_CORE(_end, sizeof(void *));
+KMEMDUMP_VAR_CORE(_text, sizeof(void *));
+KMEMDUMP_VAR_CORE(_stext, sizeof(void *));
+KMEMDUMP_VAR_CORE(_etext, sizeof(void *));
+
/*
* For dynamically allocated mm_structs, there is a dynamically sized cpumask
* at the end of the structure, the size of which depends on the maximum CPU
@@ -51,6 +58,7 @@ struct mm_struct init_mm = {
KMEMDUMP_VAR_CORE(init_mm, sizeof(init_mm));
KMEMDUMP_VAR_CORE_NAMED(init_mm_pgd, init_mm.pgd, sizeof(*init_mm.pgd));
+KMEMDUMP_VAR_CORE(swapper_pg_dir, sizeof(&swapper_pg_dir));
void setup_initial_init_mm(void *start_code, void *end_code,
void *end_data, void *brk)
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* [RFC][PATCH v2 29/29] kmemdump: Add Kinfo backend driver
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (27 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 28/29] mm/init-mm: Annotate additional " Eugen Hristev
@ 2025-07-24 13:55 ` Eugen Hristev
2025-08-26 17:14 ` [RFC][PATCH v2 00/29] introduce kmemdump Mukesh Ojha
29 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-07-24 13:55 UTC (permalink / raw)
To: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, corbet, mojha,
rostedt, jonechou, tudor.ambarus
Add Kinfo backend driver.
This backend driver will select only regions of interest for the firmware,
and it copy those into a shared memory area that is supplied via OF.
The firmware is only interested in addresses for some symbols.
The list format is kinfo-compatible, with devices like Google Pixel phone.
Based on original work from Jone Chou <jonechou@google.com>
Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
---
As stated in the cover letter I did not test this on a device as I do not
have one. Testing is appreciated and feedback welcome !
This kinfo backend is how we envision it to look like, while preserving
compatibility with existing devices and firmware.
Yes I also know the compatible is not documented. But if we want to have
this driver in the kernel, I can easily add one
MAINTAINERS | 5 +
drivers/debug/Kconfig | 13 ++
drivers/debug/Makefile | 1 +
drivers/debug/kinfo.c | 304 +++++++++++++++++++++++++++++++++++++++++
4 files changed, 323 insertions(+)
create mode 100644 drivers/debug/kinfo.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 68797717175c..bc605480d6e8 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13625,6 +13625,11 @@ F: drivers/debug/kmemdump.c
F: drivers/debug/kmemdump_coreimage.c
F: include/linux/kmemdump.h
+KMEMDUMP KINFO BACKEND DRIVER
+M: Eugen Hristev <eugen.hristev@linaro.org>
+S: Maintained
+F: drivers/debug/kinfo.c
+
KMEMDUMP QCOM MINIDUMP BACKEND DRIVER
M: Eugen Hristev <eugen.hristev@linaro.org>
S: Maintained
diff --git a/drivers/debug/Kconfig b/drivers/debug/Kconfig
index d34ceaf99bd8..1a44990c2824 100644
--- a/drivers/debug/Kconfig
+++ b/drivers/debug/Kconfig
@@ -39,4 +39,17 @@ config KMEMDUMP_QCOM_MINIDUMP_BACKEND
into the minidump table of contents. Further on, the firmware
will be able to read the table of contents and extract the
memory regions on case-by-case basis.
+
+config KMEMDUMP_KINFO_BACKEND
+ tristate "Shared memory KInfo compatible backend"
+ depends on KMEMDUMP
+ help
+ Say y here to enable the Shared memory KInfo compatible backend
+ driver.
+ With this backend, the registered regions are copied to a shared
+ memory zone at register time.
+ The shared memory zone is supplied via OF.
+ This backend will select only regions that are of interest,
+ and keep only addresses. The format of the list is Kinfo compatible.
+
endmenu
diff --git a/drivers/debug/Makefile b/drivers/debug/Makefile
index 7f70b84049cb..861f2e2c4fe2 100644
--- a/drivers/debug/Makefile
+++ b/drivers/debug/Makefile
@@ -3,3 +3,4 @@
obj-$(CONFIG_KMEMDUMP) += kmemdump.o
obj-$(CONFIG_KMEMDUMP_COREIMAGE) += kmemdump_coreimage.o
obj-$(CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND) += qcom_minidump.o
+obj-$(CONFIG_KMEMDUMP_KINFO_BACKEND) += kinfo.o
diff --git a/drivers/debug/kinfo.c b/drivers/debug/kinfo.c
new file mode 100644
index 000000000000..bdf50254fa92
--- /dev/null
+++ b/drivers/debug/kinfo.c
@@ -0,0 +1,304 @@
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/platform_device.h>
+#include <linux/kallsyms.h>
+#include <linux/vmalloc.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
+#include <linux/sizes.h>
+#include <linux/slab.h>
+#include <linux/kmemdump.h>
+#include <linux/module.h>
+#include <linux/utsname.h>
+
+#define BUILD_INFO_LEN 256
+#define DEBUG_KINFO_MAGIC 0xCCEEDDFF
+
+/*
+ * Header structure must be byte-packed, since the table is provided to
+ * bootloader.
+ */
+struct kernel_info {
+ /* For kallsyms */
+ __u8 enabled_all;
+ __u8 enabled_base_relative;
+ __u8 enabled_absolute_percpu;
+ __u8 enabled_cfi_clang;
+ __u32 num_syms;
+ __u16 name_len;
+ __u16 bit_per_long;
+ __u16 module_name_len;
+ __u16 symbol_len;
+ __u64 _relative_pa;
+ __u64 _text_pa;
+ __u64 _stext_pa;
+ __u64 _etext_pa;
+ __u64 _sinittext_pa;
+ __u64 _einittext_pa;
+ __u64 _end_pa;
+ __u64 _offsets_pa;
+ __u64 _names_pa;
+ __u64 _token_table_pa;
+ __u64 _token_index_pa;
+ __u64 _markers_pa;
+ __u64 _seqs_of_names_pa;
+
+ /* For frame pointer */
+ __u32 thread_size;
+
+ /* For virt_to_phys */
+ __u64 swapper_pg_dir_pa;
+
+ /* For linux banner */
+ __u8 last_uts_release[__NEW_UTS_LEN];
+
+ /* Info of running build */
+ __u8 build_info[BUILD_INFO_LEN];
+
+ /* For module kallsyms */
+ __u32 enabled_modules_tree_lookup;
+ __u32 mod_mem_offset;
+ __u32 mod_kallsyms_offset;
+} __packed;
+
+struct kernel_all_info {
+ __u32 magic_number;
+ __u32 combined_checksum;
+ struct kernel_info info;
+} __packed;
+
+struct debug_kinfo {
+ struct device *dev;
+ void *all_info_addr;
+ u32 all_info_size;
+ struct kmemdump_backend kinfo_be;
+};
+
+static struct debug_kinfo *kinfo;
+
+#define be_to_kinfo(be) container_of(be, struct debug_kinfo, kinfo_be)
+
+static void update_kernel_all_info(struct kernel_all_info *all_info)
+{
+ int index;
+ struct kernel_info *info;
+ u32 *checksum_info;
+
+ all_info->magic_number = DEBUG_KINFO_MAGIC;
+ all_info->combined_checksum = 0;
+
+ info = &all_info->info;
+ checksum_info = (u32 *)info;
+ for (index = 0; index < sizeof(*info) / sizeof(u32); index++)
+ all_info->combined_checksum ^= checksum_info[index];
+}
+
+static int build_info_set(const char *str, const struct kernel_param *kp)
+{
+ struct kernel_all_info *all_info = kinfo->all_info_addr;
+ size_t build_info_size;
+
+ if (kinfo->all_info_addr == 0 || kinfo->all_info_size == 0)
+ return -ENAVAIL;
+
+ all_info = (struct kernel_all_info *)kinfo->all_info_addr;
+ build_info_size = sizeof(all_info->info.build_info);
+
+ memcpy(&all_info->info.build_info, str, min(build_info_size - 1,
+ strlen(str)));
+ update_kernel_all_info(all_info);
+
+ if (strlen(str) > build_info_size) {
+ pr_warn("%s: Build info buffer (len: %zd) can't hold entire string '%s'\n",
+ __func__, build_info_size, str);
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static const struct kernel_param_ops build_info_op = {
+ .set = build_info_set,
+};
+
+module_param_cb(build_info, &build_info_op, NULL, 0200);
+MODULE_PARM_DESC(build_info, "Write build info to field 'build_info' of debug kinfo.");
+
+/**
+ * register_kinfo_region() - Register a new kinfo region
+ * @be: pointer to backend
+ * @id: unique id to identify the region
+ * @vaddr: virtual memory address of the region start
+ * @size: size of the region
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int register_kinfo_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id, void *vaddr, size_t size)
+{
+ struct debug_kinfo *kinfo = be_to_kinfo(be);
+ struct kernel_all_info *all_info = kinfo->all_info_addr;
+ struct kernel_info *info = &all_info->info;
+
+ switch (id) {
+ case KMEMDUMP_ID_COREIMAGE__sinittext:
+ info->_sinittext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__einittext:
+ info->_einittext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__end:
+ info->_end_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__text:
+ info->_text_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__stext:
+ info->_stext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE__etext:
+ info->_etext_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_num_syms:
+ info->num_syms = *(__u32 *)vaddr;
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_relative_base:
+ info->_relative_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_offsets:
+ info->_offsets_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_names:
+ info->_names_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_token_table:
+ info->_token_table_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_token_index:
+ info->_token_index_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_markers:
+ info->_markers_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_kallsyms_seqs_of_names:
+ info->_seqs_of_names_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_swapper_pg_dir:
+ info->swapper_pg_dir_pa = (u64)__pa(vaddr);
+ break;
+ case KMEMDUMP_ID_COREIMAGE_init_uts_ns_name:
+ strscpy(info->last_uts_release, vaddr, __NEW_UTS_LEN);
+ break;
+ default:
+ };
+
+ update_kernel_all_info(all_info);
+ return 0;
+}
+
+/**
+ * unregister_md_region() - Unregister a previously registered kinfo region
+ * @id: unique id to identify the region
+ *
+ * Return: On success, it returns 0 and negative error value on failure.
+ */
+static int unregister_kinfo_region(const struct kmemdump_backend *be,
+ enum kmemdump_uid id)
+{
+ return 0;
+}
+
+static int debug_kinfo_probe(struct platform_device *pdev)
+{
+ struct device_node *mem_region;
+ struct reserved_mem *rmem;
+ struct kernel_info *info;
+ struct kernel_all_info *all_info;
+
+ mem_region = of_parse_phandle(pdev->dev.of_node, "memory-region", 0);
+ if (!mem_region) {
+ dev_warn(&pdev->dev, "no such memory-region\n");
+ return -ENODEV;
+ }
+
+ rmem = of_reserved_mem_lookup(mem_region);
+ if (!rmem) {
+ dev_warn(&pdev->dev, "no such reserved mem of node name %s\n",
+ pdev->dev.of_node->name);
+ return -ENODEV;
+ }
+
+ /* Need to wait for reserved memory to be mapped */
+ if (!rmem->priv)
+ return -EPROBE_DEFER;
+
+ if (!rmem->base || !rmem->size) {
+ dev_warn(&pdev->dev, "unexpected reserved memory\n");
+ return -EINVAL;
+ }
+
+ if (rmem->size < sizeof(struct kernel_all_info)) {
+ dev_warn(&pdev->dev, "unexpected reserved memory size\n");
+ return -EINVAL;
+ }
+
+ kinfo = kzalloc(sizeof(*kinfo), GFP_KERNEL);
+ if (!kinfo)
+ return -ENOMEM;
+
+ kinfo->dev = &pdev->dev;
+
+ strscpy(kinfo->kinfo_be.name, "debug_kinfo");
+ kinfo->kinfo_be.register_region = register_kinfo_region;
+ kinfo->kinfo_be.unregister_region = unregister_kinfo_region;
+ kinfo->all_info_addr = rmem->priv;
+ kinfo->all_info_size = rmem->size;
+
+ all_info = kinfo->all_info_addr;
+
+ memset(all_info, 0, sizeof(struct kernel_all_info));
+ info = &all_info->info;
+ info->enabled_all = IS_ENABLED(CONFIG_KALLSYMS_ALL);
+ info->enabled_absolute_percpu = IS_ENABLED(CONFIG_KALLSYMS_ABSOLUTE_PERCPU);
+ info->enabled_cfi_clang = IS_ENABLED(CONFIG_CFI_CLANG);
+ info->name_len = KSYM_NAME_LEN;
+ info->bit_per_long = BITS_PER_LONG;
+ info->module_name_len = MODULE_NAME_LEN;
+ info->symbol_len = KSYM_SYMBOL_LEN;
+ info->thread_size = THREAD_SIZE;
+ info->enabled_modules_tree_lookup = IS_ENABLED(CONFIG_MODULES_TREE_LOOKUP);
+ info->mod_mem_offset = offsetof(struct module, mem);
+ info->mod_kallsyms_offset = offsetof(struct module, kallsyms);
+
+ return kmemdump_register_backend(&kinfo->kinfo_be);
+}
+
+static void debug_kinfo_remove(struct platform_device *pdev)
+{
+ kfree(kinfo);
+ kmemdump_unregister_backend(&kinfo->kinfo_be);
+}
+
+static const struct of_device_id debug_kinfo_of_match[] = {
+ { .compatible = "google,debug-kinfo" },
+ {},
+};
+MODULE_DEVICE_TABLE(of, debug_kinfo_of_match);
+
+static struct platform_driver debug_kinfo_driver = {
+ .probe = debug_kinfo_probe,
+ .remove = debug_kinfo_remove,
+ .driver = {
+ .name = "debug-kinfo",
+ .of_match_table = of_match_ptr(debug_kinfo_of_match),
+ },
+};
+module_platform_driver(debug_kinfo_driver);
+
+MODULE_AUTHOR("Eugen Hristev <eugen.hristev@linaro.org>");
+MODULE_AUTHOR("Jone Chou <jonechou@google.com>");
+MODULE_DESCRIPTION("Debug Kinfo Driver");
+MODULE_LICENSE("GPL");
--
2.43.0
^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 02/29] Documentation: add kmemdump
2025-07-24 13:54 ` [RFC][PATCH v2 02/29] Documentation: add kmemdump Eugen Hristev
@ 2025-07-24 14:13 ` Jonathan Corbet
0 siblings, 0 replies; 61+ messages in thread
From: Jonathan Corbet @ 2025-07-24 14:13 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, eugen.hristev, mojha, rostedt,
jonechou, tudor.ambarus
Eugen Hristev <eugen.hristev@linaro.org> writes:
> Document the new kmemdump kernel feature.
Thanks for including documentation!
> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
> ---
> Documentation/debug/index.rst | 17 ++++++
> Documentation/debug/kmemdump.rst | 98 ++++++++++++++++++++++++++++++++
> MAINTAINERS | 1 +
> 3 files changed, 116 insertions(+)
> create mode 100644 Documentation/debug/index.rst
> create mode 100644 Documentation/debug/kmemdump.rst
>
> diff --git a/Documentation/debug/index.rst b/Documentation/debug/index.rst
> new file mode 100644
> index 000000000000..9a9365c62f02
> --- /dev/null
> +++ b/Documentation/debug/index.rst
> @@ -0,0 +1,17 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +===
> +kmemdump
> +===
> +
> +.. toctree::
> + :maxdepth: 1
> +
> + kmemdump
> +
> +.. only:: subproject and html
> +
> + Indices
> + =======
> +
> + * :ref:`genindex`
Please don't create a new top-level directory for just this tool - I've
been working for years to get Documentation/ under control. This seems
best placed under Documentation/dev-tools/ ?
> diff --git a/Documentation/debug/kmemdump.rst b/Documentation/debug/kmemdump.rst
> new file mode 100644
> index 000000000000..3301abcaed7e
> --- /dev/null
> +++ b/Documentation/debug/kmemdump.rst
> @@ -0,0 +1,98 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +==========================
> +kmemdump
> +==========================
A nit, but it's nicer to match the markup line lengths with the enclosed
text.
> +This document provides information about the kmemdump feature.
> +
> +Overview
> +========
> +
> +kmemdump is a mechanism that allows any driver or producer to register a
> +chunk of memory into kmemdump, to be used at a later time for a specific
> +purpose like debugging or memory dumping.
> +
> +kmemdump allows a backend to be connected, this backend interfaces a
> +specific hardware that can debug or dump the memory registered into
> +kmemdump.
> +
> +kmemdump Internals
> +=============
> +
> +API
> +----
> +
> +A memory region is being registered with a call to `kmemdump_register` which
Please just say kmemdump_register() - that will let our carefully
written automatic markup machinery do its thing. Among other things, it
will create a cross-reference link to the kerneldoc documentation for
this function (if any). All function references should be written that
way.
> +takes as parameters the ID of the region, a pointer to the virtual memory
> +start address and the size. If successful, this call returns an unique ID for
> +the allocated zone (either the requested ID or an allocated ID).
> +IDs are predefined in the kmemdump header. A second registration with the
> +same ID is not allowed, the caller needs to deregister first.
> +A dedicated NO_ID is defined, which has kmemdump allocate a new unique ID
> +for the request and return it. This case is useful with multiple dynamic
> +loop allocations where ID is not significant.
> +
> +The region would be registered with a call to `kmemdump_unregister` which
> +takes the id as a parameter.
> +
> +For dynamically allocated memory, kmemdump defines a variety of wrappers
> +on top of allocation functions which are given as parameters.
> +This makes the dynamic allocation easy to use without additional calls
> +to registration functions. However kmemdump still exposes the register API
> +for cases where it may be needed (e.g. size is not exactly known at allocation
> +time).
> +
> +For static variables, a variety of annotation macros are provided. These
> +macros will create an annotation struct inside a separate section.
> +
> +
> +Backend
> +-------
> +
> +Backend is represented by a `struct kmemdump_backend` which has to be filled
Structures, too, can be mentioned without explicit markup.
> +in by the backend driver. Further, this struct is being passed to kmemdump
> +with a `backend_register` call. `backend_unregister` will remove the backend
> +from kmemdump.
> +
> +Once a backend is being registered, all previously registered regions are
> +being sent to the backend for registration.
> +
> +When the backend is being removed, all regions are being first deregistered
> +from the backend.
> +
> +kmemdump will request the backend to register a region with `register_region`
> +call, and deregister a region with `unregister_region` call. These two
> +functions are mandatory to be provided by a backend at registration time.
> +
> +Data structures
> +---------------
> +
> +`struct kmemdump_backend` represents the kmemdump backend and it has two
> +function pointers, one called `register_region` and the other
> +`unregister_region`.
> +There is a default backend that does a no-op that is initially registered
> +and is registered back if the current working backend is being removed.
Rather than this sort of handwavy description, why not just use the
kerneldoc comments you have written for this structure?
> +The regions are being stored in a simple fixed size array. It avoids
> +memory allocation overhead. This is not performance critical nor does
> +allocating a few hundred entries create a memory consumption problem.
> +
> +The static variables registered into kmemdump are being annotated into
> +a dedicated `.kemdump` memory section. This is then walked by kmemdump
> +at a later time and each variable is registered.
> +
> +kmemdump Initialization
> +------------------
> +
> +After system boots, kmemdump will be ready to accept region registration
> +from producer drivers. Even if the backend may not be registered yet,
> +there is a default no-op backend that is registered. At any time the backend
> +can be changed with a real backend in which case all regions are being
> +registered to the new backend.
> +
> +backend functionality
> +-----------------
> +
> +kmemdump backend can keep it's own list of regions and use the specific
> +hardware available to dump the memory regions or use them for debugging.
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 7e8da575025c..ef0ffdfaf3de 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -13620,6 +13620,7 @@ F: drivers/iio/accel/kionix-kx022a*
> KMEMDUMP
> M: Eugen Hristev <eugen.hristev@linaro.org>
> S: Maintained
> +F: Documentation/debug/kmemdump.rst
> F: drivers/debug/kmemdump.c
> F: include/linux/kmemdump.h
Thanks,
jon
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 01/29] kmemdump: introduce kmemdump
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
@ 2025-07-26 3:33 ` Randy Dunlap
2025-07-26 3:36 ` Randy Dunlap
1 sibling, 0 replies; 61+ messages in thread
From: Randy Dunlap @ 2025-07-26 3:33 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
On 7/24/25 6:54 AM, Eugen Hristev wrote:
> diff --git a/drivers/Kconfig b/drivers/Kconfig
> index e0777f5ed543..412ef182d5c2 100644
> --- a/drivers/Kconfig
> +++ b/drivers/Kconfig
> @@ -245,4 +245,8 @@ source "drivers/hte/Kconfig"
>
> source "drivers/cdx/Kconfig"
>
> +source "drivers/dpll/Kconfig"
Why adding dpll here? It's already in this Kconfig file.
> +
> +source "drivers/debug/Kconfig"
> +
> endmenu
--
~Randy
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 01/29] kmemdump: introduce kmemdump
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
2025-07-26 3:33 ` Randy Dunlap
@ 2025-07-26 3:36 ` Randy Dunlap
1 sibling, 0 replies; 61+ messages in thread
From: Randy Dunlap @ 2025-07-26 3:36 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
Hi,
On 7/24/25 6:54 AM, Eugen Hristev wrote:
> diff --git a/drivers/debug/Kconfig b/drivers/debug/Kconfig
> new file mode 100644
> index 000000000000..b86585c5d621
> --- /dev/null
> +++ b/drivers/debug/Kconfig
> @@ -0,0 +1,16 @@
> +# SPDX-License-Identifier: GPL-2.0
> +menu "Generic Debug Options"
> +
> +config KMEMDUMP
> + bool "Allow the kernel to register memory regions for dumping purpose"
> + help
> + Kmemdump mechanism allows any driver to register a specific memory
> + area for later dumping purpose, depending on the functionality
> + of the attached backend. The backend would interface any hardware
> + mechanism that will allow dumping to happen regardless of the
> + state of the kernel (running, frozen, crashed, or any particular
> + state).
> +
> + Note that modules using this feature must be rebuilt if option
> + changes.
It seems to me that this (all of the KMEMDUMP Kconfig options) could live in
mm/Kconfig.debug instead of creating a new subdir for it.
--
~Randy
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-07-24 13:55 ` [RFC][PATCH v2 22/29] mm/numa: Register " Eugen Hristev
@ 2025-07-30 13:52 ` David Hildenbrand
2025-07-30 13:57 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-07-30 13:52 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
On 24.07.25 15:55, Eugen Hristev wrote:
> Annotate vital static information into kmemdump:
> - node_data
>
> Information on these variables is stored into dedicated kmemdump section.
>
> Register dynamic information into kmemdump:
> - dynamic node data for each node
>
> This information is being allocated for each node, as physical address,
> so call kmemdump_phys_alloc_size that will allocate an unique kmemdump
> uid, and register the virtual address.
>
> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
> ---
> mm/numa.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/numa.c b/mm/numa.c
> index 7d5e06fe5bd4..88cada571171 100644
> --- a/mm/numa.c
> +++ b/mm/numa.c
> @@ -4,9 +4,11 @@
> #include <linux/printk.h>
> #include <linux/numa.h>
> #include <linux/numa_memblks.h>
> +#include <linux/kmemdump.h>
>
> struct pglist_data *node_data[MAX_NUMNODES];
> EXPORT_SYMBOL(node_data);
> +KMEMDUMP_VAR_CORE(node_data, MAX_NUMNODES * sizeof(struct pglist_data));
>
> /* Allocate NODE_DATA for a node on the local memory */
> void __init alloc_node_data(int nid)
> @@ -16,7 +18,8 @@ void __init alloc_node_data(int nid)
> int tnid;
>
> /* Allocate node data. Try node-local memory and then any node. */
> - nd_pa = memblock_phys_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
> + nd_pa = kmemdump_phys_alloc_size(nd_size, memblock_phys_alloc_try_nid,
> + nd_size, SMP_CACHE_BYTES, nid);
Do we really want to wrap memblock allocations in such a way? :/
Gah, no, no no.
Can't we pass that as some magical flag, or just ... register *after*
allocating?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 16/29] mm/show_mem: Annotate static information into Kmemdump
2025-07-24 13:54 ` [RFC][PATCH v2 16/29] mm/show_mem: " Eugen Hristev
@ 2025-07-30 13:55 ` David Hildenbrand
2025-07-30 14:04 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-07-30 13:55 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
On 24.07.25 15:54, Eugen Hristev wrote:
> Annotate vital static information into kmemdump:
> - _totalram_pages
>
> Information on these variables is stored into dedicated kmemdump section.
>
> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
> ---
> mm/show_mem.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/mm/show_mem.c b/mm/show_mem.c
> index 41999e94a56d..93a5dc041ae1 100644
> --- a/mm/show_mem.c
> +++ b/mm/show_mem.c
> @@ -14,12 +14,14 @@
> #include <linux/mmzone.h>
> #include <linux/swap.h>
> #include <linux/vmstat.h>
> +#include <linux/kmemdump.h>
>
> #include "internal.h"
> #include "swap.h"
>
> atomic_long_t _totalram_pages __read_mostly;
> EXPORT_SYMBOL(_totalram_pages);
> +KMEMDUMP_VAR_CORE(_totalram_pages, sizeof(_totalram_pages));
Tagging these variables that way is really rather ... controversial.
As these are exported globals, isn't there a way to have a list of what
to include and what not somewhere else?
Not sure if any of that would win a beauty price, though.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-07-30 13:52 ` David Hildenbrand
@ 2025-07-30 13:57 ` Eugen Hristev
2025-07-30 14:04 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-07-30 13:57 UTC (permalink / raw)
To: David Hildenbrand, linux-kernel, linux-arm-msm, linux-arch,
linux-mm, tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
Hello,
On 7/30/25 16:52, David Hildenbrand wrote:
> On 24.07.25 15:55, Eugen Hristev wrote:
>> Annotate vital static information into kmemdump:
>> - node_data
>>
>> Information on these variables is stored into dedicated kmemdump section.
>>
>> Register dynamic information into kmemdump:
>> - dynamic node data for each node
>>
>> This information is being allocated for each node, as physical address,
>> so call kmemdump_phys_alloc_size that will allocate an unique kmemdump
>> uid, and register the virtual address.
>>
>> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
>> ---
>> mm/numa.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/numa.c b/mm/numa.c
>> index 7d5e06fe5bd4..88cada571171 100644
>> --- a/mm/numa.c
>> +++ b/mm/numa.c
>> @@ -4,9 +4,11 @@
>> #include <linux/printk.h>
>> #include <linux/numa.h>
>> #include <linux/numa_memblks.h>
>> +#include <linux/kmemdump.h>
>>
>> struct pglist_data *node_data[MAX_NUMNODES];
>> EXPORT_SYMBOL(node_data);
>> +KMEMDUMP_VAR_CORE(node_data, MAX_NUMNODES * sizeof(struct pglist_data));
>>
>> /* Allocate NODE_DATA for a node on the local memory */
>> void __init alloc_node_data(int nid)
>> @@ -16,7 +18,8 @@ void __init alloc_node_data(int nid)
>> int tnid;
>>
>> /* Allocate node data. Try node-local memory and then any node. */
>> - nd_pa = memblock_phys_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
>> + nd_pa = kmemdump_phys_alloc_size(nd_size, memblock_phys_alloc_try_nid,
>> + nd_size, SMP_CACHE_BYTES, nid);
>
> Do we really want to wrap memblock allocations in such a way? :/
>
> Gah, no, no no.
>
> Can't we pass that as some magical flag, or just ... register *after*
> allocating?
>
Thanks for looking into my patch.
Yes, registering after is also an option. Initially this is how I
designed the kmemdump API, I also had in mind to add a flag, but, after
discussing with Thomas Gleixner, he came up with the macro wrapper idea
here:
https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
Do you think we can continue that discussion , or maybe start it here ?
Eugen
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 16/29] mm/show_mem: Annotate static information into Kmemdump
2025-07-30 13:55 ` David Hildenbrand
@ 2025-07-30 14:04 ` Eugen Hristev
2025-07-30 14:10 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-07-30 14:04 UTC (permalink / raw)
To: David Hildenbrand, linux-kernel, linux-arm-msm, linux-arch,
linux-mm, tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
On 7/30/25 16:55, David Hildenbrand wrote:
> On 24.07.25 15:54, Eugen Hristev wrote:
>> Annotate vital static information into kmemdump:
>> - _totalram_pages
>>
>> Information on these variables is stored into dedicated kmemdump section.
>>
>> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
>> ---
>> mm/show_mem.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/show_mem.c b/mm/show_mem.c
>> index 41999e94a56d..93a5dc041ae1 100644
>> --- a/mm/show_mem.c
>> +++ b/mm/show_mem.c
>> @@ -14,12 +14,14 @@
>> #include <linux/mmzone.h>
>> #include <linux/swap.h>
>> #include <linux/vmstat.h>
>> +#include <linux/kmemdump.h>
>>
>> #include "internal.h"
>> #include "swap.h"
>>
>> atomic_long_t _totalram_pages __read_mostly;
>> EXPORT_SYMBOL(_totalram_pages);
>> +KMEMDUMP_VAR_CORE(_totalram_pages, sizeof(_totalram_pages));
>
> Tagging these variables that way is really rather ... controversial.
>
> As these are exported globals, isn't there a way to have a list of what
> to include and what not somewhere else?
>
> Not sure if any of that would win a beauty price, though.
>
Annotating the variable was suggested here :
https://lore.kernel.org/lkml/87h61wn2qq.ffs@tglx/
It does not win a beauty prize but it's simple and efficient at least.
Do you think it would be better to gather all the annotations for the
globals in a single place ?
Eugen
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-07-30 13:57 ` Eugen Hristev
@ 2025-07-30 14:04 ` David Hildenbrand
2025-08-04 10:54 ` Michal Hocko
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-07-30 14:04 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
On 30.07.25 15:57, Eugen Hristev wrote:
> Hello,
>
> On 7/30/25 16:52, David Hildenbrand wrote:
>> On 24.07.25 15:55, Eugen Hristev wrote:
>>> Annotate vital static information into kmemdump:
>>> - node_data
>>>
>>> Information on these variables is stored into dedicated kmemdump section.
>>>
>>> Register dynamic information into kmemdump:
>>> - dynamic node data for each node
>>>
>>> This information is being allocated for each node, as physical address,
>>> so call kmemdump_phys_alloc_size that will allocate an unique kmemdump
>>> uid, and register the virtual address.
>>>
>>> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
>>> ---
>>> mm/numa.c | 5 ++++-
>>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/numa.c b/mm/numa.c
>>> index 7d5e06fe5bd4..88cada571171 100644
>>> --- a/mm/numa.c
>>> +++ b/mm/numa.c
>>> @@ -4,9 +4,11 @@
>>> #include <linux/printk.h>
>>> #include <linux/numa.h>
>>> #include <linux/numa_memblks.h>
>>> +#include <linux/kmemdump.h>
>>>
>>> struct pglist_data *node_data[MAX_NUMNODES];
>>> EXPORT_SYMBOL(node_data);
>>> +KMEMDUMP_VAR_CORE(node_data, MAX_NUMNODES * sizeof(struct pglist_data));
>>>
>>> /* Allocate NODE_DATA for a node on the local memory */
>>> void __init alloc_node_data(int nid)
>>> @@ -16,7 +18,8 @@ void __init alloc_node_data(int nid)
>>> int tnid;
>>>
>>> /* Allocate node data. Try node-local memory and then any node. */
>>> - nd_pa = memblock_phys_alloc_try_nid(nd_size, SMP_CACHE_BYTES, nid);
>>> + nd_pa = kmemdump_phys_alloc_size(nd_size, memblock_phys_alloc_try_nid,
>>> + nd_size, SMP_CACHE_BYTES, nid);
>>
>> Do we really want to wrap memblock allocations in such a way? :/
>>
>> Gah, no, no no.
>>
>> Can't we pass that as some magical flag, or just ... register *after*
>> allocating?
>>
>
> Thanks for looking into my patch.
>
> Yes, registering after is also an option. Initially this is how I
> designed the kmemdump API, I also had in mind to add a flag, but, after
> discussing with Thomas Gleixner, he came up with the macro wrapper idea
> here:
> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
> Do you think we can continue that discussion , or maybe start it here ?
Yeah, I don't like that, but I can see how we ended up here.
I also don't quite like the idea that we must encode here what to
include in a dump and what not ...
For the vmcore we construct it at runtime in
crash_save_vmcoreinfo_init(), where we e.g., have
VMCOREINFO_STRUCT_SIZE(pglist_data);
Could we similar have some place where we construct what to dump
similarly, just not using the current values, but the memory ranges?
Did you consider that?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 16/29] mm/show_mem: Annotate static information into Kmemdump
2025-07-30 14:04 ` Eugen Hristev
@ 2025-07-30 14:10 ` David Hildenbrand
0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2025-07-30 14:10 UTC (permalink / raw)
To: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek
Cc: linux-arm-kernel, linux-hardening, corbet, mojha, rostedt,
jonechou, tudor.ambarus
On 30.07.25 16:04, Eugen Hristev wrote:
>
>
> On 7/30/25 16:55, David Hildenbrand wrote:
>> On 24.07.25 15:54, Eugen Hristev wrote:
>>> Annotate vital static information into kmemdump:
>>> - _totalram_pages
>>>
>>> Information on these variables is stored into dedicated kmemdump section.
>>>
>>> Signed-off-by: Eugen Hristev <eugen.hristev@linaro.org>
>>> ---
>>> mm/show_mem.c | 2 ++
>>> 1 file changed, 2 insertions(+)
>>>
>>> diff --git a/mm/show_mem.c b/mm/show_mem.c
>>> index 41999e94a56d..93a5dc041ae1 100644
>>> --- a/mm/show_mem.c
>>> +++ b/mm/show_mem.c
>>> @@ -14,12 +14,14 @@
>>> #include <linux/mmzone.h>
>>> #include <linux/swap.h>
>>> #include <linux/vmstat.h>
>>> +#include <linux/kmemdump.h>
>>>
>>> #include "internal.h"
>>> #include "swap.h"
>>>
>>> atomic_long_t _totalram_pages __read_mostly;
>>> EXPORT_SYMBOL(_totalram_pages);
>>> +KMEMDUMP_VAR_CORE(_totalram_pages, sizeof(_totalram_pages));
>>
>> Tagging these variables that way is really rather ... controversial.
>>
>> As these are exported globals, isn't there a way to have a list of what
>> to include and what not somewhere else?
>>
>> Not sure if any of that would win a beauty price, though.
>>
>
> Annotating the variable was suggested here :
>
> https://lore.kernel.org/lkml/87h61wn2qq.ffs@tglx/
>
> It does not win a beauty prize but it's simple and efficient at least.
> Do you think it would be better to gather all the annotations for the
> globals in a single place ?
See my other mail regarding VMCOREINFO.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-07-30 14:04 ` David Hildenbrand
@ 2025-08-04 10:54 ` Michal Hocko
2025-08-04 11:06 ` Eugen Hristev
2025-08-04 12:16 ` David Hildenbrand
0 siblings, 2 replies; 61+ messages in thread
From: Michal Hocko @ 2025-08-04 10:54 UTC (permalink / raw)
To: David Hildenbrand
Cc: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek, linux-arm-kernel, linux-hardening,
corbet, mojha, rostedt, jonechou, tudor.ambarus
On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
> On 30.07.25 15:57, Eugen Hristev wrote:
[...]
> > Yes, registering after is also an option. Initially this is how I
> > designed the kmemdump API, I also had in mind to add a flag, but, after
> > discussing with Thomas Gleixner, he came up with the macro wrapper idea
> > here:
> > https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
> > Do you think we can continue that discussion , or maybe start it here ?
>
> Yeah, I don't like that, but I can see how we ended up here.
>
> I also don't quite like the idea that we must encode here what to include in
> a dump and what not ...
>
> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
> where we e.g., have
>
> VMCOREINFO_STRUCT_SIZE(pglist_data);
>
> Could we similar have some place where we construct what to dump similarly,
> just not using the current values, but the memory ranges?
All those symbols are part of kallsyms, right? Can we just use kallsyms
infrastructure and a list of symbols to get what we need from there?
In other words the list of symbols to be completely external to the code
that is defining them?
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 10:54 ` Michal Hocko
@ 2025-08-04 11:06 ` Eugen Hristev
2025-08-04 12:18 ` David Hildenbrand
2025-08-04 12:16 ` David Hildenbrand
1 sibling, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-04 11:06 UTC (permalink / raw)
To: Michal Hocko, David Hildenbrand
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus
On 8/4/25 13:54, Michal Hocko wrote:
> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>> On 30.07.25 15:57, Eugen Hristev wrote:
> [...]
>>> Yes, registering after is also an option. Initially this is how I
>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>> here:
>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>> Do you think we can continue that discussion , or maybe start it here ?
>>
>> Yeah, I don't like that, but I can see how we ended up here.
>>
>> I also don't quite like the idea that we must encode here what to include in
>> a dump and what not ...
>>
>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>> where we e.g., have
>>
>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>
>> Could we similar have some place where we construct what to dump similarly,
>> just not using the current values, but the memory ranges?
>
> All those symbols are part of kallsyms, right? Can we just use kallsyms
> infrastructure and a list of symbols to get what we need from there?
>
> In other words the list of symbols to be completely external to the code
> that is defining them?
Some static symbols are indeed part of kallsyms. But some symbols are
not exported, for example patch 20/29, where printk related symbols are
not to be exported. Another example is with static variables, like in
patch 17/29 , not exported as symbols, but required for the dump.
Dynamic memory regions are not have to also be considered, have a look
for example at patch 23/29 , where dynamically allocated memory needs to
be registered.
Do you think that I should move all kallsyms related symbols annotation
into a separate place and keep it for the static/dynamic regions in place ?
Thanks for looking into my patch,
Eugen
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 10:54 ` Michal Hocko
2025-08-04 11:06 ` Eugen Hristev
@ 2025-08-04 12:16 ` David Hildenbrand
1 sibling, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2025-08-04 12:16 UTC (permalink / raw)
To: Michal Hocko
Cc: Eugen Hristev, linux-kernel, linux-arm-msm, linux-arch, linux-mm,
tglx, andersson, pmladek, linux-arm-kernel, linux-hardening,
corbet, mojha, rostedt, jonechou, tudor.ambarus
On 04.08.25 12:54, Michal Hocko wrote:
> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>> On 30.07.25 15:57, Eugen Hristev wrote:
> [...]
>>> Yes, registering after is also an option. Initially this is how I
>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>> here:
>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>> Do you think we can continue that discussion , or maybe start it here ?
>>
>> Yeah, I don't like that, but I can see how we ended up here.
>>
>> I also don't quite like the idea that we must encode here what to include in
>> a dump and what not ...
>>
>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>> where we e.g., have
>>
>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>
>> Could we similar have some place where we construct what to dump similarly,
>> just not using the current values, but the memory ranges?
>
> All those symbols are part of kallsyms, right? Can we just use kallsyms
> infrastructure and a list of symbols to get what we need from there?
>
> In other words the list of symbols to be completely external to the code
> that is defining them?
That was the idea. All we should need is the start+size of the ranges.
No need to have these kmemdump specifics all over the kernel.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 11:06 ` Eugen Hristev
@ 2025-08-04 12:18 ` David Hildenbrand
2025-08-04 12:29 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-04 12:18 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus
On 04.08.25 13:06, Eugen Hristev wrote:
>
>
> On 8/4/25 13:54, Michal Hocko wrote:
>> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>>> On 30.07.25 15:57, Eugen Hristev wrote:
>> [...]
>>>> Yes, registering after is also an option. Initially this is how I
>>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>>> here:
>>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>>> Do you think we can continue that discussion , or maybe start it here ?
>>>
>>> Yeah, I don't like that, but I can see how we ended up here.
>>>
>>> I also don't quite like the idea that we must encode here what to include in
>>> a dump and what not ...
>>>
>>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>>> where we e.g., have
>>>
>>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>>
>>> Could we similar have some place where we construct what to dump similarly,
>>> just not using the current values, but the memory ranges?
>>
>> All those symbols are part of kallsyms, right? Can we just use kallsyms
>> infrastructure and a list of symbols to get what we need from there?
>>
>> In other words the list of symbols to be completely external to the code
>> that is defining them?
>
> Some static symbols are indeed part of kallsyms. But some symbols are
> not exported, for example patch 20/29, where printk related symbols are
> not to be exported. Another example is with static variables, like in
> patch 17/29 , not exported as symbols, but required for the dump.
> Dynamic memory regions are not have to also be considered, have a look
> for example at patch 23/29 , where dynamically allocated memory needs to
> be registered.
>
> Do you think that I should move all kallsyms related symbols annotation
> into a separate place and keep it for the static/dynamic regions in place ?
If you want to use a symbol from kmemdump, then make that symbol
available to kmemdump.
IOW, if we were to rip out kmemdump tomorrow, we wouldn't have to touch
any non-kmemdump-specific files.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 12:18 ` David Hildenbrand
@ 2025-08-04 12:29 ` Eugen Hristev
2025-08-04 12:49 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-04 12:29 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus
On 8/4/25 15:18, David Hildenbrand wrote:
> On 04.08.25 13:06, Eugen Hristev wrote:
>>
>>
>> On 8/4/25 13:54, Michal Hocko wrote:
>>> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>>>> On 30.07.25 15:57, Eugen Hristev wrote:
>>> [...]
>>>>> Yes, registering after is also an option. Initially this is how I
>>>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>>>> here:
>>>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>>>> Do you think we can continue that discussion , or maybe start it here ?
>>>>
>>>> Yeah, I don't like that, but I can see how we ended up here.
>>>>
>>>> I also don't quite like the idea that we must encode here what to include in
>>>> a dump and what not ...
>>>>
>>>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>>>> where we e.g., have
>>>>
>>>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>>>
>>>> Could we similar have some place where we construct what to dump similarly,
>>>> just not using the current values, but the memory ranges?
>>>
>>> All those symbols are part of kallsyms, right? Can we just use kallsyms
>>> infrastructure and a list of symbols to get what we need from there?
>>>
>>> In other words the list of symbols to be completely external to the code
>>> that is defining them?
>>
>> Some static symbols are indeed part of kallsyms. But some symbols are
>> not exported, for example patch 20/29, where printk related symbols are
>> not to be exported. Another example is with static variables, like in
>> patch 17/29 , not exported as symbols, but required for the dump.
>> Dynamic memory regions are not have to also be considered, have a look
>> for example at patch 23/29 , where dynamically allocated memory needs to
>> be registered.
>>
>> Do you think that I should move all kallsyms related symbols annotation
>> into a separate place and keep it for the static/dynamic regions in place ?
>
> If you want to use a symbol from kmemdump, then make that symbol
> available to kmemdump.
That's what I am doing, registering symbols with kmemdump.
Maybe I do not understand what you mean, do you have any suggestion for
the static variables case (symbols not exported) ?
>
> IOW, if we were to rip out kmemdump tomorrow, we wouldn't have to touch
> any non-kmemdump-specific files.
>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 12:29 ` Eugen Hristev
@ 2025-08-04 12:49 ` David Hildenbrand
2025-08-04 13:03 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-04 12:49 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus
On 04.08.25 14:29, Eugen Hristev wrote:
>
>
> On 8/4/25 15:18, David Hildenbrand wrote:
>> On 04.08.25 13:06, Eugen Hristev wrote:
>>>
>>>
>>> On 8/4/25 13:54, Michal Hocko wrote:
>>>> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>>>>> On 30.07.25 15:57, Eugen Hristev wrote:
>>>> [...]
>>>>>> Yes, registering after is also an option. Initially this is how I
>>>>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>>>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>>>>> here:
>>>>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>>>>> Do you think we can continue that discussion , or maybe start it here ?
>>>>>
>>>>> Yeah, I don't like that, but I can see how we ended up here.
>>>>>
>>>>> I also don't quite like the idea that we must encode here what to include in
>>>>> a dump and what not ...
>>>>>
>>>>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>>>>> where we e.g., have
>>>>>
>>>>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>>>>
>>>>> Could we similar have some place where we construct what to dump similarly,
>>>>> just not using the current values, but the memory ranges?
>>>>
>>>> All those symbols are part of kallsyms, right? Can we just use kallsyms
>>>> infrastructure and a list of symbols to get what we need from there?
>>>>
>>>> In other words the list of symbols to be completely external to the code
>>>> that is defining them?
>>>
>>> Some static symbols are indeed part of kallsyms. But some symbols are
>>> not exported, for example patch 20/29, where printk related symbols are
>>> not to be exported. Another example is with static variables, like in
>>> patch 17/29 , not exported as symbols, but required for the dump.
>>> Dynamic memory regions are not have to also be considered, have a look
>>> for example at patch 23/29 , where dynamically allocated memory needs to
>>> be registered.
>>>
>>> Do you think that I should move all kallsyms related symbols annotation
>>> into a separate place and keep it for the static/dynamic regions in place ?
>>
>> If you want to use a symbol from kmemdump, then make that symbol
>> available to kmemdump.
>
> That's what I am doing, registering symbols with kmemdump.
> Maybe I do not understand what you mean, do you have any suggestion for
> the static variables case (symbols not exported) ?
Let's use patch #20 as example:
What I am thinking is that you would not include "linux/kmemdump.h" and
not leak all of that KMEMDUMP_ stuff in all these files/subsystems that
couldn't less about kmemdump.
Instead of doing
static struct printk_ringbuffer printk_rb_dynamic;
You'd do
struct printk_ringbuffer printk_rb_dynamic;
and have it in some header file, from where kmemdump could lookup the
address.
So you move the logic of what goes into a dump from the subsystems to
the kmemdump core.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 12:49 ` David Hildenbrand
@ 2025-08-04 13:03 ` Eugen Hristev
2025-08-04 13:26 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-04 13:03 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 8/4/25 15:49, David Hildenbrand wrote:
> On 04.08.25 14:29, Eugen Hristev wrote:
>>
>>
>> On 8/4/25 15:18, David Hildenbrand wrote:
>>> On 04.08.25 13:06, Eugen Hristev wrote:
>>>>
>>>>
>>>> On 8/4/25 13:54, Michal Hocko wrote:
>>>>> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>>>>>> On 30.07.25 15:57, Eugen Hristev wrote:
>>>>> [...]
>>>>>>> Yes, registering after is also an option. Initially this is how I
>>>>>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>>>>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>>>>>> here:
>>>>>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>>>>>> Do you think we can continue that discussion , or maybe start it here ?
>>>>>>
>>>>>> Yeah, I don't like that, but I can see how we ended up here.
>>>>>>
>>>>>> I also don't quite like the idea that we must encode here what to include in
>>>>>> a dump and what not ...
>>>>>>
>>>>>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>>>>>> where we e.g., have
>>>>>>
>>>>>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>>>>>
>>>>>> Could we similar have some place where we construct what to dump similarly,
>>>>>> just not using the current values, but the memory ranges?
>>>>>
>>>>> All those symbols are part of kallsyms, right? Can we just use kallsyms
>>>>> infrastructure and a list of symbols to get what we need from there?
>>>>>
>>>>> In other words the list of symbols to be completely external to the code
>>>>> that is defining them?
>>>>
>>>> Some static symbols are indeed part of kallsyms. But some symbols are
>>>> not exported, for example patch 20/29, where printk related symbols are
>>>> not to be exported. Another example is with static variables, like in
>>>> patch 17/29 , not exported as symbols, but required for the dump.
>>>> Dynamic memory regions are not have to also be considered, have a look
>>>> for example at patch 23/29 , where dynamically allocated memory needs to
>>>> be registered.
>>>>
>>>> Do you think that I should move all kallsyms related symbols annotation
>>>> into a separate place and keep it for the static/dynamic regions in place ?
>>>
>>> If you want to use a symbol from kmemdump, then make that symbol
>>> available to kmemdump.
>>
>> That's what I am doing, registering symbols with kmemdump.
>> Maybe I do not understand what you mean, do you have any suggestion for
>> the static variables case (symbols not exported) ?
>
> Let's use patch #20 as example:
>
> What I am thinking is that you would not include "linux/kmemdump.h" and
> not leak all of that KMEMDUMP_ stuff in all these files/subsystems that
> couldn't less about kmemdump.
>
> Instead of doing
>
> static struct printk_ringbuffer printk_rb_dynamic;
>
> You'd do
>
> struct printk_ringbuffer printk_rb_dynamic;
>
> and have it in some header file, from where kmemdump could lookup the
> address.
>
> So you move the logic of what goes into a dump from the subsystems to
> the kmemdump core.
>
That works if the people maintaining these systems agree with it.
Attempts to export symbols from printk e.g. have been nacked :
https://lore.kernel.org/all/20250218-175733-neomutt-senozhatsky@chromium.org/
So I am unsure whether just removing the static and adding them into
header files would be more acceptable.
Added in CC Cristoph Hellwig and Sergey Senozhatsky maybe they could
tell us directly whether they like or dislike this approach, as kmemdump
would be builtin and would not require exports.
One other thing to mention is the fact that the printk code dynamically
allocates memory that would need to be registered. There is no mechanism
for kmemdump to know when this process has been completed (or even if it
was at all, because it happens on demand in certain conditions).
Thanks !
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 13:03 ` Eugen Hristev
@ 2025-08-04 13:26 ` David Hildenbrand
2025-08-25 12:55 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-04 13:26 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 04.08.25 15:03, Eugen Hristev wrote:
>
>
> On 8/4/25 15:49, David Hildenbrand wrote:
>> On 04.08.25 14:29, Eugen Hristev wrote:
>>>
>>>
>>> On 8/4/25 15:18, David Hildenbrand wrote:
>>>> On 04.08.25 13:06, Eugen Hristev wrote:
>>>>>
>>>>>
>>>>> On 8/4/25 13:54, Michal Hocko wrote:
>>>>>> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>>>>>>> On 30.07.25 15:57, Eugen Hristev wrote:
>>>>>> [...]
>>>>>>>> Yes, registering after is also an option. Initially this is how I
>>>>>>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>>>>>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>>>>>>> here:
>>>>>>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>>>>>>> Do you think we can continue that discussion , or maybe start it here ?
>>>>>>>
>>>>>>> Yeah, I don't like that, but I can see how we ended up here.
>>>>>>>
>>>>>>> I also don't quite like the idea that we must encode here what to include in
>>>>>>> a dump and what not ...
>>>>>>>
>>>>>>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>>>>>>> where we e.g., have
>>>>>>>
>>>>>>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>>>>>>
>>>>>>> Could we similar have some place where we construct what to dump similarly,
>>>>>>> just not using the current values, but the memory ranges?
>>>>>>
>>>>>> All those symbols are part of kallsyms, right? Can we just use kallsyms
>>>>>> infrastructure and a list of symbols to get what we need from there?
>>>>>>
>>>>>> In other words the list of symbols to be completely external to the code
>>>>>> that is defining them?
>>>>>
>>>>> Some static symbols are indeed part of kallsyms. But some symbols are
>>>>> not exported, for example patch 20/29, where printk related symbols are
>>>>> not to be exported. Another example is with static variables, like in
>>>>> patch 17/29 , not exported as symbols, but required for the dump.
>>>>> Dynamic memory regions are not have to also be considered, have a look
>>>>> for example at patch 23/29 , where dynamically allocated memory needs to
>>>>> be registered.
>>>>>
>>>>> Do you think that I should move all kallsyms related symbols annotation
>>>>> into a separate place and keep it for the static/dynamic regions in place ?
>>>>
>>>> If you want to use a symbol from kmemdump, then make that symbol
>>>> available to kmemdump.
>>>
>>> That's what I am doing, registering symbols with kmemdump.
>>> Maybe I do not understand what you mean, do you have any suggestion for
>>> the static variables case (symbols not exported) ?
>>
>> Let's use patch #20 as example:
>>
>> What I am thinking is that you would not include "linux/kmemdump.h" and
>> not leak all of that KMEMDUMP_ stuff in all these files/subsystems that
>> couldn't less about kmemdump.
>>
>> Instead of doing
>>
>> static struct printk_ringbuffer printk_rb_dynamic;
>>
>> You'd do
>>
>> struct printk_ringbuffer printk_rb_dynamic;
>>
>> and have it in some header file, from where kmemdump could lookup the
>> address.
>>
>> So you move the logic of what goes into a dump from the subsystems to
>> the kmemdump core.
>>
>
> That works if the people maintaining these systems agree with it.
> Attempts to export symbols from printk e.g. have been nacked :
>
> https://lore.kernel.org/all/20250218-175733-neomutt-senozhatsky@chromium.org/
Do you really need the EXPORT_SYMBOL?
Can't you just not export symbols, building the relevant kmemdump part
into the core not as a module.
IIRC, kernel/vmcore_info.c is never built as a module, as it also
accesses non-exported symbols.
>
> So I am unsure whether just removing the static and adding them into
> header files would be more acceptable.
>
> Added in CC Cristoph Hellwig and Sergey Senozhatsky maybe they could
> tell us directly whether they like or dislike this approach, as kmemdump
> would be builtin and would not require exports.
>
> One other thing to mention is the fact that the printk code dynamically
> allocates memory that would need to be registered. There is no mechanism
> for kmemdump to know when this process has been completed (or even if it
> was at all, because it happens on demand in certain conditions).
If we are talking about memblock allocations, they sure are finished at
the time ... the buddy is up.
So it's just a matter of placing yourself late in the init stage where
the buddy is already up and running.
I assume dumping any dynamically allocated stuff through the buddy is
out of the picture for now.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-04 13:26 ` David Hildenbrand
@ 2025-08-25 12:55 ` Eugen Hristev
2025-08-25 13:20 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-25 12:55 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 8/4/25 16:26, David Hildenbrand wrote:
> On 04.08.25 15:03, Eugen Hristev wrote:
>>
>>
>> On 8/4/25 15:49, David Hildenbrand wrote:
>>> On 04.08.25 14:29, Eugen Hristev wrote:
>>>>
>>>>
>>>> On 8/4/25 15:18, David Hildenbrand wrote:
>>>>> On 04.08.25 13:06, Eugen Hristev wrote:
>>>>>>
>>>>>>
>>>>>> On 8/4/25 13:54, Michal Hocko wrote:
>>>>>>> On Wed 30-07-25 16:04:28, David Hildenbrand wrote:
>>>>>>>> On 30.07.25 15:57, Eugen Hristev wrote:
>>>>>>> [...]
>>>>>>>>> Yes, registering after is also an option. Initially this is how I
>>>>>>>>> designed the kmemdump API, I also had in mind to add a flag, but, after
>>>>>>>>> discussing with Thomas Gleixner, he came up with the macro wrapper idea
>>>>>>>>> here:
>>>>>>>>> https://lore.kernel.org/lkml/87ikkzpcup.ffs@tglx/
>>>>>>>>> Do you think we can continue that discussion , or maybe start it here ?
>>>>>>>>
>>>>>>>> Yeah, I don't like that, but I can see how we ended up here.
>>>>>>>>
>>>>>>>> I also don't quite like the idea that we must encode here what to include in
>>>>>>>> a dump and what not ...
>>>>>>>>
>>>>>>>> For the vmcore we construct it at runtime in crash_save_vmcoreinfo_init(),
>>>>>>>> where we e.g., have
>>>>>>>>
>>>>>>>> VMCOREINFO_STRUCT_SIZE(pglist_data);
>>>>>>>>
>>>>>>>> Could we similar have some place where we construct what to dump similarly,
>>>>>>>> just not using the current values, but the memory ranges?
>>>>>>>
>>>>>>> All those symbols are part of kallsyms, right? Can we just use kallsyms
>>>>>>> infrastructure and a list of symbols to get what we need from there?
>>>>>>>
>>>>>>> In other words the list of symbols to be completely external to the code
>>>>>>> that is defining them?
>>>>>>
>>>>>> Some static symbols are indeed part of kallsyms. But some symbols are
>>>>>> not exported, for example patch 20/29, where printk related symbols are
>>>>>> not to be exported. Another example is with static variables, like in
>>>>>> patch 17/29 , not exported as symbols, but required for the dump.
>>>>>> Dynamic memory regions are not have to also be considered, have a look
>>>>>> for example at patch 23/29 , where dynamically allocated memory needs to
>>>>>> be registered.
>>>>>>
>>>>>> Do you think that I should move all kallsyms related symbols annotation
>>>>>> into a separate place and keep it for the static/dynamic regions in place ?
>>>>>
>>>>> If you want to use a symbol from kmemdump, then make that symbol
>>>>> available to kmemdump.
>>>>
>>>> That's what I am doing, registering symbols with kmemdump.
>>>> Maybe I do not understand what you mean, do you have any suggestion for
>>>> the static variables case (symbols not exported) ?
>>>
>>> Let's use patch #20 as example:
>>>
>>> What I am thinking is that you would not include "linux/kmemdump.h" and
>>> not leak all of that KMEMDUMP_ stuff in all these files/subsystems that
>>> couldn't less about kmemdump.
>>>
>>> Instead of doing
>>>
>>> static struct printk_ringbuffer printk_rb_dynamic;
>>>
>>> You'd do
>>>
>>> struct printk_ringbuffer printk_rb_dynamic;
>>>
>>> and have it in some header file, from where kmemdump could lookup the
>>> address.
>>>
>>> So you move the logic of what goes into a dump from the subsystems to
>>> the kmemdump core.
>>>
>>
>> That works if the people maintaining these systems agree with it.
>> Attempts to export symbols from printk e.g. have been nacked :
>>
>> https://lore.kernel.org/all/20250218-175733-neomutt-senozhatsky@chromium.org/
>
> Do you really need the EXPORT_SYMBOL?
>
> Can't you just not export symbols, building the relevant kmemdump part
> into the core not as a module.
>
> IIRC, kernel/vmcore_info.c is never built as a module, as it also
> accesses non-exported symbols.
Hello David,
I am looking again into this, and there are some things which in my
opinion would be difficult to achieve.
For example I looked into my patch #11 , which adds the `runqueues` into
kmemdump.
The runqueues is a variable of `struct rq` which is defined in
kernel/sched/sched.h , which is not supposed to be included outside of
sched.
Now moving all the struct definition outside of sched.h into another
public header would be rather painful and I don't think it's a really
good option (The struct would be needed to compute the sizeof inside
vmcoreinfo). Secondly, it would also imply moving all the nested struct
definitions outside as well. I doubt this is something that we want for
the sched subsys. How the subsys is designed, out of my understanding,
is to keep these internal structs opaque outside of it.
From my perspective it's much simpler and cleaner to just add the
kmemdump annotation macro inside the sched/core.c as it's done in my
patch. This macro translates to a noop if kmemdump is not selected.
How do you see this done another way ?
>
>>
>> So I am unsure whether just removing the static and adding them into
>> header files would be more acceptable.
>>
>> Added in CC Cristoph Hellwig and Sergey Senozhatsky maybe they could
>> tell us directly whether they like or dislike this approach, as kmemdump
>> would be builtin and would not require exports.
>>
>> One other thing to mention is the fact that the printk code dynamically
>> allocates memory that would need to be registered. There is no mechanism
>> for kmemdump to know when this process has been completed (or even if it
>> was at all, because it happens on demand in certain conditions).
>
> If we are talking about memblock allocations, they sure are finished at
> the time ... the buddy is up.
>
> So it's just a matter of placing yourself late in the init stage where
> the buddy is already up and running.
>
> I assume dumping any dynamically allocated stuff through the buddy is
> out of the picture for now.
>
The dumping mechanism needs to work for dynamically allocated stuff, and
right now, it works for e.g. printk, if the buffer is dynamically
allocated later on in the boot process.
To have this working outside of printk, it would be required to walk
through all the printk structs/allocations and select the required info.
Is this something that we want to do outside of printk ? E.g. for the
printk panic-dump case, the whole dumping is done by registering a
dumper that does the job inside printk. There is no mechanism walking
through printk data in another subsystem (in my example, pstore).
So for me it is logical to register the data inside the printk.
Does this make sense ?
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-25 12:55 ` Eugen Hristev
@ 2025-08-25 13:20 ` David Hildenbrand
2025-08-25 13:36 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-25 13:20 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
>>
>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>> accesses non-exported symbols.
>
> Hello David,
>
> I am looking again into this, and there are some things which in my
> opinion would be difficult to achieve.
> For example I looked into my patch #11 , which adds the `runqueues` into
> kmemdump.
>
> The runqueues is a variable of `struct rq` which is defined in
> kernel/sched/sched.h , which is not supposed to be included outside of
> sched.
> Now moving all the struct definition outside of sched.h into another
> public header would be rather painful and I don't think it's a really
> good option (The struct would be needed to compute the sizeof inside
> vmcoreinfo). Secondly, it would also imply moving all the nested struct
> definitions outside as well. I doubt this is something that we want for
> the sched subsys. How the subsys is designed, out of my understanding,
> is to keep these internal structs opaque outside of it.
All the kmemdump module needs is a start and a length, correct? So the
only tricky part is getting the length.
One could just add a const variable that holds this information, or even
better, a simple helper function to calculate that.
Maybe someone else reading along has a better idea.
Interestingly, runqueues is a percpu variable, which makes me wonder if
what you had would work as intended (maybe it does, not sure).
>
> From my perspective it's much simpler and cleaner to just add the
> kmemdump annotation macro inside the sched/core.c as it's done in my
> patch. This macro translates to a noop if kmemdump is not selected.
I really don't like how we are spreading kmemdump all over the kernel,
and adding complexity with __section when really, all we need is a place
to obtain a start and a length.
So we should explore if there is anything easier possible.
>>
>>>
>>> So I am unsure whether just removing the static and adding them into
>>> header files would be more acceptable.
>>>
>>> Added in CC Cristoph Hellwig and Sergey Senozhatsky maybe they could
>>> tell us directly whether they like or dislike this approach, as kmemdump
>>> would be builtin and would not require exports.
>>>
>>> One other thing to mention is the fact that the printk code dynamically
>>> allocates memory that would need to be registered. There is no mechanism
>>> for kmemdump to know when this process has been completed (or even if it
>>> was at all, because it happens on demand in certain conditions).
>>
>> If we are talking about memblock allocations, they sure are finished at
>> the time ... the buddy is up.
>>
>> So it's just a matter of placing yourself late in the init stage where
>> the buddy is already up and running.
>>
>> I assume dumping any dynamically allocated stuff through the buddy is
>> out of the picture for now.
>>
>
> The dumping mechanism needs to work for dynamically allocated stuff, and
> right now, it works for e.g. printk, if the buffer is dynamically
> allocated later on in the boot process.
You are talking about the memblock_alloc() result, correct? Like
new_log_buf = memblock_alloc(new_log_buf_len, LOG_ALIGN);
The current version is always stored in
static char *log_buf = __log_buf;
Once early boot is done and memblock gets torn down, you can just use
log_buf and be sure that it will not change anymore.
>
> To have this working outside of printk, it would be required to walk
> through all the printk structs/allocations and select the required info.
> Is this something that we want to do outside of printk ?
I don't follow, please elaborate.
How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
given that you run your initialization after setup_log_buf() ?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-25 13:20 ` David Hildenbrand
@ 2025-08-25 13:36 ` Eugen Hristev
2025-08-25 13:58 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-25 13:36 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 8/25/25 16:20, David Hildenbrand wrote:
>
>>>
>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>> accesses non-exported symbols.
>>
>> Hello David,
>>
>> I am looking again into this, and there are some things which in my
>> opinion would be difficult to achieve.
>> For example I looked into my patch #11 , which adds the `runqueues` into
>> kmemdump.
>>
>> The runqueues is a variable of `struct rq` which is defined in
>> kernel/sched/sched.h , which is not supposed to be included outside of
>> sched.
>> Now moving all the struct definition outside of sched.h into another
>> public header would be rather painful and I don't think it's a really
>> good option (The struct would be needed to compute the sizeof inside
>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>> definitions outside as well. I doubt this is something that we want for
>> the sched subsys. How the subsys is designed, out of my understanding,
>> is to keep these internal structs opaque outside of it.
>
> All the kmemdump module needs is a start and a length, correct? So the
> only tricky part is getting the length.
I also have in mind the kernel user case. How would a kernel programmer
want to add some kernel structs/info/buffers into kmemdump such that the
dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
enough.
Otherwise maybe the programmer has to write helpers to compute lengths
etc, and stitch them into kmemdump core.
I am not saying it's impossible, but just tiresome perhaps.
>
> One could just add a const variable that holds this information, or even
> better, a simple helper function to calculate that.
>
> Maybe someone else reading along has a better idea.
This could work, but it requires again adding some code into the
specific subsystem. E.g. struct_rq_get_size()
I am open to ideas , and thank you very much for your thoughts.
>
> Interestingly, runqueues is a percpu variable, which makes me wonder if
> what you had would work as intended (maybe it does, not sure).
I would not really need to dump the runqueues. But the crash tool which
I am using for testing, requires it. Without the runqueues it will not
progress further to load the kernel dump.
So I am not really sure what it does with the runqueues, but it works.
Perhaps using crash/gdb more, to actually do something with this data,
would give more insight about its utility.
For me, it is a prerequisite to run crash, and then to be able to
extract the log buffer from the dump.
>
>>
>> From my perspective it's much simpler and cleaner to just add the
>> kmemdump annotation macro inside the sched/core.c as it's done in my
>> patch. This macro translates to a noop if kmemdump is not selected.
>
> I really don't like how we are spreading kmemdump all over the kernel,
> and adding complexity with __section when really, all we need is a place
> to obtain a start and a length.
>
I understand. The section idea was suggested by Thomas. Initially I was
skeptic, but I like how it turned out.
> So we should explore if there is anything easier possible.
>
>>>
>>>>
>>>> So I am unsure whether just removing the static and adding them into
>>>> header files would be more acceptable.
>>>>
>>>> Added in CC Cristoph Hellwig and Sergey Senozhatsky maybe they could
>>>> tell us directly whether they like or dislike this approach, as kmemdump
>>>> would be builtin and would not require exports.
>>>>
>>>> One other thing to mention is the fact that the printk code dynamically
>>>> allocates memory that would need to be registered. There is no mechanism
>>>> for kmemdump to know when this process has been completed (or even if it
>>>> was at all, because it happens on demand in certain conditions).
>>>
>>> If we are talking about memblock allocations, they sure are finished at
>>> the time ... the buddy is up.
>>>
>>> So it's just a matter of placing yourself late in the init stage where
>>> the buddy is already up and running.
>>>
>>> I assume dumping any dynamically allocated stuff through the buddy is
>>> out of the picture for now.
>>>
>>
>> The dumping mechanism needs to work for dynamically allocated stuff, and
>> right now, it works for e.g. printk, if the buffer is dynamically
>> allocated later on in the boot process.
>
> You are talking about the memblock_alloc() result, correct? Like
>
> new_log_buf = memblock_alloc(new_log_buf_len, LOG_ALIGN);
>
> The current version is always stored in
>
> static char *log_buf = __log_buf;
>
>
> Once early boot is done and memblock gets torn down, you can just use
> log_buf and be sure that it will not change anymore.
>
>>
>> To have this working outside of printk, it would be required to walk
>> through all the printk structs/allocations and select the required info.
>> Is this something that we want to do outside of printk ?
>
> I don't follow, please elaborate.
>
> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
> given that you run your initialization after setup_log_buf() ?
>
>
My initial thought was the same. However I got some feedback from Petr
Mladek here :
https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
Where he explained how to register the structs correctly.
It can be that setup_log_buf is called again at a later time perhaps.
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-25 13:36 ` Eugen Hristev
@ 2025-08-25 13:58 ` David Hildenbrand
2025-08-27 11:59 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-25 13:58 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 25.08.25 15:36, Eugen Hristev wrote:
>
>
> On 8/25/25 16:20, David Hildenbrand wrote:
>>
>>>>
>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>> accesses non-exported symbols.
>>>
>>> Hello David,
>>>
>>> I am looking again into this, and there are some things which in my
>>> opinion would be difficult to achieve.
>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>> kmemdump.
>>>
>>> The runqueues is a variable of `struct rq` which is defined in
>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>> sched.
>>> Now moving all the struct definition outside of sched.h into another
>>> public header would be rather painful and I don't think it's a really
>>> good option (The struct would be needed to compute the sizeof inside
>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>> definitions outside as well. I doubt this is something that we want for
>>> the sched subsys. How the subsys is designed, out of my understanding,
>>> is to keep these internal structs opaque outside of it.
>>
>> All the kmemdump module needs is a start and a length, correct? So the
>> only tricky part is getting the length.
>
> I also have in mind the kernel user case. How would a kernel programmer
> want to add some kernel structs/info/buffers into kmemdump such that the
> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
> enough.
The other way around, why should anybody have a saying in adding their
data to kmemdump? Why do we have that all over the kernel?
Is your mechanism really so special?
A single composer should take care of that, and it's really just start +
len of physical memory areas.
> Otherwise maybe the programmer has to write helpers to compute lengths
> etc, and stitch them into kmemdump core.
> I am not saying it's impossible, but just tiresome perhaps.
In your patch set, how many of these instances did you encounter where
that was a problem?
>>
>> One could just add a const variable that holds this information, or even
>> better, a simple helper function to calculate that.
>>
>> Maybe someone else reading along has a better idea.
>
> This could work, but it requires again adding some code into the
> specific subsystem. E.g. struct_rq_get_size()
> I am open to ideas , and thank you very much for your thoughts.
>
>>
>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>> what you had would work as intended (maybe it does, not sure).
>
> I would not really need to dump the runqueues. But the crash tool which
> I am using for testing, requires it. Without the runqueues it will not
> progress further to load the kernel dump.
> So I am not really sure what it does with the runqueues, but it works.
> Perhaps using crash/gdb more, to actually do something with this data,
> would give more insight about its utility.
> For me, it is a prerequisite to run crash, and then to be able to
> extract the log buffer from the dump.
I have the faint recollection that percpu vars might not be stored in a
single contiguous physical memory area, but maybe my memory is just
wrong, that's why I was raising it.
>
>>
>>>
>>> From my perspective it's much simpler and cleaner to just add the
>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>> patch. This macro translates to a noop if kmemdump is not selected.
>>
>> I really don't like how we are spreading kmemdump all over the kernel,
>> and adding complexity with __section when really, all we need is a place
>> to obtain a start and a length.
>>
>
> I understand. The section idea was suggested by Thomas. Initially I was
> skeptic, but I like how it turned out.
Yeah, I don't like it. Taste differs ;)
I am in particular unhappy about custom memblock wrappers.
[...]
>>>
>>> To have this working outside of printk, it would be required to walk
>>> through all the printk structs/allocations and select the required info.
>>> Is this something that we want to do outside of printk ?
>>
>> I don't follow, please elaborate.
>>
>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>> given that you run your initialization after setup_log_buf() ?
>>
>>
>
> My initial thought was the same. However I got some feedback from Petr
> Mladek here :
>
> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>
> Where he explained how to register the structs correctly.
> It can be that setup_log_buf is called again at a later time perhaps.
>
setup_log_buf() is a __init function, so there is only a certain time
frame where it can be called.
In particular, once the buddy is up, memblock allocations are impossible
and it would be deeply flawed to call this function again.
Let's not over-engineer this.
Peter is on CC, so hopefully he can share his thoughts.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 00/29] introduce kmemdump
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
` (28 preceding siblings ...)
2025-07-24 13:55 ` [RFC][PATCH v2 29/29] kmemdump: Add Kinfo backend driver Eugen Hristev
@ 2025-08-26 17:14 ` Mukesh Ojha
2025-08-27 6:42 ` Eugen Hristev
29 siblings, 1 reply; 61+ messages in thread
From: Mukesh Ojha @ 2025-08-26 17:14 UTC (permalink / raw)
To: Eugen Hristev
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus
On Thu, Jul 24, 2025 at 04:54:43PM +0300, Eugen Hristev wrote:
> kmemdump is a mechanism which allows the kernel to mark specific memory
> areas for dumping or specific backend usage.
> Once regions are marked, kmemdump keeps an internal list with the regions
> and registers them in the backend.
> Further, depending on the backend driver, these regions can be dumped using
> firmware or different hardware block.
> Regions being marked beforehand, when the system is up and running, there
> is no need nor dependency on a panic handler, or a working kernel that can
> dump the debug information.
> The kmemdump approach works when pstore, kdump, or another mechanism do not.
> Pstore relies on persistent storage, a dedicated RAM area or flash, which
> has the disadvantage of having the memory reserved all the time, or another
> specific non volatile memory. Some devices cannot keep the RAM contents on
> reboot so ramoops does not work. Some devices do not allow kexec to run
> another kernel to debug the crashed one.
> For such devices, that have another mechanism to help debugging, like
> firmware, kmemdump is a viable solution.
>
> kmemdump can create a core image, similar with /proc/vmcore, with only
> the registered regions included. This can be loaded into crash tool/gdb and
> analyzed.
> To have this working, specific information from the kernel is registered,
> and this is done at kmemdump init time, no need for the kmemdump user to
> do anything.
>
> This version of the kmemdump patch series includes two backend drivers:
> one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo
> backend for Android devices, reworked from this source here:
> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c
> written originally by Jone Chou <jonechou@google.com>
>
> Initial version of kmemdump and discussion is available here:
> https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
>
> Kmemdump has been presented and discussed at Linaro Connect 2025,
> including motivation, scope, usability and feasability.
> Video of the recording is available here for anyone interested:
> https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14
>
> The implementation is based on the initial Pstore/directly mapped zones
> published as an RFC here:
> https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
>
> The back-end implementation for qcom_minidump is based on the minidump
> patch series and driver written by Mukesh Ojha, thanks:
> https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
>
> *** How to use kmemdump with minidump backend on Qualcomm platform guide ***
>
> Prerequisites:
> Crash tool with target=ARM64 and minor changes required for usual crash mode
> (minimal mode works without the patch)
> A patch can be applied from here https://p.calebs.dev/49a048
> This patch will be eventually sent in a reworked way to crash tool.
>
> Target kernel must be built with :
> CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging
> information needed for crash tool.
>
> Otherwise, the kernel requires these as well:
> CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend
> CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND
>
> Kernel arguments:
> Kernel firmware must be set to mode 'mini' by kernel module parameter
> like this : qcom_scm.download_mode=mini
>
> After the kernel boots, and qcom_minidump module is loaded, everything is ready for
> a possible crash.
>
> Once the crash happens, the firmware will kick in and you will see on
> the console the message saying Sahara init, etc, that the firmware is
> waiting in download mode. (this is subject to firmware supporting this
> mode, I am using sa8775p-ride board)
>
> Example of log on the console:
> "
> [...]
> B - 1096414 - usb: init start
> B - 1100287 - usb: qusb_dci_platform , 0x19
> B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60
> B - 1107455 - usb: usb2phy: PRIM success , 0x4
> B - 1112670 - usb: dci, chgr_type_det_err
> B - 1117154 - usb: ID:0x260, value: 0x4
> B - 1121942 - usb: ID:0x108, value: 0x1d90
> B - 1124992 - usb: timer_start , 0x4c4b40
> B - 1129140 - usb: vbus_det_pm_unavail
> B - 1133136 - usb: ID:0x252, value: 0x4
> B - 1148874 - usb: SUPER , 0x900e
> B - 1275510 - usb: SUPER , 0x900e
> B - 1388970 - usb: ID:0x20d, value: 0x0
> B - 1411113 - usb: ENUM success
> B - 1411113 - Sahara Init
> B - 1414285 - Sahara Open
> "
>
> Once the board is in download mode, you can use the qdl tool (I
> personally use edl , have not tried qdl yet), to get all the regions as
> separate files.
> The tool from the host computer will list the regions in the order they
> were downloaded.
>
> Once you have all the files simply use `cat` to put them all together,
> in the order of the indexes.
> For my kernel config and setup, here is my cat command : (you can use a script
> or something, I haven't done that so far):
>
> `cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \
> memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \
> memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \
> memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \
> memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \
> memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \
> memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \
> memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \
> memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \
> memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \
> memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \
> memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \
> memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \
> memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \
> memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \
> memory/md_Kunknown46.BIN memory/md_Kunknown47.BIN memory/md_Kunknown50.BIN \
> memory/md_Kunknown51.BIN memory/md_Kunknown52.BIN \
> memory/md_Kunknown53.BIN > ~/minidump_image`
>
> Once you have the resulted file, use `crash` tool to load it, like this:
> `./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image`
>
> There is also a --minimal mode for ./crash that would work without any patch applied
> to crash tool, but you can't inspect symbols, etc.
Unfortunately for me, only with --minimal option, I could see the 'log'.
./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image
WARNING: kernel version inconsistency between vmlinux and dumpfile
crash: read error: kernel virtual address: ffffff8ed7f380d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f510d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f6a0d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f830d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7f9c0d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7fb50d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7fce0d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffff8ed7fe70d8 type: "IRQ stack pointer"
crash: read error: kernel virtual address: ffffffc0817c5d80 type: "maple_init read mt_slots"
crash: read error: kernel virtual address: ffffffc0817c5d78 type: "maple_init read mt_pivots"
crash: read error: kernel virtual address: ffffff8efb89e2c0 type: "memory section root table"
Looks like something more you are using in your setup to make it work.
-Mukesh
>
> Once you load crash you will see something like this :
>
> KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
> DUMPFILE: /home/eugen/new
> CPUS: 8 [OFFLINE: 7]
> DATE: Thu Jan 1 02:00:00 EET 1970
> UPTIME: 00:00:29
> TASKS: 0
> NODENAME: qemuarm64
> RELEASE: 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty
> VERSION: #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
> MACHINE: aarch64 (unknown Mhz)
> MEMORY: 34.2 GB
> PANIC: ""
> crash> log
> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
> [ 0.000000] Linux version 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty (eugen@eugen-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
>
>
> *** Debug Kinfo backend driver ***
> I don't have any device to actually test this. So I have not.
> I hacked the driver to just use a kmalloc'ed area to save things instead
> of the shared memory, and dumped everything there and checked whether it looks
> sane. If someone is willing to try it out, thanks ! and let me know.
> I know there is no binding documentation for the compatible either.
>
> Thanks for everyone reviewing and bringing ideas into the discussion.
>
> Eugen
>
> Changelog since the v1 of the RFC:
> - Reworked the whole minidump implementation based on suggestions from Thomas Gleixner.
> This means new API, macros, new way to store the regions inside kmemdump
> (ditched the IDR, moved to static allocation, have a static default backend, etc)
> - Reworked qcom_minidump driver based on review from Bjorn Andersson
> - Reworked printk log buffer registration based on review from Petr Mladek
>
> I appologize if I missed any review comments. I know there is still lots of work
> on this series and hope I will improve it more and more.
> Patches are sent on top of next-20250721
>
> Eugen Hristev (29):
> kmemdump: introduce kmemdump
> Documentation: add kmemdump
> kmemdump: add coreimage ELF layer
> Documentation: kmemdump: add section for coreimage ELF
> kmemdump: introduce qcom-minidump backend driver
> soc: qcom: smem: add minidump device
> init/version: Annotate static information into Kmemdump
> cpu: Annotate static information into Kmemdump
> genirq/irqdesc: Annotate static information into Kmemdump
> panic: Annotate static information into Kmemdump
> sched/core: Annotate static information into Kmemdump
> timers: Annotate static information into Kmemdump
> kernel/fork: Annotate static information into Kmemdump
> mm/page_alloc: Annotate static information into Kmemdump
> mm/init-mm: Annotate static information into Kmemdump
> mm/show_mem: Annotate static information into Kmemdump
> mm/swapfile: Annotate static information into Kmemdump
> mm/percpu: Annotate static information into Kmemdump
> mm/mm_init: Annotate static information into Kmemdump
> printk: Register information into Kmemdump
> kernel/configs: Register dynamic information into Kmemdump
> mm/numa: Register information into Kmemdump
> mm/sparse: Register information into Kmemdump
> kernel/vmcore_info: Register dynamic information into Kmemdump
> kmemdump: Add additional symbols to the coreimage
> init/version: Annotate init uts name separately into Kmemdump
> kallsyms: Annotate static information into Kmemdump
> mm/init-mm: Annotate additional information into Kmemdump
> kmemdump: Add Kinfo backend driver
>
> Documentation/debug/index.rst | 17 ++
> Documentation/debug/kmemdump.rst | 104 +++++++++
> MAINTAINERS | 18 ++
> drivers/Kconfig | 4 +
> drivers/Makefile | 2 +
> drivers/debug/Kconfig | 55 +++++
> drivers/debug/Makefile | 6 +
> drivers/debug/kinfo.c | 304 +++++++++++++++++++++++++
> drivers/debug/kmemdump.c | 239 +++++++++++++++++++
> drivers/debug/kmemdump_coreimage.c | 223 ++++++++++++++++++
> drivers/debug/qcom_minidump.c | 353 +++++++++++++++++++++++++++++
> drivers/soc/qcom/smem.c | 10 +
> include/asm-generic/vmlinux.lds.h | 13 ++
> include/linux/kmemdump.h | 219 ++++++++++++++++++
> init/version.c | 6 +
> kernel/configs.c | 6 +
> kernel/cpu.c | 5 +
> kernel/fork.c | 2 +
> kernel/irq/irqdesc.c | 2 +
> kernel/kallsyms.c | 10 +
> kernel/panic.c | 4 +
> kernel/printk/printk.c | 28 ++-
> kernel/sched/core.c | 2 +
> kernel/time/timer.c | 3 +-
> kernel/vmcore_info.c | 3 +
> mm/init-mm.c | 12 +
> mm/mm_init.c | 2 +
> mm/numa.c | 5 +-
> mm/page_alloc.c | 2 +
> mm/percpu.c | 3 +
> mm/show_mem.c | 2 +
> mm/sparse.c | 16 +-
> mm/swapfile.c | 2 +
> 33 files changed, 1670 insertions(+), 12 deletions(-)
> create mode 100644 Documentation/debug/index.rst
> create mode 100644 Documentation/debug/kmemdump.rst
> create mode 100644 drivers/debug/Kconfig
> create mode 100644 drivers/debug/Makefile
> create mode 100644 drivers/debug/kinfo.c
> create mode 100644 drivers/debug/kmemdump.c
> create mode 100644 drivers/debug/kmemdump_coreimage.c
> create mode 100644 drivers/debug/qcom_minidump.c
> create mode 100644 include/linux/kmemdump.h
>
> --
> 2.43.0
>
--
-Mukesh Ojha
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 00/29] introduce kmemdump
2025-08-26 17:14 ` [RFC][PATCH v2 00/29] introduce kmemdump Mukesh Ojha
@ 2025-08-27 6:42 ` Eugen Hristev
0 siblings, 0 replies; 61+ messages in thread
From: Eugen Hristev @ 2025-08-27 6:42 UTC (permalink / raw)
To: Mukesh Ojha
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus
On 8/26/25 20:14, Mukesh Ojha wrote:
> On Thu, Jul 24, 2025 at 04:54:43PM +0300, Eugen Hristev wrote:
>> kmemdump is a mechanism which allows the kernel to mark specific memory
>> areas for dumping or specific backend usage.
>> Once regions are marked, kmemdump keeps an internal list with the regions
>> and registers them in the backend.
>> Further, depending on the backend driver, these regions can be dumped using
>> firmware or different hardware block.
>> Regions being marked beforehand, when the system is up and running, there
>> is no need nor dependency on a panic handler, or a working kernel that can
>> dump the debug information.
>> The kmemdump approach works when pstore, kdump, or another mechanism do not.
>> Pstore relies on persistent storage, a dedicated RAM area or flash, which
>> has the disadvantage of having the memory reserved all the time, or another
>> specific non volatile memory. Some devices cannot keep the RAM contents on
>> reboot so ramoops does not work. Some devices do not allow kexec to run
>> another kernel to debug the crashed one.
>> For such devices, that have another mechanism to help debugging, like
>> firmware, kmemdump is a viable solution.
>>
>> kmemdump can create a core image, similar with /proc/vmcore, with only
>> the registered regions included. This can be loaded into crash tool/gdb and
>> analyzed.
>> To have this working, specific information from the kernel is registered,
>> and this is done at kmemdump init time, no need for the kmemdump user to
>> do anything.
>>
>> This version of the kmemdump patch series includes two backend drivers:
>> one is the Qualcomm Minidump backend, and the other one is the Debug Kinfo
>> backend for Android devices, reworked from this source here:
>> https://android.googlesource.com/kernel/common/+/refs/heads/android-mainline/drivers/android/debug_kinfo.c
>> written originally by Jone Chou <jonechou@google.com>
>>
>> Initial version of kmemdump and discussion is available here:
>> https://lore.kernel.org/lkml/20250422113156.575971-1-eugen.hristev@linaro.org/
>>
>> Kmemdump has been presented and discussed at Linaro Connect 2025,
>> including motivation, scope, usability and feasability.
>> Video of the recording is available here for anyone interested:
>> https://www.youtube.com/watch?v=r4gII7MX9zQ&list=PLKZSArYQptsODycGiE0XZdVovzAwYNwtK&index=14
>>
>> The implementation is based on the initial Pstore/directly mapped zones
>> published as an RFC here:
>> https://lore.kernel.org/all/20250217101706.2104498-1-eugen.hristev@linaro.org/
>>
>> The back-end implementation for qcom_minidump is based on the minidump
>> patch series and driver written by Mukesh Ojha, thanks:
>> https://lore.kernel.org/lkml/20240131110837.14218-1-quic_mojha@quicinc.com/
>>
>> *** How to use kmemdump with minidump backend on Qualcomm platform guide ***
>>
>> Prerequisites:
>> Crash tool with target=ARM64 and minor changes required for usual crash mode
>> (minimal mode works without the patch)
>> A patch can be applied from here https://p.calebs.dev/49a048
>> This patch will be eventually sent in a reworked way to crash tool.
>>
>> Target kernel must be built with :
>> CONFIG_DEBUG_INFO_REDUCED=n ; this will have vmlinux include all the debugging
>> information needed for crash tool.
>>
>> Otherwise, the kernel requires these as well:
>> CONFIG_KMEMDUMP, CONFIG_KMEMDUMP_COREIMAGE, and the backend
>> CONFIG_KMEMDUMP_QCOM_MINIDUMP_BACKEND
>>
>> Kernel arguments:
>> Kernel firmware must be set to mode 'mini' by kernel module parameter
>> like this : qcom_scm.download_mode=mini
>>
>> After the kernel boots, and qcom_minidump module is loaded, everything is ready for
>> a possible crash.
>>
>> Once the crash happens, the firmware will kick in and you will see on
>> the console the message saying Sahara init, etc, that the firmware is
>> waiting in download mode. (this is subject to firmware supporting this
>> mode, I am using sa8775p-ride board)
>>
>> Example of log on the console:
>> "
>> [...]
>> B - 1096414 - usb: init start
>> B - 1100287 - usb: qusb_dci_platform , 0x19
>> B - 1105686 - usb: usb3phy: PRIM success: lane_A , 0x60
>> B - 1107455 - usb: usb2phy: PRIM success , 0x4
>> B - 1112670 - usb: dci, chgr_type_det_err
>> B - 1117154 - usb: ID:0x260, value: 0x4
>> B - 1121942 - usb: ID:0x108, value: 0x1d90
>> B - 1124992 - usb: timer_start , 0x4c4b40
>> B - 1129140 - usb: vbus_det_pm_unavail
>> B - 1133136 - usb: ID:0x252, value: 0x4
>> B - 1148874 - usb: SUPER , 0x900e
>> B - 1275510 - usb: SUPER , 0x900e
>> B - 1388970 - usb: ID:0x20d, value: 0x0
>> B - 1411113 - usb: ENUM success
>> B - 1411113 - Sahara Init
>> B - 1414285 - Sahara Open
>> "
>>
>> Once the board is in download mode, you can use the qdl tool (I
>> personally use edl , have not tried qdl yet), to get all the regions as
>> separate files.
>> The tool from the host computer will list the regions in the order they
>> were downloaded.
>>
>> Once you have all the files simply use `cat` to put them all together,
>> in the order of the indexes.
>> For my kernel config and setup, here is my cat command : (you can use a script
>> or something, I haven't done that so far):
>>
>> `cat memory/md_KELF1.BIN memory/md_Kvmcorein2.BIN memory/md_Kconfig3.BIN \
>> memory/md_Kmemsect4.BIN memory/md_Ktotalram5.BIN memory/md_Kcpu_poss6.BIN \
>> memory/md_Kcpu_pres7.BIN memory/md_Kcpu_onli8.BIN memory/md_Kcpu_acti9.BIN \
>> memory/md_Kjiffies10.BIN memory/md_Klinux_ba11.BIN memory/md_Knr_threa12.BIN \
>> memory/md_Knr_irqs13.BIN memory/md_Ktainted_14.BIN memory/md_Ktaint_fl15.BIN \
>> memory/md_Kmem_sect16.BIN memory/md_Knode_dat17.BIN memory/md_Knode_sta18.BIN \
>> memory/md_K__per_cp19.BIN memory/md_Knr_swapf20.BIN memory/md_Kinit_uts21.BIN \
>> memory/md_Kprintk_r22.BIN memory/md_Kprintk_r23.BIN memory/md_Kprb24.BIN \
>> memory/md_Kprb_desc25.BIN memory/md_Kprb_info26.BIN memory/md_Kprb_data27.BIN \
>> memory/md_Krunqueue28.BIN memory/md_Khigh_mem29.BIN memory/md_Kinit_mm30.BIN \
>> memory/md_Kinit_mm_31.BIN memory/md_Kunknown32.BIN memory/md_Kunknown33.BIN \
>> memory/md_Kunknown34.BIN memory/md_Kunknown35.BIN memory/md_Kunknown36.BIN \
>> memory/md_Kunknown37.BIN memory/md_Kunknown38.BIN memory/md_Kunknown39.BIN \
>> memory/md_Kunknown40.BIN memory/md_Kunknown41.BIN memory/md_Kunknown42.BIN \
>> memory/md_Kunknown43.BIN memory/md_Kunknown44.BIN memory/md_Kunknown45.BIN \
>> memory/md_Kunknown46.BIN memory/md_Kunknown47.BIN memory/md_Kunknown50.BIN \
>> memory/md_Kunknown51.BIN memory/md_Kunknown52.BIN \
>> memory/md_Kunknown53.BIN > ~/minidump_image`
>>
>> Once you have the resulted file, use `crash` tool to load it, like this:
>> `./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image`
>>
>> There is also a --minimal mode for ./crash that would work without any patch applied
>> to crash tool, but you can't inspect symbols, etc.
>
> Unfortunately for me, only with --minimal option, I could see the 'log'.
>
> ./crash --no_modules --no_panic --no_kmem_cache --zero_excluded vmlinux minidump_image
>
> WARNING: kernel version inconsistency between vmlinux and dumpfile
>
> crash: read error: kernel virtual address: ffffff8ed7f380d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7f510d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7f6a0d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7f830d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7f9c0d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7fb50d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7fce0d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffff8ed7fe70d8 type: "IRQ stack pointer"
> crash: read error: kernel virtual address: ffffffc0817c5d80 type: "maple_init read mt_slots"
> crash: read error: kernel virtual address: ffffffc0817c5d78 type: "maple_init read mt_pivots"
> crash: read error: kernel virtual address: ffffff8efb89e2c0 type: "memory section root table"
>
> Looks like something more you are using in your setup to make it work.
Hello Mukesh,
Thanks for trying this out. Have you applied the indicated patch to the
crash tool before compiling it ?
If yes and still facing issues, can you run it with "-d 31" to enable
debug mode, then send me the output log please.
Eugen
>
> -Mukesh
>
>>
>> Once you load crash you will see something like this :
>>
>> KERNEL: /home/eugen/linux-minidump/vmlinux [TAINTED]
>> DUMPFILE: /home/eugen/new
>> CPUS: 8 [OFFLINE: 7]
>> DATE: Thu Jan 1 02:00:00 EET 1970
>> UPTIME: 00:00:29
>> TASKS: 0
>> NODENAME: qemuarm64
>> RELEASE: 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty
>> VERSION: #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
>> MACHINE: aarch64 (unknown Mhz)
>> MEMORY: 34.2 GB
>> PANIC: ""
>> crash> log
>> [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x410fd4b2]
>> [ 0.000000] Linux version 6.16.0-rc7-next-20250721-00029-gf8cffdbf0479-dirty (eugen@eugen-station) (aarch64-none-linux-gnu-gcc (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 13.3.1 20240614, GNU ld (Arm GNU Toolchain 13.3.Rel1 (Build arm-13.24)) 2.42.0.20240614) #5 SMP PREEMPT Tue Jul 22 18:44:57 EEST 2025
>>
>>
>> *** Debug Kinfo backend driver ***
>> I don't have any device to actually test this. So I have not.
>> I hacked the driver to just use a kmalloc'ed area to save things instead
>> of the shared memory, and dumped everything there and checked whether it looks
>> sane. If someone is willing to try it out, thanks ! and let me know.
>> I know there is no binding documentation for the compatible either.
>>
>> Thanks for everyone reviewing and bringing ideas into the discussion.
>>
>> Eugen
>>
>> Changelog since the v1 of the RFC:
>> - Reworked the whole minidump implementation based on suggestions from Thomas Gleixner.
>> This means new API, macros, new way to store the regions inside kmemdump
>> (ditched the IDR, moved to static allocation, have a static default backend, etc)
>> - Reworked qcom_minidump driver based on review from Bjorn Andersson
>> - Reworked printk log buffer registration based on review from Petr Mladek
>>
>> I appologize if I missed any review comments. I know there is still lots of work
>> on this series and hope I will improve it more and more.
>> Patches are sent on top of next-20250721
>>
>> Eugen Hristev (29):
>> kmemdump: introduce kmemdump
>> Documentation: add kmemdump
>> kmemdump: add coreimage ELF layer
>> Documentation: kmemdump: add section for coreimage ELF
>> kmemdump: introduce qcom-minidump backend driver
>> soc: qcom: smem: add minidump device
>> init/version: Annotate static information into Kmemdump
>> cpu: Annotate static information into Kmemdump
>> genirq/irqdesc: Annotate static information into Kmemdump
>> panic: Annotate static information into Kmemdump
>> sched/core: Annotate static information into Kmemdump
>> timers: Annotate static information into Kmemdump
>> kernel/fork: Annotate static information into Kmemdump
>> mm/page_alloc: Annotate static information into Kmemdump
>> mm/init-mm: Annotate static information into Kmemdump
>> mm/show_mem: Annotate static information into Kmemdump
>> mm/swapfile: Annotate static information into Kmemdump
>> mm/percpu: Annotate static information into Kmemdump
>> mm/mm_init: Annotate static information into Kmemdump
>> printk: Register information into Kmemdump
>> kernel/configs: Register dynamic information into Kmemdump
>> mm/numa: Register information into Kmemdump
>> mm/sparse: Register information into Kmemdump
>> kernel/vmcore_info: Register dynamic information into Kmemdump
>> kmemdump: Add additional symbols to the coreimage
>> init/version: Annotate init uts name separately into Kmemdump
>> kallsyms: Annotate static information into Kmemdump
>> mm/init-mm: Annotate additional information into Kmemdump
>> kmemdump: Add Kinfo backend driver
>>
>> Documentation/debug/index.rst | 17 ++
>> Documentation/debug/kmemdump.rst | 104 +++++++++
>> MAINTAINERS | 18 ++
>> drivers/Kconfig | 4 +
>> drivers/Makefile | 2 +
>> drivers/debug/Kconfig | 55 +++++
>> drivers/debug/Makefile | 6 +
>> drivers/debug/kinfo.c | 304 +++++++++++++++++++++++++
>> drivers/debug/kmemdump.c | 239 +++++++++++++++++++
>> drivers/debug/kmemdump_coreimage.c | 223 ++++++++++++++++++
>> drivers/debug/qcom_minidump.c | 353 +++++++++++++++++++++++++++++
>> drivers/soc/qcom/smem.c | 10 +
>> include/asm-generic/vmlinux.lds.h | 13 ++
>> include/linux/kmemdump.h | 219 ++++++++++++++++++
>> init/version.c | 6 +
>> kernel/configs.c | 6 +
>> kernel/cpu.c | 5 +
>> kernel/fork.c | 2 +
>> kernel/irq/irqdesc.c | 2 +
>> kernel/kallsyms.c | 10 +
>> kernel/panic.c | 4 +
>> kernel/printk/printk.c | 28 ++-
>> kernel/sched/core.c | 2 +
>> kernel/time/timer.c | 3 +-
>> kernel/vmcore_info.c | 3 +
>> mm/init-mm.c | 12 +
>> mm/mm_init.c | 2 +
>> mm/numa.c | 5 +-
>> mm/page_alloc.c | 2 +
>> mm/percpu.c | 3 +
>> mm/show_mem.c | 2 +
>> mm/sparse.c | 16 +-
>> mm/swapfile.c | 2 +
>> 33 files changed, 1670 insertions(+), 12 deletions(-)
>> create mode 100644 Documentation/debug/index.rst
>> create mode 100644 Documentation/debug/kmemdump.rst
>> create mode 100644 drivers/debug/Kconfig
>> create mode 100644 drivers/debug/Makefile
>> create mode 100644 drivers/debug/kinfo.c
>> create mode 100644 drivers/debug/kmemdump.c
>> create mode 100644 drivers/debug/kmemdump_coreimage.c
>> create mode 100644 drivers/debug/qcom_minidump.c
>> create mode 100644 include/linux/kmemdump.h
>>
>> --
>> 2.43.0
>>
>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-25 13:58 ` David Hildenbrand
@ 2025-08-27 11:59 ` Eugen Hristev
2025-08-27 12:18 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-27 11:59 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 8/25/25 16:58, David Hildenbrand wrote:
> On 25.08.25 15:36, Eugen Hristev wrote:
>>
>>
>> On 8/25/25 16:20, David Hildenbrand wrote:
>>>
>>>>>
>>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>>> accesses non-exported symbols.
>>>>
>>>> Hello David,
>>>>
>>>> I am looking again into this, and there are some things which in my
>>>> opinion would be difficult to achieve.
>>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>>> kmemdump.
>>>>
>>>> The runqueues is a variable of `struct rq` which is defined in
>>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>>> sched.
>>>> Now moving all the struct definition outside of sched.h into another
>>>> public header would be rather painful and I don't think it's a really
>>>> good option (The struct would be needed to compute the sizeof inside
>>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>>> definitions outside as well. I doubt this is something that we want for
>>>> the sched subsys. How the subsys is designed, out of my understanding,
>>>> is to keep these internal structs opaque outside of it.
>>>
>>> All the kmemdump module needs is a start and a length, correct? So the
>>> only tricky part is getting the length.
>>
>> I also have in mind the kernel user case. How would a kernel programmer
>> want to add some kernel structs/info/buffers into kmemdump such that the
>> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
>> enough.
>
> The other way around, why should anybody have a saying in adding their
> data to kmemdump? Why do we have that all over the kernel?
>
> Is your mechanism really so special?
>
> A single composer should take care of that, and it's really just start +
> len of physical memory areas.
>
>> Otherwise maybe the programmer has to write helpers to compute lengths
>> etc, and stitch them into kmemdump core.
>> I am not saying it's impossible, but just tiresome perhaps.
>
> In your patch set, how many of these instances did you encounter where
> that was a problem?
>
>>>
>>> One could just add a const variable that holds this information, or even
>>> better, a simple helper function to calculate that.
>>>
>>> Maybe someone else reading along has a better idea.
>>
>> This could work, but it requires again adding some code into the
>> specific subsystem. E.g. struct_rq_get_size()
>> I am open to ideas , and thank you very much for your thoughts.
>>
>>>
>>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>>> what you had would work as intended (maybe it does, not sure).
>>
>> I would not really need to dump the runqueues. But the crash tool which
>> I am using for testing, requires it. Without the runqueues it will not
>> progress further to load the kernel dump.
>> So I am not really sure what it does with the runqueues, but it works.
>> Perhaps using crash/gdb more, to actually do something with this data,
>> would give more insight about its utility.
>> For me, it is a prerequisite to run crash, and then to be able to
>> extract the log buffer from the dump.
>
> I have the faint recollection that percpu vars might not be stored in a
> single contiguous physical memory area, but maybe my memory is just
> wrong, that's why I was raising it.
>
>>
>>>
>>>>
>>>> From my perspective it's much simpler and cleaner to just add the
>>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>>> patch. This macro translates to a noop if kmemdump is not selected.
>>>
>>> I really don't like how we are spreading kmemdump all over the kernel,
>>> and adding complexity with __section when really, all we need is a place
>>> to obtain a start and a length.
>>>
>>
>> I understand. The section idea was suggested by Thomas. Initially I was
>> skeptic, but I like how it turned out.
>
> Yeah, I don't like it. Taste differs ;)
>
> I am in particular unhappy about custom memblock wrappers.
>
> [...]
>
>>>>
>>>> To have this working outside of printk, it would be required to walk
>>>> through all the printk structs/allocations and select the required info.
>>>> Is this something that we want to do outside of printk ?
>>>
>>> I don't follow, please elaborate.
>>>
>>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>>> given that you run your initialization after setup_log_buf() ?
>>>
>>>
>>
>> My initial thought was the same. However I got some feedback from Petr
>> Mladek here :
>>
>> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>>
>> Where he explained how to register the structs correctly.
>> It can be that setup_log_buf is called again at a later time perhaps.
>>
>
> setup_log_buf() is a __init function, so there is only a certain time
> frame where it can be called.
>
> In particular, once the buddy is up, memblock allocations are impossible
> and it would be deeply flawed to call this function again.
>
> Let's not over-engineer this.
>
> Peter is on CC, so hopefully he can share his thoughts.
>
Hello David,
I tested out this snippet (on top of my series, so you can see what I
changed):
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 18ba6c1e174f..7ac4248a00e5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -67,7 +67,6 @@
#include <linux/wait_api.h>
#include <linux/workqueue_api.h>
#include <linux/livepatch_sched.h>
-#include <linux/kmemdump.h>
#ifdef CONFIG_PREEMPT_DYNAMIC
# ifdef CONFIG_GENERIC_IRQ_ENTRY
@@ -120,7 +119,12 @@
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
-KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
+
+size_t runqueues_get_size(void);
+size_t runqueues_get_size(void)
+{
+ return sizeof(runqueues);
+}
#ifdef CONFIG_SCHED_PROXY_EXEC
DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
index d808c5e67f35..c6dd2d6e96dd 100644
--- a/kernel/vmcore_info.c
+++ b/kernel/vmcore_info.c
@@ -24,6 +24,12 @@
#include "kallsyms_internal.h"
#include "kexec_internal.h"
+typedef void* kmemdump_opaque_t;
+
+size_t runqueues_get_size(void);
+
+extern kmemdump_opaque_t runqueues;
+
/* vmcoreinfo stuff */
unsigned char *vmcoreinfo_data;
size_t vmcoreinfo_size;
@@ -230,6 +236,9 @@ static int __init crash_save_vmcoreinfo_init(void)
kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
(void *)vmcoreinfo_data, vmcoreinfo_size);
+ kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
+ (void *)&runqueues, runqueues_get_size());
+
return 0;
}
With this, no more .section, no kmemdump code into sched, however, there
are few things :
First the size function, which is quite dull and doesn't fit into the
sched very much.
Second, having the extern with a different "opaque" type to avoid
exposing the struct rq definition, which is quite hackish.
What do you think ?
My opinion is that it's ugly, but maybe you have some better idea how to
write this nicer ?
( I am also not 100 % sure if I did this the way you wanted).
Thanks for helping out,
Eugen
^ permalink raw reply related [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-27 11:59 ` Eugen Hristev
@ 2025-08-27 12:18 ` David Hildenbrand
2025-08-27 14:08 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-27 12:18 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 27.08.25 13:59, Eugen Hristev wrote:
>
>
> On 8/25/25 16:58, David Hildenbrand wrote:
>> On 25.08.25 15:36, Eugen Hristev wrote:
>>>
>>>
>>> On 8/25/25 16:20, David Hildenbrand wrote:
>>>>
>>>>>>
>>>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>>>> accesses non-exported symbols.
>>>>>
>>>>> Hello David,
>>>>>
>>>>> I am looking again into this, and there are some things which in my
>>>>> opinion would be difficult to achieve.
>>>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>>>> kmemdump.
>>>>>
>>>>> The runqueues is a variable of `struct rq` which is defined in
>>>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>>>> sched.
>>>>> Now moving all the struct definition outside of sched.h into another
>>>>> public header would be rather painful and I don't think it's a really
>>>>> good option (The struct would be needed to compute the sizeof inside
>>>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>>>> definitions outside as well. I doubt this is something that we want for
>>>>> the sched subsys. How the subsys is designed, out of my understanding,
>>>>> is to keep these internal structs opaque outside of it.
>>>>
>>>> All the kmemdump module needs is a start and a length, correct? So the
>>>> only tricky part is getting the length.
>>>
>>> I also have in mind the kernel user case. How would a kernel programmer
>>> want to add some kernel structs/info/buffers into kmemdump such that the
>>> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
>>> enough.
>>
>> The other way around, why should anybody have a saying in adding their
>> data to kmemdump? Why do we have that all over the kernel?
>>
>> Is your mechanism really so special?
>>
>> A single composer should take care of that, and it's really just start +
>> len of physical memory areas.
>>
>>> Otherwise maybe the programmer has to write helpers to compute lengths
>>> etc, and stitch them into kmemdump core.
>>> I am not saying it's impossible, but just tiresome perhaps.
>>
>> In your patch set, how many of these instances did you encounter where
>> that was a problem?
>>
>>>>
>>>> One could just add a const variable that holds this information, or even
>>>> better, a simple helper function to calculate that.
>>>>
>>>> Maybe someone else reading along has a better idea.
>>>
>>> This could work, but it requires again adding some code into the
>>> specific subsystem. E.g. struct_rq_get_size()
>>> I am open to ideas , and thank you very much for your thoughts.
>>>
>>>>
>>>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>>>> what you had would work as intended (maybe it does, not sure).
>>>
>>> I would not really need to dump the runqueues. But the crash tool which
>>> I am using for testing, requires it. Without the runqueues it will not
>>> progress further to load the kernel dump.
>>> So I am not really sure what it does with the runqueues, but it works.
>>> Perhaps using crash/gdb more, to actually do something with this data,
>>> would give more insight about its utility.
>>> For me, it is a prerequisite to run crash, and then to be able to
>>> extract the log buffer from the dump.
>>
>> I have the faint recollection that percpu vars might not be stored in a
>> single contiguous physical memory area, but maybe my memory is just
>> wrong, that's why I was raising it.
>>
>>>
>>>>
>>>>>
>>>>> From my perspective it's much simpler and cleaner to just add the
>>>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>>>> patch. This macro translates to a noop if kmemdump is not selected.
>>>>
>>>> I really don't like how we are spreading kmemdump all over the kernel,
>>>> and adding complexity with __section when really, all we need is a place
>>>> to obtain a start and a length.
>>>>
>>>
>>> I understand. The section idea was suggested by Thomas. Initially I was
>>> skeptic, but I like how it turned out.
>>
>> Yeah, I don't like it. Taste differs ;)
>>
>> I am in particular unhappy about custom memblock wrappers.
>>
>> [...]
>>
>>>>>
>>>>> To have this working outside of printk, it would be required to walk
>>>>> through all the printk structs/allocations and select the required info.
>>>>> Is this something that we want to do outside of printk ?
>>>>
>>>> I don't follow, please elaborate.
>>>>
>>>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>>>> given that you run your initialization after setup_log_buf() ?
>>>>
>>>>
>>>
>>> My initial thought was the same. However I got some feedback from Petr
>>> Mladek here :
>>>
>>> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>>>
>>> Where he explained how to register the structs correctly.
>>> It can be that setup_log_buf is called again at a later time perhaps.
>>>
>>
>> setup_log_buf() is a __init function, so there is only a certain time
>> frame where it can be called.
>>
>> In particular, once the buddy is up, memblock allocations are impossible
>> and it would be deeply flawed to call this function again.
>>
>> Let's not over-engineer this.
>>
>> Peter is on CC, so hopefully he can share his thoughts.
>>
>
> Hello David,
>
> I tested out this snippet (on top of my series, so you can see what I
> changed):
>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 18ba6c1e174f..7ac4248a00e5 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -67,7 +67,6 @@
> #include <linux/wait_api.h>
> #include <linux/workqueue_api.h>
> #include <linux/livepatch_sched.h>
> -#include <linux/kmemdump.h>
>
> #ifdef CONFIG_PREEMPT_DYNAMIC
> # ifdef CONFIG_GENERIC_IRQ_ENTRY
> @@ -120,7 +119,12 @@
> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>
> DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
> -KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
> +
> +size_t runqueues_get_size(void);
> +size_t runqueues_get_size(void)
> +{
> + return sizeof(runqueues);
> +}
>
> #ifdef CONFIG_SCHED_PROXY_EXEC
> DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
> diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
> index d808c5e67f35..c6dd2d6e96dd 100644
> --- a/kernel/vmcore_info.c
> +++ b/kernel/vmcore_info.c
> @@ -24,6 +24,12 @@
> #include "kallsyms_internal.h"
> #include "kexec_internal.h"
>
> +typedef void* kmemdump_opaque_t;
> +
> +size_t runqueues_get_size(void);
> +
> +extern kmemdump_opaque_t runqueues;
I would have tried that through:
struct rq;
extern struct rq runqueues;
But the whole PER_CPU_SHARED_ALIGNED makes this all weird, and likely
not the way we would want to handle that.
> /* vmcoreinfo stuff */
> unsigned char *vmcoreinfo_data;
> size_t vmcoreinfo_size;
> @@ -230,6 +236,9 @@ static int __init crash_save_vmcoreinfo_init(void)
>
> kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
> (void *)vmcoreinfo_data, vmcoreinfo_size);
> + kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
> + (void *)&runqueues, runqueues_get_size());
> +
> return 0;
> }
>
> With this, no more .section, no kmemdump code into sched, however, there
> are few things :
I would really just do here something like the following:
/**
* sched_get_runqueues_area - obtain the runqueues area for dumping
* @start: ...
* @size: ...
*
* The obtained area is only to be used for dumping purposes.
*/
void sched_get_runqueues_area(void *start, size_t size)
{
start = &runqueues;
size = sizeof(runqueues);
}
might be cleaner.
Having said that, if you realize that there is a fundamental issue with
what I propose, please speak up.
So far, I feel like there are only limited number of "suboptimal" cases
of this kind, but I might be wrong of course.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-27 12:18 ` David Hildenbrand
@ 2025-08-27 14:08 ` Eugen Hristev
2025-08-27 20:06 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-08-27 14:08 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 8/27/25 15:18, David Hildenbrand wrote:
> On 27.08.25 13:59, Eugen Hristev wrote:
>>
>>
>> On 8/25/25 16:58, David Hildenbrand wrote:
>>> On 25.08.25 15:36, Eugen Hristev wrote:
>>>>
>>>>
>>>> On 8/25/25 16:20, David Hildenbrand wrote:
>>>>>
>>>>>>>
>>>>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>>>>> accesses non-exported symbols.
>>>>>>
>>>>>> Hello David,
>>>>>>
>>>>>> I am looking again into this, and there are some things which in my
>>>>>> opinion would be difficult to achieve.
>>>>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>>>>> kmemdump.
>>>>>>
>>>>>> The runqueues is a variable of `struct rq` which is defined in
>>>>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>>>>> sched.
>>>>>> Now moving all the struct definition outside of sched.h into another
>>>>>> public header would be rather painful and I don't think it's a really
>>>>>> good option (The struct would be needed to compute the sizeof inside
>>>>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>>>>> definitions outside as well. I doubt this is something that we want for
>>>>>> the sched subsys. How the subsys is designed, out of my understanding,
>>>>>> is to keep these internal structs opaque outside of it.
>>>>>
>>>>> All the kmemdump module needs is a start and a length, correct? So the
>>>>> only tricky part is getting the length.
>>>>
>>>> I also have in mind the kernel user case. How would a kernel programmer
>>>> want to add some kernel structs/info/buffers into kmemdump such that the
>>>> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
>>>> enough.
>>>
>>> The other way around, why should anybody have a saying in adding their
>>> data to kmemdump? Why do we have that all over the kernel?
>>>
>>> Is your mechanism really so special?
>>>
>>> A single composer should take care of that, and it's really just start +
>>> len of physical memory areas.
>>>
>>>> Otherwise maybe the programmer has to write helpers to compute lengths
>>>> etc, and stitch them into kmemdump core.
>>>> I am not saying it's impossible, but just tiresome perhaps.
>>>
>>> In your patch set, how many of these instances did you encounter where
>>> that was a problem?
>>>
>>>>>
>>>>> One could just add a const variable that holds this information, or even
>>>>> better, a simple helper function to calculate that.
>>>>>
>>>>> Maybe someone else reading along has a better idea.
>>>>
>>>> This could work, but it requires again adding some code into the
>>>> specific subsystem. E.g. struct_rq_get_size()
>>>> I am open to ideas , and thank you very much for your thoughts.
>>>>
>>>>>
>>>>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>>>>> what you had would work as intended (maybe it does, not sure).
>>>>
>>>> I would not really need to dump the runqueues. But the crash tool which
>>>> I am using for testing, requires it. Without the runqueues it will not
>>>> progress further to load the kernel dump.
>>>> So I am not really sure what it does with the runqueues, but it works.
>>>> Perhaps using crash/gdb more, to actually do something with this data,
>>>> would give more insight about its utility.
>>>> For me, it is a prerequisite to run crash, and then to be able to
>>>> extract the log buffer from the dump.
>>>
>>> I have the faint recollection that percpu vars might not be stored in a
>>> single contiguous physical memory area, but maybe my memory is just
>>> wrong, that's why I was raising it.
>>>
>>>>
>>>>>
>>>>>>
>>>>>> From my perspective it's much simpler and cleaner to just add the
>>>>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>>>>> patch. This macro translates to a noop if kmemdump is not selected.
>>>>>
>>>>> I really don't like how we are spreading kmemdump all over the kernel,
>>>>> and adding complexity with __section when really, all we need is a place
>>>>> to obtain a start and a length.
>>>>>
>>>>
>>>> I understand. The section idea was suggested by Thomas. Initially I was
>>>> skeptic, but I like how it turned out.
>>>
>>> Yeah, I don't like it. Taste differs ;)
>>>
>>> I am in particular unhappy about custom memblock wrappers.
>>>
>>> [...]
>>>
>>>>>>
>>>>>> To have this working outside of printk, it would be required to walk
>>>>>> through all the printk structs/allocations and select the required info.
>>>>>> Is this something that we want to do outside of printk ?
>>>>>
>>>>> I don't follow, please elaborate.
>>>>>
>>>>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>>>>> given that you run your initialization after setup_log_buf() ?
>>>>>
>>>>>
>>>>
>>>> My initial thought was the same. However I got some feedback from Petr
>>>> Mladek here :
>>>>
>>>> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>>>>
>>>> Where he explained how to register the structs correctly.
>>>> It can be that setup_log_buf is called again at a later time perhaps.
>>>>
>>>
>>> setup_log_buf() is a __init function, so there is only a certain time
>>> frame where it can be called.
>>>
>>> In particular, once the buddy is up, memblock allocations are impossible
>>> and it would be deeply flawed to call this function again.
>>>
>>> Let's not over-engineer this.
>>>
>>> Peter is on CC, so hopefully he can share his thoughts.
>>>
>>
>> Hello David,
>>
>> I tested out this snippet (on top of my series, so you can see what I
>> changed):
>>
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 18ba6c1e174f..7ac4248a00e5 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -67,7 +67,6 @@
>> #include <linux/wait_api.h>
>> #include <linux/workqueue_api.h>
>> #include <linux/livepatch_sched.h>
>> -#include <linux/kmemdump.h>
>>
>> #ifdef CONFIG_PREEMPT_DYNAMIC
>> # ifdef CONFIG_GENERIC_IRQ_ENTRY
>> @@ -120,7 +119,12 @@
>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>>
>> DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
>> -KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
>> +
>> +size_t runqueues_get_size(void);
>> +size_t runqueues_get_size(void)
>> +{
>> + return sizeof(runqueues);
>> +}
>>
>> #ifdef CONFIG_SCHED_PROXY_EXEC
>> DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
>> diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
>> index d808c5e67f35..c6dd2d6e96dd 100644
>> --- a/kernel/vmcore_info.c
>> +++ b/kernel/vmcore_info.c
>> @@ -24,6 +24,12 @@
>> #include "kallsyms_internal.h"
>> #include "kexec_internal.h"
>>
>> +typedef void* kmemdump_opaque_t;
>> +
>> +size_t runqueues_get_size(void);
>> +
>> +extern kmemdump_opaque_t runqueues;
>
> I would have tried that through:
>
> struct rq;
> extern struct rq runqueues;
>
> But the whole PER_CPU_SHARED_ALIGNED makes this all weird, and likely
> not the way we would want to handle that.
>
>> /* vmcoreinfo stuff */
>> unsigned char *vmcoreinfo_data;
>> size_t vmcoreinfo_size;
>> @@ -230,6 +236,9 @@ static int __init crash_save_vmcoreinfo_init(void)
>>
>> kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
>> (void *)vmcoreinfo_data, vmcoreinfo_size);
>> + kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
>> + (void *)&runqueues, runqueues_get_size());
>> +
>> return 0;
>> }
>>
>> With this, no more .section, no kmemdump code into sched, however, there
>> are few things :
>
> I would really just do here something like the following:
>
> /**
> * sched_get_runqueues_area - obtain the runqueues area for dumping
> * @start: ...
> * @size: ...
> *
> * The obtained area is only to be used for dumping purposes.
> */
> void sched_get_runqueues_area(void *start, size_t size)
> {
> start = &runqueues;
> size = sizeof(runqueues);
> }
>
> might be cleaner.
>
How about this in the header:
#define DECLARE_DUMP_AREA_FUNC(subsys, symbol) \
void subsys ## _get_ ## symbol ##_area(void **start, size_t *size);
#define DEFINE_DUMP_AREA_FUNC(subsys, symbol) \
void subsys ## _get_ ## symbol ##_area(void **start, size_t *size)\
{\
*start = &symbol;\
*size = sizeof(symbol);\
}
then, in sched just
DECLARE_DUMP_AREA_FUNC(sched, runqueues);
DEFINE_DUMP_AREA_FUNC(sched, runqueues);
or a single macro that wraps both.
would make it shorter and neater.
What do you think ?
>
> Having said that, if you realize that there is a fundamental issue with
> what I propose, please speak up.
>
> So far, I feel like there are only limited number of "suboptimal" cases
> of this kind, but I might be wrong of course.
>
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-27 14:08 ` Eugen Hristev
@ 2025-08-27 20:06 ` David Hildenbrand
2025-09-01 8:57 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-08-27 20:06 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 27.08.25 16:08, Eugen Hristev wrote:
>
>
> On 8/27/25 15:18, David Hildenbrand wrote:
>> On 27.08.25 13:59, Eugen Hristev wrote:
>>>
>>>
>>> On 8/25/25 16:58, David Hildenbrand wrote:
>>>> On 25.08.25 15:36, Eugen Hristev wrote:
>>>>>
>>>>>
>>>>> On 8/25/25 16:20, David Hildenbrand wrote:
>>>>>>
>>>>>>>>
>>>>>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>>>>>> accesses non-exported symbols.
>>>>>>>
>>>>>>> Hello David,
>>>>>>>
>>>>>>> I am looking again into this, and there are some things which in my
>>>>>>> opinion would be difficult to achieve.
>>>>>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>>>>>> kmemdump.
>>>>>>>
>>>>>>> The runqueues is a variable of `struct rq` which is defined in
>>>>>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>>>>>> sched.
>>>>>>> Now moving all the struct definition outside of sched.h into another
>>>>>>> public header would be rather painful and I don't think it's a really
>>>>>>> good option (The struct would be needed to compute the sizeof inside
>>>>>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>>>>>> definitions outside as well. I doubt this is something that we want for
>>>>>>> the sched subsys. How the subsys is designed, out of my understanding,
>>>>>>> is to keep these internal structs opaque outside of it.
>>>>>>
>>>>>> All the kmemdump module needs is a start and a length, correct? So the
>>>>>> only tricky part is getting the length.
>>>>>
>>>>> I also have in mind the kernel user case. How would a kernel programmer
>>>>> want to add some kernel structs/info/buffers into kmemdump such that the
>>>>> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
>>>>> enough.
>>>>
>>>> The other way around, why should anybody have a saying in adding their
>>>> data to kmemdump? Why do we have that all over the kernel?
>>>>
>>>> Is your mechanism really so special?
>>>>
>>>> A single composer should take care of that, and it's really just start +
>>>> len of physical memory areas.
>>>>
>>>>> Otherwise maybe the programmer has to write helpers to compute lengths
>>>>> etc, and stitch them into kmemdump core.
>>>>> I am not saying it's impossible, but just tiresome perhaps.
>>>>
>>>> In your patch set, how many of these instances did you encounter where
>>>> that was a problem?
>>>>
>>>>>>
>>>>>> One could just add a const variable that holds this information, or even
>>>>>> better, a simple helper function to calculate that.
>>>>>>
>>>>>> Maybe someone else reading along has a better idea.
>>>>>
>>>>> This could work, but it requires again adding some code into the
>>>>> specific subsystem. E.g. struct_rq_get_size()
>>>>> I am open to ideas , and thank you very much for your thoughts.
>>>>>
>>>>>>
>>>>>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>>>>>> what you had would work as intended (maybe it does, not sure).
>>>>>
>>>>> I would not really need to dump the runqueues. But the crash tool which
>>>>> I am using for testing, requires it. Without the runqueues it will not
>>>>> progress further to load the kernel dump.
>>>>> So I am not really sure what it does with the runqueues, but it works.
>>>>> Perhaps using crash/gdb more, to actually do something with this data,
>>>>> would give more insight about its utility.
>>>>> For me, it is a prerequisite to run crash, and then to be able to
>>>>> extract the log buffer from the dump.
>>>>
>>>> I have the faint recollection that percpu vars might not be stored in a
>>>> single contiguous physical memory area, but maybe my memory is just
>>>> wrong, that's why I was raising it.
>>>>
>>>>>
>>>>>>
>>>>>>>
>>>>>>> From my perspective it's much simpler and cleaner to just add the
>>>>>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>>>>>> patch. This macro translates to a noop if kmemdump is not selected.
>>>>>>
>>>>>> I really don't like how we are spreading kmemdump all over the kernel,
>>>>>> and adding complexity with __section when really, all we need is a place
>>>>>> to obtain a start and a length.
>>>>>>
>>>>>
>>>>> I understand. The section idea was suggested by Thomas. Initially I was
>>>>> skeptic, but I like how it turned out.
>>>>
>>>> Yeah, I don't like it. Taste differs ;)
>>>>
>>>> I am in particular unhappy about custom memblock wrappers.
>>>>
>>>> [...]
>>>>
>>>>>>>
>>>>>>> To have this working outside of printk, it would be required to walk
>>>>>>> through all the printk structs/allocations and select the required info.
>>>>>>> Is this something that we want to do outside of printk ?
>>>>>>
>>>>>> I don't follow, please elaborate.
>>>>>>
>>>>>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>>>>>> given that you run your initialization after setup_log_buf() ?
>>>>>>
>>>>>>
>>>>>
>>>>> My initial thought was the same. However I got some feedback from Petr
>>>>> Mladek here :
>>>>>
>>>>> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>>>>>
>>>>> Where he explained how to register the structs correctly.
>>>>> It can be that setup_log_buf is called again at a later time perhaps.
>>>>>
>>>>
>>>> setup_log_buf() is a __init function, so there is only a certain time
>>>> frame where it can be called.
>>>>
>>>> In particular, once the buddy is up, memblock allocations are impossible
>>>> and it would be deeply flawed to call this function again.
>>>>
>>>> Let's not over-engineer this.
>>>>
>>>> Peter is on CC, so hopefully he can share his thoughts.
>>>>
>>>
>>> Hello David,
>>>
>>> I tested out this snippet (on top of my series, so you can see what I
>>> changed):
>>>
>>>
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 18ba6c1e174f..7ac4248a00e5 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -67,7 +67,6 @@
>>> #include <linux/wait_api.h>
>>> #include <linux/workqueue_api.h>
>>> #include <linux/livepatch_sched.h>
>>> -#include <linux/kmemdump.h>
>>>
>>> #ifdef CONFIG_PREEMPT_DYNAMIC
>>> # ifdef CONFIG_GENERIC_IRQ_ENTRY
>>> @@ -120,7 +119,12 @@
>>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
>>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>>>
>>> DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
>>> -KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
>>> +
>>> +size_t runqueues_get_size(void);
>>> +size_t runqueues_get_size(void)
>>> +{
>>> + return sizeof(runqueues);
>>> +}
>>>
>>> #ifdef CONFIG_SCHED_PROXY_EXEC
>>> DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
>>> diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
>>> index d808c5e67f35..c6dd2d6e96dd 100644
>>> --- a/kernel/vmcore_info.c
>>> +++ b/kernel/vmcore_info.c
>>> @@ -24,6 +24,12 @@
>>> #include "kallsyms_internal.h"
>>> #include "kexec_internal.h"
>>>
>>> +typedef void* kmemdump_opaque_t;
>>> +
>>> +size_t runqueues_get_size(void);
>>> +
>>> +extern kmemdump_opaque_t runqueues;
>>
>> I would have tried that through:
>>
>> struct rq;
>> extern struct rq runqueues;
>>
>> But the whole PER_CPU_SHARED_ALIGNED makes this all weird, and likely
>> not the way we would want to handle that.
>>
>>> /* vmcoreinfo stuff */
>>> unsigned char *vmcoreinfo_data;
>>> size_t vmcoreinfo_size;
>>> @@ -230,6 +236,9 @@ static int __init crash_save_vmcoreinfo_init(void)
>>>
>>> kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
>>> (void *)vmcoreinfo_data, vmcoreinfo_size);
>>> + kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
>>> + (void *)&runqueues, runqueues_get_size());
>>> +
>>> return 0;
>>> }
>>>
>>> With this, no more .section, no kmemdump code into sched, however, there
>>> are few things :
>>
>> I would really just do here something like the following:
>>
>> /**
>> * sched_get_runqueues_area - obtain the runqueues area for dumping
>> * @start: ...
>> * @size: ...
>> *
>> * The obtained area is only to be used for dumping purposes.
>> */
>> void sched_get_runqueues_area(void *start, size_t size)
>> {
>> start = &runqueues;
>> size = sizeof(runqueues);
>> }
>>
>> might be cleaner.
>>
>
> How about this in the header:
>
> #define DECLARE_DUMP_AREA_FUNC(subsys, symbol) \
>
> void subsys ## _get_ ## symbol ##_area(void **start, size_t *size);
>
>
>
> #define DEFINE_DUMP_AREA_FUNC(subsys, symbol) \
>
> void subsys ## _get_ ## symbol ##_area(void **start, size_t *size)\
>
> {\
>
> *start = &symbol;\
>
> *size = sizeof(symbol);\
>
> }
>
>
> then, in sched just
>
> DECLARE_DUMP_AREA_FUNC(sched, runqueues);
>
> DEFINE_DUMP_AREA_FUNC(sched, runqueues);
>
> or a single macro that wraps both.
>
> would make it shorter and neater.
>
> What do you think ?
Looks a bit over-engineered, and will require us to import a header
(likely kmemdump.h) in these files, which I don't really enjoy.
I would start simple, without any such macro-magic. It's a very simple
function after all, and likely you won't end up having many of these?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-08-27 20:06 ` David Hildenbrand
@ 2025-09-01 8:57 ` Eugen Hristev
2025-09-01 10:01 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-09-01 8:57 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 8/27/25 23:06, David Hildenbrand wrote:
> On 27.08.25 16:08, Eugen Hristev wrote:
>>
>>
>> On 8/27/25 15:18, David Hildenbrand wrote:
>>> On 27.08.25 13:59, Eugen Hristev wrote:
>>>>
>>>>
>>>> On 8/25/25 16:58, David Hildenbrand wrote:
>>>>> On 25.08.25 15:36, Eugen Hristev wrote:
>>>>>>
>>>>>>
>>>>>> On 8/25/25 16:20, David Hildenbrand wrote:
>>>>>>>
>>>>>>>>>
>>>>>>>>> IIRC, kernel/vmcore_info.c is never built as a module, as it also
>>>>>>>>> accesses non-exported symbols.
>>>>>>>>
>>>>>>>> Hello David,
>>>>>>>>
>>>>>>>> I am looking again into this, and there are some things which in my
>>>>>>>> opinion would be difficult to achieve.
>>>>>>>> For example I looked into my patch #11 , which adds the `runqueues` into
>>>>>>>> kmemdump.
>>>>>>>>
>>>>>>>> The runqueues is a variable of `struct rq` which is defined in
>>>>>>>> kernel/sched/sched.h , which is not supposed to be included outside of
>>>>>>>> sched.
>>>>>>>> Now moving all the struct definition outside of sched.h into another
>>>>>>>> public header would be rather painful and I don't think it's a really
>>>>>>>> good option (The struct would be needed to compute the sizeof inside
>>>>>>>> vmcoreinfo). Secondly, it would also imply moving all the nested struct
>>>>>>>> definitions outside as well. I doubt this is something that we want for
>>>>>>>> the sched subsys. How the subsys is designed, out of my understanding,
>>>>>>>> is to keep these internal structs opaque outside of it.
>>>>>>>
>>>>>>> All the kmemdump module needs is a start and a length, correct? So the
>>>>>>> only tricky part is getting the length.
>>>>>>
>>>>>> I also have in mind the kernel user case. How would a kernel programmer
>>>>>> want to add some kernel structs/info/buffers into kmemdump such that the
>>>>>> dump would contain their data ? Having "KMEMDUMP_VAR(...)" looks simple
>>>>>> enough.
>>>>>
>>>>> The other way around, why should anybody have a saying in adding their
>>>>> data to kmemdump? Why do we have that all over the kernel?
>>>>>
>>>>> Is your mechanism really so special?
>>>>>
>>>>> A single composer should take care of that, and it's really just start +
>>>>> len of physical memory areas.
>>>>>
>>>>>> Otherwise maybe the programmer has to write helpers to compute lengths
>>>>>> etc, and stitch them into kmemdump core.
>>>>>> I am not saying it's impossible, but just tiresome perhaps.
>>>>>
>>>>> In your patch set, how many of these instances did you encounter where
>>>>> that was a problem?
>>>>>
>>>>>>>
>>>>>>> One could just add a const variable that holds this information, or even
>>>>>>> better, a simple helper function to calculate that.
>>>>>>>
>>>>>>> Maybe someone else reading along has a better idea.
>>>>>>
>>>>>> This could work, but it requires again adding some code into the
>>>>>> specific subsystem. E.g. struct_rq_get_size()
>>>>>> I am open to ideas , and thank you very much for your thoughts.
>>>>>>
>>>>>>>
>>>>>>> Interestingly, runqueues is a percpu variable, which makes me wonder if
>>>>>>> what you had would work as intended (maybe it does, not sure).
>>>>>>
>>>>>> I would not really need to dump the runqueues. But the crash tool which
>>>>>> I am using for testing, requires it. Without the runqueues it will not
>>>>>> progress further to load the kernel dump.
>>>>>> So I am not really sure what it does with the runqueues, but it works.
>>>>>> Perhaps using crash/gdb more, to actually do something with this data,
>>>>>> would give more insight about its utility.
>>>>>> For me, it is a prerequisite to run crash, and then to be able to
>>>>>> extract the log buffer from the dump.
>>>>>
>>>>> I have the faint recollection that percpu vars might not be stored in a
>>>>> single contiguous physical memory area, but maybe my memory is just
>>>>> wrong, that's why I was raising it.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> From my perspective it's much simpler and cleaner to just add the
>>>>>>>> kmemdump annotation macro inside the sched/core.c as it's done in my
>>>>>>>> patch. This macro translates to a noop if kmemdump is not selected.
>>>>>>>
>>>>>>> I really don't like how we are spreading kmemdump all over the kernel,
>>>>>>> and adding complexity with __section when really, all we need is a place
>>>>>>> to obtain a start and a length.
>>>>>>>
>>>>>>
>>>>>> I understand. The section idea was suggested by Thomas. Initially I was
>>>>>> skeptic, but I like how it turned out.
>>>>>
>>>>> Yeah, I don't like it. Taste differs ;)
>>>>>
>>>>> I am in particular unhappy about custom memblock wrappers.
>>>>>
>>>>> [...]
>>>>>
>>>>>>>>
>>>>>>>> To have this working outside of printk, it would be required to walk
>>>>>>>> through all the printk structs/allocations and select the required info.
>>>>>>>> Is this something that we want to do outside of printk ?
>>>>>>>
>>>>>>> I don't follow, please elaborate.
>>>>>>>
>>>>>>> How is e.g., log_buf_len_get() + log_buf_addr_get() not sufficient,
>>>>>>> given that you run your initialization after setup_log_buf() ?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> My initial thought was the same. However I got some feedback from Petr
>>>>>> Mladek here :
>>>>>>
>>>>>> https://lore.kernel.org/lkml/aBm5QH2p6p9Wxe_M@localhost.localdomain/
>>>>>>
>>>>>> Where he explained how to register the structs correctly.
>>>>>> It can be that setup_log_buf is called again at a later time perhaps.
>>>>>>
>>>>>
>>>>> setup_log_buf() is a __init function, so there is only a certain time
>>>>> frame where it can be called.
>>>>>
>>>>> In particular, once the buddy is up, memblock allocations are impossible
>>>>> and it would be deeply flawed to call this function again.
>>>>>
>>>>> Let's not over-engineer this.
>>>>>
>>>>> Peter is on CC, so hopefully he can share his thoughts.
>>>>>
>>>>
>>>> Hello David,
>>>>
>>>> I tested out this snippet (on top of my series, so you can see what I
>>>> changed):
>>>>
>>>>
>>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>>> index 18ba6c1e174f..7ac4248a00e5 100644
>>>> --- a/kernel/sched/core.c
>>>> +++ b/kernel/sched/core.c
>>>> @@ -67,7 +67,6 @@
>>>> #include <linux/wait_api.h>
>>>> #include <linux/workqueue_api.h>
>>>> #include <linux/livepatch_sched.h>
>>>> -#include <linux/kmemdump.h>
>>>>
>>>> #ifdef CONFIG_PREEMPT_DYNAMIC
>>>> # ifdef CONFIG_GENERIC_IRQ_ENTRY
>>>> @@ -120,7 +119,12 @@
>>>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_update_nr_running_tp);
>>>> EXPORT_TRACEPOINT_SYMBOL_GPL(sched_compute_energy_tp);
>>>>
>>>> DEFINE_PER_CPU_SHARED_ALIGNED(struct rq, runqueues);
>>>> -KMEMDUMP_VAR_CORE(runqueues, sizeof(runqueues));
>>>> +
>>>> +size_t runqueues_get_size(void);
>>>> +size_t runqueues_get_size(void)
>>>> +{
>>>> + return sizeof(runqueues);
>>>> +}
>>>>
>>>> #ifdef CONFIG_SCHED_PROXY_EXEC
>>>> DEFINE_STATIC_KEY_TRUE(__sched_proxy_exec);
>>>> diff --git a/kernel/vmcore_info.c b/kernel/vmcore_info.c
>>>> index d808c5e67f35..c6dd2d6e96dd 100644
>>>> --- a/kernel/vmcore_info.c
>>>> +++ b/kernel/vmcore_info.c
>>>> @@ -24,6 +24,12 @@
>>>> #include "kallsyms_internal.h"
>>>> #include "kexec_internal.h"
>>>>
>>>> +typedef void* kmemdump_opaque_t;
>>>> +
>>>> +size_t runqueues_get_size(void);
>>>> +
>>>> +extern kmemdump_opaque_t runqueues;
>>>
>>> I would have tried that through:
>>>
>>> struct rq;
>>> extern struct rq runqueues;
>>>
>>> But the whole PER_CPU_SHARED_ALIGNED makes this all weird, and likely
>>> not the way we would want to handle that.
>>>
>>>> /* vmcoreinfo stuff */
>>>> unsigned char *vmcoreinfo_data;
>>>> size_t vmcoreinfo_size;
>>>> @@ -230,6 +236,9 @@ static int __init crash_save_vmcoreinfo_init(void)
>>>>
>>>> kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_VMCOREINFO,
>>>> (void *)vmcoreinfo_data, vmcoreinfo_size);
>>>> + kmemdump_register_id(KMEMDUMP_ID_COREIMAGE_runqueues,
>>>> + (void *)&runqueues, runqueues_get_size());
>>>> +
>>>> return 0;
>>>> }
>>>>
>>>> With this, no more .section, no kmemdump code into sched, however, there
>>>> are few things :
>>>
>>> I would really just do here something like the following:
>>>
>>> /**
>>> * sched_get_runqueues_area - obtain the runqueues area for dumping
>>> * @start: ...
>>> * @size: ...
>>> *
>>> * The obtained area is only to be used for dumping purposes.
>>> */
>>> void sched_get_runqueues_area(void *start, size_t size)
>>> {
>>> start = &runqueues;
>>> size = sizeof(runqueues);
>>> }
>>>
>>> might be cleaner.
>>>
>>
>> How about this in the header:
>>
>> #define DECLARE_DUMP_AREA_FUNC(subsys, symbol) \
>>
>> void subsys ## _get_ ## symbol ##_area(void **start, size_t *size);
>>
>>
>>
>> #define DEFINE_DUMP_AREA_FUNC(subsys, symbol) \
>>
>> void subsys ## _get_ ## symbol ##_area(void **start, size_t *size)\
>>
>> {\
>>
>> *start = &symbol;\
>>
>> *size = sizeof(symbol);\
>>
>> }
>>
>>
>> then, in sched just
>>
>> DECLARE_DUMP_AREA_FUNC(sched, runqueues);
>>
>> DEFINE_DUMP_AREA_FUNC(sched, runqueues);
>>
>> or a single macro that wraps both.
>>
>> would make it shorter and neater.
>>
>> What do you think ?
>
> Looks a bit over-engineered, and will require us to import a header
> (likely kmemdump.h) in these files, which I don't really enjoy.
>
> I would start simple, without any such macro-magic. It's a very simple
> function after all, and likely you won't end up having many of these?
>
Thanks David, I will do it as you suggested and see what comes out of it.
I have one side question you might know much better to answer:
As we have a start and a size for each region, this start is a virtual
address. The firmware/coprocessor that reads the memory and dumps it,
requires physical addresses. What do you suggest to use to retrieve that
address ? virt_to_phys might be problematic, __pa or __pa_symbol? or
better lm_alias ?
As kmemdump is agnostic of the region of the memory the `start` comes
from, and it should be portable and platform independent.
Thanks again,
Eugen
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-09-01 8:57 ` Eugen Hristev
@ 2025-09-01 10:01 ` David Hildenbrand
2025-09-01 12:02 ` Eugen Hristev
0 siblings, 1 reply; 61+ messages in thread
From: David Hildenbrand @ 2025-09-01 10:01 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
>>> What do you think ?
>>
>> Looks a bit over-engineered, and will require us to import a header
>> (likely kmemdump.h) in these files, which I don't really enjoy.
>>
>> I would start simple, without any such macro-magic. It's a very simple
>> function after all, and likely you won't end up having many of these?
>>
>
> Thanks David, I will do it as you suggested and see what comes out of it.
>
> I have one side question you might know much better to answer:
> As we have a start and a size for each region, this start is a virtual
> address. The firmware/coprocessor that reads the memory and dumps it,
> requires physical addresses.
Right. I was asking myself the same question while reviewing: should we
directly export physical ranges here instead of virtual ones. I guess
virtual ones is ok.
What do you suggest to use to retrieve that
> address ? virt_to_phys might be problematic, __pa or __pa_symbol? or
> better lm_alias ?
All areas should either come from memblock or be global variables, right?
IIRC, virt_to_phys() should work for these. Did you run into any
problems with them or why do you think virt_to_phys could be problematic?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-09-01 10:01 ` David Hildenbrand
@ 2025-09-01 12:02 ` Eugen Hristev
2025-09-01 12:17 ` David Hildenbrand
0 siblings, 1 reply; 61+ messages in thread
From: Eugen Hristev @ 2025-09-01 12:02 UTC (permalink / raw)
To: David Hildenbrand, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 9/1/25 13:01, David Hildenbrand wrote:
>>>> What do you think ?
>>>
>>> Looks a bit over-engineered, and will require us to import a header
>>> (likely kmemdump.h) in these files, which I don't really enjoy.
>>>
>>> I would start simple, without any such macro-magic. It's a very simple
>>> function after all, and likely you won't end up having many of these?
>>>
>>
>> Thanks David, I will do it as you suggested and see what comes out of it.
>>
>> I have one side question you might know much better to answer:
>> As we have a start and a size for each region, this start is a virtual
>> address. The firmware/coprocessor that reads the memory and dumps it,
>> requires physical addresses.
>
> Right. I was asking myself the same question while reviewing: should we
> directly export physical ranges here instead of virtual ones. I guess
> virtual ones is ok.
In patch 22/29, some areas are registered using
memblock_phys_alloc_try_nid() which allocates physical.
In this case , phys_to_virt() didn't work for me, it was returning a
wrong address. I used __va() and this worked. So there is a difference
between them.
>
> What do you suggest to use to retrieve that
>> address ? virt_to_phys might be problematic, __pa or __pa_symbol? or
>> better lm_alias ?
>
> All areas should either come from memblock or be global variables, right?
I would like to be able to register from anywhere. For example someone
debugging their driver, to just register kmalloc'ed struct.
Other use case is to register dma coherent CMA areas.
>
> IIRC, virt_to_phys() should work for these. Did you run into any
> problems with them or why do you think virt_to_phys could be problematic?
>
I am pondering about whether it would work in all cases, considering
it's source code comments that it shall not be used because it does not
work for any address.
Someone also reported its unavailability like this:
drivers/debug/kmemdump_coreimage.c:67:24: error: call to undeclared
function 'virt_to_phys'; ISO C99 and later do not support implicit
function declarations [-Wimplicit-function-declaration]
I am yet to figure out which config fails.
^ permalink raw reply [flat|nested] 61+ messages in thread
* Re: [RFC][PATCH v2 22/29] mm/numa: Register information into Kmemdump
2025-09-01 12:02 ` Eugen Hristev
@ 2025-09-01 12:17 ` David Hildenbrand
0 siblings, 0 replies; 61+ messages in thread
From: David Hildenbrand @ 2025-09-01 12:17 UTC (permalink / raw)
To: Eugen Hristev, Michal Hocko
Cc: linux-kernel, linux-arm-msm, linux-arch, linux-mm, tglx,
andersson, pmladek, linux-arm-kernel, linux-hardening, corbet,
mojha, rostedt, jonechou, tudor.ambarus, Christoph Hellwig,
Sergey Senozhatsky
On 01.09.25 14:02, Eugen Hristev wrote:
>
>
> On 9/1/25 13:01, David Hildenbrand wrote:
>>>>> What do you think ?
>>>>
>>>> Looks a bit over-engineered, and will require us to import a header
>>>> (likely kmemdump.h) in these files, which I don't really enjoy.
>>>>
>>>> I would start simple, without any such macro-magic. It's a very simple
>>>> function after all, and likely you won't end up having many of these?
>>>>
>>>
>>> Thanks David, I will do it as you suggested and see what comes out of it.
>>>
>>> I have one side question you might know much better to answer:
>>> As we have a start and a size for each region, this start is a virtual
>>> address. The firmware/coprocessor that reads the memory and dumps it,
>>> requires physical addresses.
>>
>> Right. I was asking myself the same question while reviewing: should we
>> directly export physical ranges here instead of virtual ones. I guess
>> virtual ones is ok.
>
> In patch 22/29, some areas are registered using
> memblock_phys_alloc_try_nid() which allocates physical.
> In this case , phys_to_virt() didn't work for me, it was returning a
> wrong address. I used __va() and this worked. So there is a difference
> between them.
memblock_alloc_internal() calls memblock_alloc_range_nid() to then
perform a phys_to_virt().
memblock_phys_alloc_try_nid() calls memblock_alloc_range_nid() without
the phys_to_virt().
So it's rather surprising the a phys_to_virt() would not work in that case.
Maybe for these cases where you export the area through a new helper,
you can just export the physical addr + length instead.
Then, it's also clear that this area is actually physically contiguous.
>
>>
>> What do you suggest to use to retrieve that
>>> address ? virt_to_phys might be problematic, __pa or __pa_symbol? or
>>> better lm_alias ?
>>
>> All areas should either come from memblock or be global variables, right?
>
> I would like to be able to register from anywhere. For example someone
> debugging their driver, to just register kmalloc'ed struct.
> Other use case is to register dma coherent CMA areas.
Then probably better to export physical addresses (that you need either
way) directly from the helpers you have to add.
>
>>
>> IIRC, virt_to_phys() should work for these. Did you run into any
>> problems with them or why do you think virt_to_phys could be problematic?
>>
>
> I am pondering about whether it would work in all cases, considering
> it's source code comments that it shall not be used because it does not
> work for any address.
Yeah, it does for example not work for kernel stacks IIRC.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 61+ messages in thread
end of thread, other threads:[~2025-09-01 12:17 UTC | newest]
Thread overview: 61+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-24 13:54 [RFC][PATCH v2 00/29] introduce kmemdump Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 01/29] kmemdump: " Eugen Hristev
2025-07-26 3:33 ` Randy Dunlap
2025-07-26 3:36 ` Randy Dunlap
2025-07-24 13:54 ` [RFC][PATCH v2 02/29] Documentation: add kmemdump Eugen Hristev
2025-07-24 14:13 ` Jonathan Corbet
2025-07-24 13:54 ` [RFC][PATCH v2 03/29] kmemdump: add coreimage ELF layer Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 04/29] Documentation: kmemdump: add section for coreimage ELF Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 05/29] kmemdump: introduce qcom-minidump backend driver Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 06/29] soc: qcom: smem: add minidump device Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 07/29] init/version: Annotate static information into Kmemdump Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 08/29] cpu: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 09/29] genirq/irqdesc: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 10/29] panic: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 11/29] sched/core: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 12/29] timers: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 13/29] kernel/fork: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 14/29] mm/page_alloc: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 15/29] mm/init-mm: " Eugen Hristev
2025-07-24 13:54 ` [RFC][PATCH v2 16/29] mm/show_mem: " Eugen Hristev
2025-07-30 13:55 ` David Hildenbrand
2025-07-30 14:04 ` Eugen Hristev
2025-07-30 14:10 ` David Hildenbrand
2025-07-24 13:55 ` [RFC][PATCH v2 17/29] mm/swapfile: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 18/29] mm/percpu: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 19/29] mm/mm_init: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 20/29] printk: Register " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 21/29] kernel/configs: Register dynamic " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 22/29] mm/numa: Register " Eugen Hristev
2025-07-30 13:52 ` David Hildenbrand
2025-07-30 13:57 ` Eugen Hristev
2025-07-30 14:04 ` David Hildenbrand
2025-08-04 10:54 ` Michal Hocko
2025-08-04 11:06 ` Eugen Hristev
2025-08-04 12:18 ` David Hildenbrand
2025-08-04 12:29 ` Eugen Hristev
2025-08-04 12:49 ` David Hildenbrand
2025-08-04 13:03 ` Eugen Hristev
2025-08-04 13:26 ` David Hildenbrand
2025-08-25 12:55 ` Eugen Hristev
2025-08-25 13:20 ` David Hildenbrand
2025-08-25 13:36 ` Eugen Hristev
2025-08-25 13:58 ` David Hildenbrand
2025-08-27 11:59 ` Eugen Hristev
2025-08-27 12:18 ` David Hildenbrand
2025-08-27 14:08 ` Eugen Hristev
2025-08-27 20:06 ` David Hildenbrand
2025-09-01 8:57 ` Eugen Hristev
2025-09-01 10:01 ` David Hildenbrand
2025-09-01 12:02 ` Eugen Hristev
2025-09-01 12:17 ` David Hildenbrand
2025-08-04 12:16 ` David Hildenbrand
2025-07-24 13:55 ` [RFC][PATCH v2 23/29] mm/sparse: " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 24/29] kernel/vmcore_info: Register dynamic " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 25/29] kmemdump: Add additional symbols to the coreimage Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 26/29] init/version: Annotate init uts name separately into Kmemdump Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 27/29] kallsyms: Annotate static information " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 28/29] mm/init-mm: Annotate additional " Eugen Hristev
2025-07-24 13:55 ` [RFC][PATCH v2 29/29] kmemdump: Add Kinfo backend driver Eugen Hristev
2025-08-26 17:14 ` [RFC][PATCH v2 00/29] introduce kmemdump Mukesh Ojha
2025-08-27 6:42 ` Eugen Hristev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).