public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
@ 2025-02-07  8:08 Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 1/7] kexec_file: allow to place kexec_buf randomly Coiby Xu
                   ` (10 more replies)
  0 siblings, 11 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

LUKS is the standard for Linux disk encryption, widely adopted by users,
and in some cases, such as Confidential VMs, it is a requirement. With 
kdump enabled, when the first kernel crashes, the system can boot into
the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore) 
to a specified target. However, there are two challenges when dumping
vmcore to a LUKS-encrypted device:

 - Kdump kernel may not be able to decrypt the LUKS partition. For some
   machines, a system administrator may not have a chance to enter the
   password to decrypt the device in kdump initramfs after the 1st kernel
   crashes; For cloud confidential VMs, depending on the policy the
   kdump kernel may not be able to unseal the keys with TPM and the
   console virtual keyboard is untrusted.

 - LUKS2 by default use the memory-hard Argon2 key derivation function
   which is quite memory-consuming compared to the limited memory reserved
   for kdump. Take Fedora example, by default, only 256M is reserved for
   systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
   to be reserved for kdump. Note if the memory reserved for kdump can't
   be used by 1st kernel i.e. an user sees ~1300M memory missing in the
   1st kernel.

Besides users (at least for Fedora) usually expect kdump to work out of
the box i.e. no manual password input or custom crashkernel value is
needed. And it doesn't make sense to derivate the keys again in kdump
kernel which seems to be redundant work.

This patch set addresses the above issues by making the LUKS volume keys
persistent for kdump kernel with the help of cryptsetup's new APIs
(--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
the kdump copies of LUKS volume keys,

 1. After the 1st kernel loads the initramfs during boot, systemd
    use an user-input passphrase to de-crypt the LUKS volume keys
    or TPM-sealed key and then save the volume keys to specified keyring
    (using the --link-vk-to-keyring API) and the key will expire within
    specified time.

 2. A user space tool (kdump initramfs loader like kdump-utils) create
    key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
    the 1st kernel which keys are needed.

 3. When the kdump initramfs is loaded by the kexec_file_load
    syscall, the 1st kernel will iterate created key items, save the
    keys to kdump reserved memory.

 4. When the 1st kernel crashes and the kdump initramfs is booted, the
    kdump initramfs asks the kdump kernel to create a user key using the
    key stored in kdump reserved memory by writing yes to
    /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
    device is unlocked with libcryptsetup's --volume-key-keyring API.

 5. The system gets rebooted to the 1st kernel after dumping vmcore to
    the LUKS encrypted device is finished

After libcryptsetup saving the LUKS volume keys to specified keyring,
whoever takes this should be responsible for the safety of these copies
of keys. The keys will be saved in the memory area exclusively reserved
for kdump where even the 1st kernel has no direct access. And further
more, two additional protections are added,
 - save the copy randomly in kdump reserved memory as suggested by Jan
 - clear the _PAGE_PRESENT flag of the page that stores the copy as
   suggested by Pingfan

This patch set only supports x86. There will be patches to support other
architectures once this patch set gets merged.

v8
 - improve documentation [Randy]
 - rebase onto 6.14.0-rc1

v7
 - Baoquan
   - differentiate between failing to get dm crypt keys and no dm crypt keys
   - add code comments, change function name and etc. to improve code readability
 - add documentation for configfs API [Dave]
 - fix building error found by kernel test robot

v6
 - Baoquan
   - support AMD SEV
   - drop uncessary keys_header_size
   - improve commit message of [PATCH 4/7]
 
 - Greg
   - switch to configfs
   - move ifdef from .c to .h files and rework kexec_random_start
   - use tab instead of space for appended code comment
 
 - Process key description in a more flexible way to address problems
   found by Ondrej
 - improve cover letter
 - fix an compilation error as found by kernel test robot 

v5
 - Baoquan
   - limit the feature of placing kexec_buf randomly to kdump (CONFIG_CRASH_DUMP)
   - add documentation for added sysfs API 
   - allow to re-send init command to support the case of user switching to
     a different LUKS-encrypted target
   - make CONFIG_CRASH_DM_CRYPT depends on CONFIG_DM_CRYPT
   - check if the number of keys exceed KEY_NUM_MAX
   - rename (struct keys_header).key_count as (struct keys_header).total_keys
     to improve code readability
   - improve commit message
   - fix the failure of calling crash_exclude_mem_range (there is a split
     of mem_range)
   - use ret instead of r as return code
 
 - Greg
   - add documentation for added sysfs API 
   - avoid spamming kernel logs 
   - fix a buffer overflow issue
   - keep the state enums synced up with the string values
   - use sysfs_emit other than sprintf
   - explain KEY_NUM_MAX and KEY_SIZE_MAX
   - s/EXPORT_SYMBOL_GPL/EXPORT_SYMBOL/g
   - improve code readability
 
 - Rebase onto latest Linus tree


v4
- rebase onto latest Linus tree so Baoquan can apply the patches for
  code review
- fix kernel test robot warnings

v3
 - Support CPU/memory hot-plugging [Baoquan]
 - Don't save the keys temporarily to simplify the implementation [Baoquan]
 - Support multiple LUKS encrypted volumes
 - Read logon key instead of user key to improve security [Ondrej]
 - A kernel config option CRASH_DM_CRYPT for this feature (disabled by default)
 - Fix warnings found by kernel test robot
 - Rebase the code onto 6.9.0-rc5+

v2
 - work together with libscryptsetup's --link-vk-to-keyring/--volume-key-keyring APIs [Milan and Ondrej]
 - add the case where console virtual keyboard is untrusted for confidential VM
 - use dm_crypt_key instead of LUKS volume key [Milan and Eric]
 - fix some code format issues
 - don't move "struct kexec_segment" declaration
 - Rebase the code onto latest Linus tree (6.7.0)

v1
 - "Put the luks key handling related to crash_dump out into a separate
   file kernel/crash_dump_luks.c" [Baoquan]
 - Put the generic luks handling code before the x86 specific code to
   make it easier for other arches to follow suit [Baoquan]
 - Use phys_to_virt instead of "pfn -> page -> vaddr" [Dave Hansen]
 - Drop the RFC prefix [Dave Young]
 - Rebase the code onto latest Linus tree (6.4.0-rc4)

RFC v2
 - libcryptsetup interacts with the kernel via sysfs instead of "hacking"
   dm-crypt
   - to save a kdump copy of the LUKS volume key in 1st kernel
   - to add a logon key using the copy for libcryptsetup in kdump kernel [Milan]
   - to avoid the incorrect usage of LUKS master key in dm-crypt [Milan]
 - save the kdump copy of LUKS volume key randomly [Jan]
 - mark the kdump copy inaccessible [Pingfan]
 - Miscellaneous
   - explain when operations related to the LUKS volume key happen [Jan]
   - s/master key/volume key/g
   - use crash_ instead of kexec_ as function prefix
   - fix commit subject prefixes e.g. "x86, kdump" to x86/crash


Coiby Xu (7):
  kexec_file: allow to place kexec_buf randomly
  crash_dump: make dm crypt keys persist for the kdump kernel
  crash_dump: store dm crypt keys in kdump reserved memory
  crash_dump: reuse saved dm crypt keys for CPU/memory hot-plugging
  crash_dump: retrieve dm crypt keys in kdump kernel
  x86/crash: pass dm crypt keys to kdump kernel
  x86/crash: make the page that stores the dm crypt keys inaccessible

 Documentation/admin-guide/kdump/kdump.rst |  32 ++
 arch/x86/kernel/crash.c                   |  26 +-
 arch/x86/kernel/kexec-bzimage64.c         |  11 +
 arch/x86/kernel/machine_kexec_64.c        |  22 ++
 include/linux/crash_core.h                |   7 +-
 include/linux/crash_dump.h                |   2 +
 include/linux/kexec.h                     |  34 ++
 kernel/Kconfig.kexec                      |  10 +
 kernel/Makefile                           |   1 +
 kernel/crash_dump_dm_crypt.c              | 459 ++++++++++++++++++++++
 kernel/kexec_file.c                       |   3 +
 11 files changed, 604 insertions(+), 3 deletions(-)
 create mode 100644 kernel/crash_dump_dm_crypt.c


base-commit: bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b
-- 
2.48.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v8 1/7] kexec_file: allow to place kexec_buf randomly
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
@ 2025-02-07  8:08 ` Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel Coiby Xu
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Jan Pazdziora, Eric Biederman

Currently, kexec_buf is placed in order which means for the same
machine, the info in the kexec_buf is always located at the same
position each time the machine is booted. This may cause a risk for
sensitive information like LUKS volume key. Now struct kexec_buf has a
new field random which indicates it's supposed to be placed in a random
position.

Note this feature is enabled only when CONFIG_CRASH_DUMP is enabled. So
it only takes effect for kdump and won't impact kexec reboot.

Suggested-by: Jan Pazdziora <jpazdziora@redhat.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 include/linux/kexec.h | 30 ++++++++++++++++++++++++++++++
 kernel/kexec_file.c   |  3 +++
 2 files changed, 33 insertions(+)

diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index f0e9f8eda7a3..61269e97502a 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -25,6 +25,10 @@
 
 extern note_buf_t __percpu *crash_notes;
 
+#ifdef CONFIG_CRASH_DUMP
+#include <linux/prandom.h>
+#endif
+
 #ifdef CONFIG_KEXEC_CORE
 #include <linux/list.h>
 #include <linux/compat.h>
@@ -171,6 +175,7 @@ int kexec_image_post_load_cleanup_default(struct kimage *image);
  * @buf_min:	The buffer can't be placed below this address.
  * @buf_max:	The buffer can't be placed above this address.
  * @top_down:	Allocate from top of memory.
+ * @random:	Place the buffer at a random position.
  */
 struct kexec_buf {
 	struct kimage *image;
@@ -182,8 +187,33 @@ struct kexec_buf {
 	unsigned long buf_min;
 	unsigned long buf_max;
 	bool top_down;
+#ifdef CONFIG_CRASH_DUMP
+	bool random;
+#endif
 };
 
+
+#ifdef CONFIG_CRASH_DUMP
+static inline void kexec_random_range_start(unsigned long start,
+					    unsigned long end,
+					    struct kexec_buf *kbuf,
+					    unsigned long *temp_start)
+{
+	unsigned short i;
+
+	if (kbuf->random) {
+		get_random_bytes(&i, sizeof(unsigned short));
+		*temp_start = start + (end - start) / USHRT_MAX * i;
+	}
+}
+#else
+static inline void kexec_random_range_start(unsigned long start,
+					    unsigned long end,
+					    struct kexec_buf *kbuf,
+					    unsigned long *temp_start)
+{}
+#endif
+
 int kexec_load_purgatory(struct kimage *image, struct kexec_buf *kbuf);
 int kexec_purgatory_get_set_symbol(struct kimage *image, const char *name,
 				   void *buf, unsigned int size,
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 3eedb8c226ad..875fe108cc83 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -445,6 +445,7 @@ static int locate_mem_hole_top_down(unsigned long start, unsigned long end,
 
 	temp_end = min(end, kbuf->buf_max);
 	temp_start = temp_end - kbuf->memsz + 1;
+	kexec_random_range_start(temp_start, temp_end, kbuf, &temp_start);
 
 	do {
 		/* align down start */
@@ -483,6 +484,8 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
 
 	temp_start = max(start, kbuf->buf_min);
 
+	kexec_random_range_start(temp_start, end, kbuf, &temp_start);
+
 	do {
 		temp_start = ALIGN(temp_start, kbuf->buf_align);
 		temp_end = temp_start + kbuf->memsz - 1;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 1/7] kexec_file: allow to place kexec_buf randomly Coiby Xu
@ 2025-02-07  8:08 ` Coiby Xu
  2025-04-23 20:44   ` Arnaud Lefebvre
  2025-02-07  8:08 ` [PATCH v8 3/7] crash_dump: store dm crypt keys in kdump reserved memory Coiby Xu
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, open list:DOCUMENTATION

A configfs /sys/kernel/config/crash_dm_crypt_keys is provided for user
space to make the dm crypt keys persist for the kdump kernel. Take the
case of dumping to a LUKS-encrypted target as an example, here is the
life cycle of the kdump copies of LUKS volume keys,

 1. After the 1st kernel loads the initramfs during boot, systemd uses
    an user-input passphrase to de-crypt the LUKS volume keys or simply
    TPM-sealed volume keys and then save the volume keys to specified
    keyring (using the --link-vk-to-keyring API) and the keys will expire
    within specified time.

 2. A user space tool (kdump initramfs loader like kdump-utils) create
    key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
    the 1st kernel which keys are needed.

 3. When the kdump initramfs is loaded by the kexec_file_load
    syscall, the 1st kernel will iterate created key items, save the
    keys to kdump reserved memory.

 4. When the 1st kernel crashes and the kdump initramfs is booted, the
    kdump initramfs asks the kdump kernel to create a user key using the
    key stored in kdump reserved memory by writing yes to
    /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
    device is unlocked with libcryptsetup's --volume-key-keyring API.

 5. The system gets rebooted to the 1st kernel after dumping vmcore to
    the LUKS encrypted device is finished

Eventually the keys have to stay in the kdump reserved memory for the
kdump kernel to unlock encrypted volumes. During this process, some
measures like letting the keys expire within specified time are
desirable to reduce security risk.

This patch assumes,
1) there are 128 LUKS devices at maximum to be unlocked thus
   MAX_KEY_NUM=128.

2) a key description won't exceed 128 bytes thus KEY_DESC_MAX_LEN=128.

And here is a demo on how to interact with
/sys/kernel/config/crash_dm_crypt_keys,

    # Add key #1
    mkdir /sys/kernel/config/crash_dm_crypt_keys/7d26b7b4-e342-4d2d-b660-7426b0996720
    # Add key #1's description
    echo cryptsetup:7d26b7b4-e342-4d2d-b660-7426b0996720 > /sys/kernel/config/crash_dm_crypt_keys/description

    # how many keys do we have now?
    cat /sys/kernel/config/crash_dm_crypt_keys/count
    1

    # Add key#2 in the same way

    # how many keys do we have now?
    cat /sys/kernel/config/crash_dm_crypt_keys/count
    2

    # the tree structure of /crash_dm_crypt_keys configfs
    tree /sys/kernel/config/crash_dm_crypt_keys/
    /sys/kernel/config/crash_dm_crypt_keys/
    ├── 7d26b7b4-e342-4d2d-b660-7426b0996720
    │   └── description
    ├── count
    ├── fce2cd38-4d59-4317-8ce2-1fd24d52c46a
    │   └── description

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 Documentation/admin-guide/kdump/kdump.rst |  28 ++++
 kernel/Kconfig.kexec                      |  10 ++
 kernel/Makefile                           |   1 +
 kernel/crash_dump_dm_crypt.c              | 154 ++++++++++++++++++++++
 4 files changed, 193 insertions(+)
 create mode 100644 kernel/crash_dump_dm_crypt.c

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 5376890adbeb..83d422d761b6 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -551,6 +551,34 @@ from within add_taint() whenever the value set in this bitmask matches with the
 bit flag being set by add_taint().
 This will cause a kdump to occur at the add_taint()->panic() call.
 
+Write the dump file to encrypted disk volume
+============================================
+
+CONFIG_CRASH_DM_CRYPT can be enabled to support saving the dump file to an
+encrypted disk volume. User space can interact with
+/sys/kernel/config/crash_dm_crypt_keys for setup,
+
+1. Tell the first kernel what keys are needed to unlock the disk volumes,
+    # Add key #1
+    mkdir /sys/kernel/config/crash_dm_crypt_keys/7d26b7b4-e342-4d2d-b660-7426b0996720
+    # Add key #1's description
+    echo cryptsetup:7d26b7b4-e342-4d2d-b660-7426b0996720 > /sys/kernel/config/crash_dm_crypt_keys/description
+
+    # how many keys do we have now?
+    cat /sys/kernel/config/crash_dm_crypt_keys/count
+    1
+
+    # Add key #2 in the same way
+
+    # how many keys do we have now?
+    cat /sys/kernel/config/crash_dm_crypt_keys/count
+    2
+
+2. Load the dump-capture kernel
+
+3. After the dump-capture kerne get booted, restore the keys to user keyring
+   echo yes > /sys/kernel/crash_dm_crypt_keys/restore
+
 Contact
 =======
 
diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec
index 4d111f871951..5226775fd4c6 100644
--- a/kernel/Kconfig.kexec
+++ b/kernel/Kconfig.kexec
@@ -116,6 +116,16 @@ config CRASH_DUMP
 	  For s390, this option also enables zfcpdump.
 	  See also <file:Documentation/arch/s390/zfcpdump.rst>
 
+config CRASH_DM_CRYPT
+	bool "Support saving crash dump to dm-crypt encrypted volume"
+	depends on KEXEC_FILE
+	depends on CRASH_DUMP
+	depends on DM_CRYPT
+	help
+	  With this option enabled, user space can intereact with
+	  /sys/kernel/config/crash_dm_crypt_keys to make the dm crypt keys
+	  persistent for the dump-capture kernel.
+
 config CRASH_HOTPLUG
 	bool "Update the crash elfcorehdr on system configuration changes"
 	default y
diff --git a/kernel/Makefile b/kernel/Makefile
index 87866b037fbe..9d1cabf1ec46 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -72,6 +72,7 @@ obj-$(CONFIG_VMCORE_INFO) += vmcore_info.o elfcorehdr.o
 obj-$(CONFIG_CRASH_RESERVE) += crash_reserve.o
 obj-$(CONFIG_KEXEC_CORE) += kexec_core.o
 obj-$(CONFIG_CRASH_DUMP) += crash_core.o
+obj-$(CONFIG_CRASH_DM_CRYPT) += crash_dump_dm_crypt.o
 obj-$(CONFIG_KEXEC) += kexec.o
 obj-$(CONFIG_KEXEC_FILE) += kexec_file.o
 obj-$(CONFIG_KEXEC_ELF) += kexec_elf.o
diff --git a/kernel/crash_dump_dm_crypt.c b/kernel/crash_dump_dm_crypt.c
new file mode 100644
index 000000000000..62a3c47d8b3b
--- /dev/null
+++ b/kernel/crash_dump_dm_crypt.c
@@ -0,0 +1,154 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <keys/user-type.h>
+#include <linux/crash_dump.h>
+#include <linux/configfs.h>
+#include <linux/module.h>
+
+#define KEY_NUM_MAX 128	/* maximum dm crypt keys */
+#define KEY_DESC_MAX_LEN 128	/* maximum dm crypt key description size */
+
+static unsigned int key_count;
+
+struct config_key {
+	struct config_item item;
+	const char *description;
+};
+
+static inline struct config_key *to_config_key(struct config_item *item)
+{
+	return container_of(item, struct config_key, item);
+}
+
+static ssize_t config_key_description_show(struct config_item *item, char *page)
+{
+	return sprintf(page, "%s\n", to_config_key(item)->description);
+}
+
+static ssize_t config_key_description_store(struct config_item *item,
+					    const char *page, size_t count)
+{
+	struct config_key *config_key = to_config_key(item);
+	size_t len;
+	int ret;
+
+	ret = -EINVAL;
+	len = strcspn(page, "\n");
+
+	if (len > KEY_DESC_MAX_LEN) {
+		pr_err("The key description shouldn't exceed %u characters", KEY_DESC_MAX_LEN);
+		return ret;
+	}
+
+	if (!len)
+		return ret;
+
+	kfree(config_key->description);
+	ret = -ENOMEM;
+	config_key->description = kmemdup_nul(page, len, GFP_KERNEL);
+	if (!config_key->description)
+		return ret;
+
+	return count;
+}
+
+CONFIGFS_ATTR(config_key_, description);
+
+static struct configfs_attribute *config_key_attrs[] = {
+	&config_key_attr_description,
+	NULL,
+};
+
+static void config_key_release(struct config_item *item)
+{
+	kfree(to_config_key(item));
+	key_count--;
+}
+
+static struct configfs_item_operations config_key_item_ops = {
+	.release = config_key_release,
+};
+
+static const struct config_item_type config_key_type = {
+	.ct_item_ops = &config_key_item_ops,
+	.ct_attrs = config_key_attrs,
+	.ct_owner = THIS_MODULE,
+};
+
+static struct config_item *config_keys_make_item(struct config_group *group,
+						 const char *name)
+{
+	struct config_key *config_key;
+
+	if (key_count > KEY_NUM_MAX) {
+		pr_err("Only %u keys at maximum to be created\n", KEY_NUM_MAX);
+		return ERR_PTR(-EINVAL);
+	}
+
+	config_key = kzalloc(sizeof(struct config_key), GFP_KERNEL);
+	if (!config_key)
+		return ERR_PTR(-ENOMEM);
+
+	config_item_init_type_name(&config_key->item, name, &config_key_type);
+
+	key_count++;
+
+	return &config_key->item;
+}
+
+static ssize_t config_keys_count_show(struct config_item *item, char *page)
+{
+	return sprintf(page, "%d\n", key_count);
+}
+
+CONFIGFS_ATTR_RO(config_keys_, count);
+
+static struct configfs_attribute *config_keys_attrs[] = {
+	&config_keys_attr_count,
+	NULL,
+};
+
+/*
+ * Note that, since no extra work is required on ->drop_item(),
+ * no ->drop_item() is provided.
+ */
+static struct configfs_group_operations config_keys_group_ops = {
+	.make_item = config_keys_make_item,
+};
+
+static const struct config_item_type config_keys_type = {
+	.ct_group_ops = &config_keys_group_ops,
+	.ct_attrs = config_keys_attrs,
+	.ct_owner = THIS_MODULE,
+};
+
+static struct configfs_subsystem config_keys_subsys = {
+	.su_group = {
+		.cg_item = {
+			.ci_namebuf = "crash_dm_crypt_keys",
+			.ci_type = &config_keys_type,
+		},
+	},
+};
+
+static int __init configfs_dmcrypt_keys_init(void)
+{
+	int ret;
+
+	config_group_init(&config_keys_subsys.su_group);
+	mutex_init(&config_keys_subsys.su_mutex);
+	ret = configfs_register_subsystem(&config_keys_subsys);
+	if (ret) {
+		pr_err("Error %d while registering subsystem %s\n", ret,
+		       config_keys_subsys.su_group.cg_item.ci_namebuf);
+		goto out_unregister;
+	}
+
+	return 0;
+
+out_unregister:
+	configfs_unregister_subsystem(&config_keys_subsys);
+
+	return ret;
+}
+
+module_init(configfs_dmcrypt_keys_init);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v8 3/7] crash_dump: store dm crypt keys in kdump reserved memory
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 1/7] kexec_file: allow to place kexec_buf randomly Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel Coiby Xu
@ 2025-02-07  8:08 ` Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 4/7] crash_dump: reuse saved dm crypt keys for CPU/memory hot-plugging Coiby Xu
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Eric Biederman, Kees Cook, Gustavo A. R. Silva,
	open list:KERNEL HARDENING (not covered by other areas):Keyword:b__counted_by(_le|_be)?b

When the kdump kernel image and initrd are loaded, the dm crypts keys
will be read from keyring and then stored in kdump reserved memory.

Assume a key won't exceed 256 bytes thus MAX_KEY_SIZE=256 according to
"cryptsetup benchmark".

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 include/linux/crash_core.h   |   6 +-
 include/linux/kexec.h        |   4 ++
 kernel/crash_dump_dm_crypt.c | 128 +++++++++++++++++++++++++++++++++++
 3 files changed, 137 insertions(+), 1 deletion(-)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 44305336314e..2e6782239034 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -34,7 +34,11 @@ static inline void arch_kexec_protect_crashkres(void) { }
 static inline void arch_kexec_unprotect_crashkres(void) { }
 #endif
 
-
+#ifdef CONFIG_CRASH_DM_CRYPT
+int crash_load_dm_crypt_keys(struct kimage *image);
+#else
+static inline int crash_load_dm_crypt_keys(struct kimage *image) {return 0; }
+#endif
 
 #ifndef arch_crash_handle_hotplug_event
 static inline void arch_crash_handle_hotplug_event(struct kimage *image, void *arg) { }
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index 61269e97502a..ec7504ba80e9 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -398,6 +398,10 @@ struct kimage {
 	void *elf_headers;
 	unsigned long elf_headers_sz;
 	unsigned long elf_load_addr;
+
+	/* dm crypt keys buffer */
+	unsigned long dm_crypt_keys_addr;
+	unsigned long dm_crypt_keys_sz;
 };
 
 /* kexec interface functions */
diff --git a/kernel/crash_dump_dm_crypt.c b/kernel/crash_dump_dm_crypt.c
index 62a3c47d8b3b..00dc6a3f71ca 100644
--- a/kernel/crash_dump_dm_crypt.c
+++ b/kernel/crash_dump_dm_crypt.c
@@ -1,14 +1,62 @@
 // SPDX-License-Identifier: GPL-2.0-only
+#include <linux/key.h>
+#include <linux/keyctl.h>
 #include <keys/user-type.h>
 #include <linux/crash_dump.h>
 #include <linux/configfs.h>
 #include <linux/module.h>
 
 #define KEY_NUM_MAX 128	/* maximum dm crypt keys */
+#define KEY_SIZE_MAX 256	/* maximum dm crypt key size */
 #define KEY_DESC_MAX_LEN 128	/* maximum dm crypt key description size */
 
 static unsigned int key_count;
 
+struct dm_crypt_key {
+	unsigned int key_size;
+	char key_desc[KEY_DESC_MAX_LEN];
+	u8 data[KEY_SIZE_MAX];
+};
+
+static struct keys_header {
+	unsigned int total_keys;
+	struct dm_crypt_key keys[] __counted_by(total_keys);
+} *keys_header;
+
+static size_t get_keys_header_size(size_t total_keys)
+{
+	return struct_size(keys_header, keys, total_keys);
+}
+
+static int read_key_from_user_keying(struct dm_crypt_key *dm_key)
+{
+	const struct user_key_payload *ukp;
+	struct key *key;
+
+	kexec_dprintk("Requesting key %s", dm_key->key_desc);
+	key = request_key(&key_type_logon, dm_key->key_desc, NULL);
+
+	if (IS_ERR(key)) {
+		pr_warn("No such key %s\n", dm_key->key_desc);
+		return PTR_ERR(key);
+	}
+
+	ukp = user_key_payload_locked(key);
+	if (!ukp)
+		return -EKEYREVOKED;
+
+	if (ukp->datalen > KEY_SIZE_MAX) {
+		pr_err("Key size %u exceeds maximum (%u)\n", ukp->datalen, KEY_SIZE_MAX);
+		return -EINVAL;
+	}
+
+	memcpy(dm_key->data, ukp->data, ukp->datalen);
+	dm_key->key_size = ukp->datalen;
+	kexec_dprintk("Get dm crypt key (size=%u) %s: %8ph\n", dm_key->key_size,
+		      dm_key->key_desc, dm_key->data);
+	return 0;
+}
+
 struct config_key {
 	struct config_item item;
 	const char *description;
@@ -130,6 +178,86 @@ static struct configfs_subsystem config_keys_subsys = {
 	},
 };
 
+static int build_keys_header(void)
+{
+	struct config_item *item = NULL;
+	struct config_key *key;
+	int i, r;
+
+	if (keys_header != NULL)
+		kvfree(keys_header);
+
+	keys_header = kzalloc(get_keys_header_size(key_count), GFP_KERNEL);
+	if (!keys_header)
+		return -ENOMEM;
+
+	keys_header->total_keys = key_count;
+
+	i = 0;
+	list_for_each_entry(item, &config_keys_subsys.su_group.cg_children,
+			    ci_entry) {
+		if (item->ci_type != &config_key_type)
+			continue;
+
+		key = to_config_key(item);
+
+		strscpy(keys_header->keys[i].key_desc, key->description,
+			KEY_DESC_MAX_LEN);
+		r = read_key_from_user_keying(&keys_header->keys[i]);
+		if (r != 0) {
+			kexec_dprintk("Failed to read key %s\n",
+				      keys_header->keys[i].key_desc);
+			return r;
+		}
+		i++;
+		kexec_dprintk("Found key: %s\n", item->ci_name);
+	}
+
+	return 0;
+}
+
+int crash_load_dm_crypt_keys(struct kimage *image)
+{
+	struct kexec_buf kbuf = {
+		.image = image,
+		.buf_min = 0,
+		.buf_max = ULONG_MAX,
+		.top_down = false,
+		.random = true,
+	};
+	int r;
+
+
+	if (key_count <= 0) {
+		kexec_dprintk("No dm-crypt keys\n");
+		return -ENOENT;
+	}
+
+	image->dm_crypt_keys_addr = 0;
+	r = build_keys_header();
+	if (r)
+		return r;
+
+	kbuf.buffer = keys_header;
+	kbuf.bufsz = get_keys_header_size(key_count);
+
+	kbuf.memsz = kbuf.bufsz;
+	kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
+	kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
+	r = kexec_add_buffer(&kbuf);
+	if (r) {
+		kvfree((void *)kbuf.buffer);
+		return r;
+	}
+	image->dm_crypt_keys_addr = kbuf.mem;
+	image->dm_crypt_keys_sz = kbuf.bufsz;
+	kexec_dprintk(
+		"Loaded dm crypt keys to kexec_buffer bufsz=0x%lx memsz=0x%lx\n",
+		kbuf.bufsz, kbuf.memsz);
+
+	return r;
+}
+
 static int __init configfs_dmcrypt_keys_init(void)
 {
 	int ret;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v8 4/7] crash_dump: reuse saved dm crypt keys for CPU/memory hot-plugging
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (2 preceding siblings ...)
  2025-02-07  8:08 ` [PATCH v8 3/7] crash_dump: store dm crypt keys in kdump reserved memory Coiby Xu
@ 2025-02-07  8:08 ` Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 5/7] crash_dump: retrieve dm crypt keys in kdump kernel Coiby Xu
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, open list:DOCUMENTATION

When there are CPU and memory hot un/plugs, the dm crypt keys may need
to be reloaded again depending on the solution for crash hotplug
support. Currently, there are two solutions. One is to utilizes udev to
instruct user space to reload the kdump kernel image and initrd,
elfcorehdr and etc again. The other is to only update the elfcorehdr
segment introduced in commit 247262756121 ("crash:
add generic infrastructure for crash hotplug support").

For the 1st solution, the dm crypt keys need to be reloaded again. The
user space can write true to
/sys/kernel/config/crash_dm_crypt_key/reuse so the stored keys can be
re-used.

For the 2nd solution, the dm crypt keys don't need to be reloaded.
Currently, only x86 supports the 2nd solution. If the 2nd solution
gets extended to all arches, this patch can be dropped.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 Documentation/admin-guide/kdump/kdump.rst |  4 ++
 kernel/crash_dump_dm_crypt.c              | 52 +++++++++++++++++++++--
 2 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 83d422d761b6..1283f0244614 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -574,6 +574,10 @@ encrypted disk volume. User space can interact with
     cat /sys/kernel/config/crash_dm_crypt_keys/count
     2
 
+    # To support CPU/memory hot-plugging, re-use keys already saved to reserved
+    # memory
+    echo true > /sys/kernel/config/crash_dm_crypt_key/reuse
+
 2. Load the dump-capture kernel
 
 3. After the dump-capture kerne get booted, restore the keys to user keyring
diff --git a/kernel/crash_dump_dm_crypt.c b/kernel/crash_dump_dm_crypt.c
index 00dc6a3f71ca..e4e0cc1c3399 100644
--- a/kernel/crash_dump_dm_crypt.c
+++ b/kernel/crash_dump_dm_crypt.c
@@ -28,6 +28,20 @@ static size_t get_keys_header_size(size_t total_keys)
 	return struct_size(keys_header, keys, total_keys);
 }
 
+static void get_keys_from_kdump_reserved_memory(void)
+{
+	struct keys_header *keys_header_loaded;
+
+	arch_kexec_unprotect_crashkres();
+
+	keys_header_loaded = kmap_local_page(pfn_to_page(
+		kexec_crash_image->dm_crypt_keys_addr >> PAGE_SHIFT));
+
+	memcpy(keys_header, keys_header_loaded, get_keys_header_size(key_count));
+	kunmap_local(keys_header_loaded);
+	arch_kexec_protect_crashkres();
+}
+
 static int read_key_from_user_keying(struct dm_crypt_key *dm_key)
 {
 	const struct user_key_payload *ukp;
@@ -150,8 +164,36 @@ static ssize_t config_keys_count_show(struct config_item *item, char *page)
 
 CONFIGFS_ATTR_RO(config_keys_, count);
 
+static bool is_dm_key_reused;
+
+static ssize_t config_keys_reuse_show(struct config_item *item, char *page)
+{
+	return sprintf(page, "%d\n", is_dm_key_reused);
+}
+
+static ssize_t config_keys_reuse_store(struct config_item *item,
+					   const char *page, size_t count)
+{
+	if (!kexec_crash_image || !kexec_crash_image->dm_crypt_keys_addr) {
+		kexec_dprintk(
+			"dm-crypt keys haven't be saved to crash-reserved memory\n");
+		return -EINVAL;
+	}
+
+	if (kstrtobool(page, &is_dm_key_reused))
+		return -EINVAL;
+
+	if (is_dm_key_reused)
+		get_keys_from_kdump_reserved_memory();
+
+	return count;
+}
+
+CONFIGFS_ATTR(config_keys_, reuse);
+
 static struct configfs_attribute *config_keys_attrs[] = {
 	&config_keys_attr_count,
+	&config_keys_attr_reuse,
 	NULL,
 };
 
@@ -233,10 +275,12 @@ int crash_load_dm_crypt_keys(struct kimage *image)
 		return -ENOENT;
 	}
 
-	image->dm_crypt_keys_addr = 0;
-	r = build_keys_header();
-	if (r)
-		return r;
+	if (!is_dm_key_reused) {
+		image->dm_crypt_keys_addr = 0;
+		r = build_keys_header();
+		if (r)
+			return r;
+	}
 
 	kbuf.buffer = keys_header;
 	kbuf.bufsz = get_keys_header_size(key_count);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v8 5/7] crash_dump: retrieve dm crypt keys in kdump kernel
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (3 preceding siblings ...)
  2025-02-07  8:08 ` [PATCH v8 4/7] crash_dump: reuse saved dm crypt keys for CPU/memory hot-plugging Coiby Xu
@ 2025-02-07  8:08 ` Coiby Xu
  2025-02-07  8:08 ` [PATCH v8 6/7] x86/crash: pass dm crypt keys to " Coiby Xu
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal

Crash kernel will retrieve the dm crypt keys based on the dmcryptkeys
command line parameter. When user space writes the key description to
/sys/kernel/config/crash_dm_crypt_key/restore, the crash kernel will
save the encryption keys to the user keyring. Then user space e.g.
cryptsetup's --volume-key-keyring API can use it to unlock the encrypted
device.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 include/linux/crash_core.h   |   1 +
 include/linux/crash_dump.h   |   2 +
 kernel/crash_dump_dm_crypt.c | 133 +++++++++++++++++++++++++++++++++++
 3 files changed, 136 insertions(+)

diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 2e6782239034..d35726d6a415 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -36,6 +36,7 @@ static inline void arch_kexec_unprotect_crashkres(void) { }
 
 #ifdef CONFIG_CRASH_DM_CRYPT
 int crash_load_dm_crypt_keys(struct kimage *image);
+ssize_t dm_crypt_keys_read(char *buf, size_t count, u64 *ppos);
 #else
 static inline int crash_load_dm_crypt_keys(struct kimage *image) {return 0; }
 #endif
diff --git a/include/linux/crash_dump.h b/include/linux/crash_dump.h
index 2f2555e6407c..dd6fc3b2133b 100644
--- a/include/linux/crash_dump.h
+++ b/include/linux/crash_dump.h
@@ -15,6 +15,8 @@
 extern unsigned long long elfcorehdr_addr;
 extern unsigned long long elfcorehdr_size;
 
+extern unsigned long long dm_crypt_keys_addr;
+
 #ifdef CONFIG_CRASH_DUMP
 extern int elfcorehdr_alloc(unsigned long long *addr, unsigned long long *size);
 extern void elfcorehdr_free(unsigned long long addr);
diff --git a/kernel/crash_dump_dm_crypt.c b/kernel/crash_dump_dm_crypt.c
index e4e0cc1c3399..993d9e08d774 100644
--- a/kernel/crash_dump_dm_crypt.c
+++ b/kernel/crash_dump_dm_crypt.c
@@ -3,6 +3,7 @@
 #include <linux/keyctl.h>
 #include <keys/user-type.h>
 #include <linux/crash_dump.h>
+#include <linux/cc_platform.h>
 #include <linux/configfs.h>
 #include <linux/module.h>
 
@@ -28,6 +29,61 @@ static size_t get_keys_header_size(size_t total_keys)
 	return struct_size(keys_header, keys, total_keys);
 }
 
+unsigned long long dm_crypt_keys_addr;
+EXPORT_SYMBOL_GPL(dm_crypt_keys_addr);
+
+static int __init setup_dmcryptkeys(char *arg)
+{
+	char *end;
+
+	if (!arg)
+		return -EINVAL;
+	dm_crypt_keys_addr = memparse(arg, &end);
+	if (end > arg)
+		return 0;
+
+	dm_crypt_keys_addr = 0;
+	return -EINVAL;
+}
+
+early_param("dmcryptkeys", setup_dmcryptkeys);
+
+/*
+ * Architectures may override this function to read dm crypt keys
+ */
+ssize_t __weak dm_crypt_keys_read(char *buf, size_t count, u64 *ppos)
+{
+	struct kvec kvec = { .iov_base = buf, .iov_len = count };
+	struct iov_iter iter;
+
+	iov_iter_kvec(&iter, READ, &kvec, 1, count);
+	return read_from_oldmem(&iter, count, ppos, cc_platform_has(CC_ATTR_MEM_ENCRYPT));
+}
+
+static int add_key_to_keyring(struct dm_crypt_key *dm_key,
+			      key_ref_t keyring_ref)
+{
+	key_ref_t key_ref;
+	int r;
+
+	/* create or update the requested key and add it to the target keyring */
+	key_ref = key_create_or_update(keyring_ref, "user", dm_key->key_desc,
+				       dm_key->data, dm_key->key_size,
+				       KEY_USR_ALL, KEY_ALLOC_IN_QUOTA);
+
+	if (!IS_ERR(key_ref)) {
+		r = key_ref_to_ptr(key_ref)->serial;
+		key_ref_put(key_ref);
+		kexec_dprintk("Success adding key %s", dm_key->key_desc);
+	} else {
+		r = PTR_ERR(key_ref);
+		kexec_dprintk("Error when adding key");
+	}
+
+	key_ref_put(keyring_ref);
+	return r;
+}
+
 static void get_keys_from_kdump_reserved_memory(void)
 {
 	struct keys_header *keys_header_loaded;
@@ -42,6 +98,47 @@ static void get_keys_from_kdump_reserved_memory(void)
 	arch_kexec_protect_crashkres();
 }
 
+static int restore_dm_crypt_keys_to_thread_keyring(void)
+{
+	struct dm_crypt_key *key;
+	size_t keys_header_size;
+	key_ref_t keyring_ref;
+	u64 addr;
+
+	/* find the target keyring (which must be writable) */
+	keyring_ref =
+		lookup_user_key(KEY_SPEC_USER_KEYRING, 0x01, KEY_NEED_WRITE);
+	if (IS_ERR(keyring_ref)) {
+		kexec_dprintk("Failed to get the user keyring\n");
+		return PTR_ERR(keyring_ref);
+	}
+
+	addr = dm_crypt_keys_addr;
+	dm_crypt_keys_read((char *)&key_count, sizeof(key_count), &addr);
+	if (key_count < 0 || key_count > KEY_NUM_MAX) {
+		kexec_dprintk("Failed to read the number of dm-crypt keys\n");
+		return -1;
+	}
+
+	kexec_dprintk("There are %u keys\n", key_count);
+	addr = dm_crypt_keys_addr;
+
+	keys_header_size = get_keys_header_size(key_count);
+	keys_header = kzalloc(keys_header_size, GFP_KERNEL);
+	if (!keys_header)
+		return -ENOMEM;
+
+	dm_crypt_keys_read((char *)keys_header, keys_header_size, &addr);
+
+	for (int i = 0; i < keys_header->total_keys; i++) {
+		key = &keys_header->keys[i];
+		kexec_dprintk("Get key (size=%u)\n", key->key_size);
+		add_key_to_keyring(key, keyring_ref);
+	}
+
+	return 0;
+}
+
 static int read_key_from_user_keying(struct dm_crypt_key *dm_key)
 {
 	const struct user_key_payload *ukp;
@@ -211,6 +308,37 @@ static const struct config_item_type config_keys_type = {
 	.ct_owner = THIS_MODULE,
 };
 
+static bool restore;
+
+static ssize_t config_keys_restore_show(struct config_item *item, char *page)
+{
+	return sprintf(page, "%d\n", restore);
+}
+
+static ssize_t config_keys_restore_store(struct config_item *item,
+					  const char *page, size_t count)
+{
+	if (!restore)
+		restore_dm_crypt_keys_to_thread_keyring();
+
+	if (kstrtobool(page, &restore))
+		return -EINVAL;
+
+	return count;
+}
+
+CONFIGFS_ATTR(config_keys_, restore);
+
+static struct configfs_attribute *kdump_config_keys_attrs[] = {
+	&config_keys_attr_restore,
+	NULL,
+};
+
+static const struct config_item_type kdump_config_keys_type = {
+	.ct_attrs = kdump_config_keys_attrs,
+	.ct_owner = THIS_MODULE,
+};
+
 static struct configfs_subsystem config_keys_subsys = {
 	.su_group = {
 		.cg_item = {
@@ -306,6 +434,11 @@ static int __init configfs_dmcrypt_keys_init(void)
 {
 	int ret;
 
+	if (is_kdump_kernel()) {
+		config_keys_subsys.su_group.cg_item.ci_type =
+			&kdump_config_keys_type;
+	}
+
 	config_group_init(&config_keys_subsys.su_group);
 	mutex_init(&config_keys_subsys.su_mutex);
 	ret = configfs_register_subsystem(&config_keys_subsys);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v8 6/7] x86/crash: pass dm crypt keys to kdump kernel
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (4 preceding siblings ...)
  2025-02-07  8:08 ` [PATCH v8 5/7] crash_dump: retrieve dm crypt keys in kdump kernel Coiby Xu
@ 2025-02-07  8:08 ` Coiby Xu
  2025-04-23 20:59   ` Arnaud Lefebvre
  2025-02-07  8:13 ` [PATCH v8 7/7] x86/crash: make the page that stores the dm crypt keys inaccessible Coiby Xu
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:08 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin,
	open list:DOCUMENTATION

1st kernel will build up the kernel command parameter dmcryptkeys as
similar to elfcorehdr to pass the memory address of the stored info of
dm crypt key to kdump kernel.

Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 Documentation/admin-guide/kdump/kdump.rst |  4 ++--
 arch/x86/kernel/crash.c                   | 26 +++++++++++++++++++++--
 arch/x86/kernel/kexec-bzimage64.c         | 11 ++++++++++
 3 files changed, 37 insertions(+), 4 deletions(-)

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index 1283f0244614..2209caf36d79 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -555,8 +555,8 @@ Write the dump file to encrypted disk volume
 ============================================
 
 CONFIG_CRASH_DM_CRYPT can be enabled to support saving the dump file to an
-encrypted disk volume. User space can interact with
-/sys/kernel/config/crash_dm_crypt_keys for setup,
+encrypted disk volume (only x86_64 supported for now). User space can interact
+with /sys/kernel/config/crash_dm_crypt_keys for setup,
 
 1. Tell the first kernel what keys are needed to unlock the disk volumes,
     # Add key #1
diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 340af8155658..a525ee639b63 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -278,6 +278,7 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 				 unsigned long long mend)
 {
 	unsigned long start, end;
+	int ret;
 
 	cmem->ranges[0].start = mstart;
 	cmem->ranges[0].end = mend;
@@ -286,22 +287,43 @@ static int memmap_exclude_ranges(struct kimage *image, struct crash_mem *cmem,
 	/* Exclude elf header region */
 	start = image->elf_load_addr;
 	end = start + image->elf_headers_sz - 1;
-	return crash_exclude_mem_range(cmem, start, end);
+	ret = crash_exclude_mem_range(cmem, start, end);
+
+	if (ret)
+		return ret;
+
+	/* Exclude dm crypt keys region */
+	if (image->dm_crypt_keys_addr) {
+		start = image->dm_crypt_keys_addr;
+		end = start + image->dm_crypt_keys_sz - 1;
+		return crash_exclude_mem_range(cmem, start, end);
+	}
+
+	return ret;
 }
 
 /* Prepare memory map for crash dump kernel */
 int crash_setup_memmap_entries(struct kimage *image, struct boot_params *params)
 {
+	unsigned int nr_ranges = 0;
 	int i, ret = 0;
 	unsigned long flags;
 	struct e820_entry ei;
 	struct crash_memmap_data cmd;
 	struct crash_mem *cmem;
 
-	cmem = vzalloc(struct_size(cmem, ranges, 1));
+	/*
+	 * Using random kexec_buf for passing dm crypt keys may cause a range
+	 * split. So use two slots here.
+	 */
+	nr_ranges = 2;
+	cmem = vzalloc(struct_size(cmem, ranges, nr_ranges));
 	if (!cmem)
 		return -ENOMEM;
 
+	cmem->max_nr_ranges = nr_ranges;
+	cmem->nr_ranges = 0;
+
 	memset(&cmd, 0, sizeof(struct crash_memmap_data));
 	cmd.params = params;
 
diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index 68530fad05f7..5604a5109858 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -76,6 +76,10 @@ static int setup_cmdline(struct kimage *image, struct boot_params *params,
 	if (image->type == KEXEC_TYPE_CRASH) {
 		len = sprintf(cmdline_ptr,
 			"elfcorehdr=0x%lx ", image->elf_load_addr);
+
+		if (image->dm_crypt_keys_addr != 0)
+			len += sprintf(cmdline_ptr + len,
+					"dmcryptkeys=0x%lx ", image->dm_crypt_keys_addr);
 	}
 	memcpy(cmdline_ptr + len, cmdline, cmdline_len);
 	cmdline_len += len;
@@ -441,6 +445,13 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
 		ret = crash_load_segments(image);
 		if (ret)
 			return ERR_PTR(ret);
+		ret = crash_load_dm_crypt_keys(image);
+		if (ret == -ENOENT) {
+			kexec_dprintk("No dm crypt key to load\n");
+		} else if (ret) {
+			pr_err("Failed to load dm crypt keys\n");
+			return ERR_PTR(ret);
+		}
 	}
 #endif
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v8 7/7] x86/crash: make the page that stores the dm crypt keys inaccessible
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (5 preceding siblings ...)
  2025-02-07  8:08 ` [PATCH v8 6/7] x86/crash: pass dm crypt keys to " Coiby Xu
@ 2025-02-07  8:13 ` Coiby Xu
  2025-02-11 10:25 ` [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Baoquan He
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-07  8:13 UTC (permalink / raw)
  To: kexec
  Cc: Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	H. Peter Anvin

This adds an addition layer of protection for the saved copy of dm
crypt key. Trying to access the saved copy will cause page fault.

Suggested-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Coiby Xu <coxu@redhat.com>
---
 arch/x86/kernel/machine_kexec_64.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/arch/x86/kernel/machine_kexec_64.c b/arch/x86/kernel/machine_kexec_64.c
index a68f5a0a9f37..f615fcb6d35d 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -598,13 +598,35 @@ static void kexec_mark_crashkres(bool protect)
 	kexec_mark_range(control, crashk_res.end, protect);
 }
 
+/* make the memory storing dm crypt keys in/accessible */
+static void kexec_mark_dm_crypt_keys(bool protect)
+{
+	unsigned long start_paddr, end_paddr;
+	unsigned int nr_pages;
+
+	if (kexec_crash_image->dm_crypt_keys_addr) {
+		start_paddr = kexec_crash_image->dm_crypt_keys_addr;
+		end_paddr = start_paddr + kexec_crash_image->dm_crypt_keys_sz - 1;
+		nr_pages = (PAGE_ALIGN(end_paddr) - PAGE_ALIGN_DOWN(start_paddr))/PAGE_SIZE;
+		if (protect)
+			set_memory_np((unsigned long)phys_to_virt(start_paddr), nr_pages);
+		else
+			__set_memory_prot(
+				(unsigned long)phys_to_virt(start_paddr),
+				nr_pages,
+				__pgprot(_PAGE_PRESENT | _PAGE_NX | _PAGE_RW));
+	}
+}
+
 void arch_kexec_protect_crashkres(void)
 {
 	kexec_mark_crashkres(true);
+	kexec_mark_dm_crypt_keys(true);
 }
 
 void arch_kexec_unprotect_crashkres(void)
 {
+	kexec_mark_dm_crypt_keys(false);
 	kexec_mark_crashkres(false);
 }
 #endif
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (6 preceding siblings ...)
  2025-02-07  8:13 ` [PATCH v8 7/7] x86/crash: make the page that stores the dm crypt keys inaccessible Coiby Xu
@ 2025-02-11 10:25 ` Baoquan He
  2025-02-12  0:43   ` Coiby Xu
  2025-02-24  1:36   ` Baoquan He
  2025-03-10  3:30 ` Baoquan He
                   ` (2 subsequent siblings)
  10 siblings, 2 replies; 24+ messages in thread
From: Baoquan He @ 2025-02-11 10:25 UTC (permalink / raw)
  To: Coiby Xu
  Cc: akpm, kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Dave Young,
	linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On 02/07/25 at 04:08pm, Coiby Xu wrote:
> LUKS is the standard for Linux disk encryption, widely adopted by users,
> and in some cases, such as Confidential VMs, it is a requirement. With 
> kdump enabled, when the first kernel crashes, the system can boot into
> the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore) 
> to a specified target. However, there are two challenges when dumping
> vmcore to a LUKS-encrypted device:
> 
>  - Kdump kernel may not be able to decrypt the LUKS partition. For some
>    machines, a system administrator may not have a chance to enter the
>    password to decrypt the device in kdump initramfs after the 1st kernel
>    crashes; For cloud confidential VMs, depending on the policy the
>    kdump kernel may not be able to unseal the keys with TPM and the
>    console virtual keyboard is untrusted.
> 
>  - LUKS2 by default use the memory-hard Argon2 key derivation function
>    which is quite memory-consuming compared to the limited memory reserved
>    for kdump. Take Fedora example, by default, only 256M is reserved for
>    systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
>    to be reserved for kdump. Note if the memory reserved for kdump can't
>    be used by 1st kernel i.e. an user sees ~1300M memory missing in the
>    1st kernel.
> 
> Besides users (at least for Fedora) usually expect kdump to work out of
> the box i.e. no manual password input or custom crashkernel value is
> needed. And it doesn't make sense to derivate the keys again in kdump
> kernel which seems to be redundant work.
> 
> This patch set addresses the above issues by making the LUKS volume keys
> persistent for kdump kernel with the help of cryptsetup's new APIs
> (--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
> the kdump copies of LUKS volume keys,
> 
>  1. After the 1st kernel loads the initramfs during boot, systemd
>     use an user-input passphrase to de-crypt the LUKS volume keys
>     or TPM-sealed key and then save the volume keys to specified keyring
>     (using the --link-vk-to-keyring API) and the key will expire within
>     specified time.
> 
>  2. A user space tool (kdump initramfs loader like kdump-utils) create
>     key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
>     the 1st kernel which keys are needed.
> 
>  3. When the kdump initramfs is loaded by the kexec_file_load
>     syscall, the 1st kernel will iterate created key items, save the
>     keys to kdump reserved memory.
> 
>  4. When the 1st kernel crashes and the kdump initramfs is booted, the
>     kdump initramfs asks the kdump kernel to create a user key using the
>     key stored in kdump reserved memory by writing yes to
>     /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
>     device is unlocked with libcryptsetup's --volume-key-keyring API.
> 
>  5. The system gets rebooted to the 1st kernel after dumping vmcore to
>     the LUKS encrypted device is finished
> 
> After libcryptsetup saving the LUKS volume keys to specified keyring,
> whoever takes this should be responsible for the safety of these copies
> of keys. The keys will be saved in the memory area exclusively reserved
> for kdump where even the 1st kernel has no direct access. And further
> more, two additional protections are added,
>  - save the copy randomly in kdump reserved memory as suggested by Jan
>  - clear the _PAGE_PRESENT flag of the page that stores the copy as
>    suggested by Pingfan
> 
> This patch set only supports x86. There will be patches to support other
> architectures once this patch set gets merged.

This v8 looks good to me, thanks for the great effort, Coiby.

Acked-by: Baoquan He <bhe@redhat.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-11 10:25 ` [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Baoquan He
@ 2025-02-12  0:43   ` Coiby Xu
  2025-02-24  1:36   ` Baoquan He
  1 sibling, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-02-12  0:43 UTC (permalink / raw)
  To: Baoquan He
  Cc: akpm, kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Dave Young,
	linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On Tue, Feb 11, 2025 at 06:25:18PM +0800, Baoquan He wrote:
>On 02/07/25 at 04:08pm, Coiby Xu wrote:
>> LUKS is the standard for Linux disk encryption, widely adopted by users,
>> and in some cases, such as Confidential VMs, it is a requirement. With
>> kdump enabled, when the first kernel crashes, the system can boot into
>> the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore)
>> to a specified target. However, there are two challenges when dumping
>> vmcore to a LUKS-encrypted device:
>>
>>  - Kdump kernel may not be able to decrypt the LUKS partition. For some
>>    machines, a system administrator may not have a chance to enter the
>>    password to decrypt the device in kdump initramfs after the 1st kernel
>>    crashes; For cloud confidential VMs, depending on the policy the
>>    kdump kernel may not be able to unseal the keys with TPM and the
>>    console virtual keyboard is untrusted.
>>
>>  - LUKS2 by default use the memory-hard Argon2 key derivation function
>>    which is quite memory-consuming compared to the limited memory reserved
>>    for kdump. Take Fedora example, by default, only 256M is reserved for
>>    systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
>>    to be reserved for kdump. Note if the memory reserved for kdump can't
>>    be used by 1st kernel i.e. an user sees ~1300M memory missing in the
>>    1st kernel.
>>
>> Besides users (at least for Fedora) usually expect kdump to work out of
>> the box i.e. no manual password input or custom crashkernel value is
>> needed. And it doesn't make sense to derivate the keys again in kdump
>> kernel which seems to be redundant work.
>>
>> This patch set addresses the above issues by making the LUKS volume keys
>> persistent for kdump kernel with the help of cryptsetup's new APIs
>> (--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
>> the kdump copies of LUKS volume keys,
>>
>>  1. After the 1st kernel loads the initramfs during boot, systemd
>>     use an user-input passphrase to de-crypt the LUKS volume keys
>>     or TPM-sealed key and then save the volume keys to specified keyring
>>     (using the --link-vk-to-keyring API) and the key will expire within
>>     specified time.
>>
>>  2. A user space tool (kdump initramfs loader like kdump-utils) create
>>     key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
>>     the 1st kernel which keys are needed.
>>
>>  3. When the kdump initramfs is loaded by the kexec_file_load
>>     syscall, the 1st kernel will iterate created key items, save the
>>     keys to kdump reserved memory.
>>
>>  4. When the 1st kernel crashes and the kdump initramfs is booted, the
>>     kdump initramfs asks the kdump kernel to create a user key using the
>>     key stored in kdump reserved memory by writing yes to
>>     /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
>>     device is unlocked with libcryptsetup's --volume-key-keyring API.
>>
>>  5. The system gets rebooted to the 1st kernel after dumping vmcore to
>>     the LUKS encrypted device is finished
>>
>> After libcryptsetup saving the LUKS volume keys to specified keyring,
>> whoever takes this should be responsible for the safety of these copies
>> of keys. The keys will be saved in the memory area exclusively reserved
>> for kdump where even the 1st kernel has no direct access. And further
>> more, two additional protections are added,
>>  - save the copy randomly in kdump reserved memory as suggested by Jan
>>  - clear the _PAGE_PRESENT flag of the page that stores the copy as
>>    suggested by Pingfan
>>
>> This patch set only supports x86. There will be patches to support other
>> architectures once this patch set gets merged.
>
>This v8 looks good to me, thanks for the great effort, Coiby.
>
>Acked-by: Baoquan He <bhe@redhat.com>

Great, thanks for reviewing and acknowledging the patch set!

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-11 10:25 ` [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Baoquan He
  2025-02-12  0:43   ` Coiby Xu
@ 2025-02-24  1:36   ` Baoquan He
  2025-03-21  6:54     ` Coiby Xu
  1 sibling, 1 reply; 24+ messages in thread
From: Baoquan He @ 2025-02-24  1:36 UTC (permalink / raw)
  To: akpm
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Dave Young,
	linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov, Coiby Xu

Hi Andrew,

On 02/11/25 at 06:25pm, Baoquan He wrote:
> On 02/07/25 at 04:08pm, Coiby Xu wrote:
> > LUKS is the standard for Linux disk encryption, widely adopted by users,
> > and in some cases, such as Confidential VMs, it is a requirement. With 
> > kdump enabled, when the first kernel crashes, the system can boot into
> > the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore) 
> > to a specified target. However, there are two challenges when dumping
> > vmcore to a LUKS-encrypted device:
> > 
> >  - Kdump kernel may not be able to decrypt the LUKS partition. For some
> >    machines, a system administrator may not have a chance to enter the
> >    password to decrypt the device in kdump initramfs after the 1st kernel
> >    crashes; For cloud confidential VMs, depending on the policy the
> >    kdump kernel may not be able to unseal the keys with TPM and the
> >    console virtual keyboard is untrusted.
> > 
> >  - LUKS2 by default use the memory-hard Argon2 key derivation function
> >    which is quite memory-consuming compared to the limited memory reserved
> >    for kdump. Take Fedora example, by default, only 256M is reserved for
> >    systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
> >    to be reserved for kdump. Note if the memory reserved for kdump can't
> >    be used by 1st kernel i.e. an user sees ~1300M memory missing in the
> >    1st kernel.
> > 
> > Besides users (at least for Fedora) usually expect kdump to work out of
> > the box i.e. no manual password input or custom crashkernel value is
> > needed. And it doesn't make sense to derivate the keys again in kdump
> > kernel which seems to be redundant work.
> > 
> > This patch set addresses the above issues by making the LUKS volume keys
> > persistent for kdump kernel with the help of cryptsetup's new APIs
> > (--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
> > the kdump copies of LUKS volume keys,
> > 
> >  1. After the 1st kernel loads the initramfs during boot, systemd
> >     use an user-input passphrase to de-crypt the LUKS volume keys
> >     or TPM-sealed key and then save the volume keys to specified keyring
> >     (using the --link-vk-to-keyring API) and the key will expire within
> >     specified time.
> > 
> >  2. A user space tool (kdump initramfs loader like kdump-utils) create
> >     key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
> >     the 1st kernel which keys are needed.
> > 
> >  3. When the kdump initramfs is loaded by the kexec_file_load
> >     syscall, the 1st kernel will iterate created key items, save the
> >     keys to kdump reserved memory.
> > 
> >  4. When the 1st kernel crashes and the kdump initramfs is booted, the
> >     kdump initramfs asks the kdump kernel to create a user key using the
> >     key stored in kdump reserved memory by writing yes to
> >     /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
> >     device is unlocked with libcryptsetup's --volume-key-keyring API.
> > 
> >  5. The system gets rebooted to the 1st kernel after dumping vmcore to
> >     the LUKS encrypted device is finished
> > 
> > After libcryptsetup saving the LUKS volume keys to specified keyring,
> > whoever takes this should be responsible for the safety of these copies
> > of keys. The keys will be saved in the memory area exclusively reserved
> > for kdump where even the 1st kernel has no direct access. And further
> > more, two additional protections are added,
> >  - save the copy randomly in kdump reserved memory as suggested by Jan
> >  - clear the _PAGE_PRESENT flag of the page that stores the copy as
> >    suggested by Pingfan
> > 
> > This patch set only supports x86. There will be patches to support other
> > architectures once this patch set gets merged.

Could you pick this patchset into your tree since no conern from other
reviewers?

Thanks
Baoquan

> 
> This v8 looks good to me, thanks for the great effort, Coiby.
> 
> Acked-by: Baoquan He <bhe@redhat.com>
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (7 preceding siblings ...)
  2025-02-11 10:25 ` [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Baoquan He
@ 2025-03-10  3:30 ` Baoquan He
  2025-04-14  5:44 ` Baoquan He
  2025-04-24  0:08 ` Arnaud Lefebvre
  10 siblings, 0 replies; 24+ messages in thread
From: Baoquan He @ 2025-03-10  3:30 UTC (permalink / raw)
  To: akpm
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Dave Young,
	linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov, Coiby Xu, dwmw2

Hi Andrew,

On 02/07/25 at 04:08pm, Coiby Xu wrote:
......snip...

Ping again.

This patchset is adding a generic infrastructure for luks support in
crash dumping, and it adds the support in x86 ARCH. Since the x86
related change is only located in kdump only files, won't impact other
x86 codes. Could you consider pick this into your tree?

And by the way, this is a kdump only fix, not related to KHO (Kexec
HandOver) which David suggested to adapt to earlier. Explained it here
to remove misunderstanding. 

Thanks
Baoquan

>  Documentation/admin-guide/kdump/kdump.rst |  32 ++
>  arch/x86/kernel/crash.c                   |  26 +-
>  arch/x86/kernel/kexec-bzimage64.c         |  11 +
>  arch/x86/kernel/machine_kexec_64.c        |  22 ++
>  include/linux/crash_core.h                |   7 +-
>  include/linux/crash_dump.h                |   2 +
>  include/linux/kexec.h                     |  34 ++
>  kernel/Kconfig.kexec                      |  10 +
>  kernel/Makefile                           |   1 +
>  kernel/crash_dump_dm_crypt.c              | 459 ++++++++++++++++++++++
>  kernel/kexec_file.c                       |   3 +
>  11 files changed, 604 insertions(+), 3 deletions(-)
>  create mode 100644 kernel/crash_dump_dm_crypt.c
> 
> 
> base-commit: bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b
> -- 
> 2.48.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-24  1:36   ` Baoquan He
@ 2025-03-21  6:54     ` Coiby Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-03-21  6:54 UTC (permalink / raw)
  To: akpm, Dave Hansen, Baoquan He
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Dave Young,
	linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On Mon, Feb 24, 2025 at 09:36:48AM +0800, Baoquan He wrote:
>Hi Andrew,
>
>On 02/11/25 at 06:25pm, Baoquan He wrote:
>> On 02/07/25 at 04:08pm, Coiby Xu wrote:
>> > LUKS is the standard for Linux disk encryption, widely adopted by users,
>> > and in some cases, such as Confidential VMs, it is a requirement. With
>> > kdump enabled, when the first kernel crashes, the system can boot into
>> > the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore)
>> > to a specified target. However, there are two challenges when dumping
>> > vmcore to a LUKS-encrypted device:
>> > [...]
>> >
>> > This patch set only supports x86. There will be patches to support other
>> > architectures once this patch set gets merged.
>
>Could you pick this patchset into your tree since no conern from other
>reviewers?

Thanks to Baoquan for endorsing the patch set!

Hi Andrew and Dave,

If there is anything further I need to do, any suggestion or feedback
will be appreciated!

Or if it's more appropriate for Dave to take the patch set to the x86 tree,
it couldn't be better.

>
>Thanks
>Baoquan
>
>>
>> This v8 looks good to me, thanks for the great effort, Coiby.
>>
>> Acked-by: Baoquan He <bhe@redhat.com>
>>
>

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (8 preceding siblings ...)
  2025-03-10  3:30 ` Baoquan He
@ 2025-04-14  5:44 ` Baoquan He
  2025-04-24  0:08 ` Arnaud Lefebvre
  10 siblings, 0 replies; 24+ messages in thread
From: Baoquan He @ 2025-04-14  5:44 UTC (permalink / raw)
  To: x86, akpm, Coiby Xu
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Dave Young,
	linux-kernel, Dave Hansen, Vitaly Kuznetsov

Hi X86 maintainers, Andrew,

On 02/07/25 at 04:08pm, Coiby Xu wrote:
......snip...
> This patch set only supports x86. There will be patches to support other
> architectures once this patch set gets merged.

who can help pick this patchset? It has been under many rounds of
reviewing, now it's ready for merging from kdump reviewers' side.
Or any comments or concern for further work?

Thanks
Baoquan

> 
> v8
>  - improve documentation [Randy]
>  - rebase onto 6.14.0-rc1
> 
> v7
>  - Baoquan
>    - differentiate between failing to get dm crypt keys and no dm crypt keys
>    - add code comments, change function name and etc. to improve code readability
>  - add documentation for configfs API [Dave]
>  - fix building error found by kernel test robot
> 
> v6
>  - Baoquan
>    - support AMD SEV
>    - drop uncessary keys_header_size
>    - improve commit message of [PATCH 4/7]
>  
>  - Greg
>    - switch to configfs
>    - move ifdef from .c to .h files and rework kexec_random_start
>    - use tab instead of space for appended code comment
>  
>  - Process key description in a more flexible way to address problems
>    found by Ondrej
>  - improve cover letter
>  - fix an compilation error as found by kernel test robot 
> 
> v5
>  - Baoquan
>    - limit the feature of placing kexec_buf randomly to kdump (CONFIG_CRASH_DUMP)
>    - add documentation for added sysfs API 
>    - allow to re-send init command to support the case of user switching to
>      a different LUKS-encrypted target
>    - make CONFIG_CRASH_DM_CRYPT depends on CONFIG_DM_CRYPT
>    - check if the number of keys exceed KEY_NUM_MAX
>    - rename (struct keys_header).key_count as (struct keys_header).total_keys
>      to improve code readability
>    - improve commit message
>    - fix the failure of calling crash_exclude_mem_range (there is a split
>      of mem_range)
>    - use ret instead of r as return code
>  
>  - Greg
>    - add documentation for added sysfs API 
>    - avoid spamming kernel logs 
>    - fix a buffer overflow issue
>    - keep the state enums synced up with the string values
>    - use sysfs_emit other than sprintf
>    - explain KEY_NUM_MAX and KEY_SIZE_MAX
>    - s/EXPORT_SYMBOL_GPL/EXPORT_SYMBOL/g
>    - improve code readability
>  
>  - Rebase onto latest Linus tree
> 
> 
> v4
> - rebase onto latest Linus tree so Baoquan can apply the patches for
>   code review
> - fix kernel test robot warnings
> 
> v3
>  - Support CPU/memory hot-plugging [Baoquan]
>  - Don't save the keys temporarily to simplify the implementation [Baoquan]
>  - Support multiple LUKS encrypted volumes
>  - Read logon key instead of user key to improve security [Ondrej]
>  - A kernel config option CRASH_DM_CRYPT for this feature (disabled by default)
>  - Fix warnings found by kernel test robot
>  - Rebase the code onto 6.9.0-rc5+
> 
> v2
>  - work together with libscryptsetup's --link-vk-to-keyring/--volume-key-keyring APIs [Milan and Ondrej]
>  - add the case where console virtual keyboard is untrusted for confidential VM
>  - use dm_crypt_key instead of LUKS volume key [Milan and Eric]
>  - fix some code format issues
>  - don't move "struct kexec_segment" declaration
>  - Rebase the code onto latest Linus tree (6.7.0)
> 
> v1
>  - "Put the luks key handling related to crash_dump out into a separate
>    file kernel/crash_dump_luks.c" [Baoquan]
>  - Put the generic luks handling code before the x86 specific code to
>    make it easier for other arches to follow suit [Baoquan]
>  - Use phys_to_virt instead of "pfn -> page -> vaddr" [Dave Hansen]
>  - Drop the RFC prefix [Dave Young]
>  - Rebase the code onto latest Linus tree (6.4.0-rc4)
> 
> RFC v2
>  - libcryptsetup interacts with the kernel via sysfs instead of "hacking"
>    dm-crypt
>    - to save a kdump copy of the LUKS volume key in 1st kernel
>    - to add a logon key using the copy for libcryptsetup in kdump kernel [Milan]
>    - to avoid the incorrect usage of LUKS master key in dm-crypt [Milan]
>  - save the kdump copy of LUKS volume key randomly [Jan]
>  - mark the kdump copy inaccessible [Pingfan]
>  - Miscellaneous
>    - explain when operations related to the LUKS volume key happen [Jan]
>    - s/master key/volume key/g
>    - use crash_ instead of kexec_ as function prefix
>    - fix commit subject prefixes e.g. "x86, kdump" to x86/crash
> 
> 
> Coiby Xu (7):
>   kexec_file: allow to place kexec_buf randomly
>   crash_dump: make dm crypt keys persist for the kdump kernel
>   crash_dump: store dm crypt keys in kdump reserved memory
>   crash_dump: reuse saved dm crypt keys for CPU/memory hot-plugging
>   crash_dump: retrieve dm crypt keys in kdump kernel
>   x86/crash: pass dm crypt keys to kdump kernel
>   x86/crash: make the page that stores the dm crypt keys inaccessible
> 
>  Documentation/admin-guide/kdump/kdump.rst |  32 ++
>  arch/x86/kernel/crash.c                   |  26 +-
>  arch/x86/kernel/kexec-bzimage64.c         |  11 +
>  arch/x86/kernel/machine_kexec_64.c        |  22 ++
>  include/linux/crash_core.h                |   7 +-
>  include/linux/crash_dump.h                |   2 +
>  include/linux/kexec.h                     |  34 ++
>  kernel/Kconfig.kexec                      |  10 +
>  kernel/Makefile                           |   1 +
>  kernel/crash_dump_dm_crypt.c              | 459 ++++++++++++++++++++++
>  kernel/kexec_file.c                       |   3 +
>  11 files changed, 604 insertions(+), 3 deletions(-)
>  create mode 100644 kernel/crash_dump_dm_crypt.c
> 
> 
> base-commit: bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b
> -- 
> 2.48.1
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel
  2025-02-07  8:08 ` [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel Coiby Xu
@ 2025-04-23 20:44   ` Arnaud Lefebvre
  2025-04-29  9:34     ` Coiby Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaud Lefebvre @ 2025-04-23 20:44 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, open list:DOCUMENTATION

On Fri, Feb 07, 2025 at 04:08:10PM +0800, Coiby Xu wrote:
>+config CRASH_DM_CRYPT
>+	bool "Support saving crash dump to dm-crypt encrypted volume"
>+	depends on KEXEC_FILE
>+	depends on CRASH_DUMP
>+	depends on DM_CRYPT
>+	help
>+	  With this option enabled, user space can intereact with
>+	  /sys/kernel/config/crash_dm_crypt_keys to make the dm crypt keys
>+	  persistent for the dump-capture kernel.
>+

Maybe also add CONFIG_CONFIGFS_FS option? Without it this series code doesn't compile:

Last build lines:

    GEN     modules.builtin
    MODPOST vmlinux.symvers
    UPD     include/generated/utsversion.h
    CC      init/version-timestamp.o
    KSYMS   .tmp_vmlinux0.kallsyms.S
    AS      .tmp_vmlinux0.kallsyms.o
    LD      .tmp_vmlinux1
  ld: vmlinux.o: in function `config_keys_make_item':
  /usr/src/linux/kernel/crash_dump_dm_crypt.c:250:(.text+0x228028): undefined reference to `config_item_init_type_name'
  ld: vmlinux.o: in function `configfs_dmcrypt_keys_init':
  /usr/src/linux/kernel/crash_dump_dm_crypt.c:442:(.init.text+0x71e5c): undefined reference to `config_group_init'
  ld: /usr/src/linux/kernel/crash_dump_dm_crypt.c:444:(.init.text+0x71e82): undefined reference to `configfs_register_subsystem'
  ld: /usr/src/linux/kernel/crash_dump_dm_crypt.c:454:(.init.text+0x71ef7): undefined reference to `configfs_unregister_subsystem'
  make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
  make[1]: *** [/usr/src/linux/Makefile:1226: vmlinux] Error 2
  make: *** [Makefile:251: __sub-make] Error 2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 6/7] x86/crash: pass dm crypt keys to kdump kernel
  2025-02-07  8:08 ` [PATCH v8 6/7] x86/crash: pass dm crypt keys to " Coiby Xu
@ 2025-04-23 20:59   ` Arnaud Lefebvre
  2025-04-29  9:40     ` Coiby Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaud Lefebvre @ 2025-04-23 20:59 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin,
	open list:DOCUMENTATION

>diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>index 68530fad05f7..5604a5109858 100644
>--- a/arch/x86/kernel/kexec-bzimage64.c
>+++ b/arch/x86/kernel/kexec-bzimage64.c
>@@ -76,6 +76,10 @@ static int setup_cmdline(struct kimage *image, struct boot_params *params,
> 	if (image->type == KEXEC_TYPE_CRASH) {
> 		len = sprintf(cmdline_ptr,
> 			"elfcorehdr=0x%lx ", image->elf_load_addr);
>+
>+		if (image->dm_crypt_keys_addr != 0)
>+			len += sprintf(cmdline_ptr + len,
>+					"dmcryptkeys=0x%lx ", image->dm_crypt_keys_addr);
> 	}
> 	memcpy(cmdline_ptr + len, cmdline, cmdline_len);
> 	cmdline_len += len;

You are adding another kernel parameter but I believe without taking its
length into account. See the MAX_ELFCOREHDR_STR_LEN constant which is added to the
params_cmdline_sz variable for the elfcorehdr= parameter.

This will (at least during my tests) truncate the cmdline given to the crash kernel because
the next section (efi_map_offset) will have an offset starting inside the cmdline section
and it might overwrite the end of it:

kexec-bzimage64.c:480:
params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
			MAX_ELFCOREHDR_STR_LEN; <<< Should have + 31 here for "dmcryptkeys=0x<ptr> "
params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
kbuf.bufsz = params_cmdline_sz + ALIGN(efi_map_sz, 16) +
			sizeof(struct setup_data) +
			sizeof(struct efi_setup_data) +
			sizeof(struct setup_data) +
			RNG_SEED_LENGTH;

And I believe the buffer might be too small.

Also, there is another check a few lines above that needs to take the size into account:

/*
  * In case of crash dump, we will append elfcorehdr=<addr> to
  * command line. Make sure it does not overflow
  */
if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
	pr_err("Appending elfcorehdr=<addr> to command line exceeds maximum allowed length\n");
	return ERR_PTR(-EINVAL);
}

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
                   ` (9 preceding siblings ...)
  2025-04-14  5:44 ` Baoquan He
@ 2025-04-24  0:08 ` Arnaud Lefebvre
  2025-04-28  9:02   ` Coiby Xu
  10 siblings, 1 reply; 24+ messages in thread
From: Arnaud Lefebvre @ 2025-04-24  0:08 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On Fri, Feb 07, 2025 at 04:08:08PM +0800, Coiby Xu wrote:
>LUKS is the standard for Linux disk encryption, widely adopted by users,
>and in some cases, such as Confidential VMs, it is a requirement. With
>kdump enabled, when the first kernel crashes, the system can boot into
>the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore)
>to a specified target. However, there are two challenges when dumping
>vmcore to a LUKS-encrypted device:
>
> - Kdump kernel may not be able to decrypt the LUKS partition. For some
>   machines, a system administrator may not have a chance to enter the
>   password to decrypt the device in kdump initramfs after the 1st kernel
>   crashes; For cloud confidential VMs, depending on the policy the
>   kdump kernel may not be able to unseal the keys with TPM and the
>   console virtual keyboard is untrusted.
>
> - LUKS2 by default use the memory-hard Argon2 key derivation function
>   which is quite memory-consuming compared to the limited memory reserved
>   for kdump. Take Fedora example, by default, only 256M is reserved for
>   systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
>   to be reserved for kdump. Note if the memory reserved for kdump can't
>   be used by 1st kernel i.e. an user sees ~1300M memory missing in the
>   1st kernel.
>
>Besides users (at least for Fedora) usually expect kdump to work out of
>the box i.e. no manual password input or custom crashkernel value is
>needed. And it doesn't make sense to derivate the keys again in kdump
>kernel which seems to be redundant work.
>
>This patch set addresses the above issues by making the LUKS volume keys
>persistent for kdump kernel with the help of cryptsetup's new APIs
>(--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
>the kdump copies of LUKS volume keys,
>
> 1. After the 1st kernel loads the initramfs during boot, systemd
>    use an user-input passphrase to de-crypt the LUKS volume keys
>    or TPM-sealed key and then save the volume keys to specified keyring
>    (using the --link-vk-to-keyring API) and the key will expire within
>    specified time.
>
> 2. A user space tool (kdump initramfs loader like kdump-utils) create
>    key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
>    the 1st kernel which keys are needed.
>
> 3. When the kdump initramfs is loaded by the kexec_file_load
>    syscall, the 1st kernel will iterate created key items, save the
>    keys to kdump reserved memory.
>
> 4. When the 1st kernel crashes and the kdump initramfs is booted, the
>    kdump initramfs asks the kdump kernel to create a user key using the
>    key stored in kdump reserved memory by writing yes to
>    /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
>    device is unlocked with libcryptsetup's --volume-key-keyring API.
>
> 5. The system gets rebooted to the 1st kernel after dumping vmcore to
>    the LUKS encrypted device is finished
>
>After libcryptsetup saving the LUKS volume keys to specified keyring,
>whoever takes this should be responsible for the safety of these copies
>of keys. The keys will be saved in the memory area exclusively reserved
>for kdump where even the 1st kernel has no direct access. And further
>more, two additional protections are added,
> - save the copy randomly in kdump reserved memory as suggested by Jan
> - clear the _PAGE_PRESENT flag of the page that stores the copy as
>   suggested by Pingfan
>
>This patch set only supports x86. There will be patches to support other
>architectures once this patch set gets merged.
>

I'm not sure what's the problem here but I can reliably trigger a kernel
panic on a qemu VM + custom kernel (compiled from
bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b + your patches).

When I configure the crash configfs and call kexec in a systemd service
using ExecStart=, the panic occurs when I start the service:

~ # cat /etc/systemd/system/my-kexec.service
[Unit]
Description=kexec loading for the crash capture kernel

[Service]
Type=oneshot
ExecStart=/usr/bin/mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
ExecStart=/usr/bin/echo cryptsetup:mykey >/sys/kernel/config/crash_dm_crypt_keys/mykey/description
ExecStart=/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
KeyringMode=shared

[Install]
WantedBy=default.target


Starting the service:

~ # systemctl start my-kexec.service
kexec_file: kernel: 00000000ace85dcc kernel_size: 0x16e3000
crash_core: Crash PT_LOAD ELF header. phdr=00000000d08940fa
vaddr=0xffff888000100000, paddr=0x100000, sz=0x700000 e_phnum=11
p_offset=0x100000
crash_core: Crash PT_LOAD ELF header. phdr=00000000304ef570
vaddr=0xffff888000808000, paddr=0x808000, sz=0x3000 e_phnum=12
p_offset=0x808000
crash_core: Crash PT_LOAD ELF header. phdr=000000000275e248
vaddr=0xffff88800080c000, paddr=0x80c000, sz=0x5000 e_phnum=13
p_offset=0x80c000
crash_core: Crash PT_LOAD ELF header. phdr=000000004e47ca09
vaddr=0xffff888000900000, paddr=0x900000, sz=0xa5700000 e_phnum=14
p_offset=0x900000
crash_core: Crash PT_LOAD ELF header. phdr=00000000e56c8350
vaddr=0xffff8880b6000000, paddr=0xb6000000, sz=0x7d51018 e_phnum=15
p_offset=0xb6000000
crash_core: Crash PT_LOAD ELF header. phdr=0000000099d67ff3
vaddr=0xffff8880bdd51018, paddr=0xbdd51018, sz=0x27440 e_phnum=16
p_offset=0xbdd51018
crash_core: Crash PT_LOAD ELF header. phdr=00000000461a2f21
vaddr=0xffff8880bdd78458, paddr=0xbdd78458, sz=0xbc0 e_phnum=17
p_offset=0xbdd78458
crash_core: Crash PT_LOAD ELF header. phdr=0000000058149b54
vaddr=0xffff8880bdd79018, paddr=0xbdd79018, sz=0x9a40 e_phnum=18
p_offset=0xbdd79018
crash_core: Crash PT_LOAD ELF header. phdr=000000001e30ff2c
vaddr=0xffff8880bdd82a58, paddr=0xbdd82a58, sz=0xdbc5a8 e_phnum=19
p_offset=0xbdd82a58
crash_core: Crash PT_LOAD ELF header. phdr=00000000e67a9768
vaddr=0xffff8880bec00000, paddr=0xbec00000, sz=0xaed000 e_phnum=20
p_offset=0xbec00000
crash_core: Crash PT_LOAD ELF header. phdr=000000005909c4c6
vaddr=0xffff8880bf9ff000, paddr=0xbf9ff000, sz=0x453000 e_phnum=21
p_offset=0xbf9ff000
crash_core: Crash PT_LOAD ELF header. phdr=00000000473d74ef
vaddr=0xffff8880bfe58000, paddr=0xbfe58000, sz=0x64000 e_phnum=22
p_offset=0xbfe58000
crash_core: Crash PT_LOAD ELF header. phdr=00000000abde8123
vaddr=0xffff888100000000, paddr=0x100000000, sz=0x23f000000 e_phnum=23
p_offset=0x100000000
crash_core: Crash PT_LOAD ELF header. phdr=00000000bda3e0bf
vaddr=0xffff88843f000000, paddr=0x43f000000, sz=0x1000000 e_phnum=24
p_offset=0x43f000000
kexec: Loaded ELF headers at 0x33f000000 bufsz=0x1000 memsz=0xe1000
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 5 UID: 0 PID: 3812 Comm: kexec Not tainted 6.14.0-rc1+ #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 2025.02-6
04/08/2025
RIP: 0010:sized_strscpy+0x71/0x150
Code: b9 80 80 80 80 80 80 80 80 48 c1 e8 03 48 8d 1c c5 08 00 00 00 31 c0 eb
11
48 89 34 07 48 83 c0 08 48 39 d8 0f 84 83 00 00 00 <49> 8b 34 00 4a 8d 14 1e 49
89 f2 49 f7 d2 4c 21 d2 4c 8d 14 07 4c
RSP: 0018:ffffc9000420fc68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000080
RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff8881030ec808
RBP: ffff888109724000 R08: 0000000000000000 R09: 8080808080808080
R10: ffffc9000420fc78 R11: fefefefefefefeff R12: ffffc90004219000
R13: ffff888104a80000 R14: 0000000000000008 R15: 0000000000000000
FS:  00007f09ea73f740(0000) GS:ffff88843fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000120760002 CR4: 0000000000772ef0
PKRU: 55555554
Call Trace:
  <TASK>
  ? __die+0x23/0x60
  ? page_fault_oops+0x177/0x510
  ? _prb_read_valid+0x2e7/0x370
  ? exc_page_fault+0x6f/0x130
  ? asm_exc_page_fault+0x26/0x30
  ? sized_strscpy+0x71/0x150
  crash_load_dm_crypt_keys+0x1bc/0x370
  bzImage64_load+0x41b/0xa30
  __do_sys_kexec_file_load+0x2af/0x8a0
  do_syscall_64+0x4b/0x110
  entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f09ea848d6d
Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48
89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01
c3 48 8b 0d 6b 70 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007fff8cf979e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000140
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f09ea848d6d
RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000003
RBP: 0000000000000003 R08: 000000000000000a R09: 00007fff8cf97a10
R10: 000055de70eee9a0 R11: 0000000000000206 R12: 0000000000000003
R13: 00007fff8cf97d08 R14: 000055de4c336448 R15: 0000000000000004
  </TASK>
CR2: 0000000000000000
---[ end trace 0000000000000000 ]---
RIP: 0010:sized_strscpy+0x71/0x150
Code: b9 80 80 80 80 80 80 80 80 48 c1 e8 03 48 8d 1c c5 08 00 00 00 31 c0 eb
11 48 89 34 07 48 83 c0 08 48 39 d8 0f 84 83 00 00 00 <49> 8b 34 00 4a 8d 14 1e
49 89 f2 49 f7 d2 4c 21 d2 4c 8d 14 07 4c
RSP: 0018:ffffc9000420fc68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000080
RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff8881030ec808
RBP: ffff888109724000 R08: 0000000000000000 R09: 8080808080808080
R10: ffffc9000420fc78 R11: fefefefefefefeff R12: ffffc90004219000
R13: ffff888104a80000 R14: 0000000000000008 R15: 0000000000000000
FS:  00007f09ea73f740(0000) GS:ffff88843fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000120760002 CR4: 0000000000772ef0
PKRU: 55555554
Kernel panic - not syncing: Fatal exception
Kernel Offset: disabled


Calling a script that does the same thing works fine and loads the keys
correctly:

[Service]
ExecStart=/root/kexec.sh

~ # cat /root/kexec.sh
#!/bin/bash

mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
echo cryptsetup:mykey > /sys/kernel/config/crash_dm_crypt_keys/mykey/description
/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd

If that's any help, my crypttab:

~ # cat /etc/crypttab
root UUID=8001fca4-2e54-48e9-9235-031c19fc6e36 none luks,link-volume-key=@u::%logon:cryptsetup:mykey

If you can't reproduce, I can help track this. Just let me know if you need
any help.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-04-24  0:08 ` Arnaud Lefebvre
@ 2025-04-28  9:02   ` Coiby Xu
  2025-04-28 18:40     ` Arnaud Lefebvre
  0 siblings, 1 reply; 24+ messages in thread
From: Coiby Xu @ 2025-04-28  9:02 UTC (permalink / raw)
  To: Arnaud Lefebvre
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On Thu, Apr 24, 2025 at 02:08:55AM +0200, Arnaud Lefebvre wrote:
>On Fri, Feb 07, 2025 at 04:08:08PM +0800, Coiby Xu wrote:
>>LUKS is the standard for Linux disk encryption, widely adopted by users,
>>and in some cases, such as Confidential VMs, it is a requirement. With
>>kdump enabled, when the first kernel crashes, the system can boot into
>>the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore)
>>to a specified target. However, there are two challenges when dumping
>>vmcore to a LUKS-encrypted device:
>>
>>- Kdump kernel may not be able to decrypt the LUKS partition. For some
>>  machines, a system administrator may not have a chance to enter the
>>  password to decrypt the device in kdump initramfs after the 1st kernel
>>  crashes; For cloud confidential VMs, depending on the policy the
>>  kdump kernel may not be able to unseal the keys with TPM and the
>>  console virtual keyboard is untrusted.
>>
>>- LUKS2 by default use the memory-hard Argon2 key derivation function
>>  which is quite memory-consuming compared to the limited memory reserved
>>  for kdump. Take Fedora example, by default, only 256M is reserved for
>>  systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
>>  to be reserved for kdump. Note if the memory reserved for kdump can't
>>  be used by 1st kernel i.e. an user sees ~1300M memory missing in the
>>  1st kernel.
>>
>>Besides users (at least for Fedora) usually expect kdump to work out of
>>the box i.e. no manual password input or custom crashkernel value is
>>needed. And it doesn't make sense to derivate the keys again in kdump
>>kernel which seems to be redundant work.
>>
>>This patch set addresses the above issues by making the LUKS volume keys
>>persistent for kdump kernel with the help of cryptsetup's new APIs
>>(--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
>>the kdump copies of LUKS volume keys,
>>
>>1. After the 1st kernel loads the initramfs during boot, systemd
>>   use an user-input passphrase to de-crypt the LUKS volume keys
>>   or TPM-sealed key and then save the volume keys to specified keyring
>>   (using the --link-vk-to-keyring API) and the key will expire within
>>   specified time.
>>
>>2. A user space tool (kdump initramfs loader like kdump-utils) create
>>   key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
>>   the 1st kernel which keys are needed.
>>
>>3. When the kdump initramfs is loaded by the kexec_file_load
>>   syscall, the 1st kernel will iterate created key items, save the
>>   keys to kdump reserved memory.
>>
>>4. When the 1st kernel crashes and the kdump initramfs is booted, the
>>   kdump initramfs asks the kdump kernel to create a user key using the
>>   key stored in kdump reserved memory by writing yes to
>>   /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
>>   device is unlocked with libcryptsetup's --volume-key-keyring API.
>>
>>5. The system gets rebooted to the 1st kernel after dumping vmcore to
>>   the LUKS encrypted device is finished
>>
>>After libcryptsetup saving the LUKS volume keys to specified keyring,
>>whoever takes this should be responsible for the safety of these copies
>>of keys. The keys will be saved in the memory area exclusively reserved
>>for kdump where even the 1st kernel has no direct access. And further
>>more, two additional protections are added,
>>- save the copy randomly in kdump reserved memory as suggested by Jan
>>- clear the _PAGE_PRESENT flag of the page that stores the copy as
>>  suggested by Pingfan
>>
>>This patch set only supports x86. There will be patches to support other
>>architectures once this patch set gets merged.
>>
>
>I'm not sure what's the problem here but I can reliably trigger a kernel
>panic on a qemu VM + custom kernel (compiled from
>bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b + your patches).

Hi Arnaud,

Thanks for testing the patches, finding this issue and also sharing the
details to reproduce it!

>
>When I configure the crash configfs and call kexec in a systemd service
>using ExecStart=, the panic occurs when I start the service:
>
>~ # cat /etc/systemd/system/my-kexec.service
>[Unit]
>Description=kexec loading for the crash capture kernel
>
>[Service]
>Type=oneshot
>ExecStart=/usr/bin/mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
>ExecStart=/usr/bin/echo cryptsetup:mykey >/sys/kernel/config/crash_dm_crypt_keys/mykey/description
>ExecStart=/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
>KeyringMode=shared

Can you try putting the above commands into a script e.g.
/usr/local/bin/my-kexec.sh and then using
ExecStart=/usr/local/bin/my-kexec.sh
so I can be more sure that I've reproduced your issue?

>
>[Install]
>WantedBy=default.target
>
>
>Starting the service:
>
>~ # systemctl start my-kexec.service
>kexec_file: kernel: 00000000ace85dcc kernel_size: 0x16e3000
>crash_core: Crash PT_LOAD ELF header. phdr=00000000d08940fa
>vaddr=0xffff888000100000, paddr=0x100000, sz=0x700000 e_phnum=11
>p_offset=0x100000
>crash_core: Crash PT_LOAD ELF header. phdr=00000000304ef570
>vaddr=0xffff888000808000, paddr=0x808000, sz=0x3000 e_phnum=12
>p_offset=0x808000
>crash_core: Crash PT_LOAD ELF header. phdr=000000000275e248
>vaddr=0xffff88800080c000, paddr=0x80c000, sz=0x5000 e_phnum=13
>p_offset=0x80c000
>crash_core: Crash PT_LOAD ELF header. phdr=000000004e47ca09
>vaddr=0xffff888000900000, paddr=0x900000, sz=0xa5700000 e_phnum=14
>p_offset=0x900000
>crash_core: Crash PT_LOAD ELF header. phdr=00000000e56c8350
>vaddr=0xffff8880b6000000, paddr=0xb6000000, sz=0x7d51018 e_phnum=15
>p_offset=0xb6000000
>crash_core: Crash PT_LOAD ELF header. phdr=0000000099d67ff3
>vaddr=0xffff8880bdd51018, paddr=0xbdd51018, sz=0x27440 e_phnum=16
>p_offset=0xbdd51018
>crash_core: Crash PT_LOAD ELF header. phdr=00000000461a2f21
>vaddr=0xffff8880bdd78458, paddr=0xbdd78458, sz=0xbc0 e_phnum=17
>p_offset=0xbdd78458
>crash_core: Crash PT_LOAD ELF header. phdr=0000000058149b54
>vaddr=0xffff8880bdd79018, paddr=0xbdd79018, sz=0x9a40 e_phnum=18
>p_offset=0xbdd79018
>crash_core: Crash PT_LOAD ELF header. phdr=000000001e30ff2c
>vaddr=0xffff8880bdd82a58, paddr=0xbdd82a58, sz=0xdbc5a8 e_phnum=19
>p_offset=0xbdd82a58
>crash_core: Crash PT_LOAD ELF header. phdr=00000000e67a9768
>vaddr=0xffff8880bec00000, paddr=0xbec00000, sz=0xaed000 e_phnum=20
>p_offset=0xbec00000
>crash_core: Crash PT_LOAD ELF header. phdr=000000005909c4c6
>vaddr=0xffff8880bf9ff000, paddr=0xbf9ff000, sz=0x453000 e_phnum=21
>p_offset=0xbf9ff000
>crash_core: Crash PT_LOAD ELF header. phdr=00000000473d74ef
>vaddr=0xffff8880bfe58000, paddr=0xbfe58000, sz=0x64000 e_phnum=22
>p_offset=0xbfe58000
>crash_core: Crash PT_LOAD ELF header. phdr=00000000abde8123
>vaddr=0xffff888100000000, paddr=0x100000000, sz=0x23f000000 e_phnum=23
>p_offset=0x100000000
>crash_core: Crash PT_LOAD ELF header. phdr=00000000bda3e0bf
>vaddr=0xffff88843f000000, paddr=0x43f000000, sz=0x1000000 e_phnum=24
>p_offset=0x43f000000
>kexec: Loaded ELF headers at 0x33f000000 bufsz=0x1000 memsz=0xe1000
>BUG: kernel NULL pointer dereference, address: 0000000000000000
>#PF: supervisor read access in kernel mode
>#PF: error_code(0x0000) - not-present page
>PGD 0 P4D 0
>Oops: Oops: 0000 [#1] SMP NOPTI
>CPU: 5 UID: 0 PID: 3812 Comm: kexec Not tainted 6.14.0-rc1+ #20
>Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 2025.02-6
>04/08/2025
>RIP: 0010:sized_strscpy+0x71/0x150
>Code: b9 80 80 80 80 80 80 80 80 48 c1 e8 03 48 8d 1c c5 08 00 00 00 31 c0 eb
>11
>48 89 34 07 48 83 c0 08 48 39 d8 0f 84 83 00 00 00 <49> 8b 34 00 4a 8d 14 1e 49
>89 f2 49 f7 d2 4c 21 d2 4c 8d 14 07 4c
>RSP: 0018:ffffc9000420fc68 EFLAGS: 00010246
>RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000080
>RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff8881030ec808
>RBP: ffff888109724000 R08: 0000000000000000 R09: 8080808080808080
>R10: ffffc9000420fc78 R11: fefefefefefefeff R12: ffffc90004219000
>R13: ffff888104a80000 R14: 0000000000000008 R15: 0000000000000000
>FS:  00007f09ea73f740(0000) GS:ffff88843fc80000(0000) knlGS:0000000000000000
>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>CR2: 0000000000000000 CR3: 0000000120760002 CR4: 0000000000772ef0
>PKRU: 55555554
>Call Trace:
> <TASK>
> ? __die+0x23/0x60
> ? page_fault_oops+0x177/0x510
> ? _prb_read_valid+0x2e7/0x370
> ? exc_page_fault+0x6f/0x130
> ? asm_exc_page_fault+0x26/0x30
> ? sized_strscpy+0x71/0x150
> crash_load_dm_crypt_keys+0x1bc/0x370
> bzImage64_load+0x41b/0xa30
> __do_sys_kexec_file_load+0x2af/0x8a0
> do_syscall_64+0x4b/0x110
> entry_SYSCALL_64_after_hwframe+0x76/0x7e
>RIP: 0033:0x7f09ea848d6d
>Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48
>89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01
>c3 48 8b 0d 6b 70 0d 00 f7 d8 64 89 01 48
>RSP: 002b:00007fff8cf979e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000140
>RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f09ea848d6d
>RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000003
>RBP: 0000000000000003 R08: 000000000000000a R09: 00007fff8cf97a10
>R10: 000055de70eee9a0 R11: 0000000000000206 R12: 0000000000000003
>R13: 00007fff8cf97d08 R14: 000055de4c336448 R15: 0000000000000004
> </TASK>
>CR2: 0000000000000000
>---[ end trace 0000000000000000 ]---
>RIP: 0010:sized_strscpy+0x71/0x150
>Code: b9 80 80 80 80 80 80 80 80 48 c1 e8 03 48 8d 1c c5 08 00 00 00 31 c0 eb
>11 48 89 34 07 48 83 c0 08 48 39 d8 0f 84 83 00 00 00 <49> 8b 34 00 4a 8d 14 1e
>49 89 f2 49 f7 d2 4c 21 d2 4c 8d 14 07 4c
>RSP: 0018:ffffc9000420fc68 EFLAGS: 00010246
>RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000080
>RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff8881030ec808
>RBP: ffff888109724000 R08: 0000000000000000 R09: 8080808080808080
>R10: ffffc9000420fc78 R11: fefefefefefefeff R12: ffffc90004219000
>R13: ffff888104a80000 R14: 0000000000000008 R15: 0000000000000000
>FS:  00007f09ea73f740(0000) GS:ffff88843fc80000(0000) knlGS:0000000000000000
>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>CR2: 0000000000000000 CR3: 0000000120760002 CR4: 0000000000772ef0
>PKRU: 55555554
>Kernel panic - not syncing: Fatal exception
>Kernel Offset: disabled
>
>
>Calling a script that does the same thing works fine and loads the keys
>correctly:
>
>[Service]
>ExecStart=/root/kexec.sh
>
>~ # cat /root/kexec.sh
>#!/bin/bash
>
>mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
>echo cryptsetup:mykey > /sys/kernel/config/crash_dm_crypt_keys/mykey/description
>/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
>
>If that's any help, my crypttab:
>
>~ # cat /etc/crypttab
>root UUID=8001fca4-2e54-48e9-9235-031c19fc6e36 none luks,link-volume-key=@u::%logon:cryptsetup:mykey
>
>If you can't reproduce, I can help track this. Just let me know if you need
>any help.
>

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-04-28  9:02   ` Coiby Xu
@ 2025-04-28 18:40     ` Arnaud Lefebvre
  2025-04-28 23:56       ` Coiby Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaud Lefebvre @ 2025-04-28 18:40 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On Mon, Apr 28, 2025 at 05:02:23PM +0800, Coiby Xu wrote:
>On Thu, Apr 24, 2025 at 02:08:55AM +0200, Arnaud Lefebvre wrote:
>>On Fri, Feb 07, 2025 at 04:08:08PM +0800, Coiby Xu wrote:
>>>LUKS is the standard for Linux disk encryption, widely adopted by users,
>>>and in some cases, such as Confidential VMs, it is a requirement. With
>>>kdump enabled, when the first kernel crashes, the system can boot into
>>>the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore)
>>>to a specified target. However, there are two challenges when dumping
>>>vmcore to a LUKS-encrypted device:
>>>
>>>- Kdump kernel may not be able to decrypt the LUKS partition. For some
>>> machines, a system administrator may not have a chance to enter the
>>> password to decrypt the device in kdump initramfs after the 1st kernel
>>> crashes; For cloud confidential VMs, depending on the policy the
>>> kdump kernel may not be able to unseal the keys with TPM and the
>>> console virtual keyboard is untrusted.
>>>
>>>- LUKS2 by default use the memory-hard Argon2 key derivation function
>>> which is quite memory-consuming compared to the limited memory reserved
>>> for kdump. Take Fedora example, by default, only 256M is reserved for
>>> systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
>>> to be reserved for kdump. Note if the memory reserved for kdump can't
>>> be used by 1st kernel i.e. an user sees ~1300M memory missing in the
>>> 1st kernel.
>>>
>>>Besides users (at least for Fedora) usually expect kdump to work out of
>>>the box i.e. no manual password input or custom crashkernel value is
>>>needed. And it doesn't make sense to derivate the keys again in kdump
>>>kernel which seems to be redundant work.
>>>
>>>This patch set addresses the above issues by making the LUKS volume keys
>>>persistent for kdump kernel with the help of cryptsetup's new APIs
>>>(--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
>>>the kdump copies of LUKS volume keys,
>>>
>>>1. After the 1st kernel loads the initramfs during boot, systemd
>>>  use an user-input passphrase to de-crypt the LUKS volume keys
>>>  or TPM-sealed key and then save the volume keys to specified keyring
>>>  (using the --link-vk-to-keyring API) and the key will expire within
>>>  specified time.
>>>
>>>2. A user space tool (kdump initramfs loader like kdump-utils) create
>>>  key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
>>>  the 1st kernel which keys are needed.
>>>
>>>3. When the kdump initramfs is loaded by the kexec_file_load
>>>  syscall, the 1st kernel will iterate created key items, save the
>>>  keys to kdump reserved memory.
>>>
>>>4. When the 1st kernel crashes and the kdump initramfs is booted, the
>>>  kdump initramfs asks the kdump kernel to create a user key using the
>>>  key stored in kdump reserved memory by writing yes to
>>>  /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
>>>  device is unlocked with libcryptsetup's --volume-key-keyring API.
>>>
>>>5. The system gets rebooted to the 1st kernel after dumping vmcore to
>>>  the LUKS encrypted device is finished
>>>
>>>After libcryptsetup saving the LUKS volume keys to specified keyring,
>>>whoever takes this should be responsible for the safety of these copies
>>>of keys. The keys will be saved in the memory area exclusively reserved
>>>for kdump where even the 1st kernel has no direct access. And further
>>>more, two additional protections are added,
>>>- save the copy randomly in kdump reserved memory as suggested by Jan
>>>- clear the _PAGE_PRESENT flag of the page that stores the copy as
>>> suggested by Pingfan
>>>
>>>This patch set only supports x86. There will be patches to support other
>>>architectures once this patch set gets merged.
>>>
>>
>>I'm not sure what's the problem here but I can reliably trigger a kernel
>>panic on a qemu VM + custom kernel (compiled from
>>bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b + your patches).
>
>Hi Arnaud,
>
>Thanks for testing the patches, finding this issue and also sharing the
>details to reproduce it!
>

Hello,

You're welcome, thanks to you for this patch series!

>>
>>When I configure the crash configfs and call kexec in a systemd service
>>using ExecStart=, the panic occurs when I start the service:
>>
>>~ # cat /etc/systemd/system/my-kexec.service
>>[Unit]
>>Description=kexec loading for the crash capture kernel
>>
>>[Service]
>>Type=oneshot
>>ExecStart=/usr/bin/mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
>>ExecStart=/usr/bin/echo cryptsetup:mykey >/sys/kernel/config/crash_dm_crypt_keys/mykey/description
>>ExecStart=/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
>>KeyringMode=shared
>
>Can you try putting the above commands into a script e.g.
>/usr/local/bin/my-kexec.sh and then using
>ExecStart=/usr/local/bin/my-kexec.sh
>so I can be more sure that I've reproduced your issue?
>

I believe that's what I wrote at the end of my previous message
(see below the panic trace). It works fine using a script like that.

Did you miss it or is there a difference with what you're asking?

>>
>>[Install]
>>WantedBy=default.target
>>
>>
>>Starting the service:
>>
>>~ # systemctl start my-kexec.service
>>kexec_file: kernel: 00000000ace85dcc kernel_size: 0x16e3000
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000d08940fa
>>vaddr=0xffff888000100000, paddr=0x100000, sz=0x700000 e_phnum=11
>>p_offset=0x100000
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000304ef570
>>vaddr=0xffff888000808000, paddr=0x808000, sz=0x3000 e_phnum=12
>>p_offset=0x808000
>>crash_core: Crash PT_LOAD ELF header. phdr=000000000275e248
>>vaddr=0xffff88800080c000, paddr=0x80c000, sz=0x5000 e_phnum=13
>>p_offset=0x80c000
>>crash_core: Crash PT_LOAD ELF header. phdr=000000004e47ca09
>>vaddr=0xffff888000900000, paddr=0x900000, sz=0xa5700000 e_phnum=14
>>p_offset=0x900000
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000e56c8350
>>vaddr=0xffff8880b6000000, paddr=0xb6000000, sz=0x7d51018 e_phnum=15
>>p_offset=0xb6000000
>>crash_core: Crash PT_LOAD ELF header. phdr=0000000099d67ff3
>>vaddr=0xffff8880bdd51018, paddr=0xbdd51018, sz=0x27440 e_phnum=16
>>p_offset=0xbdd51018
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000461a2f21
>>vaddr=0xffff8880bdd78458, paddr=0xbdd78458, sz=0xbc0 e_phnum=17
>>p_offset=0xbdd78458
>>crash_core: Crash PT_LOAD ELF header. phdr=0000000058149b54
>>vaddr=0xffff8880bdd79018, paddr=0xbdd79018, sz=0x9a40 e_phnum=18
>>p_offset=0xbdd79018
>>crash_core: Crash PT_LOAD ELF header. phdr=000000001e30ff2c
>>vaddr=0xffff8880bdd82a58, paddr=0xbdd82a58, sz=0xdbc5a8 e_phnum=19
>>p_offset=0xbdd82a58
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000e67a9768
>>vaddr=0xffff8880bec00000, paddr=0xbec00000, sz=0xaed000 e_phnum=20
>>p_offset=0xbec00000
>>crash_core: Crash PT_LOAD ELF header. phdr=000000005909c4c6
>>vaddr=0xffff8880bf9ff000, paddr=0xbf9ff000, sz=0x453000 e_phnum=21
>>p_offset=0xbf9ff000
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000473d74ef
>>vaddr=0xffff8880bfe58000, paddr=0xbfe58000, sz=0x64000 e_phnum=22
>>p_offset=0xbfe58000
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000abde8123
>>vaddr=0xffff888100000000, paddr=0x100000000, sz=0x23f000000 e_phnum=23
>>p_offset=0x100000000
>>crash_core: Crash PT_LOAD ELF header. phdr=00000000bda3e0bf
>>vaddr=0xffff88843f000000, paddr=0x43f000000, sz=0x1000000 e_phnum=24
>>p_offset=0x43f000000
>>kexec: Loaded ELF headers at 0x33f000000 bufsz=0x1000 memsz=0xe1000
>>BUG: kernel NULL pointer dereference, address: 0000000000000000
>>#PF: supervisor read access in kernel mode
>>#PF: error_code(0x0000) - not-present page
>>PGD 0 P4D 0
>>Oops: Oops: 0000 [#1] SMP NOPTI
>>CPU: 5 UID: 0 PID: 3812 Comm: kexec Not tainted 6.14.0-rc1+ #20
>>Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 2025.02-6
>>04/08/2025
>>RIP: 0010:sized_strscpy+0x71/0x150
>>Code: b9 80 80 80 80 80 80 80 80 48 c1 e8 03 48 8d 1c c5 08 00 00 00 31 c0 eb
>>11
>>48 89 34 07 48 83 c0 08 48 39 d8 0f 84 83 00 00 00 <49> 8b 34 00 4a 8d 14 1e 49
>>89 f2 49 f7 d2 4c 21 d2 4c 8d 14 07 4c
>>RSP: 0018:ffffc9000420fc68 EFLAGS: 00010246
>>RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000080
>>RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff8881030ec808
>>RBP: ffff888109724000 R08: 0000000000000000 R09: 8080808080808080
>>R10: ffffc9000420fc78 R11: fefefefefefefeff R12: ffffc90004219000
>>R13: ffff888104a80000 R14: 0000000000000008 R15: 0000000000000000
>>FS:  00007f09ea73f740(0000) GS:ffff88843fc80000(0000) knlGS:0000000000000000
>>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>CR2: 0000000000000000 CR3: 0000000120760002 CR4: 0000000000772ef0
>>PKRU: 55555554
>>Call Trace:
>><TASK>
>>? __die+0x23/0x60
>>? page_fault_oops+0x177/0x510
>>? _prb_read_valid+0x2e7/0x370
>>? exc_page_fault+0x6f/0x130
>>? asm_exc_page_fault+0x26/0x30
>>? sized_strscpy+0x71/0x150
>>crash_load_dm_crypt_keys+0x1bc/0x370
>>bzImage64_load+0x41b/0xa30
>>__do_sys_kexec_file_load+0x2af/0x8a0
>>do_syscall_64+0x4b/0x110
>>entry_SYSCALL_64_after_hwframe+0x76/0x7e
>>RIP: 0033:0x7f09ea848d6d
>>Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48
>>89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01
>>c3 48 8b 0d 6b 70 0d 00 f7 d8 64 89 01 48
>>RSP: 002b:00007fff8cf979e8 EFLAGS: 00000206 ORIG_RAX: 0000000000000140
>>RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f09ea848d6d
>>RDX: 0000000000000001 RSI: 0000000000000004 RDI: 0000000000000003
>>RBP: 0000000000000003 R08: 000000000000000a R09: 00007fff8cf97a10
>>R10: 000055de70eee9a0 R11: 0000000000000206 R12: 0000000000000003
>>R13: 00007fff8cf97d08 R14: 000055de4c336448 R15: 0000000000000004
>></TASK>
>>CR2: 0000000000000000
>>---[ end trace 0000000000000000 ]---
>>RIP: 0010:sized_strscpy+0x71/0x150
>>Code: b9 80 80 80 80 80 80 80 80 48 c1 e8 03 48 8d 1c c5 08 00 00 00 31 c0 eb
>>11 48 89 34 07 48 83 c0 08 48 39 d8 0f 84 83 00 00 00 <49> 8b 34 00 4a 8d 14 1e
>>49 89 f2 49 f7 d2 4c 21 d2 4c 8d 14 07 4c
>>RSP: 0018:ffffc9000420fc68 EFLAGS: 00010246
>>RAX: 0000000000000000 RBX: 0000000000000080 RCX: 0000000000000080
>>RDX: 0000000000000080 RSI: 0000000000000000 RDI: ffff8881030ec808
>>RBP: ffff888109724000 R08: 0000000000000000 R09: 8080808080808080
>>R10: ffffc9000420fc78 R11: fefefefefefefeff R12: ffffc90004219000
>>R13: ffff888104a80000 R14: 0000000000000008 R15: 0000000000000000
>>FS:  00007f09ea73f740(0000) GS:ffff88843fc80000(0000) knlGS:0000000000000000
>>CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>CR2: 0000000000000000 CR3: 0000000120760002 CR4: 0000000000772ef0
>>PKRU: 55555554
>>Kernel panic - not syncing: Fatal exception
>>Kernel Offset: disabled
>>
>>
>>Calling a script that does the same thing works fine and loads the keys
>>correctly:
>>
>>[Service]
>>ExecStart=/root/kexec.sh
>>
>>~ # cat /root/kexec.sh
>>#!/bin/bash
>>
>>mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
>>echo cryptsetup:mykey > /sys/kernel/config/crash_dm_crypt_keys/mykey/description
>>/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
>>
>>If that's any help, my crypttab:
>>
>>~ # cat /etc/crypttab
>>root UUID=8001fca4-2e54-48e9-9235-031c19fc6e36 none luks,link-volume-key=@u::%logon:cryptsetup:mykey
>>
>>If you can't reproduce, I can help track this. Just let me know if you need
>>any help.
>>
>
>-- 
>Best regards,
>Coiby
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys
  2025-04-28 18:40     ` Arnaud Lefebvre
@ 2025-04-28 23:56       ` Coiby Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-04-28 23:56 UTC (permalink / raw)
  To: Arnaud Lefebvre
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov

On Mon, Apr 28, 2025 at 08:40:44PM +0200, Arnaud Lefebvre wrote:
>On Mon, Apr 28, 2025 at 05:02:23PM +0800, Coiby Xu wrote:
>>On Thu, Apr 24, 2025 at 02:08:55AM +0200, Arnaud Lefebvre wrote:
>>>On Fri, Feb 07, 2025 at 04:08:08PM +0800, Coiby Xu wrote:
>>>>LUKS is the standard for Linux disk encryption, widely adopted by users,
>>>>and in some cases, such as Confidential VMs, it is a requirement. With
>>>>kdump enabled, when the first kernel crashes, the system can boot into
>>>>the kdump/crash kernel to dump the memory image (i.e., /proc/vmcore)
>>>>to a specified target. However, there are two challenges when dumping
>>>>vmcore to a LUKS-encrypted device:
>>>>
>>>>- Kdump kernel may not be able to decrypt the LUKS partition. For some
>>>>machines, a system administrator may not have a chance to enter the
>>>>password to decrypt the device in kdump initramfs after the 1st kernel
>>>>crashes; For cloud confidential VMs, depending on the policy the
>>>>kdump kernel may not be able to unseal the keys with TPM and the
>>>>console virtual keyboard is untrusted.
>>>>
>>>>- LUKS2 by default use the memory-hard Argon2 key derivation function
>>>>which is quite memory-consuming compared to the limited memory reserved
>>>>for kdump. Take Fedora example, by default, only 256M is reserved for
>>>>systems having memory between 4G-64G. With LUKS enabled, ~1300M needs
>>>>to be reserved for kdump. Note if the memory reserved for kdump can't
>>>>be used by 1st kernel i.e. an user sees ~1300M memory missing in the
>>>>1st kernel.
>>>>
>>>>Besides users (at least for Fedora) usually expect kdump to work out of
>>>>the box i.e. no manual password input or custom crashkernel value is
>>>>needed. And it doesn't make sense to derivate the keys again in kdump
>>>>kernel which seems to be redundant work.
>>>>
>>>>This patch set addresses the above issues by making the LUKS volume keys
>>>>persistent for kdump kernel with the help of cryptsetup's new APIs
>>>>(--link-vk-to-keyring/--volume-key-keyring). Here is the life cycle of
>>>>the kdump copies of LUKS volume keys,
>>>>
>>>>1. After the 1st kernel loads the initramfs during boot, systemd
>>>> use an user-input passphrase to de-crypt the LUKS volume keys
>>>> or TPM-sealed key and then save the volume keys to specified keyring
>>>> (using the --link-vk-to-keyring API) and the key will expire within
>>>> specified time.
>>>>
>>>>2. A user space tool (kdump initramfs loader like kdump-utils) create
>>>> key items inside /sys/kernel/config/crash_dm_crypt_keys to inform
>>>> the 1st kernel which keys are needed.
>>>>
>>>>3. When the kdump initramfs is loaded by the kexec_file_load
>>>> syscall, the 1st kernel will iterate created key items, save the
>>>> keys to kdump reserved memory.
>>>>
>>>>4. When the 1st kernel crashes and the kdump initramfs is booted, the
>>>> kdump initramfs asks the kdump kernel to create a user key using the
>>>> key stored in kdump reserved memory by writing yes to
>>>> /sys/kernel/crash_dm_crypt_keys/restore. Then the LUKS encrypted
>>>> device is unlocked with libcryptsetup's --volume-key-keyring API.
>>>>
>>>>5. The system gets rebooted to the 1st kernel after dumping vmcore to
>>>> the LUKS encrypted device is finished
>>>>
>>>>After libcryptsetup saving the LUKS volume keys to specified keyring,
>>>>whoever takes this should be responsible for the safety of these copies
>>>>of keys. The keys will be saved in the memory area exclusively reserved
>>>>for kdump where even the 1st kernel has no direct access. And further
>>>>more, two additional protections are added,
>>>>- save the copy randomly in kdump reserved memory as suggested by Jan
>>>>- clear the _PAGE_PRESENT flag of the page that stores the copy as
>>>>suggested by Pingfan
>>>>
>>>>This patch set only supports x86. There will be patches to support other
>>>>architectures once this patch set gets merged.
>>>>
>>>
>>>I'm not sure what's the problem here but I can reliably trigger a kernel
>>>panic on a qemu VM + custom kernel (compiled from
>>>bb066fe812d6fb3a9d01c073d9f1e2fd5a63403b + your patches).
>>
>>Hi Arnaud,
>>
>>Thanks for testing the patches, finding this issue and also sharing the
>>details to reproduce it!
>>
>
>Hello,
>
>You're welcome, thanks to you for this patch series!
>
>>>
>>>When I configure the crash configfs and call kexec in a systemd service
>>>using ExecStart=, the panic occurs when I start the service:
>>>
>>>~ # cat /etc/systemd/system/my-kexec.service
>>>[Unit]
>>>Description=kexec loading for the crash capture kernel
>>>
>>>[Service]
>>>Type=oneshot
>>>ExecStart=/usr/bin/mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
>>>ExecStart=/usr/bin/echo cryptsetup:mykey >/sys/kernel/config/crash_dm_crypt_keys/mykey/description
>>>ExecStart=/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
>>>KeyringMode=shared
>>
>>Can you try putting the above commands into a script e.g.
>>/usr/local/bin/my-kexec.sh and then using
>>ExecStart=/usr/local/bin/my-kexec.sh
>>so I can be more sure that I've reproduced your issue?
>>
>
>I believe that's what I wrote at the end of my previous message
>(see below the panic trace). It works fine using a script like that.
>
>Did you miss it or is there a difference with what you're asking?

Oh, I missed it, thanks for the reminder! Now I'm sure I have reproduced
the issue and understood how the problem happens.  This is because
systemd will ignore the bash redirection ">" as it doesn't invoke bash
to run ExecStart. So you will see systemd logs as follows, 
   warning: ignoring excess arguments, starting with with ‘>/sys/kernel/config/...

So a key configfs item is created but the key description fails to be
set. Unfortunately, the kernel doesn't check if the key description is
null and crashes when trying to copy it. I'll send a new version of
patches to resolve this issue, thanks!

>
> [...]
>>>Kernel panic - not syncing: Fatal exception
>>>Kernel Offset: disabled
>>>
>>>
>>>Calling a script that does the same thing works fine and loads the keys
>>>correctly:
>>>
>>>[Service]
>>>ExecStart=/root/kexec.sh
>>>
>>>~ # cat /root/kexec.sh
>>>#!/bin/bash
>>>
>>>mkdir /sys/kernel/config/crash_dm_crypt_keys/mykey
>>>echo cryptsetup:mykey > /sys/kernel/config/crash_dm_crypt_keys/mykey/description
>>>/usr/host/bin/kexec --debug --load-panic /linux-hv --initrd /crash-initrd
>>>
>>>If that's any help, my crypttab:
>>>
>>>~ # cat /etc/crypttab
>>>root UUID=8001fca4-2e54-48e9-9235-031c19fc6e36 none luks,link-volume-key=@u::%logon:cryptsetup:mykey
>>>
>>>If you can't reproduce, I can help track this. Just let me know if you need
>>>any help.
>>>
>>
>>-- 
>>Best regards,
>>Coiby
>>
>

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel
  2025-04-23 20:44   ` Arnaud Lefebvre
@ 2025-04-29  9:34     ` Coiby Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-04-29  9:34 UTC (permalink / raw)
  To: Arnaud Lefebvre
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, open list:DOCUMENTATION

On Wed, Apr 23, 2025 at 10:44:22PM +0200, Arnaud Lefebvre wrote:
>On Fri, Feb 07, 2025 at 04:08:10PM +0800, Coiby Xu wrote:
>>+config CRASH_DM_CRYPT
>>+	bool "Support saving crash dump to dm-crypt encrypted volume"
>>+	depends on KEXEC_FILE
>>+	depends on CRASH_DUMP
>>+	depends on DM_CRYPT
>>+	help
>>+	  With this option enabled, user space can intereact with
>>+	  /sys/kernel/config/crash_dm_crypt_keys to make the dm crypt keys
>>+	  persistent for the dump-capture kernel.
>>+
>
>Maybe also add CONFIG_CONFIGFS_FS option? Without it this series code doesn't compile:

I'll add the dependency on CONFIG_CONFIGFS_FS, thanks for your
suggestion!

>
>Last build lines:
>
>   GEN     modules.builtin
>   MODPOST vmlinux.symvers
>   UPD     include/generated/utsversion.h
>   CC      init/version-timestamp.o
>   KSYMS   .tmp_vmlinux0.kallsyms.S
>   AS      .tmp_vmlinux0.kallsyms.o
>   LD      .tmp_vmlinux1
> ld: vmlinux.o: in function `config_keys_make_item':
> /usr/src/linux/kernel/crash_dump_dm_crypt.c:250:(.text+0x228028): undefined reference to `config_item_init_type_name'
> ld: vmlinux.o: in function `configfs_dmcrypt_keys_init':
> /usr/src/linux/kernel/crash_dump_dm_crypt.c:442:(.init.text+0x71e5c): undefined reference to `config_group_init'
> ld: /usr/src/linux/kernel/crash_dump_dm_crypt.c:444:(.init.text+0x71e82): undefined reference to `configfs_register_subsystem'
> ld: /usr/src/linux/kernel/crash_dump_dm_crypt.c:454:(.init.text+0x71ef7): undefined reference to `configfs_unregister_subsystem'
> make[2]: *** [scripts/Makefile.vmlinux:77: vmlinux] Error 1
> make[1]: *** [/usr/src/linux/Makefile:1226: vmlinux] Error 2
> make: *** [Makefile:251: __sub-make] Error 2



-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 6/7] x86/crash: pass dm crypt keys to kdump kernel
  2025-04-23 20:59   ` Arnaud Lefebvre
@ 2025-04-29  9:40     ` Coiby Xu
  2025-04-30 14:48       ` Arnaud Lefebvre
  0 siblings, 1 reply; 24+ messages in thread
From: Coiby Xu @ 2025-04-29  9:40 UTC (permalink / raw)
  To: Arnaud Lefebvre
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin,
	open list:DOCUMENTATION

On Wed, Apr 23, 2025 at 10:59:06PM +0200, Arnaud Lefebvre wrote:
>>diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>>index 68530fad05f7..5604a5109858 100644
>>--- a/arch/x86/kernel/kexec-bzimage64.c
>>+++ b/arch/x86/kernel/kexec-bzimage64.c
>>@@ -76,6 +76,10 @@ static int setup_cmdline(struct kimage *image, struct boot_params *params,
>>	if (image->type == KEXEC_TYPE_CRASH) {
>>		len = sprintf(cmdline_ptr,
>>			"elfcorehdr=0x%lx ", image->elf_load_addr);
>>+
>>+		if (image->dm_crypt_keys_addr != 0)
>>+			len += sprintf(cmdline_ptr + len,
>>+					"dmcryptkeys=0x%lx ", image->dm_crypt_keys_addr);

sprintf will return the length of dmcryptkey=xxx which will be added to
len.

>>	}
>>	memcpy(cmdline_ptr + len, cmdline, cmdline_len);
>>	cmdline_len += len;

Then cmdline_len will included the new len.

>
>You are adding another kernel parameter but I believe without taking its
>length into account. See the MAX_ELFCOREHDR_STR_LEN constant which is added to the
>params_cmdline_sz variable for the elfcorehdr= parameter.

Thanks for raising the concern! I believe this issue has already been
took care of. Please check the above two inline comments:)


>
>This will (at least during my tests) truncate the cmdline given to the crash kernel because
>the next section (efi_map_offset) will have an offset starting inside the cmdline section
>and it might overwrite the end of it:
>
>kexec-bzimage64.c:480:
>params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
>			MAX_ELFCOREHDR_STR_LEN; <<< Should have + 31 here for "dmcryptkeys=0x<ptr> "
>params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
>kbuf.bufsz = params_cmdline_sz + ALIGN(efi_map_sz, 16) +
>			sizeof(struct setup_data) +
>			sizeof(struct efi_setup_data) +
>			sizeof(struct setup_data) +
>			RNG_SEED_LENGTH;
>
>And I believe the buffer might be too small.
>
>Also, there is another check a few lines above that needs to take the size into account:
>
>/*
> * In case of crash dump, we will append elfcorehdr=<addr> to
> * command line. Make sure it does not overflow
> */
>if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
>	pr_err("Appending elfcorehdr=<addr> to command line exceeds maximum allowed length\n");
>	return ERR_PTR(-EINVAL);
>}
>

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 6/7] x86/crash: pass dm crypt keys to kdump kernel
  2025-04-29  9:40     ` Coiby Xu
@ 2025-04-30 14:48       ` Arnaud Lefebvre
  2025-05-02  0:13         ` Coiby Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Arnaud Lefebvre @ 2025-04-30 14:48 UTC (permalink / raw)
  To: Coiby Xu
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin,
	open list:DOCUMENTATION

On Tue, Apr 29, 2025 at 05:40:21PM +0800, Coiby Xu wrote:
>On Wed, Apr 23, 2025 at 10:59:06PM +0200, Arnaud Lefebvre wrote:
>>>diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>>>index 68530fad05f7..5604a5109858 100644
>>>--- a/arch/x86/kernel/kexec-bzimage64.c
>>>+++ b/arch/x86/kernel/kexec-bzimage64.c
>>>@@ -76,6 +76,10 @@ static int setup_cmdline(struct kimage *image, struct boot_params *params,
>>>	if (image->type == KEXEC_TYPE_CRASH) {
>>>		len = sprintf(cmdline_ptr,
>>>			"elfcorehdr=0x%lx ", image->elf_load_addr);
>>>+
>>>+		if (image->dm_crypt_keys_addr != 0)
>>>+			len += sprintf(cmdline_ptr + len,
>>>+					"dmcryptkeys=0x%lx ", image->dm_crypt_keys_addr);
>
>sprintf will return the length of dmcryptkey=xxx which will be added to
>len.
>
>>>	}
>>>	memcpy(cmdline_ptr + len, cmdline, cmdline_len);
>>>	cmdline_len += len;
>
>Then cmdline_len will included the new len.

Yes, the cmdline_len is correct. No issue there.

>
>>
>>You are adding another kernel parameter but I believe without taking its
>>length into account. See the MAX_ELFCOREHDR_STR_LEN constant which is added to the
>>params_cmdline_sz variable for the elfcorehdr= parameter.
>
>Thanks for raising the concern! I believe this issue has already been
>took care of. Please check the above two inline comments:)
>

I'm sorry but I don't think it is. If you look at my comments below:

>
>>
>>This will (at least during my tests) truncate the cmdline given to the crash kernel because
>>the next section (efi_map_offset) will have an offset starting inside the cmdline section
>>and it might overwrite the end of it:
>>
>>kexec-bzimage64.c:480:
>>params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
>>			MAX_ELFCOREHDR_STR_LEN; <<< Should have + 31 here for "dmcryptkeys=0x<ptr> "
>>params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
>>kbuf.bufsz = params_cmdline_sz + ALIGN(efi_map_sz, 16) +
>>			sizeof(struct setup_data) +
>>			sizeof(struct efi_setup_data) +
>>			sizeof(struct setup_data) +
>>			RNG_SEED_LENGTH;
>>
>>And I believe the buffer might be too small.
>>
>>Also, there is another check a few lines above that needs to take the size into account:
>>
>>/*
>>* In case of crash dump, we will append elfcorehdr=<addr> to
>>* command line. Make sure it does not overflow
>>*/
>>if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
>>	pr_err("Appending elfcorehdr=<addr> to command line exceeds maximum allowed length\n");
>>	return ERR_PTR(-EINVAL);
>>}
>>
>

To try to explain a bit more, we pass a lot of arguments to the crash kernel so
the initrd (dracut) can mount the encrypted disk. When I run kexec using
the following:

/usr/host/bin/kexec --debug --load-panic /linux-hv '--append=maxcpus=1
reset_devices rd.info rd.cc.kdump root=UUID=d039277c-2ee
3-466a-85eb-db9524398135 console=ttyS0 rd.timeout=10 rd.shell=1
rd.cc.kdump.encrypted
rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c260907a2447' --initrd
/crash-initrd

kexec debug print those logs:

<snip>
[   53.642483] kexec-bzImage64: Loaded purgatory at 0xb6ffb000
[   53.642828] kexec-bzImage64: Loaded boot_param, command line and misc at
0xb6ff9000 bufsz=0x12f0 memsz=0x2000
[   53.643366] kexec-bzImage64: Loaded 64bit kernel at 0xb1000000
bufsz=0x16a5000 memsz=0x550d000
[   53.643918] kexec-bzImage64: Loaded initrd at 0xaeb90000 bufsz=0x246f2a1
memsz=0x246f2a1
[   53.644363] kexec-bzImage64: Final command line is: elfcorehdr=0x77000000
dmcryptkeys=0xa81fc000 maxcpus=1 reset_devices rd.info rd.cc.kdump
root=UUID=d039277c-2ee3-466a-85eb-db9524398135  console=ttyS0 rd.timeout=10
rd.shell=1 rd.cc.kdump.encrypted
rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c260907a2447
<snip>

Here, we see the full command line, as expected. But when I trigger a panic
using `echo c > /proc/sysrq-trigger`, the first two lines of the crash kernel
loading are:

[    0.000000] Linux version 6.12.23+ (arnaud@exherbo) (gcc (GCC) 12.3.0, GNU ld
(GNU Binutils) 2.44) #4 SMP Wed Apr 30 16:11:39 CEST 2025
[    0.000000] Command line: elfcorehdr=0x77000000 dmcryptkeys=0x9ec14000
maxcpus=1 reset_devices rd.info rd.cc.kdump
root=UUID=d039277c-2ee3-466a-85eb-db9524398135 console=ttyS0 rd.timeout=10
rd.shell=1 rd.cc.kdump.encrypted
rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c26090

You can see some of it is truncated at the end. It's missing `7a2447`. This is
because I guess it gets overridden.

My comment above explains where and why it might happen. If I add the size of
the dmcryptkeys string length to the params_cmdline_sz variable, we should
allocate enough space to have it all. With the patch below, it works fine and I
get the full cmdline when my crash kernel boots:

[    0.000000] Linux version 6.12.23+ (arnaud@exherbo) (gcc (GCC) 12.3.0, GNU ld
(GNU Binutils) 2.44) #3 SMP Thu Apr 24 16:42:18 CEST 2025
[    0.000000] Command line: elfcorehdr=0x77000000 dmcryptkeys=0xa81fc000
maxcpus=1 reset_devices rd.info rd.cc.kdump
root=UUID=d039277c-2ee3-466a-85eb-db9524398135 console=ttyS0 rd.timeout=10
rd.shell=1 rd.cc.kdump.encrypted
rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c260907a2447


diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
index 5604a5109858..06fc1f412af4 100644
--- a/arch/x86/kernel/kexec-bzimage64.c
+++ b/arch/x86/kernel/kexec-bzimage64.c
@@ -27,6 +27,7 @@
  #include <asm/kexec-bzimage64.h>
  
  #define MAX_ELFCOREHDR_STR_LEN	30	/* elfcorehdr=0x<64bit-value> */
+#define MAX_DMCRYPTKEYS_STR_LEN 31
  
  /*
   * Defines lowest physical address for various segments. Not sure where
@@ -434,7 +435,7 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
  	 * In case of crash dump, we will append elfcorehdr=<addr> to
  	 * command line. Make sure it does not overflow
  	 */
-	if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
+	if (cmdline_len + MAX_ELFCOREHDR_STR_LEN + MAX_DMCRYPTKEYS_STR_LEN > header->cmdline_size) {
  		pr_err("Appending elfcorehdr=<addr> to command line exceeds maximum allowed length\n");
  		return ERR_PTR(-EINVAL);
  	}
@@ -478,7 +479,7 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
  	 */
  	efi_map_sz = efi_get_runtime_map_size();
  	params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
-				MAX_ELFCOREHDR_STR_LEN;
+				MAX_ELFCOREHDR_STR_LEN + MAX_DMCRYPTKEYS_STR_LEN;
  	params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
  	kbuf.bufsz = params_cmdline_sz + ALIGN(efi_map_sz, 16) +
  				sizeof(struct setup_data) +


Let me know if it makes more sense!

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v8 6/7] x86/crash: pass dm crypt keys to kdump kernel
  2025-04-30 14:48       ` Arnaud Lefebvre
@ 2025-05-02  0:13         ` Coiby Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Coiby Xu @ 2025-05-02  0:13 UTC (permalink / raw)
  To: Arnaud Lefebvre
  Cc: kexec, Ondrej Kozina, Milan Broz, Thomas Staudt,
	Daniel P . Berrangé, Kairui Song, Pingfan Liu, Baoquan He,
	Dave Young, linux-kernel, x86, Dave Hansen, Vitaly Kuznetsov,
	Vivek Goyal, Jonathan Corbet, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, H. Peter Anvin,
	open list:DOCUMENTATION

On Wed, Apr 30, 2025 at 04:48:25PM +0200, Arnaud Lefebvre wrote:
>On Tue, Apr 29, 2025 at 05:40:21PM +0800, Coiby Xu wrote:
>>On Wed, Apr 23, 2025 at 10:59:06PM +0200, Arnaud Lefebvre wrote:
>>>>diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>>>>index 68530fad05f7..5604a5109858 100644
>>>>--- a/arch/x86/kernel/kexec-bzimage64.c
>>>>+++ b/arch/x86/kernel/kexec-bzimage64.c
>>>>@@ -76,6 +76,10 @@ static int setup_cmdline(struct kimage *image, struct boot_params *params,
>>>>	if (image->type == KEXEC_TYPE_CRASH) {
>>>>		len = sprintf(cmdline_ptr,
>>>>			"elfcorehdr=0x%lx ", image->elf_load_addr);
>>>>+
>>>>+		if (image->dm_crypt_keys_addr != 0)
>>>>+			len += sprintf(cmdline_ptr + len,
>>>>+					"dmcryptkeys=0x%lx ", image->dm_crypt_keys_addr);
>>
>>sprintf will return the length of dmcryptkey=xxx which will be added to
>>len.
>>
>>>>	}
>>>>	memcpy(cmdline_ptr + len, cmdline, cmdline_len);
>>>>	cmdline_len += len;
>>
>>Then cmdline_len will included the new len.
>
>Yes, the cmdline_len is correct. No issue there.

Thanks for confirming it!

>
>>
>>>
>>>You are adding another kernel parameter but I believe without taking its
>>>length into account. See the MAX_ELFCOREHDR_STR_LEN constant which is added to the
>>>params_cmdline_sz variable for the elfcorehdr= parameter.
>>
>>Thanks for raising the concern! I believe this issue has already been
>>took care of. Please check the above two inline comments:)
>>
>
>I'm sorry but I don't think it is. If you look at my comments below:
>
>>
>>>
>>>This will (at least during my tests) truncate the cmdline given to the crash kernel because
>>>the next section (efi_map_offset) will have an offset starting inside the cmdline section
>>>and it might overwrite the end of it:
>>>
>>>kexec-bzimage64.c:480:
>>>params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
>>>			MAX_ELFCOREHDR_STR_LEN; <<< Should have + 31 here for "dmcryptkeys=0x<ptr> "
>>>params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
>>>kbuf.bufsz = params_cmdline_sz + ALIGN(efi_map_sz, 16) +
>>>			sizeof(struct setup_data) +
>>>			sizeof(struct efi_setup_data) +
>>>			sizeof(struct setup_data) +
>>>			RNG_SEED_LENGTH;
>>>
>>>And I believe the buffer might be too small.
>>>
>>>Also, there is another check a few lines above that needs to take the size into account:
>>>
>>>/*
>>>* In case of crash dump, we will append elfcorehdr=<addr> to
>>>* command line. Make sure it does not overflow
>>>*/
>>>if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
>>>	pr_err("Appending elfcorehdr=<addr> to command line exceeds maximum allowed length\n");
>>>	return ERR_PTR(-EINVAL);
>>>}
>>>
>>
>
>To try to explain a bit more, we pass a lot of arguments to the crash kernel so
>the initrd (dracut) can mount the encrypted disk. When I run kexec using
>the following:
>
>/usr/host/bin/kexec --debug --load-panic /linux-hv '--append=maxcpus=1
>reset_devices rd.info rd.cc.kdump root=UUID=d039277c-2ee
>3-466a-85eb-db9524398135 console=ttyS0 rd.timeout=10 rd.shell=1
>rd.cc.kdump.encrypted
>rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
>rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c260907a2447' --initrd
>/crash-initrd
>
>kexec debug print those logs:
>
><snip>
>[   53.642483] kexec-bzImage64: Loaded purgatory at 0xb6ffb000
>[   53.642828] kexec-bzImage64: Loaded boot_param, command line and misc at
>0xb6ff9000 bufsz=0x12f0 memsz=0x2000
>[   53.643366] kexec-bzImage64: Loaded 64bit kernel at 0xb1000000
>bufsz=0x16a5000 memsz=0x550d000
>[   53.643918] kexec-bzImage64: Loaded initrd at 0xaeb90000 bufsz=0x246f2a1
>memsz=0x246f2a1
>[   53.644363] kexec-bzImage64: Final command line is: elfcorehdr=0x77000000
>dmcryptkeys=0xa81fc000 maxcpus=1 reset_devices rd.info rd.cc.kdump
>root=UUID=d039277c-2ee3-466a-85eb-db9524398135  console=ttyS0 rd.timeout=10
>rd.shell=1 rd.cc.kdump.encrypted
>rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
>rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c260907a2447
><snip>
>
>Here, we see the full command line, as expected. But when I trigger a panic
>using `echo c > /proc/sysrq-trigger`, the first two lines of the crash kernel
>loading are:
>
>[    0.000000] Linux version 6.12.23+ (arnaud@exherbo) (gcc (GCC) 12.3.0, GNU ld
>(GNU Binutils) 2.44) #4 SMP Wed Apr 30 16:11:39 CEST 2025
>[    0.000000] Command line: elfcorehdr=0x77000000 dmcryptkeys=0x9ec14000
>maxcpus=1 reset_devices rd.info rd.cc.kdump
>root=UUID=d039277c-2ee3-466a-85eb-db9524398135 console=ttyS0 rd.timeout=10
>rd.shell=1 rd.cc.kdump.encrypted
>rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
>rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c26090
>
>You can see some of it is truncated at the end. It's missing `7a2447`. This is
>because I guess it gets overridden.
>
>My comment above explains where and why it might happen. If I add the size of
>the dmcryptkeys string length to the params_cmdline_sz variable, we should
>allocate enough space to have it all. With the patch below, it works fine and I
>get the full cmdline when my crash kernel boots:
>
>[    0.000000] Linux version 6.12.23+ (arnaud@exherbo) (gcc (GCC) 12.3.0, GNU ld
>(GNU Binutils) 2.44) #3 SMP Thu Apr 24 16:42:18 CEST 2025
>[    0.000000] Command line: elfcorehdr=0x77000000 dmcryptkeys=0xa81fc000
>maxcpus=1 reset_devices rd.info rd.cc.kdump
>root=UUID=d039277c-2ee3-466a-85eb-db9524398135 console=ttyS0 rd.timeout=10
>rd.shell=1 rd.cc.kdump.encrypted
>rd.cc.kdump.device=UUID=908234b1-c1f3-4150-bfdf-c260907a2447
>rd.cc.kdump.keyring=cryptsetup:908234b1-c1f3-4150-bfdf-c260907a2447
>
>
>diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c
>index 5604a5109858..06fc1f412af4 100644
>--- a/arch/x86/kernel/kexec-bzimage64.c
>+++ b/arch/x86/kernel/kexec-bzimage64.c
>@@ -27,6 +27,7 @@
> #include <asm/kexec-bzimage64.h>
> #define MAX_ELFCOREHDR_STR_LEN	30	/* elfcorehdr=0x<64bit-value> */
>+#define MAX_DMCRYPTKEYS_STR_LEN 31
> /*
>  * Defines lowest physical address for various segments. Not sure where
>@@ -434,7 +435,7 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> 	 * In case of crash dump, we will append elfcorehdr=<addr> to
> 	 * command line. Make sure it does not overflow
> 	 */
>-	if (cmdline_len + MAX_ELFCOREHDR_STR_LEN > header->cmdline_size) {
>+	if (cmdline_len + MAX_ELFCOREHDR_STR_LEN + MAX_DMCRYPTKEYS_STR_LEN > header->cmdline_size) {
> 		pr_err("Appending elfcorehdr=<addr> to command line exceeds maximum allowed length\n");
> 		return ERR_PTR(-EINVAL);
> 	}
>@@ -478,7 +479,7 @@ static void *bzImage64_load(struct kimage *image, char *kernel,
> 	 */
> 	efi_map_sz = efi_get_runtime_map_size();
> 	params_cmdline_sz = sizeof(struct boot_params) + cmdline_len +
>-				MAX_ELFCOREHDR_STR_LEN;
>+				MAX_ELFCOREHDR_STR_LEN + MAX_DMCRYPTKEYS_STR_LEN;
> 	params_cmdline_sz = ALIGN(params_cmdline_sz, 16);
> 	kbuf.bufsz = params_cmdline_sz + ALIGN(efi_map_sz, 16) +
> 				sizeof(struct setup_data) +
>
>
>Let me know if it makes more sense!

Yes, thanks for providing a crystal clear explanation and also a fix! I
appreciate your elaboration to show me what the problem is! I'll fix it
in v9.

-- 
Best regards,
Coiby


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2025-05-02  0:15 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-02-07  8:08 [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Coiby Xu
2025-02-07  8:08 ` [PATCH v8 1/7] kexec_file: allow to place kexec_buf randomly Coiby Xu
2025-02-07  8:08 ` [PATCH v8 2/7] crash_dump: make dm crypt keys persist for the kdump kernel Coiby Xu
2025-04-23 20:44   ` Arnaud Lefebvre
2025-04-29  9:34     ` Coiby Xu
2025-02-07  8:08 ` [PATCH v8 3/7] crash_dump: store dm crypt keys in kdump reserved memory Coiby Xu
2025-02-07  8:08 ` [PATCH v8 4/7] crash_dump: reuse saved dm crypt keys for CPU/memory hot-plugging Coiby Xu
2025-02-07  8:08 ` [PATCH v8 5/7] crash_dump: retrieve dm crypt keys in kdump kernel Coiby Xu
2025-02-07  8:08 ` [PATCH v8 6/7] x86/crash: pass dm crypt keys to " Coiby Xu
2025-04-23 20:59   ` Arnaud Lefebvre
2025-04-29  9:40     ` Coiby Xu
2025-04-30 14:48       ` Arnaud Lefebvre
2025-05-02  0:13         ` Coiby Xu
2025-02-07  8:13 ` [PATCH v8 7/7] x86/crash: make the page that stores the dm crypt keys inaccessible Coiby Xu
2025-02-11 10:25 ` [PATCH v8 0/7] Support kdump with LUKS encryption by reusing LUKS volume keys Baoquan He
2025-02-12  0:43   ` Coiby Xu
2025-02-24  1:36   ` Baoquan He
2025-03-21  6:54     ` Coiby Xu
2025-03-10  3:30 ` Baoquan He
2025-04-14  5:44 ` Baoquan He
2025-04-24  0:08 ` Arnaud Lefebvre
2025-04-28  9:02   ` Coiby Xu
2025-04-28 18:40     ` Arnaud Lefebvre
2025-04-28 23:56       ` Coiby Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox