[PATCH v2 0/6] crashdump: Kernel handling of CPU and memory hot un/plug

From: Eric DeVolder <eric.devolder@oracle.com>
To: kexec@lists.infradead.org
Cc: boris.ostrovsky@oracle.com, eric.devolder@oracle.com
Subject: [PATCH v2 0/6] crashdump: Kernel handling of CPU and memory hot un/plug
Date: Wed,  3 May 2023 18:16:05 -0400	[thread overview]
Message-ID: <20230503221611.2119-1-eric.devolder@oracle.com> (raw)

When the kdump service is loaded, if a CPU or memory is hot
un/plugged, the crash elfcorehdr, which describes the CPUs and memory
in the system, must also be updated, else the resulting vmcore is
inaccurate (eg. missing either CPU context or memory regions).

The current solution (eg. RHEL /usr/lib/udev/rules.d/98-kexec.rules)
utilizes udev to initiate an unload-then-reload of the *entire* kdump
image (eg. kernel, initrd, boot_params, purgatory and elfcorehdr) by
the userspace kexec utility. In a previous kernel patch post I have
outlined the significant performance problems related to offloading
this activity to userspace.

As such, I've been working to provide the ability for the Linux kernel
to directly modify the elfcorehdr in response to hotplug changes.

 https://lore.kernel.org/lkml/20230404180326.6890-1-eric.devolder@oracle.com/

The series listed above is v21, and the v22 contains changes that
work in concert with the v2 changes cited within. (I'm posting the
kexec-tools changes first so I can reference them in the kernel v22
posting.)

I believe this work to be nearing the finish line. As such, I'd like
to start posting the kexec-tools userspace changes for review in order
to minimize the time to adoption.

This kexec-tools patch series is for supporting the kexec_load
syscall only. The kernel patch series cited above is self-contained
for the kexec_file_load syscall, requiring no userspace help.

There are two basic obstacles/requirements for the kexec-tools to
overcome in order to support kernel hotplug rewriting of the
elfcorehdr.

First, the buffer containing the elfcorehdr must be excluded from the
purgatory checksum/digest, which is computed at load time. Otherwise
kernel run-time changes to the elfcorehdr, as a result of hot un/plug,
would result in the checksum failing (specifically in purgatory at
panic kernel boot time), and kdump capture kernel failing to start.
To let the kernel know it is okay to modify the elfcorehdr, kexec
sets the KEXEC_UPDATE_ELFCOREHDR flag.

NOTE: The kernel specifically does *NOT* attempt to recompute the
checksum/digest as that would ultimately require patching the in-
memory purgatory image with the updated checksum. As that purgatory
image is already fully linked, it is binary blob containing no ELF
information which would allow it to be re-linked or patched. Thus
excluding the elfcorehdr from the checksum/digests avoids all these
problems.

Second, the size of the elfcorehdr buffer must be large enough
to accomodate growth of the number of CPUs and/or memory regions.

To satisfy the first requirement, this patch series introduces the
--hotplug option to indicate to kexec-tools that kexec should exclude
the elfcorehdr buffer from the purgatory checksum/digest calculation
and set the KEXEC_UPDATE_ELFCOREHDR flag.

To satisfy the second requirement, the size is obtained from the
(proposed in the kernel series above)
/sys/kernel/crash_elfcorehdr_size node, or it can be specified
manually with new --elfcorehdrsz= option.

I am intentionally posting this series before the kernel changes
have been merged. I'm hoping to facilitate discussion as to how
kexec-tools wants to handle the soon-to-be new kernel feature.

Discussion items:

- It is worth noting, that deploying kexec-tools, with this series
  included, on kernels that do NOT have the kernel hotplug series
  cited above, is safe to do. The result of running a kernel without
  hotplug elfcorehdr support with kexec-tools and the --hotplug option
  simply removes the elfcorehdr buffer from the digest. This does not
  prevent kdump from operating; the only risk being a slight chance of
  corruption of the elfcorehdr, as it now not covered by the checksum.
  Using the --elfcorehdrsz option on a kernel without hotplug
  elfcorehdr support simply results in a possibly oversized buffer for
  the elfcorehdr, there is no harm in that.

- While I currently have the --hotplug as an option, the option could
  be eliminated (or reversed polarity) it would be safe to *always*
  omit the elfcorehdr from the checksum/digest for purgatory.
  If this were the case, then distros would not have to make any
  changes to kdump scripts to pass the --hotplug option. Then, when
  their kernel does include the kernel patch series cited above,
  kdump and hotplug would "just work".

- I'm unsure if these options should be kept as common/global
  kexec options, or moved to arch options.

- I'm only showing x86 support (and testing) at this time, but
  it would be straight forward to provide similar support for the
  other architectures in a future patch revision.

Thanks!
eric

---
v2: 3may2023
 - Setting KEXEC_UPDATE_ELFCOREHDR flag
 - Utilizing /sys/kernel/crash_elfcorehdr_size info.

v1: 20oct2022
 http://lists.infradead.org/pipermail/kexec/2022-October/026032.html
 - Initial patch series

RFC:
 https://lore.kernel.org/lkml/b04ed259-dc5f-7f30-6661-c26f92d9096a@oracle.com/
 s/vmcoreinfo/elfcorehdr/g
---

Eric DeVolder (6):
  kexec: define KEXEC_UPDATE_ELFCOREHDR
  crashdump: introduce the hotplug command line options
  crashdump: setup hotplug support
  crashdump: exclude elfcorehdr segment from digest for hotplug
  crashdump/x86: identify elfcorehdr segment for hotplug
  crashdump/x86: set the elfcorehdr segment size for hotplug

 kexec/arch/i386/crashdump-x86.c |  8 ++++++
 kexec/kexec-syscall.h           |  1 +
 kexec/kexec.c                   | 45 +++++++++++++++++++++++++++++++++
 kexec/kexec.h                   | 10 +++++++-
 4 files changed, 63 insertions(+), 1 deletion(-)

-- 
2.31.1

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec