public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/5] TDX host: kexec() support
@ 2024-08-15 12:29 Kai Huang
  2024-08-15 12:29 ` [PATCH v5 1/5] x86/kexec: do unconditional WBINVD for bare-metal in stop_this_cpu() Kai Huang
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Kai Huang @ 2024-08-15 12:29 UTC (permalink / raw)
  To: dave.hansen, kirill.shutemov, bp, tglx, peterz, mingo
  Cc: x86, hpa, luto, linux-kernel, thomas.lendacky, pbonzini, seanjc

Currently kexec() support and TDX host are muturally exclusive in the
Kconfig.  This series adds the TDX host kexec support so that they can
work together and can be enabled at the same time in the Kconfig.

Hi maintainers,

This series aims to go through the tip tree, but I also CC'ed Sean/Paolo
due to when KVM TDX comes to play a KVM patch [*] is needed to complete
the kexec support for TDX.  Also copy Dan for TDX connect.

Thanks for your time!

=== More information ===

If the kernel has ever enabled TDX, part of system memory remains TDX
private memory when kexec happens.  E.g., the PAMT (Physical Address
Metadata Table) pages used by the TDX module to track each TDX memory
page's state are never freed once the TDX module is initialized.  TDX
guests also have guest private memory and secure-EPT pages.

Similar to AMD SME, to support kexec the kernel needs to flush dirty
cachelines for TDX private memory before booting to the second kernel.
Also, the kernel needs to reset TDX private memory to normal (using
MOVDIR64B) before booting to the second kernel when the platform has
"partial write machine check" erratum, otherwise the second kernel may
see unexpected machine check.

The majority code change in this series handles "resetting TDX private
memory" (flushing cache part is relatively straightforward).  Due to
currently the kernel doesn't have a unified way to tell whether a given
page is TDX private or not, this series chooses to only reset PAMT in
the core-kernel kexec code, but requires the in-kernel TDX users (e.g.,
KVM to reset the TDX private pages that they manage (see [*]).

Other options are also mentioned in the changelog of patch:

  x86/kexec: Reset TDX private memory on platforms with TDX erratum

..which also contains more information about the above TDX erratum.

This series also covers crash kexec, but no special handling is needed
for crash kexec:

1) kdump kernel uses reserved memory from the first kernel, but the
   reserved memory will never be used as TDX memory.
2) /proc/vmcore in the kdump kernel will only be used for read, but read
   itself won't poison TDX private memory thus won't cause unexpected
   machine check (only "partial write" will).


v4 -> v5:
 - Rebase to tip/master.
 - Remove the TDX-specific callback due to no need to reset TDX private
   memory for crash kexec.
 - Add a new patch to make module status immutable in reboot notifier
   (split from v1) in order to use module status to tell the presence of
   TDX private memory.
 - Minor changelog updates, trivial comments improvements.
 - Add Tom's Reviewed-by tag.

 v4: https://lore.kernel.org/all/cover.1713439632.git.kai.huang@intel.com/

v3 -> v4:
 - Updated changelog and comments of patch 1/2 per comments from
   Kirill and Tom (see specific patch for details).

 v3: https://lore.kernel.org/linux-kernel/cover.1712493366.git.kai.huang@intel.com/

v2 -> v3:
 - Change to only do WBINVD for bare-metal, as Kirill/Tom pointed out
   WBINVD in TDX guests and SEV-ES/SEV-SNP guests triggers #VE.

 v2: https://lore.kernel.org/linux-kernel/cover.1710811610.git.kai.huang@intel.com/

v1 -> v2:
 - Do unconditional WBINVD during kexec() -- Boris
 - Change to cover crash kexec() -- Rick
 - Add a new patch (last one) to add a mechanism to reset all TDX private
   pages due to having to cover crash kexec().
 - Other code improvements  -- Dave
 - Rebase to latest tip/master.

 v1: https://lore.kernel.org/linux-kernel/cover.1706698706.git.kai.huang@intel.com/

[*]: https://github.com/intel/tdx/commit/513e24d7913457ba87b6f25644d02fbed0848f21


Kai Huang (5):
  x86/kexec: do unconditional WBINVD for bare-metal in stop_this_cpu()
  x86/kexec: do unconditional WBINVD for bare-metal in relocate_kernel()
  x86/virt/tdx: Make module initializatiton state immutable in reboot
    notifier
  x86/kexec: Reset TDX private memory on platforms with TDX erratum
  x86/virt/tdx: Remove the !KEXEC_CORE dependency

 arch/x86/Kconfig                     |  1 -
 arch/x86/include/asm/kexec.h         |  2 +-
 arch/x86/include/asm/tdx.h           |  2 +
 arch/x86/kernel/machine_kexec_64.c   | 29 +++++++++--
 arch/x86/kernel/process.c            | 19 ++++---
 arch/x86/kernel/relocate_kernel_64.S | 19 +++++--
 arch/x86/virt/vmx/tdx/tdx.c          | 78 ++++++++++++++++++++++++++++
 7 files changed, 129 insertions(+), 21 deletions(-)


base-commit: b8c7cbc324dc17b9e42379b42603613580bec2d8
-- 
2.45.2


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2024-09-04 23:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15 12:29 [PATCH v5 0/5] TDX host: kexec() support Kai Huang
2024-08-15 12:29 ` [PATCH v5 1/5] x86/kexec: do unconditional WBINVD for bare-metal in stop_this_cpu() Kai Huang
2024-08-15 12:29 ` [PATCH v5 2/5] x86/kexec: do unconditional WBINVD for bare-metal in relocate_kernel() Kai Huang
2024-08-15 23:45   ` Huang, Kai
2024-09-04 15:30   ` Borislav Petkov
2024-09-04 23:55     ` Huang, Kai
2024-08-15 12:29 ` [PATCH v5 3/5] x86/virt/tdx: Make module initializatiton state immutable in reboot notifier Kai Huang
2024-08-15 12:29 ` [PATCH v5 4/5] x86/kexec: Reset TDX private memory on platforms with TDX erratum Kai Huang
2024-08-15 12:29 ` [PATCH v5 5/5] x86/virt/tdx: Remove the !KEXEC_CORE dependency Kai Huang
2024-08-19 21:21 ` [PATCH v5 0/5] TDX host: kexec() support Sagi Shahar
2024-08-19 22:16   ` Huang, Kai
2024-08-19 22:28     ` Sagi Shahar
2024-08-19 22:43       ` Huang, Kai
2024-08-23 16:15         ` Sagi Shahar
2024-08-24  9:31           ` Huang, Kai
2024-08-26 19:22             ` Sagi Shahar
2024-08-26 22:50               ` Huang, Kai

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox