Linux Confidential Computing Development
 help / color / mirror / Atom feed
From: Chao Gao <chao.gao@intel.com>
To: kvm@vger.kernel.org, linux-coco@lists.linux.dev,
	linux-kernel@vger.kernel.org
Cc: binbin.wu@linux.intel.com, dave.hansen@linux.intel.com,
	djbw@kernel.org, ira.weiny@intel.com, kai.huang@intel.com,
	kas@kernel.org, nik.borisov@suse.com, paulmck@kernel.org,
	pbonzini@redhat.com, reinette.chatre@intel.com,
	rick.p.edgecombe@intel.com, sagis@google.com, seanjc@google.com,
	tony.lindgren@linux.intel.com, vannapurve@google.com,
	vishal.l.verma@intel.com, yilun.xu@linux.intel.com,
	xiaoyao.li@intel.com, yan.y.zhao@intel.com,
	Chao Gao <chao.gao@intel.com>, Thomas Gleixner <tglx@kernel.org>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>
Subject: [PATCH v10 11/25] coco/tdx-host: Don't expose P-SEAMLDR information on CPUs with erratum
Date: Wed, 20 May 2026 06:38:14 -0700	[thread overview]
Message-ID: <20260520133909.409394-12-chao.gao@intel.com> (raw)
In-Reply-To: <20260520133909.409394-1-chao.gao@intel.com>

TDX-capable CPUs clobber the current VMCS on return from P-SEAMLDR, as
documented in Intel® Trust Domain CPU Architectural Extensions:

  SEAMRET from the P-SEAMLDR clears the current VMCS structure pointed
  to by the current-VMCS pointer. A VMM that invokes the P-SEAMLDR using
  SEAMCALL must reload the current-VMCS, if required, using the VMPTRLD
  instruction.

Clearing the current VMCS behind KVM's back will break KVM.

Future CPUs will fix this by preserving the current VMCS across
P-SEAMLDR calls. A future specification update will describe the SEAMRET
VMCS-clearing behavior as an erratum and to state that it does not occur
when IA32_VMX_BASIC[60] is set.

Add a CPU bug bit for this erratum and refuse to expose P-SEAMLDR
information on affected CPUs, because even reading the P-SEAMLDR sysfs
knobs would enter and exit P-SEAMLDR.

Use a CPU bug bit to stay consistent with X86_BUG_TDX_PW_MCE. As a bonus,
the bug bit is visible to userspace, which allows userspace to determine
why these sysfs files are not exposed, and it can also be checked by other
kernel components in the future if needed.

== Alternatives ==
Two workarounds were considered but both were rejected:

1. Save/restore the current VMCS around P-SEAMLDR calls. This produces ugly
   assembly code [1] and doesn't play well with #MCE or #NMI if they
   need to use the current VMCS.

2. Move KVM's VMCS tracking logic to the TDX core code, which would break
   the boundary between KVM and the TDX core code [2].

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Reviewed-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/kvm/fedb3192-e68c-423c-93b2-a4dc2f964148@intel.com/ # [1]
Link: https://lore.kernel.org/kvm/aYIXFmT-676oN6j0@google.com/ # [2]
---
This is split into a separate patch rather than folded into the previous one,
because the erratum handling warrants a longer changelog and discussion of
the alternatives.

v10:
 - Make it clear that clearing VMCS is the current architecture behavior.
   but will be fixed by a later doc update.
---
 arch/x86/include/asm/cpufeatures.h    |  1 +
 arch/x86/include/asm/vmx.h            |  1 +
 arch/x86/virt/vmx/tdx/tdx.c           | 11 +++++++++++
 drivers/virt/coco/tdx-host/tdx-host.c |  8 ++++++++
 4 files changed, 21 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1d506e5d6f46..7b572bc24265 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -573,4 +573,5 @@
 #define X86_BUG_ITS_NATIVE_ONLY		X86_BUG( 1*32+ 8) /* "its_native_only" CPU is affected by ITS, VMX is not affected */
 #define X86_BUG_TSA			X86_BUG( 1*32+ 9) /* "tsa" CPU is affected by Transient Scheduler Attacks */
 #define X86_BUG_VMSCAPE			X86_BUG( 1*32+10) /* "vmscape" CPU is affected by VMSCAPE attacks from guests */
+#define X86_BUG_SEAMRET_INVD_VMCS	X86_BUG( 1*32+11) /* "seamret_invd_vmcs" SEAMRET from P-SEAMLDR clears the current VMCS */
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 37080382df54..49d8551d285d 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -147,6 +147,7 @@ struct vmcs {
 #define VMX_BASIC_INOUT				BIT_ULL(54)
 #define VMX_BASIC_TRUE_CTLS			BIT_ULL(55)
 #define VMX_BASIC_NO_HW_ERROR_CODE_CC		BIT_ULL(56)
+#define VMX_BASIC_NO_SEAMRET_INVD_VMCS		BIT_ULL(60)
 
 static inline u32 vmx_basic_vmcs_revision_id(u64 vmx_basic)
 {
diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c
index 5fb0441a9ac6..53cf99c41dbb 100644
--- a/arch/x86/virt/vmx/tdx/tdx.c
+++ b/arch/x86/virt/vmx/tdx/tdx.c
@@ -42,6 +42,7 @@
 #include <asm/processor.h>
 #include <asm/mce.h>
 #include <asm/virt.h>
+#include <asm/vmx.h>
 
 #include "seamcall_internal.h"
 #include "tdx.h"
@@ -1450,6 +1451,8 @@ static struct notifier_block tdx_memory_nb = {
 
 static void __init check_tdx_erratum(void)
 {
+	u64 basic_msr;
+
 	/*
 	 * These CPUs have an erratum.  A partial write from non-TD
 	 * software (e.g. via MOVNTI variants or UC/WC mapping) to TDX
@@ -1461,6 +1464,14 @@ static void __init check_tdx_erratum(void)
 	case INTEL_EMERALDRAPIDS_X:
 		setup_force_cpu_bug(X86_BUG_TDX_PW_MCE);
 	}
+
+	/*
+	 * Some TDX-capable CPUs have an erratum where the current VMCS is
+	 * cleared after calling into P-SEAMLDR.
+	 */
+	rdmsrq(MSR_IA32_VMX_BASIC, basic_msr);
+	if (!(basic_msr & VMX_BASIC_NO_SEAMRET_INVD_VMCS))
+		setup_force_cpu_bug(X86_BUG_SEAMRET_INVD_VMCS);
 }
 
 void __init tdx_init(void)
diff --git a/drivers/virt/coco/tdx-host/tdx-host.c b/drivers/virt/coco/tdx-host/tdx-host.c
index 2997311f72fa..2cd7be7bb404 100644
--- a/drivers/virt/coco/tdx-host/tdx-host.c
+++ b/drivers/virt/coco/tdx-host/tdx-host.c
@@ -98,6 +98,14 @@ static umode_t seamldr_group_visible(struct kobject *kobj, struct attribute *att
 	if (!tdx_supports_runtime_update(sysinfo))
 		return 0;
 
+	/*
+	 * Calling P-SEAMLDR on CPUs with the seamret_invd_vmcs bug clears
+	 * the current VMCS, which breaks KVM. Verify the erratum is not
+	 * present before exposing P-SEAMLDR features.
+	 */
+	if (boot_cpu_has_bug(X86_BUG_SEAMRET_INVD_VMCS))
+		return 0;
+
 	return attr->mode;
 }
 
-- 
2.52.0


  parent reply	other threads:[~2026-05-20 13:40 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-20 13:38 [PATCH v10 00/25] Runtime TDX module update support Chao Gao
2026-05-20 13:38 ` [PATCH v10 01/25] x86/virt/tdx: Clarify try_init_module_global() result caching Chao Gao
2026-05-20 13:38 ` [PATCH v10 02/25] x86/virt/tdx: Move TDX global initialization states to file scope Chao Gao
2026-05-20 13:38 ` [PATCH v10 03/25] x86/virt/tdx: Consolidate TDX global initialization states Chao Gao
2026-05-20 13:38 ` [PATCH v10 04/25] x86/virt/tdx: Move TDX_FEATURES0 bits to asm/tdx.h Chao Gao
2026-05-20 13:38 ` [PATCH v10 05/25] x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h> Chao Gao
2026-05-20 13:38 ` [PATCH v10 06/25] coco/tdx-host: Introduce a "tdx_host" device Chao Gao
2026-05-20 13:38 ` [PATCH v10 07/25] coco/tdx-host: Expose TDX module version Chao Gao
2026-05-20 13:38 ` [PATCH v10 08/25] x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs Chao Gao
2026-05-20 13:38 ` [PATCH v10 09/25] x86/virt/seamldr: Add a helper to retrieve P-SEAMLDR information Chao Gao
2026-05-20 13:38 ` [PATCH v10 10/25] coco/tdx-host: Expose P-SEAMLDR information via sysfs Chao Gao
2026-05-20 13:38 ` Chao Gao [this message]
2026-05-20 13:38 ` [PATCH v10 12/25] coco/tdx-host: Implement firmware upload sysfs ABI for TDX module updates Chao Gao
2026-05-20 13:38 ` [PATCH v10 13/25] x86/virt/seamldr: Allocate and populate a module update request Chao Gao
2026-05-20 13:38 ` [PATCH v10 14/25] x86/virt/seamldr: Introduce skeleton for TDX module updates Chao Gao
2026-05-20 13:38 ` [PATCH v10 15/25] x86/virt/seamldr: Abort updates after a failed step Chao Gao
2026-05-20 17:38   ` Dave Hansen
2026-05-20 13:38 ` [PATCH v10 16/25] x86/virt/seamldr: Shut down the current TDX module Chao Gao
2026-05-20 13:38 ` [PATCH v10 17/25] x86/virt/tdx: Reset software states during TDX module shutdown Chao Gao
2026-05-20 13:38 ` [PATCH v10 18/25] x86/virt/seamldr: Install a new TDX module Chao Gao
2026-05-20 13:38 ` [PATCH v10 19/25] x86/virt/seamldr: Do TDX global and per-CPU init after module installation Chao Gao
2026-05-20 13:38 ` [PATCH v10 20/25] x86/virt/tdx: Restore TDX module state Chao Gao
2026-05-20 13:38 ` [PATCH v10 21/25] x86/virt/tdx: Refresh TDX module version after update Chao Gao
2026-05-20 13:38 ` [PATCH v10 22/25] x86/virt/tdx: Reject updates during compatibility-sensitive operations Chao Gao
2026-05-20 19:35   ` Dave Hansen
2026-05-20 13:38 ` [PATCH v10 23/25] x86/virt/tdx: Enable TDX module runtime updates Chao Gao
2026-05-20 13:38 ` [PATCH v10 24/25] coco/tdx-host: Document TDX module update compatibility criteria Chao Gao
2026-05-20 19:22   ` Dave Hansen
2026-05-20 13:38 ` [PATCH v10 25/25] x86/virt/tdx: Document TDX module update Chao Gao
2026-05-20 13:46 ` [PATCH v10 00/25] Runtime TDX module update support Chao Gao
2026-05-20 19:42 ` Dave Hansen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260520133909.409394-12-chao.gao@intel.com \
    --to=chao.gao@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=djbw@kernel.org \
    --cc=hpa@zytor.com \
    --cc=ira.weiny@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=paulmck@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sagis@google.com \
    --cc=seanjc@google.com \
    --cc=tglx@kernel.org \
    --cc=tony.lindgren@linux.intel.com \
    --cc=vannapurve@google.com \
    --cc=vishal.l.verma@intel.com \
    --cc=x86@kernel.org \
    --cc=xiaoyao.li@intel.com \
    --cc=yan.y.zhao@intel.com \
    --cc=yilun.xu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox