public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: Chao Gao <chao.gao@intel.com>
Cc: linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org,
	kvm@vger.kernel.org, x86@kernel.org, reinette.chatre@intel.com,
	ira.weiny@intel.com, kai.huang@intel.com,
	dan.j.williams@intel.com, yilun.xu@linux.intel.com,
	sagis@google.com, vannapurve@google.com, paulmck@kernel.org,
	nik.borisov@suse.com, zhenzhong.duan@intel.com,
	seanjc@google.com, rick.p.edgecombe@intel.com, kas@kernel.org,
	dave.hansen@linux.intel.com, vishal.l.verma@intel.com,
	Farrah Chen <farrah.chen@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH v3 07/26] x86/virt/seamldr: Introduce a wrapper for P-SEAMLDR SEAMCALLs
Date: Fri, 30 Jan 2026 08:18:07 -0800	[thread overview]
Message-ID: <fedb3192-e68c-423c-93b2-a4dc2f964148@intel.com> (raw)
In-Reply-To: <aXywVcqbXodADg4a@intel.com>

On 1/30/26 05:21, Chao Gao wrote:
...
>>> +	/*
>>> +	 * SEAMRET from P-SEAMLDR invalidates the current VMCS.  Save/restore
>>> +	 * the VMCS across P-SEAMLDR SEAMCALLs to avoid clobbering KVM state.
>>> +	 * Disable interrupts as KVM is allowed to do VMREAD/VMWRITE in IRQ
>>> +	 * context (but not NMI context).
>>> +	 */
>>
>> I think you mean:
>>
>> 	WARN_ON(in_nmi());
> 
> This function only disables interrupts, not NMIs. Kirill questioned whether any
> KVM operations might execute from NMI context and do VMREAD/VMWRITE. If such
> operations exist and an NMI interrupts seamldr_call(), they could encounter
> an invalid current VMCS.
> 
> The problematic scenario is:
> 
> 	seamldr_call()			KVM code in NMI handler
> 
> 1.	vmptrst // save current-vmcs
> 2.	seamcall // clobber current-vmcs
> 3.					// NMI handler start
> 					call into some KVM code and do vmread/vmwrite
> 					// consume __invalid__ current-vmcs
> 					// NMI handler end
> 4.	vmptrld // restore current-vmcs
> 
> The comment clarifies that KVM doesn't do VMREAD/VMWRITE during NMI handling.

How about something like:

	P-SEAMLDR calls invalidate the current VMCS. It must be saved
	and restored around the call. Exclude KVM access to the VMCS
	by disabling interrupts. This is not safe against VMCS use in
	NMIs, but there are none of those today.

Ideally, you'd also pair that with _some_ checks in the KVM code that
use lockdep or warnings to reiterate that NMI access to the VMCS is not OK.

>>> +	local_irq_save(flags);
>>> +
>>> +	asm goto("1: vmptrst %0\n\t"
>>> +		 _ASM_EXTABLE(1b, %l[error])
>>> +		 : "=m" (vmcs) : : "cc" : error);
>>
>> I'd much rather this be wrapped up in a helper function. We shouldn't
>> have to look at the horrors of inline assembly like this.
>>
>> But this *REALLY* wants the KVM folks to look at it. One argument is
>> that with the inline assembly this is nice and self-contained. The other
>> argument is that this completely ignores all existing KVM infrastructure
>> and is parallel VMCS management.
> 
> Exactly. Sean suggested this approach [*]. He prefers inline assembly rather than
> adding new, inferior wrappers
> 
> *: https://lore.kernel.org/linux-coco/aHEYtGgA3aIQ7A3y@google.com/

Get his explicit reviews on the patch, please.

Also, I 100% object to inline assembly in the main flow. Please at least
make a wrapper for these and stick them in:

	arch/x86/include/asm/special_insns.h

so the inline assembly spew is hidden from view.

>> I'd be shocked if this is the one and only place in the whole kernel
>> that can unceremoniously zap VMX state.
>>
>> I'd *bet* that you don't really need to do the vmptrld and that KVM can
>> figure it out because it can vmptrld on demand anyway. Something along
>> the lines of:
>>
>> 	local_irq_disable();
>> 	list_for_each(handwaving...)
>> 		vmcs_clear();
>> 	ret = seamldr_prerr(fn, args);
>> 	local_irq_enable();	
>>
>> Basically, zap this CPU's vmcs state and then make KVM reload it at some
>> later time.
> 
> The idea is feasible. But just calling vmcs_clear() won't work. We need to
> reset all the tracking state associated with each VMCS. We should call
> vmclear_local_loaded_vmcss() instead, similar to what's done before VMXOFF.
> 
>>
>> I'm sure Sean and Paolo will tell me if I'm crazy.
> 
> To me, this approach needs more work since we need to either move 
> vmclear_local_loaded_vmcss() to the kernel or allow KVM to register a callback.
> 
> I don't think it's as straightforward as just doing the save/restore.

Could you please just do me a favor and spend 20 minutes to see what
this looks like in practice and if the KVM folks hate it?

>>> diff --git a/drivers/virt/coco/tdx-host/Kconfig b/drivers/virt/coco/tdx-host/Kconfig
>>> index e58bad148a35..6a9199e6c2c6 100644
>>> --- a/drivers/virt/coco/tdx-host/Kconfig
>>> +++ b/drivers/virt/coco/tdx-host/Kconfig
>>> @@ -8,3 +8,13 @@ config TDX_HOST_SERVICES
>>>  
>>>  	  Say y or m if enabling support for confidential virtual machine
>>>  	  support (CONFIG_INTEL_TDX_HOST). The module is called tdx_host.ko
>>> +
>>> +config INTEL_TDX_MODULE_UPDATE
>>> +	bool "Intel TDX module runtime update"
>>> +	depends on TDX_HOST_SERVICES
>>> +	help
>>> +	  This enables the kernel to support TDX module runtime update. This
>>> +	  allows the admin to update the TDX module to another compatible
>>> +	  version without the need to terminate running TDX guests.
>>
>> ... as opposed to the method that the kernel has to update the module
>> without terminating guests? ;)
> 
> I will reduce this to:
> 
> 	  This enables the kernel to update the TDX Module to another compatible
> 	  version.

I guess I'll be explicit: Remove this Kconfig prompt.

I think you should remove INTEL_TDX_MODULE_UPDATE entirely. But I'll
settle for:

	config INTEL_TDX_MODULE_UPDATE
		bool
		default TDX_HOST_SERVICES

so that users don't have to see it. Don't bother users with it. Period.

  reply	other threads:[~2026-01-30 16:18 UTC|newest]

Thread overview: 132+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-23 14:55 [PATCH v3 00/26] Runtime TDX Module update support Chao Gao
2026-01-23 14:55 ` [PATCH v3 01/26] x86/virt/tdx: Print SEAMCALL leaf numbers in decimal Chao Gao
2026-01-26 10:01   ` Tony Lindgren
2026-01-28  1:28   ` Binbin Wu
2026-01-28 16:26   ` Dave Hansen
2026-01-29  5:44     ` Chao Gao
2026-01-23 14:55 ` [PATCH v3 02/26] x86/virt/tdx: Use %# prefix for hex values in SEAMCALL error messages Chao Gao
2026-01-26 10:02   ` Tony Lindgren
2026-01-28  1:34   ` Binbin Wu
2026-01-28 12:16     ` Chao Gao
2026-01-28 15:18   ` Dave Hansen
2026-01-23 14:55 ` [PATCH v3 03/26] x86/virt/tdx: Move low level SEAMCALL helpers out of <asm/tdx.h> Chao Gao
2026-01-26 10:02   ` Tony Lindgren
2026-01-28  1:37   ` Binbin Wu
2026-01-28 12:42     ` Chao Gao
2026-01-28 16:31       ` Dave Hansen
2026-01-29 14:02         ` Chao Gao
2026-01-29 16:03           ` Dave Hansen
2026-01-28 16:37   ` Dave Hansen
2026-01-29  8:04     ` Chao Gao
2026-01-23 14:55 ` [PATCH v3 04/26] coco/tdx-host: Introduce a "tdx_host" device Chao Gao
2026-01-26  9:52   ` Tony Lindgren
2026-01-28 16:53     ` Dave Hansen
2026-01-28  3:24   ` Binbin Wu
2026-01-29  7:26     ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 05/26] coco/tdx-host: Expose TDX Module version Chao Gao
2026-01-26  9:54   ` Tony Lindgren
2026-01-28  3:48   ` Binbin Wu
2026-01-28 17:01   ` Dave Hansen
2026-01-29 14:07     ` Chao Gao
2026-01-29  7:38   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 06/26] x86/virt/tdx: Prepare to support P-SEAMLDR SEAMCALLs Chao Gao
2026-01-26 10:05   ` Tony Lindgren
2026-01-28  5:58   ` Binbin Wu
2026-01-28 23:03   ` Dave Hansen
2026-01-29  9:46     ` Xu Yilun
2026-01-29 16:08       ` Dave Hansen
2026-01-29 14:55     ` Chao Gao
2026-01-29 16:59       ` Dave Hansen
2026-01-23 14:55 ` [PATCH v3 07/26] x86/virt/seamldr: Introduce a wrapper for " Chao Gao
2026-01-26 10:12   ` Tony Lindgren
2026-01-28  6:38   ` Binbin Wu
2026-01-28 23:04   ` Dave Hansen
2026-01-30  8:08     ` Chao Gao
2026-01-30 16:23       ` Dave Hansen
2026-01-28 23:36   ` Dave Hansen
2026-01-30 13:21     ` Chao Gao
2026-01-30 16:18       ` Dave Hansen [this message]
2026-02-03 12:15         ` Chao Gao
2026-02-03 15:41           ` Sean Christopherson
2026-02-03 16:12             ` Dave Hansen
2026-02-03 23:54             ` Chao Gao
2026-02-05 16:29               ` Sean Christopherson
2026-02-05 16:37                 ` Dave Hansen
2026-01-23 14:55 ` [PATCH v3 08/26] x86/virt/seamldr: Retrieve P-SEAMLDR information Chao Gao
2026-01-26 10:15   ` Tony Lindgren
2026-01-28  6:50   ` Binbin Wu
2026-01-28 23:54   ` Dave Hansen
2026-01-30  4:01     ` Xu Yilun
2026-01-30 16:35       ` Dave Hansen
2026-02-02  0:16         ` Xu Yilun
2026-01-30 13:55     ` Chao Gao
2026-01-30 16:06       ` Dave Hansen
2026-01-28 23:57   ` Dave Hansen
2026-01-30 13:30     ` Chao Gao
2026-01-23 14:55 ` [PATCH v3 09/26] coco/tdx-host: Expose P-SEAMLDR information via sysfs Chao Gao
2026-01-26  9:56   ` Tony Lindgren
2026-01-28  3:07   ` Huang, Kai
2026-01-29  0:08   ` Dave Hansen
2026-01-30 14:44     ` Chao Gao
2026-01-30 16:02       ` Dave Hansen
2026-01-23 14:55 ` [PATCH v3 10/26] coco/tdx-host: Implement FW_UPLOAD sysfs ABI for TDX Module updates Chao Gao
2026-01-26 10:00   ` Tony Lindgren
2026-01-28  3:30   ` Huang, Kai
2026-01-30 14:07   ` Xu Yilun
2026-02-06 17:15   ` Xing, Cedric
2026-01-23 14:55 ` [PATCH v3 11/26] x86/virt/seamldr: Block TDX Module updates if any CPU is offline Chao Gao
2026-01-26 10:16   ` Tony Lindgren
2026-02-02  0:31   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 12/26] x86/virt/seamldr: Verify availability of slots for TDX Module updates Chao Gao
2026-01-26 10:17   ` Tony Lindgren
2026-01-23 14:55 ` [PATCH v3 13/26] x86/virt/seamldr: Allocate and populate a module update request Chao Gao
2026-01-26 10:23   ` Tony Lindgren
2026-01-27  3:21   ` Huang, Kai
2026-01-28 11:28     ` Chao Gao
2026-01-28 22:33       ` Huang, Kai
2026-01-28  4:03   ` Huang, Kai
2026-01-30 14:56     ` Chao Gao
2026-02-02  3:08   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 14/26] x86/virt/seamldr: Introduce skeleton for TDX Module updates Chao Gao
2026-01-26 10:28   ` Tony Lindgren
2026-02-02  6:01   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 15/26] x86/virt/seamldr: Abort updates if errors occurred midway Chao Gao
2026-01-26 10:31   ` Tony Lindgren
2026-02-02  6:08   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 16/26] x86/virt/seamldr: Shut down the current TDX module Chao Gao
2026-01-26 10:42   ` Tony Lindgren
2026-02-02  6:31   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 17/26] x86/virt/tdx: Reset software states after TDX module shutdown Chao Gao
2026-01-26 10:43   ` Tony Lindgren
2026-01-23 14:55 ` [PATCH v3 18/26] x86/virt/seamldr: Log TDX Module update failures Chao Gao
2026-01-26 10:45   ` Tony Lindgren
2026-02-02  7:11   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 19/26] x86/virt/seamldr: Install a new TDX Module Chao Gao
2026-01-26 10:52   ` Tony Lindgren
2026-01-23 14:55 ` [PATCH v3 20/26] x86/virt/seamldr: Do TDX per-CPU initialization after updates Chao Gao
2026-01-26 10:53   ` Tony Lindgren
2026-02-02  7:32   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 21/26] x86/virt/tdx: Establish contexts for the new TDX Module Chao Gao
2026-01-26 10:54   ` Tony Lindgren
2026-01-23 14:55 ` [PATCH v3 22/26] x86/virt/tdx: Update tdx_sysinfo and check features post-update Chao Gao
2026-01-26 11:07   ` Tony Lindgren
2026-02-02  7:33   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 23/26] x86/virt/tdx: Enable TDX Module runtime updates Chao Gao
2026-01-26 11:14   ` Tony Lindgren
2026-02-04 10:03     ` Tony Lindgren
2026-02-02  7:41   ` Xu Yilun
2026-01-23 14:55 ` [PATCH v3 24/26] x86/virt/seamldr: Extend sigstruct to 16KB Chao Gao
2026-01-26 11:15   ` Tony Lindgren
2026-01-27  3:58   ` Huang, Kai
2026-01-28 23:01   ` Huang, Kai
2026-01-30 14:25     ` Chao Gao
2026-02-02 11:57       ` Huang, Kai
2026-01-23 14:55 ` [PATCH v3 25/26] x86/virt/tdx: Avoid updates during update-sensitive operations Chao Gao
2026-01-26 11:23   ` Tony Lindgren
2026-01-23 14:55 ` [PATCH v3 26/26] coco/tdx-host: Set and document TDX Module update expectations Chao Gao
2026-01-26 11:28   ` Tony Lindgren
2026-01-26 22:14   ` dan.j.williams
2026-01-27 12:17     ` Chao Gao
2026-01-27 17:23       ` dan.j.williams
2026-01-28 17:52 ` [PATCH v3 00/26] Runtime TDX Module update support Sagi Shahar
2026-01-29  1:51   ` Chao Gao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fedb3192-e68c-423c-93b2-a4dc2f964148@intel.com \
    --to=dave.hansen@intel.com \
    --cc=bp@alien8.de \
    --cc=chao.gao@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=farrah.chen@intel.com \
    --cc=hpa@zytor.com \
    --cc=ira.weiny@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=nik.borisov@suse.com \
    --cc=paulmck@kernel.org \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=sagis@google.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vannapurve@google.com \
    --cc=vishal.l.verma@intel.com \
    --cc=x86@kernel.org \
    --cc=yilun.xu@linux.intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox