kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Rick P Edgecombe <rick.p.edgecombe@intel.com>,
	Kai Huang <kai.huang@intel.com>,
	 "ackerleytng@google.com" <ackerleytng@google.com>,
	Vishal Annapurve <vannapurve@google.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Ira Weiny <ira.weiny@intel.com>,
	 "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"michael.roth@amd.com" <michael.roth@amd.com>,
	 "pbonzini@redhat.com" <pbonzini@redhat.com>
Subject: Re: [RFC PATCH v2 12/18] KVM: TDX: Bug the VM if extended the initial measurement fails
Date: Tue, 2 Sep 2025 10:04:38 -0700	[thread overview]
Message-ID: <aLcjppW1eiCrxJPC@google.com> (raw)
In-Reply-To: <aLa34QCJCXGLk/fl@yzhao56-desk.sh.intel.com>

On Tue, Sep 02, 2025, Yan Zhao wrote:
> But during writing another concurrency test, I found a sad news :
> 
> SEAMCALL TDH_VP_INIT requires to hold exclusive lock for resource TDR when its
> leaf_opcode.version > 0. So, when I use v1 (which is the current value in
> upstream, for x2apic?) to test executing ioctl KVM_TDX_INIT_VCPU on different
> vCPUs concurrently, the TDX_BUG_ON() following tdh_vp_init() will print error
> "SEAMCALL TDH_VP_INIT failed: 0x8000020000000080".
> 
> If I switch to using v0 version of TDH_VP_INIT, the contention will be gone.

Uh, so that's exactly the type of breaking ABI change that isn't acceptable.  If
it's really truly necessary, then we can can probably handle the change in KVM
since TDX is so new, but generally speaking such changes simply must not happen.

> Note: this acquiring of exclusive lock was not previously present in the public
> repo https://github.com/intel/tdx-module.git, branch tdx_1.5.
> (The branch has been force-updated to new implementation now).

Lovely.

> > Acquire kvm->lock to prevent VM-wide things from happening, slots_lock to prevent
> > kvm_mmu_zap_all_fast(), and _all_ vCPU mutexes to prevent vCPUs from interefering.
> Nit: we should have no worry to kvm_mmu_zap_all_fast(), since it only zaps
> !mirror roots. The slots_lock should be for slots deletion.

Oof, I missed that.  We should have required nx_huge_pages=never for tdx=1.
Probably too late for that now though :-/

> > Doing that for a vCPU ioctl is a bit awkward, but not awful.  E.g. we can abuse
> > kvm_arch_vcpu_async_ioctl().  In hindsight, a more clever approach would have
> > been to make KVM_TDX_INIT_MEM_REGION a VM-scoped ioctl that takes a vCPU fd.  Oh
> > well.
> > 
> > Anyways, I think we need to avoid the "synchronous" ioctl path anyways, because
> > taking kvm->slots_lock inside vcpu->mutex is gross.  AFAICT it's not actively
> > problematic today, but it feels like a deadlock waiting to happen.
> Note: Looks kvm_inhibit_apic_access_page() also takes kvm->slots_lock inside
> vcpu->mutex.

Yikes.  As does kvm_alloc_apic_access_page(), which is likely why I thought it
was ok to take slots_lock.  But while kvm_alloc_apic_access_page() appears to be
called with vCPU scope, it's actually called from VM scope during vCPU creation.

I'll chew on this, though if someone has any ideas...

> So, do we need to move KVM_TDX_INIT_VCPU to tdx_vcpu_async_ioctl() as well?

If it's _just_ INIT_VCPU that can race (assuming the VM-scoped state transtitions
take all vcpu->mutex locks, as proposed), then a dedicated mutex (spinlock?) would
suffice, and probably would be preferable.  If INIT_VCPU needs to take kvm->lock
to protect against other races, then I guess the big hammer approach could work?

  reply	other threads:[~2025-09-02 17:04 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-29  0:06 [RFC PATCH v2 00/18] KVM: x86/mmu: TDX post-populate cleanups Sean Christopherson
2025-08-29  0:06 ` [RFC PATCH v2 01/18] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings Sean Christopherson
2025-08-29  6:20   ` Binbin Wu
2025-08-29  0:06 ` [RFC PATCH v2 02/18] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU Sean Christopherson
2025-08-29 18:34   ` Edgecombe, Rick P
2025-08-29 20:27     ` Sean Christopherson
2025-08-29  0:06 ` [RFC PATCH v2 03/18] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" Sean Christopherson
2025-08-29 19:00   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 04/18] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() Sean Christopherson
2025-08-29 19:03   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 05/18] KVM: TDX: Drop superfluous page pinning in S-EPT management Sean Christopherson
2025-08-29  8:36   ` Binbin Wu
2025-08-29 19:53   ` Edgecombe, Rick P
2025-08-29 20:19     ` Sean Christopherson
2025-08-29 21:54       ` Edgecombe, Rick P
2025-08-29 22:02         ` Sean Christopherson
2025-08-29 22:17           ` Edgecombe, Rick P
2025-08-29 22:58             ` Sean Christopherson
2025-08-29 22:59               ` Edgecombe, Rick P
2025-09-01  1:25     ` Yan Zhao
2025-09-02 17:33       ` Sean Christopherson
2025-09-02 18:55         ` Edgecombe, Rick P
2025-09-04  8:45           ` Sean Christopherson
2025-08-29  0:06 ` [RFC PATCH v2 06/18] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition Sean Christopherson
2025-08-29  9:40   ` Binbin Wu
2025-08-29 16:58   ` Ira Weiny
2025-08-29 19:59   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 07/18] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-08-29  9:49   ` Binbin Wu
2025-08-29  0:06 ` [RFC PATCH v2 08/18] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() Sean Christopherson
2025-08-29  9:52   ` Binbin Wu
2025-08-29  0:06 ` [RFC PATCH v2 09/18] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() Sean Christopherson
2025-08-29  9:52   ` Binbin Wu
2025-08-29  0:06 ` [RFC PATCH v2 10/18] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent Sean Christopherson
2025-08-29 10:06   ` Binbin Wu
2025-08-29  0:06 ` [RFC PATCH v2 11/18] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller Sean Christopherson
2025-09-02 22:46   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 12/18] KVM: TDX: Bug the VM if extended the initial measurement fails Sean Christopherson
2025-08-29  8:18   ` Yan Zhao
2025-08-29 18:16     ` Edgecombe, Rick P
2025-08-29 20:11       ` Sean Christopherson
2025-08-29 22:39         ` Edgecombe, Rick P
2025-08-29 23:15           ` Edgecombe, Rick P
2025-08-29 23:18             ` Sean Christopherson
2025-09-02  9:24         ` Yan Zhao
2025-09-02 17:04           ` Sean Christopherson [this message]
2025-09-03  0:18             ` Edgecombe, Rick P
2025-09-03  3:34               ` Yan Zhao
2025-09-03  9:19                 ` Yan Zhao
2025-08-29  0:06 ` [RFC PATCH v2 13/18] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries Sean Christopherson
2025-08-29 23:42   ` Edgecombe, Rick P
2025-09-02 17:09     ` Sean Christopherson
2025-08-29  0:06 ` [RFC PATCH v2 14/18] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-09-02 17:31   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 15/18] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() Sean Christopherson
2025-08-29  9:03   ` Binbin Wu
2025-08-29 14:19     ` Sean Christopherson
2025-09-01  1:46       ` Binbin Wu
2025-09-02 18:55   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 16/18] KVM: TDX: Derive error argument names from the local variable names Sean Christopherson
2025-08-30  0:00   ` Edgecombe, Rick P
2025-08-29  0:06 ` [RFC PATCH v2 17/18] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries Sean Christopherson
2025-08-29  0:06 ` [RFC PATCH v2 18/18] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aLcjppW1eiCrxJPC@google.com \
    --to=seanjc@google.com \
    --cc=ackerleytng@google.com \
    --cc=ira.weiny@intel.com \
    --cc=kai.huang@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.roth@amd.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=vannapurve@google.com \
    --cc=yan.y.zhao@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).