From: Sean Christopherson <seanjc@google.com>
To: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: ackerleytng@google.com, anup@brainfault.org,
aou@eecs.berkeley.edu, binbin.wu@linux.intel.com,
borntraeger@linux.ibm.com, chenhuacai@kernel.org,
frankja@linux.ibm.com, imbrenda@linux.ibm.com,
ira.weiny@intel.com, kai.huang@intel.com, kas@kernel.org,
kvm-riscv@lists.infradead.org, kvm@vger.kernel.org,
kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org,
linux-mips@vger.kernel.org, linux-riscv@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, loongarch@lists.linux.dev,
maddy@linux.ibm.com, maobibo@loongson.cn, maz@kernel.org,
michael.roth@amd.com, oliver.upton@linux.dev,
palmer@dabbelt.com, pbonzini@redhat.com, pjw@kernel.org,
vannapurve@google.com, x86@kernel.org, yan.y.zhao@intel.com,
zhaotianrui@loongson.cn
Subject: Re: [PATCH] KVM: TDX: Take MMU lock around tdh_vp_init()
Date: Tue, 18 Nov 2025 15:31:22 -0800 [thread overview]
Message-ID: <aR0Byu3bd3URxzhu@google.com> (raw)
In-Reply-To: <20251028002824.1470939-1-rick.p.edgecombe@intel.com>
On Mon, Oct 27, 2025, Rick Edgecombe wrote:
> Take MMU lock around tdh_vp_init() in KVM_TDX_INIT_VCPU to prevent
> meeting contention during retries in some no-fail MMU paths.
>
> The TDX module takes various try-locks internally, which can cause
> SEAMCALLs to return an error code when contention is met. Dealing with
> an error in some of the MMU paths that make SEAMCALLs is not straight
> forward, so KVM takes steps to ensure that these will meet no contention
> during a single BUSY error retry. The whole scheme relies on KVM to take
> appropriate steps to avoid making any SEAMCALLs that could contend while
> the retry is happening.
>
> Unfortunately, there is a case where contention could be met if userspace
> does something unusual. Specifically, hole punching a gmem fd while
> initializing the TD vCPU. The impact would be triggering a KVM_BUG_ON().
>
> The resource being contended is called the "TDR resource" in TDX docs
> parlance. The tdh_vp_init() can take this resource as exclusive if the
> 'version' passed is 1, which happens to be version the kernel passes. The
> various MMU operations (tdh_mem_range_block(), tdh_mem_track() and
> tdh_mem_page_remove()) take it as shared.
>
> There isn't a KVM lock that maps conceptually and in a lock order friendly
> way to the TDR lock. So to minimize infrastructure, just take MMU lock
> around tdh_vp_init(). This makes the operations we care about mutually
> exclusive. Since the other operations are under a write mmu_lock, the code
> could just take the lock for read, however this is weirdly inverted from
> the actual underlying resource being contended. Since this is covering an
> edge case that shouldn't be hit in normal usage, be a little less weird
> and take the mmu_lock for write around the call.
>
> Fixes: 02ab57707bdb ("KVM: TDX: Implement hooks to propagate changes of TDP MMU mirror page table")
> Reported-by: Yan Zhao <yan.y.zhao@intel.com>
> Suggested-by: Yan Zhao <yan.y.zhao@intel.com>
> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
> ---
> Hi,
>
> It was indeed awkward, as Sean must have sniffed. But seems ok enough to
> close the issue.
>
> Yan, can you give it a look?
>
> Posted here, but applies on top of this series.
In the future, please don't post in-reply-to, as it mucks up my b4 workflow.
Applied to kvm-x86 tdx, with a more verbose comment as suggested by Binbin.
[1/1] KVM: TDX: Take MMU lock around tdh_vp_init()
https://github.com/kvm-x86/linux/commit/9a89894f30d5
next prev parent reply other threads:[~2025-11-18 23:31 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 0:32 [PATCH v3 00/25] KVM: x86/mmu: TDX post-populate cleanups Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 01/25] KVM: Make support for kvm_arch_vcpu_async_ioctl() mandatory Sean Christopherson
2025-10-17 9:12 ` Claudio Imbrenda
2025-10-17 0:32 ` [PATCH v3 02/25] KVM: Rename kvm_arch_vcpu_async_ioctl() to kvm_arch_vcpu_unlocked_ioctl() Sean Christopherson
2025-10-17 9:13 ` Claudio Imbrenda
2025-10-17 0:32 ` [PATCH v3 03/25] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings Sean Christopherson
2025-10-22 3:15 ` Binbin Wu
2025-10-17 0:32 ` [PATCH v3 04/25] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-21 4:06 ` Yan Zhao
2025-10-21 16:36 ` Sean Christopherson
2025-10-22 8:05 ` Yan Zhao
2025-10-22 18:12 ` Sean Christopherson
2025-10-23 6:48 ` Yan Zhao
2025-10-22 4:53 ` Yan Zhao
2025-10-30 8:34 ` Yan Zhao
2025-11-04 17:57 ` Sean Christopherson
2025-11-05 7:32 ` Yan Zhao
2025-11-05 7:47 ` Yan Zhao
2025-11-05 15:26 ` Sean Christopherson
2025-10-23 10:28 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 05/25] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" Sean Christopherson
2025-10-22 5:56 ` Binbin Wu
2025-10-23 10:30 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 06/25] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() Sean Christopherson
2025-10-22 5:57 ` Binbin Wu
2025-10-23 10:38 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 07/25] KVM: TDX: Drop superfluous page pinning in S-EPT management Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-17 0:32 ` [PATCH v3 08/25] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 09/25] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-10-23 10:53 ` Huang, Kai
2025-10-23 14:59 ` Sean Christopherson
2025-10-23 22:20 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 10/25] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() Sean Christopherson
2025-10-22 8:46 ` Yan Zhao
2025-10-22 19:08 ` Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 11/25] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() Sean Christopherson
2025-10-23 22:21 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 12/25] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 13/25] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller Sean Christopherson
2025-10-23 22:32 ` Huang, Kai
2025-10-24 7:21 ` Huang, Kai
2025-10-24 7:38 ` Binbin Wu
2025-10-24 16:33 ` Sean Christopherson
2025-10-27 9:01 ` Binbin Wu
2025-10-28 0:29 ` Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 14/25] KVM: TDX: Bug the VM if extended the initial measurement fails Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-23 17:27 ` Sean Christopherson
2025-10-23 22:48 ` Huang, Kai
2025-10-24 16:35 ` Sean Christopherson
2025-10-27 9:31 ` Yan Zhao
2025-10-17 0:32 ` [PATCH v3 15/25] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries Sean Christopherson
2025-10-24 7:18 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 16/25] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-10-24 9:53 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 17/25] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 18/25] KVM: TDX: Derive error argument names from the local variable names Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 19/25] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries Sean Christopherson
2025-10-23 7:37 ` Yan Zhao
2025-10-23 15:14 ` Sean Christopherson
2025-10-24 10:05 ` Yan Zhao
2025-10-17 0:32 ` [PATCH v3 20/25] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest Sean Christopherson
2025-10-24 10:09 ` Huang, Kai
2025-10-27 19:20 ` Sean Christopherson
2025-10-27 22:00 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 21/25] KVM: TDX: Add tdx_get_cmd() helper to get and validate sub-ioctl command Sean Christopherson
2025-10-21 0:12 ` Edgecombe, Rick P
2025-10-24 10:11 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 22/25] KVM: TDX: Convert INIT_MEM_REGION and INIT_VCPU to "unlocked" vCPU ioctl Sean Christopherson
2025-10-24 10:36 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 23/25] KVM: TDX: Use guard() to acquire kvm->lock in tdx_vm_ioctl() Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-21 16:56 ` Sean Christopherson
2025-10-21 19:03 ` Edgecombe, Rick P
2025-10-24 10:36 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks Sean Christopherson
2025-10-24 10:02 ` Yan Zhao
2025-10-24 16:57 ` Sean Christopherson
2025-10-27 9:26 ` Yan Zhao
2025-10-27 17:46 ` Edgecombe, Rick P
2025-10-27 18:10 ` Sean Christopherson
2025-10-28 0:28 ` [PATCH] KVM: TDX: Take MMU lock around tdh_vp_init() Rick Edgecombe
2025-10-28 5:37 ` Yan Zhao
2025-10-29 6:37 ` Binbin Wu
2025-11-18 23:31 ` Sean Christopherson [this message]
2025-11-19 0:01 ` Edgecombe, Rick P
2025-11-19 0:02 ` Edgecombe, Rick P
2025-10-28 1:37 ` [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks Yan Zhao
2025-10-28 17:40 ` Edgecombe, Rick P
2025-10-24 10:53 ` Huang, Kai
2025-10-28 0:28 ` Huang, Kai
2025-10-28 0:37 ` Sean Christopherson
2025-10-28 1:01 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 25/25] KVM: TDX: Fix list_add corruption during vcpu_load() Sean Christopherson
2025-10-20 8:50 ` Yan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aR0Byu3bd3URxzhu@google.com \
--to=seanjc@google.com \
--cc=ackerleytng@google.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=binbin.wu@linux.intel.com \
--cc=borntraeger@linux.ibm.com \
--cc=chenhuacai@kernel.org \
--cc=frankja@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=ira.weiny@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=maddy@linux.ibm.com \
--cc=maobibo@loongson.cn \
--cc=maz@kernel.org \
--cc=michael.roth@amd.com \
--cc=oliver.upton@linux.dev \
--cc=palmer@dabbelt.com \
--cc=pbonzini@redhat.com \
--cc=pjw@kernel.org \
--cc=rick.p.edgecombe@intel.com \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
--cc=yan.y.zhao@intel.com \
--cc=zhaotianrui@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).