From: Yan Zhao <yan.y.zhao@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Marc Zyngier <maz@kernel.org>,
Oliver Upton <oliver.upton@linux.dev>,
Tianrui Zhao <zhaotianrui@loongson.cn>,
Bibo Mao <maobibo@loongson.cn>,
Huacai Chen <chenhuacai@kernel.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Anup Patel <anup@brainfault.org>, Paul Walmsley <pjw@kernel.org>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
Christian Borntraeger <borntraeger@linux.ibm.com>,
Janosch Frank <frankja@linux.ibm.com>,
Claudio Imbrenda <imbrenda@linux.ibm.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"Kirill A. Shutemov" <kas@kernel.org>,
<linux-arm-kernel@lists.infradead.org>, <kvmarm@lists.linux.dev>,
<kvm@vger.kernel.org>, <loongarch@lists.linux.dev>,
<linux-mips@vger.kernel.org>, <linuxppc-dev@lists.ozlabs.org>,
<kvm-riscv@lists.infradead.org>,
<linux-riscv@lists.infradead.org>, <x86@kernel.org>,
<linux-coco@lists.linux.dev>, <linux-kernel@vger.kernel.org>,
Ira Weiny <ira.weiny@intel.com>, Kai Huang <kai.huang@intel.com>,
Michael Roth <michael.roth@amd.com>,
Vishal Annapurve <vannapurve@google.com>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
Ackerley Tng <ackerleytng@google.com>,
Binbin Wu <binbin.wu@linux.intel.com>
Subject: Re: [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks
Date: Mon, 27 Oct 2025 17:26:09 +0800 [thread overview]
Message-ID: <aP86sdBZxXm5I17f@yzhao56-desk.sh.intel.com> (raw)
In-Reply-To: <aPuv8F8iDp3SLb9q@google.com>
On Fri, Oct 24, 2025 at 09:57:20AM -0700, Sean Christopherson wrote:
> On Fri, Oct 24, 2025, Yan Zhao wrote:
> > On Thu, Oct 16, 2025 at 05:32:42PM -0700, Sean Christopherson wrote:
> > We may have missed a zap in the guest_memfd punch hole path:
> >
> > The SEAMCALLs tdh_mem_range_block(), tdh_mem_track() tdh_mem_page_remove()
> > in the guest_memfd punch hole path are only protected by the filemap invaliate
> > lock and mmu_lock, so they could contend with v1 version of tdh_vp_init().
> >
> > (I'm writing a selftest to verify this, haven't been able to reproduce
> > tdh_vp_init(v1) returning BUSY yet. However, this race condition should be
> > theoretically possible.)
> >
> > Resources SHARED users EXCLUSIVE users
> > ------------------------------------------------------------------------
> > (1) TDR tdh_mng_rdwr tdh_mng_create
> > tdh_vp_create tdh_mng_add_cx
> > tdh_vp_addcx tdh_mng_init
> > tdh_vp_init(v0) tdh_mng_vpflushdone
> > tdh_vp_enter tdh_mng_key_config
> > tdh_vp_flush tdh_mng_key_freeid
> > tdh_vp_rd_wr tdh_mr_extend
> > tdh_mem_sept_add tdh_mr_finalize
> > tdh_mem_sept_remove tdh_vp_init(v1)
> > tdh_mem_page_aug tdh_mem_page_add
> > tdh_mem_page_remove
> > tdh_mem_range_block
> > tdh_mem_track
> > tdh_mem_range_unblock
> > tdh_phymem_page_reclaim
> >
> > Do you think we can acquire the mmu_lock for cmd KVM_TDX_INIT_VCPU?
>
> Ugh, I'd rather not? Refresh me, what's the story with "v1"? Are we now on v2?
No... We are now on v1.
As in [1], I found that TDX module changed SEAMCALL TDH_VP_INIT behavior to
require exclusive lock on resource TDR when leaf_opcode.version > 0.
Therefore, we moved KVM_TDX_INIT_VCPU to tdx_vcpu_unlocked_ioctl() in patch 22.
[1] https://lore.kernel.org/all/aLa34QCJCXGLk%2Ffl@yzhao56-desk.sh.intel.com/
> If this is effectively limited to deprecated TDX modules, my vote would be to
> ignore the flaw and avoid even more complexity in KVM.
Conversely, the new TDX module has this flaw...
> > > @@ -3155,12 +3198,13 @@ int tdx_vcpu_unlocked_ioctl(struct kvm_vcpu *vcpu, void __user *argp)
> > > if (r)
> > > return r;
> > >
> > > + CLASS(tdx_vm_state_guard, guard)(kvm);
> > Should we move the guard to inside each cmd? Then there's no need to acquire the
> > locks in the default cases.
>
> No, I don't think it's a good tradeoff. We'd also need to move vcpu_{load,put}()
> into the cmd helpers, and theoretically slowing down a bad ioctl invocation due
> to taking extra locks is a complete non-issue.
Fair enough.
next prev parent reply other threads:[~2025-10-27 9:28 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-17 0:32 [PATCH v3 00/25] KVM: x86/mmu: TDX post-populate cleanups Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 01/25] KVM: Make support for kvm_arch_vcpu_async_ioctl() mandatory Sean Christopherson
2025-10-17 9:12 ` Claudio Imbrenda
2025-10-17 0:32 ` [PATCH v3 02/25] KVM: Rename kvm_arch_vcpu_async_ioctl() to kvm_arch_vcpu_unlocked_ioctl() Sean Christopherson
2025-10-17 9:13 ` Claudio Imbrenda
2025-10-17 0:32 ` [PATCH v3 03/25] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings Sean Christopherson
2025-10-22 3:15 ` Binbin Wu
2025-10-17 0:32 ` [PATCH v3 04/25] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-21 4:06 ` Yan Zhao
2025-10-21 16:36 ` Sean Christopherson
2025-10-22 8:05 ` Yan Zhao
2025-10-22 18:12 ` Sean Christopherson
2025-10-23 6:48 ` Yan Zhao
2025-10-22 4:53 ` Yan Zhao
2025-10-30 8:34 ` Yan Zhao
2025-11-04 17:57 ` Sean Christopherson
2025-11-05 7:32 ` Yan Zhao
2025-11-05 7:47 ` Yan Zhao
2025-11-05 15:26 ` Sean Christopherson
2025-10-23 10:28 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 05/25] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" Sean Christopherson
2025-10-22 5:56 ` Binbin Wu
2025-10-23 10:30 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 06/25] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_page_prefault() Sean Christopherson
2025-10-22 5:57 ` Binbin Wu
2025-10-23 10:38 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 07/25] KVM: TDX: Drop superfluous page pinning in S-EPT management Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-17 0:32 ` [PATCH v3 08/25] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 09/25] KVM: TDX: Fold tdx_sept_drop_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-10-23 10:53 ` Huang, Kai
2025-10-23 14:59 ` Sean Christopherson
2025-10-23 22:20 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 10/25] KVM: x86/mmu: Drop the return code from kvm_x86_ops.remove_external_spte() Sean Christopherson
2025-10-22 8:46 ` Yan Zhao
2025-10-22 19:08 ` Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 11/25] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() Sean Christopherson
2025-10-23 22:21 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 12/25] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 13/25] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller Sean Christopherson
2025-10-23 22:32 ` Huang, Kai
2025-10-24 7:21 ` Huang, Kai
2025-10-24 7:38 ` Binbin Wu
2025-10-24 16:33 ` Sean Christopherson
2025-10-27 9:01 ` Binbin Wu
2025-10-28 0:29 ` Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 14/25] KVM: TDX: Bug the VM if extended the initial measurement fails Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-23 17:27 ` Sean Christopherson
2025-10-23 22:48 ` Huang, Kai
2025-10-24 16:35 ` Sean Christopherson
2025-10-27 9:31 ` Yan Zhao
2025-10-17 0:32 ` [PATCH v3 15/25] KVM: TDX: ADD pages to the TD image while populating mirror EPT entries Sean Christopherson
2025-10-24 7:18 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 16/25] KVM: TDX: Fold tdx_sept_zap_private_spte() into tdx_sept_remove_private_spte() Sean Christopherson
2025-10-24 9:53 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 17/25] KVM: TDX: Combine KVM_BUG_ON + pr_tdx_error() into TDX_BUG_ON() Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 18/25] KVM: TDX: Derive error argument names from the local variable names Sean Christopherson
2025-10-17 0:32 ` [PATCH v3 19/25] KVM: TDX: Assert that mmu_lock is held for write when removing S-EPT entries Sean Christopherson
2025-10-23 7:37 ` Yan Zhao
2025-10-23 15:14 ` Sean Christopherson
2025-10-24 10:05 ` Yan Zhao
2025-10-17 0:32 ` [PATCH v3 20/25] KVM: TDX: Add macro to retry SEAMCALLs when forcing vCPUs out of guest Sean Christopherson
2025-10-24 10:09 ` Huang, Kai
2025-10-27 19:20 ` Sean Christopherson
2025-10-27 22:00 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 21/25] KVM: TDX: Add tdx_get_cmd() helper to get and validate sub-ioctl command Sean Christopherson
2025-10-21 0:12 ` Edgecombe, Rick P
2025-10-24 10:11 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 22/25] KVM: TDX: Convert INIT_MEM_REGION and INIT_VCPU to "unlocked" vCPU ioctl Sean Christopherson
2025-10-24 10:36 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 23/25] KVM: TDX: Use guard() to acquire kvm->lock in tdx_vm_ioctl() Sean Christopherson
2025-10-21 0:10 ` Edgecombe, Rick P
2025-10-21 16:56 ` Sean Christopherson
2025-10-21 19:03 ` Edgecombe, Rick P
2025-10-24 10:36 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks Sean Christopherson
2025-10-24 10:02 ` Yan Zhao
2025-10-24 16:57 ` Sean Christopherson
2025-10-27 9:26 ` Yan Zhao [this message]
2025-10-27 17:46 ` Edgecombe, Rick P
2025-10-27 18:10 ` Sean Christopherson
2025-10-28 0:28 ` [PATCH] KVM: TDX: Take MMU lock around tdh_vp_init() Rick Edgecombe
2025-10-28 5:37 ` Yan Zhao
2025-10-29 6:37 ` Binbin Wu
2025-11-18 23:31 ` Sean Christopherson
2025-11-19 0:01 ` Edgecombe, Rick P
2025-11-19 0:02 ` Edgecombe, Rick P
2025-10-28 1:37 ` [PATCH v3 24/25] KVM: TDX: Guard VM state transitions with "all" the locks Yan Zhao
2025-10-28 17:40 ` Edgecombe, Rick P
2025-10-24 10:53 ` Huang, Kai
2025-10-28 0:28 ` Huang, Kai
2025-10-28 0:37 ` Sean Christopherson
2025-10-28 1:01 ` Huang, Kai
2025-10-17 0:32 ` [PATCH v3 25/25] KVM: TDX: Fix list_add corruption during vcpu_load() Sean Christopherson
2025-10-20 8:50 ` Yan Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aP86sdBZxXm5I17f@yzhao56-desk.sh.intel.com \
--to=yan.y.zhao@intel.com \
--cc=ackerleytng@google.com \
--cc=anup@brainfault.org \
--cc=aou@eecs.berkeley.edu \
--cc=binbin.wu@linux.intel.com \
--cc=borntraeger@linux.ibm.com \
--cc=chenhuacai@kernel.org \
--cc=frankja@linux.ibm.com \
--cc=imbrenda@linux.ibm.com \
--cc=ira.weiny@intel.com \
--cc=kai.huang@intel.com \
--cc=kas@kernel.org \
--cc=kvm-riscv@lists.infradead.org \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=loongarch@lists.linux.dev \
--cc=maddy@linux.ibm.com \
--cc=maobibo@loongson.cn \
--cc=maz@kernel.org \
--cc=michael.roth@amd.com \
--cc=oliver.upton@linux.dev \
--cc=palmer@dabbelt.com \
--cc=pbonzini@redhat.com \
--cc=pjw@kernel.org \
--cc=rick.p.edgecombe@intel.com \
--cc=seanjc@google.com \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
--cc=zhaotianrui@loongson.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox