From: Sean Christopherson <seanjc@google.com>
To: Yan Zhao <yan.y.zhao@intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
Michael Roth <michael.roth@amd.com>,
Ira Weiny <ira.weiny@intel.com>,
Vishal Annapurve <vannapurve@google.com>,
Rick Edgecombe <rick.p.edgecombe@intel.com>
Subject: Re: [RFC PATCH 09/12] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller
Date: Wed, 27 Aug 2025 12:08:27 -0700 [thread overview]
Message-ID: <aK9Xqy0W1ghonWUL@google.com> (raw)
In-Reply-To: <aK7Ji3kAoDaEYn3h@yzhao56-desk.sh.intel.com>
On Wed, Aug 27, 2025, Yan Zhao wrote:
> On Tue, Aug 26, 2025 at 05:05:19PM -0700, Sean Christopherson wrote:
> > @@ -1641,14 +1618,30 @@ static int tdx_sept_set_private_spte(struct kvm *kvm, gfn_t gfn,
> > return -EIO;
> >
> > /*
> > - * Read 'pre_fault_allowed' before 'kvm_tdx->state'; see matching
> > - * barrier in tdx_td_finalize().
> > + * Ensure pre_fault_allowed is read by kvm_arch_vcpu_pre_fault_memory()
> > + * before kvm_tdx->state. Userspace must not be allowed to pre-fault
> > + * arbitrary memory until the initial memory image is finalized. Pairs
> > + * with the smp_wmb() in tdx_td_finalize().
> > */
> > smp_rmb();
> > - if (likely(kvm_tdx->state == TD_STATE_RUNNABLE))
> > - return tdx_mem_page_aug(kvm, gfn, level, pfn);
> >
> > - return tdx_mem_page_record_premap_cnt(kvm, gfn, level, pfn);
> > + /*
> > + * If the TD isn't finalized/runnable, then userspace is initializing
> > + * the VM image via KVM_TDX_INIT_MEM_REGION. Increment the number of
> > + * pages that need to be initialized via TDH.MEM.PAGE.ADD (PAGE.ADD
> > + * requires a pre-existing S-EPT mapping). KVM_TDX_FINALIZE_VM checks
> > + * the counter to ensure all mapped pages have been added to the image,
> > + * to prevent running the TD with uninitialized memory.
> To prevent the mismatch between mirror EPT and the S-EPT?
No? Because KVM bumps the count when installing the S-EPT and decrements it
on AUG, so I don't see how nr_premapped guards against M-EPT vs. S-EPT issues?
> e.g., Before KVM_TDX_FINALIZE_VM, if userspace performs a zap after the
> TDH.MEM.PAGE.ADD, the page will be removed from the S-EPT. The count of
> nr_premapped will not change after the successful TDH.MEM.RANGE.BLOCK and
> TDH.MEM.PAGE.REMOVE.
Eww. It would be nice to close that hole, but I suppose it's futile, e.g. the
underlying problem is unexpectedly removing pages from the initial, whether the
VMM is doing stupid things before vs. after FINALIZE doesn't really matter.
> As a result, the TD will still run with uninitialized memory.
No? Because BLOCK+REMOVE means there are no valid S-EPT mappings. There's a
"hole" that the guest might not expect, but that hole will trigger an EPT
violation and only get "filled" if the guest explicitly accepts an AUG'd page.
Side topic, why does KVM tolerate tdh_mem_page_add() failure? IIUC, playing
nice with tdh_mem_page_add() failure necessitates both the
tdx_is_sept_zap_err_due_to_premap() craziness and the check in tdx_td_finalize()
that all pending pages have been consumed.
What reasonable use case is there for gracefully handling tdh_mem_page_add() failure?
If there is a need to handle failure, I gotta imagine it's only for the -EBUSY
case. And if it's only for -EBUSY, why can't that be handled by retrying in
tdx_vcpu_init_mem_region()? If tdx_vcpu_init_mem_region() guarantees that all
pages mapped into the S-EPT are ADDed, then it can assert that there are no
pending pages when it completes (even if it "fails"), and similarly
tdx_td_finalize() can KVM_BUG_ON/WARN_ON the number of pending pages being
non-zero.
> > + */
> > + if (unlikely(kvm_tdx->state != TD_STATE_RUNNABLE)) {
> > + if (KVM_BUG_ON(kvm->arch.pre_fault_allowed, kvm))
> > + return -EIO;
> > +
> > + atomic64_inc(&kvm_tdx->nr_premapped);
> > + return 0;
> > + }
> > +
> > + return tdx_mem_page_aug(kvm, gfn, level, pfn);
> > }
> >
> > static int tdx_sept_drop_private_spte(struct kvm *kvm, gfn_t gfn,
> > --
> > 2.51.0.268.g9569e192d0-goog
> >
next prev parent reply other threads:[~2025-08-27 19:08 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 0:05 [RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups Sean Christopherson
2025-08-27 0:05 ` [RFC PATCH 01/12] KVM: TDX: Drop PROVE_MMU=y sanity check on to-be-populated mappings Sean Christopherson
2025-08-27 8:14 ` Yan Zhao
2025-08-28 0:37 ` Ira Weiny
2025-08-28 2:13 ` Huang, Kai
2025-08-27 0:05 ` [RFC PATCH 02/12] KVM: x86/mmu: Add dedicated API to map guest_memfd pfn into TDP MMU Sean Christopherson
2025-08-27 8:25 ` Yan Zhao
2025-08-28 0:54 ` Edgecombe, Rick P
2025-08-28 1:26 ` Edgecombe, Rick P
2025-08-28 6:23 ` Yan Zhao
2025-08-28 19:40 ` Sean Christopherson
2025-08-29 1:16 ` Yan Zhao
2025-09-01 0:39 ` Yan Zhao
2025-08-28 6:55 ` Yan Zhao
2025-08-28 0:40 ` Ira Weiny
2025-08-28 1:51 ` Edgecombe, Rick P
2025-08-28 19:57 ` Sean Christopherson
2025-08-27 0:05 ` [RFC PATCH 03/12] Revert "KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU" Sean Christopherson
2025-08-27 0:05 ` [RFC PATCH 04/12] KVM: x86/mmu: Rename kvm_tdp_map_page() to kvm_tdp_prefault_page() Sean Christopherson
2025-08-28 2:01 ` Edgecombe, Rick P
2025-08-28 18:50 ` Sean Christopherson
2025-08-28 19:04 ` Edgecombe, Rick P
2025-08-27 0:05 ` [RFC PATCH 05/12] KVM: TDX: Drop superfluous page pinning in S-EPT management Sean Christopherson
2025-08-27 8:33 ` Yan Zhao
2025-08-28 2:05 ` Edgecombe, Rick P
2025-08-28 20:16 ` Sean Christopherson
2025-08-28 0:36 ` Ira Weiny
2025-08-28 7:08 ` Yan Zhao
2025-08-28 15:54 ` Ira Weiny
2025-08-28 2:45 ` Huang, Kai
2025-08-27 0:05 ` [RFC PATCH 06/12] KVM: TDX: Return -EIO, not -EINVAL, on a KVM_BUG_ON() condition Sean Christopherson
2025-08-27 8:39 ` Yan Zhao
2025-08-27 17:26 ` Sean Christopherson
2025-08-28 2:11 ` Edgecombe, Rick P
2025-08-28 19:21 ` Sean Christopherson
2025-08-28 20:13 ` Edgecombe, Rick P
2025-08-28 21:00 ` Sean Christopherson
2025-08-28 21:19 ` Edgecombe, Rick P
2025-08-28 21:34 ` Sean Christopherson
2025-08-28 15:03 ` Ira Weiny
2025-08-27 0:05 ` [RFC PATCH 07/12] KVM: TDX: Avoid a double-KVM_BUG_ON() in tdx_sept_zap_private_spte() Sean Christopherson
2025-08-28 2:19 ` Edgecombe, Rick P
2025-08-28 14:50 ` Edgecombe, Rick P
2025-08-29 1:10 ` Yan Zhao
2025-08-28 15:02 ` Ira Weiny
2025-08-27 0:05 ` [RFC PATCH 08/12] KVM: TDX: Use atomic64_dec_return() instead of a poor equivalent Sean Christopherson
2025-08-28 2:56 ` Edgecombe, Rick P
2025-08-28 6:48 ` Yan Zhao
2025-08-28 19:14 ` Edgecombe, Rick P
2025-08-28 22:33 ` Sean Christopherson
2025-08-28 23:18 ` Edgecombe, Rick P
2025-08-28 15:03 ` Ira Weiny
2025-08-27 0:05 ` [RFC PATCH 09/12] KVM: TDX: Fold tdx_mem_page_record_premap_cnt() into its sole caller Sean Christopherson
2025-08-27 9:02 ` Yan Zhao
2025-08-27 19:08 ` Sean Christopherson [this message]
2025-08-28 3:13 ` Edgecombe, Rick P
2025-08-28 5:56 ` Yan Zhao
2025-08-28 19:08 ` Edgecombe, Rick P
2025-08-28 5:43 ` Yan Zhao
2025-08-28 17:00 ` Sean Christopherson
2025-08-28 18:52 ` Edgecombe, Rick P
2025-08-28 20:26 ` Sean Christopherson
2025-08-28 21:33 ` Edgecombe, Rick P
2025-08-28 21:57 ` Sean Christopherson
2025-08-28 23:17 ` Edgecombe, Rick P
2025-08-29 6:08 ` Yan Zhao
2025-08-28 22:06 ` Ira Weiny
2025-08-28 23:17 ` Sean Christopherson
2025-08-29 0:35 ` Ira Weiny
2025-08-29 6:06 ` Yan Zhao
2025-08-28 21:44 ` Sean Christopherson
2025-08-29 2:42 ` Binbin Wu
2025-08-29 2:31 ` Yan Zhao
2025-08-29 6:33 ` Yan Zhao
2025-08-28 15:30 ` Ira Weiny
2025-08-28 15:28 ` Ira Weiny
2025-08-27 0:05 ` [RFC PATCH 10/12] KVM: TDX: Assert that slots_lock is held when nr_premapped is accessed Sean Christopherson
2025-08-27 0:05 ` [RFC PATCH 11/12] KVM: TDX: Track nr_premapped as an "unsigned long", not an "atomic64_t" Sean Christopherson
2025-08-27 9:12 ` Yan Zhao
2025-08-27 0:05 ` [RFC PATCH 12/12] KVM: TDX: Rename nr_premapped to nr_pending_tdh_mem_page_adds Sean Christopherson
2025-08-27 9:22 ` Yan Zhao
2025-08-28 15:23 ` Ira Weiny
2025-08-27 9:48 ` [RFC PATCH 00/12] KVM: x86/mmu: TDX post-populate cleanups Yan Zhao
2025-08-28 19:01 ` Edgecombe, Rick P
2025-08-28 23:19 ` Sean Christopherson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aK9Xqy0W1ghonWUL@google.com \
--to=seanjc@google.com \
--cc=ira.weiny@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=michael.roth@amd.com \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=vannapurve@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).