From: Isaku Yamahata <isaku.yamahata@intel.com>
To: "Huang, Kai" <kai.huang@intel.com>
Cc: Isaku Yamahata <isaku.yamahata@intel.com>,
Sean Christopherson <seanjc@google.com>,
Rick Edgecombe <rick.p.edgecombe@intel.com>,
pbonzini@redhat.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com,
erdemaktas@google.com, sagis@google.com, yan.y.zhao@intel.com,
dmatlack@google.com, isaku.yamahata@linux.intel.com
Subject: Re: [PATCH 08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is called for TDX
Date: Wed, 15 May 2024 17:15:30 -0700 [thread overview]
Message-ID: <20240516001530.GG168153@ls.amr.corp.intel.com> (raw)
In-Reply-To: <eab9201e-702e-46bc-9782-d6dfe3da2127@intel.com>
On Thu, May 16, 2024 at 10:17:50AM +1200,
"Huang, Kai" <kai.huang@intel.com> wrote:
> On 16/05/2024 4:22 am, Isaku Yamahata wrote:
> > On Wed, May 15, 2024 at 08:34:37AM -0700,
> > Sean Christopherson <seanjc@google.com> wrote:
> >
> > > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > > index d5cf5b15a10e..808805b3478d 100644
> > > > --- a/arch/x86/kvm/mmu/mmu.c
> > > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > > @@ -6528,8 +6528,17 @@ void kvm_zap_gfn_range(struct kvm *kvm, gfn_t gfn_start, gfn_t gfn_end)
> > > > flush = kvm_rmap_zap_gfn_range(kvm, gfn_start, gfn_end);
> > > > - if (tdp_mmu_enabled)
> > > > + if (tdp_mmu_enabled) {
> > > > + /*
> > > > + * kvm_zap_gfn_range() is used when MTRR or PAT memory
> > > > + * type was changed. TDX can't handle zapping the private
> > > > + * mapping, but it's ok because KVM doesn't support either of
> > > > + * those features for TDX. In case a new caller appears, BUG
> > > > + * the VM if it's called for solutions with private aliases.
> > > > + */
> > > > + KVM_BUG_ON(kvm_gfn_shared_mask(kvm), kvm);
> > >
> > > Please stop using kvm_gfn_shared_mask() as a proxy for "is this TDX". Using a
> > > generic name quite obviously doesn't prevent TDX details for bleeding into common
> > > code, and dancing around things just makes it all unnecessarily confusing.
> > >
> > > If we can't avoid bleeding TDX details into common code, my vote is to bite the
> > > bullet and simply check vm_type.
> >
> > TDX has several aspects related to the TDP MMU.
> > 1) Based on the faulting GPA, determine which KVM page table to walk.
> > (private-vs-shared)
> > 2) Need to call TDX SEAMCALL to operate on Secure-EPT instead of direct memory
> > load/store. TDP MMU needs hooks for it.
> > 3) The tables must be zapped from the leaf. not the root or the middle.
> >
> > For 1) and 2), what about something like this? TDX backend code will set
> > kvm->arch.has_mirrored_pt = true; I think we will use kvm_gfn_shared_mask() only
> > for address conversion (shared<->private).
> >
> > For 1), maybe we can add struct kvm_page_fault.walk_mirrored_pt
> > (or whatever preferable name)?
> >
> > For 3), flag of memslot handles it.
> >
> > ---
> > arch/x86/include/asm/kvm_host.h | 3 +++
> > 1 file changed, 3 insertions(+)
> >
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index aabf1648a56a..218b575d24bd 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1289,6 +1289,7 @@ struct kvm_arch {
> > u8 vm_type;
> > bool has_private_mem;
> > bool has_protected_state;
> > + bool has_mirrored_pt;
> > struct hlist_head mmu_page_hash[KVM_NUM_MMU_PAGES];
> > struct list_head active_mmu_pages;
> > struct list_head zapped_obsolete_pages;
> > @@ -2171,8 +2172,10 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level,
> > #ifdef CONFIG_KVM_PRIVATE_MEM
> > #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem)
> > +#define kvm_arch_has_mirrored_pt(kvm) ((kvm)->arch.has_mirrored_pt)
> > #else
> > #define kvm_arch_has_private_mem(kvm) false
> > +#define kvm_arch_has_mirrored_pt(kvm) false
> > #endif
> > static inline u16 kvm_read_ldt(void)
>
> I think this 'has_mirrored_pt' (or a better name) is better, because it
> clearly conveys it is for the "page table", but not the actual page that any
> page table entry maps to.
>
> AFAICT we need to split the concept of "private page table itself" and the
> "memory type of the actual GFN".
>
> E.g., both SEV-SNP and TDX has concept of "private memory" (obviously), but
> I was told only TDX uses a dedicated private page table which isn't directly
> accessible for KVV. SEV-SNP on the other hand just uses normal page table +
> additional HW managed table to make sure the security.
kvm_mmu_page_role.is_private is not good name now. Probably is_mirrored_pt or
need_callback or whatever makes sense.
> In other words, I think we should decide whether to invoke TDP MMU callback
> for private mapping (the page table itself may just be normal one) depending
> on the fault->is_private, but not whether the page table is private:
>
> if (fault->is_private && kvm_x86_ops->set_private_spte)
> kvm_x86_set_private_spte(...);
> else
> tdp_mmu_set_spte_atomic(...);
This doesn't work for two reasons.
- We need to pass down struct kvm_page_fault fault deep only for this.
We could change the code in such way.
- We don't have struct kvm_page_fault fault for zapping case.
We could create a dummy one and pass it around.
Essentially the issue is how to pass down is_private or stash the info
somewhere or determine it somehow. Options I think of are
- Pass around fault:
Con: fault isn't passed down
Con: Create fake fault for zapping case
- Stash it in struct tdp_iter and pass around iter:
Pro: work for zapping case
Con: we need to change the code to pass down tdp_iter
- Pass around is_private (or mirrored_pt or whatever):
Pro: Don't need to add member to some structure
Con: We need to pass it around still.
- Stash it in kvm_mmu_page:
The patch series uses kvm_mmu_page.role.
Pro: We don't need to pass around because we know struct kvm_mmu_page
Con: Need to twist root page allocation
- Use gfn. kvm_is_private_gfn(kvm, gfn):
Con: The use of gfn is confusing. It's too TDX specific.
> And the 'has_mirrored_pt' should be only used to select the root of the page
> table that we want to operate on.
We can add one more bool to struct kvm_page_fault.follow_mirrored_pt or
something to represent it. We can initialize it in __kvm_mmu_do_page_fault().
.follow_mirrored_pt = kvm->arch.has_mirrored_pt && kvm_is_private_gpa(gpa);
> This also gives a chance that if there's anything special needs to be done
> for page allocated for the "non-leaf" middle page table for SEV-SNP, it can
> just fit.
Can you please elaborate on this?
--
Isaku Yamahata <isaku.yamahata@intel.com>
next prev parent reply other threads:[~2024-05-16 0:15 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 0:59 [PATCH 00/16] TDX MMU prep series part 1 Rick Edgecombe
2024-05-15 0:59 ` [PATCH 01/16] KVM: x86: Add a VM type define for TDX Rick Edgecombe
2024-05-15 0:59 ` [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion Rick Edgecombe
2024-05-15 13:24 ` Huang, Kai
2024-05-15 19:09 ` Sean Christopherson
2024-05-15 19:23 ` Edgecombe, Rick P
2024-05-15 20:05 ` Sean Christopherson
2024-05-15 20:53 ` Edgecombe, Rick P
2024-05-15 22:47 ` Sean Christopherson
2024-05-15 23:06 ` Huang, Kai
2024-05-15 23:20 ` Sean Christopherson
2024-05-15 23:36 ` Huang, Kai
2024-05-16 1:12 ` Xiaoyao Li
2024-05-17 15:30 ` Paolo Bonzini
2024-05-22 1:29 ` Yan Zhao
2024-05-22 2:31 ` Sean Christopherson
2024-05-22 6:48 ` Yan Zhao
2024-05-22 15:45 ` Paolo Bonzini
2024-05-24 1:50 ` Yan Zhao
2024-05-15 23:56 ` Edgecombe, Rick P
2024-05-16 2:21 ` Edgecombe, Rick P
2024-05-16 3:56 ` Yan Zhao
2024-05-17 15:27 ` Paolo Bonzini
2024-05-17 15:25 ` Paolo Bonzini
2024-05-15 18:03 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 03/16] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU Rick Edgecombe
2024-05-17 7:44 ` Chao Gao
2024-05-17 9:08 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 04/16] KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA Rick Edgecombe
2024-05-15 22:34 ` Huang, Kai
2024-05-15 23:21 ` Edgecombe, Rick P
2024-05-15 23:31 ` Huang, Kai
2024-05-15 23:38 ` Edgecombe, Rick P
2024-05-15 23:44 ` Huang, Kai
2024-05-15 23:59 ` Edgecombe, Rick P
2024-05-16 0:12 ` Huang, Kai
2024-05-16 0:19 ` Edgecombe, Rick P
2024-05-16 0:25 ` Huang, Kai
2024-05-16 0:35 ` Edgecombe, Rick P
2024-05-16 1:04 ` Huang, Kai
2024-05-16 1:20 ` Edgecombe, Rick P
2024-05-16 1:40 ` Huang, Kai
2024-05-16 5:52 ` Yan Zhao
2024-05-18 0:25 ` Edgecombe, Rick P
2024-05-16 23:08 ` Edgecombe, Rick P
2024-05-17 0:37 ` Huang, Kai
2024-05-17 1:51 ` Edgecombe, Rick P
2024-05-17 4:26 ` Huang, Kai
2024-05-17 21:12 ` Edgecombe, Rick P
2024-05-15 0:59 ` [PATCH 05/16] KVM: Add member to struct kvm_gfn_range for target alias Rick Edgecombe
2024-05-17 20:58 ` Edgecombe, Rick P
2024-05-15 0:59 ` [PATCH 06/16] KVM: x86/mmu: Add a new is_private member for union kvm_mmu_page_role Rick Edgecombe
2024-05-15 0:59 ` [PATCH 07/16] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page Rick Edgecombe
2024-05-15 0:59 ` [PATCH 08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is called for TDX Rick Edgecombe
2024-05-15 13:27 ` Huang, Kai
2024-05-15 15:22 ` Edgecombe, Rick P
2024-05-15 23:14 ` Huang, Kai
2024-05-15 15:34 ` Sean Christopherson
2024-05-15 15:49 ` Edgecombe, Rick P
2024-05-15 15:56 ` Edgecombe, Rick P
2024-05-15 16:02 ` Sean Christopherson
2024-05-15 16:12 ` Edgecombe, Rick P
2024-05-15 18:09 ` Sean Christopherson
2024-05-15 18:22 ` Edgecombe, Rick P
2024-05-15 19:48 ` Sean Christopherson
2024-05-15 20:32 ` Edgecombe, Rick P
2024-05-15 23:26 ` Sean Christopherson
2024-05-15 16:22 ` Isaku Yamahata
2024-05-15 22:17 ` Huang, Kai
2024-05-15 23:14 ` Edgecombe, Rick P
2024-05-15 23:38 ` Huang, Kai
2024-05-16 0:13 ` Edgecombe, Rick P
2024-05-16 0:27 ` Isaku Yamahata
2024-05-16 1:11 ` Huang, Kai
2024-05-16 0:15 ` Isaku Yamahata [this message]
2024-05-16 0:52 ` Edgecombe, Rick P
2024-05-16 1:21 ` Huang, Kai
2024-05-16 17:27 ` Isaku Yamahata
2024-05-16 21:46 ` Edgecombe, Rick P
2024-05-16 22:23 ` Huang, Kai
2024-05-16 22:38 ` Edgecombe, Rick P
2024-05-16 23:16 ` Huang, Kai
2024-05-15 0:59 ` [PATCH 09/16] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void Rick Edgecombe
2024-05-15 0:59 ` [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU Rick Edgecombe
2024-05-15 17:35 ` Isaku Yamahata
2024-05-15 18:00 ` Edgecombe, Rick P
2024-05-16 0:52 ` Huang, Kai
2024-05-16 1:27 ` Edgecombe, Rick P
2024-05-16 2:07 ` Huang, Kai
2024-05-16 2:57 ` Edgecombe, Rick P
2024-05-16 13:04 ` Huang, Kai
2024-05-16 16:36 ` Edgecombe, Rick P
2024-05-16 19:42 ` Isaku Yamahata
2024-05-17 2:35 ` Edgecombe, Rick P
2024-05-17 9:03 ` Isaku Yamahata
2024-05-17 18:16 ` Edgecombe, Rick P
2024-05-17 19:16 ` Isaku Yamahata
2024-05-20 23:32 ` Isaku Yamahata
2024-05-21 15:07 ` Edgecombe, Rick P
2024-05-21 16:15 ` Isaku Yamahata
2024-05-22 22:34 ` Isaku Yamahata
2024-05-22 23:09 ` Edgecombe, Rick P
2024-05-22 23:47 ` Isaku Yamahata
2024-05-22 23:50 ` Edgecombe, Rick P
2024-05-23 0:01 ` Isaku Yamahata
2024-05-23 18:27 ` Edgecombe, Rick P
2024-05-24 7:55 ` Isaku Yamahata
2024-05-28 16:27 ` Edgecombe, Rick P
2024-05-28 17:47 ` Paolo Bonzini
2024-05-29 2:13 ` Edgecombe, Rick P
2024-05-29 7:25 ` Paolo Bonzini
2024-05-31 14:11 ` Isaku Yamahata
2024-05-28 17:43 ` Paolo Bonzini
2024-05-28 17:16 ` Paolo Bonzini
2024-05-28 18:29 ` Edgecombe, Rick P
2024-05-29 1:06 ` Isaku Yamahata
2024-05-29 1:51 ` Edgecombe, Rick P
2024-05-17 2:36 ` Huang, Kai
2024-05-17 8:14 ` Isaku Yamahata
2024-05-18 5:42 ` Huang, Kai
2024-05-18 15:41 ` Edgecombe, Rick P
2024-05-20 10:38 ` Huang, Kai
2024-05-20 18:58 ` Isaku Yamahata
2024-05-20 19:02 ` Edgecombe, Rick P
2024-05-20 23:39 ` Edgecombe, Rick P
2024-05-21 2:25 ` Isaku Yamahata
2024-05-21 2:57 ` Edgecombe, Rick P
2024-05-20 22:34 ` Huang, Kai
2024-05-16 1:48 ` Isaku Yamahata
2024-05-16 2:00 ` Edgecombe, Rick P
2024-05-16 2:10 ` Huang, Kai
2024-05-28 16:59 ` Paolo Bonzini
2024-05-16 17:10 ` Isaku Yamahata
2024-05-23 23:14 ` Edgecombe, Rick P
2024-05-24 8:20 ` Isaku Yamahata
2024-05-28 21:48 ` Edgecombe, Rick P
2024-05-29 1:16 ` Isaku Yamahata
2024-05-29 1:50 ` Edgecombe, Rick P
2024-05-29 2:20 ` Isaku Yamahata
2024-05-29 2:29 ` Edgecombe, Rick P
2024-05-28 20:54 ` Edgecombe, Rick P
2024-05-29 1:24 ` Isaku Yamahata
2024-05-28 23:06 ` Edgecombe, Rick P
2024-05-29 1:57 ` Isaku Yamahata
2024-05-29 2:13 ` Edgecombe, Rick P
2024-05-29 16:55 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 11/16] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() Rick Edgecombe
2024-05-15 0:59 ` [PATCH 12/16] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type Rick Edgecombe
2024-05-15 0:59 ` [PATCH 13/16] KVM: x86/tdp_mmu: Introduce shared, private KVM MMU root types Rick Edgecombe
2024-05-15 0:59 ` [PATCH 14/16] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() Rick Edgecombe
2024-05-15 0:59 ` [PATCH 15/16] KVM: x86/tdp_mmu: Make mmu notifier callbacks to check kvm_process Rick Edgecombe
2024-05-15 0:59 ` [PATCH 16/16] KVM: x86/tdp_mmu: Invalidate correct roots Rick Edgecombe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240516001530.GG168153@ls.amr.corp.intel.com \
--to=isaku.yamahata@intel.com \
--cc=dmatlack@google.com \
--cc=erdemaktas@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=isaku.yamahata@linux.intel.com \
--cc=kai.huang@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox