From: Isaku Yamahata <isaku.yamahata@intel.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"seanjc@google.com" <seanjc@google.com>,
"sagis@google.com" <sagis@google.com>,
"dmatlack@google.com" <dmatlack@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"Aktas, Erdem" <erdemaktas@google.com>
Subject: Re: [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU
Date: Tue, 28 May 2024 18:57:49 -0700 [thread overview]
Message-ID: <20240529015749.GF386318@ls.amr.corp.intel.com> (raw)
In-Reply-To: <a5ad658b1eec38f621315431a24e028187b5ca2d.camel@intel.com>
On Tue, May 28, 2024 at 11:06:45PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@intel.com> wrote:
> On Tue, 2024-05-14 at 17:59 -0700, Rick Edgecombe wrote:
> > static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> > - u64 old_spte, u64 new_spte, int level,
> > - bool shared)
> > + u64 old_spte, u64 new_spte,
> > + union kvm_mmu_page_role role, bool shared)
> > {
> > + bool is_private = kvm_mmu_page_role_is_private(role);
> > + int level = role.level;
> > bool was_present = is_shadow_present_pte(old_spte);
> > bool is_present = is_shadow_present_pte(new_spte);
> > bool was_leaf = was_present && is_last_spte(old_spte, level);
> > bool is_leaf = is_present && is_last_spte(new_spte, level);
> > - bool pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte);
> > + kvm_pfn_t old_pfn = spte_to_pfn(old_spte);
> > + kvm_pfn_t new_pfn = spte_to_pfn(new_spte);
> > + bool pfn_changed = old_pfn != new_pfn;
> >
> > WARN_ON_ONCE(level > PT64_ROOT_MAX_LEVEL);
> > WARN_ON_ONCE(level < PG_LEVEL_4K);
> > @@ -513,7 +636,7 @@ static void handle_changed_spte(struct kvm *kvm, int
> > as_id, gfn_t gfn,
> >
> > if (was_leaf && is_dirty_spte(old_spte) &&
> > (!is_present || !is_dirty_spte(new_spte) || pfn_changed))
> > - kvm_set_pfn_dirty(spte_to_pfn(old_spte));
> > + kvm_set_pfn_dirty(old_pfn);
> >
> > /*
> > * Recursively handle child PTs if the change removed a subtree from
> > @@ -522,15 +645,21 @@ static void handle_changed_spte(struct kvm *kvm, int
> > as_id, gfn_t gfn,
> > * pages are kernel allocations and should never be migrated.
> > */
> > if (was_present && !was_leaf &&
> > - (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
> > + (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) {
> > + KVM_BUG_ON(is_private !=
> > is_private_sptep(spte_to_child_pt(old_spte, level)),
> > + kvm);
> > handle_removed_pt(kvm, spte_to_child_pt(old_spte, level),
> > shared);
> > + }
> > +
> > + if (is_private && !is_present)
> > + handle_removed_private_spte(kvm, gfn, old_spte, new_spte,
> > role.level);
>
> I'm a little bothered by the asymmetry of where the mirrored hooks get called
> between setting and zapping PTEs. Tracing through the code, the relevent
> operations that are needed for TDX are:
> 1. tdp_mmu_iter_set_spte() from tdp_mmu_zap_leafs() and __tdp_mmu_zap_root()
> 2. tdp_mmu_set_spte_atomic() is used for mapping, linking
>
> (1) is a simple case because the mmu_lock is held for writes. It updates the
> mirror root like normal, then has extra logic to call out to update the S-EPT.
>
> (2) on the other hand just has the read lock, so it has to do the whole
> operation in a special way. First set REMOVED_SPTE, then update the private
> copy, then write to the mirror page tables. It can't get stuffed into
> handle_changed_spte() because it has to write REMOVED_SPTE first.
>
> In some ways it makes sense to update the S-EPT. Despite claiming
> "handle_changed_spte() only updates stats.", it does some updating of other PTEs
> based on the current PTE change. Which is pretty similar to what the mirrored
> PTEs are doing. But we can't really do the setting of present PTEs because of
> the REMOVED_SPTE stuff.
>
> So we could only make it more symmetrical by moving the S-EPT ops out of
> handle_changed_spte() and manually call it in the two places relevant for TDX,
> like the below.
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index e966986bb9f2..c9ddb1c2a550 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -438,6 +438,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t
> pt, bool shared)
> */
> old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte,
> REMOVED_SPTE, level);
> +
> + if (is_mirror_sp(sp))
> + reflect_removed_spte(kvm, gfn, old_spte,
> REMOVED_SPTE, level);
The callback before handling lower level will result in error.
> }
> handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn,
> old_spte, REMOVED_SPTE, sp->role, shared);
We should call it here after processing lower level.
> @@ -667,9 +670,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id,
> gfn_t gfn,
> handle_removed_pt(kvm, spte_to_child_pt(old_spte, level),
> shared);
> }
>
> - if (is_mirror && !is_present)
> - reflect_removed_spte(kvm, gfn, old_spte, new_spte, role.level);
> -
> if (was_leaf && is_accessed_spte(old_spte) &&
> (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
> kvm_set_pfn_accessed(spte_to_pfn(old_spte));
> @@ -839,6 +839,9 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id,
> tdp_ptep_t sptep,
> new_spte, level), kvm);
> }
>
> + if (is_mirror_sptep(sptep))
> + reflect_removed_spte(kvm, gfn, old_spte, REMOVED_SPTE, level);
> +
Ditto.
> role = sptep_to_sp(sptep)->role;
> role.level = level;
> handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, role, false);
The callback should be here. It should be after handling the lower level.
> Otherwise, we could move the "set present" mirroring operations into
> handle_changed_spte(), and have some earlier conditional logic do the
> REMOVED_SPTE parts. It starts to become more scattered.
> Anyway, it's just a code clarity thing arising from having hard time explaining
> the design in the log. Any opinions?
Originally I tried to consolidate the callbacks by following TDP MMU using
handle_changed_spte(). Anyway we can pick from two outcomes based on which is
easy to understand/maintain.
> A separate but related comment is below.
>
> >
> > if (was_leaf && is_accessed_spte(old_spte) &&
> > (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
> > kvm_set_pfn_accessed(spte_to_pfn(old_spte));
> > }
> >
> > @@ -648,6 +807,8 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
> > static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
> > u64 old_spte, u64 new_spte, gfn_t gfn, int level)
> > {
> > + union kvm_mmu_page_role role;
> > +
> > lockdep_assert_held_write(&kvm->mmu_lock);
> >
> > /*
> > @@ -660,8 +821,16 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id,
> > tdp_ptep_t sptep,
> > WARN_ON_ONCE(is_removed_spte(old_spte) || is_removed_spte(new_spte));
> >
> > old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
> > + if (is_private_sptep(sptep) && !is_removed_spte(new_spte) &&
> > + is_shadow_present_pte(new_spte)) {
> > + /* Because write spin lock is held, no race. It should
> > success. */
> > + KVM_BUG_ON(__set_private_spte_present(kvm, sptep, gfn,
> > old_spte,
> > + new_spte, level), kvm);
> > + }
>
> Based on the above enumeration, I don't see how this hunk gets used.
I should've removed it. This is leftover from the old patches.
--
Isaku Yamahata <isaku.yamahata@intel.com>
next prev parent reply other threads:[~2024-05-29 1:57 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 0:59 [PATCH 00/16] TDX MMU prep series part 1 Rick Edgecombe
2024-05-15 0:59 ` [PATCH 01/16] KVM: x86: Add a VM type define for TDX Rick Edgecombe
2024-05-15 0:59 ` [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion Rick Edgecombe
2024-05-15 13:24 ` Huang, Kai
2024-05-15 19:09 ` Sean Christopherson
2024-05-15 19:23 ` Edgecombe, Rick P
2024-05-15 20:05 ` Sean Christopherson
2024-05-15 20:53 ` Edgecombe, Rick P
2024-05-15 22:47 ` Sean Christopherson
2024-05-15 23:06 ` Huang, Kai
2024-05-15 23:20 ` Sean Christopherson
2024-05-15 23:36 ` Huang, Kai
2024-05-16 1:12 ` Xiaoyao Li
2024-05-17 15:30 ` Paolo Bonzini
2024-05-22 1:29 ` Yan Zhao
2024-05-22 2:31 ` Sean Christopherson
2024-05-22 6:48 ` Yan Zhao
2024-05-22 15:45 ` Paolo Bonzini
2024-05-24 1:50 ` Yan Zhao
2024-05-15 23:56 ` Edgecombe, Rick P
2024-05-16 2:21 ` Edgecombe, Rick P
2024-05-16 3:56 ` Yan Zhao
2024-05-17 15:27 ` Paolo Bonzini
2024-05-17 15:25 ` Paolo Bonzini
2024-05-15 18:03 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 03/16] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU Rick Edgecombe
2024-05-17 7:44 ` Chao Gao
2024-05-17 9:08 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 04/16] KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA Rick Edgecombe
2024-05-15 22:34 ` Huang, Kai
2024-05-15 23:21 ` Edgecombe, Rick P
2024-05-15 23:31 ` Huang, Kai
2024-05-15 23:38 ` Edgecombe, Rick P
2024-05-15 23:44 ` Huang, Kai
2024-05-15 23:59 ` Edgecombe, Rick P
2024-05-16 0:12 ` Huang, Kai
2024-05-16 0:19 ` Edgecombe, Rick P
2024-05-16 0:25 ` Huang, Kai
2024-05-16 0:35 ` Edgecombe, Rick P
2024-05-16 1:04 ` Huang, Kai
2024-05-16 1:20 ` Edgecombe, Rick P
2024-05-16 1:40 ` Huang, Kai
2024-05-16 5:52 ` Yan Zhao
2024-05-18 0:25 ` Edgecombe, Rick P
2024-05-16 23:08 ` Edgecombe, Rick P
2024-05-17 0:37 ` Huang, Kai
2024-05-17 1:51 ` Edgecombe, Rick P
2024-05-17 4:26 ` Huang, Kai
2024-05-17 21:12 ` Edgecombe, Rick P
2024-05-15 0:59 ` [PATCH 05/16] KVM: Add member to struct kvm_gfn_range for target alias Rick Edgecombe
2024-05-17 20:58 ` Edgecombe, Rick P
2024-05-15 0:59 ` [PATCH 06/16] KVM: x86/mmu: Add a new is_private member for union kvm_mmu_page_role Rick Edgecombe
2024-05-15 0:59 ` [PATCH 07/16] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page Rick Edgecombe
2024-05-15 0:59 ` [PATCH 08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is called for TDX Rick Edgecombe
2024-05-15 13:27 ` Huang, Kai
2024-05-15 15:22 ` Edgecombe, Rick P
2024-05-15 23:14 ` Huang, Kai
2024-05-15 15:34 ` Sean Christopherson
2024-05-15 15:49 ` Edgecombe, Rick P
2024-05-15 15:56 ` Edgecombe, Rick P
2024-05-15 16:02 ` Sean Christopherson
2024-05-15 16:12 ` Edgecombe, Rick P
2024-05-15 18:09 ` Sean Christopherson
2024-05-15 18:22 ` Edgecombe, Rick P
2024-05-15 19:48 ` Sean Christopherson
2024-05-15 20:32 ` Edgecombe, Rick P
2024-05-15 23:26 ` Sean Christopherson
2024-05-15 16:22 ` Isaku Yamahata
2024-05-15 22:17 ` Huang, Kai
2024-05-15 23:14 ` Edgecombe, Rick P
2024-05-15 23:38 ` Huang, Kai
2024-05-16 0:13 ` Edgecombe, Rick P
2024-05-16 0:27 ` Isaku Yamahata
2024-05-16 1:11 ` Huang, Kai
2024-05-16 0:15 ` Isaku Yamahata
2024-05-16 0:52 ` Edgecombe, Rick P
2024-05-16 1:21 ` Huang, Kai
2024-05-16 17:27 ` Isaku Yamahata
2024-05-16 21:46 ` Edgecombe, Rick P
2024-05-16 22:23 ` Huang, Kai
2024-05-16 22:38 ` Edgecombe, Rick P
2024-05-16 23:16 ` Huang, Kai
2024-05-15 0:59 ` [PATCH 09/16] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void Rick Edgecombe
2024-05-15 0:59 ` [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU Rick Edgecombe
2024-05-15 17:35 ` Isaku Yamahata
2024-05-15 18:00 ` Edgecombe, Rick P
2024-05-16 0:52 ` Huang, Kai
2024-05-16 1:27 ` Edgecombe, Rick P
2024-05-16 2:07 ` Huang, Kai
2024-05-16 2:57 ` Edgecombe, Rick P
2024-05-16 13:04 ` Huang, Kai
2024-05-16 16:36 ` Edgecombe, Rick P
2024-05-16 19:42 ` Isaku Yamahata
2024-05-17 2:35 ` Edgecombe, Rick P
2024-05-17 9:03 ` Isaku Yamahata
2024-05-17 18:16 ` Edgecombe, Rick P
2024-05-17 19:16 ` Isaku Yamahata
2024-05-20 23:32 ` Isaku Yamahata
2024-05-21 15:07 ` Edgecombe, Rick P
2024-05-21 16:15 ` Isaku Yamahata
2024-05-22 22:34 ` Isaku Yamahata
2024-05-22 23:09 ` Edgecombe, Rick P
2024-05-22 23:47 ` Isaku Yamahata
2024-05-22 23:50 ` Edgecombe, Rick P
2024-05-23 0:01 ` Isaku Yamahata
2024-05-23 18:27 ` Edgecombe, Rick P
2024-05-24 7:55 ` Isaku Yamahata
2024-05-28 16:27 ` Edgecombe, Rick P
2024-05-28 17:47 ` Paolo Bonzini
2024-05-29 2:13 ` Edgecombe, Rick P
2024-05-29 7:25 ` Paolo Bonzini
2024-05-31 14:11 ` Isaku Yamahata
2024-05-28 17:43 ` Paolo Bonzini
2024-05-28 17:16 ` Paolo Bonzini
2024-05-28 18:29 ` Edgecombe, Rick P
2024-05-29 1:06 ` Isaku Yamahata
2024-05-29 1:51 ` Edgecombe, Rick P
2024-05-17 2:36 ` Huang, Kai
2024-05-17 8:14 ` Isaku Yamahata
2024-05-18 5:42 ` Huang, Kai
2024-05-18 15:41 ` Edgecombe, Rick P
2024-05-20 10:38 ` Huang, Kai
2024-05-20 18:58 ` Isaku Yamahata
2024-05-20 19:02 ` Edgecombe, Rick P
2024-05-20 23:39 ` Edgecombe, Rick P
2024-05-21 2:25 ` Isaku Yamahata
2024-05-21 2:57 ` Edgecombe, Rick P
2024-05-20 22:34 ` Huang, Kai
2024-05-16 1:48 ` Isaku Yamahata
2024-05-16 2:00 ` Edgecombe, Rick P
2024-05-16 2:10 ` Huang, Kai
2024-05-28 16:59 ` Paolo Bonzini
2024-05-16 17:10 ` Isaku Yamahata
2024-05-23 23:14 ` Edgecombe, Rick P
2024-05-24 8:20 ` Isaku Yamahata
2024-05-28 21:48 ` Edgecombe, Rick P
2024-05-29 1:16 ` Isaku Yamahata
2024-05-29 1:50 ` Edgecombe, Rick P
2024-05-29 2:20 ` Isaku Yamahata
2024-05-29 2:29 ` Edgecombe, Rick P
2024-05-28 20:54 ` Edgecombe, Rick P
2024-05-29 1:24 ` Isaku Yamahata
2024-05-28 23:06 ` Edgecombe, Rick P
2024-05-29 1:57 ` Isaku Yamahata [this message]
2024-05-29 2:13 ` Edgecombe, Rick P
2024-05-29 16:55 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 11/16] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() Rick Edgecombe
2024-05-15 0:59 ` [PATCH 12/16] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type Rick Edgecombe
2024-05-15 0:59 ` [PATCH 13/16] KVM: x86/tdp_mmu: Introduce shared, private KVM MMU root types Rick Edgecombe
2024-05-15 0:59 ` [PATCH 14/16] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() Rick Edgecombe
2024-05-15 0:59 ` [PATCH 15/16] KVM: x86/tdp_mmu: Make mmu notifier callbacks to check kvm_process Rick Edgecombe
2024-05-15 0:59 ` [PATCH 16/16] KVM: x86/tdp_mmu: Invalidate correct roots Rick Edgecombe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240529015749.GF386318@ls.amr.corp.intel.com \
--to=isaku.yamahata@intel.com \
--cc=dmatlack@google.com \
--cc=erdemaktas@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.