From: Isaku Yamahata <isaku.yamahata@intel.com>
To: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
Cc: "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
"pbonzini@redhat.com" <pbonzini@redhat.com>,
"seanjc@google.com" <seanjc@google.com>,
"sagis@google.com" <sagis@google.com>,
"dmatlack@google.com" <dmatlack@google.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"isaku.yamahata@gmail.com" <isaku.yamahata@gmail.com>,
"Zhao, Yan Y" <yan.y.zhao@intel.com>,
"Aktas, Erdem" <erdemaktas@google.com>
Subject: Re: [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU
Date: Tue, 28 May 2024 18:57:49 -0700 [thread overview]
Message-ID: <20240529015749.GF386318@ls.amr.corp.intel.com> (raw)
In-Reply-To: <a5ad658b1eec38f621315431a24e028187b5ca2d.camel@intel.com>
On Tue, May 28, 2024 at 11:06:45PM +0000,
"Edgecombe, Rick P" <rick.p.edgecombe@intel.com> wrote:
> On Tue, 2024-05-14 at 17:59 -0700, Rick Edgecombe wrote:
> > static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
> > - u64 old_spte, u64 new_spte, int level,
> > - bool shared)
> > + u64 old_spte, u64 new_spte,
> > + union kvm_mmu_page_role role, bool shared)
> > {
> > + bool is_private = kvm_mmu_page_role_is_private(role);
> > + int level = role.level;
> > bool was_present = is_shadow_present_pte(old_spte);
> > bool is_present = is_shadow_present_pte(new_spte);
> > bool was_leaf = was_present && is_last_spte(old_spte, level);
> > bool is_leaf = is_present && is_last_spte(new_spte, level);
> > - bool pfn_changed = spte_to_pfn(old_spte) != spte_to_pfn(new_spte);
> > + kvm_pfn_t old_pfn = spte_to_pfn(old_spte);
> > + kvm_pfn_t new_pfn = spte_to_pfn(new_spte);
> > + bool pfn_changed = old_pfn != new_pfn;
> >
> > WARN_ON_ONCE(level > PT64_ROOT_MAX_LEVEL);
> > WARN_ON_ONCE(level < PG_LEVEL_4K);
> > @@ -513,7 +636,7 @@ static void handle_changed_spte(struct kvm *kvm, int
> > as_id, gfn_t gfn,
> >
> > if (was_leaf && is_dirty_spte(old_spte) &&
> > (!is_present || !is_dirty_spte(new_spte) || pfn_changed))
> > - kvm_set_pfn_dirty(spte_to_pfn(old_spte));
> > + kvm_set_pfn_dirty(old_pfn);
> >
> > /*
> > * Recursively handle child PTs if the change removed a subtree from
> > @@ -522,15 +645,21 @@ static void handle_changed_spte(struct kvm *kvm, int
> > as_id, gfn_t gfn,
> > * pages are kernel allocations and should never be migrated.
> > */
> > if (was_present && !was_leaf &&
> > - (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed)))
> > + (is_leaf || !is_present || WARN_ON_ONCE(pfn_changed))) {
> > + KVM_BUG_ON(is_private !=
> > is_private_sptep(spte_to_child_pt(old_spte, level)),
> > + kvm);
> > handle_removed_pt(kvm, spte_to_child_pt(old_spte, level),
> > shared);
> > + }
> > +
> > + if (is_private && !is_present)
> > + handle_removed_private_spte(kvm, gfn, old_spte, new_spte,
> > role.level);
>
> I'm a little bothered by the asymmetry of where the mirrored hooks get called
> between setting and zapping PTEs. Tracing through the code, the relevent
> operations that are needed for TDX are:
> 1. tdp_mmu_iter_set_spte() from tdp_mmu_zap_leafs() and __tdp_mmu_zap_root()
> 2. tdp_mmu_set_spte_atomic() is used for mapping, linking
>
> (1) is a simple case because the mmu_lock is held for writes. It updates the
> mirror root like normal, then has extra logic to call out to update the S-EPT.
>
> (2) on the other hand just has the read lock, so it has to do the whole
> operation in a special way. First set REMOVED_SPTE, then update the private
> copy, then write to the mirror page tables. It can't get stuffed into
> handle_changed_spte() because it has to write REMOVED_SPTE first.
>
> In some ways it makes sense to update the S-EPT. Despite claiming
> "handle_changed_spte() only updates stats.", it does some updating of other PTEs
> based on the current PTE change. Which is pretty similar to what the mirrored
> PTEs are doing. But we can't really do the setting of present PTEs because of
> the REMOVED_SPTE stuff.
>
> So we could only make it more symmetrical by moving the S-EPT ops out of
> handle_changed_spte() and manually call it in the two places relevant for TDX,
> like the below.
>
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
> index e966986bb9f2..c9ddb1c2a550 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.c
> +++ b/arch/x86/kvm/mmu/tdp_mmu.c
> @@ -438,6 +438,9 @@ static void handle_removed_pt(struct kvm *kvm, tdp_ptep_t
> pt, bool shared)
> */
> old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte,
> REMOVED_SPTE, level);
> +
> + if (is_mirror_sp(sp))
> + reflect_removed_spte(kvm, gfn, old_spte,
> REMOVED_SPTE, level);
The callback before handling lower level will result in error.
> }
> handle_changed_spte(kvm, kvm_mmu_page_as_id(sp), gfn,
> old_spte, REMOVED_SPTE, sp->role, shared);
We should call it here after processing lower level.
> @@ -667,9 +670,6 @@ static void handle_changed_spte(struct kvm *kvm, int as_id,
> gfn_t gfn,
> handle_removed_pt(kvm, spte_to_child_pt(old_spte, level),
> shared);
> }
>
> - if (is_mirror && !is_present)
> - reflect_removed_spte(kvm, gfn, old_spte, new_spte, role.level);
> -
> if (was_leaf && is_accessed_spte(old_spte) &&
> (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
> kvm_set_pfn_accessed(spte_to_pfn(old_spte));
> @@ -839,6 +839,9 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id,
> tdp_ptep_t sptep,
> new_spte, level), kvm);
> }
>
> + if (is_mirror_sptep(sptep))
> + reflect_removed_spte(kvm, gfn, old_spte, REMOVED_SPTE, level);
> +
Ditto.
> role = sptep_to_sp(sptep)->role;
> role.level = level;
> handle_changed_spte(kvm, as_id, gfn, old_spte, new_spte, role, false);
The callback should be here. It should be after handling the lower level.
> Otherwise, we could move the "set present" mirroring operations into
> handle_changed_spte(), and have some earlier conditional logic do the
> REMOVED_SPTE parts. It starts to become more scattered.
> Anyway, it's just a code clarity thing arising from having hard time explaining
> the design in the log. Any opinions?
Originally I tried to consolidate the callbacks by following TDP MMU using
handle_changed_spte(). Anyway we can pick from two outcomes based on which is
easy to understand/maintain.
> A separate but related comment is below.
>
> >
> > if (was_leaf && is_accessed_spte(old_spte) &&
> > (!is_present || !is_accessed_spte(new_spte) || pfn_changed))
> > kvm_set_pfn_accessed(spte_to_pfn(old_spte));
> > }
> >
> > @@ -648,6 +807,8 @@ static inline int tdp_mmu_zap_spte_atomic(struct kvm *kvm,
> > static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id, tdp_ptep_t sptep,
> > u64 old_spte, u64 new_spte, gfn_t gfn, int level)
> > {
> > + union kvm_mmu_page_role role;
> > +
> > lockdep_assert_held_write(&kvm->mmu_lock);
> >
> > /*
> > @@ -660,8 +821,16 @@ static u64 tdp_mmu_set_spte(struct kvm *kvm, int as_id,
> > tdp_ptep_t sptep,
> > WARN_ON_ONCE(is_removed_spte(old_spte) || is_removed_spte(new_spte));
> >
> > old_spte = kvm_tdp_mmu_write_spte(sptep, old_spte, new_spte, level);
> > + if (is_private_sptep(sptep) && !is_removed_spte(new_spte) &&
> > + is_shadow_present_pte(new_spte)) {
> > + /* Because write spin lock is held, no race. It should
> > success. */
> > + KVM_BUG_ON(__set_private_spte_present(kvm, sptep, gfn,
> > old_spte,
> > + new_spte, level), kvm);
> > + }
>
> Based on the above enumeration, I don't see how this hunk gets used.
I should've removed it. This is leftover from the old patches.
--
Isaku Yamahata <isaku.yamahata@intel.com>
next prev parent reply other threads:[~2024-05-29 1:57 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-15 0:59 [PATCH 00/16] TDX MMU prep series part 1 Rick Edgecombe
2024-05-15 0:59 ` [PATCH 01/16] KVM: x86: Add a VM type define for TDX Rick Edgecombe
2024-05-15 0:59 ` [PATCH 02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion Rick Edgecombe
2024-05-15 13:24 ` Huang, Kai
2024-05-15 19:09 ` Sean Christopherson
2024-05-15 19:23 ` Edgecombe, Rick P
2024-05-15 20:05 ` Sean Christopherson
2024-05-15 20:53 ` Edgecombe, Rick P
2024-05-15 22:47 ` Sean Christopherson
2024-05-15 23:06 ` Huang, Kai
2024-05-15 23:20 ` Sean Christopherson
2024-05-15 23:36 ` Huang, Kai
2024-05-16 1:12 ` Xiaoyao Li
2024-05-17 15:30 ` Paolo Bonzini
2024-05-22 1:29 ` Yan Zhao
2024-05-22 2:31 ` Sean Christopherson
2024-05-22 6:48 ` Yan Zhao
2024-05-22 15:45 ` Paolo Bonzini
2024-05-24 1:50 ` Yan Zhao
2024-05-15 23:56 ` Edgecombe, Rick P
2024-05-16 2:21 ` Edgecombe, Rick P
2024-05-16 3:56 ` Yan Zhao
2024-05-17 15:27 ` Paolo Bonzini
2024-05-17 15:25 ` Paolo Bonzini
2024-05-15 18:03 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 03/16] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU Rick Edgecombe
2024-05-17 7:44 ` Chao Gao
2024-05-17 9:08 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 04/16] KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA Rick Edgecombe
2024-05-15 22:34 ` Huang, Kai
2024-05-15 23:21 ` Edgecombe, Rick P
2024-05-15 23:31 ` Huang, Kai
2024-05-15 23:38 ` Edgecombe, Rick P
2024-05-15 23:44 ` Huang, Kai
2024-05-15 23:59 ` Edgecombe, Rick P
2024-05-16 0:12 ` Huang, Kai
2024-05-16 0:19 ` Edgecombe, Rick P
2024-05-16 0:25 ` Huang, Kai
2024-05-16 0:35 ` Edgecombe, Rick P
2024-05-16 1:04 ` Huang, Kai
2024-05-16 1:20 ` Edgecombe, Rick P
2024-05-16 1:40 ` Huang, Kai
2024-05-16 5:52 ` Yan Zhao
2024-05-18 0:25 ` Edgecombe, Rick P
2024-05-16 23:08 ` Edgecombe, Rick P
2024-05-17 0:37 ` Huang, Kai
2024-05-17 1:51 ` Edgecombe, Rick P
2024-05-17 4:26 ` Huang, Kai
2024-05-17 21:12 ` Edgecombe, Rick P
2024-05-15 0:59 ` [PATCH 05/16] KVM: Add member to struct kvm_gfn_range for target alias Rick Edgecombe
2024-05-17 20:58 ` Edgecombe, Rick P
2024-05-15 0:59 ` [PATCH 06/16] KVM: x86/mmu: Add a new is_private member for union kvm_mmu_page_role Rick Edgecombe
2024-05-15 0:59 ` [PATCH 07/16] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page Rick Edgecombe
2024-05-15 0:59 ` [PATCH 08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is called for TDX Rick Edgecombe
2024-05-15 13:27 ` Huang, Kai
2024-05-15 15:22 ` Edgecombe, Rick P
2024-05-15 23:14 ` Huang, Kai
2024-05-15 15:34 ` Sean Christopherson
2024-05-15 15:49 ` Edgecombe, Rick P
2024-05-15 15:56 ` Edgecombe, Rick P
2024-05-15 16:02 ` Sean Christopherson
2024-05-15 16:12 ` Edgecombe, Rick P
2024-05-15 18:09 ` Sean Christopherson
2024-05-15 18:22 ` Edgecombe, Rick P
2024-05-15 19:48 ` Sean Christopherson
2024-05-15 20:32 ` Edgecombe, Rick P
2024-05-15 23:26 ` Sean Christopherson
2024-05-15 16:22 ` Isaku Yamahata
2024-05-15 22:17 ` Huang, Kai
2024-05-15 23:14 ` Edgecombe, Rick P
2024-05-15 23:38 ` Huang, Kai
2024-05-16 0:13 ` Edgecombe, Rick P
2024-05-16 0:27 ` Isaku Yamahata
2024-05-16 1:11 ` Huang, Kai
2024-05-16 0:15 ` Isaku Yamahata
2024-05-16 0:52 ` Edgecombe, Rick P
2024-05-16 1:21 ` Huang, Kai
2024-05-16 17:27 ` Isaku Yamahata
2024-05-16 21:46 ` Edgecombe, Rick P
2024-05-16 22:23 ` Huang, Kai
2024-05-16 22:38 ` Edgecombe, Rick P
2024-05-16 23:16 ` Huang, Kai
2024-05-15 0:59 ` [PATCH 09/16] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void Rick Edgecombe
2024-05-15 0:59 ` [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU Rick Edgecombe
2024-05-15 17:35 ` Isaku Yamahata
2024-05-15 18:00 ` Edgecombe, Rick P
2024-05-16 0:52 ` Huang, Kai
2024-05-16 1:27 ` Edgecombe, Rick P
2024-05-16 2:07 ` Huang, Kai
2024-05-16 2:57 ` Edgecombe, Rick P
2024-05-16 13:04 ` Huang, Kai
2024-05-16 16:36 ` Edgecombe, Rick P
2024-05-16 19:42 ` Isaku Yamahata
2024-05-17 2:35 ` Edgecombe, Rick P
2024-05-17 9:03 ` Isaku Yamahata
2024-05-17 18:16 ` Edgecombe, Rick P
2024-05-17 19:16 ` Isaku Yamahata
2024-05-20 23:32 ` Isaku Yamahata
2024-05-21 15:07 ` Edgecombe, Rick P
2024-05-21 16:15 ` Isaku Yamahata
2024-05-22 22:34 ` Isaku Yamahata
2024-05-22 23:09 ` Edgecombe, Rick P
2024-05-22 23:47 ` Isaku Yamahata
2024-05-22 23:50 ` Edgecombe, Rick P
2024-05-23 0:01 ` Isaku Yamahata
2024-05-23 18:27 ` Edgecombe, Rick P
2024-05-24 7:55 ` Isaku Yamahata
2024-05-28 16:27 ` Edgecombe, Rick P
2024-05-28 17:47 ` Paolo Bonzini
2024-05-29 2:13 ` Edgecombe, Rick P
2024-05-29 7:25 ` Paolo Bonzini
2024-05-31 14:11 ` Isaku Yamahata
2024-05-28 17:43 ` Paolo Bonzini
2024-05-28 17:16 ` Paolo Bonzini
2024-05-28 18:29 ` Edgecombe, Rick P
2024-05-29 1:06 ` Isaku Yamahata
2024-05-29 1:51 ` Edgecombe, Rick P
2024-05-17 2:36 ` Huang, Kai
2024-05-17 8:14 ` Isaku Yamahata
2024-05-18 5:42 ` Huang, Kai
2024-05-18 15:41 ` Edgecombe, Rick P
2024-05-20 10:38 ` Huang, Kai
2024-05-20 18:58 ` Isaku Yamahata
2024-05-20 19:02 ` Edgecombe, Rick P
2024-05-20 23:39 ` Edgecombe, Rick P
2024-05-21 2:25 ` Isaku Yamahata
2024-05-21 2:57 ` Edgecombe, Rick P
2024-05-20 22:34 ` Huang, Kai
2024-05-16 1:48 ` Isaku Yamahata
2024-05-16 2:00 ` Edgecombe, Rick P
2024-05-16 2:10 ` Huang, Kai
2024-05-28 16:59 ` Paolo Bonzini
2024-05-16 17:10 ` Isaku Yamahata
2024-05-23 23:14 ` Edgecombe, Rick P
2024-05-24 8:20 ` Isaku Yamahata
2024-05-28 21:48 ` Edgecombe, Rick P
2024-05-29 1:16 ` Isaku Yamahata
2024-05-29 1:50 ` Edgecombe, Rick P
2024-05-29 2:20 ` Isaku Yamahata
2024-05-29 2:29 ` Edgecombe, Rick P
2024-05-28 20:54 ` Edgecombe, Rick P
2024-05-29 1:24 ` Isaku Yamahata
2024-05-28 23:06 ` Edgecombe, Rick P
2024-05-29 1:57 ` Isaku Yamahata [this message]
2024-05-29 2:13 ` Edgecombe, Rick P
2024-05-29 16:55 ` Isaku Yamahata
2024-05-15 0:59 ` [PATCH 11/16] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() Rick Edgecombe
2024-05-15 0:59 ` [PATCH 12/16] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type Rick Edgecombe
2024-05-15 0:59 ` [PATCH 13/16] KVM: x86/tdp_mmu: Introduce shared, private KVM MMU root types Rick Edgecombe
2024-05-15 0:59 ` [PATCH 14/16] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() Rick Edgecombe
2024-05-15 0:59 ` [PATCH 15/16] KVM: x86/tdp_mmu: Make mmu notifier callbacks to check kvm_process Rick Edgecombe
2024-05-15 0:59 ` [PATCH 16/16] KVM: x86/tdp_mmu: Invalidate correct roots Rick Edgecombe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240529015749.GF386318@ls.amr.corp.intel.com \
--to=isaku.yamahata@intel.com \
--cc=dmatlack@google.com \
--cc=erdemaktas@google.com \
--cc=isaku.yamahata@gmail.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=rick.p.edgecombe@intel.com \
--cc=sagis@google.com \
--cc=seanjc@google.com \
--cc=yan.y.zhao@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox