From: Sean Christopherson <seanjc@google.com>
To: David Matlack <dmatlack@google.com>
Cc: kvm@vger.kernel.org, Ben Gardon <bgardon@google.com>,
Joerg Roedel <joro@8bytes.org>, Jim Mattson <jmattson@google.com>,
Wanpeng Li <wanpengli@tencent.com>,
Vitaly Kuznetsov <vkuznets@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Junaid Shahid <junaids@google.com>,
Andrew Jones <drjones@redhat.com>
Subject: Re: [PATCH 5/8] KVM: x86/mmu: Also record spteps in shadow_page_walk
Date: Mon, 14 Jun 2021 22:59:14 +0000 [thread overview]
Message-ID: <YMffQriMoxWw2V1f@google.com> (raw)
In-Reply-To: <20210611235701.3941724-6-dmatlack@google.com>
On Fri, Jun 11, 2021, David Matlack wrote:
> In order to use walk_shadow_page_lockless() in fast_page_fault() we need
> to also record the spteps.
>
> No functional change intended.
>
> Signed-off-by: David Matlack <dmatlack@google.com>
> ---
> arch/x86/kvm/mmu/mmu.c | 1 +
> arch/x86/kvm/mmu/mmu_internal.h | 3 +++
> arch/x86/kvm/mmu/tdp_mmu.c | 1 +
> 3 files changed, 5 insertions(+)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 8140c262f4d3..765f5b01768d 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3538,6 +3538,7 @@ static bool walk_shadow_page_lockless(struct kvm_vcpu *vcpu, u64 addr,
> spte = mmu_spte_get_lockless(it.sptep);
> walk->last_level = it.level;
> walk->sptes[it.level] = spte;
> + walk->spteps[it.level] = it.sptep;
>
> if (!is_shadow_present_pte(spte))
> break;
> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index 26da6ca30fbf..0fefbd5d6c95 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -178,6 +178,9 @@ struct shadow_page_walk {
>
> /* The spte value at each level. */
> u64 sptes[PT64_ROOT_MAX_LEVEL + 1];
> +
> + /* The spte pointers at each level. */
> + u64 *spteps[PT64_ROOT_MAX_LEVEL + 1];
Hrm. I'm not sure how I feel about this patch, or about shadow_page_walk in
general. On the one hand, I like having a common API. On the other hand, I
really don't like mixing two different protection schemes, e.g. this really
should be
tdp_ptep_t spteps;
in order to gain the RCU annotations for TDP, but those RCU annotations are
problematic because the legacy MMU doesn't use RCU to protect its page tables.
I also don't like forcing the caller to hold the "lock" for longer than is
necessary, e.g. get_mmio_spte() doesn't require access to the page tables after
the initial walk, but the common spteps[] kinda sorta forces its hand.
The two use cases (and the only common use cases I can see) have fairly different
requirements. The MMIO check wants the SPTEs at _all_ levels, whereas the fast
page fault handler wants the SPTE _and_ its pointer at a single level. So I
wonder if by providing a super generic API we'd actually increase complexity.
I.e. rather than provide a completely generic API, maybe it would be better to
have two distinct API. That wouldn't fix the tdp_ptep_t issue, but it would at
least bound it to some degree and make the code more obvious. I suspect it would
also reduce the code churn, though that's not necessarily a goal in and of itself.
E.g. for fast_page_fault():
walk_shadow_page_lockless_begin(vcpu);
do {
sptep = get_spte_lockless(..., &spte);
if (!is_shadow_present_pte(spte))
break;
sp = sptep_to_sp(sptep);
if (!is_last_spte(spte, sp->role.level))
break;
...
} while(true);
walk_shadow_page_lockless_end(vcpu);
and for get_mmio_spte():
walk_shadow_page_lockless_begin(vcpu);
leaf = get_sptes_lockless(vcpu, addr, sptes, &root);
if (unlikely(leaf < 0)) {
*sptep = 0ull;
return reserved;
}
walk_shadow_page_lockless_end(vcpu);
And if we look at the TDP MMU implementations, they aren't sharing _that_ much
code, and the code that is shared isn't all that interesting, e.g. if we really
wanted to we could macro-magic away the boilerplate, but I think even I would
balk at the result :-)
int kvm_tdp_mmu_get_sptes_lockless(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
int *root_level)
{
struct tdp_iter iter;
struct kvm_mmu *mmu = vcpu->arch.mmu;
gfn_t gfn = addr >> PAGE_SHIFT;
int leaf = -1;
*root_level = vcpu->arch.mmu->shadow_root_level;
tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
leaf = iter.level;
sptes[leaf] = iter.old_spte;
}
return leaf;
}
u64 *kvm_tdp_mmu_get_spte_lockless(struct kvm_vcpu *vcpu, u64 addr, u64 *spte)
{
struct kvm_mmu *mmu = vcpu->arch.mmu;
gfn_t gfn = addr >> PAGE_SHIFT;
struct tdp_iter iter;
u64 *sptep = NULL;
*spte = 0ull;
tdp_mmu_for_each_pte(iter, mmu, gfn, gfn + 1) {
/*
* Here be a comment about the unfortunate differences between
* the TDP MMU and the legacy MMU.
*/
sptep = (u64 * __force)iter.sptep;
*spte = iter.old_spte;
}
return sptep;
}
next prev parent reply other threads:[~2021-06-14 23:00 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-11 23:56 [PATCH 0/8] KVM: x86/mmu: Fast page fault support for the TDP MMU David Matlack
2021-06-11 23:56 ` [PATCH 1/8] KVM: x86/mmu: Refactor is_tdp_mmu_root() David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-14 19:07 ` Sean Christopherson
2021-06-14 21:23 ` David Matlack
2021-06-14 21:39 ` Sean Christopherson
2021-06-14 22:01 ` David Matlack
2021-06-11 23:56 ` [PATCH 2/8] KVM: x86/mmu: Rename cr2_or_gpa to gpa in fast_page_fault David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-11 23:56 ` [PATCH 3/8] KVM: x86/mmu: Fix use of enums in trace_fast_page_fault David Matlack
2021-06-11 23:56 ` [PATCH 4/8] KVM: x86/mmu: Common API for lockless shadow page walks David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-11 23:56 ` [PATCH 5/8] KVM: x86/mmu: Also record spteps in shadow_page_walk David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-14 22:27 ` David Matlack
2021-06-14 22:59 ` Sean Christopherson [this message]
2021-06-14 23:39 ` David Matlack
2021-06-15 0:22 ` Sean Christopherson
2021-06-11 23:56 ` [PATCH 6/8] KVM: x86/mmu: fast_page_fault support for the TDP MMU David Matlack
2021-06-11 23:59 ` David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-14 22:34 ` David Matlack
2021-06-11 23:57 ` [PATCH 7/8] KVM: selftests: Fix missing break in dirty_log_perf_test arg parsing David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-11 23:57 ` [PATCH 8/8] KVM: selftests: Introduce access_tracking_perf_test David Matlack
2021-06-14 17:56 ` Ben Gardon
2021-06-14 21:47 ` David Matlack
2021-06-14 9:54 ` [PATCH 0/8] KVM: x86/mmu: Fast page fault support for the TDP MMU Paolo Bonzini
2021-06-14 21:08 ` David Matlack
2021-06-15 7:16 ` Paolo Bonzini
2021-06-16 19:27 ` David Matlack
2021-06-16 19:31 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YMffQriMoxWw2V1f@google.com \
--to=seanjc@google.com \
--cc=bgardon@google.com \
--cc=dmatlack@google.com \
--cc=drjones@redhat.com \
--cc=jmattson@google.com \
--cc=joro@8bytes.org \
--cc=junaids@google.com \
--cc=kvm@vger.kernel.org \
--cc=pbonzini@redhat.com \
--cc=vkuznets@redhat.com \
--cc=wanpengli@tencent.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.