From: Oliver Upton <oliver.upton@linux.dev>
To: Yu Zhao <yuzhao@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Paolo Bonzini <pbonzini@redhat.com>,
Alistair Popple <apopple@nvidia.com>,
Anup Patel <anup@brainfault.org>, Ben Gardon <bgardon@google.com>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Chao Peng <chao.p.peng@linux.intel.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Dave Hansen <dave.hansen@linux.intel.com>,
Fabiano Rosas <farosas@linux.ibm.com>,
Gaosheng Cui <cuigaosheng1@huawei.com>,
Gavin Shan <gshan@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
Ingo Molnar <mingo@redhat.com>, James Morse <james.morse@arm.com>,
"Jason A. Donenfeld" <Jason@zx2c4.com>,
Jason Gunthorpe <jgg@ziepe.ca>, Jonathan Corbet <corbet@lwn.net>,
Marc Zyngier <maz@kernel.org>,
Masami Hiramatsu <mhiramat@kernel.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Michael Larabel <michael@michaellarabel.com>,
Mike Rapoport <rppt@kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Paul Mackerras <paulus@ozlabs.org>, Peter Xu <peterx@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Steven Rostedt <rostedt@goodmis.org>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
Thomas Gleixner <tglx@linutronix.de>,
Thomas Huth <thuth@redhat.com>, Will Deacon <will@kernel.org>,
Zenghui Yu <yuzenghui@huawei.com>,
kvmarm@lists.linux.dev, kvm@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linuxppc-dev@lists.ozlabs.org,
linux-trace-kernel@vger.kernel.org, x86@kernel.org,
linux-mm@google.com
Subject: Re: [PATCH mm-unstable v2 05/10] kvm/arm64: add kvm_arch_test_clear_young()
Date: Wed, 31 May 2023 19:56:01 +0000 [thread overview]
Message-ID: <ZHemUc3DiSbxQbxJ@linux.dev> (raw)
In-Reply-To: <20230526234435.662652-6-yuzhao@google.com>
Hi Yu,
On Fri, May 26, 2023 at 05:44:30PM -0600, Yu Zhao wrote:
> Implement kvm_arch_test_clear_young() to support the fast path in
> mmu_notifier_ops->test_clear_young().
>
> It focuses on a simple case, i.e., hardware sets the accessed bit in
> KVM PTEs and VMs are not protected, where it can rely on RCU and
> cmpxchg to safely clear the accessed bit without taking
> kvm->mmu_lock. Complex cases fall back to the existing slow path
> where kvm->mmu_lock is then taken.
>
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> ---
> arch/arm64/include/asm/kvm_host.h | 6 ++++++
> arch/arm64/kvm/mmu.c | 36 +++++++++++++++++++++++++++++++
> 2 files changed, 42 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 7e7e19ef6993..da32b0890716 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -1113,4 +1113,10 @@ static inline void kvm_hyp_reserve(void) { }
> void kvm_arm_vcpu_power_off(struct kvm_vcpu *vcpu);
> bool kvm_arm_vcpu_stopped(struct kvm_vcpu *vcpu);
>
> +#define kvm_arch_has_test_clear_young kvm_arch_has_test_clear_young
> +static inline bool kvm_arch_has_test_clear_young(void)
> +{
> + return cpu_has_hw_af() && !is_protected_kvm_enabled();
> +}
I would *strongly* suggest you consider supporting test_clear_young on
systems that do software Access Flag management. FEAT_HAFDBS is an
*optional* extension to the architecture, so we're going to support
software AF management for a very long time in KVM. It is also a valid
fallback option in the case of hardware errata which render HAFDBS
broken.
So, we should expect (and support) systems of all shapes and sizes that
do software AF. I'm sure we'll hear about more in the not-too-distant
future...
For future reference (even though I'm suggesting you support software
AF), decisions such of these need an extremely verbose comment
describing the rationale behind the decision.
> +
> #endif /* __ARM64_KVM_HOST_H__ */
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index c3b3e2afe26f..26a8d955b49c 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
Please do not implement page table walkers outside of hyp/pgtable.c
> @@ -1678,6 +1678,42 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
> range->start << PAGE_SHIFT);
> }
>
> +static int stage2_test_clear_young(const struct kvm_pgtable_visit_ctx *ctx,
> + enum kvm_pgtable_walk_flags flags)
> +{
> + kvm_pte_t new = ctx->old & ~KVM_PTE_LEAF_ATTR_LO_S2_AF;
> +
> + VM_WARN_ON_ONCE(!page_count(virt_to_page(ctx->ptep)));
This sort of sanity checking is a bit excessive. Isn't there a risk of
false negatives here too? IOW, if we tragically mess up RCU in the page
table code, what's stopping a prematurely freed page from being
allocated to another user?
> + if (!kvm_pte_valid(new))
> + return 0;
> +
> + if (new == ctx->old)
> + return 0;
> +
> + if (kvm_should_clear_young(ctx->arg, ctx->addr / PAGE_SIZE))
> + stage2_try_set_pte(ctx, new);
> +
> + return 0;
> +}
> +
> +bool kvm_arch_test_clear_young(struct kvm *kvm, struct kvm_gfn_range *range)
> +{
> + u64 start = range->start * PAGE_SIZE;
> + u64 end = range->end * PAGE_SIZE;
> + struct kvm_pgtable_walker walker = {
> + .cb = stage2_test_clear_young,
> + .arg = range,
> + .flags = KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_SHARED,
> + };
> +
> + BUILD_BUG_ON(is_hyp_code());
Delete this assertion.
> + kvm_pgtable_walk(kvm->arch.mmu.pgt, start, end - start, &walker);
> +
> + return false;
> +}
> +
> phys_addr_t kvm_mmu_get_httbr(void)
> {
> return __pa(hyp_pgtable->pgd);
> --
> 2.41.0.rc0.172.g3f132b7071-goog
>
--
Thanks,
Oliver
next prev parent reply other threads:[~2023-05-31 19:56 UTC|newest]
Thread overview: 127+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-26 23:44 [PATCH mm-unstable v2 00/10] mm/kvm: locklessly clear the accessed bit Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 01/10] mm/kvm: add mmu_notifier_ops->test_clear_young() Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-31 19:17 ` Jason Gunthorpe
2023-06-09 9:04 ` Paolo Bonzini
2023-06-09 9:04 ` Paolo Bonzini
2023-06-09 9:04 ` Paolo Bonzini
2023-06-06 8:34 ` Tzung-Bi Shih
2023-06-06 8:34 ` Tzung-Bi Shih
2023-06-06 8:34 ` Tzung-Bi Shih
2023-06-09 1:00 ` Yu Zhao
2023-06-09 1:00 ` Yu Zhao
2023-06-09 1:00 ` Yu Zhao
2023-06-15 17:42 ` Sean Christopherson
2023-06-15 17:42 ` Sean Christopherson
2023-06-15 17:42 ` Sean Christopherson
2023-06-20 7:30 ` Nicholas Piggin
2023-06-20 7:30 ` Nicholas Piggin
2023-06-20 7:30 ` Nicholas Piggin
2023-05-26 23:44 ` [PATCH mm-unstable v2 02/10] mm/kvm: use mmu_notifier_ops->test_clear_young() Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-31 19:20 ` Jason Gunthorpe
2023-05-26 23:44 ` [PATCH mm-unstable v2 03/10] kvm/arm64: export stage2_try_set_pte() and macros Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 04/10] kvm/arm64: make stage2 page tables RCU safe Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-27 18:08 ` Oliver Upton
2023-05-27 18:08 ` Oliver Upton
2023-05-27 18:08 ` Oliver Upton
2023-05-27 20:13 ` Yu Zhao
2023-05-27 20:13 ` Yu Zhao
2023-05-27 20:13 ` Yu Zhao
2023-05-30 19:37 ` Oliver Upton
2023-05-30 19:37 ` Oliver Upton
2023-05-30 19:37 ` Oliver Upton
2023-05-30 20:06 ` Yu Zhao
2023-05-30 20:06 ` Yu Zhao
2023-05-30 20:06 ` Yu Zhao
2023-05-31 19:28 ` Oliver Upton
2023-05-31 23:10 ` Yu Zhao
2023-05-31 23:10 ` Yu Zhao
2023-05-31 23:10 ` Yu Zhao
2023-05-31 23:22 ` Oliver Upton
2023-05-31 23:22 ` Oliver Upton
2023-05-31 23:22 ` Oliver Upton
2023-05-31 23:41 ` Yu Zhao
2023-05-31 23:41 ` Yu Zhao
2023-05-31 23:41 ` Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 05/10] kvm/arm64: add kvm_arch_test_clear_young() Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-31 19:56 ` Oliver Upton [this message]
2023-05-31 21:12 ` Yu Zhao
2023-05-31 21:12 ` Yu Zhao
2023-05-31 21:12 ` Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-06-20 6:32 ` Nicholas Piggin
2023-06-20 6:32 ` Nicholas Piggin
2023-06-20 6:32 ` Nicholas Piggin
2023-06-20 8:00 ` Yu Zhao
2023-06-20 8:00 ` Yu Zhao
2023-06-20 8:00 ` Yu Zhao
2023-06-20 10:49 ` Nicholas Piggin
2023-06-20 10:49 ` Nicholas Piggin
2023-06-20 10:49 ` Nicholas Piggin
2023-05-26 23:44 ` [PATCH mm-unstable v2 07/10] kvm/powerpc: add kvm_arch_test_clear_young() Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-06-20 7:47 ` Nicholas Piggin
2023-06-20 7:47 ` Nicholas Piggin
2023-06-20 7:47 ` Nicholas Piggin
2023-06-21 0:38 ` Yu Zhao
2023-06-21 0:38 ` Yu Zhao
2023-06-21 0:38 ` Yu Zhao
2023-06-21 2:51 ` Nicholas Piggin
2023-06-21 2:51 ` Nicholas Piggin
2023-06-21 2:51 ` Nicholas Piggin
2023-05-26 23:44 ` [PATCH mm-unstable v2 08/10] kvm/x86: move tdp_mmu_enabled and shadow_accessed_mask Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-06-15 16:59 ` Sean Christopherson
2023-06-15 16:59 ` Sean Christopherson
2023-06-15 16:59 ` Sean Christopherson
2023-05-26 23:44 ` [PATCH mm-unstable v2 09/10] kvm/x86: add kvm_arch_test_clear_young() Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-06-09 9:06 ` Paolo Bonzini
2023-06-09 9:06 ` Paolo Bonzini
2023-06-09 9:06 ` Paolo Bonzini
2023-06-15 18:26 ` Sean Christopherson
2023-06-15 18:26 ` Sean Christopherson
2023-06-15 18:26 ` Sean Christopherson
2023-05-26 23:44 ` [PATCH mm-unstable v2 10/10] mm: multi-gen LRU: use mmu_notifier_test_clear_young() Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-05-26 23:44 ` Yu Zhao
2023-06-09 0:59 ` kvm/arm64: Spark benchmark Yu Zhao
2023-06-09 0:59 ` Yu Zhao
2023-06-09 0:59 ` Yu Zhao
2023-06-09 13:04 ` Marc Zyngier
2023-06-09 13:04 ` Marc Zyngier
2023-06-09 13:04 ` Marc Zyngier
2023-06-18 20:11 ` Yu Zhao
2023-06-18 20:11 ` Yu Zhao
2023-06-18 20:11 ` Yu Zhao
2023-06-09 0:59 ` kvm/powerpc: memcached benchmark Yu Zhao
2023-06-09 0:59 ` Yu Zhao
2023-06-09 0:59 ` Yu Zhao
2023-06-09 0:59 ` kvm/x86: multichase benchmark Yu Zhao
2023-06-09 0:59 ` Yu Zhao
2023-06-09 0:59 ` Yu Zhao
2023-06-18 19:19 ` Yu Zhao
2023-06-18 19:19 ` Yu Zhao
2023-06-18 19:19 ` Yu Zhao
2023-06-09 9:07 ` [PATCH mm-unstable v2 00/10] mm/kvm: locklessly clear the accessed bit Paolo Bonzini
2023-06-09 9:07 ` Paolo Bonzini
2023-06-09 9:07 ` Paolo Bonzini
2023-06-20 2:19 ` Yu Zhao
2023-06-20 2:19 ` Yu Zhao
2023-06-20 2:19 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZHemUc3DiSbxQbxJ@linux.dev \
--to=oliver.upton@linux.dev \
--cc=Jason@zx2c4.com \
--cc=akpm@linux-foundation.org \
--cc=anup@brainfault.org \
--cc=apopple@nvidia.com \
--cc=bgardon@google.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chao.p.peng@linux.intel.com \
--cc=christophe.leroy@csgroup.eu \
--cc=corbet@lwn.net \
--cc=cuigaosheng1@huawei.com \
--cc=dave.hansen@linux.intel.com \
--cc=farosas@linux.ibm.com \
--cc=gshan@redhat.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=jgg@ziepe.ca \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@google.com \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maz@kernel.org \
--cc=mhiramat@kernel.org \
--cc=michael@michaellarabel.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@ozlabs.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=thuth@redhat.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yuzenghui@huawei.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.