From: "Nicholas Piggin" <npiggin@gmail.com>
To: "Yu Zhao" <yuzhao@google.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Alistair Popple" <apopple@nvidia.com>,
"Anup Patel" <anup@brainfault.org>,
"Ben Gardon" <bgardon@google.com>,
"Borislav Petkov" <bp@alien8.de>,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Chao Peng" <chao.p.peng@linux.intel.com>,
"Christophe Leroy" <christophe.leroy@csgroup.eu>,
"Dave Hansen" <dave.hansen@linux.intel.com>,
"Fabiano Rosas" <farosas@linux.ibm.com>,
"Gaosheng Cui" <cuigaosheng1@huawei.com>,
"Gavin Shan" <gshan@redhat.com>, "H. Peter Anvin" <hpa@zytor.com>,
"Ingo Molnar" <mingo@redhat.com>,
"James Morse" <james.morse@arm.com>,
"Jason A. Donenfeld" <Jason@zx2c4.com>,
"Jason Gunthorpe" <jgg@ziepe.ca>,
"Jonathan Corbet" <corbet@lwn.net>,
"Marc Zyngier" <maz@kernel.org>,
"Masami Hiramatsu" <mhiramat@kernel.org>,
"Michael Ellerman" <mpe@ellerman.id.au>,
"Michael Larabel" <michael@michaellarabel.com>,
"Mike Rapoport" <rppt@kernel.org>,
"Oliver Upton" <oliver.upton@linux.dev>,
"Paul Mackerras" <paulus@ozlabs.org>,
"Peter Xu" <peterx@redhat.com>,
"Sean Christopherson" <seanjc@google.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Suzuki K Poulose" <suzuki.poulose@arm.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Thomas Huth" <thuth@redhat.com>, "Will Deacon" <will@kernel.org>,
"Zenghui Yu" <yuzenghui@huawei.com>, <kvmarm@lists.linux.dev>,
<kvm@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org>,
<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <linuxppc-dev@lists.ozlabs.org>,
<linux-trace-kernel@vger.kernel.org>, <x86@kernel.org>,
<linux-mm@google.com>
Subject: Re: [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe
Date: Tue, 20 Jun 2023 20:49:56 +1000 [thread overview]
Message-ID: <CTHF42QOSYR0.1Y16SFUF11F3X@wheely> (raw)
In-Reply-To: <CAOUHufY8egkNrxQwd6ms4j6ziyUW5uDjD=yhkxHLqAAOGB4Ccw@mail.gmail.com>
On Tue Jun 20, 2023 at 6:00 PM AEST, Yu Zhao wrote:
> On Tue, Jun 20, 2023 at 12:33 AM Nicholas Piggin <npiggin@gmail.com> wrote:
> >
> > On Sat May 27, 2023 at 9:44 AM AEST, Yu Zhao wrote:
> > > KVM page tables are currently not RCU safe against remapping, i.e.,
> > > kvmppc_unmap_free_pmd_entry_table() et al. The previous
> >
> > Minor nit but the "page table" is not RCU-safe against something. It
> > is RCU-freed, and therefore some algorithm that accesses it can have
> > the existence guarantee provided by RCU (usually there still needs
> > to be more to it).
> >
> > > mmu_notifier_ops members rely on kvm->mmu_lock to synchronize with
> > > that operation.
> > >
> > > However, the new mmu_notifier_ops member test_clear_young() provides
> > > a fast path that does not take kvm->mmu_lock. To implement
> > > kvm_arch_test_clear_young() for that path, orphan page tables need to
> > > be freed by RCU.
> >
> > Short version: clear the referenced bit using RCU instead of MMU lock
> > to protect against page table freeing, and there is no problem with
> > clearing the bit in a table that has been freed.
> >
> > Seems reasonable.
>
> Thanks. All above points taken.
>
> > > Unmapping, specifically kvm_unmap_radix(), does not free page tables,
> > > hence not a concern.
> >
> > Not sure if you really need to make the distinction about why the page
> > table is freed, we might free them via unmapping. The point is just
> > anything that frees them while there can be concurrent access, right?
>
> Correct.
>
> > > Signed-off-by: Yu Zhao <yuzhao@google.com>
> > > ---
> > > arch/powerpc/kvm/book3s_64_mmu_radix.c | 6 ++++--
> > > 1 file changed, 4 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > > index 461307b89c3a..3b65b3b11041 100644
> > > --- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > > +++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
> > > @@ -1469,13 +1469,15 @@ int kvmppc_radix_init(void)
> > > {
> > > unsigned long size = sizeof(void *) << RADIX_PTE_INDEX_SIZE;
> > >
> > > - kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, 0, pte_ctor);
> > > + kvm_pte_cache = kmem_cache_create("kvm-pte", size, size,
> > > + SLAB_TYPESAFE_BY_RCU, pte_ctor);
> > > if (!kvm_pte_cache)
> > > return -ENOMEM;
> > >
> > > size = sizeof(void *) << RADIX_PMD_INDEX_SIZE;
> > >
> > > - kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size, 0, pmd_ctor);
> > > + kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size,
> > > + SLAB_TYPESAFE_BY_RCU, pmd_ctor);
> > > if (!kvm_pmd_cache) {
> > > kmem_cache_destroy(kvm_pte_cache);
> > > return -ENOMEM;
> >
> > KVM PPC HV radix PUD level page tables use the arch/powerpc allocators
> > (for some reason), which are not RCU freed. I think you need them too?
>
> We don't. The use of the arch/powerpc allocator for PUD tables seems
> appropriate to me because, unlike PMD/PTE tables, we never free PUD
> tables during the lifetime of a VM:
Ah you're right, the pud_free only comes from the double alloc case
so it's never visible to concurrent threads.
> * We don't free PUD/PMD/PTE tables when they become empty, i.e., not
> mapping any pages but still attached. (We could in theory, as
> x86/aarch64 do.)
We may try to do that at some point, but that's not related to your
patch for now so no worries.
Thanks,
Nick
next prev parent reply other threads:[~2023-06-20 10:50 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-26 23:44 [PATCH mm-unstable v2 00/10] mm/kvm: locklessly clear the accessed bit Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 01/10] mm/kvm: add mmu_notifier_ops->test_clear_young() Yu Zhao
2023-06-06 8:34 ` Tzung-Bi Shih
2023-06-09 1:00 ` Yu Zhao
[not found] ` <ZHedMX470b7EMwbe@ziepe.ca>
2023-06-09 9:04 ` Paolo Bonzini
2023-06-15 17:42 ` Sean Christopherson
2023-06-20 7:30 ` Nicholas Piggin
2023-05-26 23:44 ` [PATCH mm-unstable v2 02/10] mm/kvm: use mmu_notifier_ops->test_clear_young() Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 03/10] kvm/arm64: export stage2_try_set_pte() and macros Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 04/10] kvm/arm64: make stage2 page tables RCU safe Yu Zhao
2023-05-27 18:08 ` Oliver Upton
2023-05-27 20:13 ` Yu Zhao
2023-05-30 19:37 ` Oliver Upton
2023-05-30 20:06 ` Yu Zhao
[not found] ` <ZHef0VsZvZ1Vnz0u@linux.dev>
2023-05-31 23:10 ` Yu Zhao
2023-05-31 23:22 ` Oliver Upton
2023-05-31 23:41 ` Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 05/10] kvm/arm64: add kvm_arch_test_clear_young() Yu Zhao
2023-05-26 23:44 ` [PATCH mm-unstable v2 06/10] kvm/powerpc: make radix page tables RCU safe Yu Zhao
2023-06-20 6:32 ` Nicholas Piggin
2023-06-20 8:00 ` Yu Zhao
2023-06-20 10:49 ` Nicholas Piggin [this message]
2023-05-26 23:44 ` [PATCH mm-unstable v2 07/10] kvm/powerpc: add kvm_arch_test_clear_young() Yu Zhao
2023-06-20 7:47 ` Nicholas Piggin
2023-06-21 0:38 ` Yu Zhao
2023-06-21 2:51 ` Nicholas Piggin
2023-05-26 23:44 ` [PATCH mm-unstable v2 08/10] kvm/x86: move tdp_mmu_enabled and shadow_accessed_mask Yu Zhao
2023-06-15 16:59 ` Sean Christopherson
2023-05-26 23:44 ` [PATCH mm-unstable v2 09/10] kvm/x86: add kvm_arch_test_clear_young() Yu Zhao
2023-06-09 9:06 ` Paolo Bonzini
2023-06-15 18:26 ` Sean Christopherson
2023-05-26 23:44 ` [PATCH mm-unstable v2 10/10] mm: multi-gen LRU: use mmu_notifier_test_clear_young() Yu Zhao
2023-06-09 0:59 ` kvm/arm64: Spark benchmark Yu Zhao
2023-06-09 13:04 ` Marc Zyngier
2023-06-18 20:11 ` Yu Zhao
2023-06-09 0:59 ` kvm/powerpc: memcached benchmark Yu Zhao
2023-06-09 0:59 ` kvm/x86: multichase benchmark Yu Zhao
2023-06-18 19:19 ` Yu Zhao
2023-06-09 9:07 ` [PATCH mm-unstable v2 00/10] mm/kvm: locklessly clear the accessed bit Paolo Bonzini
2023-06-20 2:19 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CTHF42QOSYR0.1Y16SFUF11F3X@wheely \
--to=npiggin@gmail.com \
--cc=Jason@zx2c4.com \
--cc=akpm@linux-foundation.org \
--cc=anup@brainfault.org \
--cc=apopple@nvidia.com \
--cc=bgardon@google.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chao.p.peng@linux.intel.com \
--cc=christophe.leroy@csgroup.eu \
--cc=corbet@lwn.net \
--cc=cuigaosheng1@huawei.com \
--cc=dave.hansen@linux.intel.com \
--cc=farosas@linux.ibm.com \
--cc=gshan@redhat.com \
--cc=hpa@zytor.com \
--cc=james.morse@arm.com \
--cc=jgg@ziepe.ca \
--cc=kvm@vger.kernel.org \
--cc=kvmarm@lists.linux.dev \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@google.com \
--cc=linux-mm@kvack.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maz@kernel.org \
--cc=mhiramat@kernel.org \
--cc=michael@michaellarabel.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=oliver.upton@linux.dev \
--cc=paulus@ozlabs.org \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=rostedt@goodmis.org \
--cc=rppt@kernel.org \
--cc=seanjc@google.com \
--cc=suzuki.poulose@arm.com \
--cc=tglx@linutronix.de \
--cc=thuth@redhat.com \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=yuzenghui@huawei.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).