* [PATCH v1 0/3] mm: split PTE/PMD PT table Kconfig cleanups+clarifications
@ 2024-07-26 15:07 David Hildenbrand
2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: David Hildenbrand @ 2024-07-26 15:07 UTC (permalink / raw)
To: linux-kernel
Cc: linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel,
linux-fsdevel, David Hildenbrand, Andrew Morton, Oscar Salvador,
Peter Xu, Muchun Song, Russell King, Michael Ellerman,
Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross,
Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner
This series is a follow up to the fixes:
"[PATCH v1 0/2] mm/hugetlb: fix hugetlb vs. core-mm PT locking"
When working on the fixes, I wondered why 8xx is fine (-> never uses split
PT locks) and how PT locking even works properly with PMD page table
sharing (-> always requires split PMD PT locks).
Let's improve the split PT lock detection, make hugetlb properly depend
on it and make 8xx bail out if it would ever get enabled by accident.
As an alternative to patch #3 we could extend the Kconfig SPLIT_PTE_PTLOCKS
option from patch #2 -- but enforcing it closer to the code that actually
implements it feels a bit nicer for documentation purposes, and there
is no need to actually disable it because it should always be disabled
(!SMP).
Did a bunch of cross-compilations to make sure that split PTE/PMD
PT locks are still getting used where we would expect them.
[1] https://lkml.kernel.org/r/20240725183955.2268884-1-david@redhat.com
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Peter Xu <peterx@redhat.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
David Hildenbrand (3):
mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig
options
mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks
powerpc/8xx: document and enforce that split PT locks are not used
arch/arm/mm/fault-armv.c | 6 +++---
arch/powerpc/mm/pgtable.c | 6 ++++++
arch/x86/xen/mmu_pv.c | 7 ++++---
fs/Kconfig | 4 ++++
include/linux/hugetlb.h | 5 ++---
include/linux/mm.h | 8 ++++----
include/linux/mm_types.h | 2 +-
include/linux/mm_types_task.h | 3 ---
kernel/fork.c | 4 ++--
mm/Kconfig | 18 +++++++++++-------
mm/hugetlb.c | 8 ++++----
mm/memory.c | 2 +-
12 files changed, 42 insertions(+), 31 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 8+ messages in thread* [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options 2024-07-26 15:07 [PATCH v1 0/3] mm: split PTE/PMD PT table Kconfig cleanups+clarifications David Hildenbrand @ 2024-07-26 15:07 ` David Hildenbrand 2024-07-28 12:45 ` Mike Rapoport ` (2 more replies) 2024-07-26 15:07 ` [PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks David Hildenbrand 2024-07-26 15:07 ` [PATCH v1 3/3] powerpc/8xx: document and enforce that split PT locks are not used David Hildenbrand 2 siblings, 3 replies; 8+ messages in thread From: David Hildenbrand @ 2024-07-26 15:07 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, David Hildenbrand, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Russell King, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner Let's clean that up a bit and prepare for depending on CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options. More cleanups would be reasonable (like the arch-specific "depends on" for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day. Signed-off-by: David Hildenbrand <david@redhat.com> --- arch/arm/mm/fault-armv.c | 6 +++--- arch/x86/xen/mmu_pv.c | 7 ++++--- include/linux/mm.h | 8 ++++---- include/linux/mm_types.h | 2 +- include/linux/mm_types_task.h | 3 --- kernel/fork.c | 4 ++-- mm/Kconfig | 18 +++++++++++------- mm/memory.c | 2 +- 8 files changed, 26 insertions(+), 24 deletions(-) diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c index 2286c2ea60ec4..831793cd6ff94 100644 --- a/arch/arm/mm/fault-armv.c +++ b/arch/arm/mm/fault-armv.c @@ -61,7 +61,7 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address, return ret; } -#if USE_SPLIT_PTE_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) /* * If we are using split PTE locks, then we need to take the page * lock here. Otherwise we are using shared mm->page_table_lock @@ -80,10 +80,10 @@ static inline void do_pte_unlock(spinlock_t *ptl) { spin_unlock(ptl); } -#else /* !USE_SPLIT_PTE_PTLOCKS */ +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ static inline void do_pte_lock(spinlock_t *ptl) {} static inline void do_pte_unlock(spinlock_t *ptl) {} -#endif /* USE_SPLIT_PTE_PTLOCKS */ +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ static int adjust_pte(struct vm_area_struct *vma, unsigned long address, unsigned long pfn) diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index f1ce39d6d32cb..f4a316894bbb4 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -665,7 +665,7 @@ static spinlock_t *xen_pte_lock(struct page *page, struct mm_struct *mm) { spinlock_t *ptl = NULL; -#if USE_SPLIT_PTE_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) ptl = ptlock_ptr(page_ptdesc(page)); spin_lock_nest_lock(ptl, &mm->page_table_lock); #endif @@ -1553,7 +1553,8 @@ static inline void xen_alloc_ptpage(struct mm_struct *mm, unsigned long pfn, __set_pfn_prot(pfn, PAGE_KERNEL_RO); - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS && !pinned) + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS) && + !pinned) __pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, pfn); xen_mc_issue(XEN_LAZY_MMU); @@ -1581,7 +1582,7 @@ static inline void xen_release_ptpage(unsigned long pfn, unsigned level) if (pinned) { xen_mc_batch(); - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS) + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS)) __pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, pfn); __set_pfn_prot(pfn, PAGE_KERNEL); diff --git a/include/linux/mm.h b/include/linux/mm.h index 0472a5090b180..dff43101572ec 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2843,7 +2843,7 @@ static inline void pagetable_free(struct ptdesc *pt) __free_pages(page, compound_order(page)); } -#if USE_SPLIT_PTE_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) #if ALLOC_SPLIT_PTLOCKS void __init ptlock_cache_init(void); bool ptlock_alloc(struct ptdesc *ptdesc); @@ -2895,7 +2895,7 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) return true; } -#else /* !USE_SPLIT_PTE_PTLOCKS */ +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ /* * We use mm->page_table_lock to guard all pagetable pages of the mm. */ @@ -2906,7 +2906,7 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) static inline void ptlock_cache_init(void) {} static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } static inline void ptlock_free(struct ptdesc *ptdesc) {} -#endif /* USE_SPLIT_PTE_PTLOCKS */ +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) { @@ -2966,7 +2966,7 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ NULL: pte_offset_kernel(pmd, address)) -#if USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_SPLIT_PMD_PTLOCKS) static inline struct page *pmd_pgtable_page(pmd_t *pmd) { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 4854249792545..165c58b12ccc9 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -947,7 +947,7 @@ struct mm_struct { #ifdef CONFIG_MMU_NOTIFIER struct mmu_notifier_subscriptions *notifier_subscriptions; #endif -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) pgtable_t pmd_huge_pte; /* protected by page_table_lock */ #endif #ifdef CONFIG_NUMA_BALANCING diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h index a2f6179b672b8..bff5706b76e14 100644 --- a/include/linux/mm_types_task.h +++ b/include/linux/mm_types_task.h @@ -16,9 +16,6 @@ #include <asm/tlbbatch.h> #endif -#define USE_SPLIT_PTE_PTLOCKS (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS) -#define USE_SPLIT_PMD_PTLOCKS (USE_SPLIT_PTE_PTLOCKS && \ - IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK)) #define ALLOC_SPLIT_PTLOCKS (SPINLOCK_SIZE > BITS_PER_LONG/8) /* diff --git a/kernel/fork.c b/kernel/fork.c index a8362c26ebcb0..216ce9ba4f4e6 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -832,7 +832,7 @@ static void check_mm(struct mm_struct *mm) pr_alert("BUG: non-zero pgtables_bytes on freeing mm: %ld\n", mm_pgtables_bytes(mm)); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) VM_BUG_ON_MM(mm->pmd_huge_pte, mm); #endif } @@ -1276,7 +1276,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, RCU_INIT_POINTER(mm->exe_file, NULL); mmu_notifier_subscriptions_init(mm); init_tlb_flush_pending(mm); -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) mm->pmd_huge_pte = NULL; #endif mm_init_uprobes_state(mm); diff --git a/mm/Kconfig b/mm/Kconfig index b72e7d040f789..7b716ac802726 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -585,17 +585,21 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE # at the same time (e.g. copy_page_range()). # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page. # -config SPLIT_PTLOCK_CPUS - int - default "999999" if !MMU - default "999999" if ARM && !CPU_CACHE_VIPT - default "999999" if PARISC && !PA20 - default "999999" if SPARC32 - default "4" +config SPLIT_PTE_PTLOCKS + def_bool y + depends on MMU + depends on NR_CPUS >= 4 + depends on !ARM || CPU_CACHE_VIPT + depends on !PARISC || PA20 + depends on !SPARC32 config ARCH_ENABLE_SPLIT_PMD_PTLOCK bool +config SPLIT_PMD_PTLOCKS + def_bool y + depends on SPLIT_PTE_PTLOCKS && ARCH_ENABLE_SPLIT_PMD_PTLOCK + # # support for memory balloon config MEMORY_BALLOON diff --git a/mm/memory.c b/mm/memory.c index 833d2cad6eb29..714589582fe15 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6559,7 +6559,7 @@ long copy_folio_from_user(struct folio *dst_folio, } #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ -#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS static struct kmem_cache *page_ptl_cachep; -- 2.45.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options 2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand @ 2024-07-28 12:45 ` Mike Rapoport 2024-07-29 7:56 ` Qi Zheng 2024-07-29 11:33 ` Russell King (Oracle) 2 siblings, 0 replies; 8+ messages in thread From: Mike Rapoport @ 2024-07-28 12:45 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Russell King, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner On Fri, Jul 26, 2024 at 05:07:26PM +0200, David Hildenbrand wrote: > Let's clean that up a bit and prepare for depending on > CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options. > > More cleanups would be reasonable (like the arch-specific "depends on" > for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day. > > Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > --- > arch/arm/mm/fault-armv.c | 6 +++--- > arch/x86/xen/mmu_pv.c | 7 ++++--- > include/linux/mm.h | 8 ++++---- > include/linux/mm_types.h | 2 +- > include/linux/mm_types_task.h | 3 --- > kernel/fork.c | 4 ++-- > mm/Kconfig | 18 +++++++++++------- > mm/memory.c | 2 +- > 8 files changed, 26 insertions(+), 24 deletions(-) > > diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c > index 2286c2ea60ec4..831793cd6ff94 100644 > --- a/arch/arm/mm/fault-armv.c > +++ b/arch/arm/mm/fault-armv.c > @@ -61,7 +61,7 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address, > return ret; > } > > -#if USE_SPLIT_PTE_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) > /* > * If we are using split PTE locks, then we need to take the page > * lock here. Otherwise we are using shared mm->page_table_lock > @@ -80,10 +80,10 @@ static inline void do_pte_unlock(spinlock_t *ptl) > { > spin_unlock(ptl); > } > -#else /* !USE_SPLIT_PTE_PTLOCKS */ > +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > static inline void do_pte_lock(spinlock_t *ptl) {} > static inline void do_pte_unlock(spinlock_t *ptl) {} > -#endif /* USE_SPLIT_PTE_PTLOCKS */ > +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > > static int adjust_pte(struct vm_area_struct *vma, unsigned long address, > unsigned long pfn) > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > index f1ce39d6d32cb..f4a316894bbb4 100644 > --- a/arch/x86/xen/mmu_pv.c > +++ b/arch/x86/xen/mmu_pv.c > @@ -665,7 +665,7 @@ static spinlock_t *xen_pte_lock(struct page *page, struct mm_struct *mm) > { > spinlock_t *ptl = NULL; > > -#if USE_SPLIT_PTE_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) > ptl = ptlock_ptr(page_ptdesc(page)); > spin_lock_nest_lock(ptl, &mm->page_table_lock); > #endif > @@ -1553,7 +1553,8 @@ static inline void xen_alloc_ptpage(struct mm_struct *mm, unsigned long pfn, > > __set_pfn_prot(pfn, PAGE_KERNEL_RO); > > - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS && !pinned) > + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS) && > + !pinned) > __pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, pfn); > > xen_mc_issue(XEN_LAZY_MMU); > @@ -1581,7 +1582,7 @@ static inline void xen_release_ptpage(unsigned long pfn, unsigned level) > if (pinned) { > xen_mc_batch(); > > - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS) > + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS)) > __pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, pfn); > > __set_pfn_prot(pfn, PAGE_KERNEL); > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 0472a5090b180..dff43101572ec 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2843,7 +2843,7 @@ static inline void pagetable_free(struct ptdesc *pt) > __free_pages(page, compound_order(page)); > } > > -#if USE_SPLIT_PTE_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) > #if ALLOC_SPLIT_PTLOCKS > void __init ptlock_cache_init(void); > bool ptlock_alloc(struct ptdesc *ptdesc); > @@ -2895,7 +2895,7 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) > return true; > } > > -#else /* !USE_SPLIT_PTE_PTLOCKS */ > +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > /* > * We use mm->page_table_lock to guard all pagetable pages of the mm. > */ > @@ -2906,7 +2906,7 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) > static inline void ptlock_cache_init(void) {} > static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } > static inline void ptlock_free(struct ptdesc *ptdesc) {} > -#endif /* USE_SPLIT_PTE_PTLOCKS */ > +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > > static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) > { > @@ -2966,7 +2966,7 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, > ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ > NULL: pte_offset_kernel(pmd, address)) > > -#if USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_SPLIT_PMD_PTLOCKS) > > static inline struct page *pmd_pgtable_page(pmd_t *pmd) > { > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 4854249792545..165c58b12ccc9 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -947,7 +947,7 @@ struct mm_struct { > #ifdef CONFIG_MMU_NOTIFIER > struct mmu_notifier_subscriptions *notifier_subscriptions; > #endif > -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) > pgtable_t pmd_huge_pte; /* protected by page_table_lock */ > #endif > #ifdef CONFIG_NUMA_BALANCING > diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h > index a2f6179b672b8..bff5706b76e14 100644 > --- a/include/linux/mm_types_task.h > +++ b/include/linux/mm_types_task.h > @@ -16,9 +16,6 @@ > #include <asm/tlbbatch.h> > #endif > > -#define USE_SPLIT_PTE_PTLOCKS (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS) > -#define USE_SPLIT_PMD_PTLOCKS (USE_SPLIT_PTE_PTLOCKS && \ > - IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK)) > #define ALLOC_SPLIT_PTLOCKS (SPINLOCK_SIZE > BITS_PER_LONG/8) > > /* > diff --git a/kernel/fork.c b/kernel/fork.c > index a8362c26ebcb0..216ce9ba4f4e6 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -832,7 +832,7 @@ static void check_mm(struct mm_struct *mm) > pr_alert("BUG: non-zero pgtables_bytes on freeing mm: %ld\n", > mm_pgtables_bytes(mm)); > > -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) > VM_BUG_ON_MM(mm->pmd_huge_pte, mm); > #endif > } > @@ -1276,7 +1276,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, > RCU_INIT_POINTER(mm->exe_file, NULL); > mmu_notifier_subscriptions_init(mm); > init_tlb_flush_pending(mm); > -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) > mm->pmd_huge_pte = NULL; > #endif > mm_init_uprobes_state(mm); > diff --git a/mm/Kconfig b/mm/Kconfig > index b72e7d040f789..7b716ac802726 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -585,17 +585,21 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE > # at the same time (e.g. copy_page_range()). > # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page. > # > -config SPLIT_PTLOCK_CPUS > - int > - default "999999" if !MMU > - default "999999" if ARM && !CPU_CACHE_VIPT > - default "999999" if PARISC && !PA20 > - default "999999" if SPARC32 > - default "4" > +config SPLIT_PTE_PTLOCKS > + def_bool y > + depends on MMU > + depends on NR_CPUS >= 4 > + depends on !ARM || CPU_CACHE_VIPT > + depends on !PARISC || PA20 > + depends on !SPARC32 > > config ARCH_ENABLE_SPLIT_PMD_PTLOCK > bool > > +config SPLIT_PMD_PTLOCKS > + def_bool y > + depends on SPLIT_PTE_PTLOCKS && ARCH_ENABLE_SPLIT_PMD_PTLOCK > + > # > # support for memory balloon > config MEMORY_BALLOON > diff --git a/mm/memory.c b/mm/memory.c > index 833d2cad6eb29..714589582fe15 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -6559,7 +6559,7 @@ long copy_folio_from_user(struct folio *dst_folio, > } > #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ > > -#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS > > static struct kmem_cache *page_ptl_cachep; > > -- > 2.45.2 > > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options 2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand 2024-07-28 12:45 ` Mike Rapoport @ 2024-07-29 7:56 ` Qi Zheng 2024-07-29 11:33 ` Russell King (Oracle) 2 siblings, 0 replies; 8+ messages in thread From: Qi Zheng @ 2024-07-29 7:56 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Russell King, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner On 2024/7/26 23:07, David Hildenbrand wrote: > Let's clean that up a bit and prepare for depending on > CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options. > > More cleanups would be reasonable (like the arch-specific "depends on" > for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > arch/arm/mm/fault-armv.c | 6 +++--- > arch/x86/xen/mmu_pv.c | 7 ++++--- > include/linux/mm.h | 8 ++++---- > include/linux/mm_types.h | 2 +- > include/linux/mm_types_task.h | 3 --- > kernel/fork.c | 4 ++-- > mm/Kconfig | 18 +++++++++++------- > mm/memory.c | 2 +- > 8 files changed, 26 insertions(+), 24 deletions(-) That's great. Thanks! Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com> > > diff --git a/arch/arm/mm/fault-armv.c b/arch/arm/mm/fault-armv.c > index 2286c2ea60ec4..831793cd6ff94 100644 > --- a/arch/arm/mm/fault-armv.c > +++ b/arch/arm/mm/fault-armv.c > @@ -61,7 +61,7 @@ static int do_adjust_pte(struct vm_area_struct *vma, unsigned long address, > return ret; > } > > -#if USE_SPLIT_PTE_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) > /* > * If we are using split PTE locks, then we need to take the page > * lock here. Otherwise we are using shared mm->page_table_lock > @@ -80,10 +80,10 @@ static inline void do_pte_unlock(spinlock_t *ptl) > { > spin_unlock(ptl); > } > -#else /* !USE_SPLIT_PTE_PTLOCKS */ > +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > static inline void do_pte_lock(spinlock_t *ptl) {} > static inline void do_pte_unlock(spinlock_t *ptl) {} > -#endif /* USE_SPLIT_PTE_PTLOCKS */ > +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > > static int adjust_pte(struct vm_area_struct *vma, unsigned long address, > unsigned long pfn) > diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c > index f1ce39d6d32cb..f4a316894bbb4 100644 > --- a/arch/x86/xen/mmu_pv.c > +++ b/arch/x86/xen/mmu_pv.c > @@ -665,7 +665,7 @@ static spinlock_t *xen_pte_lock(struct page *page, struct mm_struct *mm) > { > spinlock_t *ptl = NULL; > > -#if USE_SPLIT_PTE_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) > ptl = ptlock_ptr(page_ptdesc(page)); > spin_lock_nest_lock(ptl, &mm->page_table_lock); > #endif > @@ -1553,7 +1553,8 @@ static inline void xen_alloc_ptpage(struct mm_struct *mm, unsigned long pfn, > > __set_pfn_prot(pfn, PAGE_KERNEL_RO); > > - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS && !pinned) > + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS) && > + !pinned) > __pin_pagetable_pfn(MMUEXT_PIN_L1_TABLE, pfn); > > xen_mc_issue(XEN_LAZY_MMU); > @@ -1581,7 +1582,7 @@ static inline void xen_release_ptpage(unsigned long pfn, unsigned level) > if (pinned) { > xen_mc_batch(); > > - if (level == PT_PTE && USE_SPLIT_PTE_PTLOCKS) > + if (level == PT_PTE && IS_ENABLED(CONFIG_SPLIT_PTE_PTLOCKS)) > __pin_pagetable_pfn(MMUEXT_UNPIN_TABLE, pfn); > > __set_pfn_prot(pfn, PAGE_KERNEL); > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 0472a5090b180..dff43101572ec 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2843,7 +2843,7 @@ static inline void pagetable_free(struct ptdesc *pt) > __free_pages(page, compound_order(page)); > } > > -#if USE_SPLIT_PTE_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) > #if ALLOC_SPLIT_PTLOCKS > void __init ptlock_cache_init(void); > bool ptlock_alloc(struct ptdesc *ptdesc); > @@ -2895,7 +2895,7 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) > return true; > } > > -#else /* !USE_SPLIT_PTE_PTLOCKS */ > +#else /* !defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > /* > * We use mm->page_table_lock to guard all pagetable pages of the mm. > */ > @@ -2906,7 +2906,7 @@ static inline spinlock_t *pte_lockptr(struct mm_struct *mm, pte_t *pte) > static inline void ptlock_cache_init(void) {} > static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; } > static inline void ptlock_free(struct ptdesc *ptdesc) {} > -#endif /* USE_SPLIT_PTE_PTLOCKS */ > +#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */ > > static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc) > { > @@ -2966,7 +2966,7 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd, > ((unlikely(pmd_none(*(pmd))) && __pte_alloc_kernel(pmd))? \ > NULL: pte_offset_kernel(pmd, address)) > > -#if USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_SPLIT_PMD_PTLOCKS) > > static inline struct page *pmd_pgtable_page(pmd_t *pmd) > { > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h > index 4854249792545..165c58b12ccc9 100644 > --- a/include/linux/mm_types.h > +++ b/include/linux/mm_types.h > @@ -947,7 +947,7 @@ struct mm_struct { > #ifdef CONFIG_MMU_NOTIFIER > struct mmu_notifier_subscriptions *notifier_subscriptions; > #endif > -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) > pgtable_t pmd_huge_pte; /* protected by page_table_lock */ > #endif > #ifdef CONFIG_NUMA_BALANCING > diff --git a/include/linux/mm_types_task.h b/include/linux/mm_types_task.h > index a2f6179b672b8..bff5706b76e14 100644 > --- a/include/linux/mm_types_task.h > +++ b/include/linux/mm_types_task.h > @@ -16,9 +16,6 @@ > #include <asm/tlbbatch.h> > #endif > > -#define USE_SPLIT_PTE_PTLOCKS (NR_CPUS >= CONFIG_SPLIT_PTLOCK_CPUS) > -#define USE_SPLIT_PMD_PTLOCKS (USE_SPLIT_PTE_PTLOCKS && \ > - IS_ENABLED(CONFIG_ARCH_ENABLE_SPLIT_PMD_PTLOCK)) > #define ALLOC_SPLIT_PTLOCKS (SPINLOCK_SIZE > BITS_PER_LONG/8) > > /* > diff --git a/kernel/fork.c b/kernel/fork.c > index a8362c26ebcb0..216ce9ba4f4e6 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -832,7 +832,7 @@ static void check_mm(struct mm_struct *mm) > pr_alert("BUG: non-zero pgtables_bytes on freeing mm: %ld\n", > mm_pgtables_bytes(mm)); > > -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) > VM_BUG_ON_MM(mm->pmd_huge_pte, mm); > #endif > } > @@ -1276,7 +1276,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, > RCU_INIT_POINTER(mm->exe_file, NULL); > mmu_notifier_subscriptions_init(mm); > init_tlb_flush_pending(mm); > -#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !defined(CONFIG_SPLIT_PMD_PTLOCKS) > mm->pmd_huge_pte = NULL; > #endif > mm_init_uprobes_state(mm); > diff --git a/mm/Kconfig b/mm/Kconfig > index b72e7d040f789..7b716ac802726 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -585,17 +585,21 @@ config ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE > # at the same time (e.g. copy_page_range()). > # DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC spinlock_t also enlarge struct page. > # > -config SPLIT_PTLOCK_CPUS > - int > - default "999999" if !MMU > - default "999999" if ARM && !CPU_CACHE_VIPT > - default "999999" if PARISC && !PA20 > - default "999999" if SPARC32 > - default "4" > +config SPLIT_PTE_PTLOCKS > + def_bool y > + depends on MMU > + depends on NR_CPUS >= 4 > + depends on !ARM || CPU_CACHE_VIPT > + depends on !PARISC || PA20 > + depends on !SPARC32 > > config ARCH_ENABLE_SPLIT_PMD_PTLOCK > bool > > +config SPLIT_PMD_PTLOCKS > + def_bool y > + depends on SPLIT_PTE_PTLOCKS && ARCH_ENABLE_SPLIT_PMD_PTLOCK > + > # > # support for memory balloon > config MEMORY_BALLOON > diff --git a/mm/memory.c b/mm/memory.c > index 833d2cad6eb29..714589582fe15 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -6559,7 +6559,7 @@ long copy_folio_from_user(struct folio *dst_folio, > } > #endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_HUGETLBFS */ > > -#if USE_SPLIT_PTE_PTLOCKS && ALLOC_SPLIT_PTLOCKS > +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) && ALLOC_SPLIT_PTLOCKS > > static struct kmem_cache *page_ptl_cachep; > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options 2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand 2024-07-28 12:45 ` Mike Rapoport 2024-07-29 7:56 ` Qi Zheng @ 2024-07-29 11:33 ` Russell King (Oracle) 2 siblings, 0 replies; 8+ messages in thread From: Russell King (Oracle) @ 2024-07-29 11:33 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner On Fri, Jul 26, 2024 at 05:07:26PM +0200, David Hildenbrand wrote: > Let's clean that up a bit and prepare for depending on > CONFIG_SPLIT_PMD_PTLOCKS in other Kconfig options. > > More cleanups would be reasonable (like the arch-specific "depends on" > for CONFIG_SPLIT_PTE_PTLOCKS), but we'll leave that for another day. > > Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Thanks! -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last! ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks 2024-07-26 15:07 [PATCH v1 0/3] mm: split PTE/PMD PT table Kconfig cleanups+clarifications David Hildenbrand 2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand @ 2024-07-26 15:07 ` David Hildenbrand 2024-07-28 12:47 ` Mike Rapoport 2024-07-26 15:07 ` [PATCH v1 3/3] powerpc/8xx: document and enforce that split PT locks are not used David Hildenbrand 2 siblings, 1 reply; 8+ messages in thread From: David Hildenbrand @ 2024-07-26 15:07 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, David Hildenbrand, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Russell King, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner Sharing page tables between processes but falling back to per-MM page table locks cannot possibly work. So, let's make sure that we do have split PMD locks by adding a new Kconfig option and letting that depend on CONFIG_SPLIT_PMD_PTLOCKS. Signed-off-by: David Hildenbrand <david@redhat.com> --- fs/Kconfig | 4 ++++ include/linux/hugetlb.h | 5 ++--- mm/hugetlb.c | 8 ++++---- 3 files changed, 10 insertions(+), 7 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index a46b0cbc4d8f6..0e4efec1d92e6 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -288,6 +288,10 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP depends on SPARSEMEM_VMEMMAP +config HUGETLB_PMD_PAGE_TABLE_SHARING + def_bool HUGETLB_PAGE + depends on ARCH_WANT_HUGE_PMD_SHARE && SPLIT_PMD_PTLOCKS + config ARCH_HAS_GIGANTIC_PAGE bool diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index da800e56fe590..4d2f3224ff027 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1243,7 +1243,7 @@ static inline __init void hugetlb_cma_reserve(int order) } #endif -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING static inline bool hugetlb_pmd_shared(pte_t *pte) { return page_count(virt_to_page(pte)) > 1; @@ -1279,8 +1279,7 @@ bool __vma_private_lock(struct vm_area_struct *vma); static inline pte_t * hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { -#if defined(CONFIG_HUGETLB_PAGE) && \ - defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) +#if defined(CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING) && defined(CONFIG_LOCKDEP) struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0858a18272073..c4d94e122c41f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7211,7 +7211,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, return 0; } -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING static unsigned long page_table_shareable(struct vm_area_struct *svma, struct vm_area_struct *vma, unsigned long addr, pgoff_t idx) @@ -7373,7 +7373,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, return 1; } -#else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +#else /* !CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, pud_t *pud) @@ -7396,7 +7396,7 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) { return false; } -#endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ +#endif /* CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */ #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, @@ -7494,7 +7494,7 @@ unsigned long hugetlb_mask_last_page(struct hstate *h) /* See description above. Architectures can provide their own version. */ __weak unsigned long hugetlb_mask_last_page(struct hstate *h) { -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING if (huge_page_size(h) == PMD_SIZE) return PUD_SIZE - PMD_SIZE; #endif -- 2.45.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks 2024-07-26 15:07 ` [PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks David Hildenbrand @ 2024-07-28 12:47 ` Mike Rapoport 0 siblings, 0 replies; 8+ messages in thread From: Mike Rapoport @ 2024-07-28 12:47 UTC (permalink / raw) To: David Hildenbrand Cc: linux-kernel, linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Russell King, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner On Fri, Jul 26, 2024 at 05:07:27PM +0200, David Hildenbrand wrote: > Sharing page tables between processes but falling back to per-MM page > table locks cannot possibly work. > > So, let's make sure that we do have split PMD locks by adding a new > Kconfig option and letting that depend on CONFIG_SPLIT_PMD_PTLOCKS. > > Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > --- > fs/Kconfig | 4 ++++ > include/linux/hugetlb.h | 5 ++--- > mm/hugetlb.c | 8 ++++---- > 3 files changed, 10 insertions(+), 7 deletions(-) > > diff --git a/fs/Kconfig b/fs/Kconfig > index a46b0cbc4d8f6..0e4efec1d92e6 100644 > --- a/fs/Kconfig > +++ b/fs/Kconfig > @@ -288,6 +288,10 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP > depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP > depends on SPARSEMEM_VMEMMAP > > +config HUGETLB_PMD_PAGE_TABLE_SHARING > + def_bool HUGETLB_PAGE > + depends on ARCH_WANT_HUGE_PMD_SHARE && SPLIT_PMD_PTLOCKS > + > config ARCH_HAS_GIGANTIC_PAGE > bool > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index da800e56fe590..4d2f3224ff027 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -1243,7 +1243,7 @@ static inline __init void hugetlb_cma_reserve(int order) > } > #endif > > -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE > +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING > static inline bool hugetlb_pmd_shared(pte_t *pte) > { > return page_count(virt_to_page(pte)) > 1; > @@ -1279,8 +1279,7 @@ bool __vma_private_lock(struct vm_area_struct *vma); > static inline pte_t * > hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) > { > -#if defined(CONFIG_HUGETLB_PAGE) && \ > - defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) > +#if defined(CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING) && defined(CONFIG_LOCKDEP) > struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; > > /* > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 0858a18272073..c4d94e122c41f 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -7211,7 +7211,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long start, long end, > return 0; > } > > -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE > +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING > static unsigned long page_table_shareable(struct vm_area_struct *svma, > struct vm_area_struct *vma, > unsigned long addr, pgoff_t idx) > @@ -7373,7 +7373,7 @@ int huge_pmd_unshare(struct mm_struct *mm, struct vm_area_struct *vma, > return 1; > } > > -#else /* !CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ > +#else /* !CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */ > > pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, > unsigned long addr, pud_t *pud) > @@ -7396,7 +7396,7 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr) > { > return false; > } > -#endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */ > +#endif /* CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING */ > > #ifdef CONFIG_ARCH_WANT_GENERAL_HUGETLB > pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, > @@ -7494,7 +7494,7 @@ unsigned long hugetlb_mask_last_page(struct hstate *h) > /* See description above. Architectures can provide their own version. */ > __weak unsigned long hugetlb_mask_last_page(struct hstate *h) > { > -#ifdef CONFIG_ARCH_WANT_HUGE_PMD_SHARE > +#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING > if (huge_page_size(h) == PMD_SIZE) > return PUD_SIZE - PMD_SIZE; > #endif > -- > 2.45.2 > > -- Sincerely yours, Mike. ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v1 3/3] powerpc/8xx: document and enforce that split PT locks are not used 2024-07-26 15:07 [PATCH v1 0/3] mm: split PTE/PMD PT table Kconfig cleanups+clarifications David Hildenbrand 2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand 2024-07-26 15:07 ` [PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks David Hildenbrand @ 2024-07-26 15:07 ` David Hildenbrand 2 siblings, 0 replies; 8+ messages in thread From: David Hildenbrand @ 2024-07-26 15:07 UTC (permalink / raw) To: linux-kernel Cc: linux-mm, linux-arm-kernel, x86, linuxppc-dev, xen-devel, linux-fsdevel, David Hildenbrand, Andrew Morton, Oscar Salvador, Peter Xu, Muchun Song, Russell King, Michael Ellerman, Nicholas Piggin, Christophe Leroy, Naveen N. Rao, Juergen Gross, Boris Ostrovsky, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin, Alexander Viro, Christian Brauner Right now, we cannot have split PT locks because 8xx does not support SMP. But for the sake of documentation *why* 8xx is fine regarding what we documented in huge_pte_lockptr(), let's just add code to enforce it at the same time as documenting it. This should also make everybody who wants to copy from the 8xx approach of supporting such unusual ways of mapping hugetlb folios aware that it gets tricky once multiple page tables are involved. Signed-off-by: David Hildenbrand <david@redhat.com> --- arch/powerpc/mm/pgtable.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index ab0656115424f..7316396e452d8 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -297,6 +297,12 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, } #if defined(CONFIG_PPC_8xx) + +#if defined(CONFIG_SPLIT_PTE_PTLOCKS) || defined(CONFIG_SPLIT_PMD_PTLOCKS) +/* We need the same lock to protect the PMD table and the two PTE tables. */ +#error "8M hugetlb folios are incompatible with split page table locks" +#endif + static void __set_huge_pte_at(pmd_t *pmd, pte_t *ptep, pte_basic_t val) { pte_basic_t *entry = (pte_basic_t *)ptep; -- 2.45.2 ^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-07-29 11:33 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-07-26 15:07 [PATCH v1 0/3] mm: split PTE/PMD PT table Kconfig cleanups+clarifications David Hildenbrand 2024-07-26 15:07 ` [PATCH v1 1/3] mm: turn USE_SPLIT_PTE_PTLOCKS / USE_SPLIT_PTE_PTLOCKS into Kconfig options David Hildenbrand 2024-07-28 12:45 ` Mike Rapoport 2024-07-29 7:56 ` Qi Zheng 2024-07-29 11:33 ` Russell King (Oracle) 2024-07-26 15:07 ` [PATCH v1 2/3] mm/hugetlb: enforce that PMD PT sharing has split PMD PT locks David Hildenbrand 2024-07-28 12:47 ` Mike Rapoport 2024-07-26 15:07 ` [PATCH v1 3/3] powerpc/8xx: document and enforce that split PT locks are not used David Hildenbrand
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).