linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/10] Account page tables at all levels
@ 2024-12-19 16:44 Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers Kevin Brodsky
                   ` (10 more replies)
  0 siblings, 11 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

We currently have a pair of ctor/dtor calls for lower page table levels,
up to PUD. At PTE and PMD level, these handle split locks,
if supported. Additionally, the helpers ensure correct accounting of
page table pages to the corresponding process.

This series takes that principle to its logical conclusion: account all
page table pages, at all levels and on all architectures (see caveat
below), through suitable ctor/dtor calls. This means concretely:

* Ensuring that the existing pagetable_{pte,pmd,pud}_[cd]tor are called
  on all architectures.

* Introduce pagetable_{p4d,pgd}_[cd]tor and call them at P4D/PGD level.

The primary motivation for this series is not page accounting, though.
P4D/PGD-level pages represent a tiny proportion of the memory used by a
process. Rather, the appeal comes from the introduction of a single,
generic place where construction/destruction hooks can be called for all
page table pages at all levels. This will come in handy for protecting
page tables using kpkeys [1]. Peter Zijlstra suggested this approach [2]
to avoid handling this in arch code.

With this series, __pagetable_ctor() and __pagetable_dtor() (introduced
in patch 1) should be called when page tables are allocated/freed at any
level on any architecture. Note however that only P*D that consist of
one or more regular pages are handled. This excludes:

* All P*D allocated from a kmem_cache (or kmalloc).
* P*D that are not allocated via GFP (only an issue on sparc).

The table at the end of this email gives more details for each
architecture.

Patches in details:

* Patch 1 factors out the common implementation of all
  pagetable_*_[cd]tor.

* Patch 2-4: PMD/PUD; add missing calls to pagetable_{pmd,pud}_[cd]tor
  on various architectures.

* Patch 5-7: P4D; move most arch to using generic alloc/free functions
  at P4D level, and then have them call pagetable_p4d_[cd]tor.

* Patch 8-10: PGD; same principle at PGD level.

The patches were build-tested on all architectures (thanks Linus Walleij
for triggering the LKP CI for me!), and boot-tested on arm64 and x86_64.

- Kevin

[1] https://lore.kernel.org/linux-hardening/20241206101110.1646108-1-kevin.brodsky@arm.com/
[2] https://lore.kernel.org/linux-hardening/20241210122355.GN8562@noisy.programming.kicks-ass.net/
---

Overview of the situation on all arch after this series is applied:

  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | arch          | #include                | Complete ctor/dtor    | ctor/dtor    | Notes                              |
  |               | <asm-generic/pgalloc.h> | calls up to p4d level | at pgd level |                                    |
  +===============+=========================+=======================+==============+====================================+
  | alpha         | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | arc           | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | arm           | Y                       | Y                     | Y/N          | kmalloc at pgd level if LPAE       |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | arm64         | Y                       | Y                     | Y/N          | kmem_cache if pgd not page-sized   |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | csky          | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | hexagon       | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | loongarch     | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | m68k (Sun3)   | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | m68k (others) | N                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | microblaze    | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | mips          | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | nios2         | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | openrisc      | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | parisc        | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | powerpc       | N                       | Y/N                   | N            | kmem_cache at:                     |
  |               |                         |                       |              | - pgd level                        |
  |               |                         |                       |              | - pud level in 64-bit              |
  |               |                         |                       |              | - pmd level in 64-bit on !book3s   |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | riscv         | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | s390          | N                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | sh            | Y                       | N                     | N            | kmem_cache at pmd/pgd level        |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | sparc         | N                       | N                     | N            | 32-bit: special memory             |
  |               |                         |                       |              | 64-bit: kmem_cache above pte level |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | um            | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | x86           | Y                       | Y                     | Y/N          | kmem_cache at pgd level if PAE     |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+
  | xtensa        | Y                       | Y                     | Y            |                                    |
  +---------------+-------------------------+-----------------------+--------------+------------------------------------+

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: linux-alpha@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-hexagon@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: linux-openrisc@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-um@lists.infradead.org
Cc: loongarch@lists.linux.dev
Cc: x86@kernel.org
---
Kevin Brodsky (10):
  mm: Move common parts of pagetable_*_[cd]tor to helpers
  parisc: mm: Ensure pagetable_pmd_[cd]tor are called
  m68k: mm: Add calls to pagetable_pmd_[cd]tor
  s390/mm: Add calls to pagetable_pud_[cd]tor
  riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one
  asm-generic: pgalloc: Provide generic p4d_{alloc_one,free}
  mm: Introduce ctor/dtor at P4D level
  ARM: mm: Rename PGD helpers
  asm-generic: pgalloc: Provide generic __pgd_{alloc,free}
  mm: Introduce ctor/dtor at PGD level

 arch/alpha/mm/init.c                     |  2 +-
 arch/arc/include/asm/pgalloc.h           |  9 +--
 arch/arm/mm/pgd.c                        | 16 +++--
 arch/arm64/include/asm/pgalloc.h         | 17 ------
 arch/arm64/mm/pgd.c                      |  4 +-
 arch/csky/include/asm/pgalloc.h          |  2 +-
 arch/hexagon/include/asm/pgalloc.h       |  2 +-
 arch/loongarch/mm/pgtable.c              |  7 +--
 arch/m68k/include/asm/mcf_pgalloc.h      |  2 +
 arch/m68k/include/asm/motorola_pgalloc.h |  6 +-
 arch/m68k/include/asm/sun3_pgalloc.h     |  2 +-
 arch/m68k/mm/motorola.c                  | 31 ++++++++--
 arch/microblaze/include/asm/pgalloc.h    |  7 +--
 arch/mips/include/asm/pgalloc.h          |  6 --
 arch/mips/mm/pgtable.c                   |  8 +--
 arch/nios2/mm/pgtable.c                  |  3 +-
 arch/openrisc/include/asm/pgalloc.h      |  6 +-
 arch/parisc/include/asm/pgalloc.h        | 39 ++++--------
 arch/riscv/include/asm/pgalloc.h         | 46 ++------------
 arch/s390/include/asm/pgalloc.h          | 33 +++++++---
 arch/um/kernel/mem.c                     |  7 +--
 arch/x86/include/asm/pgalloc.h           | 18 ------
 arch/x86/mm/pgtable.c                    | 27 +++++----
 arch/xtensa/include/asm/pgalloc.h        |  2 +-
 include/asm-generic/pgalloc.h            | 76 +++++++++++++++++++++++-
 include/linux/mm.h                       | 64 +++++++++++++-------
 26 files changed, 234 insertions(+), 208 deletions(-)


base-commit: 78d4f34e2115b517bcbfe7ec0d018bbbb6f9b0b8
-- 
2.47.0



^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 17:19   ` Peter Zijlstra
  2024-12-19 16:44 ` [PATCH 02/10] parisc: mm: Ensure pagetable_pmd_[cd]tor are called Kevin Brodsky
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Besides the ptlock management at PTE/PMD level, all the
pagetable_*_[cd]tor have the same implementation. Introduce common
helpers for all levels to reduce the duplication.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 include/linux/mm.h | 46 ++++++++++++++++++++++------------------------
 1 file changed, 22 insertions(+), 24 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c39c4945946c..a5b482362792 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2915,6 +2915,22 @@ static inline void pagetable_free(struct ptdesc *pt)
 	__free_pages(page, compound_order(page));
 }
 
+static inline void __pagetable_ctor(struct ptdesc *ptdesc)
+{
+	struct folio *folio = ptdesc_folio(ptdesc);
+
+	__folio_set_pgtable(folio);
+	lruvec_stat_add_folio(folio, NR_PAGETABLE);
+}
+
+static inline void __pagetable_dtor(struct ptdesc *ptdesc)
+{
+	struct folio *folio = ptdesc_folio(ptdesc);
+
+	__folio_clear_pgtable(folio);
+	lruvec_stat_sub_folio(folio, NR_PAGETABLE);
+}
+
 #if defined(CONFIG_SPLIT_PTE_PTLOCKS)
 #if ALLOC_SPLIT_PTLOCKS
 void __init ptlock_cache_init(void);
@@ -2992,22 +3008,16 @@ static inline void ptlock_free(struct ptdesc *ptdesc) {}
 
 static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc)
 {
-	struct folio *folio = ptdesc_folio(ptdesc);
-
 	if (!ptlock_init(ptdesc))
 		return false;
-	__folio_set_pgtable(folio);
-	lruvec_stat_add_folio(folio, NR_PAGETABLE);
+	__pagetable_ctor(ptdesc);
 	return true;
 }
 
 static inline void pagetable_pte_dtor(struct ptdesc *ptdesc)
 {
-	struct folio *folio = ptdesc_folio(ptdesc);
-
 	ptlock_free(ptdesc);
-	__folio_clear_pgtable(folio);
-	lruvec_stat_sub_folio(folio, NR_PAGETABLE);
+	__pagetable_dtor(ptdesc);
 }
 
 pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp);
@@ -3110,22 +3120,16 @@ static inline spinlock_t *pmd_lock(struct mm_struct *mm, pmd_t *pmd)
 
 static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc)
 {
-	struct folio *folio = ptdesc_folio(ptdesc);
-
 	if (!pmd_ptlock_init(ptdesc))
 		return false;
-	__folio_set_pgtable(folio);
-	lruvec_stat_add_folio(folio, NR_PAGETABLE);
+	__pagetable_ctor(ptdesc);
 	return true;
 }
 
 static inline void pagetable_pmd_dtor(struct ptdesc *ptdesc)
 {
-	struct folio *folio = ptdesc_folio(ptdesc);
-
 	pmd_ptlock_free(ptdesc);
-	__folio_clear_pgtable(folio);
-	lruvec_stat_sub_folio(folio, NR_PAGETABLE);
+	__pagetable_dtor(ptdesc);
 }
 
 /*
@@ -3149,18 +3153,12 @@ static inline spinlock_t *pud_lock(struct mm_struct *mm, pud_t *pud)
 
 static inline void pagetable_pud_ctor(struct ptdesc *ptdesc)
 {
-	struct folio *folio = ptdesc_folio(ptdesc);
-
-	__folio_set_pgtable(folio);
-	lruvec_stat_add_folio(folio, NR_PAGETABLE);
+	__pagetable_ctor(ptdesc);
 }
 
 static inline void pagetable_pud_dtor(struct ptdesc *ptdesc)
 {
-	struct folio *folio = ptdesc_folio(ptdesc);
-
-	__folio_clear_pgtable(folio);
-	lruvec_stat_sub_folio(folio, NR_PAGETABLE);
+	__pagetable_dtor(ptdesc);
 }
 
 extern void __init pagecache_init(void);
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 02/10] parisc: mm: Ensure pagetable_pmd_[cd]tor are called
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 03/10] m68k: mm: Add calls to pagetable_pmd_[cd]tor Kevin Brodsky
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

The implementation of pmd_{alloc_one,free} on parisc requires a
non-zero allocation order, but is completely standard aside from
that. Let's reuse the generic implementation of pmd_alloc_one().
Explicit zeroing is not needed as GFP_PGTABLE_KERNEL includes
__GFP_ZERO. The generic pmd_free() can handle higher allocation
orders so we don't need to define our own.

These changes ensure that pagetable_pmd_[cd]tor are called,
improving the accounting of page table pages.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/parisc/include/asm/pgalloc.h | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/parisc/include/asm/pgalloc.h b/arch/parisc/include/asm/pgalloc.h
index e3e142b1c5c5..3e8dbd79670b 100644
--- a/arch/parisc/include/asm/pgalloc.h
+++ b/arch/parisc/include/asm/pgalloc.h
@@ -11,7 +11,6 @@
 #include <asm/cache.h>
 
 #define __HAVE_ARCH_PMD_ALLOC_ONE
-#define __HAVE_ARCH_PMD_FREE
 #define __HAVE_ARCH_PGD_FREE
 #include <asm-generic/pgalloc.h>
 
@@ -46,17 +45,19 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long address)
 {
-	pmd_t *pmd;
+	struct ptdesc *ptdesc;
+	gfp_t gfp = GFP_PGTABLE_USER;
 
-	pmd = (pmd_t *)__get_free_pages(GFP_PGTABLE_KERNEL, PMD_TABLE_ORDER);
-	if (likely(pmd))
-		memset ((void *)pmd, 0, PAGE_SIZE << PMD_TABLE_ORDER);
-	return pmd;
-}
-
-static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
-{
-	free_pages((unsigned long)pmd, PMD_TABLE_ORDER);
+	if (mm == &init_mm)
+		gfp = GFP_PGTABLE_KERNEL;
+	ptdesc = pagetable_alloc(gfp, PMD_TABLE_ORDER);
+	if (!ptdesc)
+		return NULL;
+	if (!pagetable_pmd_ctor(ptdesc)) {
+		pagetable_free(ptdesc);
+		return NULL;
+	}
+	return ptdesc_address(ptdesc);
 }
 #endif
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 03/10] m68k: mm: Add calls to pagetable_pmd_[cd]tor
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 02/10] parisc: mm: Ensure pagetable_pmd_[cd]tor are called Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 04/10] s390/mm: Add calls to pagetable_pud_[cd]tor Kevin Brodsky
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

get_pointer_table() and free_pointer_table() already special-case
TABLE_PTE to call pagetable_pte_[cd]tor. Let's do the same at PMD
level to improve accounting further. TABLE_PGD and TABLE_PMD are
currently defined to the same value, so we first need to separate
them. That also implies separating ptable_list for PMD/PGD levels.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/m68k/include/asm/motorola_pgalloc.h |  6 +++---
 arch/m68k/mm/motorola.c                  | 25 +++++++++++++++++++-----
 2 files changed, 23 insertions(+), 8 deletions(-)

diff --git a/arch/m68k/include/asm/motorola_pgalloc.h b/arch/m68k/include/asm/motorola_pgalloc.h
index 74a817d9387f..5abe7da8ac5a 100644
--- a/arch/m68k/include/asm/motorola_pgalloc.h
+++ b/arch/m68k/include/asm/motorola_pgalloc.h
@@ -9,9 +9,9 @@ extern void mmu_page_ctor(void *page);
 extern void mmu_page_dtor(void *page);
 
 enum m68k_table_types {
-	TABLE_PGD = 0,
-	TABLE_PMD = 0, /* same size as PGD */
-	TABLE_PTE = 1,
+	TABLE_PGD,
+	TABLE_PMD,
+	TABLE_PTE,
 };
 
 extern void init_pointer_table(void *table, int type);
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index c1761d309fc6..37010ee15928 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -97,17 +97,19 @@ void mmu_page_dtor(void *page)
 
 typedef struct list_head ptable_desc;
 
-static struct list_head ptable_list[2] = {
+static struct list_head ptable_list[3] = {
 	LIST_HEAD_INIT(ptable_list[0]),
 	LIST_HEAD_INIT(ptable_list[1]),
+	LIST_HEAD_INIT(ptable_list[2]),
 };
 
 #define PD_PTABLE(page) ((ptable_desc *)&(virt_to_page((void *)(page))->lru))
 #define PD_PAGE(ptable) (list_entry(ptable, struct page, lru))
 #define PD_MARKBITS(dp) (*(unsigned int *)&PD_PAGE(dp)->index)
 
-static const int ptable_shift[2] = {
-	7+2, /* PGD, PMD */
+static const int ptable_shift[3] = {
+	7+2, /* PGD */
+	7+2, /* PMD */
 	6+2, /* PTE */
 };
 
@@ -156,12 +158,17 @@ void *get_pointer_table(int type)
 		if (!(page = (void *)get_zeroed_page(GFP_KERNEL)))
 			return NULL;
 
-		if (type == TABLE_PTE) {
+		switch (type) {
+		case TABLE_PTE:
 			/*
 			 * m68k doesn't have SPLIT_PTE_PTLOCKS for not having
 			 * SMP.
 			 */
 			pagetable_pte_ctor(virt_to_ptdesc(page));
+			break;
+		case TABLE_PMD:
+			pagetable_pmd_ctor(virt_to_ptdesc(page));
+			break;
 		}
 
 		mmu_page_ctor(page);
@@ -200,8 +207,16 @@ int free_pointer_table(void *table, int type)
 		/* all tables in page are free, free page */
 		list_del(dp);
 		mmu_page_dtor((void *)page);
-		if (type == TABLE_PTE)
+
+		switch (type) {
+		case TABLE_PTE:
 			pagetable_pte_dtor(virt_to_ptdesc((void *)page));
+			break;
+		case TABLE_PMD:
+			pagetable_pmd_dtor(virt_to_ptdesc((void *)page));
+			break;
+		}
+
 		free_page (page);
 		return 1;
 	} else if (ptable_list[type].next != dp) {
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 04/10] s390/mm: Add calls to pagetable_pud_[cd]tor
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (2 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 03/10] m68k: mm: Add calls to pagetable_pmd_[cd]tor Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one Kevin Brodsky
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Commit 55d2a0bd5ead ("mm: add statistics for PUD level pagetable")
introduced PUD-level ctor/dtor helpers to improve the accounting of
page table pages. s390 doesn't use the generic pgalloc
implementation and it seems that it got missed in the process. Add
the missing calls to the ctor/dtor helpers in pud_alloc_one/pud_free
to match the other architectures.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/s390/include/asm/pgalloc.h | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index 7b84ef6dc4b6..97db66ae06b9 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -67,15 +67,20 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address)
 {
 	unsigned long *table = crst_table_alloc(mm);
-	if (table)
-		crst_table_init(table, _REGION3_ENTRY_EMPTY);
+
+	if (!table)
+		return NULL;
+	crst_table_init(table, _REGION3_ENTRY_EMPTY);
+	pagetable_pud_ctor(virt_to_ptdesc(table));
 	return (pud_t *) table;
 }
 
 static inline void pud_free(struct mm_struct *mm, pud_t *pud)
 {
-	if (!mm_pud_folded(mm))
-		crst_table_free(mm, (unsigned long *) pud);
+	if (mm_pud_folded(mm))
+		return;
+	pagetable_pud_dtor(virt_to_ptdesc(pud));
+	crst_table_free(mm, (unsigned long *) pud);
 }
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long vmaddr)
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (3 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 04/10] s390/mm: Add calls to pagetable_pud_[cd]tor Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2025-01-03 10:31   ` Alexandre Ghiti
  2024-12-19 16:44 ` [PATCH 06/10] asm-generic: pgalloc: Provide generic p4d_{alloc_one,free} Kevin Brodsky
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

{pmd,pud,p4d}_alloc_one() is never called if the corresponding page
table level is folded, as {pmd,pud,p4d}_alloc() already does the
required check. We can therefore remove the runtime page table level
checks in {pud,p4d}_alloc_one. The PUD helper becomes equivalent to
the generic version, so we remove it altogether.

This is consistent with the way arm64 and x86 handle this situation
(runtime check in p4d_free() only).

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/riscv/include/asm/pgalloc.h | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index f52264304f77..8ad0bbe838a2 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -12,7 +12,6 @@
 #include <asm/tlb.h>
 
 #ifdef CONFIG_MMU
-#define __HAVE_ARCH_PUD_ALLOC_ONE
 #define __HAVE_ARCH_PUD_FREE
 #include <asm-generic/pgalloc.h>
 
@@ -88,15 +87,6 @@ static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd,
 	}
 }
 
-#define pud_alloc_one pud_alloc_one
-static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
-{
-	if (pgtable_l4_enabled)
-		return __pud_alloc_one(mm, addr);
-
-	return NULL;
-}
-
 #define pud_free pud_free
 static inline void pud_free(struct mm_struct *mm, pud_t *pud)
 {
@@ -118,15 +108,11 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
 #define p4d_alloc_one p4d_alloc_one
 static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-	if (pgtable_l5_enabled) {
-		gfp_t gfp = GFP_PGTABLE_USER;
-
-		if (mm == &init_mm)
-			gfp = GFP_PGTABLE_KERNEL;
-		return (p4d_t *)get_zeroed_page(gfp);
-	}
+	gfp_t gfp = GFP_PGTABLE_USER;
 
-	return NULL;
+	if (mm == &init_mm)
+		gfp = GFP_PGTABLE_KERNEL;
+	return (p4d_t *)get_zeroed_page(gfp);
 }
 
 static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 06/10] asm-generic: pgalloc: Provide generic p4d_{alloc_one,free}
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (4 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 07/10] mm: Introduce ctor/dtor at P4D level Kevin Brodsky
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Four architectures currently implement 5-level pgtables: arm64,
riscv, x86 and s390. The first three have essentially the same
implementation for p4d_alloc_one() and p4d_free(), so we've got an
opportunity to reduce duplication like at the lower levels.

Provide a generic version of p4d_alloc_one() and p4d_free(), and
make use of it on those architectures.

Their implementation is the same as at PUD level, except that
p4d_free() performs a runtime check by calling mm_p4d_folded().
5-level pgtables depend on a runtime-detected hardware feature on
all supported architectures, so we might as well include this check
in the generic implementation. No runtime check is required in
p4d_alloc_one() as the top-level p4d_alloc() already does the
required check.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/arm64/include/asm/pgalloc.h | 17 ------------
 arch/riscv/include/asm/pgalloc.h | 23 ----------------
 arch/x86/include/asm/pgalloc.h   | 18 -------------
 include/asm-generic/pgalloc.h    | 45 ++++++++++++++++++++++++++++++++
 4 files changed, 45 insertions(+), 58 deletions(-)

diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index e75422864d1b..2965f5a7e39e 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -85,23 +85,6 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
 	__pgd_populate(pgdp, __pa(p4dp), pgdval);
 }
 
-static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
-{
-	gfp_t gfp = GFP_PGTABLE_USER;
-
-	if (mm == &init_mm)
-		gfp = GFP_PGTABLE_KERNEL;
-	return (p4d_t *)get_zeroed_page(gfp);
-}
-
-static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
-{
-	if (!pgtable_l5_enabled())
-		return;
-	BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
-	free_page((unsigned long)p4d);
-}
-
 #define __p4d_free_tlb(tlb, p4d, addr)  p4d_free((tlb)->mm, p4d)
 #else
 static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot)
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index 8ad0bbe838a2..551d614d3369 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -105,29 +105,6 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
 	}
 }
 
-#define p4d_alloc_one p4d_alloc_one
-static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
-{
-	gfp_t gfp = GFP_PGTABLE_USER;
-
-	if (mm == &init_mm)
-		gfp = GFP_PGTABLE_KERNEL;
-	return (p4d_t *)get_zeroed_page(gfp);
-}
-
-static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
-{
-	BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
-	free_page((unsigned long)p4d);
-}
-
-#define p4d_free p4d_free
-static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
-{
-	if (pgtable_l5_enabled)
-		__p4d_free(mm, p4d);
-}
-
 static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
 				  unsigned long addr)
 {
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index dcd836b59beb..dd4841231bb9 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -147,24 +147,6 @@ static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4
 	set_pgd_safe(pgd, __pgd(_PAGE_TABLE | __pa(p4d)));
 }
 
-static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
-{
-	gfp_t gfp = GFP_KERNEL_ACCOUNT;
-
-	if (mm == &init_mm)
-		gfp &= ~__GFP_ACCOUNT;
-	return (p4d_t *)get_zeroed_page(gfp);
-}
-
-static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
-{
-	if (!pgtable_l5_enabled())
-		return;
-
-	BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
-	free_page((unsigned long)p4d);
-}
-
 extern void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d);
 
 static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 7c48f5fbf8aa..59131629ac9c 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -215,6 +215,51 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
 
 #endif /* CONFIG_PGTABLE_LEVELS > 3 */
 
+#if CONFIG_PGTABLE_LEVELS > 4
+
+static inline p4d_t *__p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long addr)
+{
+	gfp_t gfp = GFP_PGTABLE_USER;
+	struct ptdesc *ptdesc;
+
+	if (mm == &init_mm)
+		gfp = GFP_PGTABLE_KERNEL;
+	gfp &= ~__GFP_HIGHMEM;
+
+	ptdesc = pagetable_alloc_noprof(gfp, 0);
+	if (!ptdesc)
+		return NULL;
+
+	return ptdesc_address(ptdesc);
+}
+#define __p4d_alloc_one(...)	alloc_hooks(__p4d_alloc_one_noprof(__VA_ARGS__))
+
+#ifndef __HAVE_ARCH_P4D_ALLOC_ONE
+static inline p4d_t *p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long addr)
+{
+	return __p4d_alloc_one_noprof(mm, addr);
+}
+#define p4d_alloc_one(...)	alloc_hooks(p4d_alloc_one_noprof(__VA_ARGS__))
+#endif
+
+static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
+{
+	struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
+
+	BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
+	pagetable_free(ptdesc);
+}
+
+#ifndef __HAVE_ARCH_P4D_FREE
+static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
+{
+	if (!mm_p4d_folded(mm))
+		__p4d_free(mm, p4d);
+}
+#endif
+
+#endif /* CONFIG_PGTABLE_LEVELS > 4 */
+
 #ifndef __HAVE_ARCH_PGD_FREE
 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 07/10] mm: Introduce ctor/dtor at P4D level
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (5 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 06/10] asm-generic: pgalloc: Provide generic p4d_{alloc_one,free} Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 08/10] ARM: mm: Rename PGD helpers Kevin Brodsky
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Commit 55d2a0bd5ead ("mm: add statistics for PUD level pagetable")
added accounting for PUD-level page tables. This patch does exactly
the same thing for P4D-level page tables, introducing
pagetable_p4d_[cd]tor with the same implementation as the PUD
ctor/dtor and calling them on all alloc/free paths.

Besides the small improvement in accounting accuracy, this also
enables adding construction/destruction hooks in just one generic
place for page tables at P4D level (like lower levels).

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/riscv/include/asm/pgalloc.h |  8 ++++++--
 arch/s390/include/asm/pgalloc.h  | 12 ++++++++----
 arch/x86/mm/pgtable.c            |  3 +++
 include/asm-generic/pgalloc.h    |  2 ++
 include/linux/mm.h               | 10 ++++++++++
 5 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index 551d614d3369..3c364ecc3100 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -108,8 +108,12 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
 static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
 				  unsigned long addr)
 {
-	if (pgtable_l5_enabled)
-		riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
+	if (pgtable_l5_enabled) {
+		struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
+
+		pagetable_p4d_dtor(ptdesc);
+		riscv_tlb_remove_ptdesc(tlb, ptdesc);
+	}
 }
 #endif /* __PAGETABLE_PMD_FOLDED */
 
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index 97db66ae06b9..85a5d07365aa 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -53,15 +53,19 @@ static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long address)
 {
 	unsigned long *table = crst_table_alloc(mm);
 
-	if (table)
-		crst_table_init(table, _REGION2_ENTRY_EMPTY);
+	if (!table)
+		return NULL;
+	crst_table_init(table, _REGION2_ENTRY_EMPTY);
+	pagetable_p4d_ctor(virt_to_ptdesc(table));
 	return (p4d_t *) table;
 }
 
 static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
 {
-	if (!mm_p4d_folded(mm))
-		crst_table_free(mm, (unsigned long *) p4d);
+	if (mm_p4d_folded(mm))
+		return;
+	pagetable_p4d_dtor(virt_to_ptdesc(p4d));
+	crst_table_free(mm, (unsigned long *) p4d);
 }
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 5745a354a241..c1bfdf7b4767 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -86,6 +86,9 @@ void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
 #if CONFIG_PGTABLE_LEVELS > 4
 void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
 {
+	struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
+
+	pagetable_p4d_dtor(ptdesc);
 	paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
 	paravirt_tlb_remove_table(tlb, virt_to_page(p4d));
 }
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 59131629ac9c..bb482eeca0c3 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -230,6 +230,7 @@ static inline p4d_t *__p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long
 	if (!ptdesc)
 		return NULL;
 
+	pagetable_p4d_ctor(ptdesc);
 	return ptdesc_address(ptdesc);
 }
 #define __p4d_alloc_one(...)	alloc_hooks(__p4d_alloc_one_noprof(__VA_ARGS__))
@@ -247,6 +248,7 @@ static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
 	struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
 
 	BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
+	pagetable_p4d_dtor(ptdesc);
 	pagetable_free(ptdesc);
 }
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a5b482362792..e8b92f4bf3f1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3161,6 +3161,16 @@ static inline void pagetable_pud_dtor(struct ptdesc *ptdesc)
 	__pagetable_dtor(ptdesc);
 }
 
+static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc)
+{
+	__pagetable_ctor(ptdesc);
+}
+
+static inline void pagetable_p4d_dtor(struct ptdesc *ptdesc)
+{
+	__pagetable_dtor(ptdesc);
+}
+
 extern void __init pagecache_init(void);
 extern void free_initmem(void);
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 08/10] ARM: mm: Rename PGD helpers
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (6 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 07/10] mm: Introduce ctor/dtor at P4D level Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 09/10] asm-generic: pgalloc: Provide generic __pgd_{alloc,free} Kevin Brodsky
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Generic implementations of __pgd_alloc and __pgd_free are about to
be introduced. Rename the macros in arch/arm/mm/pgd.c to avoid
clashes. While we're at it, also pass down the mm as argument to
those helpers, as it will be needed to call the generic
__pgd_{alloc,free}.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/arm/mm/pgd.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index f8e9bc58a84f..2a1077747758 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -17,11 +17,11 @@
 #include "mm.h"
 
 #ifdef CONFIG_ARM_LPAE
-#define __pgd_alloc()	kmalloc_array(PTRS_PER_PGD, sizeof(pgd_t), GFP_KERNEL)
-#define __pgd_free(pgd)	kfree(pgd)
+#define _pgd_alloc(mm)		kmalloc_array(PTRS_PER_PGD, sizeof(pgd_t), GFP_KERNEL)
+#define _pgd_free(mm, pgd)	kfree(pgd)
 #else
-#define __pgd_alloc()	(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
-#define __pgd_free(pgd)	free_pages((unsigned long)pgd, 2)
+#define _pgd_alloc(mm)		(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
+#define _pgd_free(mm, pgd)	free_pages((unsigned long)pgd, 2)
 #endif
 
 /*
@@ -35,7 +35,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	pmd_t *new_pmd, *init_pmd;
 	pte_t *new_pte, *init_pte;
 
-	new_pgd = __pgd_alloc();
+	new_pgd = _pgd_alloc(mm);
 	if (!new_pgd)
 		goto no_pgd;
 
@@ -134,7 +134,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 no_pud:
 	p4d_free(mm, new_p4d);
 no_p4d:
-	__pgd_free(new_pgd);
+	_pgd_free(mm, new_pgd);
 no_pgd:
 	return NULL;
 }
@@ -207,5 +207,5 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd_base)
 		p4d_free(mm, p4d);
 	}
 #endif
-	__pgd_free(pgd_base);
+	_pgd_free(mm, pgd_base);
 }
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 09/10] asm-generic: pgalloc: Provide generic __pgd_{alloc,free}
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (7 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 08/10] ARM: mm: Rename PGD helpers Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 16:44 ` [PATCH 10/10] mm: Introduce ctor/dtor at PGD level Kevin Brodsky
  2024-12-19 17:13 ` [PATCH 00/10] Account page tables at all levels Dave Hansen
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

We already have a generic implementation of alloc/free up to P4D
level, as well as pgd_free(). Let's finish the work and add a
generic PGD-level alloc helper as well.

Unlike at lower levels, almost all architectures need some specific
magic at PGD level (typically initialising PGD entries), so
introducing a generic pgd_alloc() isn't worth it. Instead we
introduce two new helpers, __pgd_alloc() and __pgd_free(), and make
use of them in the arch-specific pgd_alloc() and pgd_free() wherever
possible. To accommodate as many arch as possible, __pgd_alloc()
takes a page alocation order.

Because pagetable_alloc() allocates zeroed pages, we are also able
to get rid of zeroing code in some implementations of pgd_alloc().
Some trivial implementations of pgd_free() also become unnecessary
once __pgd_alloc() is used; remove them.

Another small improvement is consistent accounting of PGD pages by
using GFP_PGTABLE_{USER,KERNEL} as appropriate.

Not all PGD allocations can be handled by the generic helpers. In
particular, multiple architectures allocate PGDs from a kmem_cache,
and those PGDs may not be page-sized.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/alpha/mm/init.c                  |  2 +-
 arch/arc/include/asm/pgalloc.h        |  9 ++-------
 arch/arm/mm/pgd.c                     |  8 +++-----
 arch/arm64/mm/pgd.c                   |  4 ++--
 arch/csky/include/asm/pgalloc.h       |  2 +-
 arch/hexagon/include/asm/pgalloc.h    |  2 +-
 arch/loongarch/mm/pgtable.c           |  7 +++----
 arch/m68k/include/asm/sun3_pgalloc.h  |  2 +-
 arch/microblaze/include/asm/pgalloc.h |  7 +------
 arch/mips/include/asm/pgalloc.h       |  6 ------
 arch/mips/mm/pgtable.c                |  8 +++-----
 arch/nios2/mm/pgtable.c               |  3 ++-
 arch/openrisc/include/asm/pgalloc.h   |  6 ++----
 arch/parisc/include/asm/pgalloc.h     | 16 +---------------
 arch/riscv/include/asm/pgalloc.h      |  3 +--
 arch/um/kernel/mem.c                  |  7 +++----
 arch/x86/mm/pgtable.c                 | 24 +++++++++++-------------
 arch/xtensa/include/asm/pgalloc.h     |  2 +-
 include/asm-generic/pgalloc.h         | 27 ++++++++++++++++++++++++++-
 19 files changed, 65 insertions(+), 80 deletions(-)

diff --git a/arch/alpha/mm/init.c b/arch/alpha/mm/init.c
index 4fe618446e4c..61c2198b1359 100644
--- a/arch/alpha/mm/init.c
+++ b/arch/alpha/mm/init.c
@@ -42,7 +42,7 @@ pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *ret, *init;
 
-	ret = (pgd_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
+	ret = __pgd_alloc(mm, 0);
 	init = pgd_offset(&init_mm, 0UL);
 	if (ret) {
 #ifdef CONFIG_ALPHA_LARGE_VMALLOC
diff --git a/arch/arc/include/asm/pgalloc.h b/arch/arc/include/asm/pgalloc.h
index 096b8ef58edb..dfae070fe8d5 100644
--- a/arch/arc/include/asm/pgalloc.h
+++ b/arch/arc/include/asm/pgalloc.h
@@ -53,19 +53,14 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd, pgtable_t pte_
 
 static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	pgd_t *ret = (pgd_t *) __get_free_page(GFP_KERNEL);
+	pgd_t *ret = __pgd_alloc(mm, 0);
 
 	if (ret) {
 		int num, num2;
-		num = USER_PTRS_PER_PGD + USER_KERNEL_GUTTER / PGDIR_SIZE;
-		memzero(ret, num * sizeof(pgd_t));
 
+		num = USER_PTRS_PER_PGD + USER_KERNEL_GUTTER / PGDIR_SIZE;
 		num2 = VMALLOC_SIZE / PGDIR_SIZE;
 		memcpy(ret + num, swapper_pg_dir + num, num2 * sizeof(pgd_t));
-
-		memzero(ret + num + num2,
-			       (PTRS_PER_PGD - num - num2) * sizeof(pgd_t));
-
 	}
 	return ret;
 }
diff --git a/arch/arm/mm/pgd.c b/arch/arm/mm/pgd.c
index 2a1077747758..4eb81b7ed03a 100644
--- a/arch/arm/mm/pgd.c
+++ b/arch/arm/mm/pgd.c
@@ -17,11 +17,11 @@
 #include "mm.h"
 
 #ifdef CONFIG_ARM_LPAE
-#define _pgd_alloc(mm)		kmalloc_array(PTRS_PER_PGD, sizeof(pgd_t), GFP_KERNEL)
+#define _pgd_alloc(mm)		kmalloc_array(PTRS_PER_PGD, sizeof(pgd_t), GFP_KERNEL | __GFP_ZERO)
 #define _pgd_free(mm, pgd)	kfree(pgd)
 #else
-#define _pgd_alloc(mm)		(pgd_t *)__get_free_pages(GFP_KERNEL, 2)
-#define _pgd_free(mm, pgd)	free_pages((unsigned long)pgd, 2)
+#define _pgd_alloc(mm)		__pgd_alloc(mm, 2)
+#define _pgd_free(mm, pgd)	__pgd_free(mm, pgd)
 #endif
 
 /*
@@ -39,8 +39,6 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	if (!new_pgd)
 		goto no_pgd;
 
-	memset(new_pgd, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
-
 	/*
 	 * Copy over the kernel and IO PGD entries
 	 */
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 0c501cabc238..8160cff35089 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -33,7 +33,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	gfp_t gfp = GFP_PGTABLE_USER;
 
 	if (pgdir_is_page_size())
-		return (pgd_t *)__get_free_page(gfp);
+		return __pgd_alloc(mm, 0);
 	else
 		return kmem_cache_alloc(pgd_cache, gfp);
 }
@@ -41,7 +41,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
 	if (pgdir_is_page_size())
-		free_page((unsigned long)pgd);
+		__pgd_free(mm, pgd);
 	else
 		kmem_cache_free(pgd_cache, pgd);
 }
diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h
index 9c84c9012e53..9a3bbf16f03f 100644
--- a/arch/csky/include/asm/pgalloc.h
+++ b/arch/csky/include/asm/pgalloc.h
@@ -44,7 +44,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 	pgd_t *ret;
 	pgd_t *init;
 
-	ret = (pgd_t *) __get_free_page(GFP_KERNEL);
+	ret = __pgd_alloc(mm, 0);
 	if (ret) {
 		init = pgd_offset(&init_mm, 0UL);
 		pgd_init((unsigned long *)ret);
diff --git a/arch/hexagon/include/asm/pgalloc.h b/arch/hexagon/include/asm/pgalloc.h
index 55988625e6fb..289dace6d76d 100644
--- a/arch/hexagon/include/asm/pgalloc.h
+++ b/arch/hexagon/include/asm/pgalloc.h
@@ -22,7 +22,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *pgd;
 
-	pgd = (pgd_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
+	pgd = __pgd_alloc(mm, 0);
 
 	/*
 	 * There may be better ways to do this, but to ensure
diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
index 3fa69b23ff84..22a94bb3e6e8 100644
--- a/arch/loongarch/mm/pgtable.c
+++ b/arch/loongarch/mm/pgtable.c
@@ -23,11 +23,10 @@ EXPORT_SYMBOL(tlb_virt_to_page);
 
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	pgd_t *init, *ret = NULL;
-	struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM, 0);
+	pgd_t *init, *ret;
 
-	if (ptdesc) {
-		ret = (pgd_t *)ptdesc_address(ptdesc);
+	ret = __pgd_alloc(mm, 0);
+	if (ret) {
 		init = pgd_offset(&init_mm, 0UL);
 		pgd_init(ret);
 		memcpy(ret + USER_PTRS_PER_PGD, init + USER_PTRS_PER_PGD,
diff --git a/arch/m68k/include/asm/sun3_pgalloc.h b/arch/m68k/include/asm/sun3_pgalloc.h
index 4a137eecb6fe..e91b0133df5d 100644
--- a/arch/m68k/include/asm/sun3_pgalloc.h
+++ b/arch/m68k/include/asm/sun3_pgalloc.h
@@ -43,7 +43,7 @@ static inline pgd_t * pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *new_pgd;
 
-	new_pgd = (pgd_t *)get_zeroed_page(GFP_KERNEL);
+	new_pgd = __pgd_alloc(mm, 0);
 	memcpy(new_pgd, swapper_pg_dir, PAGE_SIZE);
 	memset(new_pgd, 0, (PAGE_OFFSET >> PGDIR_SHIFT));
 	return new_pgd;
diff --git a/arch/microblaze/include/asm/pgalloc.h b/arch/microblaze/include/asm/pgalloc.h
index 6c33b05f730f..084a8a0dc239 100644
--- a/arch/microblaze/include/asm/pgalloc.h
+++ b/arch/microblaze/include/asm/pgalloc.h
@@ -21,12 +21,7 @@
 
 extern void __bad_pte(pmd_t *pmd);
 
-static inline pgd_t *get_pgd(void)
-{
-	return (pgd_t *)__get_free_pages(GFP_KERNEL|__GFP_ZERO, 0);
-}
-
-#define pgd_alloc(mm)		get_pgd()
+#define pgd_alloc(mm)		__pgd_alloc(mm, 0)
 
 extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm);
 
diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
index f4440edcd8fe..9ee8426b2e9d 100644
--- a/arch/mips/include/asm/pgalloc.h
+++ b/arch/mips/include/asm/pgalloc.h
@@ -15,7 +15,6 @@
 
 #define __HAVE_ARCH_PMD_ALLOC_ONE
 #define __HAVE_ARCH_PUD_ALLOC_ONE
-#define __HAVE_ARCH_PGD_FREE
 #include <asm-generic/pgalloc.h>
 
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
@@ -49,11 +48,6 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 extern void pgd_init(void *addr);
 extern pgd_t *pgd_alloc(struct mm_struct *mm);
 
-static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
-{
-	pagetable_free(virt_to_ptdesc(pgd));
-}
-
 #define __pte_free_tlb(tlb, pte, address)			\
 do {								\
 	pagetable_pte_dtor(page_ptdesc(pte));			\
diff --git a/arch/mips/mm/pgtable.c b/arch/mips/mm/pgtable.c
index 1506e458040d..10835414819f 100644
--- a/arch/mips/mm/pgtable.c
+++ b/arch/mips/mm/pgtable.c
@@ -10,12 +10,10 @@
 
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	pgd_t *init, *ret = NULL;
-	struct ptdesc *ptdesc = pagetable_alloc(GFP_KERNEL & ~__GFP_HIGHMEM,
-			PGD_TABLE_ORDER);
+	pgd_t *init, *ret;
 
-	if (ptdesc) {
-		ret = ptdesc_address(ptdesc);
+	ret = __pgd_alloc(mm, PGD_TABLE_ORDER);
+	if (ret) {
 		init = pgd_offset(&init_mm, 0UL);
 		pgd_init(ret);
 		memcpy(ret + USER_PTRS_PER_PGD, init + USER_PTRS_PER_PGD,
diff --git a/arch/nios2/mm/pgtable.c b/arch/nios2/mm/pgtable.c
index 7c76e8a7447a..6470ed378782 100644
--- a/arch/nios2/mm/pgtable.c
+++ b/arch/nios2/mm/pgtable.c
@@ -11,6 +11,7 @@
 #include <linux/sched.h>
 
 #include <asm/cpuinfo.h>
+#include <asm/pgalloc.h>
 
 /* pteaddr:
  *   ptbase | vpn* | zero
@@ -54,7 +55,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *ret, *init;
 
-	ret = (pgd_t *) __get_free_page(GFP_KERNEL);
+	ret = __pgd_alloc(mm, 0);
 	if (ret) {
 		init = pgd_offset(&init_mm, 0UL);
 		pgd_init(ret);
diff --git a/arch/openrisc/include/asm/pgalloc.h b/arch/openrisc/include/asm/pgalloc.h
index c6a73772a546..c068c4942467 100644
--- a/arch/openrisc/include/asm/pgalloc.h
+++ b/arch/openrisc/include/asm/pgalloc.h
@@ -41,15 +41,13 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
  */
 static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	pgd_t *ret = (pgd_t *)__get_free_page(GFP_KERNEL);
+	pgd_t *ret = __pgd_alloc(mm, 0);
 
-	if (ret) {
-		memset(ret, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
+	if (ret)
 		memcpy(ret + USER_PTRS_PER_PGD,
 		       swapper_pg_dir + USER_PTRS_PER_PGD,
 		       (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
 
-	}
 	return ret;
 }
 
diff --git a/arch/parisc/include/asm/pgalloc.h b/arch/parisc/include/asm/pgalloc.h
index 3e8dbd79670b..2ca74a56415c 100644
--- a/arch/parisc/include/asm/pgalloc.h
+++ b/arch/parisc/include/asm/pgalloc.h
@@ -11,26 +11,12 @@
 #include <asm/cache.h>
 
 #define __HAVE_ARCH_PMD_ALLOC_ONE
-#define __HAVE_ARCH_PGD_FREE
 #include <asm-generic/pgalloc.h>
 
 /* Allocate the top level pgd (page directory) */
 static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	pgd_t *pgd;
-
-	pgd = (pgd_t *) __get_free_pages(GFP_KERNEL, PGD_TABLE_ORDER);
-	if (unlikely(pgd == NULL))
-		return NULL;
-
-	memset(pgd, 0, PAGE_SIZE << PGD_TABLE_ORDER);
-
-	return pgd;
-}
-
-static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
-{
-	free_pages((unsigned long)pgd, PGD_TABLE_ORDER);
+	return __pgd_alloc(mm, PGD_TABLE_ORDER);
 }
 
 #if CONFIG_PGTABLE_LEVELS == 3
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index 3c364ecc3100..d527f141be1c 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -128,9 +128,8 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
 	pgd_t *pgd;
 
-	pgd = (pgd_t *)__get_free_page(GFP_KERNEL);
+	pgd = __pgd_alloc(mm, 0);
 	if (likely(pgd != NULL)) {
-		memset(pgd, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
 		/* Copy kernel mappings */
 		sync_kernel_mappings(pgd);
 	}
diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c
index 53248ed04771..d98812907493 100644
--- a/arch/um/kernel/mem.c
+++ b/arch/um/kernel/mem.c
@@ -214,14 +214,13 @@ void free_initmem(void)
 
 pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	pgd_t *pgd = (pgd_t *)__get_free_page(GFP_KERNEL);
+	pgd_t *pgd = __pgd_alloc(mm, 0);
 
-	if (pgd) {
-		memset(pgd, 0, USER_PTRS_PER_PGD * sizeof(pgd_t));
+	if (pgd)
 		memcpy(pgd + USER_PTRS_PER_PGD,
 		       swapper_pg_dir + USER_PTRS_PER_PGD,
 		       (PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
-	}
+
 	return pgd;
 }
 
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index c1bfdf7b4767..00917ef609b6 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -395,15 +395,14 @@ void __init pgtable_cache_init(void)
 				      SLAB_PANIC, NULL);
 }
 
-static inline pgd_t *_pgd_alloc(void)
+static inline pgd_t *_pgd_alloc(struct mm_struct *mm)
 {
 	/*
 	 * If no SHARED_KERNEL_PMD, PAE kernel is running as a Xen domain.
 	 * We allocate one page for pgd.
 	 */
 	if (!SHARED_KERNEL_PMD)
-		return (pgd_t *)__get_free_pages(GFP_PGTABLE_USER,
-						 PGD_ALLOCATION_ORDER);
+		return __pgd_alloc(mm, PGD_ALLOCATION_ORDER);
 
 	/*
 	 * Now PAE kernel is not running as a Xen domain. We can allocate
@@ -412,24 +411,23 @@ static inline pgd_t *_pgd_alloc(void)
 	return kmem_cache_alloc(pgd_cache, GFP_PGTABLE_USER);
 }
 
-static inline void _pgd_free(pgd_t *pgd)
+static inline void _pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
 	if (!SHARED_KERNEL_PMD)
-		free_pages((unsigned long)pgd, PGD_ALLOCATION_ORDER);
+		__pgd_free(mm, pgd);
 	else
 		kmem_cache_free(pgd_cache, pgd);
 }
 #else
 
-static inline pgd_t *_pgd_alloc(void)
+static inline pgd_t *_pgd_alloc(struct mm_struct *mm)
 {
-	return (pgd_t *)__get_free_pages(GFP_PGTABLE_USER,
-					 PGD_ALLOCATION_ORDER);
+	return __pgd_alloc(mm, PGD_ALLOCATION_ORDER);
 }
 
-static inline void _pgd_free(pgd_t *pgd)
+static inline void _pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
-	free_pages((unsigned long)pgd, PGD_ALLOCATION_ORDER);
+	__pgd_free(mm, pgd);
 }
 #endif /* CONFIG_X86_PAE */
 
@@ -439,7 +437,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	pmd_t *u_pmds[MAX_PREALLOCATED_USER_PMDS];
 	pmd_t *pmds[MAX_PREALLOCATED_PMDS];
 
-	pgd = _pgd_alloc();
+	pgd = _pgd_alloc(mm);
 
 	if (pgd == NULL)
 		goto out;
@@ -482,7 +480,7 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 	if (sizeof(pmds) != 0)
 		free_pmds(mm, pmds, PREALLOCATED_PMDS);
 out_free_pgd:
-	_pgd_free(pgd);
+	_pgd_free(mm, pgd);
 out:
 	return NULL;
 }
@@ -492,7 +490,7 @@ void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 	pgd_mop_up_pmds(mm, pgd);
 	pgd_dtor(pgd);
 	paravirt_pgd_free(mm, pgd);
-	_pgd_free(pgd);
+	_pgd_free(mm, pgd);
 }
 
 /*
diff --git a/arch/xtensa/include/asm/pgalloc.h b/arch/xtensa/include/asm/pgalloc.h
index 7fc0f9126dd3..1919ee9c3dd6 100644
--- a/arch/xtensa/include/asm/pgalloc.h
+++ b/arch/xtensa/include/asm/pgalloc.h
@@ -29,7 +29,7 @@
 static inline pgd_t*
 pgd_alloc(struct mm_struct *mm)
 {
-	return (pgd_t*) __get_free_page(GFP_KERNEL | __GFP_ZERO);
+	return __pgd_alloc(mm, 0);
 }
 
 static inline void ptes_clear(pte_t *ptep)
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index bb482eeca0c3..daa8bea36952 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -262,10 +262,35 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
 
 #endif /* CONFIG_PGTABLE_LEVELS > 4 */
 
+static inline pgd_t *__pgd_alloc_noprof(struct mm_struct *mm, unsigned int order)
+{
+	gfp_t gfp = GFP_PGTABLE_USER;
+	struct ptdesc *ptdesc;
+
+	if (mm == &init_mm)
+		gfp = GFP_PGTABLE_KERNEL;
+	gfp &= ~__GFP_HIGHMEM;
+
+	ptdesc = pagetable_alloc_noprof(gfp, order);
+	if (!ptdesc)
+		return NULL;
+
+	return ptdesc_address(ptdesc);
+}
+#define __pgd_alloc(...)	alloc_hooks(__pgd_alloc_noprof(__VA_ARGS__))
+
+static inline void __pgd_free(struct mm_struct *mm, pgd_t *pgd)
+{
+	struct ptdesc *ptdesc = virt_to_ptdesc(pgd);
+
+	BUG_ON((unsigned long)pgd & (PAGE_SIZE-1));
+	pagetable_free(ptdesc);
+}
+
 #ifndef __HAVE_ARCH_PGD_FREE
 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
-	pagetable_free(virt_to_ptdesc(pgd));
+	__pgd_free(mm, pgd);
 }
 #endif
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 10/10] mm: Introduce ctor/dtor at PGD level
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (8 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 09/10] asm-generic: pgalloc: Provide generic __pgd_{alloc,free} Kevin Brodsky
@ 2024-12-19 16:44 ` Kevin Brodsky
  2024-12-19 17:13 ` [PATCH 00/10] Account page tables at all levels Dave Hansen
  10 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-19 16:44 UTC (permalink / raw)
  To: linux-mm
  Cc: Kevin Brodsky, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Peter Zijlstra,
	Mike Rapoport (IBM), Ryan Roberts, Thomas Gleixner, Will Deacon,
	Matthew Wilcox, linux-alpha, linux-arch, linux-arm-kernel,
	linux-csky, linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Following on from the introduction of P4D-level ctor/dtor, let's
finish the job and introduce ctor/dtor at PGD level. The incurred
improvement in page accounting is minimal - the main motivation is
to create a single, generic place where construction/destruction
hooks can be added for all page table pages.

This patch should cover all architectures and all configurations
where PGDs are one or more regular pages. This excludes any
configuration where PGDs are allocated from a kmem_cache object.

Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
---
 arch/m68k/include/asm/mcf_pgalloc.h |  2 ++
 arch/m68k/mm/motorola.c             |  6 ++++++
 arch/s390/include/asm/pgalloc.h     |  8 +++++++-
 include/asm-generic/pgalloc.h       |  2 ++
 include/linux/mm.h                  | 10 ++++++++++
 5 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/arch/m68k/include/asm/mcf_pgalloc.h b/arch/m68k/include/asm/mcf_pgalloc.h
index 302c5bf67179..7bb9652e1d67 100644
--- a/arch/m68k/include/asm/mcf_pgalloc.h
+++ b/arch/m68k/include/asm/mcf_pgalloc.h
@@ -73,6 +73,7 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pgtable)
 
 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
+	pagetable_pgd_dtor(virt_to_ptdesc(pgd));
 	pagetable_free(virt_to_ptdesc(pgd));
 }
 
@@ -84,6 +85,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 
 	if (!ptdesc)
 		return NULL;
+	pagetable_pgd_ctor(ptdesc);
 	new_pgd = ptdesc_address(ptdesc);
 
 	memcpy(new_pgd, swapper_pg_dir, PTRS_PER_PGD * sizeof(pgd_t));
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index 37010ee15928..b0fbb369589f 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -169,6 +169,9 @@ void *get_pointer_table(int type)
 		case TABLE_PMD:
 			pagetable_pmd_ctor(virt_to_ptdesc(page));
 			break;
+		case TABLE_PGD:
+			pagetable_pgd_ctor(virt_to_ptdesc(page));
+			break;
 		}
 
 		mmu_page_ctor(page);
@@ -215,6 +218,9 @@ int free_pointer_table(void *table, int type)
 		case TABLE_PMD:
 			pagetable_pmd_dtor(virt_to_ptdesc((void *)page));
 			break;
+		case TABLE_PGD:
+			pagetable_pgd_dtor(virt_to_ptdesc((void *)page));
+			break;
 		}
 
 		free_page (page);
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index 85a5d07365aa..00b69f4ddf17 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -126,11 +126,17 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 
 static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 {
-	return (pgd_t *) crst_table_alloc(mm);
+	unsigned long *table = crst_table_alloc(mm);
+
+	if (!table)
+		return NULL;
+	pagetable_pgd_ctor(virt_to_ptdesc(table));
+	return (pgd_t *) table;
 }
 
 static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 {
+	pagetable_pgd_dtor(virt_to_ptdesc(pgd));
 	crst_table_free(mm, (unsigned long *) pgd);
 }
 
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index daa8bea36952..112b09dc992e 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -275,6 +275,7 @@ static inline pgd_t *__pgd_alloc_noprof(struct mm_struct *mm, unsigned int order
 	if (!ptdesc)
 		return NULL;
 
+	pagetable_pgd_ctor(ptdesc);
 	return ptdesc_address(ptdesc);
 }
 #define __pgd_alloc(...)	alloc_hooks(__pgd_alloc_noprof(__VA_ARGS__))
@@ -284,6 +285,7 @@ static inline void __pgd_free(struct mm_struct *mm, pgd_t *pgd)
 	struct ptdesc *ptdesc = virt_to_ptdesc(pgd);
 
 	BUG_ON((unsigned long)pgd & (PAGE_SIZE-1));
+	pagetable_pgd_dtor(ptdesc);
 	pagetable_free(ptdesc);
 }
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e8b92f4bf3f1..7347da3460c5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3171,6 +3171,16 @@ static inline void pagetable_p4d_dtor(struct ptdesc *ptdesc)
 	__pagetable_dtor(ptdesc);
 }
 
+static inline void pagetable_pgd_ctor(struct ptdesc *ptdesc)
+{
+	__pagetable_ctor(ptdesc);
+}
+
+static inline void pagetable_pgd_dtor(struct ptdesc *ptdesc)
+{
+	__pagetable_dtor(ptdesc);
+}
+
 extern void __init pagecache_init(void);
 extern void free_initmem(void);
 
-- 
2.47.0



^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] Account page tables at all levels
  2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
                   ` (9 preceding siblings ...)
  2024-12-19 16:44 ` [PATCH 10/10] mm: Introduce ctor/dtor at PGD level Kevin Brodsky
@ 2024-12-19 17:13 ` Dave Hansen
  2024-12-20 10:58   ` Kevin Brodsky
  10 siblings, 1 reply; 25+ messages in thread
From: Dave Hansen @ 2024-12-19 17:13 UTC (permalink / raw)
  To: Kevin Brodsky, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

On 12/19/24 08:44, Kevin Brodsky wrote:
>   +---------------+-------------------------+-----------------------+--------------+------------------------------------+
>   | x86           | Y                       | Y                     | Y/N          | kmem_cache at pgd level if PAE     |
>   +---------------+-------------------------+-----------------------+--------------+------------------------------------+

This is a really rare series that adds functionality _and_ removes code
overall. It looks really good to me. The x86 implementation seems to be
captured just fine in the generic one:

Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

One super tiny nit is that the PAE pgd _can_ be allocated using
__get_free_pages(). It was originally there for Xen, but I think it's
being used for PTI only at this point and the comments are wrong-ish.

I kinda think we should just get rid of the 32-bit kmem_cache entirely.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-19 16:44 ` [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers Kevin Brodsky
@ 2024-12-19 17:19   ` Peter Zijlstra
  2024-12-20 10:49     ` Kevin Brodsky
  0 siblings, 1 reply; 25+ messages in thread
From: Peter Zijlstra @ 2024-12-19 17:19 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: linux-mm, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM), Ryan Roberts,
	Thomas Gleixner, Will Deacon, Matthew Wilcox, linux-alpha,
	linux-arch, linux-arm-kernel, linux-csky, linux-hexagon,
	linux-kernel, linux-m68k, linux-mips, linux-openrisc,
	linux-parisc, linux-riscv, linux-s390, linux-snps-arc, linux-um,
	loongarch, x86

On Thu, Dec 19, 2024 at 04:44:16PM +0000, Kevin Brodsky wrote:
> Besides the ptlock management at PTE/PMD level, all the
> pagetable_*_[cd]tor have the same implementation. Introduce common
> helpers for all levels to reduce the duplication.

Uff, I forgot to Cc you on the discussion here, sorry!:

  https://lkml.kernel.org/r/cover.1734526570.git.zhengqi.arch@bytedance.com

we now have two series doing more or less overlapping things :/

You can in fact trivially merge the all the implementations -- the
apparent non-common bit (ptlock_free) is a no-op for all those other
levels because they'll be having ptdesc->lock == NULL.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-19 17:19   ` Peter Zijlstra
@ 2024-12-20 10:49     ` Kevin Brodsky
  2024-12-20 11:46       ` Qi Zheng
  0 siblings, 1 reply; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-20 10:49 UTC (permalink / raw)
  To: Peter Zijlstra, Qi Zheng
  Cc: linux-mm, Andrew Morton, Catalin Marinas, Dave Hansen,
	Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM), Ryan Roberts,
	Thomas Gleixner, Will Deacon, Matthew Wilcox, linux-alpha,
	linux-arch, linux-arm-kernel, linux-csky, linux-hexagon,
	linux-kernel, linux-m68k, linux-mips, linux-openrisc,
	linux-parisc, linux-riscv, linux-s390, linux-snps-arc, linux-um,
	loongarch, x86, Alexander Gordeev

Hi Peter, Qi,

On 19/12/2024 18:19, Peter Zijlstra wrote:
> On Thu, Dec 19, 2024 at 04:44:16PM +0000, Kevin Brodsky wrote:
>> Besides the ptlock management at PTE/PMD level, all the
>> pagetable_*_[cd]tor have the same implementation. Introduce common
>> helpers for all levels to reduce the duplication.
> Uff, I forgot to Cc you on the discussion here, sorry!:
>
>   https://lkml.kernel.org/r/cover.1734526570.git.zhengqi.arch@bytedance.com
>
> we now have two series doing more or less overlapping things :/
>
> You can in fact trivially merge the all the implementations -- the
> apparent non-common bit (ptlock_free) is a no-op for all those other
> levels because they'll be having ptdesc->lock == NULL.

Ah that is good to know, thanks for letting me know about that and Qi's
series! Fortunately there isn't that much overlap between our series - I
think we can easily sort this out.

Qi, shall we collaborate to make our series complementary? I believe my
series covers patch 2 and 4 of your series, but it goes further by
covering all levels and all architectures, and patches introducing
ctor/dtor are already split as Alexander suggested on your series. So my
suggestion would be:

* Remove patch 1 in my series - I'd just introduce
pagetable_{p4d,pgd}_[cd]tor with the same implementation as
pagetable_pud_[cd]tor.
* Remove patch 2 and 4 from your series and rebase it on mine.

Let me know if that makes sense, if so I'll post a v2.

Cheers,
- Kevin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] Account page tables at all levels
  2024-12-19 17:13 ` [PATCH 00/10] Account page tables at all levels Dave Hansen
@ 2024-12-20 10:58   ` Kevin Brodsky
  2024-12-20 14:45     ` Dave Hansen
  2024-12-20 19:31     ` Dave Hansen
  0 siblings, 2 replies; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-20 10:58 UTC (permalink / raw)
  To: Dave Hansen, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

On 19/12/2024 18:13, Dave Hansen wrote:
> On 12/19/24 08:44, Kevin Brodsky wrote:
>>   +---------------+-------------------------+-----------------------+--------------+------------------------------------+
>>   | x86           | Y                       | Y                     | Y/N          | kmem_cache at pgd level if PAE     |
>>   +---------------+-------------------------+-----------------------+--------------+------------------------------------+
> This is a really rare series that adds functionality _and_ removes code
> overall. It looks really good to me. The x86 implementation seems to be
> captured just fine in the generic one:

Thank you for the review, very appreciated!

> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>

Just to double-check, are your ack'ing the x86 changes specifically? If
so I'll add your Acked-by on patch 6, 7 and 9.

> One super tiny nit is that the PAE pgd _can_ be allocated using
> __get_free_pages(). It was originally there for Xen, but I think it's
> being used for PTI only at this point and the comments are wrong-ish.
>
> I kinda think we should just get rid of the 32-bit kmem_cache entirely.

That would certainly simplify things on the x86 side! I'm not at all
familiar with that code though, would you be happy with providing a
patch? I could add it to this series if that's convenient.

- Kevin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-20 10:49     ` Kevin Brodsky
@ 2024-12-20 11:46       ` Qi Zheng
  2024-12-20 13:50         ` Kevin Brodsky
  0 siblings, 1 reply; 25+ messages in thread
From: Qi Zheng @ 2024-12-20 11:46 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Peter Zijlstra, linux-mm, Andrew Morton, Catalin Marinas,
	Dave Hansen, Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Alexander Gordeev

Hi Kevin,

On 2024/12/20 18:49, Kevin Brodsky wrote:
> Hi Peter, Qi,
> 
> On 19/12/2024 18:19, Peter Zijlstra wrote:
>> On Thu, Dec 19, 2024 at 04:44:16PM +0000, Kevin Brodsky wrote:
>>> Besides the ptlock management at PTE/PMD level, all the
>>> pagetable_*_[cd]tor have the same implementation. Introduce common
>>> helpers for all levels to reduce the duplication.
>> Uff, I forgot to Cc you on the discussion here, sorry!:
>>
>>    https://lkml.kernel.org/r/cover.1734526570.git.zhengqi.arch@bytedance.com
>>
>> we now have two series doing more or less overlapping things :/
>>
>> You can in fact trivially merge the all the implementations -- the
>> apparent non-common bit (ptlock_free) is a no-op for all those other
>> levels because they'll be having ptdesc->lock == NULL.
> 
> Ah that is good to know, thanks for letting me know about that and Qi's
> series! Fortunately there isn't that much overlap between our series - I
> think we can easily sort this out.
> 
> Qi, shall we collaborate to make our series complementary? I believe my
> series covers patch 2 and 4 of your series, but it goes further by
> covering all levels and all architectures, and patches introducing
> ctor/dtor are already split as Alexander suggested on your series. So my
> suggestion would be:
> 
> * Remove patch 1 in my series - I'd just introduce
> pagetable_{p4d,pgd}_[cd]tor with the same implementation as
> pagetable_pud_[cd]tor.
> * Remove patch 2 and 4 from your series and rebase it on mine.

I quickly went through your patch series. It looks like my patch 2 and
your patch 6 are duplicated, so you want me to remove my patch 2.

But I think you may not be able to simple let arm64, riscv and x86 to
use generic p4d_{alloc_one,free}(). Because even if
CONFIG_PGTABLE_LEVELS > 4, the pgtable_l5_enabled() may not be true.

For example, in arm64:

#if CONFIG_PGTABLE_LEVELS > 4

static __always_inline bool pgtable_l5_enabled(void)
{
	if (!alternative_has_cap_likely(ARM64_ALWAYS_BOOT))
		return vabits_actual == VA_BITS;
	return alternative_has_cap_unlikely(ARM64_HAS_VA52);
}

Did I miss something?

My patch series is not only for cleanup, but also for fixes of
UAF issue [1], so is it possible to rebase your patch series onto
mine? I can post v3 ASAP.

[1]. 
https://lore.kernel.org/all/67548279.050a0220.a30f1.015b.GAE@google.com/

Thanks!

> 
> Let me know if that makes sense, if so I'll post a v2.
> 
> Cheers,
> - Kevin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-20 11:46       ` Qi Zheng
@ 2024-12-20 13:50         ` Kevin Brodsky
  2024-12-20 14:16           ` Qi Zheng
  0 siblings, 1 reply; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-20 13:50 UTC (permalink / raw)
  To: Qi Zheng
  Cc: Peter Zijlstra, linux-mm, Andrew Morton, Catalin Marinas,
	Dave Hansen, Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Alexander Gordeev

On 20/12/2024 12:46, Qi Zheng wrote:
> Hi Kevin,
>
> On 2024/12/20 18:49, Kevin Brodsky wrote:
>> [...]
>>
>> Qi, shall we collaborate to make our series complementary? I believe my
>> series covers patch 2 and 4 of your series, but it goes further by
>> covering all levels and all architectures, and patches introducing
>> ctor/dtor are already split as Alexander suggested on your series. So my
>> suggestion would be:
>>
>> * Remove patch 1 in my series - I'd just introduce
>> pagetable_{p4d,pgd}_[cd]tor with the same implementation as
>> pagetable_pud_[cd]tor.
>> * Remove patch 2 and 4 from your series and rebase it on mine.
>
> I quickly went through your patch series. It looks like my patch 2 and
> your patch 6 are duplicated, so you want me to remove my patch 2.
>
> But I think you may not be able to simple let arm64, riscv and x86 to
> use generic p4d_{alloc_one,free}(). Because even if
> CONFIG_PGTABLE_LEVELS > 4, the pgtable_l5_enabled() may not be true.
>
> For example, in arm64:
>
> #if CONFIG_PGTABLE_LEVELS > 4
>
> static __always_inline bool pgtable_l5_enabled(void)
> {
>     if (!alternative_has_cap_likely(ARM64_ALWAYS_BOOT))
>         return vabits_actual == VA_BITS;
>     return alternative_has_cap_unlikely(ARM64_HAS_VA52);
> }

Correct. That's why the implementation of p4d_free() I introduce in
patch 6 checks mm_p4d_folded(), which is implemented as
!pgtable_l5_enabled() on those architectures (see last paragraph in
commit message). In fact it turns out Alexander suggested exactly this
approach [2].

>
> Did I miss something?
>
> My patch series is not only for cleanup, but also for fixes of
> UAF issue [1], so is it possible to rebase your patch series onto
> mine? I can post v3 ASAP.

I see, yours should be merged first then. The issue is that yours would
depend on some of the patches in mine, not the other way round.

My suggestion would then be for you to take patch 5, 6 and 7 from my
series, as they match Alexander's suggestions (and patch 5 is I think a
useful simplification), and replace patch 2 in your series with those. I
would then rebase my series on top and adapt it accordingly. Does that
sound reasonable?

- Kevin

[2]
https://lore.kernel.org/all/Z2RKpdv7pL34MIEt@tuxmaker.boeblingen.de.ibm.com/



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-20 13:50         ` Kevin Brodsky
@ 2024-12-20 14:16           ` Qi Zheng
  2024-12-20 14:28             ` Kevin Brodsky
  0 siblings, 1 reply; 25+ messages in thread
From: Qi Zheng @ 2024-12-20 14:16 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Peter Zijlstra, linux-mm, Andrew Morton, Catalin Marinas,
	Dave Hansen, Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Alexander Gordeev



On 2024/12/20 21:50, Kevin Brodsky wrote:
> On 20/12/2024 12:46, Qi Zheng wrote:
>> Hi Kevin,
>>
>> On 2024/12/20 18:49, Kevin Brodsky wrote:
>>> [...]
>>>
>>> Qi, shall we collaborate to make our series complementary? I believe my
>>> series covers patch 2 and 4 of your series, but it goes further by
>>> covering all levels and all architectures, and patches introducing
>>> ctor/dtor are already split as Alexander suggested on your series. So my
>>> suggestion would be:
>>>
>>> * Remove patch 1 in my series - I'd just introduce
>>> pagetable_{p4d,pgd}_[cd]tor with the same implementation as
>>> pagetable_pud_[cd]tor.
>>> * Remove patch 2 and 4 from your series and rebase it on mine.
>>
>> I quickly went through your patch series. It looks like my patch 2 and
>> your patch 6 are duplicated, so you want me to remove my patch 2.
>>
>> But I think you may not be able to simple let arm64, riscv and x86 to
>> use generic p4d_{alloc_one,free}(). Because even if
>> CONFIG_PGTABLE_LEVELS > 4, the pgtable_l5_enabled() may not be true.
>>
>> For example, in arm64:
>>
>> #if CONFIG_PGTABLE_LEVELS > 4
>>
>> static __always_inline bool pgtable_l5_enabled(void)
>> {
>>      if (!alternative_has_cap_likely(ARM64_ALWAYS_BOOT))
>>          return vabits_actual == VA_BITS;
>>      return alternative_has_cap_unlikely(ARM64_HAS_VA52);
>> }
> 
> Correct. That's why the implementation of p4d_free() I introduce in
> patch 6 checks mm_p4d_folded(), which is implemented as
> !pgtable_l5_enabled() on those architectures (see last paragraph in
> commit message). In fact it turns out Alexander suggested exactly this
> approach [2].

OK, I see.

> 
>>
>> Did I miss something?
>>
>> My patch series is not only for cleanup, but also for fixes of
>> UAF issue [1], so is it possible to rebase your patch series onto
>> mine? I can post v3 ASAP.
> 
> I see, yours should be merged first then. The issue is that yours would
> depend on some of the patches in mine, not the other way round.
> 
> My suggestion would then be for you to take patch 5, 6 and 7 from my
> series, as they match Alexander's suggestions (and patch 5 is I think a
> useful simplification), and replace patch 2 in your series with those. I
> would then rebase my series on top and adapt it accordingly. Does that
> sound reasonable?

Sounds good. But maybe just patch 5 and 6. Because I actually did
the work of your patch 7 in my patch 2 and 4.

So, is it okay to do something like the following?

1. I separate the ctor()/dtor() part from my patch 2, and then replace
    the rest with your patch 6.
2. take your patch 5 form your series

If it's ok, I will post the v3 next Monday. ;)

Thanks!

> 
> - Kevin
> 
> [2]
> https://lore.kernel.org/all/Z2RKpdv7pL34MIEt@tuxmaker.boeblingen.de.ibm.com/
> 


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-20 14:16           ` Qi Zheng
@ 2024-12-20 14:28             ` Kevin Brodsky
  2024-12-20 14:35               ` Qi Zheng
  0 siblings, 1 reply; 25+ messages in thread
From: Kevin Brodsky @ 2024-12-20 14:28 UTC (permalink / raw)
  To: Qi Zheng
  Cc: Peter Zijlstra, linux-mm, Andrew Morton, Catalin Marinas,
	Dave Hansen, Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Alexander Gordeev

On 20/12/2024 15:16, Qi Zheng wrote:
>>>
>>> Did I miss something?
>>>
>>> My patch series is not only for cleanup, but also for fixes of
>>> UAF issue [1], so is it possible to rebase your patch series onto
>>> mine? I can post v3 ASAP.
>>
>> I see, yours should be merged first then. The issue is that yours would
>> depend on some of the patches in mine, not the other way round.
>>
>> My suggestion would then be for you to take patch 5, 6 and 7 from my
>> series, as they match Alexander's suggestions (and patch 5 is I think a
>> useful simplification), and replace patch 2 in your series with those. I
>> would then rebase my series on top and adapt it accordingly. Does that
>> sound reasonable?
>
> Sounds good. But maybe just patch 5 and 6. Because I actually did
> the work of your patch 7 in my patch 2 and 4.

Yes that's fair! You'd have to do adapt my patch 7 to make it fit in
your series so I agree it makes more sense this way.

>
> So, is it okay to do something like the following?
>
> 1. I separate the ctor()/dtor() part from my patch 2, and then replace
>    the rest with your patch 6.
> 2. take your patch 5 form your series

Sounds good to me!

IIUC Dave Hansen gave his Acked-by for the x86 part of patch 6 [1],
would make sense to add it when you post your v3.

>
> If it's ok, I will post the v3 next Monday. ;)

Perfect. I'm going offline tonight, when I come back in the new year
I'll review your v3 series and post a new version of this one.

Cheers,
- Kevin

[1]
https://lore.kernel.org/linux-mm/a7398426-56d1-40b4-a1c9-40ae8c8a4b4b@intel.com/


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers
  2024-12-20 14:28             ` Kevin Brodsky
@ 2024-12-20 14:35               ` Qi Zheng
  0 siblings, 0 replies; 25+ messages in thread
From: Qi Zheng @ 2024-12-20 14:35 UTC (permalink / raw)
  To: Kevin Brodsky
  Cc: Peter Zijlstra, linux-mm, Andrew Morton, Catalin Marinas,
	Dave Hansen, Linus Walleij, Andy Lutomirski, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Alexander Gordeev



On 2024/12/20 22:28, Kevin Brodsky wrote:
> On 20/12/2024 15:16, Qi Zheng wrote:
>>>>
>>>> Did I miss something?
>>>>
>>>> My patch series is not only for cleanup, but also for fixes of
>>>> UAF issue [1], so is it possible to rebase your patch series onto
>>>> mine? I can post v3 ASAP.
>>>
>>> I see, yours should be merged first then. The issue is that yours would
>>> depend on some of the patches in mine, not the other way round.
>>>
>>> My suggestion would then be for you to take patch 5, 6 and 7 from my
>>> series, as they match Alexander's suggestions (and patch 5 is I think a
>>> useful simplification), and replace patch 2 in your series with those. I
>>> would then rebase my series on top and adapt it accordingly. Does that
>>> sound reasonable?
>>
>> Sounds good. But maybe just patch 5 and 6. Because I actually did
>> the work of your patch 7 in my patch 2 and 4.
> 
> Yes that's fair! You'd have to do adapt my patch 7 to make it fit in
> your series so I agree it makes more sense this way.

Thanks!

> 
>>
>> So, is it okay to do something like the following?
>>
>> 1. I separate the ctor()/dtor() part from my patch 2, and then replace
>>     the rest with your patch 6.
>> 2. take your patch 5 form your series
> 
> Sounds good to me!
> 
> IIUC Dave Hansen gave his Acked-by for the x86 part of patch 6 [1],
> would make sense to add it when you post your v3.

OK, will add it!

> 
>>
>> If it's ok, I will post the v3 next Monday. ;)
> 
> Perfect. I'm going offline tonight, when I come back in the new year
> I'll review your v3 series and post a new version of this one.

Thank you very much! And Happy New Year!

> 
> Cheers,
> - Kevin
> 
> [1]
> https://lore.kernel.org/linux-mm/a7398426-56d1-40b4-a1c9-40ae8c8a4b4b@intel.com/


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] Account page tables at all levels
  2024-12-20 10:58   ` Kevin Brodsky
@ 2024-12-20 14:45     ` Dave Hansen
  2024-12-20 19:31     ` Dave Hansen
  1 sibling, 0 replies; 25+ messages in thread
From: Dave Hansen @ 2024-12-20 14:45 UTC (permalink / raw)
  To: Kevin Brodsky, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

On 12/20/24 02:58, Kevin Brodsky wrote:
>> Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> Just to double-check, are your ack'ing the x86 changes specifically? If
> so I'll add your Acked-by on patch 6, 7 and 9.

Feel free to add it to each patch in the series.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] Account page tables at all levels
  2024-12-20 10:58   ` Kevin Brodsky
  2024-12-20 14:45     ` Dave Hansen
@ 2024-12-20 19:31     ` Dave Hansen
  2025-01-03  9:28       ` Kevin Brodsky
  1 sibling, 1 reply; 25+ messages in thread
From: Dave Hansen @ 2024-12-20 19:31 UTC (permalink / raw)
  To: Kevin Brodsky, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Joerg Roedel

On 12/20/24 02:58, Kevin Brodsky wrote:
>> One super tiny nit is that the PAE pgd _can_ be allocated using
>> __get_free_pages(). It was originally there for Xen, but I think it's
>> being used for PTI only at this point and the comments are wrong-ish.
>>
>> I kinda think we should just get rid of the 32-bit kmem_cache entirely.
> That would certainly simplify things on the x86 side! I'm not at all
> familiar with that code though, would you be happy with providing a
> patch? I could add it to this series if that's convenient.

I hacked this together yesterday:

> https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=simplify-pae-20241220
It definitely needs some more work. I'm particularly still puzzling
about why SHARED_KERNEL_PMD is used both as a trigger for 32b vs.
PAGE_SIZE PAE pgd allocations _and_ for the actual PMD sharing.

Xen definitely needed the whole page behavior but I'm not sure why PTI did.

Either way, that series should make the PAE PGDs a _bit_ less weird at
the cost of an extra ~2 pages per process for folks who are running
32-bit PAE kernels with PTI disabled.

But I think the diffstat is worth it:

 5 files changed, 16 insertions(+), 96 deletions(-)



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 00/10] Account page tables at all levels
  2024-12-20 19:31     ` Dave Hansen
@ 2025-01-03  9:28       ` Kevin Brodsky
  0 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2025-01-03  9:28 UTC (permalink / raw)
  To: Dave Hansen, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Joerg Roedel

On 20/12/2024 20:31, Dave Hansen wrote:
> On 12/20/24 02:58, Kevin Brodsky wrote:
>>> One super tiny nit is that the PAE pgd _can_ be allocated using
>>> __get_free_pages(). It was originally there for Xen, but I think it's
>>> being used for PTI only at this point and the comments are wrong-ish.
>>>
>>> I kinda think we should just get rid of the 32-bit kmem_cache entirely.
>> That would certainly simplify things on the x86 side! I'm not at all
>> familiar with that code though, would you be happy with providing a
>> patch? I could add it to this series if that's convenient.
> I hacked this together yesterday:
>
>> https://git.kernel.org/pub/scm/linux/kernel/git/daveh/devel.git/log/?h=simplify-pae-20241220
> It definitely needs some more work. I'm particularly still puzzling
> about why SHARED_KERNEL_PMD is used both as a trigger for 32b vs.
> PAGE_SIZE PAE pgd allocations _and_ for the actual PMD sharing.
>
> Xen definitely needed the whole page behavior but I'm not sure why PTI did.
>
> Either way, that series should make the PAE PGDs a _bit_ less weird at
> the cost of an extra ~2 pages per process for folks who are running
> 32-bit PAE kernels with PTI disabled.
>
> But I think the diffstat is worth it:
>
>  5 files changed, 16 insertions(+), 96 deletions(-)

That does look like a nice simplification! After the first patch, with
my series, we could get rid of _pgd_alloc() and _pgd_free() in
arch/x86/mm/pgtable.c and just call __pgd_alloc() and __pgd_free() directly.

Considering that these changes are not trivial and may need more work,
should I let you post those patches as a separate series? If it gets
merged soon, I'll adapt my series, otherwise I can post a follow-up
patch later if needed.

- Kevin


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one
  2024-12-19 16:44 ` [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one Kevin Brodsky
@ 2025-01-03 10:31   ` Alexandre Ghiti
  2025-01-03 10:36     ` Kevin Brodsky
  0 siblings, 1 reply; 25+ messages in thread
From: Alexandre Ghiti @ 2025-01-03 10:31 UTC (permalink / raw)
  To: Kevin Brodsky, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86

Hi Kevin,

On 19/12/2024 17:44, Kevin Brodsky wrote:
> {pmd,pud,p4d}_alloc_one() is never called if the corresponding page
> table level is folded, as {pmd,pud,p4d}_alloc() already does the
> required check. We can therefore remove the runtime page table level
> checks in {pud,p4d}_alloc_one. The PUD helper becomes equivalent to
> the generic version, so we remove it altogether.
>
> This is consistent with the way arm64 and x86 handle this situation
> (runtime check in p4d_free() only).
>
> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
> ---
>   arch/riscv/include/asm/pgalloc.h | 22 ++++------------------
>   1 file changed, 4 insertions(+), 18 deletions(-)
>
> diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
> index f52264304f77..8ad0bbe838a2 100644
> --- a/arch/riscv/include/asm/pgalloc.h
> +++ b/arch/riscv/include/asm/pgalloc.h
> @@ -12,7 +12,6 @@
>   #include <asm/tlb.h>
>   
>   #ifdef CONFIG_MMU
> -#define __HAVE_ARCH_PUD_ALLOC_ONE
>   #define __HAVE_ARCH_PUD_FREE
>   #include <asm-generic/pgalloc.h>
>   
> @@ -88,15 +87,6 @@ static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd,
>   	}
>   }
>   
> -#define pud_alloc_one pud_alloc_one
> -static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
> -{
> -	if (pgtable_l4_enabled)
> -		return __pud_alloc_one(mm, addr);
> -
> -	return NULL;
> -}
> -
>   #define pud_free pud_free
>   static inline void pud_free(struct mm_struct *mm, pud_t *pud)
>   {
> @@ -118,15 +108,11 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
>   #define p4d_alloc_one p4d_alloc_one
>   static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
>   {
> -	if (pgtable_l5_enabled) {
> -		gfp_t gfp = GFP_PGTABLE_USER;
> -
> -		if (mm == &init_mm)
> -			gfp = GFP_PGTABLE_KERNEL;
> -		return (p4d_t *)get_zeroed_page(gfp);
> -	}
> +	gfp_t gfp = GFP_PGTABLE_USER;
>   
> -	return NULL;
> +	if (mm == &init_mm)
> +		gfp = GFP_PGTABLE_KERNEL;
> +	return (p4d_t *)get_zeroed_page(gfp);
>   }
>   
>   static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)


Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex



^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one
  2025-01-03 10:31   ` Alexandre Ghiti
@ 2025-01-03 10:36     ` Kevin Brodsky
  0 siblings, 0 replies; 25+ messages in thread
From: Kevin Brodsky @ 2025-01-03 10:36 UTC (permalink / raw)
  To: Alexandre Ghiti, linux-mm
  Cc: Andrew Morton, Catalin Marinas, Dave Hansen, Linus Walleij,
	Andy Lutomirski, Peter Zijlstra, Mike Rapoport (IBM),
	Ryan Roberts, Thomas Gleixner, Will Deacon, Matthew Wilcox,
	linux-alpha, linux-arch, linux-arm-kernel, linux-csky,
	linux-hexagon, linux-kernel, linux-m68k, linux-mips,
	linux-openrisc, linux-parisc, linux-riscv, linux-s390,
	linux-snps-arc, linux-um, loongarch, x86, Qi Zheng

+Qi

On 03/01/2025 11:31, Alexandre Ghiti wrote:
> Hi Kevin,
>
> On 19/12/2024 17:44, Kevin Brodsky wrote:
>> {pmd,pud,p4d}_alloc_one() is never called if the corresponding page
>> table level is folded, as {pmd,pud,p4d}_alloc() already does the
>> required check. We can therefore remove the runtime page table level
>> checks in {pud,p4d}_alloc_one. The PUD helper becomes equivalent to
>> the generic version, so we remove it altogether.
>>
>> This is consistent with the way arm64 and x86 handle this situation
>> (runtime check in p4d_free() only).
>>
>> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
>> ---
>>   arch/riscv/include/asm/pgalloc.h | 22 ++++------------------
>>   1 file changed, 4 insertions(+), 18 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/pgalloc.h
>> b/arch/riscv/include/asm/pgalloc.h
>> index f52264304f77..8ad0bbe838a2 100644
>> --- a/arch/riscv/include/asm/pgalloc.h
>> +++ b/arch/riscv/include/asm/pgalloc.h
>> @@ -12,7 +12,6 @@
>>   #include <asm/tlb.h>
>>     #ifdef CONFIG_MMU
>> -#define __HAVE_ARCH_PUD_ALLOC_ONE
>>   #define __HAVE_ARCH_PUD_FREE
>>   #include <asm-generic/pgalloc.h>
>>   @@ -88,15 +87,6 @@ static inline void pgd_populate_safe(struct
>> mm_struct *mm, pgd_t *pgd,
>>       }
>>   }
>>   -#define pud_alloc_one pud_alloc_one
>> -static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned
>> long addr)
>> -{
>> -    if (pgtable_l4_enabled)
>> -        return __pud_alloc_one(mm, addr);
>> -
>> -    return NULL;
>> -}
>> -
>>   #define pud_free pud_free
>>   static inline void pud_free(struct mm_struct *mm, pud_t *pud)
>>   {
>> @@ -118,15 +108,11 @@ static inline void __pud_free_tlb(struct
>> mmu_gather *tlb, pud_t *pud,
>>   #define p4d_alloc_one p4d_alloc_one
>>   static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned
>> long addr)
>>   {
>> -    if (pgtable_l5_enabled) {
>> -        gfp_t gfp = GFP_PGTABLE_USER;
>> -
>> -        if (mm == &init_mm)
>> -            gfp = GFP_PGTABLE_KERNEL;
>> -        return (p4d_t *)get_zeroed_page(gfp);
>> -    }
>> +    gfp_t gfp = GFP_PGTABLE_USER;
>>   -    return NULL;
>> +    if (mm == &init_mm)
>> +        gfp = GFP_PGTABLE_KERNEL;
>> +    return (p4d_t *)get_zeroed_page(gfp);
>>   }
>>     static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
>
>
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks for the review! Just FYI this patch is now part of Qi's series
[1], I will drop it when posting the next version of this series.

- Kevin

[1]
https://lore.kernel.org/linux-mm/84ddf857508b98a195a790bc6ff6ab8849b44633.1735549103.git.zhengqi.arch@bytedance.com/


^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2025-01-03 10:37 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-19 16:44 [PATCH 00/10] Account page tables at all levels Kevin Brodsky
2024-12-19 16:44 ` [PATCH 01/10] mm: Move common parts of pagetable_*_[cd]tor to helpers Kevin Brodsky
2024-12-19 17:19   ` Peter Zijlstra
2024-12-20 10:49     ` Kevin Brodsky
2024-12-20 11:46       ` Qi Zheng
2024-12-20 13:50         ` Kevin Brodsky
2024-12-20 14:16           ` Qi Zheng
2024-12-20 14:28             ` Kevin Brodsky
2024-12-20 14:35               ` Qi Zheng
2024-12-19 16:44 ` [PATCH 02/10] parisc: mm: Ensure pagetable_pmd_[cd]tor are called Kevin Brodsky
2024-12-19 16:44 ` [PATCH 03/10] m68k: mm: Add calls to pagetable_pmd_[cd]tor Kevin Brodsky
2024-12-19 16:44 ` [PATCH 04/10] s390/mm: Add calls to pagetable_pud_[cd]tor Kevin Brodsky
2024-12-19 16:44 ` [PATCH 05/10] riscv: mm: Skip pgtable level check in {pud,p4d}_alloc_one Kevin Brodsky
2025-01-03 10:31   ` Alexandre Ghiti
2025-01-03 10:36     ` Kevin Brodsky
2024-12-19 16:44 ` [PATCH 06/10] asm-generic: pgalloc: Provide generic p4d_{alloc_one,free} Kevin Brodsky
2024-12-19 16:44 ` [PATCH 07/10] mm: Introduce ctor/dtor at P4D level Kevin Brodsky
2024-12-19 16:44 ` [PATCH 08/10] ARM: mm: Rename PGD helpers Kevin Brodsky
2024-12-19 16:44 ` [PATCH 09/10] asm-generic: pgalloc: Provide generic __pgd_{alloc,free} Kevin Brodsky
2024-12-19 16:44 ` [PATCH 10/10] mm: Introduce ctor/dtor at PGD level Kevin Brodsky
2024-12-19 17:13 ` [PATCH 00/10] Account page tables at all levels Dave Hansen
2024-12-20 10:58   ` Kevin Brodsky
2024-12-20 14:45     ` Dave Hansen
2024-12-20 19:31     ` Dave Hansen
2025-01-03  9:28       ` Kevin Brodsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).