* [PATCH v2 01/10] powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 02/10] powerpc/64s: Fix unmap race with PMD migration entries Ritesh Harjani (IBM)
` (10 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
powerpc uses pt_frag_refcount as a reference counter for tracking it's
pte and pmd page table fragments. For PTE table, in case of Hash with
64K pagesize, we have 16 fragments of 4K size in one 64K page.
Patch series [1] "mm: free retracted page table by RCU"
added pte_free_defer() to defer the freeing of PTE tables when
retract_page_tables() is called for madvise MADV_COLLAPSE on shmem
range.
[1]: https://lore.kernel.org/all/7cd843a9-aa80-14f-5eb2-33427363c20@google.com/
pte_free_defer() sets the active flag on the corresponding fragment's
folio & calls pte_fragment_free(), which reduces the pt_frag_refcount.
When pt_frag_refcount reaches 0 (no active fragment using the folio), it
checks if the folio active flag is set, if set, it calls call_rcu to
free the folio, it the active flag is unset then it calls pte_free_now().
Now, this can lead to following problem in a corner case...
[ 265.351553][ T183] BUG: Bad page state in process a.out pfn:20d62
[ 265.353555][ T183] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x20d62
[ 265.355457][ T183] flags: 0x3ffff800000100(active|node=0|zone=0|lastcpupid=0x7ffff)
[ 265.358719][ T183] raw: 003ffff800000100 0000000000000000 5deadbeef0000122 0000000000000000
[ 265.360177][ T183] raw: 0000000000000000 c0000000119caf58 00000000ffffffff 0000000000000000
[ 265.361438][ T183] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
[ 265.362572][ T183] Modules linked in:
[ 265.364622][ T183] CPU: 0 UID: 0 PID: 183 Comm: a.out Not tainted 6.18.0-rc3-00141-g1ddeaaace7ff-dirty #53 VOLUNTARY
[ 265.364785][ T183] Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries
[ 265.364908][ T183] Call Trace:
[ 265.364955][ T183] [c000000011e6f7c0] [c000000001cfaa18] dump_stack_lvl+0x130/0x148 (unreliable)
[ 265.365202][ T183] [c000000011e6f7f0] [c000000000794758] bad_page+0xb4/0x1c8
[ 265.365384][ T183] [c000000011e6f890] [c00000000079c020] __free_frozen_pages+0x838/0xd08
[ 265.365554][ T183] [c000000011e6f980] [c0000000000a70ac] pte_frag_destroy+0x298/0x310
[ 265.365729][ T183] [c000000011e6fa30] [c0000000000aa764] arch_exit_mmap+0x34/0x218
[ 265.365912][ T183] [c000000011e6fa80] [c000000000751698] exit_mmap+0xb8/0x820
[ 265.366080][ T183] [c000000011e6fc30] [c0000000001b1258] __mmput+0x98/0x300
[ 265.366244][ T183] [c000000011e6fc80] [c0000000001c81f8] do_exit+0x470/0x1508
[ 265.366421][ T183] [c000000011e6fd70] [c0000000001c95e4] do_group_exit+0x88/0x148
[ 265.366602][ T183] [c000000011e6fdc0] [c0000000001c96ec] pid_child_should_wake+0x0/0x178
[ 265.366780][ T183] [c000000011e6fdf0] [c00000000003a270] system_call_exception+0x1b0/0x4e0
[ 265.366958][ T183] [c000000011e6fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
The bad page state error occurs when such a folio gets freed (with
active flag set), from do_exit() path in parallel.
... this can happen when the pte fragment was allocated from this folio,
but when all the fragments get freed, the pte_frag_refcount still had some
unused fragments. Now, if this process exits, with such folio as it's cached
pte_frag in mm->context, then during pte_frag_destroy(), we simply call
pagetable_dtor() and pagetable_free(), meaning it doesn't clear the
active flag. This, can lead to the above bug. Since we are anyway in
do_exit() path, then if the refcount is 0, then I guess it should be
ok to simply clear the folio active flag before calling pagetable_dtor()
& pagetable_free().
Fixes: 32cc0b7c9d50 ("powerpc: add pte_free_defer() for pgtables sharing page")
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/mm/pgtable-frag.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
index 77e55eac16e4..ae742564a3d5 100644
--- a/arch/powerpc/mm/pgtable-frag.c
+++ b/arch/powerpc/mm/pgtable-frag.c
@@ -25,6 +25,7 @@ void pte_frag_destroy(void *pte_frag)
count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT;
/* We allow PTE_FRAG_NR fragments from a PTE page */
if (atomic_sub_and_test(PTE_FRAG_NR - count, &ptdesc->pt_frag_refcount)) {
+ folio_clear_active(ptdesc_folio(ptdesc));
pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 02/10] powerpc/64s: Fix unmap race with PMD migration entries
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 01/10] powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 03/10] powerpc/64s: Fix _HPAGE_CHG_MASK to include _PAGE_SPECIAL bit Ritesh Harjani (IBM)
` (9 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM), Pavithra Prakash
The following race is possible with migration swap entries or
device-private THP entries. e.g. when move_pages is called on a PMD THP
page, then there maybe an intermediate state, where PMD entry acts as
a migration swap entry (pmd_present() is true). Then if an munmap
happens at the same time, then this VM_BUG_ON() can happen in
pmdp_huge_get_and_clear_full().
This patch fixes that.
Thread A: move_pages() syscall
add_folio_for_migration()
mmap_read_lock(mm)
folio_isolate_lru(folio)
mmap_read_unlock(mm)
do_move_pages_to_node()
migrate_pages()
try_to_migrate_one()
spin_lock(ptl)
set_pmd_migration_entry()
pmdp_invalidate() # PMD: _PAGE_INVALID | _PAGE_PTE | pfn
set_pmd_at() # PMD: migration swap entry (pmd_present=0)
spin_unlock(ptl)
[page copy phase] # <--- RACE WINDOW -->
Thread B: munmap()
mmap_write_downgrade(mm)
unmap_vmas() -> zap_pmd_range()
zap_huge_pmd()
__pmd_trans_huge_lock()
pmd_is_huge(): # !pmd_present && !pmd_none -> TRUE (swap entry)
pmd_lock() -> # spin_lock(ptl), waits for Thread A to release ptl
pmdp_huge_get_and_clear_full()
VM_BUG_ON(!pmd_present(*pmdp)) # HITS!
[ 287.738700][ T1867] ------------[ cut here ]------------
[ 287.743843][ T1867] kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:187!
cpu 0x0: Vector: 700 (Program Check) at [c00000044037f4f0]
pc: c000000000094ca4: pmdp_huge_get_and_clear_full+0x6c/0x23c
lr: c000000000645dec: zap_huge_pmd+0xb0/0x868
sp: c00000044037f790
msr: 800000000282b033
current = 0xc0000004032c1a00
paca = 0xc000000004fe0000 irqmask: 0x03 irq_happened: 0x09
pid = 1867, comm = a.out
kernel BUG at :187!
Linux version 6.19.0-12136-g14360d4f917c-dirty (powerpc64le-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #27 SMP PREEMPT Sun Feb 22 10:38:56 IST 2026
enter ? for help
[link register ] c000000000645dec zap_huge_pmd+0xb0/0x868
[c00000044037f790] c00000044037f7d0 (unreliable)
[c00000044037f7d0] c000000000645dcc zap_huge_pmd+0x90/0x868
[c00000044037f840] c0000000005724cc unmap_page_range+0x176c/0x1f40
[c00000044037fa00] c000000000572ea0 unmap_vmas+0xb0/0x1d8
[c00000044037fa90] c0000000005af254 unmap_region+0xb4/0x128
[c00000044037fb50] c0000000005af400 vms_complete_munmap_vmas+0x138/0x310
[c00000044037fbe0] c0000000005b0f1c do_vmi_align_munmap+0x1ec/0x238
[c00000044037fd30] c0000000005b3688 __vm_munmap+0x170/0x1f8
[c00000044037fdf0] c000000000587f74 sys_munmap+0x2c/0x40
[c00000044037fe10] c000000000032668 system_call_exception+0x128/0x350
[c00000044037fe50] c00000000000d05c system_call_vectored_common+0x15c/0x2ec
---- Exception: 3000 (System Call Vectored) at 0000000010064a2c
SP (7fff9b1ee9c0) is in userspace
0:mon> zh
commit a30b48bf1b24 ("mm/migrate_device: implement THP migration of zone device pages"),
enabled migration for device-private PMD entries. Hence this is one
other path where this warning could get trigger from.
------------[ cut here ]------------
WARNING: arch/powerpc/mm/book3s64/hash_pgtable.c:199 at hash__pmd_hugepage_update+0x48/0x284, CPU#3: hmm-tests/1905
Modules linked in: test_hmm
CPU: 3 UID: 0 PID: 1905 Comm: hmm-tests Tainted: G B W L N 7.0.0-rc1-01438-g7e2f0ee7581c #21 PREEMPT
Tainted: [B]=BAD_PAGE, [W]=WARN, [L]=SOFTLOCKUP, [N]=TEST
Hardware name: IBM pSeries (emulated by qemu) POWER10 (architected) 0x801200 0xf000006 of:SLOF,git-ee03ae pSeries
NIP [c000000000096b70] hash__pmd_hugepage_update+0x48/0x284
LR [c000000000096e7c] hash__pmdp_huge_get_and_clear+0xd0/0xd4
Call Trace:
[c000000604707670] [c000000004e102b8] 0xc000000004e102b8 (unreliable)
[c000000604707700] [c00000000064ec3c] set_pmd_migration_entry+0x414/0x498
[c000000604707760] [c00000000063e5a4] migrate_vma_collect_pmd+0x12e8/0x16c4
[c000000604707890] [c00000000059282c] walk_pgd_range+0x7fc/0xd2c
[c000000604707990] [c000000000592e40] __walk_page_range+0xe4/0x2ac
[c000000604707a10] [c000000000593534] walk_page_range_mm_unsafe+0x204/0x2a4
[c000000604707ab0] [c00000000063af10] migrate_vma_setup+0x1dc/0x2e8
[c000000604707b10] [c008000006a21838] dmirror_migrate_to_system.constprop.0+0x210/0x4b0 [test_hmm]
[c000000604707c30] [c008000006a245b0] dmirror_fops_unlocked_ioctl+0x454/0xa5c [test_hmm]
[c000000604707d20] [c0000000006aab84] sys_ioctl+0x4ec/0x1178
[c000000604707e10] [c0000000000326a8] system_call_exception+0x128/0x350
[c000000604707e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
---- interrupt: 3000 at 0x7fffbe44f50c
Fixes: 75358ea359e7c ("powerpc/mm/book3s64: Fix MADV_DONTNEED and parallel page fault race")
Fixes: a30b48bf1b24 ("mm/migrate_device: implement THP migration of zone device pages")
Reported-by: Pavithra Prakash <pavrampu@linux.vnet.ibm.com>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 15 +++++++++++++++
arch/powerpc/mm/book3s64/pgtable.c | 13 +++++++++----
2 files changed, 24 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 639cbf34f752..43d442a80a23 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1336,12 +1336,27 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
{
pmd_t old_pmd;
+ /*
+ * Non-present PMDs can be migration entries or device-private THP
+ * entries. This can happen at 2 places:
+ * - When the address space is being unmapped zap_huge_pmd(), and we
+ * encounter non-present pmds.
+ * - migrate_vma_collect_huge_pmd() could calls this during migration
+ * of device-private pmd entries.
+ */
+ if (!pmd_present(*pmdp)) {
+ old_pmd = READ_ONCE(*pmdp);
+ pmd_clear(pmdp);
+ goto out;
+ }
+
if (radix_enabled()) {
old_pmd = radix__pmdp_huge_get_and_clear(mm, addr, pmdp);
} else {
old_pmd = hash__pmdp_huge_get_and_clear(mm, addr, pmdp);
}
+out:
page_table_check_pmd_clear(mm, addr, old_pmd);
return old_pmd;
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 4b09c04654a8..42c7906d0e43 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -209,16 +209,21 @@ pmd_t pmdp_huge_get_and_clear_full(struct vm_area_struct *vma,
unsigned long addr, pmd_t *pmdp, int full)
{
pmd_t pmd;
+ bool was_present = pmd_present(*pmdp);
+
VM_BUG_ON(addr & ~HPAGE_PMD_MASK);
- VM_BUG_ON((pmd_present(*pmdp) && !pmd_trans_huge(*pmdp)) ||
- !pmd_present(*pmdp));
+ VM_BUG_ON(was_present && !pmd_trans_huge(*pmdp));
+ /*
+ * Check pmdp_huge_get_and_clear() for non-present pmd case.
+ */
pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
/*
* if it not a fullmm flush, then we can possibly end up converting
* this PMD pte entry to a regular level 0 PTE by a parallel page fault.
- * Make sure we flush the tlb in this case.
+ * Make sure we flush the tlb in this case. TLB flush not needed for
+ * non-present case.
*/
- if (!full)
+ if (was_present && !full)
flush_pmd_tlb_range(vma, addr, addr + HPAGE_PMD_SIZE);
return pmd;
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 03/10] powerpc/64s: Fix _HPAGE_CHG_MASK to include _PAGE_SPECIAL bit
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 01/10] powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 02/10] powerpc/64s: Fix unmap race with PMD migration entries Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 04/10] powerpc/64s/tlbflush-radix: Remove unused radix__flush_tlb_pwc() Ritesh Harjani (IBM)
` (8 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
commit af38538801c6a ("mm/memory: factor out common code from vm_normal_page_*()"),
added a VM_WARN_ON_ONCE for huge zero pfn.
This can lead to the following call stack.
------------[ cut here ]------------
WARNING: mm/memory.c:735 at vm_normal_page_pmd+0xf0/0x140, CPU#19: hmm-tests/3366
NIP [c00000000078d0c0] vm_normal_page_pmd+0xf0/0x140
LR [c00000000078d060] vm_normal_page_pmd+0x90/0x140
Call Trace:
[c00000016f56f850] [c00000000078d060] vm_normal_page_pmd+0x90/0x140 (unreliable)
[c00000016f56f8a0] [c0000000008a9e30] change_huge_pmd+0x7c0/0x870
[c00000016f56f930] [c0000000007b2bc4] change_protection+0x17a4/0x1e10
[c00000016f56fba0] [c0000000007b3440] mprotect_fixup+0x210/0x4c0
[c00000016f56fc30] [c0000000007b3c3c] do_mprotect_pkey+0x54c/0x780
[c00000016f56fdb0] [c0000000007b3ed8] sys_mprotect+0x68/0x90
[c00000016f56fdf0] [c00000000003ae40] system_call_exception+0x190/0x500
[c00000016f56fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
This happens when we call mprotect -> change_huge_pmd()
mprotect()
change_pmd_range()
pmd_modify(oldpmd, newprot) # this clears _PAGE_SPECIAL for zero huge pmd
pmdv = pmd_val(pmd);
pmdv &= _HPAGE_CHG_MASK; # -> gets cleared here
return pmd_set_protbits(__pmd(pmdv), newprot);
can_change_pmd_writable(vma, vmf->address, pmd)
vm_normal_page_pmd(vma, addr, pmd)
__vm_normal_page()
VM_WARN_ON(is_zero_pfn(pfn) || is_huge_zero_pfn(pfn)); # this get hits as _PAGE_SPECIAL for zero huge pmd was cleared.
It can be easily reproduced with the following testcase:
p = mmap(NULL, 2 * hpage_pmd_size, PROT_READ, MAP_PRIVATE |
MAP_ANONYMOUS, -1, 0);
madvise((void *)p, 2 * hpage_pmd_size, MADV_HUGEPAGE);
aligned = (char*)(((unsigned long)p + hpage_pmd_size - 1) &
~(hpage_pmd_size - 1));
(void)(*(volatile char*)aligned); // read fault, installs huge zero PMD
mprotect((void *)aligned, hpage_pmd_size, PROT_READ | PROT_WRITE);
This patch adds _PAGE_SPECIAL to _HPAGE_CHG_MASK similar to
_PAGE_CHG_MASK, as we don't want to clear this bit when calling
pmd_modify() while changing protection bits.
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 43d442a80a23..6be7428fdde4 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -107,8 +107,8 @@
* in here, on radix we expect them to be zero.
*/
#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
- _PAGE_ACCESSED | H_PAGE_THP_HUGE | _PAGE_PTE | \
- _PAGE_SOFT_DIRTY)
+ _PAGE_ACCESSED | H_PAGE_THP_HUGE | _PAGE_SPECIAL | \
+ _PAGE_PTE | _PAGE_SOFT_DIRTY)
/*
* user access blocked by key
*/
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 04/10] powerpc/64s/tlbflush-radix: Remove unused radix__flush_tlb_pwc()
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (2 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 03/10] powerpc/64s: Fix _HPAGE_CHG_MASK to include _PAGE_SPECIAL bit Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 05/10] powerpc/64s: Move serialize_against_pte_lookup() to hash_pgtable.c Ritesh Harjani (IBM)
` (7 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
Commit 52162ec784fa
("powerpc/mm/book3s64/radix: Use freed_tables instead of need_flush_all")
removed radix__flush_tlb_pwc() definition, but missed to remove the extern
declaration. This patch removes it.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index a38542259fab..de9b96660582 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -92,7 +92,6 @@ extern void radix__flush_tlb_page_psize(struct mm_struct *mm, unsigned long vmad
#define radix__flush_tlb_page(vma,addr) radix__local_flush_tlb_page(vma,addr)
#define radix__flush_tlb_page_psize(mm,addr,p) radix__local_flush_tlb_page_psize(mm,addr,p)
#endif
-extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr);
extern void radix__flush_tlb_collapsed_pmd(struct mm_struct *mm, unsigned long addr);
extern void radix__flush_tlb_all(void);
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 05/10] powerpc/64s: Move serialize_against_pte_lookup() to hash_pgtable.c
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (3 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 04/10] powerpc/64s/tlbflush-radix: Remove unused radix__flush_tlb_pwc() Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 06/10] powerpc/64s: Kill the unused argument of exit_lazy_flush_tlb Ritesh Harjani (IBM)
` (6 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
Originally,
commit fa4531f753f1 ("powerpc/mm: Don't send IPI to all cpus on THP updates")
introduced serialize_against_pte_lookup() call for both Radix and Hash.
However below commit fixed the race with Radix
commit 70cbc3cc78a9 ("mm: gup: fix the fast GUP race against THP collapse")
And therefore following commit removed the
serialize_against_pte_lookup() call from radix_pgtable.c
commit bedf03416913
("powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush")
Now since serialize_against_pte_lookup() only gets called from
hash__pmdp_collapse_flush(), thus move the related functions to
hash_pgtable.c
Hence this patch:
- moves serialize_against_pte_lookup() from radix_pgtable.c to hash_pgtable.c
- removes the radix specific calls from do_serialize()
- renames do_serialize() to do_nothing().
There should not be any functionality change in this patch.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/include/asm/book3s/64/pgtable.h | 1 -
arch/powerpc/mm/book3s64/hash_pgtable.c | 21 ++++++++++++++++
arch/powerpc/mm/book3s64/pgtable.c | 25 --------------------
3 files changed, 21 insertions(+), 26 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 6be7428fdde4..1b8916618f89 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -1438,7 +1438,6 @@ static inline bool arch_needs_pgtable_deposit(void)
return false;
return true;
}
-extern void serialize_against_pte_lookup(struct mm_struct *mm);
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
diff --git a/arch/powerpc/mm/book3s64/hash_pgtable.c b/arch/powerpc/mm/book3s64/hash_pgtable.c
index ac2a24d15d2e..d9b5b751d7b7 100644
--- a/arch/powerpc/mm/book3s64/hash_pgtable.c
+++ b/arch/powerpc/mm/book3s64/hash_pgtable.c
@@ -221,6 +221,27 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
return old;
}
+static void do_nothing(void *arg)
+{
+
+}
+
+/*
+ * Serialize against __find_linux_pte() which does lock-less
+ * lookup in page tables with local interrupts disabled. For huge pages
+ * it casts pmd_t to pte_t. Since format of pte_t is different from
+ * pmd_t we want to prevent transit from pmd pointing to page table
+ * to pmd pointing to huge page (and back) while interrupts are disabled.
+ * We clear pmd to possibly replace it with page table pointer in
+ * different code paths. So make sure we wait for the parallel
+ * __find_linux_pte() to finish.
+ */
+static void serialize_against_pte_lookup(struct mm_struct *mm)
+{
+ smp_mb();
+ smp_call_function_many(mm_cpumask(mm), do_nothing, mm, 1);
+}
+
pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address,
pmd_t *pmdp)
{
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 42c7906d0e43..faec2dc71a5c 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -150,31 +150,6 @@ void set_pud_at(struct mm_struct *mm, unsigned long addr,
return set_pte_at_unchecked(mm, addr, pudp_ptep(pudp), pud_pte(pud));
}
-static void do_serialize(void *arg)
-{
- /* We've taken the IPI, so try to trim the mask while here */
- if (radix_enabled()) {
- struct mm_struct *mm = arg;
- exit_lazy_flush_tlb(mm, false);
- }
-}
-
-/*
- * Serialize against __find_linux_pte() which does lock-less
- * lookup in page tables with local interrupts disabled. For huge pages
- * it casts pmd_t to pte_t. Since format of pte_t is different from
- * pmd_t we want to prevent transit from pmd pointing to page table
- * to pmd pointing to huge page (and back) while interrupts are disabled.
- * We clear pmd to possibly replace it with page table pointer in
- * different code paths. So make sure we wait for the parallel
- * __find_linux_pte() to finish.
- */
-void serialize_against_pte_lookup(struct mm_struct *mm)
-{
- smp_mb();
- smp_call_function_many(mm_cpumask(mm), do_serialize, mm, 1);
-}
-
/*
* We use this to invalidate a pmdp entry before switching from a
* hugepte to regular pmd entry.
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 06/10] powerpc/64s: Kill the unused argument of exit_lazy_flush_tlb
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (4 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 05/10] powerpc/64s: Move serialize_against_pte_lookup() to hash_pgtable.c Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 07/10] powerpc/64s: Rename tlbie_va_lpid to tlbie_va_pid_lpid Ritesh Harjani (IBM)
` (5 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
In previous patch we removed the only caller of exit_lazy_flush_tlb()
which was passing always_flush = false in it's second argument.
With that gone, all the callers of exit_lazy_flush_tlb() are local to
radix_pgtable.c and there is no need of an additional argument.
This patch does the required cleanup. There should not be any
functionality change in this patch.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/mm/book3s64/internal.h | 2 --
arch/powerpc/mm/book3s64/pgtable.c | 2 --
arch/powerpc/mm/book3s64/radix_tlb.c | 14 +++++---------
3 files changed, 5 insertions(+), 13 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/internal.h b/arch/powerpc/mm/book3s64/internal.h
index cad08d83369c..f7055251c8b7 100644
--- a/arch/powerpc/mm/book3s64/internal.h
+++ b/arch/powerpc/mm/book3s64/internal.h
@@ -31,6 +31,4 @@ static inline bool slb_preload_disabled(void)
void hpt_do_stress(unsigned long ea, unsigned long hpte_group);
-void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush);
-
#endif /* ARCH_POWERPC_MM_BOOK3S64_INTERNAL_H */
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index faec2dc71a5c..d32197d3298a 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -23,8 +23,6 @@
#include <mm/mmu_decl.h>
#include <trace/events/thp.h>
-#include "internal.h"
-
struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT];
EXPORT_SYMBOL_GPL(mmu_psize_defs);
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 9e1f6558d026..339bd276840b 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -19,8 +19,6 @@
#include <asm/cputhreads.h>
#include <asm/plpar_wrappers.h>
-#include "internal.h"
-
/*
* tlbiel instruction for radix, set invalidation
* i.e., r=1 and is=01 or is=10 or is=11
@@ -660,7 +658,7 @@ static bool mm_needs_flush_escalation(struct mm_struct *mm)
* If always_flush is true, then flush even if this CPU can't be removed
* from mm_cpumask.
*/
-void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush)
+static void exit_lazy_flush_tlb(struct mm_struct *mm)
{
unsigned long pid = mm->context.id;
int cpu = smp_processor_id();
@@ -703,19 +701,17 @@ void exit_lazy_flush_tlb(struct mm_struct *mm, bool always_flush)
if (cpumask_test_cpu(cpu, mm_cpumask(mm))) {
dec_mm_active_cpus(mm);
cpumask_clear_cpu(cpu, mm_cpumask(mm));
- always_flush = true;
}
out:
- if (always_flush)
- _tlbiel_pid(pid, RIC_FLUSH_ALL);
+ _tlbiel_pid(pid, RIC_FLUSH_ALL);
}
#ifdef CONFIG_SMP
static void do_exit_flush_lazy_tlb(void *arg)
{
struct mm_struct *mm = arg;
- exit_lazy_flush_tlb(mm, true);
+ exit_lazy_flush_tlb(mm);
}
static void exit_flush_lazy_tlbs(struct mm_struct *mm)
@@ -777,7 +773,7 @@ static enum tlb_flush_type flush_type_needed(struct mm_struct *mm, bool fullmm)
* to trim.
*/
if (tick_and_test_trim_clock()) {
- exit_lazy_flush_tlb(mm, true);
+ exit_lazy_flush_tlb(mm);
return FLUSH_TYPE_NONE;
}
}
@@ -823,7 +819,7 @@ static enum tlb_flush_type flush_type_needed(struct mm_struct *mm, bool fullmm)
if (current->mm == mm)
return FLUSH_TYPE_LOCAL;
if (cpumask_test_cpu(cpu, mm_cpumask(mm)))
- exit_lazy_flush_tlb(mm, true);
+ exit_lazy_flush_tlb(mm);
return FLUSH_TYPE_NONE;
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 07/10] powerpc/64s: Rename tlbie_va_lpid to tlbie_va_pid_lpid
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (5 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 06/10] powerpc/64s: Kill the unused argument of exit_lazy_flush_tlb Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 08/10] powerpc/64s: Rename tlbie_lpid_va to tlbie_va_lpid Ritesh Harjani (IBM)
` (4 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
It only make sense to rename these functions, so it's better reflect what
they are supposed to do. For e.g. __tlbie_va_pid_lpid name better reflect
that it is invalidating tlbie using VA, PID and LPID.
No functional change in this patch.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/mm/book3s64/radix_tlb.c | 20 ++++++++++----------
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 339bd276840b..1adf20798ca6 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -1411,7 +1411,7 @@ static __always_inline void __tlbie_pid_lpid(unsigned long pid,
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
-static __always_inline void __tlbie_va_lpid(unsigned long va, unsigned long pid,
+static __always_inline void __tlbie_va_pid_lpid(unsigned long va, unsigned long pid,
unsigned long lpid,
unsigned long ap, unsigned long ric)
{
@@ -1443,7 +1443,7 @@ static inline void fixup_tlbie_pid_lpid(unsigned long pid, unsigned long lpid)
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
asm volatile("ptesync" : : : "memory");
- __tlbie_va_lpid(va, pid, lpid, mmu_get_ap(MMU_PAGE_64K),
+ __tlbie_va_pid_lpid(va, pid, lpid, mmu_get_ap(MMU_PAGE_64K),
RIC_FLUSH_TLB);
}
}
@@ -1474,7 +1474,7 @@ static inline void _tlbie_pid_lpid(unsigned long pid, unsigned long lpid,
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
}
-static inline void fixup_tlbie_va_range_lpid(unsigned long va,
+static inline void fixup_tlbie_va_range_pid_lpid(unsigned long va,
unsigned long pid,
unsigned long lpid,
unsigned long ap)
@@ -1486,11 +1486,11 @@ static inline void fixup_tlbie_va_range_lpid(unsigned long va,
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
asm volatile("ptesync" : : : "memory");
- __tlbie_va_lpid(va, pid, lpid, ap, RIC_FLUSH_TLB);
+ __tlbie_va_pid_lpid(va, pid, lpid, ap, RIC_FLUSH_TLB);
}
}
-static inline void __tlbie_va_range_lpid(unsigned long start, unsigned long end,
+static inline void __tlbie_va_range_pid_lpid(unsigned long start, unsigned long end,
unsigned long pid, unsigned long lpid,
unsigned long page_size,
unsigned long psize)
@@ -1499,12 +1499,12 @@ static inline void __tlbie_va_range_lpid(unsigned long start, unsigned long end,
unsigned long ap = mmu_get_ap(psize);
for (addr = start; addr < end; addr += page_size)
- __tlbie_va_lpid(addr, pid, lpid, ap, RIC_FLUSH_TLB);
+ __tlbie_va_pid_lpid(addr, pid, lpid, ap, RIC_FLUSH_TLB);
- fixup_tlbie_va_range_lpid(addr - page_size, pid, lpid, ap);
+ fixup_tlbie_va_range_pid_lpid(addr - page_size, pid, lpid, ap);
}
-static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
+static inline void _tlbie_va_range_pid_lpid(unsigned long start, unsigned long end,
unsigned long pid, unsigned long lpid,
unsigned long page_size,
unsigned long psize, bool also_pwc)
@@ -1512,7 +1512,7 @@ static inline void _tlbie_va_range_lpid(unsigned long start, unsigned long end,
asm volatile("ptesync" : : : "memory");
if (also_pwc)
__tlbie_pid_lpid(pid, lpid, RIC_FLUSH_PWC);
- __tlbie_va_range_lpid(start, end, pid, lpid, page_size, psize);
+ __tlbie_va_range_pid_lpid(start, end, pid, lpid, page_size, psize);
asm volatile("eieio; tlbsync; ptesync" : : : "memory");
}
@@ -1563,7 +1563,7 @@ void do_h_rpt_invalidate_prt(unsigned long pid, unsigned long lpid,
_tlbie_pid_lpid(pid, lpid, RIC_FLUSH_TLB);
return;
}
- _tlbie_va_range_lpid(start, end, pid, lpid,
+ _tlbie_va_range_pid_lpid(start, end, pid, lpid,
(1UL << def->shift), psize, false);
}
}
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 08/10] powerpc/64s: Rename tlbie_lpid_va to tlbie_va_lpid
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (6 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 07/10] powerpc/64s: Rename tlbie_va_lpid to tlbie_va_pid_lpid Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 09/10] powerpc/64s: Make use of H_RPTI_TYPE_ALL macro Ritesh Harjani (IBM)
` (3 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
In previous patch we renamed tlbie_va_lpid functions to
tlbie_va_pid_lpid() since those were working with PIDs as well.
This then allows us to rename tlbie_lpid_va to tlbie_va_lpid, which
finally makes all the tlbie function naming consistent.
No functional change in this patch.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/mm/book3s64/radix_tlb.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 1adf20798ca6..6ce94eaefc1b 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -185,7 +185,7 @@ static __always_inline void __tlbie_va(unsigned long va, unsigned long pid,
trace_tlbie(0, 0, rb, rs, ric, prs, r);
}
-static __always_inline void __tlbie_lpid_va(unsigned long va, unsigned long lpid,
+static __always_inline void __tlbie_va_lpid(unsigned long va, unsigned long lpid,
unsigned long ap, unsigned long ric)
{
unsigned long rb,rs,prs,r;
@@ -249,17 +249,17 @@ static inline void fixup_tlbie_pid(unsigned long pid)
}
}
-static inline void fixup_tlbie_lpid_va(unsigned long va, unsigned long lpid,
+static inline void fixup_tlbie_va_lpid(unsigned long va, unsigned long lpid,
unsigned long ap)
{
if (cpu_has_feature(CPU_FTR_P9_TLBIE_ERAT_BUG)) {
asm volatile("ptesync": : :"memory");
- __tlbie_lpid_va(va, 0, ap, RIC_FLUSH_TLB);
+ __tlbie_va_lpid(va, 0, ap, RIC_FLUSH_TLB);
}
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
asm volatile("ptesync": : :"memory");
- __tlbie_lpid_va(va, lpid, ap, RIC_FLUSH_TLB);
+ __tlbie_va_lpid(va, lpid, ap, RIC_FLUSH_TLB);
}
}
@@ -278,7 +278,7 @@ static inline void fixup_tlbie_lpid(unsigned long lpid)
if (cpu_has_feature(CPU_FTR_P9_TLBIE_STQ_BUG)) {
asm volatile("ptesync": : :"memory");
- __tlbie_lpid_va(va, lpid, mmu_get_ap(MMU_PAGE_64K), RIC_FLUSH_TLB);
+ __tlbie_va_lpid(va, lpid, mmu_get_ap(MMU_PAGE_64K), RIC_FLUSH_TLB);
}
}
@@ -529,14 +529,14 @@ static void do_tlbiel_va_range(void *info)
t->psize, t->also_pwc);
}
-static __always_inline void _tlbie_lpid_va(unsigned long va, unsigned long lpid,
+static __always_inline void _tlbie_va_lpid(unsigned long va, unsigned long lpid,
unsigned long psize, unsigned long ric)
{
unsigned long ap = mmu_get_ap(psize);
asm volatile("ptesync": : :"memory");
- __tlbie_lpid_va(va, lpid, ap, ric);
- fixup_tlbie_lpid_va(va, lpid, ap);
+ __tlbie_va_lpid(va, lpid, ap, ric);
+ fixup_tlbie_va_lpid(va, lpid, ap);
asm volatile("eieio; tlbsync; ptesync": : :"memory");
}
@@ -1147,7 +1147,7 @@ void radix__flush_tlb_lpid_page(unsigned int lpid,
{
int psize = radix_get_mmu_psize(page_size);
- _tlbie_lpid_va(addr, lpid, psize, RIC_FLUSH_TLB);
+ _tlbie_va_lpid(addr, lpid, psize, RIC_FLUSH_TLB);
}
EXPORT_SYMBOL_GPL(radix__flush_tlb_lpid_page);
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 09/10] powerpc/64s: Make use of H_RPTI_TYPE_ALL macro
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (7 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 08/10] powerpc/64s: Rename tlbie_lpid_va to tlbie_va_lpid Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-09 18:14 ` [PATCH v2 10/10] powerpc: Print MMU_FTRS_POSSIBLE & MMU_FTRS_ALWAYS at startup Ritesh Harjani (IBM)
` (2 subsequent siblings)
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
Instead of opencoding, let's use the pre-defined macro (H_RPTI_TYPE_ALL)
at the following places.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/mm/book3s64/radix_tlb.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 6ce94eaefc1b..7de5760164a9 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -885,8 +885,7 @@ static void __flush_all_mm(struct mm_struct *mm, bool fullmm)
} else if (type == FLUSH_TYPE_GLOBAL) {
if (!mmu_has_feature(MMU_FTR_GTSE)) {
unsigned long tgt = H_RPTI_TARGET_CMMU;
- unsigned long type = H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
- H_RPTI_TYPE_PRT;
+ unsigned long type = H_RPTI_TYPE_ALL;
if (atomic_read(&mm->context.copros) > 0)
tgt |= H_RPTI_TARGET_NMMU;
@@ -982,8 +981,7 @@ void radix__flush_tlb_kernel_range(unsigned long start, unsigned long end)
{
if (!mmu_has_feature(MMU_FTR_GTSE)) {
unsigned long tgt = H_RPTI_TARGET_CMMU | H_RPTI_TARGET_NMMU;
- unsigned long type = H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
- H_RPTI_TYPE_PRT;
+ unsigned long type = H_RPTI_TYPE_ALL;
pseries_rpt_invalidate(0, tgt, type, H_RPTI_PAGE_ALL,
start, end);
@@ -1337,8 +1335,7 @@ void radix__flush_tlb_collapsed_pmd(struct mm_struct *mm, unsigned long addr)
unsigned long tgt, type, pg_sizes;
tgt = H_RPTI_TARGET_CMMU;
- type = H_RPTI_TYPE_TLB | H_RPTI_TYPE_PWC |
- H_RPTI_TYPE_PRT;
+ type = H_RPTI_TYPE_ALL;
pg_sizes = psize_to_rpti_pgsize(mmu_virtual_psize);
if (atomic_read(&mm->context.copros) > 0)
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* [PATCH v2 10/10] powerpc: Print MMU_FTRS_POSSIBLE & MMU_FTRS_ALWAYS at startup
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (8 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 09/10] powerpc/64s: Make use of H_RPTI_TYPE_ALL macro Ritesh Harjani (IBM)
@ 2026-03-09 18:14 ` Ritesh Harjani (IBM)
2026-03-10 13:46 ` [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Venkat Rao Bagalkote
2026-03-30 10:21 ` Madhavan Srinivasan
11 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani (IBM) @ 2026-03-09 18:14 UTC (permalink / raw)
To: linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Venkat Rao Bagalkote,
Nicholas Piggin, Sayali Patil, Aboorva Devarajan, Donet Tom,
Ritesh Harjani (IBM)
Similar to CPU_FTRS_[POSSIBLE|ALWAYS], let's also print
MMU_FTRS_[POSSIBLE|ALWAYS]. This has some useful data to capture during
bootup.
Reviewed-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org>
Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
---
arch/powerpc/kernel/setup-common.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index cb5b73adc250..002b312eb7e9 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -866,6 +866,10 @@ static __init void print_system_info(void)
cur_cpu_spec->cpu_user_features,
cur_cpu_spec->cpu_user_features2);
pr_info("mmu_features = 0x%08x\n", cur_cpu_spec->mmu_features);
+ pr_info(" possible = 0x%016lx\n",
+ (unsigned long)MMU_FTRS_POSSIBLE);
+ pr_info(" always = 0x%016lx\n",
+ (unsigned long)MMU_FTRS_ALWAYS);
#ifdef CONFIG_PPC64
pr_info("firmware_features = 0x%016lx\n", powerpc_firmware_features);
#ifdef CONFIG_PPC_BOOK3S
--
2.50.1 (Apple Git-155)
^ permalink raw reply related [flat|nested] 14+ messages in thread* Re: [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (9 preceding siblings ...)
2026-03-09 18:14 ` [PATCH v2 10/10] powerpc: Print MMU_FTRS_POSSIBLE & MMU_FTRS_ALWAYS at startup Ritesh Harjani (IBM)
@ 2026-03-10 13:46 ` Venkat Rao Bagalkote
2026-03-11 2:10 ` Ritesh Harjani
2026-03-30 10:21 ` Madhavan Srinivasan
11 siblings, 1 reply; 14+ messages in thread
From: Venkat Rao Bagalkote @ 2026-03-10 13:46 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Nicholas Piggin,
Sayali Patil, Aboorva Devarajan, Donet Tom
On 09/03/26 11:44 pm, Ritesh Harjani (IBM) wrote:
> v1->v2:
>
> - dropped debug_vm_pgtable patch which adds a testcase to simulate the
> failure scenario. Since it belongs to linux-mm, I will send that out
> separately.
> - Modified Patch-2 in this series to also cover PMD device migration
> entry (in addition to PMD THP migration entry). Hence dropped the
> previous RB tag.
> - Added a new Patch-3 to fix another selftests WARNING.
> - Fixed commit subject of Patch-10.
> - Changed subject pre-fix of few patches to be consistent with others
> (powerpc/64s)
> - Added RB tags
>
> This patch series addresses selftests issues w.r.t warnings or
> VM_BUG_ONs seen mainly on book3s64 powerpc kernel. This also carries
> cleanups and refactoring changes which I identified while reviewing
> other's patches and/or during code walkthrough.
>
> Suggestions and feedback are welcome!
>
> Ritesh Harjani (IBM) (10):
> powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy
> powerpc/64s: Fix unmap race with PMD migration entries
> powerpc/64s: Fix _HPAGE_CHG_MASK to include _PAGE_SPECIAL bit
> powerpc/64s/tlbflush-radix: Remove unused radix__flush_tlb_pwc()
> powerpc/64s: Move serialize_against_pte_lookup() to hash_pgtable.c
> powerpc/64s: Kill the unused argument of exit_lazy_flush_tlb
> powerpc/64s: Rename tlbie_va_lpid to tlbie_va_pid_lpid
> powerpc/64s: Rename tlbie_lpid_va to tlbie_va_lpid
> powerpc/64s: Make use of H_RPTI_TYPE_ALL macro
> powerpc: Print MMU_FTRS_POSSIBLE & MMU_FTRS_ALWAYS at startup
>
> arch/powerpc/include/asm/book3s/64/pgtable.h | 20 +++++-
> .../include/asm/book3s/64/tlbflush-radix.h | 1 -
> arch/powerpc/kernel/setup-common.c | 4 ++
> arch/powerpc/mm/book3s64/hash_pgtable.c | 21 +++++++
> arch/powerpc/mm/book3s64/internal.h | 2 -
> arch/powerpc/mm/book3s64/pgtable.c | 40 +++---------
> arch/powerpc/mm/book3s64/radix_tlb.c | 61 ++++++++-----------
> arch/powerpc/mm/pgtable-frag.c | 1 +
> 8 files changed, 79 insertions(+), 71 deletions(-)
>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
I applied the patch series on top of mainline and verified that the kernel
builds and boots successfully.
I also ran the following test suites on both RADIX (POWER11) and HASH
(POWER9)
MMU configurations:
- tools/testing/selftests/mm
- tools/testing/selftests/memory-hotplug
- tools/testing/selftests/powerpc/mm
- tools/testing/selftests/powerpc/cache_shape
- tools/testing/selftests/powerpc/copyloops
In addition, I executed basic sanity and stress tests, including:
stutter, eatmemory, hugepage_sanity, fork_mem, memory_api mprotect,
vatest, and several transparent-hugepage sanity checks.
All tests passed without regressions.
Regards,
Venkat
^ permalink raw reply [flat|nested] 14+ messages in thread* Re: [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups
2026-03-10 13:46 ` [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Venkat Rao Bagalkote
@ 2026-03-11 2:10 ` Ritesh Harjani
0 siblings, 0 replies; 14+ messages in thread
From: Ritesh Harjani @ 2026-03-11 2:10 UTC (permalink / raw)
To: Venkat Rao Bagalkote, linuxppc-dev
Cc: Madhavan Srinivasan, Christophe Leroy, Nicholas Piggin,
Sayali Patil, Aboorva Devarajan, Donet Tom
Venkat Rao Bagalkote <venkat88@linux.ibm.com> writes:
> On 09/03/26 11:44 pm, Ritesh Harjani (IBM) wrote:
>> v1->v2:
>>
>> - dropped debug_vm_pgtable patch which adds a testcase to simulate the
>> failure scenario. Since it belongs to linux-mm, I will send that out
>> separately.
>> - Modified Patch-2 in this series to also cover PMD device migration
>> entry (in addition to PMD THP migration entry). Hence dropped the
>> previous RB tag.
>> - Added a new Patch-3 to fix another selftests WARNING.
>> - Fixed commit subject of Patch-10.
>> - Changed subject pre-fix of few patches to be consistent with others
>> (powerpc/64s)
>> - Added RB tags
>>
>> This patch series addresses selftests issues w.r.t warnings or
>> VM_BUG_ONs seen mainly on book3s64 powerpc kernel. This also carries
>> cleanups and refactoring changes which I identified while reviewing
>> other's patches and/or during code walkthrough.
>>
>> Suggestions and feedback are welcome!
>>
>> Ritesh Harjani (IBM) (10):
>> powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy
>> powerpc/64s: Fix unmap race with PMD migration entries
>> powerpc/64s: Fix _HPAGE_CHG_MASK to include _PAGE_SPECIAL bit
>> powerpc/64s/tlbflush-radix: Remove unused radix__flush_tlb_pwc()
>> powerpc/64s: Move serialize_against_pte_lookup() to hash_pgtable.c
>> powerpc/64s: Kill the unused argument of exit_lazy_flush_tlb
>> powerpc/64s: Rename tlbie_va_lpid to tlbie_va_pid_lpid
>> powerpc/64s: Rename tlbie_lpid_va to tlbie_va_lpid
>> powerpc/64s: Make use of H_RPTI_TYPE_ALL macro
>> powerpc: Print MMU_FTRS_POSSIBLE & MMU_FTRS_ALWAYS at startup
>>
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 20 +++++-
>> .../include/asm/book3s/64/tlbflush-radix.h | 1 -
>> arch/powerpc/kernel/setup-common.c | 4 ++
>> arch/powerpc/mm/book3s64/hash_pgtable.c | 21 +++++++
>> arch/powerpc/mm/book3s64/internal.h | 2 -
>> arch/powerpc/mm/book3s64/pgtable.c | 40 +++---------
>> arch/powerpc/mm/book3s64/radix_tlb.c | 61 ++++++++-----------
>> arch/powerpc/mm/pgtable-frag.c | 1 +
>> 8 files changed, 79 insertions(+), 71 deletions(-)
>>
> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Thanks a lot!
>
> I applied the patch series on top of mainline and verified that the kernel
> builds and boots successfully.
>
> I also ran the following test suites on both RADIX (POWER11) and HASH
> (POWER9)
> MMU configurations:
>
> - tools/testing/selftests/mm
> - tools/testing/selftests/memory-hotplug
> - tools/testing/selftests/powerpc/mm
> - tools/testing/selftests/powerpc/cache_shape
> - tools/testing/selftests/powerpc/copyloops
>
> In addition, I executed basic sanity and stress tests, including:
> stutter, eatmemory, hugepage_sanity, fork_mem, memory_api mprotect,
> vatest, and several transparent-hugepage sanity checks.
Thanks Venkat for verifying this extensively.
So other than couple of hmm tests, there shouldn't be any other kernel
warnings or any VM_BUG_ONs() hitting after this patch series.
(we discussed this internally too!)
And as for those warnings with hmm tests fixes - will fix them in a
separate patch series later (as those looks to be non-powerpc fixes).
This should also enable Venkat and other CI systems to run mm selftests
on book3s64 PowerPC without any kernel issues.
>
> All tests passed without regressions.
Thanks!
>
> Regards,
> Venkat
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups
2026-03-09 18:14 [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Ritesh Harjani (IBM)
` (10 preceding siblings ...)
2026-03-10 13:46 ` [PATCH v2 00/10] Misc powerpc selftests kernel fixes and cleanups Venkat Rao Bagalkote
@ 2026-03-30 10:21 ` Madhavan Srinivasan
11 siblings, 0 replies; 14+ messages in thread
From: Madhavan Srinivasan @ 2026-03-30 10:21 UTC (permalink / raw)
To: linuxppc-dev, Ritesh Harjani (IBM)
Cc: Christophe Leroy, Venkat Rao Bagalkote, Nicholas Piggin,
Sayali Patil, Aboorva Devarajan, Donet Tom
On Mon, 09 Mar 2026 23:44:23 +0530, Ritesh Harjani (IBM) wrote:
> v1->v2:
>
> - dropped debug_vm_pgtable patch which adds a testcase to simulate the
> failure scenario. Since it belongs to linux-mm, I will send that out
> separately.
> - Modified Patch-2 in this series to also cover PMD device migration
> entry (in addition to PMD THP migration entry). Hence dropped the
> previous RB tag.
> - Added a new Patch-3 to fix another selftests WARNING.
> - Fixed commit subject of Patch-10.
> - Changed subject pre-fix of few patches to be consistent with others
> (powerpc/64s)
> - Added RB tags
>
> [...]
Applied to powerpc/next.
[01/10] powerpc/pgtable-frag: Fix bad page state in pte_frag_destroy
https://git.kernel.org/powerpc/c/fda4d71651f71c44b35829d13f3c8bf920032f77
[02/10] powerpc/64s: Fix unmap race with PMD migration entries
https://git.kernel.org/powerpc/c/bbcbf045d6c778e82b47a35fc8728387708e9a3d
[03/10] powerpc/64s: Fix _HPAGE_CHG_MASK to include _PAGE_SPECIAL bit
https://git.kernel.org/powerpc/c/68b1fa0ed5c84769e4e60d58f6a5af37e7273b51
[04/10] powerpc/64s/tlbflush-radix: Remove unused radix__flush_tlb_pwc()
https://git.kernel.org/powerpc/c/4a342f3e6f6848c816a661d8d7b10c75430598cf
[05/10] powerpc/64s: Move serialize_against_pte_lookup() to hash_pgtable.c
https://git.kernel.org/powerpc/c/bf7c1497d2568ff803a0b0fc6728a1c06d11bf6e
[06/10] powerpc/64s: Kill the unused argument of exit_lazy_flush_tlb
https://git.kernel.org/powerpc/c/4894e2fb7b9a25cef843ee2c3b2ac49fd808647d
[07/10] powerpc/64s: Rename tlbie_va_lpid to tlbie_va_pid_lpid
https://git.kernel.org/powerpc/c/7bcfba20e946ec160fd72c3a0b4cf6e3e845d629
[08/10] powerpc/64s: Rename tlbie_lpid_va to tlbie_va_lpid
https://git.kernel.org/powerpc/c/f074059c7a4d4b93914eee404391dcdb0fd60aa6
[09/10] powerpc/64s: Make use of H_RPTI_TYPE_ALL macro
https://git.kernel.org/powerpc/c/24eb6378408fc125eacc4ad498d120ecf7becc35
[10/10] powerpc: Print MMU_FTRS_POSSIBLE & MMU_FTRS_ALWAYS at startup
https://git.kernel.org/powerpc/c/07791ff060dd3aa270cc03861f2599d81a77b97f
cheers
^ permalink raw reply [flat|nested] 14+ messages in thread