* [PATCH v3 07/16] mm: handle VM_UFFD_RWP in khugepaged, rmap, and GUP
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
Three mm paths outside the fault handler gate on the uffd PTE bit
today: khugepaged (skip collapse on ranges carrying markers), rmap
(cap unmap batching), and GUP (force a fault through
gup_can_follow_protnone). Extend each to treat VM_UFFD_RWP the same
as VM_UFFD_WP; otherwise per-PTE RWP state is silently destroyed or
bypassed.
khugepaged: try_collapse_pte_mapped_thp() and
file_backed_vma_is_retractable() already refuse to collapse or
retract page tables on ranges carrying the uffd PTE bit. Broaden the
VMA predicate from userfaultfd_wp() to userfaultfd_protected() so
VM_UFFD_RWP ranges get the same protection. hpage_collapse_scan_pmd()
needs no change — its existing pte_uffd() check already catches an
RWP PTE because it carries the uffd bit.
rmap: folio_unmap_pte_batch() caps batching at 1 for VM_UFFD_RWP so
the restore path handles each PTE with its own marker.
GUP: gup_can_follow_protnone() forces a fault on VM_UFFD_RWP VMAs
regardless of FOLL_HONOR_NUMA_FAULT. RWP uses protnone as an
access-tracking marker, not for NUMA hinting, so any GUP — read or
write — must go through the userfaultfd fault path.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
include/linux/mm.h | 10 +++++++++-
mm/khugepaged.c | 18 +++++++++++-------
mm/rmap.c | 2 +-
3 files changed, 21 insertions(+), 9 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0598b1482aeb..0ecfb0973b01 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4605,11 +4605,19 @@ static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
/*
* Indicates whether GUP can follow a PROT_NONE mapped page, or whether
- * a (NUMA hinting) fault is required.
+ * a (NUMA hinting or userfaultfd RWP) fault is required.
*/
static inline bool gup_can_follow_protnone(const struct vm_area_struct *vma,
unsigned int flags)
{
+ /*
+ * VM_UFFD_RWP uses protnone as an access-tracking marker, not for
+ * NUMA hinting. GUP must always take a fault so the access is
+ * delivered to userfaultfd, regardless of FOLL_HONOR_NUMA_FAULT.
+ */
+ if (vma->vm_flags & VM_UFFD_RWP)
+ return false;
+
/*
* If callers don't want to honor NUMA hinting faults, no need to
* determine if we would actually have to trigger a NUMA hinting fault.
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index de0644bde400..a798c542c849 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1532,8 +1532,11 @@ static enum scan_result try_collapse_pte_mapped_thp(struct mm_struct *mm, unsign
if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER))
return SCAN_VMA_CHECK;
- /* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
- if (userfaultfd_wp(vma))
+ /*
+ * Keep pmd pgtable while the uffd bit is in use; see comment in
+ * retract_page_tables().
+ */
+ if (userfaultfd_protected(vma))
return SCAN_PTE_UFFD;
folio = filemap_lock_folio(vma->vm_file->f_mapping,
@@ -1746,13 +1749,14 @@ static bool file_backed_vma_is_retractable(struct vm_area_struct *vma)
return false;
/*
- * When a vma is registered with uffd-wp, we cannot recycle
+ * When a vma is registered with uffd-wp or RWP, we cannot recycle
* the page table because there may be pte markers installed.
- * Other vmas can still have the same file mapped hugely, but
- * skip this one: it will always be mapped in small page size
- * for uffd-wp registered ranges.
+ * VM_UFFD_RWP ranges similarly rely on per-PTE uffd state
+ * and cannot be recycled to a shared PMD. Other vmas can still
+ * have the same file mapped hugely, but skip this one: it will
+ * always be mapped in small page size for these registrations.
*/
- if (userfaultfd_wp(vma))
+ if (userfaultfd_protected(vma))
return false;
/*
diff --git a/mm/rmap.c b/mm/rmap.c
index 05056c213203..1426d1ece917 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1965,7 +1965,7 @@ static inline unsigned int folio_unmap_pte_batch(struct folio *folio,
if (pte_unused(pte))
return 1;
- if (userfaultfd_wp(vma))
+ if (userfaultfd_protected(vma))
return 1;
/*
--
2.51.2
^ permalink raw reply related
* [PATCH v3 06/16] mm: preserve RWP marker across PTE rewrites
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
The uffd PTE bit must survive any kernel path that rewrites a PTE
on a VM_UFFD_RWP VMA, otherwise the marker that carries PAGE_NONE
semantics is silently dropped and the next access leaks past RWP
tracking. Wire the preservation through every path that rewrites a
VM_UFFD_RWP PTE.
Swap and device-exclusive: do_swap_page(), restore_exclusive_pte(),
and unuse_pte() (swapoff()) re-apply PAGE_NONE when the swap PTE
carries the uffd bit and the VMA has VM_UFFD_RWP.
Migration: remove_migration_pte() and remove_migration_pmd() do the
same after the migration entry is replaced with a real PTE/PMD.
Fork: __copy_present_ptes(), copy_present_page(), copy_nonpresent_pte(),
copy_huge_pmd(), copy_huge_non_present_pmd(), and
copy_hugetlb_page_range() keep the uffd bit on the child when the
destination VMA has VM_UFFD_RWP, matching the existing VM_UFFD_WP
handling. Add VM_UFFD_RWP to VM_COPY_ON_FORK so the flag itself
propagates.
mprotect(): change_pte_range() and change_huge_pmd() restore PAGE_NONE
after pte_modify()/pmd_modify() have recomputed the base protection
from a (possibly user-changed) vm_page_prot. pte_modify() preserves
_PAGE_UFFD, so the bit stays; we just have to force PAGE_NONE back
on top.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
include/linux/mm.h | 3 ++-
mm/huge_memory.c | 47 ++++++++++++++++++++++++++++++++++++++++++----
mm/hugetlb.c | 40 ++++++++++++++++++++++++++++++---------
mm/memory.c | 47 +++++++++++++++++++++++++++++++++++++++-------
mm/migrate.c | 8 ++++++++
mm/mprotect.c | 10 ++++++++++
mm/mremap.c | 13 +++++++++++--
mm/swapfile.c | 5 +++++
mm/userfaultfd.c | 14 ++++++++++++++
9 files changed, 164 insertions(+), 23 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9054468774b5..0598b1482aeb 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -663,7 +663,8 @@ enum {
* only and thus cannot be reconstructed on page
* fault.
*/
-#define VM_COPY_ON_FORK (VM_PFNMAP | VM_MIXEDMAP | VM_UFFD_WP | VM_MAYBE_GUARD)
+#define VM_COPY_ON_FORK (VM_PFNMAP | VM_MIXEDMAP | VM_UFFD_WP | VM_UFFD_RWP | \
+ VM_MAYBE_GUARD)
/*
* mapping from the currently active vm_flags protection bits (the
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index befc919de69e..189192ea45cf 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1918,7 +1918,7 @@ static void copy_huge_non_present_pmd(
add_mm_counter(dst_mm, MM_ANONPAGES, HPAGE_PMD_NR);
mm_inc_nr_ptes(dst_mm);
pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable);
- if (!userfaultfd_wp(dst_vma))
+ if (!userfaultfd_protected(dst_vma))
pmd = pmd_swp_clear_uffd(pmd);
set_pmd_at(dst_mm, addr, dst_pmd, pmd);
}
@@ -2013,9 +2013,15 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
out_zero_page:
mm_inc_nr_ptes(dst_mm);
pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable);
- pmdp_set_wrprotect(src_mm, addr, src_pmd);
- if (!userfaultfd_wp(dst_vma))
+
+ /* See __copy_present_ptes(): restore accessible protection. */
+ if (!userfaultfd_protected(dst_vma)) {
+ if (userfaultfd_rwp(src_vma))
+ pmd = pmd_modify(pmd, dst_vma->vm_page_prot);
pmd = pmd_clear_uffd(pmd);
+ }
+
+ pmdp_set_wrprotect(src_mm, addr, src_pmd);
pmd = pmd_wrprotect(pmd);
set_pmd:
pmd = pmd_mkold(pmd);
@@ -2601,8 +2607,16 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
pgtable_trans_huge_deposit(mm, new_pmd, pgtable);
}
pmd = move_soft_dirty_pmd(pmd);
- if (vma_has_uffd_without_event_remap(vma))
+ if (vma_has_uffd_without_event_remap(vma)) {
+ /*
+ * See __copy_present_ptes(): normalise RWP PMDs so
+ * the destination starts accessible instead of taking
+ * a numa-hinting fault on first access.
+ */
+ if (pmd_present(pmd) && userfaultfd_rwp(vma))
+ pmd = pmd_modify(pmd, vma->vm_page_prot);
pmd = clear_uffd_wp_pmd(pmd);
+ }
set_pmd_at(mm, new_addr, new_pmd, pmd);
if (force_flush)
flush_pmd_tlb_range(vma, old_addr, old_addr + PMD_SIZE);
@@ -2739,6 +2753,10 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
*/
entry = pmd_clear_uffd(entry);
+ /* See change_pte_range(): preserve RWP protection across mprotect() */
+ if (userfaultfd_rwp(vma) && pmd_uffd(entry))
+ entry = pmd_modify(entry, PAGE_NONE);
+
/* See change_pte_range(). */
if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && !pmd_write(entry) &&
can_change_pmd_writable(vma, addr, entry))
@@ -2906,6 +2924,13 @@ int move_pages_huge_pmd(struct mm_struct *mm, pmd_t *dst_pmd, pmd_t *src_pmd, pm
_dst_pmd = move_soft_dirty_pmd(src_pmdval);
_dst_pmd = clear_uffd_wp_pmd(_dst_pmd);
}
+
+ /* Re-arm RWP on the moved PMD if dst_vma is RWP-registered. */
+ if (userfaultfd_rwp(dst_vma)) {
+ _dst_pmd = pmd_modify(_dst_pmd, PAGE_NONE);
+ _dst_pmd = pmd_mkuffd(_dst_pmd);
+ }
+
set_pmd_at(mm, dst_addr, dst_pmd, _dst_pmd);
src_pgtable = pgtable_trans_huge_withdraw(mm, src_pmd);
@@ -3082,6 +3107,11 @@ static void __split_huge_zero_page_pmd(struct vm_area_struct *vma,
entry = pte_mkspecial(entry);
if (pmd_uffd(old_pmd))
entry = pte_mkuffd(entry);
+
+ /* Restore PAGE_NONE so an RWP marker keeps trapping */
+ if (userfaultfd_rwp(vma) && pmd_uffd(old_pmd))
+ entry = pte_modify(entry, PAGE_NONE);
+
VM_BUG_ON(!pte_none(ptep_get(pte)));
set_pte_at(mm, addr, pte, entry);
pte++;
@@ -3356,6 +3386,10 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
if (uffd_wp)
entry = pte_mkuffd(entry);
+ /* Restore PAGE_NONE so an RWP marker keeps trapping */
+ if (userfaultfd_rwp(vma) && uffd_wp)
+ entry = pte_modify(entry, PAGE_NONE);
+
for (i = 0; i < HPAGE_PMD_NR; i++)
VM_WARN_ON(!pte_none(ptep_get(pte + i)));
@@ -5054,6 +5088,11 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
pmde = pmd_mkwrite(pmde, vma);
if (pmd_swp_uffd(*pvmw->pmd))
pmde = pmd_mkuffd(pmde);
+
+ /* See do_swap_page(): restore PAGE_NONE for RWP */
+ if (pmd_swp_uffd(*pvmw->pmd) && userfaultfd_rwp(vma))
+ pmde = pmd_modify(pmde, PAGE_NONE);
+
if (!softleaf_is_migration_young(entry))
pmde = pmd_mkold(pmde);
/* NOTE: this may contain setting soft-dirty on some archs */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3cdbf0057dce..eee32a325481 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4859,8 +4859,12 @@ hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long add
__folio_mark_uptodate(new_folio);
hugetlb_add_new_anon_rmap(new_folio, vma, addr);
- if (userfaultfd_wp(vma) && huge_pte_uffd(old))
+ if (userfaultfd_protected(vma) && huge_pte_uffd(old)) {
newpte = huge_pte_mkuffd(newpte);
+ /* Restore PAGE_NONE so the RWP marker keeps trapping. */
+ if (userfaultfd_rwp(vma))
+ newpte = huge_pte_modify(newpte, PAGE_NONE);
+ }
set_huge_pte_at(vma->vm_mm, addr, ptep, newpte, sz);
hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
folio_set_hugetlb_migratable(new_folio);
@@ -4933,7 +4937,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
softleaf = softleaf_from_pte(entry);
if (unlikely(softleaf_is_hwpoison(softleaf))) {
- if (!userfaultfd_wp(dst_vma))
+ if (!userfaultfd_protected(dst_vma))
entry = huge_pte_clear_uffd(entry);
set_huge_pte_at(dst, addr, dst_pte, entry, sz);
} else if (unlikely(softleaf_is_migration(softleaf))) {
@@ -4947,11 +4951,11 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
softleaf = make_readable_migration_entry(
swp_offset(softleaf));
entry = swp_entry_to_pte(softleaf);
- if (userfaultfd_wp(src_vma) && uffd)
+ if (userfaultfd_protected(src_vma) && uffd)
entry = pte_swp_mkuffd(entry);
set_huge_pte_at(src, addr, src_pte, entry, sz);
}
- if (!userfaultfd_wp(dst_vma))
+ if (!userfaultfd_protected(dst_vma))
entry = huge_pte_clear_uffd(entry);
set_huge_pte_at(dst, addr, dst_pte, entry, sz);
} else if (unlikely(pte_is_marker(entry))) {
@@ -5015,6 +5019,13 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
goto next;
}
+ /* See __copy_present_ptes(): restore accessible protection. */
+ if (!userfaultfd_protected(dst_vma)) {
+ if (userfaultfd_rwp(src_vma))
+ entry = huge_pte_modify(entry, dst_vma->vm_page_prot);
+ entry = huge_pte_clear_uffd(entry);
+ }
+
if (cow) {
/*
* No need to notify as we are downgrading page
@@ -5027,9 +5038,6 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
entry = huge_pte_wrprotect(entry);
}
- if (!userfaultfd_wp(dst_vma))
- entry = huge_pte_clear_uffd(entry);
-
set_huge_pte_at(dst, addr, dst_pte, entry, sz);
hugetlb_count_add(npages, dst);
}
@@ -5075,10 +5083,19 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
huge_pte_clear(mm, new_addr, dst_pte, sz);
} else {
if (need_clear_uffd_wp) {
- if (pte_present(pte))
+ if (pte_present(pte)) {
+ /*
+ * See __copy_present_ptes(): normalise RWP
+ * PTEs so the destination starts accessible
+ * instead of taking a numa-hinting fault on
+ * first access.
+ */
+ if (userfaultfd_rwp(vma))
+ pte = huge_pte_modify(pte, vma->vm_page_prot);
pte = huge_pte_clear_uffd(pte);
- else
+ } else {
pte = pte_swp_clear_uffd(pte);
+ }
}
set_huge_pte_at(mm, new_addr, dst_pte, pte, sz);
}
@@ -6529,6 +6546,11 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
pte = huge_pte_mkuffd(pte);
else if (uffd_wp_resolve || uffd_rwp_resolve)
pte = huge_pte_clear_uffd(pte);
+
+ /* Preserve RWP protection across mprotect() */
+ if (userfaultfd_rwp(vma) && huge_pte_uffd(pte))
+ pte = huge_pte_modify(pte, PAGE_NONE);
+
huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte);
pages++;
tlb_remove_huge_tlb_entry(h, &tlb, ptep, address);
diff --git a/mm/memory.c b/mm/memory.c
index f2e7e900b1b8..ea9616e3dbaf 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -880,6 +880,10 @@ static void restore_exclusive_pte(struct vm_area_struct *vma,
if (pte_swp_uffd(orig_pte))
pte = pte_mkuffd(pte);
+ /* See do_swap_page(): restore PAGE_NONE for RWP */
+ if (pte_swp_uffd(orig_pte) && userfaultfd_rwp(vma))
+ pte = pte_modify(pte, PAGE_NONE);
+
if ((vma->vm_flags & VM_WRITE) &&
can_change_pte_writable(vma, address, pte)) {
if (folio_test_dirty(folio))
@@ -1025,7 +1029,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
make_pte_marker(marker));
return 0;
}
- if (!userfaultfd_wp(dst_vma))
+ if (!userfaultfd_protected(dst_vma))
pte = pte_swp_clear_uffd(pte);
set_pte_at(dst_mm, addr, dst_pte, pte);
return 0;
@@ -1072,9 +1076,13 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
/* All done, just insert the new page copy in the child */
pte = folio_mk_pte(new_folio, dst_vma->vm_page_prot);
pte = maybe_mkwrite(pte_mkdirty(pte), dst_vma);
- if (userfaultfd_pte_wp(dst_vma, ptep_get(src_pte)))
- /* Uffd-wp needs to be delivered to dest pte as well */
+ if (userfaultfd_protected(dst_vma) && pte_uffd(ptep_get(src_pte))) {
+ /* The uffd bit needs to be delivered to the dest pte as well */
pte = pte_mkuffd(pte);
+ /* Restore PAGE_NONE so the RWP marker keeps trapping */
+ if (userfaultfd_rwp(dst_vma))
+ pte = pte_modify(pte, PAGE_NONE);
+ }
set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte);
return 0;
}
@@ -1084,9 +1092,29 @@ static __always_inline void __copy_present_ptes(struct vm_area_struct *dst_vma,
pte_t pte, unsigned long addr, int nr)
{
struct mm_struct *src_mm = src_vma->vm_mm;
+ bool writable;
+
+ /*
+ * Snapshot writability before the RWP-disarm rewrite below: when the
+ * child is not RWP-armed, pte_modify(pte, dst_vma->vm_page_prot) can
+ * silently drop _PAGE_RW from a resolved (no-marker) writable PTE,
+ * so a later pte_write(pte) check would skip the COW wrprotect and
+ * leave the parent writable over a folio shared with the child.
+ */
+ writable = pte_write(pte);
+
+ /*
+ * Child is not RWP-armed: restore accessible protection so the
+ * inherited PAGE_NONE does not cost a fault on first read.
+ */
+ if (!userfaultfd_protected(dst_vma)) {
+ if (userfaultfd_rwp(src_vma))
+ pte = pte_modify(pte, dst_vma->vm_page_prot);
+ pte = pte_clear_uffd(pte);
+ }
/* If it's a COW mapping, write protect it both processes. */
- if (is_cow_mapping(src_vma->vm_flags) && pte_write(pte)) {
+ if (is_cow_mapping(src_vma->vm_flags) && writable) {
wrprotect_ptes(src_mm, addr, src_pte, nr);
pte = pte_wrprotect(pte);
}
@@ -1096,9 +1124,6 @@ static __always_inline void __copy_present_ptes(struct vm_area_struct *dst_vma,
pte = pte_mkclean(pte);
pte = pte_mkold(pte);
- if (!userfaultfd_wp(dst_vma))
- pte = pte_clear_uffd(pte);
-
set_ptes(dst_vma->vm_mm, addr, dst_pte, pte, nr);
}
@@ -5080,6 +5105,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
if (pte_swp_uffd(vmf->orig_pte))
pte = pte_mkuffd(pte);
+ /*
+ * A page reclaimed while RWP-protected carries the uffd bit on
+ * its swap entry. Re-apply PAGE_NONE on swap-in so the first access
+ * still traps as an RWP fault. pte_modify() preserves _PAGE_UFFD.
+ */
+ if (pte_swp_uffd(vmf->orig_pte) && userfaultfd_rwp(vma))
+ pte = pte_modify(pte, PAGE_NONE);
+
/*
* Same logic as in do_wp_page(); however, optimize for pages that are
* certainly not shared either because we just allocated them without
diff --git a/mm/migrate.c b/mm/migrate.c
index 9d81b7b881ec..633085130b7c 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -329,6 +329,10 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
if (pte_swp_uffd(old_pte))
newpte = pte_mkuffd(newpte);
+ /* See remove_migration_pte(): restore PAGE_NONE for RWP */
+ if (pte_swp_uffd(old_pte) && userfaultfd_rwp(pvmw->vma))
+ newpte = pte_modify(newpte, PAGE_NONE);
+
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio));
@@ -394,6 +398,10 @@ static bool remove_migration_pte(struct folio *folio,
else if (pte_swp_uffd(old_pte))
pte = pte_mkuffd(pte);
+ /* See do_swap_page(): restore PAGE_NONE for RWP */
+ if (pte_swp_uffd(old_pte) && userfaultfd_rwp(vma))
+ pte = pte_modify(pte, PAGE_NONE);
+
if (folio_test_anon(folio) && !softleaf_is_migration_read(entry))
rmap_flags |= RMAP_EXCLUSIVE;
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 4a6b35482aee..e0b5fe7c66b2 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -296,6 +296,16 @@ static __always_inline void change_present_ptes(struct mmu_gather *tlb,
else if (uffd_prot_resolve)
ptent = pte_clear_uffd(ptent);
+ /*
+ * The uffd bit on a VM_UFFD_RWP VMA carries PROT_NONE
+ * semantics. If mprotect() or NUMA hinting changed the
+ * base protection, restore PAGE_NONE so the PTE still
+ * traps on any access. pte_modify() preserves
+ * _PAGE_UFFD.
+ */
+ if (userfaultfd_rwp(vma) && pte_uffd(ptent))
+ ptent = pte_modify(ptent, PAGE_NONE);
+
/*
* In some writable, shared mappings, we might want
* to catch actual write access -- see
diff --git a/mm/mremap.c b/mm/mremap.c
index 12732a5c547e..14e5df316f83 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -296,10 +296,19 @@ static int move_ptes(struct pagetable_move_control *pmc,
pte_clear(mm, new_addr, new_ptep);
else {
if (need_clear_uffd_wp) {
- if (pte_present(pte))
+ if (pte_present(pte)) {
+ /*
+ * See __copy_present_ptes(): normalise
+ * RWP PTEs so the destination starts
+ * accessible instead of taking a
+ * numa-hinting fault on first access.
+ */
+ if (userfaultfd_rwp(vma))
+ pte = pte_modify(pte, vma->vm_page_prot);
pte = pte_clear_uffd(pte);
- else
+ } else {
pte = pte_swp_clear_uffd(pte);
+ }
}
set_ptes(mm, new_addr, new_ptep, pte, nr_ptes);
}
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9119efef7fe6..260239b260d5 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2338,6 +2338,11 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
new_pte = pte_mksoft_dirty(new_pte);
if (pte_swp_uffd(old_pte))
new_pte = pte_mkuffd(new_pte);
+
+ /* See do_swap_page(): restore PAGE_NONE for RWP */
+ if (pte_swp_uffd(old_pte) && userfaultfd_rwp(vma))
+ new_pte = pte_modify(new_pte, PAGE_NONE);
+
setpte:
set_pte_at(vma->vm_mm, addr, pte, new_pte);
folio_put_swap(swapcache, folio_file_page(swapcache, swp_offset(entry)));
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index d546ffd2f165..d4a1d340dab3 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1200,6 +1200,13 @@ static long move_present_ptes(struct mm_struct *mm,
if (pte_dirty(orig_src_pte))
orig_dst_pte = pte_mkdirty(orig_dst_pte);
orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma);
+
+ /* Re-arm RWP on the moved PTE if dst_vma is RWP-registered. */
+ if (userfaultfd_rwp(dst_vma)) {
+ orig_dst_pte = pte_modify(orig_dst_pte, PAGE_NONE);
+ orig_dst_pte = pte_mkuffd(orig_dst_pte);
+ }
+
set_pte_at(mm, dst_addr, dst_pte, orig_dst_pte);
src_addr += PAGE_SIZE;
@@ -1307,6 +1314,13 @@ static int move_zeropage_pte(struct mm_struct *mm,
zero_pte = pte_mkspecial(pfn_pte(zero_pfn(dst_addr),
dst_vma->vm_page_prot));
+
+ /* Re-arm RWP on the moved PTE if dst_vma is RWP-registered. */
+ if (userfaultfd_rwp(dst_vma)) {
+ zero_pte = pte_modify(zero_pte, PAGE_NONE);
+ zero_pte = pte_mkuffd(zero_pte);
+ }
+
ptep_clear_flush(src_vma, src_addr, src_pte);
set_pte_at(mm, dst_addr, dst_pte, zero_pte);
double_pt_unlock(dst_ptl, src_ptl);
--
2.51.2
^ permalink raw reply related
* [PATCH v3 05/16] mm: add MM_CP_UFFD_RWP change_protection() flag
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
Preparatory patch. Add the change_protection() primitive that
userfaultfd RWP will use.
An RWP-protected PTE is PAGE_NONE with the uffd PTE bit set. The
PROT_NONE half makes the CPU fault on any access; the uffd bit
distinguishes an RWP fault from a plain mprotect(PROT_NONE) or NUMA
hinting fault. MM_CP_UFFD_WP and MM_CP_UFFD_RWP share the same PTE
bit, so the two cannot be used together on the same range.
Two new change_protection() flags:
MM_CP_UFFD_RWP install PAGE_NONE and set the uffd bit
MM_CP_UFFD_RWP_RESOLVE restore vma->vm_page_prot, clear the uffd bit
Both are wired through change_pte_range(), change_huge_pmd(), and
hugetlb_change_protection() so anon, shmem, THP, and hugetlb all
share the same semantics.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
---
include/linux/mm.h | 5 ++++
include/linux/userfaultfd_k.h | 1 -
mm/huge_memory.c | 30 +++++++++++++----------
mm/hugetlb.c | 25 ++++++++++++++-----
mm/mprotect.c | 46 +++++++++++++++++++++++++++--------
5 files changed, 77 insertions(+), 30 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 3f53d1e978c0..9054468774b5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3291,6 +3291,11 @@ int get_cmdline(struct task_struct *task, char *buffer, int buflen);
#define MM_CP_UFFD_WP_RESOLVE (1UL << 3) /* Resolve wp */
#define MM_CP_UFFD_WP_ALL (MM_CP_UFFD_WP | \
MM_CP_UFFD_WP_RESOLVE)
+/* Whether this change is for uffd RWP */
+#define MM_CP_UFFD_RWP (1UL << 4) /* do rwp */
+#define MM_CP_UFFD_RWP_RESOLVE (1UL << 5) /* resolve rwp */
+#define MM_CP_UFFD_RWP_ALL (MM_CP_UFFD_RWP | \
+ MM_CP_UFFD_RWP_RESOLVE)
bool can_change_pte_writable(struct vm_area_struct *vma, unsigned long addr,
pte_t pte);
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index 889c7b45fec8..07766398d592 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -397,7 +397,6 @@ static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma,
return false;
}
-
static inline bool userfaultfd_armed(struct vm_area_struct *vma)
{
return false;
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d88fcccd386d..befc919de69e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2615,8 +2615,8 @@ bool move_huge_pmd(struct vm_area_struct *vma, unsigned long old_addr,
}
static void change_non_present_huge_pmd(struct mm_struct *mm,
- unsigned long addr, pmd_t *pmd, bool uffd_wp,
- bool uffd_wp_resolve)
+ unsigned long addr, pmd_t *pmd, bool uffd_prot,
+ bool uffd_prot_resolve)
{
softleaf_t entry = softleaf_from_pmd(*pmd);
const struct folio *folio = softleaf_to_folio(entry);
@@ -2642,9 +2642,9 @@ static void change_non_present_huge_pmd(struct mm_struct *mm,
newpmd = *pmd;
}
- if (uffd_wp)
+ if (uffd_prot)
newpmd = pmd_swp_mkuffd(newpmd);
- else if (uffd_wp_resolve)
+ else if (uffd_prot_resolve)
newpmd = pmd_swp_clear_uffd(newpmd);
if (!pmd_same(*pmd, newpmd))
set_pmd_at(mm, addr, pmd, newpmd);
@@ -2665,8 +2665,9 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
spinlock_t *ptl;
pmd_t oldpmd, entry;
bool prot_numa = cp_flags & MM_CP_PROT_NUMA;
- bool uffd_wp = cp_flags & MM_CP_UFFD_WP;
- bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE;
+ bool uffd_prot = cp_flags & (MM_CP_UFFD_WP | MM_CP_UFFD_RWP);
+ bool uffd_prot_resolve = cp_flags &
+ (MM_CP_UFFD_WP_RESOLVE | MM_CP_UFFD_RWP_RESOLVE);
int ret = 1;
tlb_change_page_size(tlb, HPAGE_PMD_SIZE);
@@ -2679,11 +2680,17 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
return 0;
if (thp_migration_supported() && pmd_is_valid_softleaf(*pmd)) {
- change_non_present_huge_pmd(mm, addr, pmd, uffd_wp,
- uffd_wp_resolve);
+ change_non_present_huge_pmd(mm, addr, pmd, uffd_prot,
+ uffd_prot_resolve);
goto unlock;
}
+ /* Already in the desired state */
+ if (prot_numa && pmd_protnone(*pmd))
+ goto unlock;
+ if ((cp_flags & MM_CP_UFFD_RWP) && pmd_protnone(*pmd) && pmd_uffd(*pmd))
+ goto unlock;
+
if (prot_numa) {
/*
@@ -2694,9 +2701,6 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
if (is_huge_zero_pmd(*pmd))
goto unlock;
- if (pmd_protnone(*pmd))
- goto unlock;
-
if (!folio_can_map_prot_numa(pmd_folio(*pmd), vma,
vma_is_single_threaded_private(vma)))
goto unlock;
@@ -2725,9 +2729,9 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
oldpmd = pmdp_invalidate_ad(vma, addr, pmd);
entry = pmd_modify(oldpmd, newprot);
- if (uffd_wp)
+ if (uffd_prot)
entry = pmd_mkuffd(entry);
- else if (uffd_wp_resolve)
+ else if (uffd_prot_resolve)
/*
* Leave the write bit to be handled by PF interrupt
* handler, then things like COW could be properly
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f770c6504e26..3cdbf0057dce 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6409,6 +6409,8 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
unsigned long last_addr_mask;
bool uffd_wp = cp_flags & MM_CP_UFFD_WP;
bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE;
+ bool uffd_rwp = cp_flags & MM_CP_UFFD_RWP;
+ bool uffd_rwp_resolve = cp_flags & MM_CP_UFFD_RWP_RESOLVE;
struct mmu_gather tlb;
/*
@@ -6434,6 +6436,11 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
ptep = hugetlb_walk(vma, address, psize);
if (!ptep) {
+ /*
+ * uffd_wp installs a pte marker on the unpopulated
+ * entry; uffd_rwp does not install markers so the
+ * allocation is unnecessary for it.
+ */
if (!uffd_wp) {
address |= last_addr_mask;
continue;
@@ -6455,7 +6462,8 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
* shouldn't happen at all. Warn about it if it
* happened due to some reason.
*/
- WARN_ON_ONCE(uffd_wp || uffd_wp_resolve);
+ WARN_ON_ONCE(uffd_wp || uffd_wp_resolve ||
+ uffd_rwp || uffd_rwp_resolve);
pages++;
spin_unlock(ptl);
address |= last_addr_mask;
@@ -6489,9 +6497,9 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
pages++;
}
- if (uffd_wp)
+ if (uffd_wp || uffd_rwp)
newpte = pte_swp_mkuffd(newpte);
- else if (uffd_wp_resolve)
+ else if (uffd_wp_resolve || uffd_rwp_resolve)
newpte = pte_swp_clear_uffd(newpte);
if (!pte_same(pte, newpte))
set_huge_pte_at(mm, address, ptep, newpte, psize);
@@ -6502,19 +6510,24 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
* pte_marker_uffd_wp()==true implies !poison
* because they're mutual exclusive.
*/
- if (pte_is_uffd_wp_marker(pte) && uffd_wp_resolve)
+ if (pte_is_uffd_wp_marker(pte) &&
+ (uffd_wp_resolve || uffd_rwp_resolve))
/* Safe to modify directly (non-present->none). */
huge_pte_clear(mm, address, ptep, psize);
} else {
pte_t old_pte;
unsigned int shift = huge_page_shift(hstate_vma(vma));
+ /* Already protnone with uffd bit set? Nothing to do. */
+ if (uffd_rwp && pte_protnone(pte) && huge_pte_uffd(pte))
+ goto next;
+
old_pte = huge_ptep_modify_prot_start(vma, address, ptep);
pte = huge_pte_modify(old_pte, newprot);
pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
- if (uffd_wp)
+ if (uffd_wp || uffd_rwp)
pte = huge_pte_mkuffd(pte);
- else if (uffd_wp_resolve)
+ else if (uffd_wp_resolve || uffd_rwp_resolve)
pte = huge_pte_clear_uffd(pte);
huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte);
pages++;
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 8340c8b228c6..4a6b35482aee 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -214,8 +214,9 @@ static __always_inline void set_write_prot_commit_flush_ptes(struct vm_area_stru
static long change_softleaf_pte(struct vm_area_struct *vma,
unsigned long addr, pte_t *pte, pte_t oldpte, unsigned long cp_flags)
{
- const bool uffd_wp = cp_flags & MM_CP_UFFD_WP;
- const bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE;
+ const bool uffd_prot = cp_flags & (MM_CP_UFFD_WP | MM_CP_UFFD_RWP);
+ const bool uffd_prot_resolve = cp_flags &
+ (MM_CP_UFFD_WP_RESOLVE | MM_CP_UFFD_RWP_RESOLVE);
softleaf_t entry = softleaf_from_pte(oldpte);
pte_t newpte;
@@ -256,7 +257,7 @@ static long change_softleaf_pte(struct vm_area_struct *vma,
* to unprotect it, drop it; the next page
* fault will trigger without uffd trapping.
*/
- if (uffd_wp_resolve) {
+ if (uffd_prot_resolve) {
pte_clear(vma->vm_mm, addr, pte);
return 1;
}
@@ -265,9 +266,9 @@ static long change_softleaf_pte(struct vm_area_struct *vma,
newpte = oldpte;
}
- if (uffd_wp)
+ if (uffd_prot)
newpte = pte_swp_mkuffd(newpte);
- else if (uffd_wp_resolve)
+ else if (uffd_prot_resolve)
newpte = pte_swp_clear_uffd(newpte);
if (!pte_same(oldpte, newpte)) {
@@ -282,16 +283,17 @@ static __always_inline void change_present_ptes(struct mmu_gather *tlb,
int nr_ptes, unsigned long end, pgprot_t newprot,
struct folio *folio, struct page *page, unsigned long cp_flags)
{
- const bool uffd_wp_resolve = cp_flags & MM_CP_UFFD_WP_RESOLVE;
- const bool uffd_wp = cp_flags & MM_CP_UFFD_WP;
+ const bool uffd_prot = cp_flags & (MM_CP_UFFD_WP | MM_CP_UFFD_RWP);
+ const bool uffd_prot_resolve = cp_flags &
+ (MM_CP_UFFD_WP_RESOLVE | MM_CP_UFFD_RWP_RESOLVE);
pte_t ptent, oldpte;
oldpte = modify_prot_start_ptes(vma, addr, ptep, nr_ptes);
ptent = pte_modify(oldpte, newprot);
- if (uffd_wp)
+ if (uffd_prot)
ptent = pte_mkuffd(ptent);
- else if (uffd_wp_resolve)
+ else if (uffd_prot_resolve)
ptent = pte_clear_uffd(ptent);
/*
@@ -325,6 +327,7 @@ static long change_pte_range(struct mmu_gather *tlb,
long pages = 0;
bool is_private_single_threaded;
bool prot_numa = cp_flags & MM_CP_PROT_NUMA;
+ bool uffd_rwp = cp_flags & MM_CP_UFFD_RWP;
bool uffd_wp = cp_flags & MM_CP_UFFD_WP;
int nr_ptes;
@@ -350,6 +353,14 @@ static long change_pte_range(struct mmu_gather *tlb,
/* Already in the desired state. */
if (prot_numa && pte_protnone(oldpte))
continue;
+ /*
+ * RWP-protected PTEs carry _PAGE_UFFD as a marker on
+ * top of PROT_NONE. Skip only entries already in that
+ * exact state; plain PROT_NONE from mprotect() still needs
+ * to be promoted so future faults can be distinguished.
+ */
+ if (uffd_rwp && pte_protnone(oldpte) && pte_uffd(oldpte))
+ continue;
page = vm_normal_page(vma, addr, oldpte);
if (page)
@@ -358,6 +369,8 @@ static long change_pte_range(struct mmu_gather *tlb,
/*
* Avoid trapping faults against the zero or KSM
* pages. See similar comment in change_huge_pmd.
+ * Skip this filter for uffd RWP which
+ * must set protnone regardless of NUMA placement.
*/
if (prot_numa &&
!folio_can_map_prot_numa(folio, vma,
@@ -667,7 +680,16 @@ long change_protection(struct mmu_gather *tlb,
pgprot_t newprot = vma->vm_page_prot;
long pages;
- BUG_ON((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL);
+ /*
+ * MM_CP_UFFD_{WP,RWP} and _RESOLVE are mutually exclusive within one
+ * change, and WP and RWP cannot mix. Miswired callers get a warn and
+ * a no-op; userspace cannot reach this state.
+ */
+ if (WARN_ON_ONCE((cp_flags & MM_CP_UFFD_WP_ALL) == MM_CP_UFFD_WP_ALL ||
+ (cp_flags & MM_CP_UFFD_RWP_ALL) == MM_CP_UFFD_RWP_ALL ||
+ ((cp_flags & MM_CP_UFFD_WP_ALL) &&
+ (cp_flags & MM_CP_UFFD_RWP_ALL))))
+ return 0;
#ifdef CONFIG_NUMA_BALANCING
/*
@@ -681,6 +703,10 @@ long change_protection(struct mmu_gather *tlb,
WARN_ON_ONCE(cp_flags & MM_CP_PROT_NUMA);
#endif
+ if (IS_ENABLED(CONFIG_ARCH_HAS_PTE_PROTNONE) &&
+ (cp_flags & MM_CP_UFFD_RWP))
+ newprot = PAGE_NONE;
+
if (is_vm_hugetlb_page(vma))
pages = hugetlb_change_protection(vma, start, end, newprot,
cp_flags);
--
2.51.2
^ permalink raw reply related
* [PATCH v3 04/16] mm: add VM_UFFD_RWP VMA flag
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
Preparatory patch for userfaultfd read-write protection (RWP). RWP
extends userfaultfd protection from plain write-protection (WP) to
full read-write protection: accesses to an RWP-protected range --
reads as well as writes -- trap through userfaultfd.
Reserve VM_UFFD_RWP, add the userfaultfd_rwp() and
userfaultfd_protected() helpers, and wire up the smaps "ur" entry and
the trace-flag table the rest of the series will use. The flag is
gated on CONFIG_USERFAULTFD_RWP, which is introduced together with the
UAPI in a later patch; until then VM_UFFD_RWP aliases VM_NONE and
every downstream check folds to dead code.
Nothing sets or queries the flag yet.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
Documentation/filesystems/proc.rst | 1 +
fs/proc/task_mmu.c | 3 +++
include/linux/mm.h | 28 ++++++++++++++++---------
include/linux/userfaultfd_k.h | 33 ++++++++++++++++++++++++------
include/trace/events/mmflags.h | 7 +++++++
5 files changed, 56 insertions(+), 16 deletions(-)
diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
index db6167befb7b..db28207c5290 100644
--- a/Documentation/filesystems/proc.rst
+++ b/Documentation/filesystems/proc.rst
@@ -607,6 +607,7 @@ encoded manner. The codes are the following:
um userfaultfd missing tracking
uw userfaultfd wr-protect tracking
ui userfaultfd minor fault
+ ur userfaultfd read-write-protect tracking
ss shadow/guarded control stack page
sl sealed
lf lock on fault pages
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 5827074962e7..fbaede228201 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1206,6 +1206,9 @@ static void show_smap_vma_flags(struct seq_file *m, struct vm_area_struct *vma)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_MINOR
[ilog2(VM_UFFD_MINOR)] = "ui",
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_MINOR */
+#ifdef CONFIG_USERFAULTFD_RWP
+ [ilog2(VM_UFFD_RWP)] = "ur",
+#endif
#ifdef CONFIG_ARCH_HAS_USER_SHADOW_STACK
[ilog2(VM_SHADOW_STACK)] = "ss",
#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0b776907152e..3f53d1e978c0 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -353,6 +353,7 @@ enum {
#endif
DECLARE_VMA_BIT(UFFD_MINOR, 41),
DECLARE_VMA_BIT(SEALED, 42),
+ DECLARE_VMA_BIT(UFFD_RWP, 43),
/* Flags that reuse flags above. */
DECLARE_VMA_BIT_ALIAS(PKEY_BIT0, HIGH_ARCH_0),
DECLARE_VMA_BIT_ALIAS(PKEY_BIT1, HIGH_ARCH_1),
@@ -496,6 +497,11 @@ enum {
#else
#define VM_UFFD_MINOR VM_NONE
#endif
+#ifdef CONFIG_USERFAULTFD_RWP
+#define VM_UFFD_RWP INIT_VM_FLAG(UFFD_RWP)
+#else
+#define VM_UFFD_RWP VM_NONE
+#endif
#ifdef CONFIG_64BIT
#define VM_ALLOW_ANY_UNCACHED INIT_VM_FLAG(ALLOW_ANY_UNCACHED)
#define VM_SEALED INIT_VM_FLAG(SEALED)
@@ -633,22 +639,24 @@ enum {
* reconsistuted upon page fault, so necessitate page table copying upon fork.
*
* Note that these flags should be compared with the DESTINATION VMA not the
- * source, as VM_UFFD_WP may not be propagated to destination, while all other
- * flags will be.
+ * source: VM_UFFD_WP and VM_UFFD_RWP may be cleared on the destination
+ * (dup_userfaultfd() -> userfaultfd_reset_ctx() when the parent context did
+ * not negotiate UFFD_FEATURE_EVENT_FORK), while all other flags propagate.
*
* VM_PFNMAP / VM_MIXEDMAP - These contain kernel-mapped data which cannot be
* reasonably reconstructed on page fault.
*
* VM_UFFD_WP - Encodes metadata about an installed uffd
- * write protect handler, which cannot be
- * reconstructed on page fault.
+ * VM_UFFD_RWP write- or read-write-protect handler, which
+ * cannot be reconstructed on page fault.
*
- * We always copy pgtables when dst_vma has uffd-wp
- * enabled even if it's file-backed
- * (e.g. shmem). Because when uffd-wp is enabled,
- * pgtable contains uffd-wp protection information,
- * that's something we can't retrieve from page cache,
- * and skip copying will lose those info.
+ * We always copy pgtables when dst_vma has the
+ * uffd PTE bit in use even if it's file-backed
+ * (e.g. shmem). Because when the uffd bit is
+ * in use, the pgtable contains the protection
+ * information, that's something we can't
+ * retrieve from page cache, and skip copying
+ * will lose those info.
*
* VM_MAYBE_GUARD - Could contain page guard region markers which
* by design are a property of the page tables
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index 98f546e83cd2..889c7b45fec8 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -21,10 +21,11 @@
#include <linux/hugetlb_inline.h>
/* The set of all possible UFFD-related VM flags. */
-#define __VM_UFFD_FLAGS (VM_UFFD_MISSING | VM_UFFD_WP | VM_UFFD_MINOR)
+#define __VM_UFFD_FLAGS (VM_UFFD_MISSING | VM_UFFD_MINOR | \
+ VM_UFFD_WP | VM_UFFD_RWP)
-#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_WP_BIT, \
- VMA_UFFD_MINOR_BIT)
+#define __VMA_UFFD_FLAGS mk_vma_flags(VMA_UFFD_MISSING_BIT, VMA_UFFD_MINOR_BIT, \
+ VMA_UFFD_WP_BIT, VMA_UFFD_RWP_BIT)
/*
* CAREFUL: Check include/uapi/asm-generic/fcntl.h when defining
@@ -192,7 +193,7 @@ static inline bool is_mergeable_vm_userfaultfd_ctx(struct vm_area_struct *vma,
*/
static inline bool uffd_disable_huge_pmd_share(struct vm_area_struct *vma)
{
- return vma->vm_flags & (VM_UFFD_WP | VM_UFFD_MINOR);
+ return vma->vm_flags & (VM_UFFD_MINOR | VM_UFFD_WP | VM_UFFD_RWP);
}
/*
@@ -222,6 +223,16 @@ static inline bool userfaultfd_minor(struct vm_area_struct *vma)
return vma->vm_flags & VM_UFFD_MINOR;
}
+static inline bool userfaultfd_rwp(struct vm_area_struct *vma)
+{
+ return vma->vm_flags & VM_UFFD_RWP;
+}
+
+static inline bool userfaultfd_protected(struct vm_area_struct *vma)
+{
+ return userfaultfd_wp(vma) || userfaultfd_rwp(vma);
+}
+
static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma,
pte_t pte)
{
@@ -364,6 +375,16 @@ static inline bool userfaultfd_minor(struct vm_area_struct *vma)
return false;
}
+static inline bool userfaultfd_rwp(struct vm_area_struct *vma)
+{
+ return false;
+}
+
+static inline bool userfaultfd_protected(struct vm_area_struct *vma)
+{
+ return false;
+}
+
static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma,
pte_t pte)
{
@@ -457,8 +478,8 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
}
/*
- * Returns true if this is a swap pte and was uffd-wp wr-protected in either
- * forms (pte marker or a normal swap pte), false otherwise.
+ * Returns true if this swap pte carries uffd-tracked state in either
+ * form (pte marker or a normal swap pte), false otherwise.
*/
static inline bool pte_swp_uffd_any(pte_t pte)
{
diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h
index a6e5a44c9b42..bfface3d0203 100644
--- a/include/trace/events/mmflags.h
+++ b/include/trace/events/mmflags.h
@@ -194,6 +194,12 @@ IF_HAVE_PG_ARCH_3(arch_3)
# define IF_HAVE_UFFD_MINOR(flag, name)
#endif
+#ifdef CONFIG_USERFAULTFD_RWP
+# define IF_HAVE_UFFD_RWP(flag, name) {flag, name},
+#else
+# define IF_HAVE_UFFD_RWP(flag, name)
+#endif
+
#if defined(CONFIG_64BIT) || defined(CONFIG_PPC32)
# define IF_HAVE_VM_DROPPABLE(flag, name) {flag, name},
#else
@@ -215,6 +221,7 @@ IF_HAVE_UFFD_MINOR(VM_UFFD_MINOR, "uffd_minor" ) \
{VM_PFNMAP, "pfnmap" }, \
{VM_MAYBE_GUARD, "maybe_guard" }, \
{VM_UFFD_WP, "uffd_wp" }, \
+IF_HAVE_UFFD_RWP(VM_UFFD_RWP, "uffd_rwp" ) \
{VM_LOCKED, "locked" }, \
{VM_IO, "io" }, \
{VM_SEQ_READ, "seqread" }, \
--
2.51.2
^ permalink raw reply related
* [PATCH v3 03/16] mm: rename uffd-wp PTE accessors to uffd
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
Userfaultfd RWP will reuse the uffd-wp PTE bit to mark access-tracking
PTEs, alongside the write-protected ones it already marks. The bit's
meaning now depends on the VMA flag (WP or RWP), not on its name.
Rename the kernel-internal names that describe the bit:
- pte/pmd/huge_pte accessors (and swap variants)
- pgtable_supports_uffd() capability query
- SCAN_PTE_UFFD khugepaged enum
The ftrace string emitted by mm_khugepaged_scan_pmd for this enum is
kept as "pte_uffd_wp" so existing trace-based tooling keeps matching.
Pure mechanical rename -- no behavior change.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
arch/arm64/include/asm/pgtable.h | 28 ++++++++--------
arch/riscv/include/asm/pgtable.h | 38 +++++++++++-----------
arch/s390/include/asm/hugetlb.h | 12 +++----
arch/x86/include/asm/pgtable.h | 24 +++++++-------
fs/proc/task_mmu.c | 44 ++++++++++++-------------
fs/userfaultfd.c | 4 +--
include/asm-generic/hugetlb.h | 18 +++++------
include/asm-generic/pgtable_uffd.h | 32 +++++++++---------
include/linux/leafops.h | 4 +--
include/linux/mm_inline.h | 4 +--
include/linux/swapops.h | 4 +--
include/linux/userfaultfd_k.h | 14 ++++----
include/trace/events/huge_memory.h | 2 +-
mm/huge_memory.c | 52 +++++++++++++++---------------
mm/hugetlb.c | 46 +++++++++++++-------------
mm/internal.h | 4 +--
mm/khugepaged.c | 20 ++++++------
mm/memory.c | 34 +++++++++----------
mm/migrate.c | 12 +++----
mm/migrate_device.c | 8 ++---
mm/mprotect.c | 12 +++----
mm/mremap.c | 4 +--
mm/page_table_check.c | 8 ++---
mm/rmap.c | 16 ++++-----
mm/swapfile.c | 4 +--
mm/userfaultfd.c | 2 +-
26 files changed, 225 insertions(+), 225 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3eecb2c17711..c41e4d59dc9f 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -341,17 +341,17 @@ static inline pmd_t pmd_mknoncont(pmd_t pmd)
}
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static inline int pte_uffd_wp(pte_t pte)
+static inline int pte_uffd(pte_t pte)
{
return !!(pte_val(pte) & PTE_UFFD);
}
-static inline pte_t pte_mkuffd_wp(pte_t pte)
+static inline pte_t pte_mkuffd(pte_t pte)
{
return pte_wrprotect(set_pte_bit(pte, __pgprot(PTE_UFFD)));
}
-static inline pte_t pte_clear_uffd_wp(pte_t pte)
+static inline pte_t pte_clear_uffd(pte_t pte)
{
return clear_pte_bit(pte, __pgprot(PTE_UFFD));
}
@@ -537,17 +537,17 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
}
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
+static inline pte_t pte_swp_mkuffd(pte_t pte)
{
return set_pte_bit(pte, __pgprot(PTE_SWP_UFFD));
}
-static inline int pte_swp_uffd_wp(pte_t pte)
+static inline int pte_swp_uffd(pte_t pte)
{
return !!(pte_val(pte) & PTE_SWP_UFFD);
}
-static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
+static inline pte_t pte_swp_clear_uffd(pte_t pte)
{
return clear_pte_bit(pte, __pgprot(PTE_SWP_UFFD));
}
@@ -590,13 +590,13 @@ static inline int pmd_protnone(pmd_t pmd)
#define pmd_mkvalid_k(pmd) pte_pmd(pte_mkvalid_k(pmd_pte(pmd)))
#define pmd_mkinvalid(pmd) pte_pmd(pte_mkinvalid(pmd_pte(pmd)))
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-#define pmd_uffd_wp(pmd) pte_uffd_wp(pmd_pte(pmd))
-#define pmd_mkuffd_wp(pmd) pte_pmd(pte_mkuffd_wp(pmd_pte(pmd)))
-#define pmd_clear_uffd_wp(pmd) pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd)))
-#define pmd_swp_uffd_wp(pmd) pte_swp_uffd_wp(pmd_pte(pmd))
-#define pmd_swp_mkuffd_wp(pmd) pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd)))
-#define pmd_swp_clear_uffd_wp(pmd) \
- pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd)))
+#define pmd_uffd(pmd) pte_uffd(pmd_pte(pmd))
+#define pmd_mkuffd(pmd) pte_pmd(pte_mkuffd(pmd_pte(pmd)))
+#define pmd_clear_uffd(pmd) pte_pmd(pte_clear_uffd(pmd_pte(pmd)))
+#define pmd_swp_uffd(pmd) pte_swp_uffd(pmd_pte(pmd))
+#define pmd_swp_mkuffd(pmd) pte_pmd(pte_swp_mkuffd(pmd_pte(pmd)))
+#define pmd_swp_clear_uffd(pmd) \
+ pte_pmd(pte_swp_clear_uffd(pmd_pte(pmd)))
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
#define pmd_write(pmd) pte_write(pmd_pte(pmd))
@@ -1512,7 +1512,7 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
* Encode and decode a swap entry:
* bits 0-1: present (must be zero)
* bits 2: remember PG_anon_exclusive
- * bit 3: remember uffd-wp state
+ * bit 3: remember uffd state
* bits 6-10: swap type
* bit 11: PTE_PRESENT_INVALID (must be zero)
* bits 12-61: swap offset
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index ca69948b3ed8..b111e134795e 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -400,35 +400,35 @@ static inline pte_t pte_wrprotect(pte_t pte)
}
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-#define pgtable_supports_uffd_wp() \
+#define pgtable_supports_uffd() \
riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)
-static inline bool pte_uffd_wp(pte_t pte)
+static inline bool pte_uffd(pte_t pte)
{
return !!(pte_val(pte) & _PAGE_UFFD);
}
-static inline pte_t pte_mkuffd_wp(pte_t pte)
+static inline pte_t pte_mkuffd(pte_t pte)
{
return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD));
}
-static inline pte_t pte_clear_uffd_wp(pte_t pte)
+static inline pte_t pte_clear_uffd(pte_t pte)
{
return __pte(pte_val(pte) & ~(_PAGE_UFFD));
}
-static inline bool pte_swp_uffd_wp(pte_t pte)
+static inline bool pte_swp_uffd(pte_t pte)
{
return !!(pte_val(pte) & _PAGE_SWP_UFFD);
}
-static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
+static inline pte_t pte_swp_mkuffd(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_SWP_UFFD);
}
-static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
+static inline pte_t pte_swp_clear_uffd(pte_t pte)
{
return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD));
}
@@ -886,34 +886,34 @@ static inline pud_t pud_mkspecial(pud_t pud)
#endif
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static inline bool pmd_uffd_wp(pmd_t pmd)
+static inline bool pmd_uffd(pmd_t pmd)
{
- return pte_uffd_wp(pmd_pte(pmd));
+ return pte_uffd(pmd_pte(pmd));
}
-static inline pmd_t pmd_mkuffd_wp(pmd_t pmd)
+static inline pmd_t pmd_mkuffd(pmd_t pmd)
{
- return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd)));
+ return pte_pmd(pte_mkuffd(pmd_pte(pmd)));
}
-static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd)
+static inline pmd_t pmd_clear_uffd(pmd_t pmd)
{
- return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd)));
+ return pte_pmd(pte_clear_uffd(pmd_pte(pmd)));
}
-static inline bool pmd_swp_uffd_wp(pmd_t pmd)
+static inline bool pmd_swp_uffd(pmd_t pmd)
{
- return pte_swp_uffd_wp(pmd_pte(pmd));
+ return pte_swp_uffd(pmd_pte(pmd));
}
-static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd)
+static inline pmd_t pmd_swp_mkuffd(pmd_t pmd)
{
- return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd)));
+ return pte_pmd(pte_swp_mkuffd(pmd_pte(pmd)));
}
-static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
+static inline pmd_t pmd_swp_clear_uffd(pmd_t pmd)
{
- return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd)));
+ return pte_pmd(pte_swp_clear_uffd(pmd_pte(pmd)));
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h
index 6983e52eaf81..cf8a176ff3d8 100644
--- a/arch/s390/include/asm/hugetlb.h
+++ b/arch/s390/include/asm/hugetlb.h
@@ -77,20 +77,20 @@ static inline void huge_ptep_set_wrprotect(struct mm_struct *mm,
__set_huge_pte_at(mm, addr, ptep, pte_wrprotect(pte));
}
-#define __HAVE_ARCH_HUGE_PTE_MKUFFD_WP
-static inline pte_t huge_pte_mkuffd_wp(pte_t pte)
+#define __HAVE_ARCH_HUGE_PTE_MKUFFD
+static inline pte_t huge_pte_mkuffd(pte_t pte)
{
return pte;
}
-#define __HAVE_ARCH_HUGE_PTE_CLEAR_UFFD_WP
-static inline pte_t huge_pte_clear_uffd_wp(pte_t pte)
+#define __HAVE_ARCH_HUGE_PTE_CLEAR_UFFD
+static inline pte_t huge_pte_clear_uffd(pte_t pte)
{
return pte;
}
-#define __HAVE_ARCH_HUGE_PTE_UFFD_WP
-static inline int huge_pte_uffd_wp(pte_t pte)
+#define __HAVE_ARCH_HUGE_PTE_UFFD
+static inline int huge_pte_uffd(pte_t pte)
{
return 0;
}
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 038c806b50a2..d14c84b2a332 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -411,17 +411,17 @@ static inline pte_t pte_wrprotect(pte_t pte)
}
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static inline int pte_uffd_wp(pte_t pte)
+static inline int pte_uffd(pte_t pte)
{
return pte_flags(pte) & _PAGE_UFFD;
}
-static inline pte_t pte_mkuffd_wp(pte_t pte)
+static inline pte_t pte_mkuffd(pte_t pte)
{
return pte_wrprotect(pte_set_flags(pte, _PAGE_UFFD));
}
-static inline pte_t pte_clear_uffd_wp(pte_t pte)
+static inline pte_t pte_clear_uffd(pte_t pte)
{
return pte_clear_flags(pte, _PAGE_UFFD);
}
@@ -526,17 +526,17 @@ static inline pmd_t pmd_wrprotect(pmd_t pmd)
}
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static inline int pmd_uffd_wp(pmd_t pmd)
+static inline int pmd_uffd(pmd_t pmd)
{
return pmd_flags(pmd) & _PAGE_UFFD;
}
-static inline pmd_t pmd_mkuffd_wp(pmd_t pmd)
+static inline pmd_t pmd_mkuffd(pmd_t pmd)
{
return pmd_wrprotect(pmd_set_flags(pmd, _PAGE_UFFD));
}
-static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd)
+static inline pmd_t pmd_clear_uffd(pmd_t pmd)
{
return pmd_clear_flags(pmd, _PAGE_UFFD);
}
@@ -1548,32 +1548,32 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
#endif
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
+static inline pte_t pte_swp_mkuffd(pte_t pte)
{
return pte_set_flags(pte, _PAGE_SWP_UFFD);
}
-static inline int pte_swp_uffd_wp(pte_t pte)
+static inline int pte_swp_uffd(pte_t pte)
{
return pte_flags(pte) & _PAGE_SWP_UFFD;
}
-static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
+static inline pte_t pte_swp_clear_uffd(pte_t pte)
{
return pte_clear_flags(pte, _PAGE_SWP_UFFD);
}
-static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd)
+static inline pmd_t pmd_swp_mkuffd(pmd_t pmd)
{
return pmd_set_flags(pmd, _PAGE_SWP_UFFD);
}
-static inline int pmd_swp_uffd_wp(pmd_t pmd)
+static inline int pmd_swp_uffd(pmd_t pmd)
{
return pmd_flags(pmd) & _PAGE_SWP_UFFD;
}
-static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
+static inline pmd_t pmd_swp_clear_uffd(pmd_t pmd)
{
return pmd_clear_flags(pmd, _PAGE_SWP_UFFD);
}
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 751b9ba160fb..5827074962e7 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1948,14 +1948,14 @@ static pagemap_entry_t pte_to_pagemap_entry(struct pagemapread *pm,
page = vm_normal_page(vma, addr, pte);
if (pte_soft_dirty(pte))
flags |= PM_SOFT_DIRTY;
- if (pte_uffd_wp(pte))
+ if (pte_uffd(pte))
flags |= PM_UFFD_WP;
} else {
softleaf_t entry;
if (pte_swp_soft_dirty(pte))
flags |= PM_SOFT_DIRTY;
- if (pte_swp_uffd_wp(pte))
+ if (pte_swp_uffd(pte))
flags |= PM_UFFD_WP;
entry = softleaf_from_pte(pte);
if (pm->show_pfn) {
@@ -2021,7 +2021,7 @@ static int pagemap_pmd_range_thp(pmd_t *pmdp, unsigned long addr,
flags |= PM_PRESENT;
if (pmd_soft_dirty(pmd))
flags |= PM_SOFT_DIRTY;
- if (pmd_uffd_wp(pmd))
+ if (pmd_uffd(pmd))
flags |= PM_UFFD_WP;
if (pm->show_pfn)
frame = pmd_pfn(pmd) + idx;
@@ -2040,7 +2040,7 @@ static int pagemap_pmd_range_thp(pmd_t *pmdp, unsigned long addr,
flags |= PM_SWAP;
if (pmd_swp_soft_dirty(pmd))
flags |= PM_SOFT_DIRTY;
- if (pmd_swp_uffd_wp(pmd))
+ if (pmd_swp_uffd(pmd))
flags |= PM_UFFD_WP;
VM_WARN_ON_ONCE(!pmd_is_migration_entry(pmd));
page = softleaf_to_page(entry);
@@ -2146,14 +2146,14 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask,
!hugetlb_pmd_shared(ptep))
flags |= PM_MMAP_EXCLUSIVE;
- if (huge_pte_uffd_wp(pte))
+ if (huge_pte_uffd(pte))
flags |= PM_UFFD_WP;
flags |= PM_PRESENT;
if (pm->show_pfn)
frame = pte_pfn(pte) +
((addr & ~hmask) >> PAGE_SHIFT);
- } else if (pte_swp_uffd_wp_any(pte)) {
+ } else if (pte_swp_uffd_any(pte)) {
flags |= PM_UFFD_WP;
}
@@ -2354,7 +2354,7 @@ static unsigned long pagemap_page_category(struct pagemap_scan_private *p,
categories = PAGE_IS_PRESENT;
- if (!pte_uffd_wp(pte))
+ if (!pte_uffd(pte))
categories |= PAGE_IS_WRITTEN;
if (p->masks_of_interest & PAGE_IS_FILE) {
@@ -2372,7 +2372,7 @@ static unsigned long pagemap_page_category(struct pagemap_scan_private *p,
categories = PAGE_IS_SWAPPED;
- if (!pte_swp_uffd_wp_any(pte))
+ if (!pte_swp_uffd_any(pte))
categories |= PAGE_IS_WRITTEN;
entry = softleaf_from_pte(pte);
@@ -2397,13 +2397,13 @@ static void make_uffd_wp_pte(struct vm_area_struct *vma,
pte_t old_pte;
old_pte = ptep_modify_prot_start(vma, addr, pte);
- ptent = pte_mkuffd_wp(old_pte);
+ ptent = pte_mkuffd(old_pte);
ptep_modify_prot_commit(vma, addr, pte, old_pte, ptent);
} else if (pte_none(ptent)) {
set_pte_at(vma->vm_mm, addr, pte,
make_pte_marker(PTE_MARKER_UFFD_WP));
} else {
- ptent = pte_swp_mkuffd_wp(ptent);
+ ptent = pte_swp_mkuffd(ptent);
set_pte_at(vma->vm_mm, addr, pte, ptent);
}
}
@@ -2422,7 +2422,7 @@ static unsigned long pagemap_thp_category(struct pagemap_scan_private *p,
struct page *page;
categories |= PAGE_IS_PRESENT;
- if (!pmd_uffd_wp(pmd))
+ if (!pmd_uffd(pmd))
categories |= PAGE_IS_WRITTEN;
if (p->masks_of_interest & PAGE_IS_FILE) {
@@ -2437,7 +2437,7 @@ static unsigned long pagemap_thp_category(struct pagemap_scan_private *p,
categories |= PAGE_IS_SOFT_DIRTY;
} else {
categories |= PAGE_IS_SWAPPED;
- if (!pmd_swp_uffd_wp(pmd))
+ if (!pmd_swp_uffd(pmd))
categories |= PAGE_IS_WRITTEN;
if (pmd_swp_soft_dirty(pmd))
categories |= PAGE_IS_SOFT_DIRTY;
@@ -2461,10 +2461,10 @@ static void make_uffd_wp_pmd(struct vm_area_struct *vma,
if (pmd_present(pmd)) {
old = pmdp_invalidate_ad(vma, addr, pmdp);
- pmd = pmd_mkuffd_wp(old);
+ pmd = pmd_mkuffd(old);
set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
} else if (pmd_is_migration_entry(pmd)) {
- pmd = pmd_swp_mkuffd_wp(pmd);
+ pmd = pmd_swp_mkuffd(pmd);
set_pmd_at(vma->vm_mm, addr, pmdp, pmd);
}
}
@@ -2486,7 +2486,7 @@ static unsigned long pagemap_hugetlb_category(pte_t pte)
if (pte_present(pte)) {
categories |= PAGE_IS_PRESENT;
- if (!huge_pte_uffd_wp(pte))
+ if (!huge_pte_uffd(pte))
categories |= PAGE_IS_WRITTEN;
if (!PageAnon(pte_page(pte)))
categories |= PAGE_IS_FILE;
@@ -2497,7 +2497,7 @@ static unsigned long pagemap_hugetlb_category(pte_t pte)
} else {
categories |= PAGE_IS_SWAPPED;
- if (!pte_swp_uffd_wp_any(pte))
+ if (!pte_swp_uffd_any(pte))
categories |= PAGE_IS_WRITTEN;
if (pte_swp_soft_dirty(pte))
categories |= PAGE_IS_SOFT_DIRTY;
@@ -2525,10 +2525,10 @@ static void make_uffd_wp_huge_pte(struct vm_area_struct *vma,
if (softleaf_is_migration(entry))
set_huge_pte_at(vma->vm_mm, addr, ptep,
- pte_swp_mkuffd_wp(ptent), psize);
+ pte_swp_mkuffd(ptent), psize);
else
huge_ptep_modify_prot_commit(vma, addr, ptep, ptent,
- huge_pte_mkuffd_wp(ptent));
+ huge_pte_mkuffd(ptent));
}
#endif /* CONFIG_HUGETLB_PAGE */
@@ -2759,8 +2759,8 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start,
for (addr = start; addr != end; pte++, addr += PAGE_SIZE) {
pte_t ptent = ptep_get(pte);
- if ((pte_present(ptent) && pte_uffd_wp(ptent)) ||
- pte_swp_uffd_wp_any(ptent))
+ if ((pte_present(ptent) && pte_uffd(ptent)) ||
+ pte_swp_uffd_any(ptent))
continue;
make_uffd_wp_pte(vma, addr, pte, ptent);
if (!flush_end)
@@ -2777,8 +2777,8 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start,
unsigned long next = addr + PAGE_SIZE;
pte_t ptent = ptep_get(pte);
- if ((pte_present(ptent) && pte_uffd_wp(ptent)) ||
- pte_swp_uffd_wp_any(ptent))
+ if ((pte_present(ptent) && pte_uffd(ptent)) ||
+ pte_swp_uffd_any(ptent))
continue;
ret = pagemap_scan_output(p->cur_vma_category | PAGE_IS_WRITTEN,
p, addr, &next);
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 4b53dc4a3266..0fdf28f62702 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1287,7 +1287,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING)
vm_flags |= VM_UFFD_MISSING;
if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) {
- if (!pgtable_supports_uffd_wp())
+ if (!pgtable_supports_uffd())
goto out;
vm_flags |= VM_UFFD_WP;
@@ -1997,7 +1997,7 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
uffdio_api.features &=
~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM);
#endif
- if (!pgtable_supports_uffd_wp())
+ if (!pgtable_supports_uffd())
uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP;
if (!uffd_supports_wp_marker()) {
diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h
index e1a2e1b7c8e7..635c41cc3479 100644
--- a/include/asm-generic/hugetlb.h
+++ b/include/asm-generic/hugetlb.h
@@ -37,24 +37,24 @@ static inline pte_t huge_pte_modify(pte_t pte, pgprot_t newprot)
return pte_modify(pte, newprot);
}
-#ifndef __HAVE_ARCH_HUGE_PTE_MKUFFD_WP
-static inline pte_t huge_pte_mkuffd_wp(pte_t pte)
+#ifndef __HAVE_ARCH_HUGE_PTE_MKUFFD
+static inline pte_t huge_pte_mkuffd(pte_t pte)
{
- return huge_pte_wrprotect(pte_mkuffd_wp(pte));
+ return huge_pte_wrprotect(pte_mkuffd(pte));
}
#endif
-#ifndef __HAVE_ARCH_HUGE_PTE_CLEAR_UFFD_WP
-static inline pte_t huge_pte_clear_uffd_wp(pte_t pte)
+#ifndef __HAVE_ARCH_HUGE_PTE_CLEAR_UFFD
+static inline pte_t huge_pte_clear_uffd(pte_t pte)
{
- return pte_clear_uffd_wp(pte);
+ return pte_clear_uffd(pte);
}
#endif
-#ifndef __HAVE_ARCH_HUGE_PTE_UFFD_WP
-static inline int huge_pte_uffd_wp(pte_t pte)
+#ifndef __HAVE_ARCH_HUGE_PTE_UFFD
+static inline int huge_pte_uffd(pte_t pte)
{
- return pte_uffd_wp(pte);
+ return pte_uffd(pte);
}
#endif
diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h
index 0d85791efdf7..30e88fc1de2f 100644
--- a/include/asm-generic/pgtable_uffd.h
+++ b/include/asm-generic/pgtable_uffd.h
@@ -2,79 +2,79 @@
#define _ASM_GENERIC_PGTABLE_UFFD_H
/*
- * Some platforms can customize the uffd-wp bit, making it unavailable
+ * Some platforms can customize the uffd PTE bit, making it unavailable
* even if the architecture provides the resource.
* Adding this API allows architectures to add their own checks for the
* devices on which the kernel is running.
* Note: When overriding it, please make sure the
* CONFIG_HAVE_ARCH_USERFAULTFD_WP is part of this macro.
*/
-#ifndef pgtable_supports_uffd_wp
-#define pgtable_supports_uffd_wp() IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP)
+#ifndef pgtable_supports_uffd
+#define pgtable_supports_uffd() IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP)
#endif
static inline bool uffd_supports_wp_marker(void)
{
- return pgtable_supports_uffd_wp() && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP);
+ return pgtable_supports_uffd() && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP);
}
#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-static __always_inline int pte_uffd_wp(pte_t pte)
+static __always_inline int pte_uffd(pte_t pte)
{
return 0;
}
-static __always_inline int pmd_uffd_wp(pmd_t pmd)
+static __always_inline int pmd_uffd(pmd_t pmd)
{
return 0;
}
-static __always_inline pte_t pte_mkuffd_wp(pte_t pte)
+static __always_inline pte_t pte_mkuffd(pte_t pte)
{
return pte;
}
-static __always_inline pmd_t pmd_mkuffd_wp(pmd_t pmd)
+static __always_inline pmd_t pmd_mkuffd(pmd_t pmd)
{
return pmd;
}
-static __always_inline pte_t pte_clear_uffd_wp(pte_t pte)
+static __always_inline pte_t pte_clear_uffd(pte_t pte)
{
return pte;
}
-static __always_inline pmd_t pmd_clear_uffd_wp(pmd_t pmd)
+static __always_inline pmd_t pmd_clear_uffd(pmd_t pmd)
{
return pmd;
}
-static __always_inline pte_t pte_swp_mkuffd_wp(pte_t pte)
+static __always_inline pte_t pte_swp_mkuffd(pte_t pte)
{
return pte;
}
-static __always_inline int pte_swp_uffd_wp(pte_t pte)
+static __always_inline int pte_swp_uffd(pte_t pte)
{
return 0;
}
-static __always_inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
+static __always_inline pte_t pte_swp_clear_uffd(pte_t pte)
{
return pte;
}
-static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd)
+static inline pmd_t pmd_swp_mkuffd(pmd_t pmd)
{
return pmd;
}
-static inline int pmd_swp_uffd_wp(pmd_t pmd)
+static inline int pmd_swp_uffd(pmd_t pmd)
{
return 0;
}
-static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
+static inline pmd_t pmd_swp_clear_uffd(pmd_t pmd)
{
return pmd;
}
diff --git a/include/linux/leafops.h b/include/linux/leafops.h
index 992cd8bd8ed0..2ce2f37ac883 100644
--- a/include/linux/leafops.h
+++ b/include/linux/leafops.h
@@ -100,8 +100,8 @@ static inline softleaf_t softleaf_from_pmd(pmd_t pmd)
if (pmd_swp_soft_dirty(pmd))
pmd = pmd_swp_clear_soft_dirty(pmd);
- if (pmd_swp_uffd_wp(pmd))
- pmd = pmd_swp_clear_uffd_wp(pmd);
+ if (pmd_swp_uffd(pmd))
+ pmd = pmd_swp_clear_uffd(pmd);
arch_entry = __pmd_to_swp_entry(pmd);
/* Temporary until swp_entry_t eliminated. */
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index a171070e15f0..2811caf4188d 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -600,14 +600,14 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr,
return false;
/* A uffd-wp wr-protected normal pte */
- if (unlikely(pte_present(pteval) && pte_uffd_wp(pteval)))
+ if (unlikely(pte_present(pteval) && pte_uffd(pteval)))
arm_uffd_pte = true;
/*
* A uffd-wp wr-protected swap pte. Note: this should even cover an
* existing pte marker with uffd-wp bit set.
*/
- if (unlikely(pte_swp_uffd_wp_any(pteval)))
+ if (unlikely(pte_swp_uffd_any(pteval)))
arm_uffd_pte = true;
if (unlikely(arm_uffd_pte)) {
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 8cfc966eae48..15c6440e38dd 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -73,8 +73,8 @@ static inline pte_t pte_swp_clear_flags(pte_t pte)
pte = pte_swp_clear_exclusive(pte);
if (pte_swp_soft_dirty(pte))
pte = pte_swp_clear_soft_dirty(pte);
- if (pte_swp_uffd_wp(pte))
- pte = pte_swp_clear_uffd_wp(pte);
+ if (pte_swp_uffd(pte))
+ pte = pte_swp_clear_uffd(pte);
return pte;
}
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index d2920f98ab86..98f546e83cd2 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -225,13 +225,13 @@ static inline bool userfaultfd_minor(struct vm_area_struct *vma)
static inline bool userfaultfd_pte_wp(struct vm_area_struct *vma,
pte_t pte)
{
- return userfaultfd_wp(vma) && pte_uffd_wp(pte);
+ return userfaultfd_wp(vma) && pte_uffd(pte);
}
static inline bool userfaultfd_huge_pmd_wp(struct vm_area_struct *vma,
pmd_t pmd)
{
- return userfaultfd_wp(vma) && pmd_uffd_wp(pmd);
+ return userfaultfd_wp(vma) && pmd_uffd(pmd);
}
static inline bool userfaultfd_armed(struct vm_area_struct *vma)
@@ -308,10 +308,10 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
}
/*
- * Returns true if this is a swap pte and was uffd-wp wr-protected in either
- * forms (pte marker or a normal swap pte), false otherwise.
+ * Returns true if this swap pte carries uffd-tracked state in either
+ * form (pte marker or a normal swap pte), false otherwise.
*/
-static inline bool pte_swp_uffd_wp_any(pte_t pte)
+static inline bool pte_swp_uffd_any(pte_t pte)
{
if (!uffd_supports_wp_marker())
return false;
@@ -319,7 +319,7 @@ static inline bool pte_swp_uffd_wp_any(pte_t pte)
if (pte_present(pte))
return false;
- if (pte_swp_uffd_wp(pte))
+ if (pte_swp_uffd(pte))
return true;
if (pte_is_uffd_wp_marker(pte))
@@ -460,7 +460,7 @@ static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
* Returns true if this is a swap pte and was uffd-wp wr-protected in either
* forms (pte marker or a normal swap pte), false otherwise.
*/
-static inline bool pte_swp_uffd_wp_any(pte_t pte)
+static inline bool pte_swp_uffd_any(pte_t pte)
{
return false;
}
diff --git a/include/trace/events/huge_memory.h b/include/trace/events/huge_memory.h
index bcdc57eea270..b4a314b06aef 100644
--- a/include/trace/events/huge_memory.h
+++ b/include/trace/events/huge_memory.h
@@ -16,7 +16,7 @@
EM( SCAN_EXCEED_SWAP_PTE, "exceed_swap_pte") \
EM( SCAN_EXCEED_SHARED_PTE, "exceed_shared_pte") \
EM( SCAN_PTE_NON_PRESENT, "pte_non_present") \
- EM( SCAN_PTE_UFFD_WP, "pte_uffd_wp") \
+ EM( SCAN_PTE_UFFD, "pte_uffd_wp") \
EM( SCAN_PTE_MAPPED_HUGEPAGE, "pte_mapped_hugepage") \
EM( SCAN_LACK_REFERENCED_PAGE, "lack_referenced_page") \
EM( SCAN_PAGE_NULL, "page_null") \
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 970e077019b7..d88fcccd386d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1884,8 +1884,8 @@ static void copy_huge_non_present_pmd(
pmd = swp_entry_to_pmd(entry);
if (pmd_swp_soft_dirty(*src_pmd))
pmd = pmd_swp_mksoft_dirty(pmd);
- if (pmd_swp_uffd_wp(*src_pmd))
- pmd = pmd_swp_mkuffd_wp(pmd);
+ if (pmd_swp_uffd(*src_pmd))
+ pmd = pmd_swp_mkuffd(pmd);
set_pmd_at(src_mm, addr, src_pmd, pmd);
} else if (softleaf_is_device_private(entry)) {
/*
@@ -1898,8 +1898,8 @@ static void copy_huge_non_present_pmd(
if (pmd_swp_soft_dirty(*src_pmd))
pmd = pmd_swp_mksoft_dirty(pmd);
- if (pmd_swp_uffd_wp(*src_pmd))
- pmd = pmd_swp_mkuffd_wp(pmd);
+ if (pmd_swp_uffd(*src_pmd))
+ pmd = pmd_swp_mkuffd(pmd);
set_pmd_at(src_mm, addr, src_pmd, pmd);
}
@@ -1919,7 +1919,7 @@ static void copy_huge_non_present_pmd(
mm_inc_nr_ptes(dst_mm);
pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable);
if (!userfaultfd_wp(dst_vma))
- pmd = pmd_swp_clear_uffd_wp(pmd);
+ pmd = pmd_swp_clear_uffd(pmd);
set_pmd_at(dst_mm, addr, dst_pmd, pmd);
}
@@ -2015,7 +2015,7 @@ int copy_huge_pmd(struct mm_struct *dst_mm, struct mm_struct *src_mm,
pgtable_trans_huge_deposit(dst_mm, dst_pmd, pgtable);
pmdp_set_wrprotect(src_mm, addr, src_pmd);
if (!userfaultfd_wp(dst_vma))
- pmd = pmd_clear_uffd_wp(pmd);
+ pmd = pmd_clear_uffd(pmd);
pmd = pmd_wrprotect(pmd);
set_pmd:
pmd = pmd_mkold(pmd);
@@ -2556,9 +2556,9 @@ static pmd_t clear_uffd_wp_pmd(pmd_t pmd)
if (pmd_none(pmd))
return pmd;
if (pmd_present(pmd))
- pmd = pmd_clear_uffd_wp(pmd);
+ pmd = pmd_clear_uffd(pmd);
else
- pmd = pmd_swp_clear_uffd_wp(pmd);
+ pmd = pmd_swp_clear_uffd(pmd);
return pmd;
}
@@ -2643,9 +2643,9 @@ static void change_non_present_huge_pmd(struct mm_struct *mm,
}
if (uffd_wp)
- newpmd = pmd_swp_mkuffd_wp(newpmd);
+ newpmd = pmd_swp_mkuffd(newpmd);
else if (uffd_wp_resolve)
- newpmd = pmd_swp_clear_uffd_wp(newpmd);
+ newpmd = pmd_swp_clear_uffd(newpmd);
if (!pmd_same(*pmd, newpmd))
set_pmd_at(mm, addr, pmd, newpmd);
}
@@ -2726,14 +2726,14 @@ int change_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
entry = pmd_modify(oldpmd, newprot);
if (uffd_wp)
- entry = pmd_mkuffd_wp(entry);
+ entry = pmd_mkuffd(entry);
else if (uffd_wp_resolve)
/*
* Leave the write bit to be handled by PF interrupt
* handler, then things like COW could be properly
* handled.
*/
- entry = pmd_clear_uffd_wp(entry);
+ entry = pmd_clear_uffd(entry);
/* See change_pte_range(). */
if ((cp_flags & MM_CP_TRY_CHANGE_WRITABLE) && !pmd_write(entry) &&
@@ -3076,8 +3076,8 @@ static void __split_huge_zero_page_pmd(struct vm_area_struct *vma,
entry = pfn_pte(zero_pfn(addr), vma->vm_page_prot);
entry = pte_mkspecial(entry);
- if (pmd_uffd_wp(old_pmd))
- entry = pte_mkuffd_wp(entry);
+ if (pmd_uffd(old_pmd))
+ entry = pte_mkuffd(entry);
VM_BUG_ON(!pte_none(ptep_get(pte)));
set_pte_at(mm, addr, pte, entry);
pte++;
@@ -3161,7 +3161,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
folio = page_folio(page);
soft_dirty = pmd_swp_soft_dirty(old_pmd);
- uffd_wp = pmd_swp_uffd_wp(old_pmd);
+ uffd_wp = pmd_swp_uffd(old_pmd);
write = softleaf_is_migration_write(entry);
if (PageAnon(page))
@@ -3177,7 +3177,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
folio = page_folio(page);
soft_dirty = pmd_swp_soft_dirty(old_pmd);
- uffd_wp = pmd_swp_uffd_wp(old_pmd);
+ uffd_wp = pmd_swp_uffd(old_pmd);
write = softleaf_is_device_private_write(entry);
anon_exclusive = PageAnonExclusive(page);
@@ -3234,7 +3234,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
write = pmd_write(old_pmd);
young = pmd_young(old_pmd);
soft_dirty = pmd_soft_dirty(old_pmd);
- uffd_wp = pmd_uffd_wp(old_pmd);
+ uffd_wp = pmd_uffd(old_pmd);
VM_WARN_ON_FOLIO(!folio_ref_count(folio), folio);
VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio);
@@ -3305,7 +3305,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
if (soft_dirty)
entry = pte_swp_mksoft_dirty(entry);
if (uffd_wp)
- entry = pte_swp_mkuffd_wp(entry);
+ entry = pte_swp_mkuffd(entry);
VM_WARN_ON(!pte_none(ptep_get(pte + i)));
set_pte_at(mm, addr, pte + i, entry);
}
@@ -3332,7 +3332,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
if (soft_dirty)
entry = pte_swp_mksoft_dirty(entry);
if (uffd_wp)
- entry = pte_swp_mkuffd_wp(entry);
+ entry = pte_swp_mkuffd(entry);
VM_WARN_ON(!pte_none(ptep_get(pte + i)));
set_pte_at(mm, addr, pte + i, entry);
}
@@ -3350,7 +3350,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
if (soft_dirty)
entry = pte_mksoft_dirty(entry);
if (uffd_wp)
- entry = pte_mkuffd_wp(entry);
+ entry = pte_mkuffd(entry);
for (i = 0; i < HPAGE_PMD_NR; i++)
VM_WARN_ON(!pte_none(ptep_get(pte + i)));
@@ -5017,8 +5017,8 @@ int set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw,
pmdswp = swp_entry_to_pmd(entry);
if (pmd_soft_dirty(pmdval))
pmdswp = pmd_swp_mksoft_dirty(pmdswp);
- if (pmd_uffd_wp(pmdval))
- pmdswp = pmd_swp_mkuffd_wp(pmdswp);
+ if (pmd_uffd(pmdval))
+ pmdswp = pmd_swp_mkuffd(pmdswp);
set_pmd_at(mm, address, pvmw->pmd, pmdswp);
folio_remove_rmap_pmd(folio, page, vma);
folio_put(folio);
@@ -5048,8 +5048,8 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
pmde = pmd_mksoft_dirty(pmde);
if (softleaf_is_migration_write(entry))
pmde = pmd_mkwrite(pmde, vma);
- if (pmd_swp_uffd_wp(*pvmw->pmd))
- pmde = pmd_mkuffd_wp(pmde);
+ if (pmd_swp_uffd(*pvmw->pmd))
+ pmde = pmd_mkuffd(pmde);
if (!softleaf_is_migration_young(entry))
pmde = pmd_mkold(pmde);
/* NOTE: this may contain setting soft-dirty on some archs */
@@ -5069,8 +5069,8 @@ void remove_migration_pmd(struct page_vma_mapped_walk *pvmw, struct page *new)
if (pmd_swp_soft_dirty(*pvmw->pmd))
pmde = pmd_swp_mksoft_dirty(pmde);
- if (pmd_swp_uffd_wp(*pvmw->pmd))
- pmde = pmd_swp_mkuffd_wp(pmde);
+ if (pmd_swp_uffd(*pvmw->pmd))
+ pmde = pmd_swp_mkuffd(pmde);
}
if (folio_test_anon(folio)) {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f24bf49be047..f770c6504e26 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4859,8 +4859,8 @@ hugetlb_install_folio(struct vm_area_struct *vma, pte_t *ptep, unsigned long add
__folio_mark_uptodate(new_folio);
hugetlb_add_new_anon_rmap(new_folio, vma, addr);
- if (userfaultfd_wp(vma) && huge_pte_uffd_wp(old))
- newpte = huge_pte_mkuffd_wp(newpte);
+ if (userfaultfd_wp(vma) && huge_pte_uffd(old))
+ newpte = huge_pte_mkuffd(newpte);
set_huge_pte_at(vma->vm_mm, addr, ptep, newpte, sz);
hugetlb_count_add(pages_per_huge_page(hstate_vma(vma)), vma->vm_mm);
folio_set_hugetlb_migratable(new_folio);
@@ -4934,10 +4934,10 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
softleaf = softleaf_from_pte(entry);
if (unlikely(softleaf_is_hwpoison(softleaf))) {
if (!userfaultfd_wp(dst_vma))
- entry = huge_pte_clear_uffd_wp(entry);
+ entry = huge_pte_clear_uffd(entry);
set_huge_pte_at(dst, addr, dst_pte, entry, sz);
} else if (unlikely(softleaf_is_migration(softleaf))) {
- bool uffd_wp = pte_swp_uffd_wp(entry);
+ bool uffd = pte_swp_uffd(entry);
if (!softleaf_is_migration_read(softleaf) && cow) {
/*
@@ -4947,12 +4947,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
softleaf = make_readable_migration_entry(
swp_offset(softleaf));
entry = swp_entry_to_pte(softleaf);
- if (userfaultfd_wp(src_vma) && uffd_wp)
- entry = pte_swp_mkuffd_wp(entry);
+ if (userfaultfd_wp(src_vma) && uffd)
+ entry = pte_swp_mkuffd(entry);
set_huge_pte_at(src, addr, src_pte, entry, sz);
}
if (!userfaultfd_wp(dst_vma))
- entry = huge_pte_clear_uffd_wp(entry);
+ entry = huge_pte_clear_uffd(entry);
set_huge_pte_at(dst, addr, dst_pte, entry, sz);
} else if (unlikely(pte_is_marker(entry))) {
const pte_marker marker = copy_pte_marker(softleaf, dst_vma);
@@ -5028,7 +5028,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
}
if (!userfaultfd_wp(dst_vma))
- entry = huge_pte_clear_uffd_wp(entry);
+ entry = huge_pte_clear_uffd(entry);
set_huge_pte_at(dst, addr, dst_pte, entry, sz);
hugetlb_count_add(npages, dst);
@@ -5076,9 +5076,9 @@ static void move_huge_pte(struct vm_area_struct *vma, unsigned long old_addr,
} else {
if (need_clear_uffd_wp) {
if (pte_present(pte))
- pte = huge_pte_clear_uffd_wp(pte);
+ pte = huge_pte_clear_uffd(pte);
else
- pte = pte_swp_clear_uffd_wp(pte);
+ pte = pte_swp_clear_uffd(pte);
}
set_huge_pte_at(mm, new_addr, dst_pte, pte, sz);
}
@@ -5212,7 +5212,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
* drop the uffd-wp bit in this zap, then replace the
* pte with a marker.
*/
- if (pte_swp_uffd_wp_any(pte) &&
+ if (pte_swp_uffd_any(pte) &&
!(zap_flags & ZAP_FLAG_DROP_MARKER))
set_huge_pte_at(mm, address, ptep,
make_pte_marker(PTE_MARKER_UFFD_WP),
@@ -5248,7 +5248,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
if (huge_pte_dirty(pte))
folio_mark_dirty(folio);
/* Leave a uffd-wp pte marker if needed */
- if (huge_pte_uffd_wp(pte) &&
+ if (huge_pte_uffd(pte) &&
!(zap_flags & ZAP_FLAG_DROP_MARKER))
set_huge_pte_at(mm, address, ptep,
make_pte_marker(PTE_MARKER_UFFD_WP),
@@ -5452,7 +5452,7 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
* can trigger this, because hugetlb_fault() will always resolve
* uffd-wp bit first.
*/
- if (!unshare && huge_pte_uffd_wp(pte))
+ if (!unshare && huge_pte_uffd(pte))
return 0;
/* Let's take out MAP_SHARED mappings first. */
@@ -5596,8 +5596,8 @@ static vm_fault_t hugetlb_wp(struct vm_fault *vmf)
huge_ptep_clear_flush(vma, vmf->address, vmf->pte);
hugetlb_remove_rmap(old_folio);
hugetlb_add_new_anon_rmap(new_folio, vma, vmf->address);
- if (huge_pte_uffd_wp(pte))
- newpte = huge_pte_mkuffd_wp(newpte);
+ if (huge_pte_uffd(pte))
+ newpte = huge_pte_mkuffd(newpte);
set_huge_pte_at(mm, vmf->address, vmf->pte, newpte,
huge_page_size(h));
folio_set_hugetlb_migratable(new_folio);
@@ -5875,7 +5875,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping,
* if populated.
*/
if (unlikely(pte_is_uffd_wp_marker(vmf->orig_pte)))
- new_pte = huge_pte_mkuffd_wp(new_pte);
+ new_pte = huge_pte_mkuffd(new_pte);
set_huge_pte_at(mm, vmf->address, vmf->pte, new_pte, huge_page_size(h));
hugetlb_count_add(pages_per_huge_page(h), mm);
@@ -6073,7 +6073,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
goto out_ptl;
/* Handle userfault-wp first, before trying to lock more pages */
- if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(mm, vmf.address, vmf.pte)) &&
+ if (userfaultfd_wp(vma) && huge_pte_uffd(huge_ptep_get(mm, vmf.address, vmf.pte)) &&
(flags & FAULT_FLAG_WRITE) && !huge_pte_write(vmf.orig_pte)) {
if (!userfaultfd_wp_async(vma)) {
spin_unlock(vmf.ptl);
@@ -6082,7 +6082,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
return handle_userfault(&vmf, VM_UFFD_WP);
}
- vmf.orig_pte = huge_pte_clear_uffd_wp(vmf.orig_pte);
+ vmf.orig_pte = huge_pte_clear_uffd(vmf.orig_pte);
set_huge_pte_at(mm, vmf.address, vmf.pte, vmf.orig_pte,
huge_page_size(hstate_vma(vma)));
/* Fallthrough to CoW */
@@ -6366,7 +6366,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte,
_dst_pte = pte_mkyoung(_dst_pte);
if (wp_enabled)
- _dst_pte = huge_pte_mkuffd_wp(_dst_pte);
+ _dst_pte = huge_pte_mkuffd(_dst_pte);
set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte, size);
@@ -6490,9 +6490,9 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
}
if (uffd_wp)
- newpte = pte_swp_mkuffd_wp(newpte);
+ newpte = pte_swp_mkuffd(newpte);
else if (uffd_wp_resolve)
- newpte = pte_swp_clear_uffd_wp(newpte);
+ newpte = pte_swp_clear_uffd(newpte);
if (!pte_same(pte, newpte))
set_huge_pte_at(mm, address, ptep, newpte, psize);
} else if (unlikely(pte_is_marker(pte))) {
@@ -6513,9 +6513,9 @@ long hugetlb_change_protection(struct vm_area_struct *vma,
pte = huge_pte_modify(old_pte, newprot);
pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
if (uffd_wp)
- pte = huge_pte_mkuffd_wp(pte);
+ pte = huge_pte_mkuffd(pte);
else if (uffd_wp_resolve)
- pte = huge_pte_clear_uffd_wp(pte);
+ pte = huge_pte_clear_uffd(pte);
huge_ptep_modify_prot_commit(vma, address, ptep, old_pte, pte);
pages++;
tlb_remove_huge_tlb_entry(h, &tlb, ptep, address);
diff --git a/mm/internal.h b/mm/internal.h
index 5a2ddcf68e0b..b0c6d1621d7c 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -413,8 +413,8 @@ static inline pte_t pte_move_swp_offset(pte_t pte, long delta)
new = pte_swp_mksoft_dirty(new);
if (pte_swp_exclusive(pte))
new = pte_swp_mkexclusive(new);
- if (pte_swp_uffd_wp(pte))
- new = pte_swp_mkuffd_wp(new);
+ if (pte_swp_uffd(pte))
+ new = pte_swp_mkuffd(new);
return new;
}
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b8452dbdb043..de0644bde400 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -37,7 +37,7 @@ enum scan_result {
SCAN_EXCEED_SWAP_PTE,
SCAN_EXCEED_SHARED_PTE,
SCAN_PTE_NON_PRESENT,
- SCAN_PTE_UFFD_WP,
+ SCAN_PTE_UFFD,
SCAN_PTE_MAPPED_HUGEPAGE,
SCAN_LACK_REFERENCED_PAGE,
SCAN_PAGE_NULL,
@@ -566,8 +566,8 @@ static enum scan_result __collapse_huge_page_isolate(struct vm_area_struct *vma,
result = SCAN_PTE_NON_PRESENT;
goto out;
}
- if (pte_uffd_wp(pteval)) {
- result = SCAN_PTE_UFFD_WP;
+ if (pte_uffd(pteval)) {
+ result = SCAN_PTE_UFFD;
goto out;
}
page = vm_normal_page(vma, addr, pteval);
@@ -1303,10 +1303,10 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
/*
* Always be strict with uffd-wp
* enabled swap entries. Please see
- * comment below for pte_uffd_wp().
+ * comment below for pte_uffd().
*/
- if (pte_swp_uffd_wp_any(pteval)) {
- result = SCAN_PTE_UFFD_WP;
+ if (pte_swp_uffd_any(pteval)) {
+ result = SCAN_PTE_UFFD;
goto out_unmap;
}
continue;
@@ -1316,7 +1316,7 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
goto out_unmap;
}
}
- if (pte_uffd_wp(pteval)) {
+ if (pte_uffd(pteval)) {
/*
* Don't collapse the page if any of the small
* PTEs are armed with uffd write protection.
@@ -1326,7 +1326,7 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
* userfault messages that falls outside of
* the registered range. So, just be simple.
*/
- result = SCAN_PTE_UFFD_WP;
+ result = SCAN_PTE_UFFD;
goto out_unmap;
}
@@ -1534,7 +1534,7 @@ static enum scan_result try_collapse_pte_mapped_thp(struct mm_struct *mm, unsign
/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
if (userfaultfd_wp(vma))
- return SCAN_PTE_UFFD_WP;
+ return SCAN_PTE_UFFD;
folio = filemap_lock_folio(vma->vm_file->f_mapping,
linear_page_index(vma, haddr));
@@ -2876,7 +2876,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
/* Whitelisted set of results where continuing OK */
case SCAN_NO_PTE_TABLE:
case SCAN_PTE_NON_PRESENT:
- case SCAN_PTE_UFFD_WP:
+ case SCAN_PTE_UFFD:
case SCAN_LACK_REFERENCED_PAGE:
case SCAN_PAGE_NULL:
case SCAN_PAGE_COUNT:
diff --git a/mm/memory.c b/mm/memory.c
index ea6568571131..f2e7e900b1b8 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -877,8 +877,8 @@ static void restore_exclusive_pte(struct vm_area_struct *vma,
if (pte_swp_soft_dirty(orig_pte))
pte = pte_mksoft_dirty(pte);
- if (pte_swp_uffd_wp(orig_pte))
- pte = pte_mkuffd_wp(pte);
+ if (pte_swp_uffd(orig_pte))
+ pte = pte_mkuffd(pte);
if ((vma->vm_flags & VM_WRITE) &&
can_change_pte_writable(vma, address, pte)) {
@@ -968,8 +968,8 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
pte = softleaf_to_pte(entry);
if (pte_swp_soft_dirty(orig_pte))
pte = pte_swp_mksoft_dirty(pte);
- if (pte_swp_uffd_wp(orig_pte))
- pte = pte_swp_mkuffd_wp(pte);
+ if (pte_swp_uffd(orig_pte))
+ pte = pte_swp_mkuffd(pte);
set_pte_at(src_mm, addr, src_pte, pte);
}
} else if (softleaf_is_device_private(entry)) {
@@ -1002,8 +1002,8 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
entry = make_readable_device_private_entry(
swp_offset(entry));
pte = swp_entry_to_pte(entry);
- if (pte_swp_uffd_wp(orig_pte))
- pte = pte_swp_mkuffd_wp(pte);
+ if (pte_swp_uffd(orig_pte))
+ pte = pte_swp_mkuffd(pte);
set_pte_at(src_mm, addr, src_pte, pte);
}
} else if (softleaf_is_device_exclusive(entry)) {
@@ -1026,7 +1026,7 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm,
return 0;
}
if (!userfaultfd_wp(dst_vma))
- pte = pte_swp_clear_uffd_wp(pte);
+ pte = pte_swp_clear_uffd(pte);
set_pte_at(dst_mm, addr, dst_pte, pte);
return 0;
}
@@ -1074,7 +1074,7 @@ copy_present_page(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma
pte = maybe_mkwrite(pte_mkdirty(pte), dst_vma);
if (userfaultfd_pte_wp(dst_vma, ptep_get(src_pte)))
/* Uffd-wp needs to be delivered to dest pte as well */
- pte = pte_mkuffd_wp(pte);
+ pte = pte_mkuffd(pte);
set_pte_at(dst_vma->vm_mm, addr, dst_pte, pte);
return 0;
}
@@ -1097,7 +1097,7 @@ static __always_inline void __copy_present_ptes(struct vm_area_struct *dst_vma,
pte = pte_mkold(pte);
if (!userfaultfd_wp(dst_vma))
- pte = pte_clear_uffd_wp(pte);
+ pte = pte_clear_uffd(pte);
set_ptes(dst_vma->vm_mm, addr, dst_pte, pte, nr);
}
@@ -3909,8 +3909,8 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
if (unlikely(unshare)) {
if (pte_soft_dirty(vmf->orig_pte))
entry = pte_mksoft_dirty(entry);
- if (pte_uffd_wp(vmf->orig_pte))
- entry = pte_mkuffd_wp(entry);
+ if (pte_uffd(vmf->orig_pte))
+ entry = pte_mkuffd(entry);
} else {
entry = maybe_mkwrite(pte_mkdirty(entry), vma);
}
@@ -4245,7 +4245,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf)
* etc.) because we're only removing the uffd-wp bit,
* which is completely invisible to the user.
*/
- pte = pte_clear_uffd_wp(ptep_get(vmf->pte));
+ pte = pte_clear_uffd(ptep_get(vmf->pte));
set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte);
/*
@@ -5077,8 +5077,8 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
pte = mk_pte(page, vma->vm_page_prot);
if (pte_swp_soft_dirty(vmf->orig_pte))
pte = pte_mksoft_dirty(pte);
- if (pte_swp_uffd_wp(vmf->orig_pte))
- pte = pte_mkuffd_wp(pte);
+ if (pte_swp_uffd(vmf->orig_pte))
+ pte = pte_mkuffd(pte);
/*
* Same logic as in do_wp_page(); however, optimize for pages that are
@@ -5294,7 +5294,7 @@ void map_anon_folio_pte_nopf(struct folio *folio, pte_t *pte,
if (vma->vm_flags & VM_WRITE)
entry = pte_mkwrite(pte_mkdirty(entry), vma);
if (uffd_wp)
- entry = pte_mkuffd_wp(entry);
+ entry = pte_mkuffd(entry);
folio_ref_add(folio, nr_pages - 1);
folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE);
@@ -5360,7 +5360,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
return handle_userfault(vmf, VM_UFFD_MISSING);
}
if (vmf_orig_pte_uffd_wp(vmf))
- entry = pte_mkuffd_wp(entry);
+ entry = pte_mkuffd(entry);
set_pte_at(vma->vm_mm, addr, vmf->pte, entry);
/* No need to invalidate - it was non-present before */
@@ -5609,7 +5609,7 @@ void set_pte_range(struct vm_fault *vmf, struct folio *folio,
else if (pte_write(entry) && folio_test_dirty(folio))
entry = pte_mkdirty(entry);
if (unlikely(vmf_orig_pte_uffd_wp(vmf)))
- entry = pte_mkuffd_wp(entry);
+ entry = pte_mkuffd(entry);
/* copy-on-write page */
if (write && !(vma->vm_flags & VM_SHARED)) {
VM_BUG_ON_FOLIO(nr != 1, folio);
diff --git a/mm/migrate.c b/mm/migrate.c
index 8a64291ab5b4..9d81b7b881ec 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -326,8 +326,8 @@ static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw,
if (pte_swp_soft_dirty(old_pte))
newpte = pte_mksoft_dirty(newpte);
- if (pte_swp_uffd_wp(old_pte))
- newpte = pte_mkuffd_wp(newpte);
+ if (pte_swp_uffd(old_pte))
+ newpte = pte_mkuffd(newpte);
set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
@@ -391,8 +391,8 @@ static bool remove_migration_pte(struct folio *folio,
if (softleaf_is_migration_write(entry))
pte = pte_mkwrite(pte, vma);
- else if (pte_swp_uffd_wp(old_pte))
- pte = pte_mkuffd_wp(pte);
+ else if (pte_swp_uffd(old_pte))
+ pte = pte_mkuffd(pte);
if (folio_test_anon(folio) && !softleaf_is_migration_read(entry))
rmap_flags |= RMAP_EXCLUSIVE;
@@ -407,8 +407,8 @@ static bool remove_migration_pte(struct folio *folio,
pte = softleaf_to_pte(entry);
if (pte_swp_soft_dirty(old_pte))
pte = pte_swp_mksoft_dirty(pte);
- if (pte_swp_uffd_wp(old_pte))
- pte = pte_swp_mkuffd_wp(pte);
+ if (pte_swp_uffd(old_pte))
+ pte = pte_swp_mkuffd(pte);
}
#ifdef CONFIG_HUGETLB_PAGE
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index fbfe5715f635..f4058688522d 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -445,13 +445,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
if (pte_present(pte)) {
if (pte_soft_dirty(pte))
swp_pte = pte_swp_mksoft_dirty(swp_pte);
- if (pte_uffd_wp(pte))
- swp_pte = pte_swp_mkuffd_wp(swp_pte);
+ if (pte_uffd(pte))
+ swp_pte = pte_swp_mkuffd(swp_pte);
} else {
if (pte_swp_soft_dirty(pte))
swp_pte = pte_swp_mksoft_dirty(swp_pte);
- if (pte_swp_uffd_wp(pte))
- swp_pte = pte_swp_mkuffd_wp(swp_pte);
+ if (pte_swp_uffd(pte))
+ swp_pte = pte_swp_mkuffd(swp_pte);
}
set_pte_at(mm, addr, ptep, swp_pte);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028c..8340c8b228c6 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -240,8 +240,8 @@ static long change_softleaf_pte(struct vm_area_struct *vma,
*/
entry = make_readable_device_private_entry(swp_offset(entry));
newpte = swp_entry_to_pte(entry);
- if (pte_swp_uffd_wp(oldpte))
- newpte = pte_swp_mkuffd_wp(newpte);
+ if (pte_swp_uffd(oldpte))
+ newpte = pte_swp_mkuffd(newpte);
} else if (softleaf_is_marker(entry)) {
/*
* Ignore error swap entries unconditionally,
@@ -266,9 +266,9 @@ static long change_softleaf_pte(struct vm_area_struct *vma,
}
if (uffd_wp)
- newpte = pte_swp_mkuffd_wp(newpte);
+ newpte = pte_swp_mkuffd(newpte);
else if (uffd_wp_resolve)
- newpte = pte_swp_clear_uffd_wp(newpte);
+ newpte = pte_swp_clear_uffd(newpte);
if (!pte_same(oldpte, newpte)) {
set_pte_at(vma->vm_mm, addr, pte, newpte);
@@ -290,9 +290,9 @@ static __always_inline void change_present_ptes(struct mmu_gather *tlb,
ptent = pte_modify(oldpte, newprot);
if (uffd_wp)
- ptent = pte_mkuffd_wp(ptent);
+ ptent = pte_mkuffd(ptent);
else if (uffd_wp_resolve)
- ptent = pte_clear_uffd_wp(ptent);
+ ptent = pte_clear_uffd(ptent);
/*
* In some writable, shared mappings, we might want
diff --git a/mm/mremap.c b/mm/mremap.c
index e9c8b1d05832..12732a5c547e 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -297,9 +297,9 @@ static int move_ptes(struct pagetable_move_control *pmc,
else {
if (need_clear_uffd_wp) {
if (pte_present(pte))
- pte = pte_clear_uffd_wp(pte);
+ pte = pte_clear_uffd(pte);
else
- pte = pte_swp_clear_uffd_wp(pte);
+ pte = pte_swp_clear_uffd(pte);
}
set_ptes(mm, new_addr, new_ptep, pte, nr_ptes);
}
diff --git a/mm/page_table_check.c b/mm/page_table_check.c
index 53a8997ec043..3fb995e5d40d 100644
--- a/mm/page_table_check.c
+++ b/mm/page_table_check.c
@@ -188,8 +188,8 @@ static inline bool softleaf_cached_writable(softleaf_t entry)
static void page_table_check_pte_flags(pte_t pte)
{
if (pte_present(pte)) {
- WARN_ON_ONCE(pte_uffd_wp(pte) && pte_write(pte));
- } else if (pte_swp_uffd_wp(pte)) {
+ WARN_ON_ONCE(pte_uffd(pte) && pte_write(pte));
+ } else if (pte_swp_uffd(pte)) {
const softleaf_t entry = softleaf_from_pte(pte);
WARN_ON_ONCE(softleaf_cached_writable(entry));
@@ -216,9 +216,9 @@ EXPORT_SYMBOL(__page_table_check_ptes_set);
static inline void page_table_check_pmd_flags(pmd_t pmd)
{
if (pmd_present(pmd)) {
- if (pmd_uffd_wp(pmd))
+ if (pmd_uffd(pmd))
WARN_ON_ONCE(pmd_write(pmd));
- } else if (pmd_swp_uffd_wp(pmd)) {
+ } else if (pmd_swp_uffd(pmd)) {
const softleaf_t entry = softleaf_from_pmd(pmd);
WARN_ON_ONCE(softleaf_cached_writable(entry));
diff --git a/mm/rmap.c b/mm/rmap.c
index 78b7fb5f367c..05056c213203 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2316,13 +2316,13 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
if (likely(pte_present(pteval))) {
if (pte_soft_dirty(pteval))
swp_pte = pte_swp_mksoft_dirty(swp_pte);
- if (pte_uffd_wp(pteval))
- swp_pte = pte_swp_mkuffd_wp(swp_pte);
+ if (pte_uffd(pteval))
+ swp_pte = pte_swp_mkuffd(swp_pte);
} else {
if (pte_swp_soft_dirty(pteval))
swp_pte = pte_swp_mksoft_dirty(swp_pte);
- if (pte_swp_uffd_wp(pteval))
- swp_pte = pte_swp_mkuffd_wp(swp_pte);
+ if (pte_swp_uffd(pteval))
+ swp_pte = pte_swp_mkuffd(swp_pte);
}
set_pte_at(mm, address, pvmw.pte, swp_pte);
} else {
@@ -2690,14 +2690,14 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
swp_pte = swp_entry_to_pte(entry);
if (pte_soft_dirty(pteval))
swp_pte = pte_swp_mksoft_dirty(swp_pte);
- if (pte_uffd_wp(pteval))
- swp_pte = pte_swp_mkuffd_wp(swp_pte);
+ if (pte_uffd(pteval))
+ swp_pte = pte_swp_mkuffd(swp_pte);
} else {
swp_pte = swp_entry_to_pte(entry);
if (pte_swp_soft_dirty(pteval))
swp_pte = pte_swp_mksoft_dirty(swp_pte);
- if (pte_swp_uffd_wp(pteval))
- swp_pte = pte_swp_mkuffd_wp(swp_pte);
+ if (pte_swp_uffd(pteval))
+ swp_pte = pte_swp_mkuffd(swp_pte);
}
if (folio_test_hugetlb(folio))
set_huge_pte_at(mm, address, pvmw.pte, swp_pte,
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 9174f1eeffb0..9119efef7fe6 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -2336,8 +2336,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
new_pte = pte_mkold(mk_pte(page, vma->vm_page_prot));
if (pte_swp_soft_dirty(old_pte))
new_pte = pte_mksoft_dirty(new_pte);
- if (pte_swp_uffd_wp(old_pte))
- new_pte = pte_mkuffd_wp(new_pte);
+ if (pte_swp_uffd(old_pte))
+ new_pte = pte_mkuffd(new_pte);
setpte:
set_pte_at(vma->vm_mm, addr, pte, new_pte);
folio_put_swap(swapcache, folio_file_page(swapcache, swp_offset(entry)));
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 885da1e56466..d546ffd2f165 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -358,7 +358,7 @@ static int mfill_atomic_install_pte(pmd_t *dst_pmd,
if (writable)
_dst_pte = pte_mkwrite(_dst_pte, dst_vma);
if (flags & MFILL_ATOMIC_WP)
- _dst_pte = pte_mkuffd_wp(_dst_pte);
+ _dst_pte = pte_mkuffd(_dst_pte);
ret = -EAGAIN;
dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl);
--
2.51.2
^ permalink raw reply related
* [PATCH v3 02/16] mm: rename uffd-wp PTE bit macros to uffd
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
The uffd-wp PTE bit is about to gain a second consumer: userfaultfd
RWP will use the same bit to mark access-tracking PTEs, distinct
from mprotect(PROT_NONE) or NUMA-hinting PTEs. WP vs RWP semantics
come from the VMA flag; the bit is just "uffd has claimed this
entry." Drop the "_wp" suffix from the arch-private bit macros so
they reflect that.
x86: _PAGE_BIT_UFFD_WP -> _PAGE_BIT_UFFD
_PAGE_UFFD_WP -> _PAGE_UFFD
_PAGE_SWP_UFFD_WP -> _PAGE_SWP_UFFD
arm64: PTE_UFFD_WP -> PTE_UFFD
PTE_SWP_UFFD_WP -> PTE_SWP_UFFD
riscv: _PAGE_UFFD_WP -> _PAGE_UFFD
_PAGE_SWP_UFFD_WP -> _PAGE_SWP_UFFD
Pure mechanical rename -- no behavior change.
Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Reviewed-by: SeongJae Park <sj@kernel.org>
---
arch/arm64/include/asm/pgtable-prot.h | 8 ++++----
arch/arm64/include/asm/pgtable.h | 12 ++++++------
arch/riscv/include/asm/pgtable-bits.h | 12 ++++++------
arch/riscv/include/asm/pgtable.h | 14 +++++++-------
arch/x86/include/asm/pgtable.h | 24 ++++++++++++------------
arch/x86/include/asm/pgtable_types.h | 16 ++++++++--------
6 files changed, 43 insertions(+), 43 deletions(-)
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 212ce1b02e15..09d7c00cf405 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -28,11 +28,11 @@
#define PTE_PRESENT_VALID_KERNEL (PTE_VALID | PTE_MAYBE_NG)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-#define PTE_UFFD_WP (_AT(pteval_t, 1) << 58) /* uffd-wp tracking */
-#define PTE_SWP_UFFD_WP (_AT(pteval_t, 1) << 3) /* only for swp ptes */
+#define PTE_UFFD (_AT(pteval_t, 1) << 58) /* userfaultfd tracking */
+#define PTE_SWP_UFFD (_AT(pteval_t, 1) << 3) /* only for swp ptes */
#else
-#define PTE_UFFD_WP (_AT(pteval_t, 0))
-#define PTE_SWP_UFFD_WP (_AT(pteval_t, 0))
+#define PTE_UFFD (_AT(pteval_t, 0))
+#define PTE_SWP_UFFD (_AT(pteval_t, 0))
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
#define _PROT_DEFAULT (PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 873f4ea2e288..3eecb2c17711 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -343,17 +343,17 @@ static inline pmd_t pmd_mknoncont(pmd_t pmd)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static inline int pte_uffd_wp(pte_t pte)
{
- return !!(pte_val(pte) & PTE_UFFD_WP);
+ return !!(pte_val(pte) & PTE_UFFD);
}
static inline pte_t pte_mkuffd_wp(pte_t pte)
{
- return pte_wrprotect(set_pte_bit(pte, __pgprot(PTE_UFFD_WP)));
+ return pte_wrprotect(set_pte_bit(pte, __pgprot(PTE_UFFD)));
}
static inline pte_t pte_clear_uffd_wp(pte_t pte)
{
- return clear_pte_bit(pte, __pgprot(PTE_UFFD_WP));
+ return clear_pte_bit(pte, __pgprot(PTE_UFFD));
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
@@ -539,17 +539,17 @@ static inline pte_t pte_swp_clear_exclusive(pte_t pte)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
{
- return set_pte_bit(pte, __pgprot(PTE_SWP_UFFD_WP));
+ return set_pte_bit(pte, __pgprot(PTE_SWP_UFFD));
}
static inline int pte_swp_uffd_wp(pte_t pte)
{
- return !!(pte_val(pte) & PTE_SWP_UFFD_WP);
+ return !!(pte_val(pte) & PTE_SWP_UFFD);
}
static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
{
- return clear_pte_bit(pte, __pgprot(PTE_SWP_UFFD_WP));
+ return clear_pte_bit(pte, __pgprot(PTE_SWP_UFFD));
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index b422d9691e60..d5a86b4df3ce 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -40,20 +40,20 @@
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-/* ext_svrsw60t59b: Bit(60) for uffd-wp tracking */
-#define _PAGE_UFFD_WP \
+/* ext_svrsw60t59b: Bit(60) for userfaultfd tracking */
+#define _PAGE_UFFD \
((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \
(1UL << 60) : 0)
/*
* Bit 4 is not involved into swap entry computation, so we
- * can borrow it for swap page uffd-wp tracking.
+ * can borrow it for swap page userfaultfd tracking.
*/
-#define _PAGE_SWP_UFFD_WP \
+#define _PAGE_SWP_UFFD \
((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \
_PAGE_USER : 0)
#else
-#define _PAGE_UFFD_WP 0
-#define _PAGE_SWP_UFFD_WP 0
+#define _PAGE_UFFD 0
+#define _PAGE_SWP_UFFD 0
#endif
#define _PAGE_TABLE _PAGE_PRESENT
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 48a127323b21..ca69948b3ed8 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -405,32 +405,32 @@ static inline pte_t pte_wrprotect(pte_t pte)
static inline bool pte_uffd_wp(pte_t pte)
{
- return !!(pte_val(pte) & _PAGE_UFFD_WP);
+ return !!(pte_val(pte) & _PAGE_UFFD);
}
static inline pte_t pte_mkuffd_wp(pte_t pte)
{
- return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP));
+ return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD));
}
static inline pte_t pte_clear_uffd_wp(pte_t pte)
{
- return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP));
+ return __pte(pte_val(pte) & ~(_PAGE_UFFD));
}
static inline bool pte_swp_uffd_wp(pte_t pte)
{
- return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP);
+ return !!(pte_val(pte) & _PAGE_SWP_UFFD);
}
static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
{
- return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP);
+ return __pte(pte_val(pte) | _PAGE_SWP_UFFD);
}
static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
{
- return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP));
+ return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD));
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
@@ -1157,7 +1157,7 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot)
* bit 0: _PAGE_PRESENT (zero)
* bit 1 to 2: (zero)
* bit 3: _PAGE_SWP_SOFT_DIRTY
- * bit 4: _PAGE_SWP_UFFD_WP
+ * bit 4: _PAGE_SWP_UFFD
* bit 5: _PAGE_PROT_NONE (zero)
* bit 6: exclusive marker
* bits 7 to 11: swap type
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index c7f014cbf0a9..038c806b50a2 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -413,17 +413,17 @@ static inline pte_t pte_wrprotect(pte_t pte)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static inline int pte_uffd_wp(pte_t pte)
{
- return pte_flags(pte) & _PAGE_UFFD_WP;
+ return pte_flags(pte) & _PAGE_UFFD;
}
static inline pte_t pte_mkuffd_wp(pte_t pte)
{
- return pte_wrprotect(pte_set_flags(pte, _PAGE_UFFD_WP));
+ return pte_wrprotect(pte_set_flags(pte, _PAGE_UFFD));
}
static inline pte_t pte_clear_uffd_wp(pte_t pte)
{
- return pte_clear_flags(pte, _PAGE_UFFD_WP);
+ return pte_clear_flags(pte, _PAGE_UFFD);
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
@@ -528,17 +528,17 @@ static inline pmd_t pmd_wrprotect(pmd_t pmd)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static inline int pmd_uffd_wp(pmd_t pmd)
{
- return pmd_flags(pmd) & _PAGE_UFFD_WP;
+ return pmd_flags(pmd) & _PAGE_UFFD;
}
static inline pmd_t pmd_mkuffd_wp(pmd_t pmd)
{
- return pmd_wrprotect(pmd_set_flags(pmd, _PAGE_UFFD_WP));
+ return pmd_wrprotect(pmd_set_flags(pmd, _PAGE_UFFD));
}
static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd)
{
- return pmd_clear_flags(pmd, _PAGE_UFFD_WP);
+ return pmd_clear_flags(pmd, _PAGE_UFFD);
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
@@ -1550,32 +1550,32 @@ static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
{
- return pte_set_flags(pte, _PAGE_SWP_UFFD_WP);
+ return pte_set_flags(pte, _PAGE_SWP_UFFD);
}
static inline int pte_swp_uffd_wp(pte_t pte)
{
- return pte_flags(pte) & _PAGE_SWP_UFFD_WP;
+ return pte_flags(pte) & _PAGE_SWP_UFFD;
}
static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
{
- return pte_clear_flags(pte, _PAGE_SWP_UFFD_WP);
+ return pte_clear_flags(pte, _PAGE_SWP_UFFD);
}
static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd)
{
- return pmd_set_flags(pmd, _PAGE_SWP_UFFD_WP);
+ return pmd_set_flags(pmd, _PAGE_SWP_UFFD);
}
static inline int pmd_swp_uffd_wp(pmd_t pmd)
{
- return pmd_flags(pmd) & _PAGE_SWP_UFFD_WP;
+ return pmd_flags(pmd) & _PAGE_SWP_UFFD;
}
static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
{
- return pmd_clear_flags(pmd, _PAGE_SWP_UFFD_WP);
+ return pmd_clear_flags(pmd, _PAGE_SWP_UFFD);
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h
index 2ec250ba467e..af08d98be930 100644
--- a/arch/x86/include/asm/pgtable_types.h
+++ b/arch/x86/include/asm/pgtable_types.h
@@ -31,7 +31,7 @@
#define _PAGE_BIT_SPECIAL _PAGE_BIT_SOFTW1
#define _PAGE_BIT_CPA_TEST _PAGE_BIT_SOFTW1
-#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */
+#define _PAGE_BIT_UFFD _PAGE_BIT_SOFTW2 /* userfaultfd tracking */
#define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */
#define _PAGE_BIT_KERNEL_4K _PAGE_BIT_SOFTW3 /* page must not be converted to large */
@@ -39,7 +39,7 @@
#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit (leaf) */
#define _PAGE_BIT_NOPTISHADOW _PAGE_BIT_SOFTW5 /* No PTI shadow (root PGD) */
#else
-/* Shared with _PAGE_BIT_UFFD_WP which is not supported on 32 bit */
+/* Shared with _PAGE_BIT_UFFD which is not supported on 32 bit */
#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW2 /* Saved Dirty bit (leaf) */
#define _PAGE_BIT_NOPTISHADOW _PAGE_BIT_SOFTW2 /* No PTI shadow (root PGD) */
#endif
@@ -111,11 +111,11 @@
#endif
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
-#define _PAGE_UFFD_WP (_AT(pteval_t, 1) << _PAGE_BIT_UFFD_WP)
-#define _PAGE_SWP_UFFD_WP _PAGE_USER
+#define _PAGE_UFFD (_AT(pteval_t, 1) << _PAGE_BIT_UFFD)
+#define _PAGE_SWP_UFFD _PAGE_USER
#else
-#define _PAGE_UFFD_WP (_AT(pteval_t, 0))
-#define _PAGE_SWP_UFFD_WP (_AT(pteval_t, 0))
+#define _PAGE_UFFD (_AT(pteval_t, 0))
+#define _PAGE_SWP_UFFD (_AT(pteval_t, 0))
#endif
#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
@@ -129,7 +129,7 @@
/*
* The hardware requires shadow stack to be Write=0,Dirty=1. However,
* there are valid cases where the kernel might create read-only PTEs that
- * are dirty (e.g., fork(), mprotect(), uffd-wp(), soft-dirty tracking). In
+ * are dirty (e.g., fork(), mprotect(), userfaultfd, soft-dirty tracking). In
* this case, the _PAGE_SAVED_DIRTY bit is used instead of the HW-dirty bit,
* to avoid creating a wrong "shadow stack" PTEs. Such PTEs have
* (Write=0,SavedDirty=1,Dirty=0) set.
@@ -151,7 +151,7 @@
#define _COMMON_PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \
_PAGE_SPECIAL | _PAGE_ACCESSED | \
_PAGE_DIRTY_BITS | _PAGE_SOFT_DIRTY | \
- _PAGE_CC | _PAGE_UFFD_WP)
+ _PAGE_CC | _PAGE_UFFD)
#define _PAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PAT)
#define _HPAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_PAT_LARGE)
--
2.51.2
^ permalink raw reply related
* [PATCH v3 01/16] mm: decouple protnone helpers from CONFIG_NUMA_BALANCING
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
In-Reply-To: <20260522133857.552279-1-kirill@shutemov.name>
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
pte_protnone() and pmd_protnone() detect present-but-inaccessible page
table entries. This capability is useful beyond NUMA balancing -- for
example, userfaultfd working set tracking uses protnone PTEs to track
page access without unmapping pages.
Introduce CONFIG_ARCH_HAS_PTE_PROTNONE to decouple the protnone PTE
infrastructure from CONFIG_NUMA_BALANCING. The six architectures that
support protnone PTEs (x86_64, arm64, powerpc, s390, riscv, loongarch)
now select this option, and CONFIG_NUMA_BALANCING depends on it.
No functional change -- the same set of architectures continues to have
working protnone support, but the infrastructure is now available
independently of NUMA balancing.
Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org>
Assisted-by: Claude:claude-opus-4-6
Acked-by: SeongJae Park <sj@kernel.org>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/pgtable.h | 7 ++---
arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/pgtable.h | 4 +--
arch/powerpc/include/asm/book3s/64/pgtable.h | 8 ++---
arch/powerpc/platforms/Kconfig.cputype | 1 +
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/pgtable.h | 7 ++---
arch/s390/Kconfig | 1 +
arch/s390/include/asm/pgtable.h | 4 +--
arch/x86/Kconfig | 1 +
arch/x86/include/asm/pgtable.h | 8 ++---
include/linux/pgtable.h | 32 ++++++++++++++------
init/Kconfig | 8 +++++
mm/debug_vm_pgtable.c | 4 +--
15 files changed, 52 insertions(+), 36 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943..319470b3b1bb 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -78,6 +78,7 @@ config ARM64
select ARCH_SUPPORTS_CFI
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
+ select ARCH_HAS_PTE_PROTNONE
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_SUPPORTS_PAGE_TABLE_CHECK
select ARCH_SUPPORTS_PER_VMA_LOCK
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 4dfa42b7d053..873f4ea2e288 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -553,10 +553,7 @@ static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
}
#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * See the comment in include/linux/pgtable.h
- */
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline int pte_protnone(pte_t pte)
{
/*
@@ -575,7 +572,7 @@ static inline int pmd_protnone(pmd_t pmd)
{
return pte_protnone(pmd_pte(pmd));
}
-#endif
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
#define pmd_present(pmd) pte_present(pmd_pte(pmd))
#define pmd_dirty(pmd) pte_dirty(pmd_pte(pmd))
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 3b042dbb2c41..229b3d1b7056 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -67,6 +67,7 @@ config LOONGARCH
select ARCH_SUPPORTS_LTO_CLANG
select ARCH_SUPPORTS_LTO_CLANG_THIN
select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
+ select ARCH_HAS_PTE_PROTNONE
select ARCH_SUPPORTS_NUMA_BALANCING if NUMA
select ARCH_SUPPORTS_PER_VMA_LOCK
select ARCH_SUPPORTS_RT
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index 2a0b63ae421f..d295447a2763 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -619,7 +619,7 @@ static inline pmd_t pmdp_huge_get_and_clear(struct mm_struct *mm,
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
-#ifdef CONFIG_NUMA_BALANCING
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline long pte_protnone(pte_t pte)
{
return (pte_val(pte) & _PAGE_PROTNONE);
@@ -629,7 +629,7 @@ static inline long pmd_protnone(pmd_t pmd)
{
return (pmd_val(pmd) & _PAGE_PROTNONE);
}
-#endif /* CONFIG_NUMA_BALANCING */
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
#define pmd_leaf(pmd) ((pmd_val(pmd) & _PAGE_HUGE) != 0)
#define pud_leaf(pud) ((pud_val(pud) & _PAGE_HUGE) != 0)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index e67e64ac6e8c..53a0c5892548 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -490,13 +490,13 @@ static inline pte_t pte_clear_soft_dirty(pte_t pte)
}
#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
-#ifdef CONFIG_NUMA_BALANCING
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline int pte_protnone(pte_t pte)
{
return (pte_raw(pte) & cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE | _PAGE_RWX)) ==
cpu_to_be64(_PAGE_PRESENT | _PAGE_PTE);
}
-#endif /* CONFIG_NUMA_BALANCING */
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
static inline bool pte_hw_valid(pte_t pte)
{
@@ -1067,12 +1067,12 @@ static inline pte_t *pmdp_ptep(pmd_t *pmd)
#endif
#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
-#ifdef CONFIG_NUMA_BALANCING
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline int pmd_protnone(pmd_t pmd)
{
return pte_protnone(pmd_pte(pmd));
}
-#endif /* CONFIG_NUMA_BALANCING */
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
#define pmd_write(pmd) pte_write(pmd_pte(pmd))
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index bac02c83bb3e..36b64a24cf30 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -87,6 +87,7 @@ config PPC_BOOK3S_64
select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION
select ARCH_ENABLE_SPLIT_PMD_PTLOCK
select ARCH_SUPPORTS_HUGETLBFS
+ select ARCH_HAS_PTE_PROTNONE
select ARCH_SUPPORTS_NUMA_BALANCING
select HAVE_MOVE_PMD
select HAVE_MOVE_PUD
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d235396c4514..9eb4a9315bdf 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -71,6 +71,7 @@ config RISCV
select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS if 64BIT && MMU
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU
select ARCH_SUPPORTS_PER_VMA_LOCK if MMU
+ select ARCH_HAS_PTE_PROTNONE if MMU
select ARCH_SUPPORTS_RT
select ARCH_SUPPORTS_SHADOW_CALL_STACK if HAVE_SHADOW_CALL_STACK
select ARCH_SUPPORTS_SCHED_MC if SMP
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index a1a7c6520a09..48a127323b21 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -524,10 +524,7 @@ static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
PAGE_SIZE)
#endif
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * See the comment in include/asm-generic/pgtable.h
- */
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline int pte_protnone(pte_t pte)
{
return (pte_val(pte) & (_PAGE_PRESENT | _PAGE_PROT_NONE)) == _PAGE_PROT_NONE;
@@ -537,7 +534,7 @@ static inline int pmd_protnone(pmd_t pmd)
{
return pte_protnone(pmd_pte(pmd));
}
-#endif
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
/* Modify page protection bits */
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index ecbcbb781e40..bc5bef08454b 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -151,6 +151,7 @@ config S390
select ARCH_SUPPORTS_HUGETLBFS
select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 && CC_IS_CLANG
select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
+ select ARCH_HAS_PTE_PROTNONE
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_SUPPORTS_PAGE_TABLE_CHECK
select ARCH_SUPPORTS_PER_VMA_LOCK
diff --git a/arch/s390/include/asm/pgtable.h b/arch/s390/include/asm/pgtable.h
index 2c6cee8241e0..97241dea5573 100644
--- a/arch/s390/include/asm/pgtable.h
+++ b/arch/s390/include/asm/pgtable.h
@@ -842,7 +842,7 @@ static inline int pte_same(pte_t a, pte_t b)
return pte_val(a) == pte_val(b);
}
-#ifdef CONFIG_NUMA_BALANCING
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline int pte_protnone(pte_t pte)
{
return pte_present(pte) && !(pte_val(pte) & _PAGE_READ);
@@ -853,7 +853,7 @@ static inline int pmd_protnone(pmd_t pmd)
/* pmd_leaf(pmd) implies pmd_present(pmd) */
return pmd_leaf(pmd) && !(pmd_val(pmd) & _SEGMENT_ENTRY_READ);
}
-#endif
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
static inline bool pte_swp_exclusive(pte_t pte)
{
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f3f7cb01d69d..9da1119e8ff6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -123,6 +123,7 @@ config X86
select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_HUGETLBFS
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if X86_64
+ select ARCH_HAS_PTE_PROTNONE if X86_64
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096
select ARCH_SUPPORTS_CFI if X86_64
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 2187e9cfcefa..c7f014cbf0a9 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -985,11 +985,7 @@ static inline int pmd_present(pmd_t pmd)
return pmd_flags(pmd) & (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_PSE);
}
-#ifdef CONFIG_NUMA_BALANCING
-/*
- * These work without NUMA balancing but the kernel does not care. See the
- * comment in include/linux/pgtable.h
- */
+#ifdef CONFIG_ARCH_HAS_PTE_PROTNONE
static inline int pte_protnone(pte_t pte)
{
return (pte_flags(pte) & (_PAGE_PROTNONE | _PAGE_PRESENT))
@@ -1001,7 +997,7 @@ static inline int pmd_protnone(pmd_t pmd)
return (pmd_flags(pmd) & (_PAGE_PROTNONE | _PAGE_PRESENT))
== _PAGE_PROTNONE;
}
-#endif /* CONFIG_NUMA_BALANCING */
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
static inline int pmd_none(pmd_t pmd)
{
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index cdd68ed3ae1a..b6516a11adfa 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -2052,18 +2052,26 @@ static inline int pud_trans_unstable(pud_t *pud)
return 0;
}
-#ifndef CONFIG_NUMA_BALANCING
+#ifndef CONFIG_ARCH_HAS_PTE_PROTNONE
/*
- * In an inaccessible (PROT_NONE) VMA, pte_protnone() may indicate "yes". It is
- * perfectly valid to indicate "no" in that case, which is why our default
- * implementation defaults to "always no".
+ * In an inaccessible (PROT_NONE) VMA, pte_protnone() may indicate "yes". It
+ * is perfectly valid to indicate "no" in that case, which is why our
+ * default implementation defaults to "always no".
*
- * In an accessible VMA, however, pte_protnone() reliably indicates PROT_NONE
- * page protection due to NUMA hinting. NUMA hinting faults only apply in
- * accessible VMAs.
+ * In an accessible VMA, pte_protnone() reliably indicates a present
+ * PROT_NONE page protection. Today the kernel uses such PTEs for two
+ * purposes: NUMA hinting faults, and userfaultfd RWP tracking on
+ * VM_UFFD_RWP VMAs. The two are distinguished by the uffd PTE bit and
+ * the VMA flag; see include/linux/userfaultfd_k.h.
*
- * So, to reliably identify PROT_NONE PTEs that require a NUMA hinting fault,
- * looking at the VMA accessibility is sufficient.
+ * So, to reliably identify PROT_NONE PTEs that require kernel handling,
+ * looking at the VMA accessibility (and the uffd bit on RWP VMAs) is
+ * sufficient.
+ *
+ * Architectures without CONFIG_ARCH_HAS_PTE_PROTNONE get the always-zero
+ * stubs below; PAGE_NONE references that survive to runtime fire the
+ * BUILD_BUG() fallback, since callers should have folded such paths to
+ * dead code via IS_ENABLED(CONFIG_ARCH_HAS_PTE_PROTNONE).
*/
static inline int pte_protnone(pte_t pte)
{
@@ -2074,7 +2082,11 @@ static inline int pmd_protnone(pmd_t pmd)
{
return 0;
}
-#endif /* CONFIG_NUMA_BALANCING */
+
+#ifndef PAGE_NONE
+#define PAGE_NONE ({ BUILD_BUG(); (pgprot_t){0}; })
+#endif
+#endif /* CONFIG_ARCH_HAS_PTE_PROTNONE */
#endif /* CONFIG_MMU */
diff --git a/init/Kconfig b/init/Kconfig
index 2937c4d308ae..58abb7f19206 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -944,6 +944,13 @@ config SCHED_PROXY_EXEC
endmenu
+#
+# For architectures that support present-but-inaccessible (PROT_NONE) page
+# table entries detectable via pte_protnone() / pmd_protnone():
+#
+config ARCH_HAS_PTE_PROTNONE
+ bool
+
#
# For architectures that want to enable the support for NUMA-affine scheduler
# balancing logic:
@@ -1010,6 +1017,7 @@ config ARCH_WANT_NUMA_VARIABLE_LOCALITY
config NUMA_BALANCING
bool "Memory placement aware NUMA scheduler"
depends on ARCH_SUPPORTS_NUMA_BALANCING
+ depends on ARCH_HAS_PTE_PROTNONE
depends on !ARCH_WANT_NUMA_VARIABLE_LOCALITY
depends on SMP && NUMA_MIGRATION && !PREEMPT_RT
help
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 23dc3ee09561..5e9f3a35f924 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -672,7 +672,7 @@ static void __init pte_protnone_tests(struct pgtable_debug_args *args)
{
pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot_none);
- if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+ if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_PROTNONE))
return;
pr_debug("Validating PTE protnone\n");
@@ -685,7 +685,7 @@ static void __init pmd_protnone_tests(struct pgtable_debug_args *args)
{
pmd_t pmd;
- if (!IS_ENABLED(CONFIG_NUMA_BALANCING))
+ if (!IS_ENABLED(CONFIG_ARCH_HAS_PTE_PROTNONE))
return;
if (!has_transparent_hugepage())
--
2.51.2
^ permalink raw reply related
* [PATCH v3 00/16] userfaultfd: working set tracking for VM guest memory
From: Kiryl Shutsemau @ 2026-05-22 13:38 UTC (permalink / raw)
To: akpm, rppt, peterx, david
Cc: ljs, surenb, vbabka, Liam.Howlett, ziy, corbet, skhan, seanjc,
pbonzini, jthoughton, aarcange, sj, usama.arif, linux-mm,
linux-kernel, linux-doc, linux-kselftest, kvm, kernel-team,
linux-man, alx, Kiryl Shutsemau (Meta)
From: "Kiryl Shutsemau (Meta)" <kas@kernel.org>
This series adds userfaultfd support for tracking the working set of
VM guest memory, so a VMM can identify cold pages and evict them to
tiered or remote storage.
v1: https://lore.kernel.org/all/20260427114607.4068647-1-kas@kernel.org/
v2: https://lore.kernel.org/all/cover.1778254670.git.kas@kernel.org/
== Changes since v2 ==
Review feedback from Mike Rapoport and SeongJae Park; tags folded in.
- 03/16: rename uffd_wp local in copy_hugetlb_page_range() (SJ).
- 04/16: group mode and protection bits in __VM_UFFD_FLAGS et al;
move CONFIG_USERFAULTFD_RWP to 08/16 where the UAPI lands.
- 05/16: fold uffd_wp/uffd_rwp bool pairs into uffd_prot in
change_huge_pmd(), change_softleaf_pte(), change_present_ptes();
nit fixes in comments.
- 08/16: pre-scan rewritten as bool found; CONFIG_USERFAULTFD_RWP
moved here from 04/16.
- 09/16: line reflow.
- 13/16: rewrite uffd_rwp_gup_test() with vmsplice() -- write()
went through copy_from_user(), not gup_can_follow_protnone();
plus selftest cleanups.
- 14/16: documentation wording fixes.
Patches 15-16 are the matching userfaultfd(2) and ioctl_userfaultfd(2)
man-page updates against the linux-man tree. Apply with
"git am --directory=" or by hand in the man-pages repo.
== Problem ==
A VMM managing guest memory needs to:
1. detect which pages are still being touched (working-set
tracking);
2. safely evict cold pages to slower tiered or remote storage;
3. fetch them back on demand when accessed again.
== Approach ==
UFFDIO_REGISTER_MODE_RWP is a new userfaultfd registration mode, in
parallel with the existing MODE_MISSING / MODE_WP / MODE_MINOR. It
uses the same mechanism on every backing -- anon, shmem, hugetlbfs:
- PAGE_NONE on the PTE (the same primitive NUMA balancing uses)
makes the page inaccessible while keeping it resident;
- the uffd PTE bit (the one MODE_WP already owns) marks the entry
as "userfaultfd-tracked" so the protnone fault path can tell an
RWP fault apart from an mprotect(PROT_NONE) or NUMA hinting
fault.
VM_UFFD_WP and VM_UFFD_RWP are mutually exclusive per VMA, so the
same PTE bit safely carries both meanings depending on the
registered VMA flag.
In sync mode, the kernel delivers a UFFD_PAGEFAULT_FLAG_RWP message
to the registered handler, and the handler resolves the fault with
UFFDIO_RWPROTECT clearing MODE_RWP. In async mode
(UFFD_FEATURE_RWP_ASYNC), the fault is auto-resolved in-place: the
kernel restores the original PTE permissions and the faulting thread
continues without a userfaultfd message ever being delivered.
Userspace then learns which pages were touched by reading
PAGE_IS_ACCESSED out of PAGEMAP_SCAN -- pages whose uffd bit is
still set were not re-accessed since the last RWP cycle.
UFFDIO_RWPROTECT is the protect/unprotect ioctl, mirroring
UFFDIO_WRITEPROTECT.
UFFDIO_SET_MODE flips RWP_ASYNC <-> sync at runtime under
mmap_write_lock(), so a VMM can run in async mode for detection and
switch to sync for race-free eviction without re-registering the
userfaultfd.
== Typical VMM workflow ==
/* arm */
UFFDIO_API(features = RWP | RWP_ASYNC)
UFFDIO_REGISTER(MODE_RWP)
/* detection cycle */
UFFDIO_RWPROTECT(range, RWP)
sleep(interval)
PAGEMAP_SCAN(!PAGE_IS_ACCESSED) -> cold pages
/* eviction */
UFFDIO_SET_MODE(disable = RWP_ASYNC) /* sync */
pwrite(cold) + fallocate(FALLOC_FL_PUNCH_HOLE, cold) /* races trapped */
UFFDIO_SET_MODE(enable = RWP_ASYNC) /* resume */
== Series layout ==
Patches 1 to 3 are preparatory:
1: decouple protnone helpers from CONFIG_NUMA_BALANCING.
2-3: rename _PAGE_BIT_UFFD_WP, pte_uffd_wp() and friends to drop
the _WP suffix, since the bit now carries WP and RWP meaning
depending on the VMA flag. The SCAN_PTE_UFFD enum's ftrace
output string is intentionally kept as "pte_uffd_wp" so
trace-based tooling does not silently break.
Patches 4 to 7 add the in-kernel mechanism:
4: VM_UFFD_RWP VMA flag (aliased to VM_NONE until 8/16 introduces
CONFIG_USERFAULTFD_RWP together with the UAPI).
5: MM_CP_UFFD_RWP change_protection() primitive (PAGE_NONE +
uffd bit, plus a RESOLVE counterpart).
6: marker preservation across swap, device-exclusive, migration,
fork, mremap, UFFDIO_MOVE, hugetlb copy, and mprotect().
7: handle VM_UFFD_RWP in khugepaged, rmap, and GUP.
Patches 8 to 12 wire the userspace surface:
8: UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT plumbing
(introduces CONFIG_USERFAULTFD_RWP).
9: RWP fault delivery and exposure of UFFDIO_REGISTER_MODE_RWP.
10: PAGE_IS_ACCESSED in PAGEMAP_SCAN.
11: UFFD_FEATURE_RWP_ASYNC for async fault resolution.
12: UFFDIO_SET_MODE for runtime sync/async toggle.
Patches 13 and 14 are kernel tests and Documentation/. Patches 15 and
16 update userfaultfd(2) and ioctl_userfaultfd(2) in the linux-man
tree.
Kiryl Shutsemau (Meta) (16):
mm: decouple protnone helpers from CONFIG_NUMA_BALANCING
mm: rename uffd-wp PTE bit macros to uffd
mm: rename uffd-wp PTE accessors to uffd
mm: add VM_UFFD_RWP VMA flag
mm: add MM_CP_UFFD_RWP change_protection() flag
mm: preserve RWP marker across PTE rewrites
mm: handle VM_UFFD_RWP in khugepaged, rmap, and GUP
userfaultfd: add UFFDIO_REGISTER_MODE_RWP and UFFDIO_RWPROTECT
plumbing
mm/userfaultfd: add RWP fault delivery and expose
UFFDIO_REGISTER_MODE_RWP
mm/pagemap: add PAGE_IS_ACCESSED for RWP tracking
userfaultfd: add UFFD_FEATURE_RWP_ASYNC for async fault resolution
userfaultfd: add UFFDIO_SET_MODE for runtime sync/async toggle
selftests/mm: add userfaultfd RWP tests
Documentation/userfaultfd: document RWP working set tracking
userfaultfd.2: Add read-write protect mode
ioctl_userfaultfd.2: Add read-write protect mode docs
Documentation/admin-guide/mm/pagemap.rst | 13 +-
Documentation/admin-guide/mm/userfaultfd.rst | 236 +++++-
Documentation/filesystems/proc.rst | 1 +
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/pgtable-prot.h | 8 +-
arch/arm64/include/asm/pgtable.h | 47 +-
arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/pgtable.h | 4 +-
arch/powerpc/include/asm/book3s/64/pgtable.h | 8 +-
arch/powerpc/platforms/Kconfig.cputype | 1 +
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/pgtable-bits.h | 12 +-
arch/riscv/include/asm/pgtable.h | 59 +-
arch/s390/Kconfig | 1 +
arch/s390/include/asm/hugetlb.h | 12 +-
arch/s390/include/asm/pgtable.h | 4 +-
arch/x86/Kconfig | 1 +
arch/x86/include/asm/pgtable.h | 56 +-
arch/x86/include/asm/pgtable_types.h | 16 +-
fs/proc/task_mmu.c | 108 ++-
fs/userfaultfd.c | 263 ++++++-
include/asm-generic/hugetlb.h | 18 +-
include/asm-generic/pgtable_uffd.h | 32 +-
include/linux/huge_mm.h | 7 +
include/linux/leafops.h | 4 +-
include/linux/mm.h | 46 +-
include/linux/mm_inline.h | 4 +-
include/linux/pgtable.h | 32 +-
include/linux/swapops.h | 4 +-
include/linux/userfaultfd_k.h | 78 +-
include/trace/events/huge_memory.h | 2 +-
include/trace/events/mmflags.h | 7 +
include/uapi/linux/fs.h | 1 +
include/uapi/linux/userfaultfd.h | 54 +-
init/Kconfig | 8 +
mm/Kconfig | 9 +
mm/debug_vm_pgtable.c | 4 +-
mm/huge_memory.c | 155 +++-
mm/hugetlb.c | 146 +++-
mm/internal.h | 4 +-
mm/khugepaged.c | 38 +-
mm/memory.c | 123 ++-
mm/migrate.c | 20 +-
mm/migrate_device.c | 8 +-
mm/mprotect.c | 68 +-
mm/mremap.c | 17 +-
mm/page_table_check.c | 8 +-
mm/rmap.c | 18 +-
mm/swapfile.c | 9 +-
mm/userfaultfd.c | 112 ++-
tools/include/uapi/linux/fs.h | 1 +
tools/testing/selftests/mm/uffd-unit-tests.c | 766 +++++++++++++++++++
man2/ioctl_userfaultfd.2 | 209 ++++++++++++++++++++++++++++++++++++++-
man2/userfaultfd.2 | 147 ++++++++++++++++++++++++++-
54 files changed, 2586 insertions(+), 426 deletions(-)
base-commit: 254f49634ee16a731174d2ae34bc50bd5f45e731
--
2.51.2
^ permalink raw reply
* Re: [PATCH v12 5/6] iio: adc: ad4691: add oversampling support
From: Jonathan Cameron @ 2026-05-22 13:38 UTC (permalink / raw)
To: Sabau, Radu bogdan
Cc: Lars-Peter Clausen, Hennerich, Michael, David Lechner, Sa, Nuno,
Andy Shevchenko, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
Uwe Kleine-König, Liam Girdwood, Mark Brown, Linus Walleij,
Bartosz Golaszewski, Philipp Zabel, Jonathan Corbet, Shuah Khan,
linux-iio@vger.kernel.org, devicetree@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-pwm@vger.kernel.org,
linux-gpio@vger.kernel.org, linux-doc@vger.kernel.org
In-Reply-To: <LV9PR03MB8414E63DA5C2C22A46E98215F70F2@LV9PR03MB8414.namprd03.prod.outlook.com>
On Fri, 22 May 2026 11:38:55 +0000
"Sabau, Radu bogdan" <Radu.Sabau@analog.com> wrote:
> > -----Original Message-----
> > From: Jonathan Cameron <jic23@kernel.org>
> > Sent: Friday, May 22, 2026 2:16 PM
>
> ...
>
> > > >
> > > > + iio_for_each_active_channel(indio_dev, bit) {
> > > > + ret = regmap_write(st->regmap,
> > > > AD4691_ACC_DEPTH_IN(bit), st->osr[bit]);
> > >
> > > Unfortunately enough, I think a v13 will come, too...
> > >
> > > Had a look again on what Sashiko had to say, and seeing the sampling
> > frequency
> > > shared_by_all comment again made me have a deeper look see how the
> > code could
> > > be commented so he wouldn't complain about this anymore, and...
> > >
> > > Perhaps he is a bit right after all. I found a section stating that in standard
> > > sequencer mode (which the driver uses right now), all the channels actually
> > use
> > > the ACC_DEPTH_IN0 for osr, and so changing ACC_DEPTH_INn for other
> > channels
> > > doesn't really do much. And so I tested this selecting both voltage0 and
> > voltage1
> > > for sampling with osr4 for voltage0 and osr1 for voltage1 and with a 100kHz
> > osc freq
> > > indeed DR fell after approximately 80us which points out both channels were
> > actually
> > > using OSR of 4. Perhaps the OSR should be shared by all and therefore the
> > > sampling frequency would also be shared by all, right?
> >
> > I kind of lost track on the modes. What are the chances we later move to or
> > add
> > support for a mode where the different OSRs do matter? If that's a possibility
> > we should avoid ABI change by allowing for it from the start.
> >
> > Then if we are in this mode, they'll have separate controls but change any,
> > changes
> > them all, if we are in a different mode that connection breaks.
> > If that's the case, just throw in a comment saying something to the effect this
> > may change.
> >
> > It's not wrong ABI to do this, it's just less intuitive for users which is why
> > we prefer the shared_by stuff where there isn't a disadvantage. That is at
> > most
> > a hint to what actually happens. A simple example is where different
> > channels have one OSR field but they aren't the same - i.e. channel 1 is twice
> > the OSR of channel 2. Hence we can't share the attribute but any change
> > effects
> > both.
> >
>
> Hi Jonathan,
>
> I don't think a mode where different OSR will matter will be added in the future. Better
> yet, this advanced sequencer functionality is not really mode dependent and is actually
> something that allows you to manually rearrange channels and samples in the
> sequence, and unless this functionality is active (it is not by default nor is it used by
> the driver, since we use the standard sequencer).
>
> Personally, I don't see any reason to have this advanced sequencer stuff implemented
> since DR is only falling at the end of the sequence no matter if it is standard config or not,
> the only "disadvantage" to say so is that the standard sequencer uses the same OSR field
> for all channels. But that advanced sequencer stuff would only complicate the buffer
> enable/disable functions even more, which I don't think it's worth the effort.
>
> So, with this in mind. Letting the driver use standard sequencer would ultimately mean
> that the osr would be the same for all the channels, and then effective rate the same for
> all channels, which I suggest having it like that from initial driver patch to the end, so no
> ABI change mid-patch series. This change will simplify the driver.
Ok. Thanks for the analysis. It may well be that those fancy sequencer things are
only really useful for very specific use cases we won't see in Linux.
So I'm fine with following the simple path!
Jonathan
>
> Radu
^ permalink raw reply
* Re: [PATCH] docs: fix typo in Sphinx custom CSS
From: Jonathan Corbet @ 2026-05-22 13:29 UTC (permalink / raw)
To: omkarbhor4011; +Cc: skhan, linux-doc, linux-kernel, Omkarbhor4011
In-Reply-To: <20260522003156.70389-1-omkarbhor4011@gmail.com>
omkarbhor4011@gmail.com writes:
> From: omkarbhor4011 <Omkarbhor4011@gmail.com>
>
> Signed-off-by: omkarbhor4011 <Omkarbhor4011@gmail.com>
For future reference, please include a proper changelog and your full
name in the signoff. That said,
> ---
> Documentation/sphinx-static/custom.css | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/sphinx-static/custom.css b/Documentation/sphinx-static/custom.css
> index f91393426..5aa0a1ed9 100644
> --- a/Documentation/sphinx-static/custom.css
> +++ b/Documentation/sphinx-static/custom.css
> @@ -30,7 +30,7 @@ img.logo {
> margin-bottom: 20px;
> }
>
> -/* The default is to use -1em, wich makes it override text */
> +/* The default is to use -1em, which makes it override text */
> li { text-indent: 0em; }
This one is already fixed in docs-next; that (or linux-next) is where
you should be preparing documentation patches.
Thanks,
jon
^ permalink raw reply
* Re: [PATCH v4 20/21] nfsd: track requested dir attributes
From: Chuck Lever @ 2026-05-22 13:24 UTC (permalink / raw)
To: Jeff Layton, Chuck Lever, NeilBrown, Olga Kornievskaia, Dai Ngo,
Tom Talpey, Trond Myklebust, Anna Schumaker
Cc: Alexander Aring, Amir Goldstein, Jan Kara, Alexander Viro,
Christian Brauner, Calum Mackay, linux-kernel, linux-doc,
linux-nfs
In-Reply-To: <20260522-dir-deleg-v4-20-2acb883ac6bc@kernel.org>
On Fri, May 22, 2026, at 8:29 AM, Jeff Layton wrote:
> Track the union of requested and supported dir attributes in the
> delegation, and only encode the attributes in that union when sending
> add/remove/rename updates.
Nit: The encode-time use of dl_notify_mask for NOTIFY4_CHANGE_DIR_ATTRS
is wired up in 21/21. This patch adds only the tracking; the per-event
encoder change lives in the subsequent patch.
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> fs/nfsd/nfs4proc.c | 9 ++++++---
> fs/nfsd/nfs4state.c | 14 +++++++++++++-
> fs/nfsd/state.h | 2 ++
> 3 files changed, 21 insertions(+), 4 deletions(-)
--
Chuck Lever
^ permalink raw reply
* Re: [PATCH v6 21/43] KVM: SEV: Make 'uaddr' parameter optional for KVM_SEV_SNP_LAUNCH_UPDATE
From: Sean Christopherson @ 2026-05-22 13:08 UTC (permalink / raw)
To: Ackerley Tng
Cc: Fuad Tabba, aik, andrew.jones, binbin.wu, brauner, chao.p.peng,
david, ira.weiny, jmattson, jthoughton, michael.roth, oupton,
pankaj.gupta, qperret, rick.p.edgecombe, rientjes, shivankg,
steven.price, willy, wyihan, yan.y.zhao, forkloop, pratyush,
suzuki.poulose, aneesh.kumar, liam, Paolo Bonzini,
Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, x86,
H. Peter Anvin, Steven Rostedt, Masami Hiramatsu,
Mathieu Desnoyers, Jonathan Corbet, Shuah Khan, Shuah Khan,
Vishal Annapurve, Andrew Morton, Chris Li, Kairui Song,
Kemeng Shi, Nhat Pham, Baoquan He, Barry Song, Axel Rasmussen,
Yuanchu Xie, Wei Xu, Youngjun Park, Qi Zheng, Shakeel Butt,
Kiryl Shutsemau, Jason Gunthorpe, Vlastimil Babka, kvm,
linux-kernel, linux-trace-kernel, linux-doc, linux-kselftest,
linux-mm, linux-coco
In-Reply-To: <CAEvNRgFB8ydih9JTmsH06H32j38tH-iViZqN_eZ_gQAmXpw+Dw@mail.gmail.com>
On Thu, May 21, 2026, Ackerley Tng wrote:
> Sean Christopherson <seanjc@google.com> writes:
>
> > On Thu, May 21, 2026, Fuad Tabba wrote:
> >> Hi,
> >>
> >> On Thu, 7 May 2026 at 21:22, Ackerley Tng via B4 Relay
> > diff --git include/linux/kvm_host.h include/linux/kvm_host.h
> > index 61a3430957f2..b83cda2870ba 100644
> > --- include/linux/kvm_host.h
> > +++ include/linux/kvm_host.h
> > @@ -2596,7 +2596,8 @@ int kvm_arch_gmem_prepare(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn, int max_ord
> > typedef int (*kvm_gmem_populate_cb)(struct kvm *kvm, gfn_t gfn, kvm_pfn_t pfn,
> > struct page *page, void *opaque);
> >
> > -long kvm_gmem_populate(struct kvm *kvm, gfn_t gfn, void __user *src, long npages,
> > +long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src,
> > + long npages, bool writable,
>
> What do you think of need_writable_src instead of just writable for the
> variable name?
How about "may_write_src" or "may_writeback_src"?
> > kvm_gmem_populate_cb post_populate, void *opaque);
> > #endif
> >
> > diff --git virt/kvm/guest_memfd.c virt/kvm/guest_memfd.c
> > index a35a55571a2d..6553d4e032ce 100644
> > --- virt/kvm/guest_memfd.c
> > +++ virt/kvm/guest_memfd.c
> > @@ -858,7 +858,8 @@ static long __kvm_gmem_populate(struct kvm *kvm, struct kvm_memory_slot *slot,
> > return ret;
> > }
> >
> > -long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long npages,
> > +long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src,
> > + long npages, bool writable,
> > kvm_gmem_populate_cb post_populate, void *opaque)
> > {
> > struct kvm_memory_slot *slot;
> > @@ -892,8 +893,9 @@ long kvm_gmem_populate(struct kvm *kvm, gfn_t start_gfn, void __user *src, long
> >
> > if (src) {
> > unsigned long uaddr = (unsigned long)src + i * PAGE_SIZE;
> > + unsigned int flags = writable ? FOLL_WRITE : 0;
>
> How about using FOLL_WRITE | FOLL_NOFAULT so if it weren't writable to
> start with, don't CoW, just error out?
Eh, I don't see any value in value in erroring out if userspace is doing something
unusual. If breaking CoW was actually problematic somehow, then sure. But AFAICT
it's overall harmless.
> Like you said above the CPUID page provided as src_page would have been
> written to before, so it should have been mapped as writable.
^ permalink raw reply
* Re: [PATCH v4 27/30] KVM: x86: Add KVM_VCPU_TSC_EFFECTIVE_FREQ attribute
From: David Woodhouse @ 2026-05-22 13:07 UTC (permalink / raw)
To: Sean Christopherson
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
Vitaly Kuznetsov, Juergen Gross, Boris Ostrovsky, Paul Durrant,
Jonathan Cameron, Sascha Bischoff, Marc Zyngier, Joey Gouly,
Jack Allister, Dongli Zhang, joe.jin, kvm, linux-doc,
linux-kernel, xen-devel, linux-kselftest
In-Reply-To: <ahBQ7mXNaTtouT3C@google.com>
[-- Attachment #1: Type: text/plain, Size: 1205 bytes --]
On Fri, 2026-05-22 at 05:49 -0700, Sean Christopherson wrote:
>
> Oh, that's just an oversight, definitely not intentional. Easy enough to fix:
Want me to roll that into the series? As you eloquently put it the
other day, what's one more patch...?
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 1616b2eec6e7..cd4a244ca0c5 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2235,7 +2235,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> r = tdp_enabled;
> break;
> case KVM_CAP_X86_APIC_BUS_CYCLES_NS:
> - r = APIC_BUS_CYCLE_NS_DEFAULT;
> + r = kvm ? kvm->arch.apic_bus_cycle_ns : APIC_BUS_CYCLE_NS_DEFAULT;
> break;
> case KVM_CAP_EXIT_HYPERCALL:
> r = KVM_EXIT_HYPERCALL_VALID_MASK;
Please tell me that can never be zero. Because we divide by it when
reading HV_X64_MSR_APIC_FREQUENCY.
... checks ... it does look like it's initialised to
APIC_BUS_CYCLE_NS_DEFAULT in kvm_arch_init_vm(), and we don't allow
userspace to set it to zero.
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [PATCH v5 2/4] kernel: param: initialize module_kset in a pure_initcall
From: Petr Pavlu @ 2026-05-22 13:06 UTC (permalink / raw)
To: Shashank Balaji
Cc: Suzuki K Poulose, James Clark, Alexander Shishkin,
Greg Kroah-Hartman, Rafael J. Wysocki, Danilo Krummrich,
Miguel Ojeda, Boqun Feng, Gary Guo, Björn Roy Baron,
Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Jonathan Corbet, Shuah Khan, Luis Chamberlain, Daniel Gomez,
Sami Tolvanen, Aaron Tomlin, Mike Leach, Leo Yan, Thierry Reding,
Jonathan Hunter, Rahul Bukte, linux-kernel, coresight,
linux-arm-kernel, driver-core, rust-for-linux, linux-doc,
Daniel Palmer, Tim Bird, linux-modules, linux-tegra, Sumit Gupta
In-Reply-To: <20260518-acpi_mod_name-v5-2-705ccc430885@sony.com>
On 5/18/26 12:19 PM, Shashank Balaji wrote:
> Commit "driver core: platform: set mod_name in driver registration" will set
> struct device_driver's mod_name member for platform driver registration. For a
> driver to be registered with its mod_name set, module_kset needs to be
> initialized, which currently happens in a subsys_initcall in param_sysfs_init().
> The tegra cbb drivers register themselves before module_kset init, in a
> core_initcall. This works currently because lookup_or_create_module_kobject(),
> which dereferences module_kset via kset_find_obj(), is not called if mod_name
> is not set, which is the case now.
>
> So in preparation for the commit "driver core: platform: set mod_name in driver registration",
> move module_kset init to pure_initcall level, ensuring it happens before tegra
> cbb driver registration.
>
> Suggested-by: Gary Guo <gary@garyguo.net>
> Co-developed-by: Rahul Bukte <rahul.bukte@sony.com>
> Signed-off-by: Rahul Bukte <rahul.bukte@sony.com>
> Signed-off-by: Shashank Balaji <shashank.mahadasyam@sony.com>
> ---
> Patch 4 depends on this patch
> ---
> kernel/params.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/params.c b/kernel/params.c
> index 74d620bc2521..ac088d4b09a9 100644
> --- a/kernel/params.c
> +++ b/kernel/params.c
> @@ -957,7 +957,7 @@ static int __init param_sysfs_init(void)
>
> return 0;
> }
> -subsys_initcall(param_sysfs_init);
> +pure_initcall(param_sysfs_init);
>
> /*
> * param_sysfs_builtin_init - add sysfs version and parameter
>
The change looks ok to me functionality-wise. Sysfs is initialized
earlier in do_basic_setup() and other code, such as classes_init(),
calls kset_create_and_add() similarly early.
One minor issue is that pure_initcall() was originally intended for
static variable initialization. The file include/linux/init.h says:
| /*
| * A "pure" initcall has no dependencies on anything else, and purely
| * initializes variables that couldn't be statically initialized.
| *
| * This only exists for built-in code, not for modules.
| * Keep main.c:initcall_level_names[] in sync.
| */
| #define pure_initcall(fn) __define_initcall(fn, 0)
The patch stretches the intended use of pure_initcall() somewhat in this
regard. However, other code already appears to do the same, so I guess
this is ok.
Additionally, I think it would be good to update the comment preceding
param_sysfs_init(). It currently says:
| /*
| * param_sysfs_init - create "module" kset
| *
| * This must be done before the initramfs is unpacked and
| * request_module() thus becomes possible, because otherwise the
| * module load would fail in mod_sysfs_init.
| */
I suggest changing it to something like follows:
This must be done before any driver registration so that when a driver comes
from a built-in module, the driver core can add the module under /sys/module
and create the associated driver symlinks.
--
Thanks,
Petr
^ permalink raw reply
* Re: [PATCH v4 27/30] KVM: x86: Add KVM_VCPU_TSC_EFFECTIVE_FREQ attribute
From: Sean Christopherson @ 2026-05-22 12:49 UTC (permalink / raw)
To: David Woodhouse
Cc: Paolo Bonzini, Jonathan Corbet, Shuah Khan, Thomas Gleixner,
Ingo Molnar, Borislav Petkov, Dave Hansen, x86, H. Peter Anvin,
Vitaly Kuznetsov, Juergen Gross, Boris Ostrovsky, Paul Durrant,
Jonathan Cameron, Sascha Bischoff, Marc Zyngier, Joey Gouly,
Jack Allister, Dongli Zhang, joe.jin, kvm, linux-doc,
linux-kernel, xen-devel, linux-kselftest
In-Reply-To: <ab84153e33fbe7c25667f595c56b310d4d5a93ef.camel@infradead.org>
On Thu, May 21, 2026, David Woodhouse wrote:
> On Thu, 2026-05-21 at 15:30 -0700, Sean Christopherson wrote:
> > On Thu, May 21, 2026, David Woodhouse wrote:
> > > On Sat, 2026-05-09 at 23:46 +0100, David Woodhouse wrote:
> > > > From: David Woodhouse <dwmw@amazon.co.uk>
> > > That does leave userspace still needing a way to get the APIC bus
> > > frequency, to populate CPUID. So maybe I'll just make an attribute
> > > which returns that as a single value.
> >
> > Already exists, KVM_CAP_X86_APIC_BUS_CYCLES_NS. The TDX architecture decided
> > that unconditionally telling guests the virtual APIC bus runs at 400Mhz was a
> > brilliant idea.
>
> Ah, thanks.
>
> So KVM always exposes 1GHz by default regardless of the actual host?
> Which is why there's no *get* method?
>
> (Well... getting KVM_CAP_APIC_BUS_CYCLES_NS returns
> APIC_BUS_CYCLE_NS_DEFAULT which is 1, so it's basically just returning
> 1 like a lot of cap queries do, and *not* returning what the period is
> actually set to)
Oh, that's just an oversight, definitely not intentional. Easy enough to fix:
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1616b2eec6e7..cd4a244ca0c5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2235,7 +2235,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
r = tdp_enabled;
break;
case KVM_CAP_X86_APIC_BUS_CYCLES_NS:
- r = APIC_BUS_CYCLE_NS_DEFAULT;
+ r = kvm ? kvm->arch.apic_bus_cycle_ns : APIC_BUS_CYCLE_NS_DEFAULT;
break;
case KVM_CAP_EXIT_HYPERCALL:
r = KVM_EXIT_HYPERCALL_VALID_MASK;
^ permalink raw reply related
* Re: [PATCH mm-new] Documentation/admin-guide/mm: Fix typos in transhuge.rst
From: Lorenzo Stoakes @ 2026-05-22 12:43 UTC (permalink / raw)
To: Leon Hwang
Cc: linux-mm, Andrew Morton, David Hildenbrand, Zi Yan, Baolin Wang,
Liam R . Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
Lance Yang, Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan,
Michal Hocko, Jonathan Corbet, Shuah Khan, linux-doc,
linux-kernel
In-Reply-To: <20260520051751.74396-1-leon.hwang@linux.dev>
On Wed, May 20, 2026 at 01:17:51PM +0800, Leon Hwang wrote:
> Fix these two typos:
>
> 1. approporiately -> appropriately
> 2. presure -> pressure
>
> Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
LGTM, so:
Reviewed-by: Lorenzo Stoakes <ljs@kernel.org>
> ---
> Documentation/admin-guide/mm/transhuge.rst | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> index fc0127a36ef6..78a1b341a3b5 100644
> --- a/Documentation/admin-guide/mm/transhuge.rst
> +++ b/Documentation/admin-guide/mm/transhuge.rst
> @@ -57,7 +57,7 @@ prominent because the size of each page isn't as huge as the PMD-sized
> variant and there is less memory to clear in each page fault. Some
> architectures also employ TLB compression mechanisms to squeeze more
> entries in when a set of PTEs are virtually and physically contiguous
> -and approporiately aligned. In this case, TLB misses will occur less
> +and appropriately aligned. In this case, TLB misses will occur less
> often.
>
> THP can be enabled system wide or restricted to certain tasks or even
> @@ -211,7 +211,7 @@ PMD-mappable transparent hugepage::
> cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size
>
> All THPs at fault and collapse time will be added to _deferred_list,
> -and will therefore be split under memory presure if they are considered
> +and will therefore be split under memory pressure if they are considered
> "underused". A THP is underused if the number of zero-filled pages in
> the THP is above max_ptes_none (see below). It is possible to disable
> this behaviour by writing 0 to shrink_underused, and enable it by writing
> --
> 2.54.0
>
^ permalink raw reply
* Re: [PATCH mm-unstable v17 11/14] mm/khugepaged: Introduce mTHP collapse support
From: Nico Pache @ 2026-05-22 12:39 UTC (permalink / raw)
To: Vernon Yang
Cc: linux-doc, linux-kernel, linux-mm, linux-trace-kernel, aarcange,
akpm, anshuman.khandual, apopple, baohua, baolin.wang, byungchul,
catalin.marinas, cl, corbet, dave.hansen, david, dev.jain, gourry,
hannes, hughd, jack, jackmanb, jannh, jglisse, joshua.hahnjy, kas,
lance.yang, liam, ljs, mathieu.desnoyers, matthew.brost, mhiramat,
mhocko, peterx, pfalcato, rakie.kim, raquini, rdunlap,
richard.weiyang, rientjes, rostedt, rppt, ryan.roberts, shivankg,
sunnanyong, surenb, thomas.hellstrom, tiwai, usamaarif642, vbabka,
vishal.moola, wangkefeng.wang, will, willy, yang, ying.huang, ziy,
zokeefe
In-Reply-To: <8f9834db-8981-4eb1-ae46-94908943da3d@gmail.com>
On Wed, May 20, 2026 at 8:36 PM Vernon Yang <vernon2gm@gmail.com> wrote:
>
> On Mon, May 11, 2026 at 12:58:11PM -0600, Nico Pache wrote:
> > Enable khugepaged to collapse to mTHP orders. This patch implements the
> > main scanning logic using a bitmap to track occupied pages and a stack
> > structure that allows us to find optimal collapse sizes.
> >
> > Previous to this patch, PMD collapse had 3 main phases, a light weight
> > scanning phase (mmap_read_lock) that determines a potential PMD
> > collapse, an alloc phase (mmap unlocked), then finally heavier collapse
> > phase (mmap_write_lock).
> >
> > To enabled mTHP collapse we make the following changes:
> >
> > During PMD scan phase, track occupied pages in a bitmap. When mTHP
> > orders are enabled, we remove the restriction of max_ptes_none during the
> > scan phase to avoid missing potential mTHP collapse candidates. Once we
> > have scanned the full PMD range and updated the bitmap to track occupied
> > pages, we use the bitmap to find the optimal mTHP size.
> >
> > Implement collapse_scan_bitmap() to perform binary recursion on the bitmap
> > and determine the best eligible order for the collapse. A stack structure
> > is used instead of traditional recursion to manage the search. This also
> > prevents a traditional recursive approach when the kernel stack struct is
> > limited. The algorithm recursively splits the bitmap into smaller chunks to
> > find the highest order mTHPs that satisfy the collapse criteria. We start
> > by attempting the PMD order, then moved on the consecutively lower orders
> > (mTHP collapse). The stack maintains a pair of variables (offset, order),
> > indicating the number of PTEs from the start of the PMD, and the order of
> > the potential collapse candidate.
> >
> > The algorithm for consuming the bitmap works as such:
> > 1) push (0, HPAGE_PMD_ORDER) onto the stack
> > 2) pop the stack
> > 3) check if the number of set bits in that (offset,order) pair
> > statisfy the max_ptes_none threshold for that order
> > 4) if yes, attempt collapse
> > 5) if no (or collapse fails), push two new stack items representing
> > the left and right halves of the current bitmap range, at the
> > next lower order
> > 6) repeat at step (2) until stack is empty.
> >
> > Below is a diagram representing the algorithm and stack items:
> >
> > offset mid_offset
> > | |
> > | |
> > v v
> > ____________________________________
> > | PTE Page Table |
> > --------------------------------------
> > <-------><------->
> > order-1 order-1
> >
> > mTHP collapses reject regions containing swapped out or shared pages.
> > This is because adding new entries can lead to new none pages, and these
> > may lead to constant promotion into a higher order mTHP. A similar
> > issue can occur with "max_ptes_none > HPAGE_PMD_NR/2" due to a collapse
> > introducing at least 2x the number of pages, and on a future scan will
> > satisfy the promotion condition once again. This issue is prevented via
> > the collapse_max_ptes_none() function which imposes the max_ptes_none
> > restrictions above.
> >
> > We currently only support mTHP collapse for max_ptes_none values of 0
> > and HPAGE_PMD_NR - 1. resulting in the following behavior:
> >
> > - max_ptes_none=0: Never introduce new empty pages during collapse
> > - max_ptes_none=HPAGE_PMD_NR-1: Always try collapse to the highest
> > available mTHP order
> >
> > Any other max_ptes_none value will emit a warning and skip mTHP collapse
> > attempts. There should be no behavior change for PMD collapse.
> >
> > Once we determine what mTHP sizes fits best in that PMD range a collapse
> > is attempted. A minimum collapse order of 2 is used as this is the lowest
> > order supported by anon memory as defined by THP_ORDERS_ALL_ANON.
> >
> > Currently madv_collapse is not supported and will only attempt PMD
> > collapse.
> >
> > We can also remove the check for is_khugepaged inside the PMD scan as
> > the collapse_max_ptes_none() function handles this logic now.
> >
> > Signed-off-by: Nico Pache <npache@redhat.com>
> > ---
> > mm/khugepaged.c | 182 +++++++++++++++++++++++++++++++++++++++++++++---
> > 1 file changed, 174 insertions(+), 8 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index 3492b135d667..39bf7ea8a6e8 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -100,6 +100,30 @@ static DEFINE_READ_MOSTLY_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
> >
> > static struct kmem_cache *mm_slot_cache __ro_after_init;
> >
> > +#define KHUGEPAGED_MIN_MTHP_ORDER 2
> > +/*
> > + * mthp_collapse() does an iterative DFS over a binary tree, from
> > + * HPAGE_PMD_ORDER down to KHUGEPAGED_MIN_MTHP_ORDER. The max stack
> > + * size needed for a DFS on a binary tree is height + 1, where
> > + * height = HPAGE_PMD_ORDER - KHUGEPAGED_MIN_MTHP_ORDER.
> > + *
> > + * ilog2 is used in place of HPAGE_PMD_ORDER because some architectures
> > + * (e.g. ppc64le) do not define HPAGE_PMD_ORDER until after build time.
> > + */
> > +#define MTHP_STACK_SIZE (ilog2(MAX_PTRS_PER_PTE) - KHUGEPAGED_MIN_MTHP_ORDER + 1)
> > +
> > +/*
> > + * Defines a range of PTE entries in a PTE page table which are being
> > + * considered for mTHP collapse.
> > + *
> > + * @offset: the offset of the first PTE entry in a PMD range.
> > + * @order: the order of the PTE entries being considered for collapse.
> > + */
> > +struct mthp_range {
> > + u16 offset;
> > + u8 order;
> > +};
> > +
> > struct collapse_control {
> > bool is_khugepaged;
> >
> > @@ -111,6 +135,12 @@ struct collapse_control {
> >
> > /* nodemask for allocation fallback */
> > nodemask_t alloc_nmask;
> > +
> > + /* Each bit represents a single occupied (!none/zero) page. */
> > + DECLARE_BITMAP(mthp_bitmap, MAX_PTRS_PER_PTE);
> > + /* A mask of the current range being considered for mTHP collapse. */
> > + DECLARE_BITMAP(mthp_bitmap_mask, MAX_PTRS_PER_PTE);
> > + struct mthp_range mthp_bitmap_stack[MTHP_STACK_SIZE];
> > };
> >
> > /**
> > @@ -1404,20 +1434,140 @@ static enum scan_result collapse_huge_page(struct mm_struct *mm, unsigned long s
> > return result;
> > }
> >
> > +static void collapse_mthp_stack_push(struct collapse_control *cc, int *stack_size,
> > + u16 offset, u8 order)
> > +{
> > + const int size = *stack_size;
> > + struct mthp_range *stack = &cc->mthp_bitmap_stack[size];
> > +
> > + VM_WARN_ON_ONCE(size >= MTHP_STACK_SIZE);
> > + stack->order = order;
> > + stack->offset = offset;
> > + (*stack_size)++;
> > +}
> > +
> > +static struct mthp_range collapse_mthp_stack_pop(struct collapse_control *cc,
> > + int *stack_size)
> > +{
> > + const int size = *stack_size;
> > +
> > + VM_WARN_ON_ONCE(size <= 0);
> > + (*stack_size)--;
> > + return cc->mthp_bitmap_stack[size - 1];
> > +}
> > +
> > +static unsigned int collapse_mthp_count_present(struct collapse_control *cc,
> > + u16 offset, unsigned int nr_ptes)
> > +{
> > + bitmap_zero(cc->mthp_bitmap_mask, MAX_PTRS_PER_PTE);
> > + bitmap_set(cc->mthp_bitmap_mask, offset, nr_ptes);
> > + return bitmap_weight_and(cc->mthp_bitmap, cc->mthp_bitmap_mask, MAX_PTRS_PER_PTE);
> > +}
> > +
> > +/*
> > + * mthp_collapse() consumes the bitmap that is generated during
> > + * collapse_scan_pmd() to determine what regions and mTHP orders fit best.
> > + *
> > + * Each bit in cc->mthp_bitmap represents a single occupied (!none/zero) page.
> > + * A stack structure cc->mthp_bitmap_stack is used to check different regions
> > + * of the bitmap for collapse eligibility. The stack maintains a pair of
> > + * variables (offset, order), indicating the number of PTEs from the start of
> > + * the PMD, and the order of the potential collapse candidate respectively. We
> > + * start at the PMD order and check if it is eligible for collapse; if not, we
> > + * add two entries to the stack at a lower order to represent the left and right
> > + * halves of the PTE page table we are examining.
> > + *
> > + * offset mid_offset
> > + * | |
> > + * | |
> > + * v v
> > + * --------------------------------------
> > + * | cc->mthp_bitmap |
> > + * --------------------------------------
> > + * <-------><------->
> > + * order-1 order-1
> > + *
> > + * For each of these, we determine how many PTE entries are occupied in the
> > + * range of PTE entries we propose to collapse, then we compare this to a
> > + * threshold number of PTE entries which would need to be occupied for a
> > + * collapse to be permitted at that order (accounting for max_ptes_none).
> > + *
> > + * If a collapse is permitted, we attempt to collapse the PTE range into a
> > + * mTHP.
> > + */
> > +static int mthp_collapse(struct mm_struct *mm, unsigned long address,
> > + int referenced, int unmapped, struct collapse_control *cc,
> > + unsigned long enabled_orders)
> > +{
> > + unsigned int nr_occupied_ptes, nr_ptes;
> > + int max_ptes_none, collapsed = 0, stack_size = 0;
> > + unsigned long collapse_address;
> > + struct mthp_range range;
> > + u16 offset;
> > + u8 order;
> > +
> > + collapse_mthp_stack_push(cc, &stack_size, 0, HPAGE_PMD_ORDER);
> > +
> > + while (stack_size) {
> > + range = collapse_mthp_stack_pop(cc, &stack_size);
> > + order = range.order;
> > + offset = range.offset;
> > + nr_ptes = 1UL << order;
> > +
> > + if (!test_bit(order, &enabled_orders))
> > + goto next_order;
> > +
> > + max_ptes_none = collapse_max_ptes_none(cc, NULL, order);
> > +
> > + if (max_ptes_none < 0)
> > + return collapsed;
> > +
> > + nr_occupied_ptes = collapse_mthp_count_present(cc, offset,
> > + nr_ptes);
> > +
> > + if (nr_occupied_ptes >= nr_ptes - max_ptes_none) {
> > + int ret;
> > +
> > + collapse_address = address + offset * PAGE_SIZE;
> > + ret = collapse_huge_page(mm, collapse_address, referenced,
> > + unmapped, cc, order);
> > + if (ret == SCAN_SUCCEED) {
> > + collapsed += nr_ptes;
> > + continue;
> > + }
> > + }
> > +
> > +next_order:
> > + if (order > KHUGEPAGED_MIN_MTHP_ORDER) {
>
> Hi Nico, thank you very much for your contributions to this series.
>
> I found a minor issue, for MADV_COLLAPSE, if collapse_huge_page() fails
> for some reason (e.g. allocate folio), it goes to next_order and
> continues splitting to the next small order. However, enabled_orders
> only supports HPAGE_PMD_ORDER, so it keeps runing the split operations
> without any effective work until KHUGEPAGED_MIN_MTHP_ORDER is reached
> before exiting. For khugepaged, e.g. setting only 2MB to always, also
> same phenomenon.
>
> This does not affect the overall functionality of mthp collapse, just
> redundant.
>
> The redundant operations can be easily skipped with the following
> modification. If I miss some thing, please let me know. Thanks!
Hi Vernon!
Thank you for the report and very clean solution :) I will implement
your optimization into this commit.
Cheers,
-- Nico
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1a25af3d6d0f..fa407cce525c 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -1574,7 +1574,7 @@ static int mthp_collapse(struct mm_struct *mm, unsigned long address,
> }
>
> next_order:
> - if (order > KHUGEPAGED_MIN_MTHP_ORDER) {
> + if ((BIT(order) - 1) & enabled_orders) {
> const u8 next_order = order - 1;
> const u16 mid_offset = offset + (nr_ptes / 2);
>
> --
> Cheers,
> Vernon
>
> > + const u8 next_order = order - 1;
> > + const u16 mid_offset = offset + (nr_ptes / 2);
> > +
> > + collapse_mthp_stack_push(cc, &stack_size, mid_offset,
> > + next_order);
> > + collapse_mthp_stack_push(cc, &stack_size, offset,
> > + next_order);
> > + }
> > + }
> > + return collapsed;
> > +}
> > +
> > static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> > struct vm_area_struct *vma, unsigned long start_addr,
> > bool *lock_dropped, struct collapse_control *cc)
> > {
> > - const int max_ptes_none = collapse_max_ptes_none(cc, vma, HPAGE_PMD_ORDER);
> > + int max_ptes_none = collapse_max_ptes_none(cc, vma, HPAGE_PMD_ORDER);
> > const unsigned int max_ptes_shared = collapse_max_ptes_shared(cc, HPAGE_PMD_ORDER);
> > const unsigned int max_ptes_swap = collapse_max_ptes_swap(cc, HPAGE_PMD_ORDER);
> > + enum tva_type tva_flags = cc->is_khugepaged ? TVA_KHUGEPAGED : TVA_FORCED_COLLAPSE;
> > pmd_t *pmd;
> > - pte_t *pte, *_pte;
> > - int none_or_zero = 0, shared = 0, referenced = 0;
> > + pte_t *pte, *_pte, pteval;
> > + int i;
> > + int none_or_zero = 0, shared = 0, nr_collapsed = 0, referenced = 0;
> > enum scan_result result = SCAN_FAIL;
> > struct page *page = NULL;
> > struct folio *folio = NULL;
> > unsigned long addr;
> > + unsigned long enabled_orders;
> > spinlock_t *ptl;
> > int node = NUMA_NO_NODE, unmapped = 0;
> >
> > @@ -1429,8 +1579,19 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> > goto out;
> > }
> >
> > + bitmap_zero(cc->mthp_bitmap, MAX_PTRS_PER_PTE);
> > memset(cc->node_load, 0, sizeof(cc->node_load));
> > nodes_clear(cc->alloc_nmask);
> > +
> > + enabled_orders = collapse_allowable_orders(vma, vma->vm_flags, tva_flags);
> > +
> > + /*
> > + * If PMD is the only enabled order, enforce max_ptes_none, otherwise
> > + * scan all pages to populate the bitmap for mTHP collapse.
> > + */
> > + if (enabled_orders != BIT(HPAGE_PMD_ORDER))
> > + max_ptes_none = KHUGEPAGED_MAX_PTES_LIMIT;
> > +
> > pte = pte_offset_map_lock(mm, pmd, start_addr, &ptl);
> > if (!pte) {
> > cc->progress++;
> > @@ -1438,11 +1599,13 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> > goto out;
> > }
> >
> > - for (addr = start_addr, _pte = pte; _pte < pte + HPAGE_PMD_NR;
> > - _pte++, addr += PAGE_SIZE) {
> > + for (i = 0; i < HPAGE_PMD_NR; i++) {
> > + _pte = pte + i;
> > + addr = start_addr + i * PAGE_SIZE;
> > + pteval = ptep_get(_pte);
> > +
> > cc->progress++;
> >
> > - pte_t pteval = ptep_get(_pte);
> > if (pte_none_or_zero(pteval)) {
> > if (++none_or_zero > max_ptes_none) {
> > result = SCAN_EXCEED_NONE_PTE;
> > @@ -1522,6 +1685,8 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> > }
> > }
> >
> > + /* Set bit for occupied pages */
> > + __set_bit(i, cc->mthp_bitmap);
> > /*
> > * Record which node the original page is from and save this
> > * information to cc->node_load[].
> > @@ -1580,10 +1745,11 @@ static enum scan_result collapse_scan_pmd(struct mm_struct *mm,
> > if (result == SCAN_SUCCEED) {
> > /* collapse_huge_page expects the lock to be dropped before calling */
> > mmap_read_unlock(mm);
> > - result = collapse_huge_page(mm, start_addr, referenced,
> > - unmapped, cc, HPAGE_PMD_ORDER);
> > + nr_collapsed = mthp_collapse(mm, start_addr, referenced, unmapped,
> > + cc, enabled_orders);
> > /* collapse_huge_page will return with the mmap_lock released */
> > *lock_dropped = true;
> > + result = nr_collapsed ? SCAN_SUCCEED : SCAN_FAIL;
> > }
> > out:
> > trace_mm_khugepaged_scan_pmd(mm, folio, referenced,
> > --
> > 2.54.0
> >
> >
>
^ permalink raw reply
* Re: [PATCH v2 14/14] Documentation/userfaultfd: document RWP working set tracking
From: Kiryl Shutsemau @ 2026-05-22 12:37 UTC (permalink / raw)
To: Mike Rapoport
Cc: akpm, peterx, david, ljs, surenb, vbabka, Liam.Howlett, ziy,
corbet, skhan, seanjc, pbonzini, jthoughton, aarcange, sj,
usama.arif, linux-mm, linux-kernel, linux-doc, linux-kselftest,
kvm, kernel-team
In-Reply-To: <agQZiTUQNviaGIim@kernel.org>
On Wed, May 13, 2026 at 09:26:17AM +0300, Mike Rapoport wrote:
> On Fri, May 08, 2026 at 04:55:26PM +0100, Kiryl Shutsemau (Meta) wrote:
> > Add an admin-guide section covering UFFDIO_REGISTER_MODE_RWP:
> >
> > - sync and async fault models;
> > - UFFDIO_RWPROTECT semantics;
> > - UFFD_FEATURE_RWP_ASYNC;
> > - UFFDIO_SET_MODE runtime mode flips.
> >
> > It also covers typical VMM working-set-tracking workflow from detection
> > loop through sync-mode eviction and back to async.
>
> We'd also need man page update at some point :)
Will add a patch for man-pages in v3.
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > Assisted-by: Claude:claude-opus-4-6
> > ---
> > Documentation/admin-guide/mm/userfaultfd.rst | 226 ++++++++++++++++++-
> > 1 file changed, 220 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
> > index 1e533639fd50..5ac4ae3dff1b 100644
> > --- a/Documentation/admin-guide/mm/userfaultfd.rst
> > +++ b/Documentation/admin-guide/mm/userfaultfd.rst
> > @@ -275,16 +275,16 @@ tracking and it can be different in a few ways:
> > - Dirty information will not get lost if the pte was zapped due to
> > various reasons (e.g. during split of a shmem transparent huge page).
> >
> > - - Due to a reverted meaning of soft-dirty (page clean when uffd-wp bit
> > - set; dirty when uffd-wp bit cleared), it has different semantics on
> > - some of the memory operations. For example: ``MADV_DONTNEED`` on
> > + - Due to a reverted meaning of soft-dirty (page clean when the uffd bit
> > + is set; dirty when the uffd bit is cleared), it has different semantics
> > + on some of the memory operations. For example: ``MADV_DONTNEED`` on
> > anonymous (or ``MADV_REMOVE`` on a file mapping) will be treated as
> > - dirtying of memory by dropping uffd-wp bit during the procedure.
> > + dirtying of memory by dropping the uffd bit during the procedure.
> >
> > The user app can collect the "written/dirty" status by looking up the
> > -uffd-wp bit for the pages being interested in /proc/pagemap.
> > +uffd bit for the pages being interested in /proc/pagemap.
> >
> > -The page will not be under track of uffd-wp async mode until the page is
> > +The page will not be under track of userfaultfd-wp async mode until the page is
> > explicitly write-protected by ``ioctl(UFFDIO_WRITEPROTECT)`` with the mode
> > flag ``UFFDIO_WRITEPROTECT_MODE_WP`` set. Trying to resolve a page fault
> > that was tracked by async mode userfaultfd-wp is invalid.
> > @@ -307,6 +307,220 @@ transparent to the guest, we want that same address range to act as if it was
> > still poisoned, even though it's on a new physical host which ostensibly
> > doesn't have a memory error in the exact same spot.
> >
> > +Read-Write Protection
> > +---------------------
> > +
> > +``UFFDIO_REGISTER_MODE_RWP`` enables read-write protection tracking on a
> > +memory range. It is similar to (but faster than) ``mprotect(PROT_NONE)``
> > +combined with a signal handler; unlike ``mprotect(PROT_NONE)``, RWP only
> > +traps accesses to *present* PTEs, so accesses to unpopulated addresses in a
> > +protected range fall through to the normal missing-page path. It uses the
> > +PROT_NONE hinting mechanism (same as NUMA balancing) to make pages
> > +inaccessible while keeping them resident in memory. Works on anonymous,
> > +shmem, and hugetlbfs memory.
> > +
> > +This is designed for VM memory managers that need to track the working set
>
> This feature? Or RWP mode?
RWP.
> > +of guest memory for cold page eviction to tiered or remote storage.
> > +
> > +**Setup:**
> > +
> > +1. Open a userfaultfd and enable ``UFFD_FEATURE_RWP`` via ``UFFDIO_API``.
> > + Optionally request ``UFFD_FEATURE_RWP_ASYNC`` as well — it requires
> > + ``UFFD_FEATURE_RWP`` to be set in the same ``UFFDIO_API`` call.
> > +
> > +2. Register the guest memory range with ``UFFDIO_REGISTER_MODE_RWP``
> > + (and ``UFFDIO_REGISTER_MODE_MISSING`` if evicted pages will need to be
> > + fetched back from storage).
> > +
> > +**Feature availability:**
> > +
> > +RWP is built on top of two kernel primitives: a spare PTE bit owned by
> > +userfaultfd (``CONFIG_HAVE_ARCH_USERFAULTFD_WP``) and arch support for
>
> Please spell out architecture.
Ack.
> > +present-but-inaccessible PTEs (``CONFIG_ARCH_HAS_PTE_PROTNONE``). When both
> > +are available on a 64-bit kernel, the build selects
> > +``CONFIG_USERFAULTFD_RWP=y`` and the ``VM_UFFD_RWP`` VMA flag becomes
> > +available.
> > +
> > +``UFFD_FEATURE_RWP`` and ``UFFD_FEATURE_RWP_ASYNC`` are masked out of the
> > +features returned by ``UFFDIO_API`` when the running kernel or architecture
> > +cannot support them — for example 32-bit kernels (where ``VM_UFFD_RWP`` is
> > +unavailable), kernels built without ``CONFIG_USERFAULTFD_RWP``, and
> > +architectures whose ptes cannot carry the uffd bit at runtime (e.g. riscv
> > +without the ``SVRSW60T59B`` extension). ``UFFDIO_API`` does not fail;
> > +unsupported bits are simply absent from ``uffdio_api.features`` on return.
> > +VMMs should inspect the returned ``features`` after ``UFFDIO_API`` and fall
>
> Lets s/VMM/Callers/.
> Although RWP is designed for VMMs, it's not limited to them and I expect
> other use-cases will be coming along.
Okay.
--
Kiryl Shutsemau / Kirill A. Shutemov
^ permalink raw reply
* [PATCH v15 28/28] drm/connector: Update docs of "colorspace" for color format prop
From: Nicolas Frattaroli @ 2026-05-22 12:32 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, David Airlie, Simona Vetter,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
Jonathan Corbet, Shuah Khan, Daniel Stone
Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
linux-rockchip, intel-gfx, intel-xe, linux-doc, wayland-devel,
Nicolas Frattaroli
In-Reply-To: <20260522-color-format-v15-0-21fb136c9df2@collabora.com>
The colorspace property's documentation states that BT2020_RGB and
BT2020_YCC are equivalent, and the output format depends on the driver.
Now that there is a "color format" property that userspace can use to
explicitly set a format, update the colorspace docs to mention this.
The behaviour here is not changed for userspace that doesn't know about
the color format property yet, as the color format property defaults to
"AUTO", where the choice of output format is left up to drivers.
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/drm_connector.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index b91c1f76355e..52f9a6a8daf7 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -2573,7 +2573,8 @@ EXPORT_SYMBOL(drm_mode_create_aspect_ratio_property);
* conversion matrix and convert to the appropriate quantization
* range.
* The variants BT2020_RGB and BT2020_YCC are equivalent and the
- * driver chooses between RGB and YCbCr on its own.
+ * driver chooses between RGB and YCbCr based on the color format
+ * property.
*
* SMPTE_170M_YCC:
* BT709_YCC:
--
2.54.0
^ permalink raw reply related
* [PATCH v15 27/28] drm/bridge: Document bridge chain format selection
From: Nicolas Frattaroli @ 2026-05-22 12:32 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, David Airlie, Simona Vetter,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
Jonathan Corbet, Shuah Khan, Daniel Stone
Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
linux-rockchip, intel-gfx, intel-xe, linux-doc, wayland-devel,
Nicolas Frattaroli
In-Reply-To: <20260522-color-format-v15-0-21fb136c9df2@collabora.com>
The bridge chain format selection behaviour was, until now,
undocumented. With the addition of the "color format" DRM property, it's
not sufficiently complex enough that documentation is warranted,
especially for driver authors trying to do the right thing.
Add a high-level overview of how the process is supposed to work, and
mention what the display driver is supposed to do if it wants to make
use of this functionality.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
Documentation/gpu/drm-kms-helpers.rst | 6 ++++++
drivers/gpu/drm/drm_bridge.c | 40 +++++++++++++++++++++++++++++++++++
2 files changed, 46 insertions(+)
diff --git a/Documentation/gpu/drm-kms-helpers.rst b/Documentation/gpu/drm-kms-helpers.rst
index 80453dda33b8..94cfc26acecc 100644
--- a/Documentation/gpu/drm-kms-helpers.rst
+++ b/Documentation/gpu/drm-kms-helpers.rst
@@ -171,6 +171,12 @@ Bridge Operations
.. kernel-doc:: drivers/gpu/drm/drm_bridge.c
:doc: bridge operations
+Bridge Chain Format Selection
+-----------------------------
+
+.. kernel-doc:: drivers/gpu/drm/drm_bridge.c
+ :doc: bridge chain format selection
+
Bridge Connector Helper
-----------------------
diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c
index 6ec8c23b36b5..5cb170328850 100644
--- a/drivers/gpu/drm/drm_bridge.c
+++ b/drivers/gpu/drm/drm_bridge.c
@@ -198,6 +198,46 @@
* driver.
*/
+/**
+ * DOC: bridge chain format selection
+ *
+ * A bridge chain, from display output processor to connector, may contain
+ * bridges capable of converting between bus formats on their inputs, and
+ * output formats on their outputs. For example, a bridge may be able to convert
+ * from RGB to YCbCr 4:4:4, and pass through YCbCr 4:2:0 as-is, but not convert
+ * from RGB to YCbCr 4:2:0. This means not all input formats map to all output
+ * formats.
+ *
+ * Further adding to this, a desired output color format, as specified with the
+ * "color format" DRM property, might not correspond 1:1 to what the display
+ * driver should set at its output. The bridge chain it feeds into may only be
+ * able to reach the desired output format, if a conversion from a different
+ * starting format is performed.
+ *
+ * To deal with this complexity, the recursive bridge chain bus format selection
+ * logic starts with the last bridge in the chain, usually the connector, and
+ * then recursively walks the chain of bridges backwards to the first bridge,
+ * trying to find a path.
+ *
+ * For a display driver to work in such a scenario, it should read the first
+ * bridge's bridge state to figure out which bus format the chain resolved to.
+ * If the first bridge's input format resolved to %MEDIA_BUS_FMT_FIXED, then its
+ * output format should be used.
+ *
+ * Special handling is done for HDMI as it relates to format selection. Instead
+ * of directly using the "color format" DRM property for bridge chains that end
+ * in HDMI bridges, the bridge chain format selection logic will trust the logic
+ * that set the HDMI output format. For the common HDMI state helper
+ * functionality, this means that %DRM_CONNECTOR_COLOR_FORMAT_AUTO will allow
+ * fallbacks to YCBCr 4:2:0 if the bandwidth requirements would otherwise be too
+ * high but the mode and connector allow it.
+ *
+ * For bridge chains that do not end in an HDMI bridge,
+ * %DRM_CONNECTOR_COLOR_FORMAT_AUTO will be satisfied with the first output
+ * format on the last bridge for which it can find a path back to the first
+ * bridge.
+ */
+
/* Protect bridge_list and bridge_lingering_list */
static DEFINE_MUTEX(bridge_lock);
static LIST_HEAD(bridge_list);
--
2.54.0
^ permalink raw reply related
* Re: [PATCH] tpm: tpm_tis: Add optional delay after relinquish
From: Jarkko Sakkinen @ 2026-05-22 12:37 UTC (permalink / raw)
To: Jim Broadus; +Cc: linux-integrity, linux-kernel, linux-doc, peterhuewe, jgg
In-Reply-To: <CAKgEEwswj4in29_hoy_dQQ18+GF=Uwf0LnwS=w7bwZCSW=mwjw@mail.gmail.com>
On Thu, May 21, 2026 at 11:03:29PM -0700, Jim Broadus wrote:
> Thank you Jarkko. I'll do that.
>
> Jim
Yeah, in this form it is quite unsable e.g., for any Linux distribution,
and somewhat involved for the user :-)
Conditionally on is much better with appropriate detection. Also, this
way the change improves the code base a bit given that chip->did_vid is
much more applicable than chip->manufaturer_id.
BR, Jarkko
^ permalink raw reply
* [PATCH v15 26/28] drm/tests: bridge: Add test for HDMI output bus formats helper
From: Nicolas Frattaroli @ 2026-05-22 12:32 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, David Airlie, Simona Vetter,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
Jonathan Corbet, Shuah Khan, Daniel Stone
Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
linux-rockchip, intel-gfx, intel-xe, linux-doc, wayland-devel,
Nicolas Frattaroli
In-Reply-To: <20260522-color-format-v15-0-21fb136c9df2@collabora.com>
The common atomic_get_output_bus_fmts helper for HDMI bridge connectors,
called drm_atomic_helper_bridge_get_hdmi_output_bus_fmts, should return
an array of output bus formats depending on the supported formats of the
connector, and the current output BPC.
Add a test to exercise some of this helper.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/tests/drm_bridge_test.c | 184 ++++++++++++++++++++++++++++++++
1 file changed, 184 insertions(+)
diff --git a/drivers/gpu/drm/tests/drm_bridge_test.c b/drivers/gpu/drm/tests/drm_bridge_test.c
index 92f142ca6695..9d4fbe709df4 100644
--- a/drivers/gpu/drm/tests/drm_bridge_test.c
+++ b/drivers/gpu/drm/tests/drm_bridge_test.c
@@ -5,6 +5,7 @@
#include <linux/cleanup.h>
#include <linux/media-bus-format.h>
+#include <drm/drm_atomic_helper.h>
#include <drm/drm_atomic_state_helper.h>
#include <drm/drm_atomic_uapi.h>
#include <drm/drm_bridge.h>
@@ -118,6 +119,28 @@ static const struct drm_bridge_funcs drm_test_bridge_atomic_funcs = {
.atomic_reset = drm_atomic_helper_bridge_reset,
};
+static int dummy_clear_infoframe(struct drm_bridge *bridge)
+{
+ return 0;
+}
+
+static int dummy_write_infoframe(struct drm_bridge *bridge, const u8 *buffer,
+ size_t len)
+{
+ return 0;
+}
+
+static const struct drm_bridge_funcs drm_test_bridge_bus_fmts_funcs = {
+ .atomic_get_output_bus_fmts = drm_atomic_helper_bridge_get_hdmi_output_bus_fmts,
+ .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state,
+ .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state,
+ .atomic_reset = drm_atomic_helper_bridge_reset,
+ .hdmi_write_avi_infoframe = dummy_write_infoframe,
+ .hdmi_write_hdmi_infoframe = dummy_write_infoframe,
+ .hdmi_clear_avi_infoframe = dummy_clear_infoframe,
+ .hdmi_clear_hdmi_infoframe = dummy_clear_infoframe,
+};
+
/**
* struct fmt_tuple - a tuple of input/output MEDIA_BUS_FMT_*
*/
@@ -537,6 +560,83 @@ drm_test_bridge_chain_init(struct kunit *test, unsigned int num_bridges,
return priv;
}
+static struct drm_bridge_init_priv *
+drm_test_bridge_hdmi_init(struct kunit *test, const struct drm_bridge_funcs *funcs,
+ unsigned int supported_formats, int max_bpc)
+{
+ struct drm_bridge_init_priv *priv;
+ struct drm_encoder *enc;
+ struct drm_bridge *bridge;
+ struct drm_device *drm;
+ struct device *dev;
+ int ret;
+
+ dev = drm_kunit_helper_alloc_device(test);
+ if (IS_ERR(dev))
+ return ERR_CAST(dev);
+
+ priv = drm_kunit_helper_alloc_drm_device(test, dev,
+ struct drm_bridge_init_priv, drm,
+ DRIVER_MODESET | DRIVER_ATOMIC);
+ if (IS_ERR(priv))
+ return ERR_CAST(priv);
+
+ priv->test_bridge = devm_drm_bridge_alloc(dev, struct drm_bridge_priv, bridge, funcs);
+ if (IS_ERR(priv->test_bridge))
+ return ERR_CAST(priv->test_bridge);
+
+ priv->test_bridge->data = priv;
+
+ drm = &priv->drm;
+ priv->plane = drm_kunit_helper_create_primary_plane(test, drm,
+ NULL,
+ NULL,
+ NULL, 0,
+ NULL);
+ if (IS_ERR(priv->plane))
+ return ERR_CAST(priv->plane);
+
+ priv->crtc = drm_kunit_helper_create_crtc(test, drm,
+ priv->plane, NULL,
+ NULL,
+ NULL);
+ if (IS_ERR(priv->crtc))
+ return ERR_CAST(priv->crtc);
+
+ enc = &priv->encoder;
+ ret = drmm_encoder_init(drm, enc, NULL, DRM_MODE_ENCODER_TMDS, NULL);
+ if (ret)
+ return ERR_PTR(ret);
+
+ enc->possible_crtcs = drm_crtc_mask(priv->crtc);
+
+ bridge = &priv->test_bridge->bridge;
+ bridge->type = DRM_MODE_CONNECTOR_HDMIA;
+ bridge->supported_formats = supported_formats;
+ bridge->max_bpc = max_bpc;
+ bridge->ops |= DRM_BRIDGE_OP_HDMI;
+ bridge->vendor = "LNX";
+ bridge->product = "KUnit";
+
+ ret = drm_kunit_bridge_add(test, bridge);
+ if (ret)
+ return ERR_PTR(ret);
+
+ ret = drm_bridge_attach(enc, bridge, NULL, 0);
+ if (ret)
+ return ERR_PTR(ret);
+
+ priv->connector = drm_bridge_connector_init(drm, enc);
+ if (IS_ERR(priv->connector))
+ return ERR_CAST(priv->connector);
+
+ drm_connector_attach_encoder(priv->connector, enc);
+
+ drm_mode_config_reset(drm);
+
+ return priv;
+}
+
/*
* Test that drm_bridge_get_current_state() returns the last committed
* state for an atomic bridge.
@@ -784,10 +884,94 @@ static void drm_test_drm_bridge_helper_reset_crtc_legacy(struct kunit *test)
KUNIT_EXPECT_EQ(test, bridge_priv->disable_count, 1);
}
+/*
+ * Test that a bridge using the drm_atomic_helper_bridge_get_hdmi_output_bus_fmts()
+ * function for &drm_bridge_funcs.atomic_get_output_bus_fmts behaves as expected
+ * for an HDMI connector bridge. Does so by creating an HDMI bridge connector
+ * with RGB444, YCBCR444, and YCBCR420 (but not YCBCR422) as supported formats,
+ * sets the output depth to 8 bits per component, and then validates the returned
+ * list of bus formats.
+ */
+static void drm_test_drm_bridge_helper_hdmi_output_bus_fmts(struct kunit *test)
+{
+ struct drm_connector_state *conn_state;
+ struct drm_bridge_state *bridge_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_bridge_init_priv *priv;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *mode;
+ unsigned int num_output_fmts;
+ struct drm_bridge *bridge;
+ u32 *out_bus_fmts;
+ int ret;
+
+ priv = drm_test_bridge_hdmi_init(test, &drm_test_bridge_bus_fmts_funcs,
+ BIT(DRM_OUTPUT_COLOR_FORMAT_RGB444) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR444) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR420),
+ 12);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
+
+ bridge = &priv->test_bridge->bridge;
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+retry_commit:
+ conn_state = drm_atomic_get_connector_state(state, priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ conn_state->hdmi.output_bpc = 8;
+
+ mode = drm_kunit_display_mode_from_cea_vic(test, &priv->drm, 16);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mode);
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, mode);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ bridge_state = drm_atomic_get_bridge_state(state, bridge);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, bridge_state);
+
+ out_bus_fmts = bridge->funcs->atomic_get_output_bus_fmts(
+ bridge, bridge_state, crtc_state, conn_state, &num_output_fmts);
+ KUNIT_EXPECT_NOT_NULL(test, out_bus_fmts);
+ KUNIT_EXPECT_EQ(test, num_output_fmts, 3);
+
+ KUNIT_EXPECT_EQ(test, out_bus_fmts[0], MEDIA_BUS_FMT_RGB888_1X24);
+ KUNIT_EXPECT_EQ(test, out_bus_fmts[1], MEDIA_BUS_FMT_YUV8_1X24);
+ KUNIT_EXPECT_EQ(test, out_bus_fmts[2], MEDIA_BUS_FMT_UYYVYY8_0_5X24);
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+
+ kfree(out_bus_fmts);
+}
+
static struct kunit_case drm_bridge_helper_reset_crtc_tests[] = {
KUNIT_CASE(drm_test_drm_bridge_helper_reset_crtc_atomic),
KUNIT_CASE(drm_test_drm_bridge_helper_reset_crtc_atomic_disabled),
KUNIT_CASE(drm_test_drm_bridge_helper_reset_crtc_legacy),
+ KUNIT_CASE(drm_test_drm_bridge_helper_hdmi_output_bus_fmts),
{ }
};
--
2.54.0
^ permalink raw reply related
* [PATCH v15 25/28] drm/tests: bridge: Add KUnit tests for bridge chain format selection
From: Nicolas Frattaroli @ 2026-05-22 12:32 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, David Airlie, Simona Vetter,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
Jonathan Corbet, Shuah Khan, Daniel Stone
Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
linux-rockchip, intel-gfx, intel-xe, linux-doc, wayland-devel,
Nicolas Frattaroli
In-Reply-To: <20260522-color-format-v15-0-21fb136c9df2@collabora.com>
With the "color format" property, the bridge chain format selection has
gained increased complexity. Instead of simply finding any sequence of
bus formats that works, the bridge chain format selection needs to pick
a sequence that results in the requested color format.
Add KUnit tests for this new logic. These take the form of some pleasant
preprocessor macros to make it less cumbersome to define test bridges
with a set of possible input and output formats.
The input and output formats are defined for bridges in the form of
tuples, where the first member defines the input format, and the second
member defines the output format that can be produced from this input
format. This means the tests can construct scenarios in which not all
inputs can be converted to all outputs.
Some tests are added to test interesting scenarios to exercise the bus
format selection in the presence of a specific color format request.
Furthermore, tests are added to verify that bridge chains that end in an
HDMI connector will always prefer RGB when the color format is
DRM_CONNECTOR_COLOR_FORMAT_AUTO, as is the behaviour in the HDMI state
helpers.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/tests/drm_bridge_test.c | 787 ++++++++++++++++++++++++++++++++
1 file changed, 787 insertions(+)
diff --git a/drivers/gpu/drm/tests/drm_bridge_test.c b/drivers/gpu/drm/tests/drm_bridge_test.c
index 64b665580a88..92f142ca6695 100644
--- a/drivers/gpu/drm/tests/drm_bridge_test.c
+++ b/drivers/gpu/drm/tests/drm_bridge_test.c
@@ -2,15 +2,23 @@
/*
* Kunit test for drm_bridge functions
*/
+#include <linux/cleanup.h>
+#include <linux/media-bus-format.h>
+
#include <drm/drm_atomic_state_helper.h>
+#include <drm/drm_atomic_uapi.h>
#include <drm/drm_bridge.h>
#include <drm/drm_bridge_connector.h>
#include <drm/drm_bridge_helper.h>
+#include <drm/drm_edid.h>
#include <drm/drm_kunit_helpers.h>
+#include <drm/drm_managed.h>
#include <kunit/device.h>
#include <kunit/test.h>
+#include "drm_kunit_edid.h"
+
/*
* Mimick the typical "private" struct defined by a bridge driver, which
* embeds a bridge plus other fields.
@@ -37,6 +45,21 @@ struct drm_bridge_init_priv {
bool destroyed;
};
+struct drm_bridge_chain_priv {
+ struct drm_device drm;
+ struct drm_encoder encoder;
+ struct drm_plane *plane;
+ struct drm_crtc *crtc;
+ struct drm_connector *connector;
+ unsigned int num_bridges;
+
+ /**
+ * @test_bridges: array of pointers to &struct drm_bridge_priv entries
+ * of which the first @num_bridges entries are valid.
+ */
+ struct drm_bridge_priv **test_bridges;
+};
+
static struct drm_bridge_priv *bridge_to_priv(struct drm_bridge *bridge)
{
return container_of(bridge, struct drm_bridge_priv, bridge);
@@ -95,6 +118,229 @@ static const struct drm_bridge_funcs drm_test_bridge_atomic_funcs = {
.atomic_reset = drm_atomic_helper_bridge_reset,
};
+/**
+ * struct fmt_tuple - a tuple of input/output MEDIA_BUS_FMT_*
+ */
+struct fmt_tuple {
+ u32 in_fmt;
+ u32 out_fmt;
+};
+
+/*
+ * Format mapping that only accepts RGB888, and outputs only RGB888
+ */
+static const struct fmt_tuple rgb8_passthrough[] = {
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+};
+
+/*
+ * Format mapping that only accepts YUV444, and outputs only YUV444
+ */
+static const struct fmt_tuple yuv8_passthrough[] = {
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+};
+
+/*
+ * Format mapping where 8bpc RGB -> 8bpc YUV444, or ID(RGB) or ID(YUV444)
+ */
+static const struct fmt_tuple rgb8_to_yuv8_or_id[] = {
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+};
+
+static const struct fmt_tuple rgb8_to_id_yuv8_or_yuv8_to_yuv422_yuv420[] = {
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_UYVY8_1X16 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_UYYVYY8_0_5X24 },
+};
+
+/*
+ * Format mapping where 8bpc YUV444 -> 8bpc RGB, or ID(YUV444)
+ */
+static const struct fmt_tuple yuv8_to_rgb8_or_id[] = {
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+};
+
+/*
+ * A format mapping that acts like a video processor that generates an RGB signal
+ */
+static const struct fmt_tuple rgb_producer[] = {
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB101010_1X30 },
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB121212_1X36 },
+};
+
+/*
+ * A format mapping that acts like a video processor that generates an 8-bit RGB,
+ * YUV444 or YUV420 signal
+ */
+static const struct fmt_tuple rgb_yuv444_yuv420_producer[] = {
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_UYYVYY8_0_5X24 },
+};
+
+static const struct fmt_tuple rgb8_yuv444_yuv422_passthrough[] = {
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_UYVY8_1X16, MEDIA_BUS_FMT_UYVY8_1X16 },
+};
+
+static const struct fmt_tuple yuv444_yuv422_rgb8_passthrough[] = {
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_UYVY8_1X16, MEDIA_BUS_FMT_UYVY8_1X16 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+};
+
+static bool fmt_in_list(const u32 fmt, const u32 *out_fmts, const size_t num_fmts)
+{
+ size_t i;
+
+ for (i = 0; i < num_fmts; i++)
+ if (out_fmts[i] == fmt)
+ return true;
+
+ return false;
+}
+
+/**
+ * get_tuples_out_fmts - Get unique output formats of a &struct fmt_tuple list
+ * @fmt_tuples: array of &struct fmt_tuple
+ * @num_fmt_tuples: number of entries in @fmt_tuples
+ * @out_fmts: target array to store the unique output bus formats
+ *
+ * Returns the number of unique output formats, i.e. the number of entries in
+ * @out_fmts that were populated with sensible values.
+ */
+static size_t get_tuples_out_fmts(const struct fmt_tuple *fmt_tuples,
+ const size_t num_fmt_tuples, u32 *out_fmts)
+{
+ size_t num_unique = 0;
+ size_t i;
+
+ for (i = 0; i < num_fmt_tuples; i++)
+ if (!fmt_in_list(fmt_tuples[i].out_fmt, out_fmts, num_unique))
+ out_fmts[num_unique++] = fmt_tuples[i].out_fmt;
+
+ return num_unique;
+}
+
+#define DEFINE_FMT_FUNCS_FROM_TUPLES(name) \
+static u32 *drm_test_bridge_ ## name ## _out_fmts(struct drm_bridge *bridge, \
+ struct drm_bridge_state *bridge_state, \
+ struct drm_crtc_state *crtc_state, \
+ struct drm_connector_state *conn_state, \
+ unsigned int *num_output_fmts) \
+{ \
+ u32 *out_fmts = kcalloc(ARRAY_SIZE((name)), sizeof(u32), GFP_KERNEL); \
+ \
+ if (out_fmts) \
+ *num_output_fmts = get_tuples_out_fmts((name), ARRAY_SIZE((name)), out_fmts); \
+ else \
+ *num_output_fmts = 0; \
+ \
+ return out_fmts; \
+} \
+ \
+static u32 *drm_test_bridge_ ## name ## _in_fmts(struct drm_bridge *bridge, \
+ struct drm_bridge_state *bridge_state, \
+ struct drm_crtc_state *crtc_state, \
+ struct drm_connector_state *conn_state, \
+ u32 output_fmt, \
+ unsigned int *num_input_fmts) \
+{ \
+ u32 *in_fmts = kcalloc(ARRAY_SIZE((name)), sizeof(u32), GFP_KERNEL); \
+ unsigned int num_fmts = 0; \
+ size_t i; \
+ \
+ if (!in_fmts) { \
+ *num_input_fmts = 0; \
+ return NULL; \
+ } \
+ \
+ for (i = 0; i < ARRAY_SIZE((name)); i++) \
+ if ((name)[i].out_fmt == output_fmt) \
+ in_fmts[num_fmts++] = (name)[i].in_fmt; \
+ \
+ *num_input_fmts = num_fmts; \
+ \
+ return in_fmts; \
+}
+
+#define DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_HDMI_FUNC(ident, input_fmts_func, output_fmts_func, \
+ hdmi_write_infoframe_func, \
+ hdmi_clear_infoframe_func) \
+static const struct drm_bridge_funcs (ident) = { \
+ .atomic_enable = drm_test_bridge_atomic_enable, \
+ .atomic_disable = drm_test_bridge_atomic_disable, \
+ .atomic_destroy_state = drm_atomic_helper_bridge_destroy_state, \
+ .atomic_duplicate_state = drm_atomic_helper_bridge_duplicate_state, \
+ .atomic_reset = drm_atomic_helper_bridge_reset, \
+ .atomic_get_input_bus_fmts = (input_fmts_func), \
+ .atomic_get_output_bus_fmts = (output_fmts_func), \
+ .hdmi_write_avi_infoframe = (hdmi_write_infoframe_func), \
+ .hdmi_clear_avi_infoframe = (hdmi_clear_infoframe_func), \
+ .hdmi_write_hdmi_infoframe = (hdmi_write_infoframe_func), \
+ .hdmi_clear_hdmi_infoframe = (hdmi_clear_infoframe_func), \
+}
+
+#define DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_FUNC(ident, input_fmts_func, output_fmts_func) \
+ DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_HDMI_FUNC(ident, input_fmts_func, output_fmts_func, \
+ NULL, NULL)
+
+#define DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(_name) \
+ DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_FUNC(_name ## _funcs, \
+ drm_test_bridge_ ## _name ## _in_fmts, \
+ drm_test_bridge_ ## _name ## _out_fmts)
+
+static int drm_test_bridge_write_infoframe_stub(struct drm_bridge *bridge,
+ const u8 *buffer, size_t len)
+{
+ return 0;
+}
+
+static int drm_test_bridge_clear_infoframe_stub(struct drm_bridge *bridge)
+{
+ return 0;
+}
+
+#define DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_HDMI(_name) \
+ DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_HDMI_FUNC(_name ## _hdmi ## _funcs, \
+ drm_test_bridge_ ## _name ## _in_fmts, \
+ drm_test_bridge_ ## _name ## _out_fmts, \
+ drm_test_bridge_write_infoframe_stub, \
+ drm_test_bridge_clear_infoframe_stub)
+DEFINE_FMT_FUNCS_FROM_TUPLES(rgb8_passthrough)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(rgb8_passthrough);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(yuv8_passthrough)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(yuv8_passthrough);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(rgb8_to_yuv8_or_id)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(rgb8_to_yuv8_or_id);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(yuv8_to_rgb8_or_id)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(yuv8_to_rgb8_or_id);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(rgb_producer)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(rgb_producer);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(rgb_yuv444_yuv420_producer)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(rgb_yuv444_yuv420_producer);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(rgb8_to_id_yuv8_or_yuv8_to_yuv422_yuv420)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(rgb8_to_id_yuv8_or_yuv8_to_yuv422_yuv420);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(rgb8_yuv444_yuv422_passthrough)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(rgb8_yuv444_yuv422_passthrough);
+
+DEFINE_FMT_FUNCS_FROM_TUPLES(yuv444_yuv422_rgb8_passthrough)
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT(yuv444_yuv422_rgb8_passthrough);
+DRM_BRIDGE_ATOMIC_WITH_BUS_FMT_HDMI(yuv444_yuv422_rgb8_passthrough);
+
KUNIT_DEFINE_ACTION_WRAPPER(drm_bridge_remove_wrapper,
drm_bridge_remove,
struct drm_bridge *);
@@ -178,6 +424,119 @@ drm_test_bridge_init(struct kunit *test, const struct drm_bridge_funcs *funcs)
return priv;
}
+static struct drm_bridge_chain_priv *
+drm_test_bridge_chain_init(struct kunit *test, unsigned int num_bridges,
+ const struct drm_bridge_funcs **funcs)
+{
+ struct drm_bridge_chain_priv *priv;
+ const struct drm_edid *edid;
+ struct drm_bridge *prev;
+ struct drm_encoder *enc;
+ struct drm_bridge *bridge;
+ struct drm_device *drm;
+ bool has_hdmi = false;
+ struct device *dev;
+ unsigned int i;
+ int ret;
+
+ dev = drm_kunit_helper_alloc_device(test);
+ if (IS_ERR(dev))
+ return ERR_CAST(dev);
+
+ priv = drm_kunit_helper_alloc_drm_device(test, dev, struct drm_bridge_chain_priv,
+ drm, DRIVER_MODESET | DRIVER_ATOMIC);
+ if (IS_ERR(priv))
+ return ERR_CAST(priv);
+
+ drm = &priv->drm;
+
+ priv->test_bridges = drmm_kmalloc_array(drm, num_bridges, sizeof(*priv->test_bridges),
+ GFP_KERNEL);
+ if (!priv->test_bridges)
+ return ERR_PTR(-ENOMEM);
+
+ priv->num_bridges = num_bridges;
+
+ for (i = 0; i < num_bridges; i++) {
+ priv->test_bridges[i] = devm_drm_bridge_alloc(dev, struct drm_bridge_priv,
+ bridge, funcs[i]);
+ if (IS_ERR(priv->test_bridges[i]))
+ return ERR_CAST(priv->test_bridges[i]);
+
+ priv->test_bridges[i]->data = priv;
+ }
+
+ priv->plane = drm_kunit_helper_create_primary_plane(test, drm, NULL, NULL,
+ NULL, 0, NULL);
+ if (IS_ERR(priv->plane))
+ return ERR_CAST(priv->plane);
+
+ priv->crtc = drm_kunit_helper_create_crtc(test, drm, priv->plane, NULL,
+ NULL, NULL);
+ if (IS_ERR(priv->crtc))
+ return ERR_CAST(priv->crtc);
+
+ enc = &priv->encoder;
+ ret = drmm_encoder_init(drm, enc, NULL, DRM_MODE_ENCODER_TMDS, NULL);
+ if (ret)
+ return ERR_PTR(ret);
+
+ enc->possible_crtcs = drm_crtc_mask(priv->crtc);
+
+ prev = NULL;
+ for (i = 0; i < num_bridges; i++) {
+ bridge = &priv->test_bridges[i]->bridge;
+ bridge->type = DRM_MODE_CONNECTOR_VIRTUAL;
+
+ if (bridge->funcs->hdmi_write_hdmi_infoframe) {
+ has_hdmi = true;
+ bridge->ops |= DRM_BRIDGE_OP_HDMI;
+ bridge->type = DRM_MODE_CONNECTOR_HDMIA;
+ bridge->vendor = "LNX";
+ bridge->product = "KUnit";
+ bridge->supported_formats = (BIT(DRM_OUTPUT_COLOR_FORMAT_RGB444) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR444) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR422) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR420));
+ }
+
+ ret = drm_kunit_bridge_add(test, bridge);
+ if (ret)
+ return ERR_PTR(ret);
+
+ ret = drm_bridge_attach(enc, bridge, prev, 0);
+ if (ret)
+ return ERR_PTR(ret);
+
+ prev = bridge;
+ }
+
+ priv->connector = drm_bridge_connector_init(drm, enc);
+ if (IS_ERR(priv->connector))
+ return ERR_CAST(priv->connector);
+
+ drm_connector_attach_encoder(priv->connector, enc);
+
+ drm_mode_config_reset(drm);
+
+ if (!has_hdmi)
+ return priv;
+
+ scoped_guard(mutex, &drm->mode_config.mutex) {
+ edid = drm_edid_alloc(test_edid_hdmi_1080p_rgb_yuv_4k_yuv420_dc_max_200mhz,
+ ARRAY_SIZE(test_edid_hdmi_1080p_rgb_yuv_4k_yuv420_dc_max_200mhz));
+ if (!edid)
+ return ERR_PTR(-EINVAL);
+
+ drm_edid_connector_update(priv->connector, edid);
+ KUNIT_ASSERT_GT(test, drm_edid_connector_add_modes(priv->connector), 0);
+
+ ret = priv->connector->funcs->fill_modes(priv->connector, 4096, 4096);
+ }
+
+ return priv;
+}
+
/*
* Test that drm_bridge_get_current_state() returns the last committed
* state for an atomic bridge.
@@ -506,14 +865,442 @@ static struct kunit_suite drm_bridge_alloc_test_suite = {
.test_cases = drm_bridge_alloc_tests,
};
+/**
+ * drm_test_bridge_chain_verify_fmt - Verify bridge chain format selection
+ * @test: pointer to KUnit test object
+ * @priv: pointer to a &struct drm_bridge_chain_priv for this chain
+ * @expected: constant array of &struct fmt_tuple describing the expected
+ * input and output bus formats
+ * @num_expected: number of entries in @expected
+ *
+ * Runs the KUNIT_EXPECT clauses to verify the bridge chain format selection
+ * resulted in the expected formats. If %0 is given as a format in a
+ * &struct fmt_tuple, then it is understood to mean "any".
+ *
+ * Must be called with the modeset lock held.
+ */
+static void drm_test_bridge_chain_verify_fmt(struct kunit *test,
+ struct drm_bridge_chain_priv *priv,
+ const struct fmt_tuple *const expected,
+ const unsigned int num_expected)
+{
+ struct drm_bridge_state *bstate;
+ unsigned int i = 0;
+
+ drm_for_each_bridge_in_chain_scoped(&priv->encoder, bridge) {
+ KUNIT_ASSERT_LT(test, i, num_expected);
+
+ bstate = drm_bridge_get_current_state(bridge);
+ if (expected[i].in_fmt)
+ KUNIT_EXPECT_EQ(test, bstate->input_bus_cfg.format,
+ expected[i].in_fmt);
+ if (expected[i].out_fmt)
+ KUNIT_EXPECT_EQ(test, bstate->output_bus_cfg.format,
+ expected[i].out_fmt);
+
+ i++;
+ }
+
+ KUNIT_ASSERT_EQ_MSG(test, i, num_expected,
+ "Fewer bridges (%u) than expected (%u)\n", i, num_expected);
+}
+
+/*
+ * Test that constructs a bridge chain in which an RGB888 producer is chained to
+ * two bridges that will convert from RGB to YUV and from YUV to RGB respectively.
+ *
+ * The test requests an output color_format of RGB using the color_format property,
+ * so to satisfy this request, the bridge chain must take a detour over YUV.
+ */
+static void drm_test_bridge_rgb_yuv_rgb(struct kunit *test)
+{
+ static const struct drm_bridge_funcs *funcs[] = {
+ &rgb_producer_funcs,
+ &rgb8_to_yuv8_or_id_funcs,
+ &yuv8_to_rgb8_or_id_funcs,
+ };
+ static const struct fmt_tuple expected[] = {
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+ };
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_bridge_chain_priv *priv;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *mode;
+ int ret;
+
+ priv = drm_test_bridge_chain_init(test, ARRAY_SIZE(funcs), funcs);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+retry_commit:
+ conn_state = drm_atomic_get_connector_state(state, priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ mode = drm_kunit_display_mode_from_cea_vic(test, &priv->drm, 16);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mode);
+
+ conn_state->color_format = DRM_CONNECTOR_COLOR_FORMAT_RGB444;
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, mode);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_commit(state);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ drm_test_bridge_chain_verify_fmt(test, priv, expected, ARRAY_SIZE(expected));
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+}
+
+/*
+ * Test in which a bridge capable of producing RGB, YUV444 and YUV420 has to
+ * produce RGB and convert with a downstream bridge in the chain to reach the
+ * requested YUV444 color format, as no direct path exists between its YUV444
+ * and the last bridge.
+ *
+ * The rationale behind this test is to devise a scenario in which naively
+ * assuming any format the video processor can output, and the connector
+ * requests, is the right format to pick, does not work.
+ */
+static void drm_test_bridge_must_convert_to_yuv444(struct kunit *test)
+{
+ static const struct drm_bridge_funcs *funcs[] = {
+ &rgb_yuv444_yuv420_producer_funcs,
+ &rgb8_passthrough_funcs,
+ &rgb8_to_id_yuv8_or_yuv8_to_yuv422_yuv420_funcs,
+ &rgb8_yuv444_yuv422_passthrough_funcs,
+ };
+ static const struct fmt_tuple expected[] = {
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ };
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_bridge_chain_priv *priv;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *mode;
+ int ret;
+
+ priv = drm_test_bridge_chain_init(test, ARRAY_SIZE(funcs), funcs);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+retry_commit:
+ conn_state = drm_atomic_get_connector_state(state, priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ mode = drm_kunit_display_mode_from_cea_vic(test, &priv->drm, 16);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mode);
+
+ conn_state->color_format = DRM_CONNECTOR_COLOR_FORMAT_YCBCR444;
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, mode);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_commit(state);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ drm_test_bridge_chain_verify_fmt(test, priv, expected, ARRAY_SIZE(expected));
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+}
+
+/*
+ * Test which checks that no matter the order of bus formats returned by an
+ * HDMI bridge, RGB is preferred on DRM_CONNECTOR_COLOR_FORMAT_AUTO if it's
+ * available.
+ */
+static void drm_test_bridge_hdmi_auto_rgb(struct kunit *test)
+{
+ static const struct drm_bridge_funcs *funcs[] = {
+ &rgb_yuv444_yuv420_producer_funcs,
+ &yuv444_yuv422_rgb8_passthrough_hdmi_funcs,
+ };
+ static const struct fmt_tuple expected[] = {
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_RGB888_1X24 },
+ { MEDIA_BUS_FMT_RGB888_1X24, MEDIA_BUS_FMT_RGB888_1X24 },
+ };
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_bridge_chain_priv *priv;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *mode;
+ int ret;
+
+ priv = drm_test_bridge_chain_init(test, ARRAY_SIZE(funcs), funcs);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+retry_commit:
+ conn_state = drm_atomic_get_connector_state(state, priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ mode = drm_kunit_display_mode_from_cea_vic(test, &priv->drm, 16);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mode);
+
+ KUNIT_ASSERT_EQ(test, conn_state->color_format, DRM_CONNECTOR_COLOR_FORMAT_AUTO);
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, mode);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_commit(state);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_format, DRM_OUTPUT_COLOR_FORMAT_RGB444);
+
+ drm_test_bridge_chain_verify_fmt(test, priv, expected, ARRAY_SIZE(expected));
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+}
+
+/*
+ * Test which checks that DRM_CONNECTOR_COLOR_FORMAT_AUTO on non-HDMI connectors
+ * will result in the first bus format on the output to be picked.
+ */
+static void drm_test_bridge_auto_first(struct kunit *test)
+{
+ static const struct drm_bridge_funcs *funcs[] = {
+ &rgb_yuv444_yuv420_producer_funcs,
+ &yuv444_yuv422_rgb8_passthrough_funcs,
+ };
+ static const struct fmt_tuple expected[] = {
+ { MEDIA_BUS_FMT_FIXED, MEDIA_BUS_FMT_YUV8_1X24 },
+ { MEDIA_BUS_FMT_YUV8_1X24, MEDIA_BUS_FMT_YUV8_1X24 },
+ };
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_bridge_chain_priv *priv;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *mode;
+ int ret;
+
+ priv = drm_test_bridge_chain_init(test, ARRAY_SIZE(funcs), funcs);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+retry_commit:
+ conn_state = drm_atomic_get_connector_state(state, priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ mode = drm_kunit_display_mode_from_cea_vic(test, &priv->drm, 16);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mode);
+
+ KUNIT_ASSERT_EQ(test, conn_state->color_format, DRM_CONNECTOR_COLOR_FORMAT_AUTO);
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, mode);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_commit(state);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ drm_test_bridge_chain_verify_fmt(test, priv, expected, ARRAY_SIZE(expected));
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+}
+
+/*
+ * Test which checks that in a configuration of bridge chains where an RGB
+ * producer is hooked to a YUV444-only pass-through, the atomic commit fails as
+ * the bridge format selection cannot find a valid sequence of bus formats.
+ */
+static void drm_test_bridge_rgb_yuv_no_path(struct kunit *test)
+{
+ static const struct drm_bridge_funcs *funcs[] = {
+ &rgb_producer_funcs,
+ &yuv8_passthrough_funcs,
+ &rgb8_yuv444_yuv422_passthrough_funcs,
+ };
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_bridge_chain_priv *priv;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *mode;
+ int ret;
+
+ priv = drm_test_bridge_chain_init(test, ARRAY_SIZE(funcs), funcs);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+retry_commit:
+ conn_state = drm_atomic_get_connector_state(state, priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ mode = drm_kunit_display_mode_from_cea_vic(test, &priv->drm, 16);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, mode);
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, mode);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_commit(state);
+ if (ret == -EDEADLK) {
+ drm_modeset_backoff(&ctx);
+ goto retry_commit;
+ }
+ KUNIT_EXPECT_EQ(test, ret, -ENOTSUPP);
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+}
+
+static struct kunit_case drm_bridge_bus_fmt_tests[] = {
+ KUNIT_CASE(drm_test_bridge_rgb_yuv_rgb),
+ KUNIT_CASE(drm_test_bridge_must_convert_to_yuv444),
+ KUNIT_CASE(drm_test_bridge_hdmi_auto_rgb),
+ KUNIT_CASE(drm_test_bridge_auto_first),
+ KUNIT_CASE(drm_test_bridge_rgb_yuv_no_path),
+ { }
+};
+
+static struct kunit_suite drm_bridge_bus_fmt_test_suite = {
+ .name = "drm_bridge_bus_fmt",
+ .test_cases = drm_bridge_bus_fmt_tests,
+};
+
kunit_test_suites(
&drm_bridge_get_current_state_test_suite,
&drm_bridge_helper_reset_crtc_test_suite,
&drm_bridge_alloc_test_suite,
+ &drm_bridge_bus_fmt_test_suite,
);
MODULE_AUTHOR("Maxime Ripard <mripard@kernel.org>");
MODULE_AUTHOR("Luca Ceresoli <luca.ceresoli@bootlin.com>");
+MODULE_AUTHOR("Nicolas Frattaroli <nicolas.frattaroli@collabora.com>");
MODULE_DESCRIPTION("Kunit test for drm_bridge functions");
MODULE_LICENSE("GPL");
--
2.54.0
^ permalink raw reply related
* [PATCH v15 24/28] drm/tests: hdmi: Add tests for HDMI helper's mode_valid
From: Nicolas Frattaroli @ 2026-05-22 12:32 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, David Airlie, Simona Vetter,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
Jonathan Corbet, Shuah Khan, Daniel Stone
Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
linux-rockchip, intel-gfx, intel-xe, linux-doc, wayland-devel,
Nicolas Frattaroli
In-Reply-To: <20260522-color-format-v15-0-21fb136c9df2@collabora.com>
Add some KUnit tests to verify that the HDMI state helper's mode_valid
implementation does not improperly reject chroma subsampled modes on the
basis of their clock rate not being satisfiable in RGB.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 109 +++++++++++++++++++++
1 file changed, 109 insertions(+)
diff --git a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c
index 504956cde348..eaada1af3029 100644
--- a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c
+++ b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c
@@ -77,6 +77,23 @@ static struct drm_display_mode *find_420_only_mode(struct drm_connector *connect
return NULL;
}
+static struct drm_display_mode *find_420_also_mode(struct drm_connector *connector)
+{
+ struct drm_device *drm = connector->dev;
+ struct drm_display_mode *mode;
+
+ mutex_lock(&drm->mode_config.mutex);
+ list_for_each_entry(mode, &connector->modes, head) {
+ if (drm_mode_is_420_also(&connector->display_info, mode)) {
+ mutex_unlock(&drm->mode_config.mutex);
+ return mode;
+ }
+ }
+ mutex_unlock(&drm->mode_config.mutex);
+
+ return NULL;
+}
+
static int set_connector_edid(struct kunit *test, struct drm_connector *connector,
const void *edid, size_t edid_len)
{
@@ -2745,11 +2762,103 @@ static void drm_test_check_mode_valid_reject_max_clock(struct kunit *test)
KUNIT_EXPECT_EQ(test, preferred->clock, 25200);
}
+/*
+ * Test that drm_hdmi_connector_mode_valid() will accept modes that require a
+ * 4:2:0 chroma subsampling, even if said mode would violate maximum clock
+ * constraints if it used RGB 4:4:4.
+ */
+static void drm_test_check_mode_valid_yuv420_only_max_clock(struct kunit *test)
+{
+ struct drm_atomic_helper_connector_hdmi_priv *priv;
+ struct drm_display_mode *dank;
+ struct drm_connector *conn;
+
+ priv = drm_kunit_helper_connector_hdmi_init_with_edid_funcs(test,
+ BIT(HDMI_COLORSPACE_RGB) |
+ BIT(HDMI_COLORSPACE_YUV420),
+ 8,
+ &dummy_connector_hdmi_funcs,
+ test_edid_hdmi_1080p_rgb_yuv_4k_yuv420_dc_max_200mhz);
+ KUNIT_ASSERT_NOT_NULL(test, priv);
+
+ conn = &priv->connector;
+ KUNIT_ASSERT_EQ(test, conn->display_info.max_tmds_clock, 200 * 1000);
+
+ dank = find_420_only_mode(conn);
+ KUNIT_ASSERT_NOT_NULL(test, dank);
+ KUNIT_EXPECT_EQ(test, dank->hdisplay, 3840);
+ KUNIT_EXPECT_EQ(test, dank->vdisplay, 2160);
+
+ /*
+ * Note: The mode's "clock" here is not accurate to the actual TMDS
+ * clock that HDMI will use for a subsampled mode. Hence, why the mode's
+ * clock is above the .max_tmds_clock of 200MHz.
+ */
+ KUNIT_EXPECT_EQ(test, dank->clock, 297000);
+}
+
+/*
+ * Test that drm_hdmi_connector_mode_valid() will reject modes that require
+ * 4:2:0 chroma subsampling, if the connector does not support 4:2:0.
+ */
+static void
+drm_test_check_mode_valid_reject_yuv420_only_connector(struct kunit *test)
+{
+ struct drm_atomic_helper_connector_hdmi_priv *priv;
+ struct drm_display_mode *dank;
+ struct drm_connector *conn;
+
+ priv = drm_kunit_helper_connector_hdmi_init_with_edid_funcs(test,
+ BIT(HDMI_COLORSPACE_RGB),
+ 8,
+ &dummy_connector_hdmi_funcs,
+ test_edid_hdmi_1080p_rgb_yuv_4k_yuv420_dc_max_200mhz);
+ KUNIT_ASSERT_NOT_NULL(test, priv);
+
+ conn = &priv->connector;
+ KUNIT_ASSERT_EQ(test, conn->display_info.max_tmds_clock, 200 * 1000);
+
+ dank = find_420_only_mode(conn);
+ KUNIT_EXPECT_NULL(test, dank);
+}
+
+/*
+ * Test that drm_hdmi_connector_mode_valid() will accept modes that allow (among
+ * other color formats) 4:2:0 chroma subsampling, even if the connector does not
+ * support 4:2:0, but the mode's clock works for RGB 4:4:4.
+ */
+static void
+drm_test_check_mode_valid_accept_yuv420_also_connector_rgb(struct kunit *test)
+{
+ struct drm_atomic_helper_connector_hdmi_priv *priv;
+ struct drm_display_mode *mode;
+ struct drm_connector *conn;
+
+ priv = drm_kunit_helper_connector_hdmi_init_with_edid_funcs(test,
+ BIT(HDMI_COLORSPACE_RGB),
+ 8,
+ &dummy_connector_hdmi_funcs,
+ test_edid_hdmi_4k_rgb_yuv420_dc_max_340mhz);
+ KUNIT_ASSERT_NOT_NULL(test, priv);
+
+ conn = &priv->connector;
+ KUNIT_ASSERT_EQ(test, conn->display_info.max_tmds_clock, 340 * 1000);
+
+ mode = find_420_also_mode(conn);
+ KUNIT_ASSERT_NOT_NULL(test, mode);
+ KUNIT_EXPECT_EQ(test, mode->hdisplay, 3840);
+ KUNIT_EXPECT_EQ(test, mode->vdisplay, 2160);
+ KUNIT_EXPECT_EQ(test, mode->clock, 297000);
+}
+
static struct kunit_case drm_atomic_helper_connector_hdmi_mode_valid_tests[] = {
KUNIT_CASE(drm_test_check_mode_valid),
KUNIT_CASE(drm_test_check_mode_valid_reject),
KUNIT_CASE(drm_test_check_mode_valid_reject_rate),
KUNIT_CASE(drm_test_check_mode_valid_reject_max_clock),
+ KUNIT_CASE(drm_test_check_mode_valid_yuv420_only_max_clock),
+ KUNIT_CASE(drm_test_check_mode_valid_reject_yuv420_only_connector),
+ KUNIT_CASE(drm_test_check_mode_valid_accept_yuv420_also_connector_rgb),
{ }
};
--
2.54.0
^ permalink raw reply related
* [PATCH v15 23/28] drm/tests: hdmi: Add tests for the color_format property
From: Nicolas Frattaroli @ 2026-05-22 12:32 UTC (permalink / raw)
To: Harry Wentland, Leo Li, Rodrigo Siqueira, Alex Deucher,
Christian König, David Airlie, Simona Vetter,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
Andrzej Hajda, Neil Armstrong, Robert Foss, Laurent Pinchart,
Jonas Karlman, Jernej Skrabec, Sandy Huang, Heiko Stübner,
Andy Yan, Jani Nikula, Rodrigo Vivi, Joonas Lahtinen,
Tvrtko Ursulin, Dmitry Baryshkov, Sascha Hauer, Rob Herring,
Jonathan Corbet, Shuah Khan, Daniel Stone
Cc: kernel, amd-gfx, dri-devel, linux-kernel, linux-arm-kernel,
linux-rockchip, intel-gfx, intel-xe, linux-doc, wayland-devel,
Nicolas Frattaroli
In-Reply-To: <20260522-color-format-v15-0-21fb136c9df2@collabora.com>
Add some KUnit tests to check the color_format property is working as
expected with the HDMI state helper.
Existing tests are extended to also test the
DRM_CONNECTOR_COLOR_FORMAT_AUTO case, in order to avoid duplicating test
cases. For the explicitly selected color format cases, parameterized
tests are added.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com>
---
drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c | 236 +++++++++++++++++++++
1 file changed, 236 insertions(+)
diff --git a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c
index c9819c3fc635..504956cde348 100644
--- a/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c
+++ b/drivers/gpu/drm/tests/drm_hdmi_state_helper_test.c
@@ -60,6 +60,23 @@ static struct drm_display_mode *find_preferred_mode(struct drm_connector *connec
return preferred;
}
+static struct drm_display_mode *find_420_only_mode(struct drm_connector *connector)
+{
+ struct drm_device *drm = connector->dev;
+ struct drm_display_mode *mode;
+
+ mutex_lock(&drm->mode_config.mutex);
+ list_for_each_entry(mode, &connector->modes, head) {
+ if (drm_mode_is_420_only(&connector->display_info, mode)) {
+ mutex_unlock(&drm->mode_config.mutex);
+ return mode;
+ }
+ }
+ mutex_unlock(&drm->mode_config.mutex);
+
+ return NULL;
+}
+
static int set_connector_edid(struct kunit *test, struct drm_connector *connector,
const void *edid, size_t edid_len)
{
@@ -1547,6 +1564,7 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_yuv420(struct kunit *test)
* RGB/10bpc
* - The chosen mode has a TMDS character rate lower than the display
* supports in YUV422/12bpc.
+ * - The HDMI connector state's color format property is unset (i.e. AUTO)
*
* Then we will prefer to keep the RGB format with a lower bpc over
* picking YUV422.
@@ -1609,6 +1627,7 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422(struct kunit
conn_state = conn->state;
KUNIT_ASSERT_NOT_NULL(test, conn_state);
+ KUNIT_ASSERT_EQ(test, conn_state->color_format, DRM_CONNECTOR_COLOR_FORMAT_AUTO);
KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_bpc, 10);
KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_format, DRM_OUTPUT_COLOR_FORMAT_RGB444);
@@ -1626,6 +1645,7 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv422(struct kunit
* RGB/8bpc
* - The chosen mode has a TMDS character rate lower than the display
* supports in YUV420/12bpc.
+ * - The HDMI connector state's color format property is unset (i.e. AUTO)
*
* Then we will prefer to keep the RGB format with a lower bpc over
* picking YUV420.
@@ -1687,6 +1707,7 @@ static void drm_test_check_max_tmds_rate_bpc_fallback_ignore_yuv420(struct kunit
conn_state = conn->state;
KUNIT_ASSERT_NOT_NULL(test, conn_state);
+ KUNIT_ASSERT_EQ(test, conn_state->color_format, DRM_CONNECTOR_COLOR_FORMAT_AUTO);
KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_bpc, 8);
KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_format, DRM_OUTPUT_COLOR_FORMAT_RGB444);
@@ -2198,6 +2219,217 @@ static void drm_test_check_disable_connector(struct kunit *test)
drm_modeset_acquire_fini(&ctx);
}
+struct color_format_test_param {
+ enum drm_connector_color_format fmt;
+ enum drm_output_color_format expected;
+ int expected_ret;
+ const char *desc;
+};
+
+/* Test that if:
+ * - an HDMI connector supports RGB, YUV444, YUV422, and YUV420
+ * - the display supports RGB, YUV444, YUV422, and YUV420
+ * - the "color format" property is set
+ * then, for the preferred mode, for a given "color format" option:
+ * - DRM_CONNECTOR_COLOR_FORMAT_AUTO results in an output format of RGB
+ * - DRM_CONNECTOR_COLOR_FORMAT_YCBCR422 results in an output format of YUV422
+ * - DRM_CONNECTOR_COLOR_FORMAT_YCBCR420 results in an output format of YUV420
+ * - DRM_CONNECTOR_COLOR_FORMAT_YCBCR444 results in an output format of YUV444
+ * - DRM_CONNECTOR_COLOR_FORMAT_RGB results in an HDMI output format of RGB
+ */
+static void drm_test_check_hdmi_color_format(struct kunit *test)
+{
+ const struct color_format_test_param *param = test->param_value;
+ struct drm_atomic_helper_connector_hdmi_priv *priv;
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_info *info;
+ struct drm_display_mode *preferred;
+ int ret;
+
+ priv = drm_kunit_helper_connector_hdmi_init_with_edid_funcs(test,
+ BIT(DRM_OUTPUT_COLOR_FORMAT_RGB444) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR422) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR420) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR444),
+ 12,
+ &dummy_connector_hdmi_funcs,
+ test_edid_hdmi_4k_rgb_yuv420_dc_max_340mhz);
+ KUNIT_ASSERT_NOT_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ KUNIT_ASSERT_TRUE(test, priv->connector.ycbcr_420_allowed);
+
+ info = &priv->connector.display_info;
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, info);
+ preferred = find_preferred_mode(&priv->connector);
+ KUNIT_ASSERT_TRUE(test, drm_mode_is_420(info, preferred));
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+ conn_state = drm_atomic_get_connector_state(state, &priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ conn_state->color_format = param->fmt;
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, preferred);
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_check_only(state);
+ KUNIT_EXPECT_EQ(test, ret, param->expected_ret);
+ KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_format, param->expected);
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+}
+
+static const struct color_format_test_param hdmi_color_format_params[] = {
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_AUTO,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_RGB444,
+ .expected_ret = 0,
+ .desc = "AUTO -> RGB"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_YCBCR422,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_YCBCR422,
+ .expected_ret = 0,
+ .desc = "YCBCR422 -> YUV422"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_YCBCR420,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_YCBCR420,
+ .expected_ret = 0,
+ .desc = "YCBCR420 -> YUV420"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_YCBCR444,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_YCBCR444,
+ .expected_ret = 0,
+ .desc = "YCBCR444 -> YUV444"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_RGB444,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_RGB444,
+ .expected_ret = 0,
+ .desc = "RGB -> RGB"
+ },
+};
+
+KUNIT_ARRAY_PARAM_DESC(check_hdmi_color_format, hdmi_color_format_params, desc);
+
+/* Test that if:
+ * - the HDMI connector supports RGB, YUV422, YUV420, and YUV444
+ * - the display has a YUV420-only mode
+ * - the "color format" property is explicitly set (i.e. !AUTO)
+ * then:
+ * - color format DRM_CONNECTOR_COLOR_FORMAT_RGB444 will fail
+ * drm_atomic_check_only for the YUV420-only mode with -EINVAL
+ * - color format DRM_CONNECTOR_COLOR_FORMAT_YCBCR444 will fail
+ * drm_atomic_check_only for the YUV420-only mode with -EINVAL
+ * - color format DRM_CONNECTOR_COLOR_FORMAT_YCBCR422 will fail
+ * drm_atomic_check_only for the YUV420-only mode with -EINVAL
+ * - color format DRM_CONNECTOR_COLOR_FORMAT_YCBCR420 passes
+ * drm_atomic_check_only for the YUV420-only mode
+ */
+static void drm_test_check_hdmi_color_format_420_only(struct kunit *test)
+{
+ const struct color_format_test_param *param = test->param_value;
+ struct drm_atomic_helper_connector_hdmi_priv *priv;
+ struct drm_connector_state *conn_state;
+ struct drm_modeset_acquire_ctx ctx;
+ struct drm_crtc_state *crtc_state;
+ struct drm_atomic_commit *state;
+ struct drm_display_mode *dank;
+ int ret;
+
+ priv = drm_kunit_helper_connector_hdmi_init_with_edid_funcs(test,
+ BIT(DRM_OUTPUT_COLOR_FORMAT_RGB444) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR422) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR420) |
+ BIT(DRM_OUTPUT_COLOR_FORMAT_YCBCR444),
+ 12,
+ &dummy_connector_hdmi_funcs,
+ test_edid_hdmi_1080p_rgb_yuv_4k_yuv420_dc_max_200mhz);
+ KUNIT_ASSERT_NOT_NULL(test, priv);
+
+ drm_modeset_acquire_init(&ctx, 0);
+
+ dank = find_420_only_mode(&priv->connector);
+ KUNIT_ASSERT_NOT_NULL(test, dank);
+
+ state = drm_kunit_helper_atomic_state_alloc(test, &priv->drm, &ctx);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, state);
+
+ conn_state = drm_atomic_get_connector_state(state, &priv->connector);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, conn_state);
+
+ conn_state->color_format = param->fmt;
+
+ ret = drm_atomic_set_crtc_for_connector(conn_state, priv->crtc);
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state = drm_atomic_get_crtc_state(state, priv->crtc);
+ KUNIT_ASSERT_NOT_ERR_OR_NULL(test, crtc_state);
+
+ ret = drm_atomic_set_mode_for_crtc(crtc_state, dank);
+ KUNIT_ASSERT_EQ(test, ret, 0);
+
+ crtc_state->enable = true;
+ crtc_state->active = true;
+
+ ret = drm_atomic_check_only(state);
+ KUNIT_EXPECT_EQ(test, ret, param->expected_ret);
+ if (!param->expected_ret)
+ KUNIT_EXPECT_EQ(test, conn_state->hdmi.output_format, param->expected);
+
+ drm_modeset_drop_locks(&ctx);
+ drm_modeset_acquire_fini(&ctx);
+};
+
+static const struct color_format_test_param hdmi_color_format_420_only_params[] = {
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_RGB444,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_RGB444,
+ .expected_ret = -EINVAL,
+ .desc = "RGB should fail"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_YCBCR444,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_YCBCR444,
+ .expected_ret = -EINVAL,
+ .desc = "YUV444 should fail"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_YCBCR422,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_YCBCR422,
+ .expected_ret = -EINVAL,
+ .desc = "YUV422 should fail"
+ },
+ {
+ .fmt = DRM_CONNECTOR_COLOR_FORMAT_YCBCR420,
+ .expected = DRM_OUTPUT_COLOR_FORMAT_YCBCR420,
+ .expected_ret = 0,
+ .desc = "YUV420 should work"
+ },
+};
+
+KUNIT_ARRAY_PARAM_DESC(check_hdmi_color_format_420_only,
+ hdmi_color_format_420_only_params, desc);
+
static struct kunit_case drm_atomic_helper_connector_hdmi_check_tests[] = {
KUNIT_CASE(drm_test_check_broadcast_rgb_auto_cea_mode),
KUNIT_CASE(drm_test_check_broadcast_rgb_auto_cea_mode_vic_1),
@@ -2227,6 +2459,10 @@ static struct kunit_case drm_atomic_helper_connector_hdmi_check_tests[] = {
KUNIT_CASE(drm_test_check_tmds_char_rate_rgb_8bpc),
KUNIT_CASE(drm_test_check_tmds_char_rate_rgb_10bpc),
KUNIT_CASE(drm_test_check_tmds_char_rate_rgb_12bpc),
+ KUNIT_CASE_PARAM(drm_test_check_hdmi_color_format,
+ check_hdmi_color_format_gen_params),
+ KUNIT_CASE_PARAM(drm_test_check_hdmi_color_format_420_only,
+ check_hdmi_color_format_420_only_gen_params),
/*
* TODO: We should have tests to check that a change in the
* format triggers a CRTC mode change just like we do for the
--
2.54.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox