* [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support
@ 2025-09-17 3:36 Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 1/6] mm: softdirty: Add pgtable_supports_soft_dirty() Chunyan Zhang
` (5 more replies)
0 siblings, 6 replies; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:36 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
This patchset adds support for Svrsw60t59b [1] extension which is ratified now,
also add soft dirty and userfaultfd write protect tracking for RISC-V.
The patches 1 and 2 add macros to allow architectures to define their own checks
if the soft-dirty / uffd_wp PTE bits are available, in other words for RISC-V,
the Svrsw60t59b extension is supported on which device the kernel is running.
Also patch1-2 are removing "ifdef CONFIG_MEM_SOFT_DIRTY"
"ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP" and
"ifdef CONFIG_PTE_MARKER_UFFD_WP" in favor of checks which if not overridden by
the architecture, no change in behavior is expected.
This patchset has been tested with kselftest mm suite in which soft-dirty,
madv_populate, test_unmerge_uffd_wp, and uffd-unit-tests run and pass,
and no regressions are observed in any of the other tests.
This patchset applies on top of the lastest mm-new branch.
[1] https://github.com/riscv-non-isa/riscv-iommu/pull/543
V13:
- Rebase on mm-new branch;
- Fixed build errors;
- Add more exactly descriptions in commit message in patch 1-2;
- Replace '__always_inline' with 'inline' for uffd_supports_wp_markeruffd_supports_wp_marker();
- Add description to the extensions dt-binding in patch 6.
V12: https://lore.kernel.org/all/20250915101343.1449546-1-zhangchunyan@iscas.ac.cn/
- Rename the macro API to pgtable_supports_soft_dirty/uffd_wp();
- Add changes for setting VM_SOFTDIRTY flags conditionally;
- Drop changes to show_smap_vma_flags();
- Drop CONFIG_MEM_SOFT_DIRTY compile condition of clear_soft_dirty() and clear_soft_dirty_pmd();
- Fix typos;
- Add uffd_supports_wp_marker() and drop some ifdef CONFIG_PTE_MARKER_UFFD_WP.
V11: https://lore.kernel.org/all/20250911095602.1130290-1-zhangchunyan@iscas.ac.cn/
- Rename the macro API to pgtable_*_supported() since we also have PMD support;
- Change the default implementations of two macros, make CONFIG_MEM_SOFT_DIRTY or
CONFIG_HAVE_ARCH_USERFAULTFD_WP part of the macros;
- Correct the order of insertion of RISCV_ISA_EXT_SVRSW60T59B;
- Rephrase some comments.
V10: https://lore.kernel.org/all/20250909095611.803898-1-zhangchunyan@iscas.ac.cn/
- Fixed the issue reported by kernel test irobot <lkp@intel.com>.
V9: https://lore.kernel.org/all/20250905103651.489197-1-zhangchunyan@iscas.ac.cn/
- Add pte_soft_dirty/uffd_wp_available() API to allow dynamically checking
if the PTE bit is available for the platform on which the kernel is running.
V8: https://lore.kernel.org/all/20250619065232.1786470-1-zhangchunyan@iscas.ac.cn/)
- Rebase on v6.16-rc1;
- Add dependencies to MMU && 64BIT for RISCV_ISA_SVRSW60T59B;
- Use 'Svrsw60t59b' instead of 'SVRSW60T59B' in Kconfig help paragraph;
- Add Alex's Reviewed-by tag in patch 1.
V7: https://lore.kernel.org/all/20250409095320.224100-1-zhangchunyan@iscas.ac.cn/
- Add Svrsw60t59b [1] extension support;
- Have soft-dirty and uffd-wp depending on the Svrsw60t59b extension to
avoid crashes for the hardware which don't have this extension.
V6: https://lore.kernel.org/all/20250408084301.68186-1-zhangchunyan@iscas.ac.cn/
- Changes to use bits 59-60 which are supported by extension Svrsw60t59b
for soft dirty and userfaultfd write protect tracking.
V5: https://lore.kernel.org/all/20241113095833.1805746-1-zhangchunyan@iscas.ac.cn/
- Fixed typos and corrected some words in Kconfig and commit message;
- Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste
error;
- Added Alex's Reviewed-by tag in patch 2.
V4: https://lore.kernel.org/all/20240830011101.3189522-1-zhangchunyan@iscas.ac.cn/
- Added bit(4) descriptions into "Format of swap PTE".
V3: https://lore.kernel.org/all/20240805095243.44809-1-zhangchunyan@iscas.ac.cn/
- Fixed the issue reported by kernel test irobot <lkp@intel.com>.
V2: https://lore.kernel.org/all/20240731040444.3384790-1-zhangchunyan@iscas.ac.cn/
- Add uffd-wp supported;
- Make soft-dirty uffd-wp and devmap mutually exclusive which all use
the same PTE bit;
- Add test results of CRIU in the cover-letter.
Chunyan Zhang (6):
mm: softdirty: Add pgtable_supports_soft_dirty()
mm: userfaultfd: Add pgtable_supports_uffd_wp()
riscv: Add RISC-V Svrsw60t59b extension support
riscv: mm: Add soft-dirty page tracking support
riscv: mm: Add userfaultfd write-protect support
dt-bindings: riscv: Add Svrsw60t59b extension description
.../devicetree/bindings/riscv/extensions.yaml | 6 +
arch/riscv/Kconfig | 16 ++
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/include/asm/pgtable-bits.h | 37 +++++
arch/riscv/include/asm/pgtable.h | 143 +++++++++++++++++-
arch/riscv/kernel/cpufeature.c | 1 +
fs/proc/task_mmu.c | 15 +-
fs/userfaultfd.c | 22 +--
include/asm-generic/pgtable_uffd.h | 17 +++
include/linux/mm.h | 3 +
include/linux/mm_inline.h | 12 +-
include/linux/pgtable.h | 12 ++
include/linux/userfaultfd_k.h | 114 ++++++++------
mm/debug_vm_pgtable.c | 10 +-
mm/huge_memory.c | 13 +-
mm/internal.h | 2 +-
mm/memory.c | 6 +-
mm/mmap.c | 6 +-
mm/mremap.c | 13 +-
mm/userfaultfd.c | 10 +-
mm/vma.c | 6 +-
mm/vma_exec.c | 5 +-
22 files changed, 365 insertions(+), 105 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH V13 1/6] mm: softdirty: Add pgtable_supports_soft_dirty()
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
@ 2025-09-17 3:36 ` Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp() Chunyan Zhang
` (4 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:36 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
Some platforms can customize the PTE PMD entry soft-dirty bit making it
unavailable even if the architecture provides the resource.
Add an API which architectures can define their specific implementations
to detect if soft-dirty bit is available on which device the kernel is
running.
This patch is removing "ifdef CONFIG_MEM_SOFT_DIRTY" in favor of
pgtable_supports_soft_dirty() checks that defaults to
IS_ENABLED(CONFIG_MEM_SOFT_DIRTY), if not overridden by
the architecture, no change in behavior is expected.
We make sure to never set VM_SOFTDIRTY if !pgtable_supports_soft_dirty(),
so we will never run into VM_SOFTDIRTY checks.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
fs/proc/task_mmu.c | 15 ++++++---------
include/linux/mm.h | 3 +++
include/linux/pgtable.h | 12 ++++++++++++
mm/debug_vm_pgtable.c | 10 +++++-----
mm/huge_memory.c | 13 +++++++------
mm/internal.h | 2 +-
mm/mmap.c | 6 ++++--
mm/mremap.c | 13 +++++++------
mm/userfaultfd.c | 10 ++++------
mm/vma.c | 6 ++++--
mm/vma_exec.c | 5 ++++-
11 files changed, 57 insertions(+), 38 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ced01cf3c5ab..18c55e21bd16 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1582,8 +1582,6 @@ struct clear_refs_private {
enum clear_refs_types type;
};
-#ifdef CONFIG_MEM_SOFT_DIRTY
-
static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, pte_t pte)
{
struct folio *folio;
@@ -1603,6 +1601,8 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr,
static inline void clear_soft_dirty(struct vm_area_struct *vma,
unsigned long addr, pte_t *pte)
{
+ if (!pgtable_supports_soft_dirty())
+ return;
/*
* The soft-dirty tracker uses #PF-s to catch writes
* to pages, so write-protect the pte as well. See the
@@ -1625,19 +1625,16 @@ static inline void clear_soft_dirty(struct vm_area_struct *vma,
set_pte_at(vma->vm_mm, addr, pte, ptent);
}
}
-#else
-static inline void clear_soft_dirty(struct vm_area_struct *vma,
- unsigned long addr, pte_t *pte)
-{
-}
-#endif
-#if defined(CONFIG_MEM_SOFT_DIRTY) && defined(CONFIG_TRANSPARENT_HUGEPAGE)
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE)
static inline void clear_soft_dirty_pmd(struct vm_area_struct *vma,
unsigned long addr, pmd_t *pmdp)
{
pmd_t old, pmd = *pmdp;
+ if (!pgtable_supports_soft_dirty())
+ return;
+
if (pmd_present(pmd)) {
/* See comment in change_huge_pmd() */
old = pmdp_invalidate(vma, addr, pmdp);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d004fb7d805d..c5bc449a65d5 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -798,6 +798,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm)
static inline void vm_flags_init(struct vm_area_struct *vma,
vm_flags_t flags)
{
+ VM_WARN_ON_ONCE(!pgtable_supports_soft_dirty() && (flags & VM_SOFTDIRTY));
ACCESS_PRIVATE(vma, __vm_flags) = flags;
}
@@ -816,6 +817,7 @@ static inline void vm_flags_reset(struct vm_area_struct *vma,
static inline void vm_flags_reset_once(struct vm_area_struct *vma,
vm_flags_t flags)
{
+ VM_WARN_ON_ONCE(!pgtable_supports_soft_dirty() && (flags & VM_SOFTDIRTY));
vma_assert_write_locked(vma);
WRITE_ONCE(ACCESS_PRIVATE(vma, __vm_flags), flags);
}
@@ -823,6 +825,7 @@ static inline void vm_flags_reset_once(struct vm_area_struct *vma,
static inline void vm_flags_set(struct vm_area_struct *vma,
vm_flags_t flags)
{
+ VM_WARN_ON_ONCE(!pgtable_supports_soft_dirty() && (flags & VM_SOFTDIRTY));
vma_start_write(vma);
ACCESS_PRIVATE(vma, __vm_flags) |= flags;
}
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 32e8457ad535..b13b6f42be3c 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1553,6 +1553,18 @@ static inline pgprot_t pgprot_modify(pgprot_t oldprot, pgprot_t newprot)
#define arch_start_context_switch(prev) do {} while (0)
#endif
+/*
+ * Some platforms can customize the PTE soft-dirty bit making it unavailable
+ * even if the architecture provides the resource.
+ * Adding this API allows architectures to add their own checks for the
+ * devices on which the kernel is running.
+ * Note: When overriding it, please make sure the CONFIG_MEM_SOFT_DIRTY
+ * is part of this macro.
+ */
+#ifndef pgtable_supports_soft_dirty
+#define pgtable_supports_soft_dirty() IS_ENABLED(CONFIG_MEM_SOFT_DIRTY)
+#endif
+
#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
#ifndef CONFIG_ARCH_ENABLE_THP_MIGRATION
static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd)
diff --git a/mm/debug_vm_pgtable.c b/mm/debug_vm_pgtable.c
index 830107b6dd08..6a5b226bda28 100644
--- a/mm/debug_vm_pgtable.c
+++ b/mm/debug_vm_pgtable.c
@@ -690,7 +690,7 @@ static void __init pte_soft_dirty_tests(struct pgtable_debug_args *args)
{
pte_t pte = pfn_pte(args->fixed_pte_pfn, args->page_prot);
- if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ if (!pgtable_supports_soft_dirty())
return;
pr_debug("Validating PTE soft dirty\n");
@@ -702,7 +702,7 @@ static void __init pte_swap_soft_dirty_tests(struct pgtable_debug_args *args)
{
pte_t pte;
- if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ if (!pgtable_supports_soft_dirty())
return;
pr_debug("Validating PTE swap soft dirty\n");
@@ -718,7 +718,7 @@ static void __init pmd_soft_dirty_tests(struct pgtable_debug_args *args)
{
pmd_t pmd;
- if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ if (!pgtable_supports_soft_dirty())
return;
if (!has_transparent_hugepage())
@@ -734,8 +734,8 @@ static void __init pmd_swap_soft_dirty_tests(struct pgtable_debug_args *args)
{
pmd_t pmd;
- if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) ||
- !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION))
+ if (!pgtable_supports_soft_dirty() ||
+ !IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION))
return;
if (!has_transparent_hugepage())
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 5acca24bbabb..85dca384375e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2263,12 +2263,13 @@ static inline int pmd_move_must_withdraw(spinlock_t *new_pmd_ptl,
static pmd_t move_soft_dirty_pmd(pmd_t pmd)
{
-#ifdef CONFIG_MEM_SOFT_DIRTY
- if (unlikely(is_pmd_migration_entry(pmd)))
- pmd = pmd_swp_mksoft_dirty(pmd);
- else if (pmd_present(pmd))
- pmd = pmd_mksoft_dirty(pmd);
-#endif
+ if (pgtable_supports_soft_dirty()) {
+ if (unlikely(is_pmd_migration_entry(pmd)))
+ pmd = pmd_swp_mksoft_dirty(pmd);
+ else if (pmd_present(pmd))
+ pmd = pmd_mksoft_dirty(pmd);
+ }
+
return pmd;
}
diff --git a/mm/internal.h b/mm/internal.h
index 63e3ec8d63be..6a4219cdff58 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1530,7 +1530,7 @@ static inline bool vma_soft_dirty_enabled(struct vm_area_struct *vma)
* VM_SOFTDIRTY is defined as 0x0, then !(vm_flags & VM_SOFTDIRTY)
* will be constantly true.
*/
- if (!IS_ENABLED(CONFIG_MEM_SOFT_DIRTY))
+ if (!pgtable_supports_soft_dirty())
return false;
/*
diff --git a/mm/mmap.c b/mm/mmap.c
index 266711d1c91c..4ce7d4667766 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1451,8 +1451,10 @@ static struct vm_area_struct *__install_special_mapping(
return ERR_PTR(-ENOMEM);
vma_set_range(vma, addr, addr + len, 0);
- vm_flags_init(vma, (vm_flags | mm->def_flags |
- VM_DONTEXPAND | VM_SOFTDIRTY) & ~VM_LOCKED_MASK);
+ vm_flags |= mm->def_flags | VM_DONTEXPAND;
+ if (pgtable_supports_soft_dirty())
+ vm_flags |= VM_SOFTDIRTY;
+ vm_flags_init(vma, vm_flags & ~VM_LOCKED_MASK);
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
vma->vm_ops = ops;
diff --git a/mm/mremap.c b/mm/mremap.c
index 35de0a7b910e..35a135cd149a 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -162,12 +162,13 @@ static pte_t move_soft_dirty_pte(pte_t pte)
* Set soft dirty bit so we can notice
* in userspace the ptes were moved.
*/
-#ifdef CONFIG_MEM_SOFT_DIRTY
- if (pte_present(pte))
- pte = pte_mksoft_dirty(pte);
- else if (is_swap_pte(pte))
- pte = pte_swp_mksoft_dirty(pte);
-#endif
+ if (pgtable_supports_soft_dirty()) {
+ if (pte_present(pte))
+ pte = pte_mksoft_dirty(pte);
+ else if (is_swap_pte(pte))
+ pte = pte_swp_mksoft_dirty(pte);
+ }
+
return pte;
}
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index af61b95c89e4..ea8ce18483fe 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1116,9 +1116,8 @@ static long move_present_ptes(struct mm_struct *mm,
orig_dst_pte = folio_mk_pte(src_folio, dst_vma->vm_page_prot);
/* Set soft dirty bit so userspace can notice the pte was moved */
-#ifdef CONFIG_MEM_SOFT_DIRTY
- orig_dst_pte = pte_mksoft_dirty(orig_dst_pte);
-#endif
+ if (pgtable_supports_soft_dirty())
+ orig_dst_pte = pte_mksoft_dirty(orig_dst_pte);
if (pte_dirty(orig_src_pte))
orig_dst_pte = pte_mkdirty(orig_dst_pte);
orig_dst_pte = pte_mkwrite(orig_dst_pte, dst_vma);
@@ -1205,9 +1204,8 @@ static int move_swap_pte(struct mm_struct *mm, struct vm_area_struct *dst_vma,
}
orig_src_pte = ptep_get_and_clear(mm, src_addr, src_pte);
-#ifdef CONFIG_MEM_SOFT_DIRTY
- orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte);
-#endif
+ if (pgtable_supports_soft_dirty())
+ orig_src_pte = pte_swp_mksoft_dirty(orig_src_pte);
set_pte_at(mm, dst_addr, dst_pte, orig_src_pte);
double_pt_unlock(dst_ptl, src_ptl);
diff --git a/mm/vma.c b/mm/vma.c
index 1be297f7bb00..674b7a7c6132 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2568,7 +2568,8 @@ static void __mmap_complete(struct mmap_state *map, struct vm_area_struct *vma)
* then new mapped in-place (which must be aimed as
* a completely new data area).
*/
- vm_flags_set(vma, VM_SOFTDIRTY);
+ if (pgtable_supports_soft_dirty())
+ vm_flags_set(vma, VM_SOFTDIRTY);
vma_set_page_prot(vma);
}
@@ -2843,7 +2844,8 @@ int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *vma,
mm->data_vm += len >> PAGE_SHIFT;
if (vm_flags & VM_LOCKED)
mm->locked_vm += (len >> PAGE_SHIFT);
- vm_flags_set(vma, VM_SOFTDIRTY);
+ if (pgtable_supports_soft_dirty())
+ vm_flags_set(vma, VM_SOFTDIRTY);
return 0;
mas_store_fail:
diff --git a/mm/vma_exec.c b/mm/vma_exec.c
index 922ee51747a6..a822fb73f4e2 100644
--- a/mm/vma_exec.c
+++ b/mm/vma_exec.c
@@ -107,6 +107,7 @@ int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift)
int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
unsigned long *top_mem_p)
{
+ unsigned long flags = VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP;
int err;
struct vm_area_struct *vma = vm_area_alloc(mm);
@@ -137,7 +138,9 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
BUILD_BUG_ON(VM_STACK_FLAGS & VM_STACK_INCOMPLETE_SETUP);
vma->vm_end = STACK_TOP_MAX;
vma->vm_start = vma->vm_end - PAGE_SIZE;
- vm_flags_init(vma, VM_SOFTDIRTY | VM_STACK_FLAGS | VM_STACK_INCOMPLETE_SETUP);
+ if (pgtable_supports_soft_dirty())
+ flags |= VM_SOFTDIRTY;
+ vm_flags_init(vma, flags);
vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
err = insert_vm_struct(mm, vma);
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp()
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 1/6] mm: softdirty: Add pgtable_supports_soft_dirty() Chunyan Zhang
@ 2025-09-17 3:36 ` Chunyan Zhang
2025-09-17 7:25 ` David Hildenbrand
2025-09-17 3:37 ` [PATCH V13 3/6] riscv: Add RISC-V Svrsw60t59b extension support Chunyan Zhang
` (3 subsequent siblings)
5 siblings, 1 reply; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:36 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
Some platforms can customize the PTE/PMD entry uffd-wp bit making
it unavailable even if the architecture provides the resource.
This patch adds a macro API that allows architectures to define their
specific implementations to check if the uffd-wp bit is available
on which device the kernel is running.
Also this patch is removing "ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP" and
"ifdef CONFIG_PTE_MARKER_UFFD_WP" in favor of pgtable_supports_uffd_wp()
and uffd_supports_wp_marker() checks respectively that default to
IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) and
"IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP)"
if not overridden by the architecture, no change in behavior is expected.
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
fs/userfaultfd.c | 22 +++---
include/asm-generic/pgtable_uffd.h | 17 +++++
include/linux/mm_inline.h | 12 ++-
include/linux/userfaultfd_k.h | 114 ++++++++++++++++-------------
mm/memory.c | 6 +-
5 files changed, 106 insertions(+), 65 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 54c6cc7fe9c6..e41736ffa202 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1270,9 +1270,9 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MISSING)
vm_flags |= VM_UFFD_MISSING;
if (uffdio_register.mode & UFFDIO_REGISTER_MODE_WP) {
-#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP
- goto out;
-#endif
+ if (!pgtable_supports_uffd_wp())
+ goto out;
+
vm_flags |= VM_UFFD_WP;
}
if (uffdio_register.mode & UFFDIO_REGISTER_MODE_MINOR) {
@@ -1980,14 +1980,14 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx,
uffdio_api.features &=
~(UFFD_FEATURE_MINOR_HUGETLBFS | UFFD_FEATURE_MINOR_SHMEM);
#endif
-#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP
- uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP;
-#endif
-#ifndef CONFIG_PTE_MARKER_UFFD_WP
- uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM;
- uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED;
- uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC;
-#endif
+ if (!pgtable_supports_uffd_wp())
+ uffdio_api.features &= ~UFFD_FEATURE_PAGEFAULT_FLAG_WP;
+
+ if (!uffd_supports_wp_marker()) {
+ uffdio_api.features &= ~UFFD_FEATURE_WP_HUGETLBFS_SHMEM;
+ uffdio_api.features &= ~UFFD_FEATURE_WP_UNPOPULATED;
+ uffdio_api.features &= ~UFFD_FEATURE_WP_ASYNC;
+ }
ret = -EINVAL;
if (features & ~uffdio_api.features)
diff --git a/include/asm-generic/pgtable_uffd.h b/include/asm-generic/pgtable_uffd.h
index 828966d4c281..0d85791efdf7 100644
--- a/include/asm-generic/pgtable_uffd.h
+++ b/include/asm-generic/pgtable_uffd.h
@@ -1,6 +1,23 @@
#ifndef _ASM_GENERIC_PGTABLE_UFFD_H
#define _ASM_GENERIC_PGTABLE_UFFD_H
+/*
+ * Some platforms can customize the uffd-wp bit, making it unavailable
+ * even if the architecture provides the resource.
+ * Adding this API allows architectures to add their own checks for the
+ * devices on which the kernel is running.
+ * Note: When overriding it, please make sure the
+ * CONFIG_HAVE_ARCH_USERFAULTFD_WP is part of this macro.
+ */
+#ifndef pgtable_supports_uffd_wp
+#define pgtable_supports_uffd_wp() IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP)
+#endif
+
+static inline bool uffd_supports_wp_marker(void)
+{
+ return pgtable_supports_uffd_wp() && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP);
+}
+
#ifndef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static __always_inline int pte_uffd_wp(pte_t pte)
{
diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h
index d6c1011b38f2..c69162812ba6 100644
--- a/include/linux/mm_inline.h
+++ b/include/linux/mm_inline.h
@@ -553,7 +553,6 @@ static inline pte_marker copy_pte_marker(
return dstm;
}
-#endif
/*
* If this pte is wr-protected by uffd-wp in any form, arm the special pte to
@@ -571,9 +570,15 @@ static inline bool
pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr,
pte_t *pte, pte_t pteval)
{
-#ifdef CONFIG_PTE_MARKER_UFFD_WP
bool arm_uffd_pte = false;
+ /*
+ * Some platforms can customize the PTE uffd-wp bit, making it unavailable
+ * even if the architecture allows providing the PTE resource.
+ */
+ if (!uffd_supports_wp_marker())
+ return false;
+
/* The current status of the pte should be "cleared" before calling */
WARN_ON_ONCE(!pte_none(ptep_get(pte)));
@@ -602,7 +607,7 @@ pte_install_uffd_wp_if_needed(struct vm_area_struct *vma, unsigned long addr,
make_pte_marker(PTE_MARKER_UFFD_WP));
return true;
}
-#endif
+
return false;
}
@@ -616,5 +621,6 @@ static inline bool vma_has_recency(const struct vm_area_struct *vma)
return true;
}
+#endif
#endif
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index c0e716aec26a..4ccc79b5731e 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -228,15 +228,14 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma,
if (wp_async && (vm_flags == VM_UFFD_WP))
return true;
-#ifndef CONFIG_PTE_MARKER_UFFD_WP
/*
* If user requested uffd-wp but not enabled pte markers for
* uffd-wp, then shmem & hugetlbfs are not supported but only
* anonymous.
*/
- if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
+ if (!uffd_supports_wp_marker() && (vm_flags & VM_UFFD_WP) &&
+ !vma_is_anonymous(vma))
return false;
-#endif
/* By default, allow any of anon|shmem|hugetlb */
return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
@@ -291,6 +290,67 @@ void userfaultfd_release_new(struct userfaultfd_ctx *ctx);
void userfaultfd_release_all(struct mm_struct *mm,
struct userfaultfd_ctx *ctx);
+static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
+{
+ /* Only wr-protect mode uses pte markers */
+ if (!userfaultfd_wp(vma))
+ return false;
+
+ /* File-based uffd-wp always need markers */
+ if (!vma_is_anonymous(vma))
+ return true;
+
+ /*
+ * Anonymous uffd-wp only needs the markers if WP_UNPOPULATED
+ * enabled (to apply markers on zero pages).
+ */
+ return userfaultfd_wp_unpopulated(vma);
+}
+
+static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry)
+{
+ if (!uffd_supports_wp_marker())
+ return false;
+
+ return is_pte_marker_entry(entry) &&
+ (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
+}
+
+static inline bool pte_marker_uffd_wp(pte_t pte)
+{
+ swp_entry_t entry;
+
+ if (!uffd_supports_wp_marker())
+ return false;
+
+ if (!is_swap_pte(pte))
+ return false;
+
+ entry = pte_to_swp_entry(pte);
+
+ return pte_marker_entry_uffd_wp(entry);
+}
+
+/*
+ * Returns true if this is a swap pte and was uffd-wp wr-protected in either
+ * forms (pte marker or a normal swap pte), false otherwise.
+ */
+static inline bool pte_swp_uffd_wp_any(pte_t pte)
+{
+ if (!uffd_supports_wp_marker())
+ return false;
+
+ if (!is_swap_pte(pte))
+ return false;
+
+ if (pte_swp_uffd_wp(pte))
+ return true;
+
+ if (pte_marker_uffd_wp(pte))
+ return true;
+
+ return false;
+}
#else /* CONFIG_USERFAULTFD */
/* mm helpers */
@@ -415,68 +475,24 @@ static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma)
return false;
}
-#endif /* CONFIG_USERFAULTFD */
-
static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
{
- /* Only wr-protect mode uses pte markers */
- if (!userfaultfd_wp(vma))
return false;
-
- /* File-based uffd-wp always need markers */
- if (!vma_is_anonymous(vma))
- return true;
-
- /*
- * Anonymous uffd-wp only needs the markers if WP_UNPOPULATED
- * enabled (to apply markers on zero pages).
- */
- return userfaultfd_wp_unpopulated(vma);
}
static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry)
{
-#ifdef CONFIG_PTE_MARKER_UFFD_WP
- return is_pte_marker_entry(entry) &&
- (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
-#else
- return false;
-#endif
+ return false;
}
static inline bool pte_marker_uffd_wp(pte_t pte)
{
-#ifdef CONFIG_PTE_MARKER_UFFD_WP
- swp_entry_t entry;
-
- if (!is_swap_pte(pte))
return false;
-
- entry = pte_to_swp_entry(pte);
-
- return pte_marker_entry_uffd_wp(entry);
-#else
- return false;
-#endif
}
-/*
- * Returns true if this is a swap pte and was uffd-wp wr-protected in either
- * forms (pte marker or a normal swap pte), false otherwise.
- */
static inline bool pte_swp_uffd_wp_any(pte_t pte)
{
-#ifdef CONFIG_PTE_MARKER_UFFD_WP
- if (!is_swap_pte(pte))
return false;
-
- if (pte_swp_uffd_wp(pte))
- return true;
-
- if (pte_marker_uffd_wp(pte))
- return true;
-#endif
- return false;
}
-
+#endif /* CONFIG_USERFAULTFD */
#endif /* _LINUX_USERFAULTFD_K_H */
diff --git a/mm/memory.c b/mm/memory.c
index 39ed698dfc37..a47621b35194 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1593,7 +1593,9 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma,
{
bool was_installed = false;
-#ifdef CONFIG_PTE_MARKER_UFFD_WP
+ if (!uffd_supports_wp_marker())
+ return false;
+
/* Zap on anonymous always means dropping everything */
if (vma_is_anonymous(vma))
return false;
@@ -1610,7 +1612,7 @@ zap_install_uffd_wp_if_needed(struct vm_area_struct *vma,
pte++;
addr += PAGE_SIZE;
}
-#endif
+
return was_installed;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V13 3/6] riscv: Add RISC-V Svrsw60t59b extension support
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 1/6] mm: softdirty: Add pgtable_supports_soft_dirty() Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp() Chunyan Zhang
@ 2025-09-17 3:37 ` Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 4/6] riscv: mm: Add soft-dirty page tracking support Chunyan Zhang
` (2 subsequent siblings)
5 siblings, 0 replies; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:37 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
The Svrsw60t59b extension allows to free the PTE reserved bits 60
and 59 for software to use.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
arch/riscv/Kconfig | 14 ++++++++++++++
arch/riscv/include/asm/hwcap.h | 1 +
arch/riscv/kernel/cpufeature.c | 1 +
3 files changed, 16 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a4b233a0659e..d99df67cc7a4 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -862,6 +862,20 @@ config RISCV_ISA_ZICBOP
If you don't know what to do here, say Y.
+config RISCV_ISA_SVRSW60T59B
+ bool "Svrsw60t59b extension support for using PTE bits 60 and 59"
+ depends on MMU && 64BIT
+ depends on RISCV_ALTERNATIVE
+ default y
+ help
+ Adds support to dynamically detect the presence of the Svrsw60t59b
+ extension and enable its usage.
+
+ The Svrsw60t59b extension allows to free the PTE reserved bits 60
+ and 59 for software to use.
+
+ If you don't know what to do here, say Y.
+
config TOOLCHAIN_NEEDS_EXPLICIT_ZICSR_ZIFENCEI
def_bool y
# https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=aed44286efa8ae8717a77d94b51ac3614e2ca6dc
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index affd63e11b0a..f98fcb5c17d5 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -106,6 +106,7 @@
#define RISCV_ISA_EXT_ZAAMO 97
#define RISCV_ISA_EXT_ZALRSC 98
#define RISCV_ISA_EXT_ZICBOP 99
+#define RISCV_ISA_EXT_SVRSW60T59B 100
#define RISCV_ISA_EXT_XLINUXENVCFG 127
diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index 743d53415572..2ba71d2d3fa3 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -539,6 +539,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = {
__RISCV_ISA_EXT_DATA(svinval, RISCV_ISA_EXT_SVINVAL),
__RISCV_ISA_EXT_DATA(svnapot, RISCV_ISA_EXT_SVNAPOT),
__RISCV_ISA_EXT_DATA(svpbmt, RISCV_ISA_EXT_SVPBMT),
+ __RISCV_ISA_EXT_DATA(svrsw60t59b, RISCV_ISA_EXT_SVRSW60T59B),
__RISCV_ISA_EXT_DATA(svvptc, RISCV_ISA_EXT_SVVPTC),
};
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V13 4/6] riscv: mm: Add soft-dirty page tracking support
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
` (2 preceding siblings ...)
2025-09-17 3:37 ` [PATCH V13 3/6] riscv: Add RISC-V Svrsw60t59b extension support Chunyan Zhang
@ 2025-09-17 3:37 ` Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 5/6] riscv: mm: Add userfaultfd write-protect support Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 6/6] dt-bindings: riscv: Add Svrsw60t59b extension description Chunyan Zhang
5 siblings, 0 replies; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:37 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software, this patch uses bit 59 for soft-dirty.
To add swap PTE soft-dirty tracking, we borrow bit 3 which is available
for swap PTEs on RISC-V systems.
Reviewed-by: Deepak Gupta <debug@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/pgtable-bits.h | 19 +++++++
arch/riscv/include/asm/pgtable.h | 75 ++++++++++++++++++++++++++-
3 files changed, 93 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d99df67cc7a4..53b73e4bdf3f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -141,6 +141,7 @@ config RISCV
select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
select HAVE_ARCH_SECCOMP_FILTER
+ select HAVE_ARCH_SOFT_DIRTY if 64BIT && MMU && RISCV_ISA_SVRSW60T59B
select HAVE_ARCH_THREAD_STRUCT_WHITELIST
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index 179bd4afece4..f3bac2bbc157 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -19,6 +19,25 @@
#define _PAGE_SOFT (3 << 8) /* Reserved for software */
#define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */
+
+#ifdef CONFIG_MEM_SOFT_DIRTY
+
+/* ext_svrsw60t59b: bit 59 for soft-dirty tracking */
+#define _PAGE_SOFT_DIRTY \
+ ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \
+ (1UL << 59) : 0)
+/*
+ * Bit 3 is always zero for swap entry computation, so we
+ * can borrow it for swap page soft-dirty tracking.
+ */
+#define _PAGE_SWP_SOFT_DIRTY \
+ ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \
+ _PAGE_EXEC : 0)
+#else
+#define _PAGE_SOFT_DIRTY 0
+#define _PAGE_SWP_SOFT_DIRTY 0
+#endif /* CONFIG_MEM_SOFT_DIRTY */
+
#define _PAGE_TABLE _PAGE_PRESENT
/*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index e69346307e78..d7fe0f78107b 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -427,7 +427,7 @@ static inline pte_t pte_mkwrite_novma(pte_t pte)
static inline pte_t pte_mkdirty(pte_t pte)
{
- return __pte(pte_val(pte) | _PAGE_DIRTY);
+ return __pte(pte_val(pte) | _PAGE_DIRTY | _PAGE_SOFT_DIRTY);
}
static inline pte_t pte_mkclean(pte_t pte)
@@ -455,6 +455,42 @@ static inline pte_t pte_mkhuge(pte_t pte)
return pte;
}
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+#define pgtable_supports_soft_dirty() \
+ (IS_ENABLED(CONFIG_MEM_SOFT_DIRTY) && \
+ riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B))
+
+static inline bool pte_soft_dirty(pte_t pte)
+{
+ return !!(pte_val(pte) & _PAGE_SOFT_DIRTY);
+}
+
+static inline pte_t pte_mksoft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SOFT_DIRTY);
+}
+
+static inline pte_t pte_clear_soft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_SOFT_DIRTY));
+}
+
+static inline bool pte_swp_soft_dirty(pte_t pte)
+{
+ return !!(pte_val(pte) & _PAGE_SWP_SOFT_DIRTY);
+}
+
+static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
+}
+
+static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_SWP_SOFT_DIRTY));
+}
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
#ifdef CONFIG_RISCV_ISA_SVNAPOT
#define pte_leaf_size(pte) (pte_napot(pte) ? \
napot_cont_size(napot_cont_order(pte)) :\
@@ -802,6 +838,40 @@ static inline pud_t pud_mkspecial(pud_t pud)
}
#endif
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+static inline bool pmd_soft_dirty(pmd_t pmd)
+{
+ return pte_soft_dirty(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_mksoft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd)));
+}
+
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline bool pmd_swp_soft_dirty(pmd_t pmd)
+{
+ return pte_swp_soft_dirty(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd)));
+}
+#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd)
{
@@ -994,7 +1064,8 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot)
*
* Format of swap PTE:
* bit 0: _PAGE_PRESENT (zero)
- * bit 1 to 3: _PAGE_LEAF (zero)
+ * bit 1 to 2: (zero)
+ * bit 3: _PAGE_SWP_SOFT_DIRTY
* bit 5: _PAGE_PROT_NONE (zero)
* bit 6: exclusive marker
* bits 7 to 11: swap type
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V13 5/6] riscv: mm: Add userfaultfd write-protect support
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
` (3 preceding siblings ...)
2025-09-17 3:37 ` [PATCH V13 4/6] riscv: mm: Add soft-dirty page tracking support Chunyan Zhang
@ 2025-09-17 3:37 ` Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 6/6] dt-bindings: riscv: Add Svrsw60t59b extension description Chunyan Zhang
5 siblings, 0 replies; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:37 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
The Svrsw60t59b extension allows to free the PTE reserved bits 60 and 59
for software, this patch uses bit 60 for uffd-wp tracking
Additionally for tracking the uffd-wp state as a PTE swap bit, we borrow
bit 4 which is not involved into swap entry computation.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/pgtable-bits.h | 18 +++++++
arch/riscv/include/asm/pgtable.h | 68 +++++++++++++++++++++++++++
3 files changed, 87 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 53b73e4bdf3f..f928768bb14a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -147,6 +147,7 @@ config RISCV
select HAVE_ARCH_TRANSPARENT_HUGEPAGE if 64BIT && MMU
select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if 64BIT && MMU
select HAVE_ARCH_USERFAULTFD_MINOR if 64BIT && USERFAULTFD
+ select HAVE_ARCH_USERFAULTFD_WP if 64BIT && MMU && USERFAULTFD && RISCV_ISA_SVRSW60T59B
select HAVE_ARCH_VMAP_STACK if MMU && 64BIT
select HAVE_ASM_MODVERSIONS
select HAVE_CONTEXT_TRACKING_USER
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index f3bac2bbc157..b422d9691e60 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -38,6 +38,24 @@
#define _PAGE_SWP_SOFT_DIRTY 0
#endif /* CONFIG_MEM_SOFT_DIRTY */
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
+
+/* ext_svrsw60t59b: Bit(60) for uffd-wp tracking */
+#define _PAGE_UFFD_WP \
+ ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \
+ (1UL << 60) : 0)
+/*
+ * Bit 4 is not involved into swap entry computation, so we
+ * can borrow it for swap page uffd-wp tracking.
+ */
+#define _PAGE_SWP_UFFD_WP \
+ ((riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)) ? \
+ _PAGE_USER : 0)
+#else
+#define _PAGE_UFFD_WP 0
+#define _PAGE_SWP_UFFD_WP 0
+#endif
+
#define _PAGE_TABLE _PAGE_PRESENT
/*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index d7fe0f78107b..71e84c114dc4 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -416,6 +416,41 @@ static inline pte_t pte_wrprotect(pte_t pte)
return __pte(pte_val(pte) & ~(_PAGE_WRITE));
}
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
+#define pgtable_supports_uffd_wp() \
+ riscv_has_extension_unlikely(RISCV_ISA_EXT_SVRSW60T59B)
+
+static inline bool pte_uffd_wp(pte_t pte)
+{
+ return !!(pte_val(pte) & _PAGE_UFFD_WP);
+}
+
+static inline pte_t pte_mkuffd_wp(pte_t pte)
+{
+ return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP));
+}
+
+static inline pte_t pte_clear_uffd_wp(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP));
+}
+
+static inline bool pte_swp_uffd_wp(pte_t pte)
+{
+ return !!(pte_val(pte) & _PAGE_SWP_UFFD_WP);
+}
+
+static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP);
+}
+
+static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP));
+}
+#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
+
/* static inline pte_t pte_mkread(pte_t pte) */
static inline pte_t pte_mkwrite_novma(pte_t pte)
@@ -838,6 +873,38 @@ static inline pud_t pud_mkspecial(pud_t pud)
}
#endif
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
+static inline bool pmd_uffd_wp(pmd_t pmd)
+{
+ return pte_uffd_wp(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_mkuffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd)));
+}
+
+static inline bool pmd_swp_uffd_wp(pmd_t pmd)
+{
+ return pte_swp_uffd_wp(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd)));
+}
+#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
+
#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
static inline bool pmd_soft_dirty(pmd_t pmd)
{
@@ -1066,6 +1133,7 @@ static inline pud_t pud_modify(pud_t pud, pgprot_t newprot)
* bit 0: _PAGE_PRESENT (zero)
* bit 1 to 2: (zero)
* bit 3: _PAGE_SWP_SOFT_DIRTY
+ * bit 4: _PAGE_SWP_UFFD_WP
* bit 5: _PAGE_PROT_NONE (zero)
* bit 6: exclusive marker
* bits 7 to 11: swap type
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [PATCH V13 6/6] dt-bindings: riscv: Add Svrsw60t59b extension description
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
` (4 preceding siblings ...)
2025-09-17 3:37 ` [PATCH V13 5/6] riscv: mm: Add userfaultfd write-protect support Chunyan Zhang
@ 2025-09-17 3:37 ` Chunyan Zhang
2025-09-18 0:10 ` Krzysztof Kozlowski
5 siblings, 1 reply; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 3:37 UTC (permalink / raw)
To: linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
Add description for the Svrsw60t59b extension (PTE Reserved for SW
bits 60:59) extension which was ratified recently in
riscv-non-isa/riscv-iommu.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
Documentation/devicetree/bindings/riscv/extensions.yaml | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/Documentation/devicetree/bindings/riscv/extensions.yaml b/Documentation/devicetree/bindings/riscv/extensions.yaml
index ede6a58ccf53..7e1a59c7d911 100644
--- a/Documentation/devicetree/bindings/riscv/extensions.yaml
+++ b/Documentation/devicetree/bindings/riscv/extensions.yaml
@@ -217,6 +217,12 @@ properties:
memory types as ratified in the 20191213 version of the privileged
ISA specification.
+ - const: svrsw60t59b
+ description:
+ The svrsw60t59b for providing two more bits[60:59] to PTE/PMD entry
+ as ratified at commit 28bde925e7a7 ("PTE Reserved for SW bits 60:59")
+ of riscv-non-isa/riscv-iommu.
+
- const: svvptc
description:
The standard Svvptc supervisor-level extension for
--
2.34.1
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp()
2025-09-17 3:36 ` [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp() Chunyan Zhang
@ 2025-09-17 7:25 ` David Hildenbrand
2025-09-17 9:20 ` Chunyan Zhang
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand @ 2025-09-17 7:25 UTC (permalink / raw)
To: Chunyan Zhang, linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, Lorenzo Stoakes, Liam R . Howlett, Vlastimil Babka,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Axel Rasmussen,
Yuanchu Xie, Chunyan Zhang
On 17.09.25 05:36, Chunyan Zhang wrote:
> Some platforms can customize the PTE/PMD entry uffd-wp bit making
> it unavailable even if the architecture provides the resource.
> This patch adds a macro API that allows architectures to define their
> specific implementations to check if the uffd-wp bit is available
> on which device the kernel is running.
>
> Also this patch is removing "ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP" and
> "ifdef CONFIG_PTE_MARKER_UFFD_WP" in favor of pgtable_supports_uffd_wp()
> and uffd_supports_wp_marker() checks respectively that default to
> IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) and
> "IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP)"
> if not overridden by the architecture, no change in behavior is expected.
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> ---
[...]
Taking another look.
> /* mm helpers */
> @@ -415,68 +475,24 @@ static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma)
> return false;
> }
>
> -#endif /* CONFIG_USERFAULTFD */
> -
> static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
> {
> - /* Only wr-protect mode uses pte markers */
> - if (!userfaultfd_wp(vma))
> return false;
Isn't this indented one level too deep?
> -
> - /* File-based uffd-wp always need markers */
> - if (!vma_is_anonymous(vma))
> - return true;
> -
> - /*
> - * Anonymous uffd-wp only needs the markers if WP_UNPOPULATED
> - * enabled (to apply markers on zero pages).
> - */
> - return userfaultfd_wp_unpopulated(vma);
> }
>
> static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry)
> {
> -#ifdef CONFIG_PTE_MARKER_UFFD_WP
> - return is_pte_marker_entry(entry) &&
> - (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
> -#else
> - return false;
> -#endif
> + return false;
Same here.
> }
>
> static inline bool pte_marker_uffd_wp(pte_t pte)
> {
> -#ifdef CONFIG_PTE_MARKER_UFFD_WP
> - swp_entry_t entry;
> -
> - if (!is_swap_pte(pte))
> return false;
Same here.
> -
> - entry = pte_to_swp_entry(pte);
> -
> - return pte_marker_entry_uffd_wp(entry);
> -#else
> - return false;
> -#endif
> }
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp()
2025-09-17 7:25 ` David Hildenbrand
@ 2025-09-17 9:20 ` Chunyan Zhang
0 siblings, 0 replies; 10+ messages in thread
From: Chunyan Zhang @ 2025-09-17 9:20 UTC (permalink / raw)
To: David Hildenbrand
Cc: Chunyan Zhang, linux-riscv, linux-fsdevel, linux-mm, linux-kernel,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, Lorenzo Stoakes, Liam R . Howlett, Vlastimil Babka,
Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Axel Rasmussen,
Yuanchu Xie
On Wed, 17 Sept 2025 at 15:25, David Hildenbrand <david@redhat.com> wrote:
>
> On 17.09.25 05:36, Chunyan Zhang wrote:
> > Some platforms can customize the PTE/PMD entry uffd-wp bit making
> > it unavailable even if the architecture provides the resource.
> > This patch adds a macro API that allows architectures to define their
> > specific implementations to check if the uffd-wp bit is available
> > on which device the kernel is running.
> >
> > Also this patch is removing "ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP" and
> > "ifdef CONFIG_PTE_MARKER_UFFD_WP" in favor of pgtable_supports_uffd_wp()
> > and uffd_supports_wp_marker() checks respectively that default to
> > IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) and
> > "IS_ENABLED(CONFIG_HAVE_ARCH_USERFAULTFD_WP) && IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP)"
> > if not overridden by the architecture, no change in behavior is expected.
> >
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> > ---
>
> [...]
>
> Taking another look.
>
> > /* mm helpers */
> > @@ -415,68 +475,24 @@ static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma)
> > return false;
> > }
> >
> > -#endif /* CONFIG_USERFAULTFD */
> > -
> > static inline bool userfaultfd_wp_use_markers(struct vm_area_struct *vma)
> > {
> > - /* Only wr-protect mode uses pte markers */
> > - if (!userfaultfd_wp(vma))
> > return false;
>
> Isn't this indented one level too deep?
Oh right, I will fix these.
Thanks to you spotting them out!
Chunyan
>
> > -
> > - /* File-based uffd-wp always need markers */
> > - if (!vma_is_anonymous(vma))
> > - return true;
> > -
> > - /*
> > - * Anonymous uffd-wp only needs the markers if WP_UNPOPULATED
> > - * enabled (to apply markers on zero pages).
> > - */
> > - return userfaultfd_wp_unpopulated(vma);
> > }
> >
> > static inline bool pte_marker_entry_uffd_wp(swp_entry_t entry)
> > {
> > -#ifdef CONFIG_PTE_MARKER_UFFD_WP
> > - return is_pte_marker_entry(entry) &&
> > - (pte_marker_get(entry) & PTE_MARKER_UFFD_WP);
> > -#else
> > - return false;
> > -#endif
> > + return false;
>
> Same here.
>
> > }
> >
> > static inline bool pte_marker_uffd_wp(pte_t pte)
> > {
> > -#ifdef CONFIG_PTE_MARKER_UFFD_WP
> > - swp_entry_t entry;
> > -
> > - if (!is_swap_pte(pte))
> > return false;
>
> Same here.
>
> > -
> > - entry = pte_to_swp_entry(pte);
> > -
> > - return pte_marker_entry_uffd_wp(entry);
> > -#else
> > - return false;
> > -#endif
> > }
>
>
> --
> Cheers
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH V13 6/6] dt-bindings: riscv: Add Svrsw60t59b extension description
2025-09-17 3:37 ` [PATCH V13 6/6] dt-bindings: riscv: Add Svrsw60t59b extension description Chunyan Zhang
@ 2025-09-18 0:10 ` Krzysztof Kozlowski
0 siblings, 0 replies; 10+ messages in thread
From: Krzysztof Kozlowski @ 2025-09-18 0:10 UTC (permalink / raw)
To: Chunyan Zhang, linux-riscv, linux-fsdevel, linux-mm, linux-kernel
Cc: Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexandre Ghiti,
Conor Dooley, Deepak Gupta, Ved Shanbhogue, Alexander Viro,
Christian Brauner, Jan Kara, Andrew Morton, Peter Xu,
Arnd Bergmann, David Hildenbrand, Lorenzo Stoakes,
Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
Suren Baghdasaryan, Michal Hocko, Axel Rasmussen, Yuanchu Xie,
Chunyan Zhang
On 17/09/2025 12:37, Chunyan Zhang wrote:
> Add description for the Svrsw60t59b extension (PTE Reserved for SW
> bits 60:59) extension which was ratified recently in
> riscv-non-isa/riscv-iommu.
>
> Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
> ---
> Documentation/devicetree/bindings/riscv/extensions.yaml | 6 ++++++
> 1 file changed, 6 insertions(+)
<form letter>
Please use scripts/get_maintainers.pl to get a list of necessary people
and lists to CC. It might happen, that command when run on an older
kernel, gives you outdated entries. Therefore please be sure you base
your patches on recent Linux kernel.
Tools like b4 or scripts/get_maintainer.pl provide you proper list of
people, so fix your workflow. Tools might also fail if you work on some
ancient tree (don't, instead use mainline) or work on fork of kernel
(don't, instead use mainline). Just use b4 and everything should be
fine, although remember about `b4 prep --auto-to-cc` if you added new
patches to the patchset.
You missed at least devicetree list (maybe more), so this won't be
tested by automated tooling. Performing review on untested code might be
a waste of time.
Please kindly resend and include all necessary To/Cc entries.
</form letter>
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-09-18 0:10 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-17 3:36 [PATCH V13 0/6] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 1/6] mm: softdirty: Add pgtable_supports_soft_dirty() Chunyan Zhang
2025-09-17 3:36 ` [PATCH V13 2/6] mm: userfaultfd: Add pgtable_supports_uffd_wp() Chunyan Zhang
2025-09-17 7:25 ` David Hildenbrand
2025-09-17 9:20 ` Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 3/6] riscv: Add RISC-V Svrsw60t59b extension support Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 4/6] riscv: mm: Add soft-dirty page tracking support Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 5/6] riscv: mm: Add userfaultfd write-protect support Chunyan Zhang
2025-09-17 3:37 ` [PATCH V13 6/6] dt-bindings: riscv: Add Svrsw60t59b extension description Chunyan Zhang
2025-09-18 0:10 ` Krzysztof Kozlowski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).