* [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support
@ 2024-11-13 9:58 Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9) Chunyan Zhang
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Chunyan Zhang @ 2024-11-13 9:58 UTC (permalink / raw)
To: Palmer Dabbelt, Albert Ou, Paul Walmsley, Alexandre Ghiti,
Andrew Morton
Cc: linux-riscv, linux-kernel, Chunyan Zhang
This patchset adds soft dirty and userfaultfd write protect tracking
support for RISC-V.
As described in the patches, we are trying to utilize only one free PTE
bit(9) to support three kernel features (devmap, soft-dirty, uffd-wp).
Users cannot have them supported at the same time (have to select
one when building the kernel).
This patchset has been tested with:
1) The kselftest mm suite in which soft-dirty, madv_populate,
test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions
are observed in any of the other tests.
2) CRIU:
- 'criu check --feature mem_dirty_track' returns supported;
- incremental_dumps[1] and simple_loop [2] dump and restores work fine;
- zdtm test suite can run under host mode.
This patchset applies on top of v6.12-rc7.
V5:
- Fixed typos and corrected some words in Kconfig and commit message;
- Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error;
- Added Alex's Reviewed-by tag in patch 2.
V4:
- Added bit(4) descriptions into "Format of swap PTE".
V3:
- Fixed the issue reported by kernel test irobot <lkp@intel.com>.
V1 -> V2:
- Add uffd-wp supported;
- Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit;
- Add test results of CRIU in the cover-letter.
[1] https://www.criu.org/Incremental_dumps
[2] https://asciinema.org/a/232445
Chunyan Zhang (3):
riscv: mm: Prepare for reusing PTE RSW bit(9)
riscv: mm: Add soft-dirty page tracking support
riscv: mm: Add uffd write-protect support
arch/riscv/Kconfig | 34 ++++++-
arch/riscv/include/asm/pgtable-64.h | 2 +-
arch/riscv/include/asm/pgtable-bits.h | 31 ++++++
arch/riscv/include/asm/pgtable.h | 133 +++++++++++++++++++++++++-
4 files changed, 197 insertions(+), 3 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2024-11-13 9:58 [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
@ 2024-11-13 9:58 ` Chunyan Zhang
2025-01-30 8:42 ` Björn Töpel
2024-11-13 9:58 ` [PATCH V5 2/3] riscv: mm: Add soft-dirty page tracking support Chunyan Zhang
` (3 subsequent siblings)
4 siblings, 1 reply; 13+ messages in thread
From: Chunyan Zhang @ 2024-11-13 9:58 UTC (permalink / raw)
To: Palmer Dabbelt, Albert Ou, Paul Walmsley, Alexandre Ghiti,
Andrew Morton
Cc: linux-riscv, linux-kernel, Chunyan Zhang
The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
now which has to be disabled if we want to use bit(9) for other features,
since there's no more free PTE bit on RISC-V now.
So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
the build condition of devmap definitions.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
arch/riscv/include/asm/pgtable-64.h | 2 +-
arch/riscv/include/asm/pgtable-bits.h | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
index 0897dd99ab8d..babb8d2b0f0b 100644
--- a/arch/riscv/include/asm/pgtable-64.h
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -398,7 +398,7 @@ static inline struct page *pgd_page(pgd_t pgd)
#define p4d_offset p4d_offset
p4d_t *p4d_offset(pgd_t *pgd, unsigned long address);
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && defined(CONFIG_ARCH_HAS_PTE_DEVMAP)
static inline int pte_devmap(pte_t pte);
static inline pte_t pmd_pte(pmd_t pmd);
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index a8f5205cea54..5bcc73430829 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -19,7 +19,13 @@
#define _PAGE_SOFT (3 << 8) /* Reserved for software */
#define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */
+
+#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
#define _PAGE_DEVMAP (1 << 9) /* RSW, devmap */
+#else
+#define _PAGE_DEVMAP 0
+#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
+
#define _PAGE_TABLE _PAGE_PRESENT
/*
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH V5 2/3] riscv: mm: Add soft-dirty page tracking support
2024-11-13 9:58 [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9) Chunyan Zhang
@ 2024-11-13 9:58 ` Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 3/3] riscv: mm: Add uffd write-protect support Chunyan Zhang
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Chunyan Zhang @ 2024-11-13 9:58 UTC (permalink / raw)
To: Palmer Dabbelt, Albert Ou, Paul Walmsley, Alexandre Ghiti,
Andrew Morton
Cc: linux-riscv, linux-kernel, Chunyan Zhang
The PTE bit(9) is reserved for software, now used by devmap,
this patch reuses bit(9) for soft-dirty which is enabled only
if !CONFIG_ARCH_HAS_PTE_DEVMAP, in other words, soft-dirty
and devmap will be mutually exclusive on RISC-V.
To add swap PTE soft-dirty tracking, we borrow bit(4) which is
available for swap PTEs on RISC-V systems.
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
arch/riscv/Kconfig | 27 ++++++++++-
arch/riscv/include/asm/pgtable-bits.h | 12 +++++
arch/riscv/include/asm/pgtable.h | 69 ++++++++++++++++++++++++++-
3 files changed, 106 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index f4c570538d55..3bccdcae9445 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -40,7 +40,6 @@ config RISCV
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PMEM_API
select ARCH_HAS_PREPARE_SYNC_CORE_CMD
- select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_DIRECT_MAP if MMU
select ARCH_HAS_SET_MEMORY if MMU
@@ -966,6 +965,32 @@ config RANDOMIZE_BASE
If unsure, say N.
+choice
+ prompt "PTE RSW bit(9) usage"
+ default RISCV_HAS_PTE_DEVMAP
+ depends on MMU && 64BIT
+ help
+ RISC-V PTE bit(9) is reserved for software, and used by more than
+ one kernel feature which cannot be supported at the same time.
+ So we have to select one for it.
+
+config RISCV_HAS_PTE_DEVMAP
+ bool "devmap"
+ select ARCH_HAS_PTE_DEVMAP
+ help
+ The PTE bit(9) is used for devmap mark. ZONE_DEVICE pages need devmap
+ PTEs support to function.
+
+ So if you want to use ZONE_DEVICE, select this.
+
+config RISCV_HAS_SOFT_DIRTY
+ bool "soft-dirty"
+ select HAVE_ARCH_SOFT_DIRTY
+ help
+ The PTE bit(9) is used for soft-dirty tracking.
+
+endchoice
+
endmenu # "Kernel features"
menu "Boot options"
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index 5bcc73430829..c6d51fe9fc6f 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -26,6 +26,18 @@
#define _PAGE_DEVMAP 0
#endif /* CONFIG_ARCH_HAS_PTE_DEVMAP */
+#ifdef CONFIG_MEM_SOFT_DIRTY
+#define _PAGE_SOFT_DIRTY (1 << 9) /* RSW: 0x2 for software dirty tracking */
+/*
+ * BIT 4 is not involved into swap entry computation, so we
+ * can borrow it for swap page soft-dirty tracking.
+ */
+#define _PAGE_SWP_SOFT_DIRTY _PAGE_USER
+#else
+#define _PAGE_SOFT_DIRTY 0
+#define _PAGE_SWP_SOFT_DIRTY 0
+#endif /* CONFIG_MEM_SOFT_DIRTY */
+
#define _PAGE_TABLE _PAGE_PRESENT
/*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index e79f15293492..1779eae5cb49 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -424,7 +424,7 @@ static inline pte_t pte_mkwrite_novma(pte_t pte)
static inline pte_t pte_mkdirty(pte_t pte)
{
- return __pte(pte_val(pte) | _PAGE_DIRTY);
+ return __pte(pte_val(pte) | _PAGE_DIRTY | _PAGE_SOFT_DIRTY);
}
static inline pte_t pte_mkclean(pte_t pte)
@@ -457,6 +457,38 @@ static inline pte_t pte_mkhuge(pte_t pte)
return pte;
}
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+static inline int pte_soft_dirty(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_SOFT_DIRTY;
+}
+
+static inline pte_t pte_mksoft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SOFT_DIRTY);
+}
+
+static inline pte_t pte_clear_soft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_SOFT_DIRTY));
+}
+
+static inline int pte_swp_soft_dirty(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_SWP_SOFT_DIRTY;
+}
+
+static inline pte_t pte_swp_mksoft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SWP_SOFT_DIRTY);
+}
+
+static inline pte_t pte_swp_clear_soft_dirty(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_SWP_SOFT_DIRTY));
+}
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
#ifdef CONFIG_RISCV_ISA_SVNAPOT
#define pte_leaf_size(pte) (pte_napot(pte) ? \
napot_cont_size(napot_cont_order(pte)) :\
@@ -757,6 +789,40 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
}
+#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
+static inline int pmd_soft_dirty(pmd_t pmd)
+{
+ return pte_soft_dirty(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_mksoft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_mksoft_dirty(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_clear_soft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_clear_soft_dirty(pmd_pte(pmd)));
+}
+
+#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
+static inline int pmd_swp_soft_dirty(pmd_t pmd)
+{
+ return pte_swp_soft_dirty(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_swp_mksoft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_mksoft_dirty(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_swp_clear_soft_dirty(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_clear_soft_dirty(pmd_pte(pmd)));
+}
+#endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */
+#endif /* CONFIG_HAVE_ARCH_SOFT_DIRTY */
+
static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
pmd_t *pmdp, pmd_t pmd)
{
@@ -847,6 +913,7 @@ extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
* Format of swap PTE:
* bit 0: _PAGE_PRESENT (zero)
* bit 1 to 3: _PAGE_LEAF (zero)
+ * bit 4: _PAGE_SWP_SOFT_DIRTY
* bit 5: _PAGE_PROT_NONE (zero)
* bit 6: exclusive marker
* bits 7 to 11: swap type
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH V5 3/3] riscv: mm: Add uffd write-protect support
2024-11-13 9:58 [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9) Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 2/3] riscv: mm: Add soft-dirty page tracking support Chunyan Zhang
@ 2024-11-13 9:58 ` Chunyan Zhang
2025-01-29 8:12 ` [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Deepak Gupta
2025-03-30 13:51 ` Alexandre Ghiti
4 siblings, 0 replies; 13+ messages in thread
From: Chunyan Zhang @ 2024-11-13 9:58 UTC (permalink / raw)
To: Palmer Dabbelt, Albert Ou, Paul Walmsley, Alexandre Ghiti,
Andrew Morton
Cc: linux-riscv, linux-kernel, Chunyan Zhang
Reuse PTE bit(9) to do uffd-wp tracking and make it mutually exclusive
with soft-dirty and devmap which all use this PTE bit.
Additionally for tracking the uffd-wp state as a PTE swap bit,
we use swap entry pte bit(4) which is also used by swap
soft-dirty tracking.
Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn>
---
arch/riscv/Kconfig | 7 +++
arch/riscv/include/asm/pgtable-bits.h | 13 ++++++
arch/riscv/include/asm/pgtable.h | 66 ++++++++++++++++++++++++++-
3 files changed, 85 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 3bccdcae9445..920071a05512 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -989,6 +989,13 @@ config RISCV_HAS_SOFT_DIRTY
help
The PTE bit(9) is used for soft-dirty tracking.
+config RISCV_HAS_USERFAULTFD_WP
+ bool "userfaultfd write protection"
+ select HAVE_ARCH_USERFAULTFD_WP
+ depends on USERFAULTFD
+ help
+ The PTE bit(9) is used for userfaultfd write-protected
+ tracking.
endchoice
endmenu # "Kernel features"
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index c6d51fe9fc6f..7de16141c049 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -38,6 +38,19 @@
#define _PAGE_SWP_SOFT_DIRTY 0
#endif /* CONFIG_MEM_SOFT_DIRTY */
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
+/*
+ * CONFIG_HAVE_ARCH_USERFAULTFD_WP is mutually exclusive with
+ * HAVE_ARCH_SOFT_DIRTY so we can use the same bit for uffd-wp
+ * and soft-dirty tracking.
+ */
+#define _PAGE_UFFD_WP (1 << 9) /* RSW: 0x2 for uffd-wp tracking */
+#define _PAGE_SWP_UFFD_WP _PAGE_USER
+#else
+#define _PAGE_UFFD_WP 0
+#define _PAGE_SWP_UFFD_WP 0
+#endif
+
#define _PAGE_TABLE _PAGE_PRESENT
/*
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 1779eae5cb49..f241c444cebd 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -413,6 +413,38 @@ static inline pte_t pte_wrprotect(pte_t pte)
return __pte(pte_val(pte) & ~(_PAGE_WRITE));
}
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
+static inline int pte_uffd_wp(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_UFFD_WP;
+}
+
+static inline pte_t pte_mkuffd_wp(pte_t pte)
+{
+ return pte_wrprotect(__pte(pte_val(pte) | _PAGE_UFFD_WP));
+}
+
+static inline pte_t pte_clear_uffd_wp(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_UFFD_WP));
+}
+
+static inline int pte_swp_uffd_wp(pte_t pte)
+{
+ return pte_val(pte) & _PAGE_SWP_UFFD_WP;
+}
+
+static inline pte_t pte_swp_mkuffd_wp(pte_t pte)
+{
+ return __pte(pte_val(pte) | _PAGE_SWP_UFFD_WP);
+}
+
+static inline pte_t pte_swp_clear_uffd_wp(pte_t pte)
+{
+ return __pte(pte_val(pte) & ~(_PAGE_SWP_UFFD_WP));
+}
+#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
+
/* static inline pte_t pte_mkread(pte_t pte) */
static inline pte_t pte_mkwrite_novma(pte_t pte)
@@ -789,6 +821,38 @@ static inline pmd_t pmd_mkdevmap(pmd_t pmd)
return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
}
+#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
+static inline int pmd_uffd_wp(pmd_t pmd)
+{
+ return pte_uffd_wp(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_mkuffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_mkuffd_wp(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_clear_uffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_clear_uffd_wp(pmd_pte(pmd)));
+}
+
+static inline int pmd_swp_uffd_wp(pmd_t pmd)
+{
+ return pte_swp_uffd_wp(pmd_pte(pmd));
+}
+
+static inline pmd_t pmd_swp_mkuffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_mkuffd_wp(pmd_pte(pmd)));
+}
+
+static inline pmd_t pmd_swp_clear_uffd_wp(pmd_t pmd)
+{
+ return pte_pmd(pte_swp_clear_uffd_wp(pmd_pte(pmd)));
+}
+#endif /* CONFIG_HAVE_ARCH_USERFAULTFD_WP */
+
#ifdef CONFIG_HAVE_ARCH_SOFT_DIRTY
static inline int pmd_soft_dirty(pmd_t pmd)
{
@@ -913,7 +977,7 @@ extern pmd_t pmdp_collapse_flush(struct vm_area_struct *vma,
* Format of swap PTE:
* bit 0: _PAGE_PRESENT (zero)
* bit 1 to 3: _PAGE_LEAF (zero)
- * bit 4: _PAGE_SWP_SOFT_DIRTY
+ * bit 4: _PAGE_SWP_SOFT_DIRTY or _PAGE_SWP_UFFD_WP
* bit 5: _PAGE_PROT_NONE (zero)
* bit 6: exclusive marker
* bits 7 to 11: swap type
--
2.34.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support
2024-11-13 9:58 [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
` (2 preceding siblings ...)
2024-11-13 9:58 ` [PATCH V5 3/3] riscv: mm: Add uffd write-protect support Chunyan Zhang
@ 2025-01-29 8:12 ` Deepak Gupta
2025-03-30 13:51 ` Alexandre Ghiti
4 siblings, 0 replies; 13+ messages in thread
From: Deepak Gupta @ 2025-01-29 8:12 UTC (permalink / raw)
To: Chunyan Zhang
Cc: Palmer Dabbelt, Albert Ou, Paul Walmsley, Alexandre Ghiti,
Andrew Morton, linux-riscv, linux-kernel, Chunyan Zhang
On Wed, Nov 13, 2024 at 05:58:30PM +0800, Chunyan Zhang wrote:
>This patchset adds soft dirty and userfaultfd write protect tracking
>support for RISC-V.
>
>As described in the patches, we are trying to utilize only one free PTE
>bit(9) to support three kernel features (devmap, soft-dirty, uffd-wp).
>Users cannot have them supported at the same time (have to select
>one when building the kernel).
Why do we expect a user won't be using all these three kernel features
(devmap, soft-dirty and uffd-wp). I do understand the part that their
interaction with each other is mutually exclusive but their usage (from
an user's perspective) is not mutually exclusive. So forcing this choice
on user at kernel build time is way too restrictive. Additionally this
forces distros to carry 3 different builds (they dont know which user is
expecting to use which kernel build).
As an example if I were running microVMs to host something like serverless
(lambda), I could be taking live snapshots and that would require me to
enable uffd-wp. At the same time I could be using criu to snapshot some
task.
Locking in at the kernel build time takes that choice away.
IMHO, this should be done in a way which doesn't take away the choice
from user. And if there is no choice left from sw workaround perspective,
then right approach would be to ask RISC-V to cough-up more RSW bits.
>
>This patchset has been tested with:
>1) The kselftest mm suite in which soft-dirty, madv_populate,
>test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions
>are observed in any of the other tests.
>
>2) CRIU:
>- 'criu check --feature mem_dirty_track' returns supported;
>- incremental_dumps[1] and simple_loop [2] dump and restores work fine;
>- zdtm test suite can run under host mode.
>
>This patchset applies on top of v6.12-rc7.
>
>V5:
>- Fixed typos and corrected some words in Kconfig and commit message;
>- Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error;
>- Added Alex's Reviewed-by tag in patch 2.
>
>V4:
>- Added bit(4) descriptions into "Format of swap PTE".
>
>V3:
>- Fixed the issue reported by kernel test irobot <lkp@intel.com>.
>
>V1 -> V2:
>- Add uffd-wp supported;
>- Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit;
>- Add test results of CRIU in the cover-letter.
>
>[1] https://www.criu.org/Incremental_dumps
>[2] https://asciinema.org/a/232445
>
>Chunyan Zhang (3):
> riscv: mm: Prepare for reusing PTE RSW bit(9)
> riscv: mm: Add soft-dirty page tracking support
> riscv: mm: Add uffd write-protect support
>
> arch/riscv/Kconfig | 34 ++++++-
> arch/riscv/include/asm/pgtable-64.h | 2 +-
> arch/riscv/include/asm/pgtable-bits.h | 31 ++++++
> arch/riscv/include/asm/pgtable.h | 133 +++++++++++++++++++++++++-
> 4 files changed, 197 insertions(+), 3 deletions(-)
>
>--
>2.34.1
>
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2024-11-13 9:58 ` [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9) Chunyan Zhang
@ 2025-01-30 8:42 ` Björn Töpel
2025-02-05 0:19 ` Alistair Popple
2025-02-11 1:20 ` Chunyan Zhang
0 siblings, 2 replies; 13+ messages in thread
From: Björn Töpel @ 2025-01-30 8:42 UTC (permalink / raw)
To: Chunyan Zhang, Palmer Dabbelt, Albert Ou, Paul Walmsley,
Alexandre Ghiti, Andrew Morton
Cc: linux-riscv, linux-kernel, Chunyan Zhang, Alistair Popple,
linux-mm
Chunyan Zhang <zhangchunyan@iscas.ac.cn> writes:
> The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
> now which has to be disabled if we want to use bit(9) for other features,
> since there's no more free PTE bit on RISC-V now.
>
> So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
> the build condition of devmap definitions.
Heads-up: It seems like Alistair's series [1] that removes the devmap
PTE bit will most likely land in 6.15.
Björn
[1] https://lore.kernel.org/linux-mm/cover.11189864684e31260d1408779fac9db80122047b.1736488799.git-series.apopple@nvidia.com/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2025-01-30 8:42 ` Björn Töpel
@ 2025-02-05 0:19 ` Alistair Popple
2025-02-11 1:20 ` Chunyan Zhang
1 sibling, 0 replies; 13+ messages in thread
From: Alistair Popple @ 2025-02-05 0:19 UTC (permalink / raw)
To: Björn Töpel
Cc: Chunyan Zhang, Palmer Dabbelt, Albert Ou, Paul Walmsley,
Alexandre Ghiti, Andrew Morton, linux-riscv, linux-kernel,
Chunyan Zhang, linux-mm
On Thu, Jan 30, 2025 at 09:42:21AM +0100, Björn Töpel wrote:
> Chunyan Zhang <zhangchunyan@iscas.ac.cn> writes:
>
> > The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
> > now which has to be disabled if we want to use bit(9) for other features,
> > since there's no more free PTE bit on RISC-V now.
> >
> > So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
> > the build condition of devmap definitions.
>
> Heads-up: It seems like Alistair's series [1] that removes the devmap
> PTE bit will most likely land in 6.15.
That is indeed the plan/hope. I just reposted the series based on v6.14-rc1.
Note that I didn't include the devmap PTE change in the repost as I plan on
posting that as a separate followup series but I'm hoping both will make it
for v6.15.
- Alistair
> Björn
>
> [1] https://lore.kernel.org/linux-mm/cover.11189864684e31260d1408779fac9db80122047b.1736488799.git-series.apopple@nvidia.com/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2025-01-30 8:42 ` Björn Töpel
2025-02-05 0:19 ` Alistair Popple
@ 2025-02-11 1:20 ` Chunyan Zhang
2025-02-11 4:01 ` Deepak Gupta
1 sibling, 1 reply; 13+ messages in thread
From: Chunyan Zhang @ 2025-02-11 1:20 UTC (permalink / raw)
To: Björn Töpel
Cc: Chunyan Zhang, Palmer Dabbelt, Albert Ou, Paul Walmsley,
Alexandre Ghiti, Andrew Morton, linux-riscv, linux-kernel,
Alistair Popple, linux-mm
On Thu, 30 Jan 2025 at 16:42, Björn Töpel <bjorn@kernel.org> wrote:
>
> Chunyan Zhang <zhangchunyan@iscas.ac.cn> writes:
>
> > The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
> > now which has to be disabled if we want to use bit(9) for other features,
> > since there's no more free PTE bit on RISC-V now.
> >
> > So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
> > the build condition of devmap definitions.
>
> Heads-up: It seems like Alistair's series [1] that removes the devmap
> PTE bit will most likely land in 6.15.
Yes, I've been keeping an eye on Alistair's series, intended to update
this patchset after Alistair's patch that removes the devmap PTE bit
got merged.
Thanks,
Chunyan
>
>
> Björn
>
> [1] https://lore.kernel.org/linux-mm/cover.11189864684e31260d1408779fac9db80122047b.1736488799.git-series.apopple@nvidia.com/
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2025-02-11 1:20 ` Chunyan Zhang
@ 2025-02-11 4:01 ` Deepak Gupta
2025-02-11 8:05 ` Chunyan Zhang
0 siblings, 1 reply; 13+ messages in thread
From: Deepak Gupta @ 2025-02-11 4:01 UTC (permalink / raw)
To: Chunyan Zhang
Cc: Björn Töpel, Chunyan Zhang, Palmer Dabbelt, Albert Ou,
Paul Walmsley, Alexandre Ghiti, Andrew Morton, linux-riscv,
linux-kernel, Alistair Popple, linux-mm
On Tue, Feb 11, 2025 at 09:20:22AM +0800, Chunyan Zhang wrote:
>On Thu, 30 Jan 2025 at 16:42, Björn Töpel <bjorn@kernel.org> wrote:
>>
>> Chunyan Zhang <zhangchunyan@iscas.ac.cn> writes:
>>
>> > The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
>> > now which has to be disabled if we want to use bit(9) for other features,
>> > since there's no more free PTE bit on RISC-V now.
>> >
>> > So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
>> > the build condition of devmap definitions.
>>
>> Heads-up: It seems like Alistair's series [1] that removes the devmap
>> PTE bit will most likely land in 6.15.
>
>Yes, I've been keeping an eye on Alistair's series, intended to update
>this patchset after Alistair's patch that removes the devmap PTE bit
>got merged.
Please keep in mind that even after claiming back devmap PTE SW bit, a compile
time decision to select between uffd-wp and soft-dirty is not desirable.
>
>Thanks,
>Chunyan
>>
>>
>> Björn
>>
>> [1] https://lore.kernel.org/linux-mm/cover.11189864684e31260d1408779fac9db80122047b.1736488799.git-series.apopple@nvidia.com/
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2025-02-11 4:01 ` Deepak Gupta
@ 2025-02-11 8:05 ` Chunyan Zhang
2025-02-20 18:23 ` Deepak Gupta
0 siblings, 1 reply; 13+ messages in thread
From: Chunyan Zhang @ 2025-02-11 8:05 UTC (permalink / raw)
To: Deepak Gupta
Cc: Björn Töpel, Chunyan Zhang, Palmer Dabbelt, Albert Ou,
Paul Walmsley, Alexandre Ghiti, Andrew Morton, linux-riscv,
linux-kernel, Alistair Popple, linux-mm
On Tue, 11 Feb 2025 at 12:01, Deepak Gupta <debug@rivosinc.com> wrote:
>
> On Tue, Feb 11, 2025 at 09:20:22AM +0800, Chunyan Zhang wrote:
> >On Thu, 30 Jan 2025 at 16:42, Björn Töpel <bjorn@kernel.org> wrote:
> >>
> >> Chunyan Zhang <zhangchunyan@iscas.ac.cn> writes:
> >>
> >> > The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
> >> > now which has to be disabled if we want to use bit(9) for other features,
> >> > since there's no more free PTE bit on RISC-V now.
> >> >
> >> > So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
> >> > the build condition of devmap definitions.
> >>
> >> Heads-up: It seems like Alistair's series [1] that removes the devmap
> >> PTE bit will most likely land in 6.15.
> >
> >Yes, I've been keeping an eye on Alistair's series, intended to update
> >this patchset after Alistair's patch that removes the devmap PTE bit
> >got merged.
>
> Please keep in mind that even after claiming back devmap PTE SW bit, a compile
> time decision to select between uffd-wp and soft-dirty is not desirable.
Yes, I agree. I've read your aother email. I also hope we can have
more RSW bits to use. So should we add uffd-wp and soft-dirty support
on RISC-V until we have two RSW bits for these two functions? Is an
undesirable solution better than no solution for now? I can optimize
the code when we have more free RSW bits, that's not hard.
Thanks,
Chunyan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9)
2025-02-11 8:05 ` Chunyan Zhang
@ 2025-02-20 18:23 ` Deepak Gupta
0 siblings, 0 replies; 13+ messages in thread
From: Deepak Gupta @ 2025-02-20 18:23 UTC (permalink / raw)
To: Chunyan Zhang
Cc: Björn Töpel, Chunyan Zhang, Palmer Dabbelt, Albert Ou,
Paul Walmsley, Alexandre Ghiti, Andrew Morton, linux-riscv,
linux-kernel, Alistair Popple, linux-mm
Sorry for the late response.
On Tue, Feb 11, 2025 at 04:05:02PM +0800, Chunyan Zhang wrote:
>On Tue, 11 Feb 2025 at 12:01, Deepak Gupta <debug@rivosinc.com> wrote:
>>
>> On Tue, Feb 11, 2025 at 09:20:22AM +0800, Chunyan Zhang wrote:
>> >On Thu, 30 Jan 2025 at 16:42, Björn Töpel <bjorn@kernel.org> wrote:
>> >>
>> >> Chunyan Zhang <zhangchunyan@iscas.ac.cn> writes:
>> >>
>> >> > The PTE bit(9) on RISC-V is reserved for software, it is used by devmap
>> >> > now which has to be disabled if we want to use bit(9) for other features,
>> >> > since there's no more free PTE bit on RISC-V now.
>> >> >
>> >> > So to make ARCH_HAS_PTE_DEVMAP selectable, this patch uses it as
>> >> > the build condition of devmap definitions.
>> >>
>> >> Heads-up: It seems like Alistair's series [1] that removes the devmap
>> >> PTE bit will most likely land in 6.15.
>> >
>> >Yes, I've been keeping an eye on Alistair's series, intended to update
>> >this patchset after Alistair's patch that removes the devmap PTE bit
>> >got merged.
>>
>> Please keep in mind that even after claiming back devmap PTE SW bit, a compile
>> time decision to select between uffd-wp and soft-dirty is not desirable.
>
>Yes, I agree. I've read your aother email. I also hope we can have
>more RSW bits to use. So should we add uffd-wp and soft-dirty support
>on RISC-V until we have two RSW bits for these two functions? Is an
>undesirable solution better than no solution for now?
Problem is that this undesirable solution doesn't solve anything for *most* users.
Kernel can't deviate from providing functionality (which is actually arch-agnostic) to
user mode depending on the architecture.
>I can optimize the code when we have more free RSW bits, that's not hard.
We got 3 use cases,
- pfnmap/pte_special
- uffd-wp
- softdirty
4th one for devmap, I hope we don't need to do it. Should get it back.
https://lore.kernel.org/lkml/cover.95ff0627bc727f2bae44bea4c00ad7a83fbbcfac.1739941374.git-series.apopple@nvidia.com/#r
It looks like any work there would be wasted time.
There is a (fast track) proposal out there to get 2 more RSW bits.
https://lists.riscv.org/g/tech-privileged/message/2268
I hope it gets ratified soon. We will have proper solution to this problem then.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support
2024-11-13 9:58 [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
` (3 preceding siblings ...)
2025-01-29 8:12 ` [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Deepak Gupta
@ 2025-03-30 13:51 ` Alexandre Ghiti
2025-03-31 2:00 ` Chunyan Zhang
4 siblings, 1 reply; 13+ messages in thread
From: Alexandre Ghiti @ 2025-03-30 13:51 UTC (permalink / raw)
To: Chunyan Zhang, Palmer Dabbelt, Albert Ou, Paul Walmsley,
Andrew Morton
Cc: linux-riscv, linux-kernel, Chunyan Zhang
Hi Chunyan,
On 13/11/2024 10:58, Chunyan Zhang wrote:
> This patchset adds soft dirty and userfaultfd write protect tracking
> support for RISC-V.
>
> As described in the patches, we are trying to utilize only one free PTE
> bit(9) to support three kernel features (devmap, soft-dirty, uffd-wp).
> Users cannot have them supported at the same time (have to select
> one when building the kernel).
>
> This patchset has been tested with:
> 1) The kselftest mm suite in which soft-dirty, madv_populate,
> test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions
> are observed in any of the other tests.
>
> 2) CRIU:
> - 'criu check --feature mem_dirty_track' returns supported;
> - incremental_dumps[1] and simple_loop [2] dump and restores work fine;
> - zdtm test suite can run under host mode.
>
> This patchset applies on top of v6.12-rc7.
>
> V5:
> - Fixed typos and corrected some words in Kconfig and commit message;
> - Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error;
> - Added Alex's Reviewed-by tag in patch 2.
>
> V4:
> - Added bit(4) descriptions into "Format of swap PTE".
>
> V3:
> - Fixed the issue reported by kernel test irobot <lkp@intel.com>.
>
> V1 -> V2:
> - Add uffd-wp supported;
> - Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit;
> - Add test results of CRIU in the cover-letter.
>
> [1] https://www.criu.org/Incremental_dumps
> [2] https://asciinema.org/a/232445
>
> Chunyan Zhang (3):
> riscv: mm: Prepare for reusing PTE RSW bit(9)
> riscv: mm: Add soft-dirty page tracking support
> riscv: mm: Add uffd write-protect support
>
> arch/riscv/Kconfig | 34 ++++++-
> arch/riscv/include/asm/pgtable-64.h | 2 +-
> arch/riscv/include/asm/pgtable-bits.h | 31 ++++++
> arch/riscv/include/asm/pgtable.h | 133 +++++++++++++++++++++++++-
> 4 files changed, 197 insertions(+), 3 deletions(-)
As mentioned by Deepak, there is a new proposed extension Svrsw60t59b
which will free 2 more bits. It would help if you can come up with a new
version of this patchset using this new extension, would you mind
working on this? If not possible, let's discuss how I can help.
Thanks,
Alex
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support
2025-03-30 13:51 ` Alexandre Ghiti
@ 2025-03-31 2:00 ` Chunyan Zhang
0 siblings, 0 replies; 13+ messages in thread
From: Chunyan Zhang @ 2025-03-31 2:00 UTC (permalink / raw)
To: Alexandre Ghiti
Cc: Chunyan Zhang, Palmer Dabbelt, Albert Ou, Paul Walmsley,
Andrew Morton, linux-riscv, linux-kernel
Hi Alex,
On Sun, 30 Mar 2025 at 21:51, Alexandre Ghiti <alex@ghiti.fr> wrote:
>
> Hi Chunyan,
>
> On 13/11/2024 10:58, Chunyan Zhang wrote:
> > This patchset adds soft dirty and userfaultfd write protect tracking
> > support for RISC-V.
> >
> > As described in the patches, we are trying to utilize only one free PTE
> > bit(9) to support three kernel features (devmap, soft-dirty, uffd-wp).
> > Users cannot have them supported at the same time (have to select
> > one when building the kernel).
> >
> > This patchset has been tested with:
> > 1) The kselftest mm suite in which soft-dirty, madv_populate,
> > test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions
> > are observed in any of the other tests.
> >
> > 2) CRIU:
> > - 'criu check --feature mem_dirty_track' returns supported;
> > - incremental_dumps[1] and simple_loop [2] dump and restores work fine;
> > - zdtm test suite can run under host mode.
> >
> > This patchset applies on top of v6.12-rc7.
> >
> > V5:
> > - Fixed typos and corrected some words in Kconfig and commit message;
> > - Removed pte_wrprotect() from pte_swp_mkuffd_wp(), this is a copy-paste error;
> > - Added Alex's Reviewed-by tag in patch 2.
> >
> > V4:
> > - Added bit(4) descriptions into "Format of swap PTE".
> >
> > V3:
> > - Fixed the issue reported by kernel test irobot <lkp@intel.com>.
> >
> > V1 -> V2:
> > - Add uffd-wp supported;
> > - Make soft-dirty uffd-wp and devmap mutually exclusive which all use the same PTE bit;
> > - Add test results of CRIU in the cover-letter.
> >
> > [1] https://www.criu.org/Incremental_dumps
> > [2] https://asciinema.org/a/232445
> >
> > Chunyan Zhang (3):
> > riscv: mm: Prepare for reusing PTE RSW bit(9)
> > riscv: mm: Add soft-dirty page tracking support
> > riscv: mm: Add uffd write-protect support
> >
> > arch/riscv/Kconfig | 34 ++++++-
> > arch/riscv/include/asm/pgtable-64.h | 2 +-
> > arch/riscv/include/asm/pgtable-bits.h | 31 ++++++
> > arch/riscv/include/asm/pgtable.h | 133 +++++++++++++++++++++++++-
> > 4 files changed, 197 insertions(+), 3 deletions(-)
>
>
> As mentioned by Deepak, there is a new proposed extension Svrsw60t59b
> which will free 2 more bits. It would help if you can come up with a new
> version of this patchset using this new extension, would you mind
Sure, I will cook up a new patchset using Svrsw60t59b, and will submit
the new patchset after v6.15-rc1 is released.
> working on this? If not possible, let's discuss how I can help.
No worries, and thanks for the reminder, I didn't notice this new
extension supported in the kernel.
Thanks,
Chunyan
>
> Thanks,
>
> Alex
>
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-03-31 2:01 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-13 9:58 [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 1/3] riscv: mm: Prepare for reusing PTE RSW bit(9) Chunyan Zhang
2025-01-30 8:42 ` Björn Töpel
2025-02-05 0:19 ` Alistair Popple
2025-02-11 1:20 ` Chunyan Zhang
2025-02-11 4:01 ` Deepak Gupta
2025-02-11 8:05 ` Chunyan Zhang
2025-02-20 18:23 ` Deepak Gupta
2024-11-13 9:58 ` [PATCH V5 2/3] riscv: mm: Add soft-dirty page tracking support Chunyan Zhang
2024-11-13 9:58 ` [PATCH V5 3/3] riscv: mm: Add uffd write-protect support Chunyan Zhang
2025-01-29 8:12 ` [PATCH V5 0/3] riscv: mm: Add soft-dirty and uffd-wp support Deepak Gupta
2025-03-30 13:51 ` Alexandre Ghiti
2025-03-31 2:00 ` Chunyan Zhang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox