Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] make persistent huge zero folio read-only
@ 2026-05-27  3:56 Xueyuan chen
  2026-05-27  3:56 ` [RFC PATCH 1/3] mm: " Xueyuan chen
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Xueyuan chen @ 2026-05-27  3:56 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, dev.jain, lance.yang, yang, jannh, Xueyuan Chen

From: Xueyuan Chen <xueyuan.chen21@gmail.com>

Hi all,

This series makes the persistent huge zero folio read-only in the direct
map.

The motivation comes from Jann Horn's read-only zero page work[1] and the
follow-up discussion[2] with Yang Shi. As Jann pointed out, the kernel has
had bugs, including security bugs, where pages taken with read-only
semantics were later written to. For the huge zero folio, making the direct
map read-only turns such writes into faults instead of silently corrupting
shared zero contents.

The permission change is best effort. If the architecture cannot safely
make the direct map read-only, the kernel keeps using the writable
persistent huge zero folio.

Patch 1 adds the generic support for making the persistent huge zero folio
read-only. Patches 2 and 3 add arm64 and x86 support.

[1] https://lore.kernel.org/linux-mm/20260508-ro-zeropage-v1-1-9808abc20b49@google.com/
[2] https://lore.kernel.org/linux-mm/CAHbLzkrXXe7r3n3jXgDKtwZhRqj=jDx9E6dLOULohnhBguvi9A@mail.gmail.com/

Xueyuan Chen (3):
  mm: make persistent huge zero folio read-only
  arm64/mm: make huge zero folio read-only in linear map
  x86/mm: make huge zero folio read-only in direct map

 arch/arm64/Kconfig       |  1 +
 arch/arm64/mm/pageattr.c | 16 ++++++++++++++++
 arch/x86/Kconfig         |  1 +
 arch/x86/mm/init.c       | 11 +++++++++++
 include/linux/huge_mm.h  |  5 +++++
 mm/Kconfig               | 17 +++++++++++++++++
 mm/huge_memory.c         | 25 ++++++++++++++++++++++++-
 7 files changed, 75 insertions(+), 1 deletion(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH 1/3] mm: make persistent huge zero folio read-only
  2026-05-27  3:56 [RFC PATCH 0/3] make persistent huge zero folio read-only Xueyuan chen
@ 2026-05-27  3:56 ` Xueyuan chen
  2026-05-27 13:32   ` Dev Jain
  2026-05-27 15:55   ` Dave Hansen
  2026-05-27  3:56 ` [RFC PATCH 2/3] arm64/mm: make huge zero folio read-only in linear map Xueyuan chen
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 9+ messages in thread
From: Xueyuan chen @ 2026-05-27  3:56 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, dev.jain, lance.yang, yang, jannh, Xueyuan Chen

From: Xueyuan Chen <xueyuan.chen21@gmail.com>

The huge zero folio is shared globally, and its contents should never
change after initialization. As Jann Horn pointed out[1], the kernel has
had bugs, including security bugs, where read-only pages were later written
to. If the huge zero folio is read-only in the direct map, such writes
fault instead of silently corrupting shared zero contents.

For the persistent huge zero folio, set this up once after the folio is
allocated at boot.

The permission change is best-effort. If the architecture cannot safely
make the direct map read-only, keep using the writable persistent huge zero
folio.

While at it, mark the huge_zero_folio pointer itself __ro_after_init.
READONLY_HUGE_ZERO_FOLIO depends on PERSISTENT_HUGE_ZERO_FOLIO, so the
pointer is initialized during boot and never replaced.

This was inspired by Jann Horn's read-only zero page work[1] and follow-up
discussion[2] with Yang Shi.

[1] https://lore.kernel.org/linux-mm/20260508-ro-zeropage-v1-1-9808abc20b49@google.com/
[2] https://lore.kernel.org/linux-mm/CAHbLzkrXXe7r3n3jXgDKtwZhRqj=jDx9E6dLOULohnhBguvi9A@mail.gmail.com/

Co-developed-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
 include/linux/huge_mm.h |  5 +++++
 mm/Kconfig              | 17 +++++++++++++++++
 mm/huge_memory.c        | 25 ++++++++++++++++++++++++-
 3 files changed, 46 insertions(+), 1 deletion(-)

diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
index edece3e26985..45d1352619d1 100644
--- a/include/linux/huge_mm.h
+++ b/include/linux/huge_mm.h
@@ -5,6 +5,7 @@
 #include <linux/mm_types.h>
 
 #include <linux/fs.h> /* only for vma_is_dax() */
+#include <linux/init.h>
 #include <linux/kobject.h>
 
 vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf);
@@ -554,6 +555,10 @@ static inline bool is_huge_zero_pmd(pmd_t pmd)
 struct folio *mm_get_huge_zero_folio(struct mm_struct *mm);
 void mm_put_huge_zero_folio(struct mm_struct *mm);
 
+#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
+bool __init arch_make_huge_zero_folio_readonly(struct folio *folio);
+#endif
+
 static inline struct folio *get_persistent_huge_zero_folio(void)
 {
 	if (!IS_ENABLED(CONFIG_PERSISTENT_HUGE_ZERO_FOLIO))
diff --git a/mm/Kconfig b/mm/Kconfig
index 776b67c66e82..f31200816646 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -787,6 +787,23 @@ config PERSISTENT_HUGE_ZERO_FOLIO
 	  Say Y if your system has lots of memory. Say N if you are
 	  memory constrained.
 
+config ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
+	bool
+
+config READONLY_HUGE_ZERO_FOLIO
+	bool "Map the huge zero folio read-only in the direct map"
+	depends on PERSISTENT_HUGE_ZERO_FOLIO
+	depends on ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
+	help
+	  The persistent huge zero folio is shared globally, and nothing
+	  should ever change its contents after initialization.
+
+	  When supported, mark the folio read-only in the direct map so such
+	  writes trigger a fault instead of silently corrupting the zero contents.
+
+	  If the permission change is not supported, the kernel keeps using
+	  the writable persistent huge zero folio.
+
 config MM_ID
 	def_bool n
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bf9b480bb3b0..c568755dd58e 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -75,7 +75,11 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
 static bool split_underused_thp = true;
 
 static atomic_t huge_zero_refcount;
+#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
+struct folio *huge_zero_folio __ro_after_init;
+#else
 struct folio *huge_zero_folio __read_mostly;
+#endif
 unsigned long huge_zero_pfn __read_mostly = ~0UL;
 unsigned long huge_anon_orders_always __read_mostly;
 unsigned long huge_anon_orders_madvise __read_mostly;
@@ -305,6 +309,18 @@ static unsigned long shrink_huge_zero_folio_scan(struct shrinker *shrink,
 	return 0;
 }
 
+#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
+static bool __init make_huge_zero_folio_readonly(void)
+{
+	return arch_make_huge_zero_folio_readonly(READ_ONCE(huge_zero_folio));
+}
+#else
+static bool __init make_huge_zero_folio_readonly(void)
+{
+	return false;
+}
+#endif
+
 static struct shrinker *huge_zero_folio_shrinker;
 
 #ifdef CONFIG_SYSFS
@@ -965,8 +981,15 @@ static int __init thp_shrinker_init(void)
 		 * that get_huge_zero_folio() will most likely not fail as
 		 * thp_shrinker_init() is invoked early on during boot.
 		 */
-		if (!get_huge_zero_folio())
+		if (!get_huge_zero_folio()) {
 			pr_warn("Allocating persistent huge zero folio failed\n");
+			return 0;
+		}
+
+		if (IS_ENABLED(CONFIG_READONLY_HUGE_ZERO_FOLIO) &&
+		    !make_huge_zero_folio_readonly())
+			pr_warn("Making persistent huge zero folio read-only failed\n");
+
 		return 0;
 	}
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 2/3] arm64/mm: make huge zero folio read-only in linear map
  2026-05-27  3:56 [RFC PATCH 0/3] make persistent huge zero folio read-only Xueyuan chen
  2026-05-27  3:56 ` [RFC PATCH 1/3] mm: " Xueyuan chen
@ 2026-05-27  3:56 ` Xueyuan chen
  2026-05-27  3:56 ` [RFC PATCH 3/3] x86/mm: make huge zero folio read-only in direct map Xueyuan chen
  2026-05-27 15:58 ` [RFC PATCH 0/3] make persistent huge zero folio read-only Dave Hansen
  3 siblings, 0 replies; 9+ messages in thread
From: Xueyuan chen @ 2026-05-27  3:56 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, dev.jain, lance.yang, yang, jannh, Xueyuan Chen

From: Xueyuan Chen <xueyuan.chen21@gmail.com>

Implement arch_make_huge_zero_folio_readonly() for arm64. Once allocated,
try to make the folio read-only in the linear map so unexpected writes
fault instead of corrupting shared zero contents.

Respect can_set_direct_map() before touching the linear map, and treat the
pageattr update as best effort: it can still fail while splitting a leaf
mapping or applying new permissions. If that happens, generic THP keeps
using the writable persistent huge zero folio.

Co-developed-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
 arch/arm64/Kconfig       |  1 +
 arch/arm64/mm/pageattr.c | 16 ++++++++++++++++
 2 files changed, 17 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fe60738e5943..3cd705dd5251 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -44,6 +44,7 @@ config ARM64
 	select ARCH_HAS_PREEMPT_LAZY
 	select ARCH_HAS_PTDUMP
 	select ARCH_HAS_PTE_SPECIAL
+	select ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
 	select ARCH_HAS_HW_PTE_YOUNG
 	select ARCH_HAS_SETUP_DMA_OPS
 	select ARCH_HAS_SET_DIRECT_MAP
diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c
index ce035e1b4eaf..51ce31e74a18 100644
--- a/arch/arm64/mm/pageattr.c
+++ b/arch/arm64/mm/pageattr.c
@@ -3,7 +3,9 @@
  * Copyright (c) 2014, The Linux Foundation. All rights reserved.
  */
 #include <linux/kernel.h>
+#include <linux/init.h>
 #include <linux/mm.h>
+#include <linux/huge_mm.h>
 #include <linux/module.h>
 #include <linux/mem_encrypt.h>
 #include <linux/sched.h>
@@ -147,6 +149,20 @@ static int __change_memory_common(unsigned long start, unsigned long size,
 	return ret;
 }
 
+#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
+bool __init arch_make_huge_zero_folio_readonly(struct folio *folio)
+{
+	unsigned long addr = (unsigned long)folio_address(folio);
+
+	if (!can_set_direct_map())
+		return false;
+
+	return !__change_memory_common(addr, PMD_SIZE,
+				       __pgprot(PTE_RDONLY),
+				       __pgprot(PTE_WRITE));
+}
+#endif
+
 static int change_memory_common(unsigned long addr, int numpages,
 				pgprot_t set_mask, pgprot_t clear_mask)
 {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH 3/3] x86/mm: make huge zero folio read-only in direct map
  2026-05-27  3:56 [RFC PATCH 0/3] make persistent huge zero folio read-only Xueyuan chen
  2026-05-27  3:56 ` [RFC PATCH 1/3] mm: " Xueyuan chen
  2026-05-27  3:56 ` [RFC PATCH 2/3] arm64/mm: make huge zero folio read-only in linear map Xueyuan chen
@ 2026-05-27  3:56 ` Xueyuan chen
  2026-05-27 15:58 ` [RFC PATCH 0/3] make persistent huge zero folio read-only Dave Hansen
  3 siblings, 0 replies; 9+ messages in thread
From: Xueyuan chen @ 2026-05-27  3:56 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, dev.jain, lance.yang, yang, jannh, Xueyuan Chen

From: Xueyuan Chen <xueyuan.chen21@gmail.com>

Implement arch_make_huge_zero_folio_readonly() for x86-64. Once allocated,
try to make the folio read-only in the direct map so unexpected writes
fault instead of corrupting shared zero contents.

The set_memory_ro() update is best effort: if it fails, generic THP keeps
using the writable persistent huge zero folio.

Co-developed-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
Signed-off-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
 arch/x86/Kconfig   |  1 +
 arch/x86/mm/init.c | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index f3f7cb01d69d..81f9478d2803 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -24,6 +24,7 @@ config X86_64
 	def_bool y
 	depends on 64BIT
 	# Options that are inherently 64-bit kernel only:
+	select ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
 	select ARCH_HAS_GIGANTIC_PAGE
 	select ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS
 	select ARCH_SUPPORTS_INT128 if CC_HAS_INT128
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index fb67217fddcd..ef721aa2ff0c 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -3,6 +3,8 @@
 #include <linux/ioport.h>
 #include <linux/swap.h>
 #include <linux/memblock.h>
+#include <linux/mm.h>
+#include <linux/huge_mm.h>
 #include <linux/swapfile.h>
 #include <linux/swapops.h>
 #include <linux/kmemleak.h>
@@ -38,6 +40,15 @@
 
 #include "mm_internal.h"
 
+#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
+bool __init arch_make_huge_zero_folio_readonly(struct folio *folio)
+{
+	unsigned long addr = (unsigned long)folio_address(folio);
+
+	return !set_memory_ro(addr, HPAGE_PMD_NR);
+}
+#endif
+
 /*
  * Tables translating between page_cache_type_t and pte encoding.
  *
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] mm: make persistent huge zero folio read-only
  2026-05-27  3:56 ` [RFC PATCH 1/3] mm: " Xueyuan chen
@ 2026-05-27 13:32   ` Dev Jain
  2026-05-27 23:03     ` Xueyuan Chen
  2026-05-27 15:55   ` Dave Hansen
  1 sibling, 1 reply; 9+ messages in thread
From: Dev Jain @ 2026-05-27 13:32 UTC (permalink / raw)
  To: Xueyuan chen, akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, lance.yang, yang, jannh



On 27/05/26 9:26 am, Xueyuan chen wrote:
> From: Xueyuan Chen <xueyuan.chen21@gmail.com>
> 
> The huge zero folio is shared globally, and its contents should never
> change after initialization. As Jann Horn pointed out[1], the kernel has
> had bugs, including security bugs, where read-only pages were later written
> to. If the huge zero folio is read-only in the direct map, such writes
> fault instead of silently corrupting shared zero contents.
> 
> For the persistent huge zero folio, set this up once after the folio is
> allocated at boot.
> 
> The permission change is best-effort. If the architecture cannot safely
> make the direct map read-only, keep using the writable persistent huge zero
> folio.
> 
> While at it, mark the huge_zero_folio pointer itself __ro_after_init.
> READONLY_HUGE_ZERO_FOLIO depends on PERSISTENT_HUGE_ZERO_FOLIO, so the
> pointer is initialized during boot and never replaced.
> 
> This was inspired by Jann Horn's read-only zero page work[1] and follow-up
> discussion[2] with Yang Shi.
> 
> [1] https://lore.kernel.org/linux-mm/20260508-ro-zeropage-v1-1-9808abc20b49@google.com/
> [2] https://lore.kernel.org/linux-mm/CAHbLzkrXXe7r3n3jXgDKtwZhRqj=jDx9E6dLOULohnhBguvi9A@mail.gmail.com/
> 
> Co-developed-by: Lance Yang <lance.yang@linux.dev>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> Signed-off-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> ---
>  include/linux/huge_mm.h |  5 +++++
>  mm/Kconfig              | 17 +++++++++++++++++
>  mm/huge_memory.c        | 25 ++++++++++++++++++++++++-
>  3 files changed, 46 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index edece3e26985..45d1352619d1 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -5,6 +5,7 @@
>  #include <linux/mm_types.h>
>  
>  #include <linux/fs.h> /* only for vma_is_dax() */
> +#include <linux/init.h>
>  #include <linux/kobject.h>
>  
>  vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf);
> @@ -554,6 +555,10 @@ static inline bool is_huge_zero_pmd(pmd_t pmd)
>  struct folio *mm_get_huge_zero_folio(struct mm_struct *mm);
>  void mm_put_huge_zero_folio(struct mm_struct *mm);
>  
> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
> +bool __init arch_make_huge_zero_folio_readonly(struct folio *folio);
> +#endif
> +
>  static inline struct folio *get_persistent_huge_zero_folio(void)
>  {
>  	if (!IS_ENABLED(CONFIG_PERSISTENT_HUGE_ZERO_FOLIO))
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 776b67c66e82..f31200816646 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -787,6 +787,23 @@ config PERSISTENT_HUGE_ZERO_FOLIO
>  	  Say Y if your system has lots of memory. Say N if you are
>  	  memory constrained.
>  
> +config ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
> +	bool
> +
> +config READONLY_HUGE_ZERO_FOLIO
> +	bool "Map the huge zero folio read-only in the direct map"
> +	depends on PERSISTENT_HUGE_ZERO_FOLIO
> +	depends on ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
> +	help
> +	  The persistent huge zero folio is shared globally, and nothing
> +	  should ever change its contents after initialization.
> +
> +	  When supported, mark the folio read-only in the direct map so such
> +	  writes trigger a fault instead of silently corrupting the zero contents.
> +
> +	  If the permission change is not supported, the kernel keeps using
> +	  the writable persistent huge zero folio.
> +
>  config MM_ID
>  	def_bool n
>  
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index bf9b480bb3b0..c568755dd58e 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -75,7 +75,11 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>  static bool split_underused_thp = true;
>  
>  static atomic_t huge_zero_refcount;
> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
> +struct folio *huge_zero_folio __ro_after_init;


Can we guard this with CONFIG_PERSISTENT_HUGE_ZERO_FOLIO? Since
in that case too the pointer is never written after init.

> +#else
>  struct folio *huge_zero_folio __read_mostly;
> +#endif
>  unsigned long huge_zero_pfn __read_mostly = ~0UL;
>  unsigned long huge_anon_orders_always __read_mostly;
>  unsigned long huge_anon_orders_madvise __read_mostly;
> @@ -305,6 +309,18 @@ static unsigned long shrink_huge_zero_folio_scan(struct shrinker *shrink,
>  	return 0;
>  }
>  
> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
> +static bool __init make_huge_zero_folio_readonly(void)
> +{
> +	return arch_make_huge_zero_folio_readonly(READ_ONCE(huge_zero_folio));


I think READ_ONCE is not required since no one is going to change the
pointer after allocation?


> +}
> +#else
> +static bool __init make_huge_zero_folio_readonly(void)
> +{
> +	return false;
> +}
> +#endif
> +
>  static struct shrinker *huge_zero_folio_shrinker;
>  
>  #ifdef CONFIG_SYSFS
> @@ -965,8 +981,15 @@ static int __init thp_shrinker_init(void)
>  		 * that get_huge_zero_folio() will most likely not fail as
>  		 * thp_shrinker_init() is invoked early on during boot.
>  		 */
> -		if (!get_huge_zero_folio())
> +		if (!get_huge_zero_folio()) {
>  			pr_warn("Allocating persistent huge zero folio failed\n");
> +			return 0;
> +		}
> +
> +		if (IS_ENABLED(CONFIG_READONLY_HUGE_ZERO_FOLIO) &&
> +		    !make_huge_zero_folio_readonly())
> +			pr_warn("Making persistent huge zero folio read-only failed\n");
> +
>  		return 0;
>  	}
>  



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] mm: make persistent huge zero folio read-only
  2026-05-27  3:56 ` [RFC PATCH 1/3] mm: " Xueyuan chen
  2026-05-27 13:32   ` Dev Jain
@ 2026-05-27 15:55   ` Dave Hansen
  2026-05-27 16:20     ` Jann Horn
  1 sibling, 1 reply; 9+ messages in thread
From: Dave Hansen @ 2026-05-27 15:55 UTC (permalink / raw)
  To: Xueyuan chen, akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, dev.jain, lance.yang, yang, jannh

On 5/26/26 20:56, Xueyuan chen wrote:> +#ifdef
CONFIG_READONLY_HUGE_ZERO_FOLIO
> +bool __init arch_make_huge_zero_folio_readonly(struct folio *folio);
> +#endif

All of the #ifdeffery needs to die, IMNHO.

This function is also a bad idea. There is nothing "huge zero" specific
about it. It takes any old folio and tries to make it read only.

Just make it:

	bool __init arch_make_folio_readonly(struct folio *folio)

Make it a weak symbol with a stub in some mm/foo.c file, and then the
architectures can override it if they want.

>  static inline struct folio *get_persistent_huge_zero_folio(void)
>  {
>  	if (!IS_ENABLED(CONFIG_PERSISTENT_HUGE_ZERO_FOLIO))
> diff --git a/mm/Kconfig b/mm/Kconfig
> index 776b67c66e82..f31200816646 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -787,6 +787,23 @@ config PERSISTENT_HUGE_ZERO_FOLIO
>  	  Say Y if your system has lots of memory. Say N if you are
>  	  memory constrained.
>  
> +config ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
> +	bool
> +
> +config READONLY_HUGE_ZERO_FOLIO
> +	bool "Map the huge zero folio read-only in the direct map"
> +	depends on PERSISTENT_HUGE_ZERO_FOLIO
> +	depends on ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
> +	help
> +	  The persistent huge zero folio is shared globally, and nothing
> +	  should ever change its contents after initialization.
> +
> +	  When supported, mark the folio read-only in the direct map so such
> +	  writes trigger a fault instead of silently corrupting the zero contents.
> +
> +	  If the permission change is not supported, the kernel keeps using
> +	  the writable persistent huge zero folio.

I vote for no Kconfig options here. Why? This adds "security" with
_basically_ no extra runtime cost. The runtime cost is, what, usually
one kernel TLB invalidation during boot?

If you desperately need one, you can add it without a prompt so folks
need to edit .config files to change the values.

> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index bf9b480bb3b0..c568755dd58e 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -75,7 +75,11 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
>  static bool split_underused_thp = true;
>  
>  static atomic_t huge_zero_refcount;
> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
> +struct folio *huge_zero_folio __ro_after_init;
> +#else
>  struct folio *huge_zero_folio __read_mostly;
> +#endif

#ifdefs bad. Bad. Bad. Bad. ;)

Seriously, just pick one. Lowest common denominator is more important
that being precisely correct all the time.

>  unsigned long huge_zero_pfn __read_mostly = ~0UL;
>  unsigned long huge_anon_orders_always __read_mostly;
>  unsigned long huge_anon_orders_madvise __read_mostly;
> @@ -305,6 +309,18 @@ static unsigned long shrink_huge_zero_folio_scan(struct shrinker *shrink,
>  	return 0;
>  }
>  
> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
> +static bool __init make_huge_zero_folio_readonly(void)
> +{
> +	return arch_make_huge_zero_folio_readonly(READ_ONCE(huge_zero_folio));
> +}
> +#else
> +static bool __init make_huge_zero_folio_readonly(void)
> +{
> +	return false;
> +}
> +#endif
> +
>  static struct shrinker *huge_zero_folio_shrinker;
>  
>  #ifdef CONFIG_SYSFS
> @@ -965,8 +981,15 @@ static int __init thp_shrinker_init(void)
>  		 * that get_huge_zero_folio() will most likely not fail as
>  		 * thp_shrinker_init() is invoked early on during boot.
>  		 */
> -		if (!get_huge_zero_folio())
> +		if (!get_huge_zero_folio()) {
>  			pr_warn("Allocating persistent huge zero folio failed\n");
> +			return 0;
> +		}
> +
> +		if (IS_ENABLED(CONFIG_READONLY_HUGE_ZERO_FOLIO) &&
> +		    !make_huge_zero_folio_readonly())
> +			pr_warn("Making persistent huge zero folio read-only failed\n");
> +
>  		return 0;

The IS_ENABLED() doesn't do anything from what I can tell. This is all
in one .c file. The compiler can see that
make_huge_zero_folio_readonly() is always false. It should optimize out
the pr_warn() for you.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 0/3] make persistent huge zero folio read-only
  2026-05-27  3:56 [RFC PATCH 0/3] make persistent huge zero folio read-only Xueyuan chen
                   ` (2 preceding siblings ...)
  2026-05-27  3:56 ` [RFC PATCH 3/3] x86/mm: make huge zero folio read-only in direct map Xueyuan chen
@ 2026-05-27 15:58 ` Dave Hansen
  3 siblings, 0 replies; 9+ messages in thread
From: Dave Hansen @ 2026-05-27 15:58 UTC (permalink / raw)
  To: Xueyuan chen, akpm, linux-mm
  Cc: linux-kernel, linux-arm-kernel, x86, catalin.marinas, will, tglx,
	mingo, bp, dave.hansen, hpa, david, ljs, ziy, baolin.wang,
	ryan.roberts, dev.jain, lance.yang, yang, jannh

On 5/26/26 20:56, Xueyuan chen wrote:
> The motivation comes from Jann Horn's read-only zero page work[1] and the
> follow-up discussion[2] with Yang Shi. As Jann pointed out, the kernel has
> had bugs, including security bugs, where pages taken with read-only
> semantics were later written to.

My overall concern with this is that it's just a code hack for the huge
zero page and nothing else. It's a total one-off.

I think you need to make the case here that the huge zero page truly is
a special snowflake and deserves a one-off special snowflake solution.
Because it doesn't seem *that* crazy that there are more things that the
kernel dynamically allocates that it wants to keep read only.

Maybe there aren't many things that get mapped to userspace like this.
But the case needs to get made either way.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] mm: make persistent huge zero folio read-only
  2026-05-27 15:55   ` Dave Hansen
@ 2026-05-27 16:20     ` Jann Horn
  0 siblings, 0 replies; 9+ messages in thread
From: Jann Horn @ 2026-05-27 16:20 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Xueyuan chen, akpm, linux-mm, linux-kernel, linux-arm-kernel, x86,
	catalin.marinas, will, tglx, mingo, bp, dave.hansen, hpa, david,
	ljs, ziy, baolin.wang, ryan.roberts, dev.jain, lance.yang, yang

On Wed, May 27, 2026 at 5:55 PM Dave Hansen <dave.hansen@intel.com> wrote:
> On 5/26/26 20:56, Xueyuan chen wrote:
> > +config READONLY_HUGE_ZERO_FOLIO
> > +     bool "Map the huge zero folio read-only in the direct map"
> > +     depends on PERSISTENT_HUGE_ZERO_FOLIO
> > +     depends on ARCH_HAS_READONLY_HUGE_ZERO_FOLIO
> > +     help
> > +       The persistent huge zero folio is shared globally, and nothing
> > +       should ever change its contents after initialization.
> > +
> > +       When supported, mark the folio read-only in the direct map so such
> > +       writes trigger a fault instead of silently corrupting the zero contents.
> > +
> > +       If the permission change is not supported, the kernel keeps using
> > +       the writable persistent huge zero folio.
>
> I vote for no Kconfig options here. Why? This adds "security" with
> _basically_ no extra runtime cost. The runtime cost is, what, usually
> one kernel TLB invalidation during boot?

Plus potentially a bit more TLB pressure from losing a huge PUD in the
linear map, IDK how much we care about that.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH 1/3] mm: make persistent huge zero folio read-only
  2026-05-27 13:32   ` Dev Jain
@ 2026-05-27 23:03     ` Xueyuan Chen
  0 siblings, 0 replies; 9+ messages in thread
From: Xueyuan Chen @ 2026-05-27 23:03 UTC (permalink / raw)
  To: dev.jain
  Cc: xueyuan.chen21, akpm, linux-mm, linux-kernel, linux-arm-kernel,
	x86, catalin.marinas, will, tglx, mingo, bp, dave.hansen, hpa,
	david, ljs, ziy, baolin.wang, ryan.roberts, lance.yang, yang,
	jannh


On Wed, May 27, 2026 at 07:02:55PM +0530, Dev Jain wrote:

[...]
>>  static atomic_t huge_zero_refcount;
>> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
>> +struct folio *huge_zero_folio __ro_after_init;
>
>
>Can we guard this with CONFIG_PERSISTENT_HUGE_ZERO_FOLIO? Since
>in that case too the pointer is never written after init.
>

Yes, that makes sense. The pointer lifetime is tied to
CONFIG_PERSISTENT_HUGE_ZERO_FOLIO. I'll fix this in v2.

>>  
>> +#ifdef CONFIG_READONLY_HUGE_ZERO_FOLIO
>> +static bool __init make_huge_zero_folio_readonly(void)
>> +{
>> +	return arch_make_huge_zero_folio_readonly(READ_ONCE(huge_zero_folio));
>
>
>I think READ_ONCE is not required since no one is going to change the
>pointer after allocation?
>

Agreed, READ_ONCE is not needed here. I'll drop it in v2.

Thanks,
Xueyuan


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-05-27 16:21 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-27  3:56 [RFC PATCH 0/3] make persistent huge zero folio read-only Xueyuan chen
2026-05-27  3:56 ` [RFC PATCH 1/3] mm: " Xueyuan chen
2026-05-27 13:32   ` Dev Jain
2026-05-27 23:03     ` Xueyuan Chen
2026-05-27 15:55   ` Dave Hansen
2026-05-27 16:20     ` Jann Horn
2026-05-27  3:56 ` [RFC PATCH 2/3] arm64/mm: make huge zero folio read-only in linear map Xueyuan chen
2026-05-27  3:56 ` [RFC PATCH 3/3] x86/mm: make huge zero folio read-only in direct map Xueyuan chen
2026-05-27 15:58 ` [RFC PATCH 0/3] make persistent huge zero folio read-only Dave Hansen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox