Generic Linux architectural discussions
 help / color / mirror / Atom feed
* [PATCH] mm: make zeropage read-only
@ 2026-05-08 16:12 Jann Horn
  2026-05-08 16:26 ` Jann Horn
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Jann Horn @ 2026-05-08 16:12 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton, Arnd Bergmann
  Cc: linux-mm, linux-arch, linux-kernel, linux-hardening, Jann Horn

Put the zeropage in the read-only data section - nothing should ever change
its contents. Set up a new section .rodata..page_aligned to mirror the
existing .data..page_aligned and .bss..page_aligned sections.

There have been several security bugs where the kernel grabs references to
pages from some userspace-specified source, via GUP or splice, with
read-only semantics; and then later on, the kernel loses track of the
pages' read-only semantics and writes into them.

I have seen such bugs in out-of-tree GPU drivers before, and recently
upstream Linux bugs of this shape have been discovered as well.

One problem with these bugs is that fuzzers and such will have a hard time
noticing them, because the kernel has no mechanism to directly detect that
such a bug has occurred. It would be nice if we had debug infrastructure to
keep track of whether file pages are supposed to be writable, or such; but
for now, the easiest way to make these bugs detectable in at least some
cases is to make sure that writing the 4K zeropage is mapped as read-only
in the kernel, so that attempting to write into it immediately crashes
(unless the write happens through a vmap mapping or such).

This patch might increase the size of vmlinux by 4K since .rodata is stored
in the ELF file while .bss is not; but the compressed kernel image size
shouldn't change much, since it's compressed.

I have tested that with this patch applied, calling
`get_user_pages_fast(address, 1, 0, &page)` on a freshly-created anonymous
VMA and writing into the page with
`*(volatile char *)page_address(page) = 0` will cause an oops.

Signed-off-by: Jann Horn <jannh@google.com>
---
 include/asm-generic/vmlinux.lds.h | 1 +
 include/linux/linkage.h           | 1 +
 mm/mm_init.c                      | 2 +-
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 60c8c22fd3e4..e6e96bce506f 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -479,6 +479,7 @@
 	. = ALIGN((align));						\
 	.rodata           : AT(ADDR(.rodata) - LOAD_OFFSET) {		\
 		__start_rodata = .;					\
+		*(.rodata..page_aligned)				\
 		*(.rodata) *(.rodata.*) *(.data.rel.ro*)		\
 		SCHED_DATA						\
 		RO_AFTER_INIT_DATA	/* Read only after init */	\
diff --git a/include/linux/linkage.h b/include/linux/linkage.h
index b11660b706c5..49997b292c01 100644
--- a/include/linux/linkage.h
+++ b/include/linux/linkage.h
@@ -38,6 +38,7 @@
 
 #define __page_aligned_data	__section(".data..page_aligned") __aligned(PAGE_SIZE)
 #define __page_aligned_bss	__section(".bss..page_aligned") __aligned(PAGE_SIZE)
+#define __page_aligned_rodata	__section(".rodata..page_aligned") __aligned(PAGE_SIZE)
 
 /*
  * For assembly routines.
diff --git a/mm/mm_init.c b/mm/mm_init.c
index f9f8e1af921c..67b260acc27e 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -57,7 +57,7 @@ unsigned long zero_page_pfn __ro_after_init;
 EXPORT_SYMBOL(zero_page_pfn);
 
 #ifndef __HAVE_COLOR_ZERO_PAGE
-uint8_t empty_zero_page[PAGE_SIZE] __page_aligned_bss;
+uint8_t empty_zero_page[PAGE_SIZE] __page_aligned_rodata;
 EXPORT_SYMBOL(empty_zero_page);
 
 struct page *__zero_page __ro_after_init;

---
base-commit: 917719c412c48687d4a176965d1fa35320ec457c
change-id: 20260508-ro-zeropage-86fb842965ae

--  
Jann Horn <jannh@google.com>


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-08 16:12 [PATCH] mm: make zeropage read-only Jann Horn
@ 2026-05-08 16:26 ` Jann Horn
  2026-05-12  2:31   ` Lance Yang
  2026-05-10 15:12 ` kernel test robot
  2026-05-10 21:40 ` kernel test robot
  2 siblings, 1 reply; 8+ messages in thread
From: Jann Horn @ 2026-05-08 16:26 UTC (permalink / raw)
  To: Mike Rapoport, Andrew Morton, Arnd Bergmann
  Cc: linux-mm, linux-arch, linux-kernel, linux-hardening,
	Ard Biesheuvel, Seth Jenkins

On Fri, May 8, 2026 at 6:12 PM Jann Horn <jannh@google.com> wrote:
> Put the zeropage in the read-only data section - nothing should ever change
> its contents. Set up a new section .rodata..page_aligned to mirror the
> existing .data..page_aligned and .bss..page_aligned sections.
>
> There have been several security bugs where the kernel grabs references to
> pages from some userspace-specified source, via GUP or splice, with
> read-only semantics; and then later on, the kernel loses track of the
> pages' read-only semantics and writes into them.
>
> I have seen such bugs in out-of-tree GPU drivers before, and recently
> upstream Linux bugs of this shape have been discovered as well.
>
> One problem with these bugs is that fuzzers and such will have a hard time
> noticing them, because the kernel has no mechanism to directly detect that
> such a bug has occurred. It would be nice if we had debug infrastructure to
> keep track of whether file pages are supposed to be writable, or such; but
> for now, the easiest way to make these bugs detectable in at least some
> cases is to make sure that writing the 4K zeropage is mapped as read-only
> in the kernel, so that attempting to write into it immediately crashes
> (unless the write happens through a vmap mapping or such).
>
> This patch might increase the size of vmlinux by 4K since .rodata is stored
> in the ELF file while .bss is not; but the compressed kernel image size
> shouldn't change much, since it's compressed.
>
> I have tested that with this patch applied, calling
> `get_user_pages_fast(address, 1, 0, &page)` on a freshly-created anonymous
> VMA and writing into the page with
> `*(volatile char *)page_address(page) = 0` will cause an oops.
>
> Signed-off-by: Jann Horn <jannh@google.com>
> ---
>  include/asm-generic/vmlinux.lds.h | 1 +
>  include/linux/linkage.h           | 1 +
>  mm/mm_init.c                      | 2 +-
>  3 files changed, 3 insertions(+), 1 deletion(-)

Seth pointed out that this is more or less a duplicate of Ard's
<https://lore.kernel.org/all/20260427153416.2103979-19-ardb+git@google.com/>.

So this patch is redundant; sorry for the noise.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-08 16:12 [PATCH] mm: make zeropage read-only Jann Horn
  2026-05-08 16:26 ` Jann Horn
@ 2026-05-10 15:12 ` kernel test robot
  2026-05-10 21:40 ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2026-05-10 15:12 UTC (permalink / raw)
  To: Jann Horn, Mike Rapoport, Andrew Morton, Arnd Bergmann
  Cc: oe-kbuild-all, Linux Memory Management List, linux-arch,
	linux-kernel, linux-hardening, Jann Horn

Hi Jann,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 917719c412c48687d4a176965d1fa35320ec457c]

url:    https://github.com/intel-lab-lkp/linux/commits/Jann-Horn/mm-make-zeropage-read-only/20260510-200814
base:   917719c412c48687d4a176965d1fa35320ec457c
patch link:    https://lore.kernel.org/r/20260508-ro-zeropage-v1-1-9808abc20b49%40google.com
patch subject: [PATCH] mm: make zeropage read-only
config: powerpc-allnoconfig (https://download.01.org/0day-ci/archive/20260510/202605102300.PHd18T7k-lkp@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 15.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260510/202605102300.PHd18T7k-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202605102300.PHd18T7k-lkp@intel.com/

All warnings (new ones prefixed by >>):

   /tmp/ccjZQT3J.s: Assembler messages:
>> /tmp/ccjZQT3J.s:3163: Warning: setting incorrect section attributes for .rodata..page_aligned

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-08 16:12 [PATCH] mm: make zeropage read-only Jann Horn
  2026-05-08 16:26 ` Jann Horn
  2026-05-10 15:12 ` kernel test robot
@ 2026-05-10 21:40 ` kernel test robot
  2 siblings, 0 replies; 8+ messages in thread
From: kernel test robot @ 2026-05-10 21:40 UTC (permalink / raw)
  To: Jann Horn, Mike Rapoport, Andrew Morton, Arnd Bergmann
  Cc: oe-kbuild-all, Linux Memory Management List, linux-arch,
	linux-kernel, linux-hardening, Jann Horn

Hi Jann,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 917719c412c48687d4a176965d1fa35320ec457c]

url:    https://github.com/intel-lab-lkp/linux/commits/Jann-Horn/mm-make-zeropage-read-only/20260510-200814
base:   917719c412c48687d4a176965d1fa35320ec457c
patch link:    https://lore.kernel.org/r/20260508-ro-zeropage-v1-1-9808abc20b49%40google.com
patch subject: [PATCH] mm: make zeropage read-only
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20260510/202605102324.g5xWKUc3-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20260510/202605102324.g5xWKUc3-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202605102324.g5xWKUc3-lkp@intel.com/

All warnings (new ones prefixed by >>):

   /tmp/ccZJHv5S.s: Assembler messages:
>> /tmp/ccZJHv5S.s:14940: Warning: setting incorrect section attributes for .rodata..page_aligned

--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-08 16:26 ` Jann Horn
@ 2026-05-12  2:31   ` Lance Yang
  2026-05-12 13:32     ` Jann Horn
  0 siblings, 1 reply; 8+ messages in thread
From: Lance Yang @ 2026-05-12  2:31 UTC (permalink / raw)
  To: jannh
  Cc: rppt, akpm, arnd, linux-mm, linux-arch, linux-kernel,
	linux-hardening, ardb, sethjenkins, Lance Yang


On Fri, May 08, 2026 at 06:26:32PM +0200, Jann Horn wrote:
>On Fri, May 8, 2026 at 6:12 PM Jann Horn <jannh@google.com> wrote:
>> Put the zeropage in the read-only data section - nothing should ever change
>> its contents. Set up a new section .rodata..page_aligned to mirror the
>> existing .data..page_aligned and .bss..page_aligned sections.
>>
>> There have been several security bugs where the kernel grabs references to
>> pages from some userspace-specified source, via GUP or splice, with
>> read-only semantics; and then later on, the kernel loses track of the
>> pages' read-only semantics and writes into them.
>>
>> I have seen such bugs in out-of-tree GPU drivers before, and recently
>> upstream Linux bugs of this shape have been discovered as well.
>>
>> One problem with these bugs is that fuzzers and such will have a hard time
>> noticing them, because the kernel has no mechanism to directly detect that
>> such a bug has occurred. It would be nice if we had debug infrastructure to
>> keep track of whether file pages are supposed to be writable, or such; but
>> for now, the easiest way to make these bugs detectable in at least some
>> cases is to make sure that writing the 4K zeropage is mapped as read-only
>> in the kernel, so that attempting to write into it immediately crashes
>> (unless the write happens through a vmap mapping or such).
>>
>> This patch might increase the size of vmlinux by 4K since .rodata is stored
>> in the ELF file while .bss is not; but the compressed kernel image size
>> shouldn't change much, since it's compressed.
>>
>> I have tested that with this patch applied, calling
>> `get_user_pages_fast(address, 1, 0, &page)` on a freshly-created anonymous
>> VMA and writing into the page with
>> `*(volatile char *)page_address(page) = 0` will cause an oops.
>>
>> Signed-off-by: Jann Horn <jannh@google.com>
>> ---
>>  include/asm-generic/vmlinux.lds.h | 1 +
>>  include/linux/linkage.h           | 1 +
>>  mm/mm_init.c                      | 2 +-
>>  3 files changed, 3 insertions(+), 1 deletion(-)
>
>Seth pointed out that this is more or less a duplicate of Ard's
><https://lore.kernel.org/all/20260427153416.2103979-19-ardb+git@google.com/>.
>
>So this patch is redundant; sorry for the noise.

Would it makes sense to apply a similar treatment to huge_zero_folio
as well?

with CONFIG_PERSISTENT_HUGE_ZERO_FOLIO=y, it is allocated at boot and
never freed, so it should never be written after initialization either :)

Cheers, Lance

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-12  2:31   ` Lance Yang
@ 2026-05-12 13:32     ` Jann Horn
  2026-05-12 17:14       ` Yang Shi
  0 siblings, 1 reply; 8+ messages in thread
From: Jann Horn @ 2026-05-12 13:32 UTC (permalink / raw)
  To: Lance Yang
  Cc: rppt, akpm, arnd, linux-mm, linux-arch, linux-kernel,
	linux-hardening, ardb, sethjenkins

On Tue, May 12, 2026 at 4:31 AM Lance Yang <lance.yang@linux.dev> wrote:
> Would it makes sense to apply a similar treatment to huge_zero_folio
> as well?
>
> with CONFIG_PERSISTENT_HUGE_ZERO_FOLIO=y, it is allocated at boot and
> never freed, so it should never be written after initialization either :)

Oh, neat, I didn't realize that that feature exists.

I guess there are two aspects of making the huge zero folio RO that
could be problematic:

1. If the huge zero folio comes from the page allocator, making it
read-only might require splitting a huge PUD, which could have
performance implications.
2. I vaguely remember arm64 has rules about how PUD/PMD entries in the
linear mapping can't be split at runtime at all depending on hardware
capabilities, meaning the entire linear mapping may need to be mapped
without any huge PUD/PMD entries - IDK if thp_shrinker_init() runs
early enough to be excepted from that. See can_set_direct_map() and
force_pte_mapping() in arch/arm64/.

So making the huge zero folio RO in the linear map would probably
require adding a new config flag, connecting that to
ARCH_HAS_SET_DIRECT_MAP, and changing one or two places in arm64
memory management.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-12 13:32     ` Jann Horn
@ 2026-05-12 17:14       ` Yang Shi
  2026-05-13  2:11         ` Lance Yang
  0 siblings, 1 reply; 8+ messages in thread
From: Yang Shi @ 2026-05-12 17:14 UTC (permalink / raw)
  To: Jann Horn
  Cc: Lance Yang, rppt, akpm, arnd, linux-mm, linux-arch, linux-kernel,
	linux-hardening, ardb, sethjenkins

On Tue, May 12, 2026 at 6:48 AM Jann Horn <jannh@google.com> wrote:
>
> On Tue, May 12, 2026 at 4:31 AM Lance Yang <lance.yang@linux.dev> wrote:
> > Would it makes sense to apply a similar treatment to huge_zero_folio
> > as well?
> >
> > with CONFIG_PERSISTENT_HUGE_ZERO_FOLIO=y, it is allocated at boot and
> > never freed, so it should never be written after initialization either :)
>
> Oh, neat, I didn't realize that that feature exists.
>
> I guess there are two aspects of making the huge zero folio RO that
> could be problematic:
>
> 1. If the huge zero folio comes from the page allocator, making it
> read-only might require splitting a huge PUD, which could have
> performance implications.
> 2. I vaguely remember arm64 has rules about how PUD/PMD entries in the
> linear mapping can't be split at runtime at all depending on hardware
> capabilities, meaning the entire linear mapping may need to be mapped
> without any huge PUD/PMD entries - IDK if thp_shrinker_init() runs
> early enough to be excepted from that. See can_set_direct_map() and
> force_pte_mapping() in arch/arm64/.

Yes. First of all, this relies rodata mode. If rodata=on (used to be
called full), the linear mapping may be mapped by PUD/PMD if the
hardware can support BBML2_NOABORT, otherwise it is mapped at PTE
level all the time. But how huge zero folio is mapped in linear
mapping should not matter, you just need to change the linear mapping
permission to RO anyway.

If the rodata mode is off or noalias (used to be called on), the
linear mapping may be mapped by PUD/PMD, but basically changing linear
mapping permission is not expected by kernel.

Thanks,
Yang

>
> So making the huge zero folio RO in the linear map would probably
> require adding a new config flag, connecting that to
> ARCH_HAS_SET_DIRECT_MAP, and changing one or two places in arm64
> memory management.
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: make zeropage read-only
  2026-05-12 17:14       ` Yang Shi
@ 2026-05-13  2:11         ` Lance Yang
  0 siblings, 0 replies; 8+ messages in thread
From: Lance Yang @ 2026-05-13  2:11 UTC (permalink / raw)
  To: shy828301, jannh
  Cc: lance.yang, rppt, akpm, arnd, linux-mm, linux-arch, linux-kernel,
	linux-hardening, ardb, sethjenkins


On Tue, May 12, 2026 at 10:14:01AM -0700, Yang Shi wrote:
>On Tue, May 12, 2026 at 6:48 AM Jann Horn <jannh@google.com> wrote:
>>
>> On Tue, May 12, 2026 at 4:31 AM Lance Yang <lance.yang@linux.dev> wrote:
>> > Would it makes sense to apply a similar treatment to huge_zero_folio
>> > as well?
>> >
>> > with CONFIG_PERSISTENT_HUGE_ZERO_FOLIO=y, it is allocated at boot and
>> > never freed, so it should never be written after initialization either :)
>>
>> Oh, neat, I didn't realize that that feature exists.
>>
>> I guess there are two aspects of making the huge zero folio RO that
>> could be problematic:
>>
>> 1. If the huge zero folio comes from the page allocator, making it
>> read-only might require splitting a huge PUD, which could have
>> performance implications.
>> 2. I vaguely remember arm64 has rules about how PUD/PMD entries in the
>> linear mapping can't be split at runtime at all depending on hardware
>> capabilities, meaning the entire linear mapping may need to be mapped
>> without any huge PUD/PMD entries - IDK if thp_shrinker_init() runs
>> early enough to be excepted from that. See can_set_direct_map() and
>> force_pte_mapping() in arch/arm64/.
>
>Yes. First of all, this relies rodata mode. If rodata=on (used to be
>called full), the linear mapping may be mapped by PUD/PMD if the
>hardware can support BBML2_NOABORT, otherwise it is mapped at PTE
>level all the time. But how huge zero folio is mapped in linear
>mapping should not matter, you just need to change the linear mapping
>permission to RO anyway.
>
>If the rodata mode is off or noalias (used to be called on), the
>linear mapping may be mapped by PUD/PMD, but basically changing linear
>mapping permission is not expected by kernel.

Ah, right. So for huge_zero_folio the hard part is not just making the
backing memory read-only, but also whether we can change the linear
mapping permission for that range. That depends on the arm64 rodata mode
/ direct-map setup.

Thanks Jann and Yang for the explanations!
Lance

>Thanks,
>Yang
>
>>
>> So making the huge zero folio RO in the linear map would probably
>> require adding a new config flag, connecting that to
>> ARCH_HAS_SET_DIRECT_MAP, and changing one or two places in arm64
>> memory management.
>>
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-13  2:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-08 16:12 [PATCH] mm: make zeropage read-only Jann Horn
2026-05-08 16:26 ` Jann Horn
2026-05-12  2:31   ` Lance Yang
2026-05-12 13:32     ` Jann Horn
2026-05-12 17:14       ` Yang Shi
2026-05-13  2:11         ` Lance Yang
2026-05-10 15:12 ` kernel test robot
2026-05-10 21:40 ` kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox