linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel
@ 2023-01-10 12:47 Michael Ellerman
  2023-01-10 12:47 ` [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with " Michael Ellerman
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Michael Ellerman @ 2023-01-10 12:47 UTC (permalink / raw)
  To: linuxppc-dev

If a relocatable kernel is loaded at an address that is not 2MB aligned
and told not to relocate to zero, the kernel can crash due to
mark_rodata_ro() incorrectly changing some read-write data to read-only.

Scenarios where the misalignment can occur are when the kernel is
loaded by kdump or using the RELOCATABLE_TEST config option.

Example crash with the kernel loaded at 5MB:

  Run /sbin/init as init process
  BUG: Unable to handle kernel data access on write at 0xc000000000452000
  Faulting instruction address: 0xc0000000005b6730
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
  CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
  NIP:  c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
  REGS: c000000004503250 TRAP: 0300   Not tainted  (6.2.0-rc1-00011-g349188be4841)
  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44288480  XER: 00000000
  CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
  ...
  NIP memset+0x68/0x104
  LR  zero_user_segments.constprop.0+0xa8/0xf0
  Call Trace:
    ext4_mpage_readpages+0x7f8/0x830
    ext4_readahead+0x48/0x60
    read_pages+0xb8/0x380
    page_cache_ra_unbounded+0x19c/0x250
    filemap_fault+0x58c/0xae0
    __do_fault+0x60/0x100
    __handle_mm_fault+0x1230/0x1a40
    handle_mm_fault+0x120/0x300
    ___do_page_fault+0x20c/0xa80
    do_page_fault+0x30/0xc0
    data_access_common_virt+0x210/0x220

This happens because mark_rodata_ro() tries to change permissions on the
range _stext..__end_rodata, but _stext sits in the middle of the 2MB
page from 4MB to 6MB:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)

The logic that changes the permissions assumes the linear mapping was
split correctly at boot, so it marks the entire 2MB page read-only. That
leads to the write fault above.

To fix it, the boot time mapping logic needs to consider that if the
kernel is running at a non-zero address then _stext is a boundary where
it must split the mapping.

That leads to the mapping being split correctly, allowing the rodata
permission change to take happen correctly, with no spillover:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
  radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
  radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)

If the kernel is loaded at a 2MB aligned address, the mapping continues
to use 2MB pages as before:

  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
  radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
  radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages

Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index cac727b01799..5a2384ed1727 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -262,6 +262,17 @@ print_mapping(unsigned long start, unsigned long end, unsigned long size, bool e
 static unsigned long next_boundary(unsigned long addr, unsigned long end)
 {
 #ifdef CONFIG_STRICT_KERNEL_RWX
+	unsigned long stext_phys;
+
+	stext_phys = __pa_symbol(_stext);
+
+	// Relocatable kernel running at non-zero real address
+	if (stext_phys != 0) {
+		// Start of relocated kernel text is a rodata boundary
+		if (addr < stext_phys)
+			return stext_phys;
+	}
+
 	if (addr < __pa_symbol(__srwx_boundary))
 		return __pa_symbol(__srwx_boundary);
 #endif
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with relocated kernel
  2023-01-10 12:47 [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel Michael Ellerman
@ 2023-01-10 12:47 ` Michael Ellerman
  2023-01-11  5:01   ` Sachin Sant
  2023-01-11  5:06 ` [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned " Sachin Sant
  2023-02-05  9:41 ` Michael Ellerman
  2 siblings, 1 reply; 5+ messages in thread
From: Michael Ellerman @ 2023-01-10 12:47 UTC (permalink / raw)
  To: linuxppc-dev

If a relocatable kernel is loaded at a non-zero address and told not to
relocate to zero (kdump or RELOCATABLE_TEST), the mapping of the
interrupt code at zero is left with RWX permissions.

That is a security weakness, and leads to a warning at boot if
CONFIG_DEBUG_WX is enabled:

  powerpc/mm: Found insecure W+X mapping at address 00000000056435bc/0xc000000000000000
  WARNING: CPU: 1 PID: 1 at arch/powerpc/mm/ptdump/ptdump.c:193 note_page+0x484/0x4c0
  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc1-00001-g8ae8e98aea82-dirty #175
  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-dd0dca hv:linux,kvm pSeries
  NIP:  c0000000004a1c34 LR: c0000000004a1c30 CTR: 0000000000000000
  REGS: c000000003503770 TRAP: 0700   Not tainted  (6.2.0-rc1-00001-g8ae8e98aea82-dirty)
  MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24000220  XER: 00000000
  CFAR: c000000000545a58 IRQMASK: 0
  ...
  NIP note_page+0x484/0x4c0
  LR  note_page+0x480/0x4c0
  Call Trace:
    note_page+0x480/0x4c0 (unreliable)
    ptdump_pmd_entry+0xc8/0x100
    walk_pgd_range+0x618/0xab0
    walk_page_range_novma+0x74/0xc0
    ptdump_walk_pgd+0x98/0x170
    ptdump_check_wx+0x94/0x100
    mark_rodata_ro+0x30/0x70
    kernel_init+0x78/0x1a0
    ret_from_kernel_thread+0x5c/0x64

The fix has two parts. Firstly the pages from zero up to the end of
interrupts need to be marked read-only, so that they are left with R-X
permissions. Secondly the mapping logic needs to be taught to ensure
there is a page boundary at the end of the interrupt region, so that the
permission change only applies to the interrupt text, and not the region
following it.

Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 5a2384ed1727..26245aaf12b8 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -234,6 +234,14 @@ void radix__mark_rodata_ro(void)
 	end = (unsigned long)__end_rodata;
 
 	radix__change_memory_range(start, end, _PAGE_WRITE);
+
+	for (start = PAGE_OFFSET; start < (unsigned long)_stext; start += PAGE_SIZE) {
+		end = start + PAGE_SIZE;
+		if (overlaps_interrupt_vector_text(start, end))
+			radix__change_memory_range(start, end, _PAGE_WRITE);
+		else
+			break;
+	}
 }
 
 void radix__mark_initmem_nx(void)
@@ -268,6 +276,11 @@ static unsigned long next_boundary(unsigned long addr, unsigned long end)
 
 	// Relocatable kernel running at non-zero real address
 	if (stext_phys != 0) {
+		// The end of interrupts code at zero is a rodata boundary
+		unsigned long end_intr = __pa_symbol(__end_interrupts) - stext_phys;
+		if (addr < end_intr)
+			return end_intr;
+
 		// Start of relocated kernel text is a rodata boundary
 		if (addr < stext_phys)
 			return stext_phys;
-- 
2.39.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with relocated kernel
  2023-01-10 12:47 ` [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with " Michael Ellerman
@ 2023-01-11  5:01   ` Sachin Sant
  0 siblings, 0 replies; 5+ messages in thread
From: Sachin Sant @ 2023-01-11  5:01 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2264 bytes --]



> On 10-Jan-2023, at 6:17 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> If a relocatable kernel is loaded at a non-zero address and told not to
> relocate to zero (kdump or RELOCATABLE_TEST), the mapping of the
> interrupt code at zero is left with RWX permissions.
> 
> That is a security weakness, and leads to a warning at boot if
> CONFIG_DEBUG_WX is enabled:
> 
>  powerpc/mm: Found insecure W+X mapping at address 00000000056435bc/0xc000000000000000
>  WARNING: CPU: 1 PID: 1 at arch/powerpc/mm/ptdump/ptdump.c:193 note_page+0x484/0x4c0
>  CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc1-00001-g8ae8e98aea82-dirty #175
>  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-dd0dca hv:linux,kvm pSeries
>  NIP:  c0000000004a1c34 LR: c0000000004a1c30 CTR: 0000000000000000
>  REGS: c000000003503770 TRAP: 0700   Not tainted  (6.2.0-rc1-00001-g8ae8e98aea82-dirty)
>  MSR:  8000000002029033 <SF,VEC,EE,ME,IR,DR,RI,LE>  CR: 24000220  XER: 00000000
>  CFAR: c000000000545a58 IRQMASK: 0
>  ...
>  NIP note_page+0x484/0x4c0
>  LR  note_page+0x480/0x4c0
>  Call Trace:
>    note_page+0x480/0x4c0 (unreliable)
>    ptdump_pmd_entry+0xc8/0x100
>    walk_pgd_range+0x618/0xab0
>    walk_page_range_novma+0x74/0xc0
>    ptdump_walk_pgd+0x98/0x170
>    ptdump_check_wx+0x94/0x100
>    mark_rodata_ro+0x30/0x70
>    kernel_init+0x78/0x1a0
>    ret_from_kernel_thread+0x5c/0x64
> 
> The fix has two parts. Firstly the pages from zero up to the end of
> interrupts need to be marked read-only, so that they are left with R-X
> permissions. Secondly the mapping logic needs to be taught to ensure
> there is a page boundary at the end of the interrupt region, so that the
> permission change only applies to the interrupt text, and not the region
> following it.
> 
> Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---

Thanks Michael. This fixes the problem reported earlier

https://lore.kernel.org/linuxppc-dev/48206911-FD3D-401A-A69D-1A79403E79E2@linux.ibm.com/

Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>

- Sachin

[-- Attachment #2: Type: text/html, Size: 3174 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel
  2023-01-10 12:47 [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel Michael Ellerman
  2023-01-10 12:47 ` [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with " Michael Ellerman
@ 2023-01-11  5:06 ` Sachin Sant
  2023-02-05  9:41 ` Michael Ellerman
  2 siblings, 0 replies; 5+ messages in thread
From: Sachin Sant @ 2023-01-11  5:06 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev



> On 10-Jan-2023, at 6:17 PM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> 
> If a relocatable kernel is loaded at an address that is not 2MB aligned
> and told not to relocate to zero, the kernel can crash due to
> mark_rodata_ro() incorrectly changing some read-write data to read-only.
> 
> Scenarios where the misalignment can occur are when the kernel is
> loaded by kdump or using the RELOCATABLE_TEST config option.
> 
> Example crash with the kernel loaded at 5MB:
> 
>  Run /sbin/init as init process
>  BUG: Unable to handle kernel data access on write at 0xc000000000452000
>  Faulting instruction address: 0xc0000000005b6730
>  Oops: Kernel access of bad area, sig: 11 [#1]
>  LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>  CPU: 1 PID: 1 Comm: init Not tainted 6.2.0-rc1-00011-g349188be4841 #166
>  Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,git-5b4c5a hv:linux,kvm pSeries
>  NIP:  c0000000005b6730 LR: c000000000ae9ab8 CTR: 0000000000000380
>  REGS: c000000004503250 TRAP: 0300   Not tainted  (6.2.0-rc1-00011-g349188be4841)
>  MSR:  8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 44288480  XER: 00000000
>  CFAR: c0000000005b66ec DAR: c000000000452000 DSISR: 0a000000 IRQMASK: 0
>  ...
>  NIP memset+0x68/0x104
>  LR  zero_user_segments.constprop.0+0xa8/0xf0
>  Call Trace:
>    ext4_mpage_readpages+0x7f8/0x830
>    ext4_readahead+0x48/0x60
>    read_pages+0xb8/0x380
>    page_cache_ra_unbounded+0x19c/0x250
>    filemap_fault+0x58c/0xae0
>    __do_fault+0x60/0x100
>    __handle_mm_fault+0x1230/0x1a40
>    handle_mm_fault+0x120/0x300
>    ___do_page_fault+0x20c/0xa80
>    do_page_fault+0x30/0xc0
>    data_access_common_virt+0x210/0x220
> 
> This happens because mark_rodata_ro() tries to change permissions on the
> range _stext..__end_rodata, but _stext sits in the middle of the 2MB
> page from 4MB to 6MB:
> 
>  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
>  radix-mmu: Mapped 0x0000000000400000-0x0000000002400000 with 2.00 MiB pages (exec)
> 
> The logic that changes the permissions assumes the linear mapping was
> split correctly at boot, so it marks the entire 2MB page read-only. That
> leads to the write fault above.
> 
> To fix it, the boot time mapping logic needs to consider that if the
> kernel is running at a non-zero address then _stext is a boundary where
> it must split the mapping.
> 
> That leads to the mapping being split correctly, allowing the rodata
> permission change to take happen correctly, with no spillover:
> 
>  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
>  radix-mmu: Mapped 0x0000000000400000-0x0000000000500000 with 64.0 KiB pages
>  radix-mmu: Mapped 0x0000000000500000-0x0000000000600000 with 64.0 KiB pages (exec)
>  radix-mmu: Mapped 0x0000000000600000-0x0000000002400000 with 2.00 MiB pages (exec)
> 
> If the kernel is loaded at a 2MB aligned address, the mapping continues
> to use 2MB pages as before:
> 
>  radix-mmu: Mapped 0x0000000000000000-0x0000000000200000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000000200000-0x0000000000400000 with 2.00 MiB pages
>  radix-mmu: Mapped 0x0000000000400000-0x0000000002c00000 with 2.00 MiB pages (exec)
>  radix-mmu: Mapped 0x0000000002c00000-0x0000000100000000 with 2.00 MiB pages
> 
> Fixes: c55d7b5e6426 ("powerpc: Remove STRICT_KERNEL_RWX incompatibility with RELOCATABLE")
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---

Tested successfully with different crash kernel memory values
Tested-by : Sachin Sant <sachinp@linux.ibm.com>

- Sachin

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel
  2023-01-10 12:47 [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel Michael Ellerman
  2023-01-10 12:47 ` [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with " Michael Ellerman
  2023-01-11  5:06 ` [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned " Sachin Sant
@ 2023-02-05  9:41 ` Michael Ellerman
  2 siblings, 0 replies; 5+ messages in thread
From: Michael Ellerman @ 2023-02-05  9:41 UTC (permalink / raw)
  To: Michael Ellerman, linuxppc-dev

On Tue, 10 Jan 2023 23:47:52 +1100, Michael Ellerman wrote:
> If a relocatable kernel is loaded at an address that is not 2MB aligned
> and told not to relocate to zero, the kernel can crash due to
> mark_rodata_ro() incorrectly changing some read-write data to read-only.
> 
> Scenarios where the misalignment can occur are when the kernel is
> loaded by kdump or using the RELOCATABLE_TEST config option.
> 
> [...]

Applied to powerpc/fixes.

[1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel
      https://git.kernel.org/powerpc/c/98d0219e043e09013e883eacde3b93e0b2bf944d
[2/2] powerpc/64s/radix: Fix RWX mapping with relocated kernel
      https://git.kernel.org/powerpc/c/111bcb37385353f0510e5847d5abcd1c613dba23

cheers

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-02-05  9:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-01-10 12:47 [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned relocated kernel Michael Ellerman
2023-01-10 12:47 ` [PATCH 2/2] powerpc/64s/radix: Fix RWX mapping with " Michael Ellerman
2023-01-11  5:01   ` Sachin Sant
2023-01-11  5:06 ` [PATCH 1/2] powerpc/64s/radix: Fix crash with unaligned " Sachin Sant
2023-02-05  9:41 ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).