linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm/cma: make detection of highmem_start more robust
@ 2025-05-19 17:18 Mike Rapoport
  2025-05-19 21:55 ` Pratyush Yadav
  2025-05-20  8:30 ` Oscar Salvador
  0 siblings, 2 replies; 8+ messages in thread
From: Mike Rapoport @ 2025-05-19 17:18 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexandre Ghiti, Mike Rapoport, Pratyush Yadav, linux-arch,
	linux-kernel, linux-mm

From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>

Pratyush Yadav reports the following crash:

    ------------[ cut here ]------------
    kernel BUG at arch/x86/mm/physaddr.c:23!
    ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
    CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
    RIP: 0010:__phys_addr+0x58/0x60
    Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
    RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
    RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
    RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
    RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
    R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
    R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
    FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
    Call Trace:
     <TASK>
     ? __cma_declare_contiguous_nid+0x6e/0x340
     ? cma_declare_contiguous_nid+0x33/0x70
     ? dma_contiguous_reserve_area+0x2f/0x70
     ? setup_arch+0x6f1/0x870
     ? start_kernel+0x52/0x4b0
     ? x86_64_start_reservations+0x29/0x30
     ? x86_64_start_kernel+0x7c/0x80
     ? common_startup_64+0x13e/0x141

  The reason is that __cma_declare_contiguous_nid() does:

          highmem_start = __pa(high_memory - 1) + 1;

  If dma_contiguous_reserve_area() (or any other CMA declaration) is
  called before free_area_init(), high_memory is uninitialized. Without
  CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
  highmem_start.

The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
free_area_init()") moved initialization of high_memory after the call to
dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
architectures.

In the case CONFIG_HIGHMEM is enabled, some architectures that actually
support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
before a possible call to __cma_declare_contiguous_nid() and some
initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
xtensa) even before the commit e120d1bc12da so they are fine with using
uninitialized value of high_memory.

And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
the first address after memory end, so instead of relying on high_memory to
calculate highmem_start use memblock_end_of_DRAM() and eliminate the
dependency of CMA area creation on high_memory in majority of
configurations.

Reported-by: Pratyush Yadav <ptyadav@amazon.de>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 mm/cma.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/cma.c b/mm/cma.c
index 15632939f20a..c04be488b099 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -608,7 +608,10 @@ static int __init __cma_declare_contiguous_nid(phys_addr_t *basep,
 	 * complain. Find the boundary by adding one to the last valid
 	 * address.
 	 */
-	highmem_start = __pa(high_memory - 1) + 1;
+	if (IS_ENABLED(CONFIG_HIGHMEM))
+		highmem_start = __pa(high_memory - 1) + 1;
+	else
+		highmem_start = memblock_end_of_DRAM();
 	pr_debug("%s(size %pa, base %pa, limit %pa alignment %pa)\n",
 		__func__, &size, &base, &limit, &alignment);
 
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-19 17:18 [PATCH] mm/cma: make detection of highmem_start more robust Mike Rapoport
@ 2025-05-19 21:55 ` Pratyush Yadav
  2025-05-20  5:45   ` Mike Rapoport
  2025-05-20  8:30 ` Oscar Salvador
  1 sibling, 1 reply; 8+ messages in thread
From: Pratyush Yadav @ 2025-05-19 21:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexandre Ghiti, linux-arch, linux-kernel,
	linux-mm

Hi Mike,

On Mon, May 19 2025, Mike Rapoport wrote:

> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>
> Pratyush Yadav reports the following crash:
>
>     ------------[ cut here ]------------
>     kernel BUG at arch/x86/mm/physaddr.c:23!
>     ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
>     CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
>     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>     RIP: 0010:__phys_addr+0x58/0x60
>     Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
>     RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
>     RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
>     RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
>     RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
>     R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
>     R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
>     FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
>     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
>     Call Trace:
>      <TASK>
>      ? __cma_declare_contiguous_nid+0x6e/0x340
>      ? cma_declare_contiguous_nid+0x33/0x70
>      ? dma_contiguous_reserve_area+0x2f/0x70
>      ? setup_arch+0x6f1/0x870
>      ? start_kernel+0x52/0x4b0
>      ? x86_64_start_reservations+0x29/0x30
>      ? x86_64_start_kernel+0x7c/0x80
>      ? common_startup_64+0x13e/0x141
>
>   The reason is that __cma_declare_contiguous_nid() does:
>
>           highmem_start = __pa(high_memory - 1) + 1;
>
>   If dma_contiguous_reserve_area() (or any other CMA declaration) is
>   called before free_area_init(), high_memory is uninitialized. Without
>   CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
>   highmem_start.
>
> The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
> free_area_init()") moved initialization of high_memory after the call to
> dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
> architectures.
>
> In the case CONFIG_HIGHMEM is enabled, some architectures that actually
> support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
> before a possible call to __cma_declare_contiguous_nid() and some
> initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
> xtensa) even before the commit e120d1bc12da so they are fine with using
> uninitialized value of high_memory.

I don't know if they are fine or they haven't realized this is a bug
yet. Either way, this patch fixes the crash for me on x86_64, restores
how it should behave, and doesn't seem to make anything worse, so:

Tested-by: Pratyush Yadav <ptyadav@amazon.de>

Thanks for fixing this!

>
> And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
> the first address after memory end, so instead of relying on high_memory to
> calculate highmem_start use memblock_end_of_DRAM() and eliminate the
> dependency of CMA area creation on high_memory in majority of
> configurations.
>
> Reported-by: Pratyush Yadav <ptyadav@amazon.de>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
>  mm/cma.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/cma.c b/mm/cma.c
> index 15632939f20a..c04be488b099 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -608,7 +608,10 @@ static int __init __cma_declare_contiguous_nid(phys_addr_t *basep,
>  	 * complain. Find the boundary by adding one to the last valid
>  	 * address.
>  	 */
> -	highmem_start = __pa(high_memory - 1) + 1;
> +	if (IS_ENABLED(CONFIG_HIGHMEM))
> +		highmem_start = __pa(high_memory - 1) + 1;
> +	else
> +		highmem_start = memblock_end_of_DRAM();
>  	pr_debug("%s(size %pa, base %pa, limit %pa alignment %pa)\n",
>  		__func__, &size, &base, &limit, &alignment);

-- 
Regards,
Pratyush Yadav



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-19 21:55 ` Pratyush Yadav
@ 2025-05-20  5:45   ` Mike Rapoport
  2025-05-22 13:31     ` Pratyush Yadav
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2025-05-20  5:45 UTC (permalink / raw)
  To: Pratyush Yadav
  Cc: Andrew Morton, Alexandre Ghiti, linux-arch, linux-kernel,
	linux-mm

On Mon, May 19, 2025 at 11:55:05PM +0200, Pratyush Yadav wrote:
> Hi Mike,
> 
> On Mon, May 19 2025, Mike Rapoport wrote:
> 
> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> >
> > Pratyush Yadav reports the following crash:
> >
> >     ------------[ cut here ]------------
> >     kernel BUG at arch/x86/mm/physaddr.c:23!
> >     ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
> >     CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
> >     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> >     RIP: 0010:__phys_addr+0x58/0x60
> >     Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> >     RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
> >     RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
> >     RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
> >     RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
> >     R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
> >     R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
> >     FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
> >     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >     CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
> >     Call Trace:
> >      <TASK>
> >      ? __cma_declare_contiguous_nid+0x6e/0x340
> >      ? cma_declare_contiguous_nid+0x33/0x70
> >      ? dma_contiguous_reserve_area+0x2f/0x70
> >      ? setup_arch+0x6f1/0x870
> >      ? start_kernel+0x52/0x4b0
> >      ? x86_64_start_reservations+0x29/0x30
> >      ? x86_64_start_kernel+0x7c/0x80
> >      ? common_startup_64+0x13e/0x141
> >
> >   The reason is that __cma_declare_contiguous_nid() does:
> >
> >           highmem_start = __pa(high_memory - 1) + 1;
> >
> >   If dma_contiguous_reserve_area() (or any other CMA declaration) is
> >   called before free_area_init(), high_memory is uninitialized. Without
> >   CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
> >   highmem_start.
> >
> > The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
> > free_area_init()") moved initialization of high_memory after the call to
> > dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
> > architectures.
> >
> > In the case CONFIG_HIGHMEM is enabled, some architectures that actually
> > support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
> > before a possible call to __cma_declare_contiguous_nid() and some
> > initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
> > xtensa) even before the commit e120d1bc12da so they are fine with using
> > uninitialized value of high_memory.
> 
> I don't know if they are fine or they haven't realized this is a bug
> yet.

For those that initialized high_memory in their mem_init() it would have
been a bug quite some time.

> Either way, this patch fixes the crash for me on x86_64, restores how it
> should behave, and doesn't seem to make anything worse, so:
> 
> Tested-by: Pratyush Yadav <ptyadav@amazon.de>

Thanks!
 
> Thanks for fixing this!
> 
> -- 
> Regards,
> Pratyush Yadav

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-19 17:18 [PATCH] mm/cma: make detection of highmem_start more robust Mike Rapoport
  2025-05-19 21:55 ` Pratyush Yadav
@ 2025-05-20  8:30 ` Oscar Salvador
  2025-05-20  9:14   ` David Hildenbrand
  1 sibling, 1 reply; 8+ messages in thread
From: Oscar Salvador @ 2025-05-20  8:30 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexandre Ghiti, Pratyush Yadav, linux-arch,
	linux-kernel, linux-mm

On Mon, May 19, 2025 at 08:18:05PM +0300, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> Pratyush Yadav reports the following crash:
> 
>     ------------[ cut here ]------------
>     kernel BUG at arch/x86/mm/physaddr.c:23!
>     ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
>     CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
>     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>     RIP: 0010:__phys_addr+0x58/0x60
>     Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
>     RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
>     RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
>     RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
>     RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
>     R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
>     R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
>     FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
>     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>     CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
>     Call Trace:
>      <TASK>
>      ? __cma_declare_contiguous_nid+0x6e/0x340
>      ? cma_declare_contiguous_nid+0x33/0x70
>      ? dma_contiguous_reserve_area+0x2f/0x70
>      ? setup_arch+0x6f1/0x870
>      ? start_kernel+0x52/0x4b0
>      ? x86_64_start_reservations+0x29/0x30
>      ? x86_64_start_kernel+0x7c/0x80
>      ? common_startup_64+0x13e/0x141
> 
>   The reason is that __cma_declare_contiguous_nid() does:
> 
>           highmem_start = __pa(high_memory - 1) + 1;
> 
>   If dma_contiguous_reserve_area() (or any other CMA declaration) is
>   called before free_area_init(), high_memory is uninitialized. Without
>   CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
>   highmem_start.
> 
> The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
> free_area_init()") moved initialization of high_memory after the call to
> dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
> architectures.
> 
> In the case CONFIG_HIGHMEM is enabled, some architectures that actually
> support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
> before a possible call to __cma_declare_contiguous_nid() and some
> initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
> xtensa) even before the commit e120d1bc12da so they are fine with using
> uninitialized value of high_memory.
> 
> And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
> the first address after memory end, so instead of relying on high_memory to
> calculate highmem_start use memblock_end_of_DRAM() and eliminate the
> dependency of CMA area creation on high_memory in majority of
> configurations.
> 
> Reported-by: Pratyush Yadav <ptyadav@amazon.de>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

I will note though that it is a bit akward to have highmem involved here
when we might not have CONFIG_HIGHMEM enabled.
I get that for !CONFIG_HIGHMEM it is a no-op situation, but still I
wonder whether we could abstract that from this function.


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-20  8:30 ` Oscar Salvador
@ 2025-05-20  9:14   ` David Hildenbrand
  2025-05-20 15:06     ` Mike Rapoport
  0 siblings, 1 reply; 8+ messages in thread
From: David Hildenbrand @ 2025-05-20  9:14 UTC (permalink / raw)
  To: Oscar Salvador, Mike Rapoport
  Cc: Andrew Morton, Alexandre Ghiti, Pratyush Yadav, linux-arch,
	linux-kernel, linux-mm

On 20.05.25 10:30, Oscar Salvador wrote:
> On Mon, May 19, 2025 at 08:18:05PM +0300, Mike Rapoport wrote:
>> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>>
>> Pratyush Yadav reports the following crash:
>>
>>      ------------[ cut here ]------------
>>      kernel BUG at arch/x86/mm/physaddr.c:23!
>>      ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
>>      CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
>>      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>>      RIP: 0010:__phys_addr+0x58/0x60
>>      Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
>>      RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
>>      RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
>>      RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
>>      RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
>>      R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
>>      R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
>>      FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
>>      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>      CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
>>      Call Trace:
>>       <TASK>
>>       ? __cma_declare_contiguous_nid+0x6e/0x340
>>       ? cma_declare_contiguous_nid+0x33/0x70
>>       ? dma_contiguous_reserve_area+0x2f/0x70
>>       ? setup_arch+0x6f1/0x870
>>       ? start_kernel+0x52/0x4b0
>>       ? x86_64_start_reservations+0x29/0x30
>>       ? x86_64_start_kernel+0x7c/0x80
>>       ? common_startup_64+0x13e/0x141
>>
>>    The reason is that __cma_declare_contiguous_nid() does:
>>
>>            highmem_start = __pa(high_memory - 1) + 1;
>>
>>    If dma_contiguous_reserve_area() (or any other CMA declaration) is
>>    called before free_area_init(), high_memory is uninitialized. Without
>>    CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
>>    highmem_start.
>>
>> The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
>> free_area_init()") moved initialization of high_memory after the call to
>> dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
>> architectures.
>>
>> In the case CONFIG_HIGHMEM is enabled, some architectures that actually
>> support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
>> before a possible call to __cma_declare_contiguous_nid() and some
>> initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
>> xtensa) even before the commit e120d1bc12da so they are fine with using
>> uninitialized value of high_memory.
>>
>> And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
>> the first address after memory end, so instead of relying on high_memory to
>> calculate highmem_start use memblock_end_of_DRAM() and eliminate the
>> dependency of CMA area creation on high_memory in majority of
>> configurations.
>>
>> Reported-by: Pratyush Yadav <ptyadav@amazon.de>
>> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> 
> Reviewed-by: Oscar Salvador <osalvador@suse.de>
> 
> I will note though that it is a bit akward to have highmem involved here
> when we might not have CONFIG_HIGHMEM enabled.
> I get that for !CONFIG_HIGHMEM it is a no-op situation, but still I
> wonder whether we could abstract that from this function.

Same thought here.

Can't we do some IS_ENABLED(CONFIG_HIGHMEM) magic or similar to not even 
use that variable without CONFIG_HIGHMEM?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-20  9:14   ` David Hildenbrand
@ 2025-05-20 15:06     ` Mike Rapoport
  2025-05-20 15:08       ` David Hildenbrand
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Rapoport @ 2025-05-20 15:06 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Andrew Morton, Alexandre Ghiti, Pratyush Yadav,
	linux-arch, linux-kernel, linux-mm

On Tue, May 20, 2025 at 11:14:28AM +0200, David Hildenbrand wrote:
> On 20.05.25 10:30, Oscar Salvador wrote:
> > On Mon, May 19, 2025 at 08:18:05PM +0300, Mike Rapoport wrote:
> > > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > > 
> > > Pratyush Yadav reports the following crash:
> > > 
> > >      ------------[ cut here ]------------
> > >      kernel BUG at arch/x86/mm/physaddr.c:23!
> > >      ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
> > >      CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
> > >      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> > >      RIP: 0010:__phys_addr+0x58/0x60
> > >      Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
> > >      RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
> > >      RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
> > >      RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
> > >      RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
> > >      R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
> > >      R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
> > >      FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
> > >      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > >      CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
> > >      Call Trace:
> > >       <TASK>
> > >       ? __cma_declare_contiguous_nid+0x6e/0x340
> > >       ? cma_declare_contiguous_nid+0x33/0x70
> > >       ? dma_contiguous_reserve_area+0x2f/0x70
> > >       ? setup_arch+0x6f1/0x870
> > >       ? start_kernel+0x52/0x4b0
> > >       ? x86_64_start_reservations+0x29/0x30
> > >       ? x86_64_start_kernel+0x7c/0x80
> > >       ? common_startup_64+0x13e/0x141
> > > 
> > >    The reason is that __cma_declare_contiguous_nid() does:
> > > 
> > >            highmem_start = __pa(high_memory - 1) + 1;
> > > 
> > >    If dma_contiguous_reserve_area() (or any other CMA declaration) is
> > >    called before free_area_init(), high_memory is uninitialized. Without
> > >    CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
> > >    highmem_start.
> > > 
> > > The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
> > > free_area_init()") moved initialization of high_memory after the call to
> > > dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
> > > architectures.
> > > 
> > > In the case CONFIG_HIGHMEM is enabled, some architectures that actually
> > > support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
> > > before a possible call to __cma_declare_contiguous_nid() and some
> > > initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
> > > xtensa) even before the commit e120d1bc12da so they are fine with using
> > > uninitialized value of high_memory.
> > > 
> > > And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
> > > the first address after memory end, so instead of relying on high_memory to
> > > calculate highmem_start use memblock_end_of_DRAM() and eliminate the
> > > dependency of CMA area creation on high_memory in majority of
> > > configurations.
> > > 
> > > Reported-by: Pratyush Yadav <ptyadav@amazon.de>
> > > Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > 
> > Reviewed-by: Oscar Salvador <osalvador@suse.de>
> > 
> > I will note though that it is a bit akward to have highmem involved here
> > when we might not have CONFIG_HIGHMEM enabled.
> > I get that for !CONFIG_HIGHMEM it is a no-op situation, but still I
> > wonder whether we could abstract that from this function.

Highmem is there for some time now (see f7426b983a6a ("mm: cma: adjust
address limit to avoid hitting low/high memory boundary"))
We might try abstracting it from that function but I'd prefer not doing it
that late in the release cycle.
 
> Same thought here.
> 
> Can't we do some IS_ENABLED(CONFIG_HIGHMEM) magic or similar to not even use
> that variable without CONFIG_HIGHMEM?

You mean highmem_start or high_memory?

high_memory is one of the ways to say "end of directly/linearly addressable
memory" and some other places in the kernel (outside arch) still use it
regardless of CONFIG_HIGHMEM.

And I don't think we have another way to say where directly addressable
memory ends, and this IMHO is something that should replace high_memory.
 
> -- 
> Cheers,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-20 15:06     ` Mike Rapoport
@ 2025-05-20 15:08       ` David Hildenbrand
  0 siblings, 0 replies; 8+ messages in thread
From: David Hildenbrand @ 2025-05-20 15:08 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Oscar Salvador, Andrew Morton, Alexandre Ghiti, Pratyush Yadav,
	linux-arch, linux-kernel, linux-mm

On 20.05.25 17:06, Mike Rapoport wrote:
> On Tue, May 20, 2025 at 11:14:28AM +0200, David Hildenbrand wrote:
>> On 20.05.25 10:30, Oscar Salvador wrote:
>>> On Mon, May 19, 2025 at 08:18:05PM +0300, Mike Rapoport wrote:
>>>> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>>>>
>>>> Pratyush Yadav reports the following crash:
>>>>
>>>>       ------------[ cut here ]------------
>>>>       kernel BUG at arch/x86/mm/physaddr.c:23!
>>>>       ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
>>>>       CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
>>>>       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>>>>       RIP: 0010:__phys_addr+0x58/0x60
>>>>       Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
>>>>       RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
>>>>       RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
>>>>       RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
>>>>       RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
>>>>       R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
>>>>       R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
>>>>       FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
>>>>       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>       CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
>>>>       Call Trace:
>>>>        <TASK>
>>>>        ? __cma_declare_contiguous_nid+0x6e/0x340
>>>>        ? cma_declare_contiguous_nid+0x33/0x70
>>>>        ? dma_contiguous_reserve_area+0x2f/0x70
>>>>        ? setup_arch+0x6f1/0x870
>>>>        ? start_kernel+0x52/0x4b0
>>>>        ? x86_64_start_reservations+0x29/0x30
>>>>        ? x86_64_start_kernel+0x7c/0x80
>>>>        ? common_startup_64+0x13e/0x141
>>>>
>>>>     The reason is that __cma_declare_contiguous_nid() does:
>>>>
>>>>             highmem_start = __pa(high_memory - 1) + 1;
>>>>
>>>>     If dma_contiguous_reserve_area() (or any other CMA declaration) is
>>>>     called before free_area_init(), high_memory is uninitialized. Without
>>>>     CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
>>>>     highmem_start.
>>>>
>>>> The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
>>>> free_area_init()") moved initialization of high_memory after the call to
>>>> dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
>>>> architectures.
>>>>
>>>> In the case CONFIG_HIGHMEM is enabled, some architectures that actually
>>>> support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
>>>> before a possible call to __cma_declare_contiguous_nid() and some
>>>> initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
>>>> xtensa) even before the commit e120d1bc12da so they are fine with using
>>>> uninitialized value of high_memory.
>>>>
>>>> And in the case CONFIG_HIGHMEM is disabled high_memory essentially becomes
>>>> the first address after memory end, so instead of relying on high_memory to
>>>> calculate highmem_start use memblock_end_of_DRAM() and eliminate the
>>>> dependency of CMA area creation on high_memory in majority of
>>>> configurations.
>>>>
>>>> Reported-by: Pratyush Yadav <ptyadav@amazon.de>
>>>> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
>>>> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
>>>
>>> Reviewed-by: Oscar Salvador <osalvador@suse.de>
>>>
>>> I will note though that it is a bit akward to have highmem involved here
>>> when we might not have CONFIG_HIGHMEM enabled.
>>> I get that for !CONFIG_HIGHMEM it is a no-op situation, but still I
>>> wonder whether we could abstract that from this function.
> 
> Highmem is there for some time now (see f7426b983a6a ("mm: cma: adjust
> address limit to avoid hitting low/high memory boundary"))
> We might try abstracting it from that function but I'd prefer not doing it
> that late in the release cycle.

Agreed, assuming this will still make it into this release.

>   
>> Same thought here.
>>
>> Can't we do some IS_ENABLED(CONFIG_HIGHMEM) magic or similar to not even use
>> that variable without CONFIG_HIGHMEM?
> 
> You mean highmem_start or high_memory?

highmem_start in this function.

> 
> high_memory is one of the ways to say "end of directly/linearly addressable
> memory" and some other places in the kernel (outside arch) still use it
> regardless of CONFIG_HIGHMEM.
> 
> And I don't think we have another way to say where directly addressable
> memory ends, and this IMHO is something that should replace high_memory.

Agreed.


-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm/cma: make detection of highmem_start more robust
  2025-05-20  5:45   ` Mike Rapoport
@ 2025-05-22 13:31     ` Pratyush Yadav
  0 siblings, 0 replies; 8+ messages in thread
From: Pratyush Yadav @ 2025-05-22 13:31 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: Andrew Morton, Alexandre Ghiti, linux-arch, linux-kernel,
	linux-mm

Hi Mike,

On Tue, May 20 2025, Mike Rapoport wrote:

> On Mon, May 19, 2025 at 11:55:05PM +0200, Pratyush Yadav wrote:
>> Hi Mike,
>>
>> On Mon, May 19 2025, Mike Rapoport wrote:
>>
>> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>> >
>> > Pratyush Yadav reports the following crash:
>> >
>> >     ------------[ cut here ]------------
>> >     kernel BUG at arch/x86/mm/physaddr.c:23!
>> >     ception 0x06 IP 10:ffffffff812ebbf8 error 0 cr2 0xffff88903ffff000
>> >     CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.15.0-rc6+ #231 PREEMPT(undef)
>> >     Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>> >     RIP: 0010:__phys_addr+0x58/0x60
>> >     Code: 01 48 89 c2 48 d3 ea 48 85 d2 75 05 e9 91 52 cf 00 0f 0b 48 3d ff ff ff 1f 77 0f 48 8b 05 20 54 55 01 48 01 d0 e9 78 52 cf 00 <0f> 0b 90 0f 1f 44 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90
>> >     RSP: 0000:ffffffff82803dd8 EFLAGS: 00010006 ORIG_RAX: 0000000000000000
>> >     RAX: 000000007fffffff RBX: 00000000ffffffff RCX: 0000000000000000
>> >     RDX: 000000007fffffff RSI: 0000000280000000 RDI: ffffffffffffffff
>> >     RBP: ffffffff82803e68 R08: 0000000000000000 R09: 0000000000000000
>> >     R10: ffffffff83153180 R11: ffffffff82803e48 R12: ffffffff83c9aed0
>> >     R13: 0000000000000000 R14: 0000001040000000 R15: 0000000000000000
>> >     FS:  0000000000000000(0000) GS:0000000000000000(0000) knlGS:0000000000000000
>> >     CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> >     CR2: ffff88903ffff000 CR3: 0000000002838000 CR4: 00000000000000b0
>> >     Call Trace:
>> >      <TASK>
>> >      ? __cma_declare_contiguous_nid+0x6e/0x340
>> >      ? cma_declare_contiguous_nid+0x33/0x70
>> >      ? dma_contiguous_reserve_area+0x2f/0x70
>> >      ? setup_arch+0x6f1/0x870
>> >      ? start_kernel+0x52/0x4b0
>> >      ? x86_64_start_reservations+0x29/0x30
>> >      ? x86_64_start_kernel+0x7c/0x80
>> >      ? common_startup_64+0x13e/0x141
>> >
>> >   The reason is that __cma_declare_contiguous_nid() does:
>> >
>> >           highmem_start = __pa(high_memory - 1) + 1;
>> >
>> >   If dma_contiguous_reserve_area() (or any other CMA declaration) is
>> >   called before free_area_init(), high_memory is uninitialized. Without
>> >   CONFIG_DEBUG_VIRTUAL, it will likely work but use the wrong value for
>> >   highmem_start.
>> >
>> > The issue occurs because commit e120d1bc12da ("arch, mm: set high_memory in
>> > free_area_init()") moved initialization of high_memory after the call to
>> > dma_contiguous_reserve() -> __cma_declare_contiguous_nid() on several
>> > architectures.
>> >
>> > In the case CONFIG_HIGHMEM is enabled, some architectures that actually
>> > support HIGHMEM (arm, powerpc and x86) have initialization of high_memory
>> > before a possible call to __cma_declare_contiguous_nid() and some
>> > initialized high_memory late anyway (arc, csky, microblase, mips, sparc,
>> > xtensa) even before the commit e120d1bc12da so they are fine with using
>> > uninitialized value of high_memory.
>>
>> I don't know if they are fine or they haven't realized this is a bug
>> yet.
>
> For those that initialized high_memory in their mem_init() it would have
> been a bug quite some time.

Agreed. This patch fixes the regression caused by e120d1bc12da5 ("arch,
mm: set high_memory in free_area_init()"). I don't think you need to go
around fixing long standing bugs. I was just thinking out loud with this
comment.

[...]

-- 
Regards,
Pratyush Yadav



Amazon Web Services Development Center Germany GmbH
Tamara-Danz-Str. 13
10243 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 257764 B
Sitz: Berlin
Ust-ID: DE 365 538 597


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-05-22 13:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-19 17:18 [PATCH] mm/cma: make detection of highmem_start more robust Mike Rapoport
2025-05-19 21:55 ` Pratyush Yadav
2025-05-20  5:45   ` Mike Rapoport
2025-05-22 13:31     ` Pratyush Yadav
2025-05-20  8:30 ` Oscar Salvador
2025-05-20  9:14   ` David Hildenbrand
2025-05-20 15:06     ` Mike Rapoport
2025-05-20 15:08       ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).