linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop
@ 2024-08-12  6:20 Jinjie Ruan
  2024-08-12 10:39 ` Catalin Marinas
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Jinjie Ruan @ 2024-08-12  6:20 UTC (permalink / raw)
  To: catalin.marinas, bhe, vgoyal, dyoung, paul.walmsley, palmer, aou,
	akpm, linux-kernel, kexec, linux-riscv, linux-arm-kernel,
	linux-arch
  Cc: ruanjinjie

On RISCV64 Qemu machine with 512MB memory, cmdline "crashkernel=500M,high"
will cause system stall as below:

	 Zone ranges:
	   DMA32    [mem 0x0000000080000000-0x000000009fffffff]
	   Normal   empty
	 Movable zone start for each node
	 Early memory node ranges
	   node   0: [mem 0x0000000080000000-0x000000008005ffff]
	   node   0: [mem 0x0000000080060000-0x000000009fffffff]
	 Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
	(stall here)

commit 5d99cadf1568 ("crash: fix x86_32 crash memory reserve dead loop
bug") fix this on 32-bit architecture. However, the problem is not
completely solved. If `CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX` on 64-bit
architecture, for example, when system memory is equal to
CRASH_ADDR_LOW_MAX on RISCV64, the following infinite loop will also occur:

	-> reserve_crashkernel_generic() and high is true
	   -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail
	      -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly
	         (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).

As Catalin suggested, do not remove the ",high" reservation fallback to
",low" logic which will change arm64's kdump behavior, but fix it by
skipping the above situation similar to commit d2f32f23190b ("crash: fix
x86_32 crash memory reserve dead loop").

After this patch, it print:
	cannot allocate crashkernel (size:0x1f400000)

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
---
v2:
- Fix it in another way suggested by Catalin.
- Add Suggested-by.
---
 kernel/crash_reserve.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
index 5387269114f6..aae4a9e998d1 100644
--- a/kernel/crash_reserve.c
+++ b/kernel/crash_reserve.c
@@ -427,7 +427,8 @@ void __init reserve_crashkernel_generic(char *cmdline,
 		if (high && search_end == CRASH_ADDR_HIGH_MAX) {
 			search_end = CRASH_ADDR_LOW_MAX;
 			search_base = 0;
-			goto retry;
+			if (search_end != CRASH_ADDR_HIGH_MAX)
+				goto retry;
 		}
 		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
 			crash_size);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop
  2024-08-12  6:20 [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop Jinjie Ruan
@ 2024-08-12 10:39 ` Catalin Marinas
  2024-08-13  3:31 ` Baoquan He
  2024-09-24  6:40 ` patchwork-bot+linux-riscv
  2 siblings, 0 replies; 4+ messages in thread
From: Catalin Marinas @ 2024-08-12 10:39 UTC (permalink / raw)
  To: Jinjie Ruan
  Cc: bhe, vgoyal, dyoung, paul.walmsley, palmer, aou, akpm,
	linux-kernel, kexec, linux-riscv, linux-arm-kernel, linux-arch

On Mon, Aug 12, 2024 at 02:20:17PM +0800, Jinjie Ruan wrote:
> On RISCV64 Qemu machine with 512MB memory, cmdline "crashkernel=500M,high"
> will cause system stall as below:
> 
> 	 Zone ranges:
> 	   DMA32    [mem 0x0000000080000000-0x000000009fffffff]
> 	   Normal   empty
> 	 Movable zone start for each node
> 	 Early memory node ranges
> 	   node   0: [mem 0x0000000080000000-0x000000008005ffff]
> 	   node   0: [mem 0x0000000080060000-0x000000009fffffff]
> 	 Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
> 	(stall here)
> 
> commit 5d99cadf1568 ("crash: fix x86_32 crash memory reserve dead loop
> bug") fix this on 32-bit architecture. However, the problem is not
> completely solved. If `CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX` on 64-bit
> architecture, for example, when system memory is equal to
> CRASH_ADDR_LOW_MAX on RISCV64, the following infinite loop will also occur:
> 
> 	-> reserve_crashkernel_generic() and high is true
> 	   -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail
> 	      -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly
> 	         (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).
> 
> As Catalin suggested, do not remove the ",high" reservation fallback to
> ",low" logic which will change arm64's kdump behavior, but fix it by
> skipping the above situation similar to commit d2f32f23190b ("crash: fix
> x86_32 crash memory reserve dead loop").
> 
> After this patch, it print:
> 	cannot allocate crashkernel (size:0x1f400000)
> 
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks.


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop
  2024-08-12  6:20 [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop Jinjie Ruan
  2024-08-12 10:39 ` Catalin Marinas
@ 2024-08-13  3:31 ` Baoquan He
  2024-09-24  6:40 ` patchwork-bot+linux-riscv
  2 siblings, 0 replies; 4+ messages in thread
From: Baoquan He @ 2024-08-13  3:31 UTC (permalink / raw)
  To: Jinjie Ruan
  Cc: catalin.marinas, vgoyal, dyoung, paul.walmsley, palmer, aou, akpm,
	linux-kernel, kexec, linux-riscv, linux-arm-kernel, linux-arch

On 08/12/24 at 02:20pm, Jinjie Ruan wrote:
> On RISCV64 Qemu machine with 512MB memory, cmdline "crashkernel=500M,high"
> will cause system stall as below:
> 
> 	 Zone ranges:
> 	   DMA32    [mem 0x0000000080000000-0x000000009fffffff]
> 	   Normal   empty
> 	 Movable zone start for each node
> 	 Early memory node ranges
> 	   node   0: [mem 0x0000000080000000-0x000000008005ffff]
> 	   node   0: [mem 0x0000000080060000-0x000000009fffffff]
> 	 Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
> 	(stall here)
> 
> commit 5d99cadf1568 ("crash: fix x86_32 crash memory reserve dead loop
> bug") fix this on 32-bit architecture. However, the problem is not
> completely solved. If `CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX` on 64-bit
> architecture, for example, when system memory is equal to
> CRASH_ADDR_LOW_MAX on RISCV64, the following infinite loop will also occur:
> 
> 	-> reserve_crashkernel_generic() and high is true
> 	   -> alloc at [CRASH_ADDR_LOW_MAX, CRASH_ADDR_HIGH_MAX] fail
> 	      -> alloc at [0, CRASH_ADDR_LOW_MAX] fail and repeatedly
> 	         (because CRASH_ADDR_LOW_MAX = CRASH_ADDR_HIGH_MAX).
> 
> As Catalin suggested, do not remove the ",high" reservation fallback to
> ",low" logic which will change arm64's kdump behavior, but fix it by
> skipping the above situation similar to commit d2f32f23190b ("crash: fix
> x86_32 crash memory reserve dead loop").
> 
> After this patch, it print:
> 	cannot allocate crashkernel (size:0x1f400000)
> 
> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
> Suggested-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
> v2:
> - Fix it in another way suggested by Catalin.
> - Add Suggested-by.
> ---
>  kernel/crash_reserve.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Acked-by: Baoquan He <bhe@redhat.com>

> 
> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
> index 5387269114f6..aae4a9e998d1 100644
> --- a/kernel/crash_reserve.c
> +++ b/kernel/crash_reserve.c
> @@ -427,7 +427,8 @@ void __init reserve_crashkernel_generic(char *cmdline,
>  		if (high && search_end == CRASH_ADDR_HIGH_MAX) {
>  			search_end = CRASH_ADDR_LOW_MAX;
>  			search_base = 0;
> -			goto retry;
> +			if (search_end != CRASH_ADDR_HIGH_MAX)
> +				goto retry;
>  		}
>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>  			crash_size);
> -- 
> 2.34.1
> 



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop
  2024-08-12  6:20 [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop Jinjie Ruan
  2024-08-12 10:39 ` Catalin Marinas
  2024-08-13  3:31 ` Baoquan He
@ 2024-09-24  6:40 ` patchwork-bot+linux-riscv
  2 siblings, 0 replies; 4+ messages in thread
From: patchwork-bot+linux-riscv @ 2024-09-24  6:40 UTC (permalink / raw)
  To: Jinjie Ruan
  Cc: linux-riscv, catalin.marinas, bhe, vgoyal, dyoung, paul.walmsley,
	palmer, aou, akpm, linux-kernel, kexec, linux-arm-kernel,
	linux-arch

Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Mon, 12 Aug 2024 14:20:17 +0800 you wrote:
> On RISCV64 Qemu machine with 512MB memory, cmdline "crashkernel=500M,high"
> will cause system stall as below:
> 
> 	 Zone ranges:
> 	   DMA32    [mem 0x0000000080000000-0x000000009fffffff]
> 	   Normal   empty
> 	 Movable zone start for each node
> 	 Early memory node ranges
> 	   node   0: [mem 0x0000000080000000-0x000000008005ffff]
> 	   node   0: [mem 0x0000000080060000-0x000000009fffffff]
> 	 Initmem setup node 0 [mem 0x0000000080000000-0x000000009fffffff]
> 	(stall here)
> 
> [...]

Here is the summary with links:
  - [-next,v2] crash: Fix riscv64 crash memory reserve dead loop
    https://git.kernel.org/riscv/c/b3f835cd7339

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-09-24  6:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-12  6:20 [PATCH -next v2] crash: Fix riscv64 crash memory reserve dead loop Jinjie Ruan
2024-08-12 10:39 ` Catalin Marinas
2024-08-13  3:31 ` Baoquan He
2024-09-24  6:40 ` patchwork-bot+linux-riscv

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).