linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation
@ 2025-01-21 11:54 Sourabh Jain
  2025-01-21 11:54 ` [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX Sourabh Jain
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Baoquan he, Hari Bathini,
	Madhavan Srinivasan, Michael Ellerman, kexec, linux-kernel

Commit 0ab97169aa05 ("crash_core: add generic function to do
reservation") added a generic function to reserve crashkernel memory.
So let's use the same function on powerpc and remove the
architecture-specific code that essentially does the same thing.

The generic crashkernel reservation also provides a way to split the
crashkernel reservation into high and low memory reservations, which can
be enabled for powerpc in the future.

Additionally move powerpc to use generic APIs to locate memory hole for
kexec segments while loading kdump kernel.

Patch series summary:
=====================
Patch 1-3: generic changes
Patch 4-5: powerpc changes
Patch 6:   generic + powerpc changes

Changelog:

v1 Resend:
 - Rebased on top of 6.13-rc6

v2:
 - 01/06 new patch, fixes a bug with ELF load address
   It was posted in community as individual path but
   since it align with this patch series so include here:
   https://lore.kernel.org/all/20241210091314.185785-1-sourabhjain@linux.ibm.com/
 - Fixed a typo 06/06
 - Added Acked-by and Reviewed-by tags
 - Rebased it on 6.13
 - It was suggested to try HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY and
   see if 03/06 patch can be avoided:
   https://lore.kernel.org/all/Z35gnO2N%2FLFt1E7E@MiWiFi-R3L-srv/
   But after adding HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY the issue
   persisted and kernel prints below logs on boot:

   [    0.646339] ------------[ cut here ]------------
   [    0.646351] WARNING: CPU: 0 PID: 1 at arch/powerpc/mm/mem.c:379
   add_system_ram_resources+0xf0/0x15c
   [    0.646369] Modules linked in:
   [    0.646376] CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted
   6.13.0-rc6+ #1
   [    0.646386] Hardware name: IBM,9105-22A POWER10 (architected)
   0x800200 0xf000006 of:IBM,FW1060.00 (NL1060_027) hv:phyp pSeries
   [    0.646397] NIP:  c000000002016b3c LR: c000000002016b34 CTR:
   0000000000000000
   [    0.646406] REGS: c000000004817920 TRAP: 0700   Not tainted
   (6.13.0-rc6+)
   e8/0x15c (unreliable)
   [    0.646603] [c000000004817c20] [c000000000010a4c]
   do_one_initcall+0x7c/0x39c
   [    0.646618] [c000000004817d00] [c000000002005418]
   do_initcalls+0x144/0x18c
   0x290
   [    0.646648] [c000000004817df0] [c0000000000110f4]
   kernel_init+0x2c/0x1b8
   +0x14/0x1c
   [    0.646676] --- interrupt: 0 at 0x0
   010020 
   [    0.646720] ---[ end trace 0000000000000000 ]---

   So 03/06 is not removed from patch series.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
CC: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: sourabhjain@linux.ibm.com

Sourabh Jain (6):
  kexec: Initialize ELF lowest address to ULONG_MAX
  crash: remove an unused argument from reserve_crashkernel_generic()
  crash: let arch decide crash memory export to iomem_resource
  powerpc/kdump: preserve user-specified memory limit
  powerpc/crash: use generic crashkernel reservation
  crash: option to let arch decide mem range is usable

 arch/arm64/mm/init.c                     |   6 +-
 arch/loongarch/kernel/setup.c            |   5 +-
 arch/powerpc/Kconfig                     |   3 +
 arch/powerpc/include/asm/crash_reserve.h |  18 ++
 arch/powerpc/include/asm/kexec.h         |  10 +-
 arch/powerpc/kernel/prom.c               |   2 +-
 arch/powerpc/kexec/core.c                |  96 ++++-----
 arch/powerpc/kexec/file_load_64.c        | 259 +----------------------
 arch/riscv/mm/init.c                     |   6 +-
 arch/x86/kernel/setup.c                  |   6 +-
 include/linux/crash_reserve.h            |  22 +-
 include/linux/kexec.h                    |   9 +
 kernel/crash_reserve.c                   |  12 +-
 kernel/kexec_elf.c                       |   2 +-
 kernel/kexec_file.c                      |  12 ++
 15 files changed, 128 insertions(+), 340 deletions(-)
 create mode 100644 arch/powerpc/include/asm/crash_reserve.h

-- 
2.47.1



^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX
  2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
@ 2025-01-21 11:54 ` Sourabh Jain
  2025-01-23 10:04   ` Hari Bathini
  2025-01-21 11:54 ` [PATCH v2 2/6] crash: remove an unused argument from reserve_crashkernel_generic() Sourabh Jain
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Hari Bathini, Madhavan Srinivasan,
	Mahesh Salgaonkar, Michael Ellerman, kexec, linux-kernel,
	Baoquan He

kexec_elf_load() loads an ELF executable and sets the address of the
lowest PT_LOAD section to the address held by the lowest_load_addr
function argument.

To determine the lowest PT_LOAD address, a local variable lowest_addr
(type unsigned long) is initialized to UINT_MAX. After loading each
PT_LOAD, its address is compared to lowest_addr. If a loaded PT_LOAD
address is lower, lowest_addr is updated. However, setting lowest_addr
to UINT_MAX won't work when the kernel image is loaded above 4G, as the
returned lowest PT_LOAD address would be invalid. This is resolved by
initializing lowest_addr to ULONG_MAX instead.

This issue was discovered while implementing crashkernel high/low
reservation on the PowerPC architecture.

Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 kernel/kexec_elf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/kexec_elf.c b/kernel/kexec_elf.c
index d3689632e8b9..3a5c25b2adc9 100644
--- a/kernel/kexec_elf.c
+++ b/kernel/kexec_elf.c
@@ -390,7 +390,7 @@ int kexec_elf_load(struct kimage *image, struct elfhdr *ehdr,
 			 struct kexec_buf *kbuf,
 			 unsigned long *lowest_load_addr)
 {
-	unsigned long lowest_addr = UINT_MAX;
+	unsigned long lowest_addr = ULONG_MAX;
 	int ret;
 	size_t i;
 
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 2/6] crash: remove an unused argument from reserve_crashkernel_generic()
  2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
  2025-01-21 11:54 ` [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX Sourabh Jain
@ 2025-01-21 11:54 ` Sourabh Jain
  2025-01-23 10:13   ` Hari Bathini
  2025-01-21 11:54 ` [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource Sourabh Jain
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Hari Bathini, Madhavan Srinivasan,
	Mahesh Salgaonkar, Michael Ellerman, kexec, linux-kernel,
	Baoquan He

cmdline argument is not used in reserve_crashkernel_generic() so remove
it. Correspondingly, all the callers have been updated as well.

No functional change intended.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/arm64/mm/init.c          |  6 ++----
 arch/loongarch/kernel/setup.c |  5 ++---
 arch/riscv/mm/init.c          |  6 ++----
 arch/x86/kernel/setup.c       |  6 ++----
 include/linux/crash_reserve.h | 11 +++++------
 kernel/crash_reserve.c        |  9 ++++-----
 6 files changed, 17 insertions(+), 26 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 9c0b8d9558fc..6c5a1ee4b5d3 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -98,21 +98,19 @@ static void __init arch_reserve_crashkernel(void)
 {
 	unsigned long long low_size = 0;
 	unsigned long long crash_base, crash_size;
-	char *cmdline = boot_command_line;
 	bool high = false;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
 		return;
 
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
 				&crash_size, &crash_base,
 				&low_size, &high);
 	if (ret)
 		return;
 
-	reserve_crashkernel_generic(cmdline, crash_size, crash_base,
-				    low_size, high);
+	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
 }
 
 static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
index 56934fe58170..ece9c4266c3f 100644
--- a/arch/loongarch/kernel/setup.c
+++ b/arch/loongarch/kernel/setup.c
@@ -259,18 +259,17 @@ static void __init arch_reserve_crashkernel(void)
 	int ret;
 	unsigned long long low_size = 0;
 	unsigned long long crash_base, crash_size;
-	char *cmdline = boot_command_line;
 	bool high = false;
 
 	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
 		return;
 
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
 				&crash_size, &crash_base, &low_size, &high);
 	if (ret)
 		return;
 
-	reserve_crashkernel_generic(cmdline, crash_size, crash_base, low_size, high);
+	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
 }
 
 static void __init fdt_setup(void)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 8d167e09f1fe..16b81beb41bf 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1392,21 +1392,19 @@ static void __init arch_reserve_crashkernel(void)
 {
 	unsigned long long low_size = 0;
 	unsigned long long crash_base, crash_size;
-	char *cmdline = boot_command_line;
 	bool high = false;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
 		return;
 
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
 				&crash_size, &crash_base,
 				&low_size, &high);
 	if (ret)
 		return;
 
-	reserve_crashkernel_generic(cmdline, crash_size, crash_base,
-				    low_size, high);
+	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
 }
 
 void __init paging_init(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f1fea506e20f..15b6823556c8 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -469,14 +469,13 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 static void __init arch_reserve_crashkernel(void)
 {
 	unsigned long long crash_base, crash_size, low_size = 0;
-	char *cmdline = boot_command_line;
 	bool high = false;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
 		return;
 
-	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
 				&crash_size, &crash_base,
 				&low_size, &high);
 	if (ret)
@@ -487,8 +486,7 @@ static void __init arch_reserve_crashkernel(void)
 		return;
 	}
 
-	reserve_crashkernel_generic(cmdline, crash_size, crash_base,
-				    low_size, high);
+	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
 }
 
 static struct resource standard_io_resources[] = {
diff --git a/include/linux/crash_reserve.h b/include/linux/crash_reserve.h
index 5a9df944fb80..1fe7e7d1b214 100644
--- a/include/linux/crash_reserve.h
+++ b/include/linux/crash_reserve.h
@@ -32,13 +32,12 @@ int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
 #define CRASH_ADDR_HIGH_MAX		memblock_end_of_DRAM()
 #endif
 
-void __init reserve_crashkernel_generic(char *cmdline,
-		unsigned long long crash_size,
-		unsigned long long crash_base,
-		unsigned long long crash_low_size,
-		bool high);
+void __init reserve_crashkernel_generic(unsigned long long crash_size,
+					unsigned long long crash_base,
+					unsigned long long crash_low_size,
+					bool high);
 #else
-static inline void __init reserve_crashkernel_generic(char *cmdline,
+static inline void __init reserve_crashkernel_generic(
 		unsigned long long crash_size,
 		unsigned long long crash_base,
 		unsigned long long crash_low_size,
diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
index a620fb4b2116..aff7c0fdbefa 100644
--- a/kernel/crash_reserve.c
+++ b/kernel/crash_reserve.c
@@ -375,11 +375,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
 	return 0;
 }
 
-void __init reserve_crashkernel_generic(char *cmdline,
-			     unsigned long long crash_size,
-			     unsigned long long crash_base,
-			     unsigned long long crash_low_size,
-			     bool high)
+void __init reserve_crashkernel_generic(unsigned long long crash_size,
+					unsigned long long crash_base,
+					unsigned long long crash_low_size,
+					bool high)
 {
 	unsigned long long search_end = CRASH_ADDR_LOW_MAX, search_base = 0;
 	bool fixed_base = false;
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource
  2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
  2025-01-21 11:54 ` [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX Sourabh Jain
  2025-01-21 11:54 ` [PATCH v2 2/6] crash: remove an unused argument from reserve_crashkernel_generic() Sourabh Jain
@ 2025-01-21 11:54 ` Sourabh Jain
  2025-01-23 10:26   ` Hari Bathini
  2025-01-21 11:54 ` [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit Sourabh Jain
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Baoquan he, Hari Bathini,
	Madhavan Srinivasan, Mahesh Salgaonkar, Michael Ellerman, kexec,
	linux-kernel

insert_crashkernel_resources() adds crash memory to iomem_resource if
generic crashkernel reservation is enabled on an architecture.

On PowerPC, system RAM is added to iomem_resource. See commit
c40dd2f766440 ("powerpc: Add System RAM to /proc/iomem").

Enabling generic crashkernel reservation on PowerPC leads to a conflict
when system RAM is added to iomem_resource because a part of the system
RAM, the crashkernel memory, has already been added to iomem_resource.

The next commit in the series "powerpc/crash: use generic crashkernel
reservation" enables generic crashkernel reservation on PowerPC. If the
crashkernel is added to iomem_resource, the kernel fails to add
system RAM to /proc/iomem and prints the following traces:

CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.13.0-rc2+
snip...
NIP [c000000002016b3c] add_system_ram_resources+0xf0/0x15c
LR [c000000002016b34] add_system_ram_resources+0xe8/0x15c
Call Trace:
[c00000000484bbc0] [c000000002016b34] add_system_ram_resources+0xe8/0x15c
[c00000000484bc20] [c000000000010a4c] do_one_initcall+0x7c/0x39c
[c00000000484bd00] [c000000002005418] do_initcalls+0x144/0x18c
[c00000000484bd90] [c000000002005714] kernel_init_freeable+0x21c/0x290
[c00000000484bdf0] [c0000000000110f4] kernel_init+0x2c/0x1b8
[c00000000484be50] [c00000000000dd3c] ret_from_kernel_user_thread+0x14/0x1c

To avoid this, an architecture hook is added in
insert_crashkernel_resources(), allowing the architecture to decide
whether crashkernel memory should be added to iomem_resource.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 include/linux/crash_reserve.h | 11 +++++++++++
 kernel/crash_reserve.c        |  3 +++
 2 files changed, 14 insertions(+)

diff --git a/include/linux/crash_reserve.h b/include/linux/crash_reserve.h
index 1fe7e7d1b214..f1205d044dae 100644
--- a/include/linux/crash_reserve.h
+++ b/include/linux/crash_reserve.h
@@ -18,6 +18,17 @@ int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
 		unsigned long long *crash_size, unsigned long long *crash_base,
 		unsigned long long *low_size, bool *high);
 
+#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
+
+#ifndef arch_add_crash_res_to_iomem
+static inline bool arch_add_crash_res_to_iomem(void)
+{
+	return true;
+}
+#endif
+
+#endif
+
 #ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
 #ifndef DEFAULT_CRASH_KERNEL_LOW_SIZE
 #define DEFAULT_CRASH_KERNEL_LOW_SIZE	(128UL << 20)
diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
index aff7c0fdbefa..190104f32fe1 100644
--- a/kernel/crash_reserve.c
+++ b/kernel/crash_reserve.c
@@ -460,6 +460,9 @@ void __init reserve_crashkernel_generic(unsigned long long crash_size,
 #ifndef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
 static __init int insert_crashkernel_resources(void)
 {
+	if (!arch_add_crash_res_to_iomem())
+		return 0;
+
 	if (crashk_res.start < crashk_res.end)
 		insert_resource(&iomem_resource, &crashk_res);
 
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit
  2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
                   ` (2 preceding siblings ...)
  2025-01-21 11:54 ` [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource Sourabh Jain
@ 2025-01-21 11:54 ` Sourabh Jain
  2025-01-23 10:30   ` Hari Bathini
  2025-01-21 11:54 ` [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
  2025-01-21 11:54 ` [PATCH v2 6/6] crash: option to let arch decide mem range is usable Sourabh Jain
  5 siblings, 1 reply; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Baoquan he, Hari Bathini,
	Madhavan Srinivasan, Michael Ellerman, kexec, linux-kernel,
	Mahesh Salgaonkar

Commit 59d58189f3d9 ("crash: fix crash memory reserve exceed system
memory bug") fails crashkernel parsing if the crash size is found to be
higher than system RAM, which makes the memory_limit adjustment code
ineffective due to an early exit from reserve_crashkernel().

Regardless lets not violated the user-specified memory limit by
adjusting it. Remove this adjustment to ensure all reservations stay
within the limit. Commit f94f5ac07983 ("powerpc/fadump: Don't update
the user-specified memory limit") did the same for fadump.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
CC: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/kexec/core.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
index b8333a49ea5d..4945b33322ae 100644
--- a/arch/powerpc/kexec/core.c
+++ b/arch/powerpc/kexec/core.c
@@ -150,14 +150,6 @@ void __init reserve_crashkernel(void)
 		return;
 	}
 
-	/* Crash kernel trumps memory limit */
-	if (memory_limit && memory_limit <= crashk_res.end) {
-		memory_limit = crashk_res.end + 1;
-		total_mem_sz = memory_limit;
-		printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
-		       memory_limit);
-	}
-
 	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
 			"for crashkernel (System RAM: %ldMB)\n",
 			(unsigned long)(crash_size >> 20),
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation
  2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
                   ` (3 preceding siblings ...)
  2025-01-21 11:54 ` [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit Sourabh Jain
@ 2025-01-21 11:54 ` Sourabh Jain
  2025-01-23 10:45   ` Hari Bathini
  2025-01-21 11:54 ` [PATCH v2 6/6] crash: option to let arch decide mem range is usable Sourabh Jain
  5 siblings, 1 reply; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Baoquan he, Hari Bathini,
	Madhavan Srinivasan, Michael Ellerman, kexec, linux-kernel,
	Mahesh Salgaonkar

Commit 0ab97169aa05 ("crash_core: add generic function to do
reservation") added a generic function to reserve crashkernel memory.
So let's use the same function on powerpc and remove the
architecture-specific code that essentially does the same thing.

The generic crashkernel reservation also provides a way to split the
crashkernel reservation into high and low memory reservations, which can
be enabled for powerpc in the future.

Along with moving to the generic crashkernel reservation, the code
related to finding the base address for the crashkernel has been
separated into its own function name get_crash_base() for better
readability and maintainability.

To prevent crashkernel memory from being added to iomem_resource, the
function arch_add_crash_res_to_iomem() has been introduced. For further
details on why this should not be done for the PowerPC architecture,
please refer to the previous commit titled "crash: let arch decide crash
memory export to iomem_resource.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
CC: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/Kconfig                     |  3 +
 arch/powerpc/include/asm/crash_reserve.h | 18 +++++
 arch/powerpc/include/asm/kexec.h         |  4 +-
 arch/powerpc/kernel/prom.c               |  2 +-
 arch/powerpc/kexec/core.c                | 90 ++++++++++--------------
 5 files changed, 63 insertions(+), 54 deletions(-)
 create mode 100644 arch/powerpc/include/asm/crash_reserve.h

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index db9f7b2d07bf..880d35fadf40 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -718,6 +718,9 @@ config ARCH_SUPPORTS_CRASH_HOTPLUG
 	def_bool y
 	depends on PPC64
 
+config ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
+	def_bool CRASH_RESERVE
+
 config FA_DUMP
 	bool "Firmware-assisted dump"
 	depends on CRASH_DUMP && PPC64 && (PPC_RTAS || PPC_POWERNV)
diff --git a/arch/powerpc/include/asm/crash_reserve.h b/arch/powerpc/include/asm/crash_reserve.h
new file mode 100644
index 000000000000..f5e60721de41
--- /dev/null
+++ b/arch/powerpc/include/asm/crash_reserve.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_POWERPC_CRASH_RESERVE_H
+#define _ASM_POWERPC_CRASH_RESERVE_H
+
+/* crash kernel regions are Page size agliged */
+#define CRASH_ALIGN             PAGE_SIZE
+
+#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
+
+static inline bool arch_add_crash_res_to_iomem(void)
+{
+	return false;
+}
+#define arch_add_crash_res_to_iomem arch_add_crash_res_to_iomem
+#endif
+
+#endif /* _ASM_POWERPC_CRASH_RESERVE_H */
+
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 270ee93a0f7d..64741558071f 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -113,9 +113,9 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
 
 #ifdef CONFIG_CRASH_RESERVE
 int __init overlaps_crashkernel(unsigned long start, unsigned long size);
-extern void reserve_crashkernel(void);
+extern void arch_reserve_crashkernel(void);
 #else
-static inline void reserve_crashkernel(void) {}
+static inline void arch_reserve_crashkernel(void) {}
 static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
 #endif
 
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index e0059842a1c6..9ed9dde7d231 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -860,7 +860,7 @@ void __init early_init_devtree(void *params)
 	 */
 	if (fadump_reserve_mem() == 0)
 #endif
-		reserve_crashkernel();
+		arch_reserve_crashkernel();
 	early_reserve_mem();
 
 	if (memory_limit > memblock_phys_mem_size())
diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
index 4945b33322ae..b21cfa814492 100644
--- a/arch/powerpc/kexec/core.c
+++ b/arch/powerpc/kexec/core.c
@@ -80,38 +80,20 @@ void machine_kexec(struct kimage *image)
 }
 
 #ifdef CONFIG_CRASH_RESERVE
-void __init reserve_crashkernel(void)
-{
-	unsigned long long crash_size, crash_base, total_mem_sz;
-	int ret;
 
-	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
-	/* use common parsing */
-	ret = parse_crashkernel(boot_command_line, total_mem_sz,
-			&crash_size, &crash_base, NULL, NULL);
-	if (ret == 0 && crash_size > 0) {
-		crashk_res.start = crash_base;
-		crashk_res.end = crash_base + crash_size - 1;
-	}
-
-	if (crashk_res.end == crashk_res.start) {
-		crashk_res.start = crashk_res.end = 0;
-		return;
-	}
-
-	/* We might have got these values via the command line or the
-	 * device tree, either way sanitise them now. */
-
-	crash_size = resource_size(&crashk_res);
+static unsigned long long __init get_crash_base(unsigned long long crash_base)
+{
 
 #ifndef CONFIG_NONSTATIC_KERNEL
-	if (crashk_res.start != KDUMP_KERNELBASE)
+	if (crash_base != KDUMP_KERNELBASE)
 		printk("Crash kernel location must be 0x%x\n",
 				KDUMP_KERNELBASE);
 
-	crashk_res.start = KDUMP_KERNELBASE;
+	return KDUMP_KERNELBASE;
 #else
-	if (!crashk_res.start) {
+	unsigned long long crash_base_align;
+
+	if (!crash_base) {
 #ifdef CONFIG_PPC64
 		/*
 		 * On the LPAR platform place the crash kernel to mid of
@@ -123,45 +105,51 @@ void __init reserve_crashkernel(void)
 		 * kernel starts at 128MB offset on other platforms.
 		 */
 		if (firmware_has_feature(FW_FEATURE_LPAR))
-			crashk_res.start = min_t(u64, ppc64_rma_size / 2, SZ_512M);
+			crash_base = min_t(u64, ppc64_rma_size / 2, SZ_512M);
 		else
-			crashk_res.start = min_t(u64, ppc64_rma_size / 2, SZ_128M);
+			crash_base = min_t(u64, ppc64_rma_size / 2, SZ_128M);
 #else
-		crashk_res.start = KDUMP_KERNELBASE;
+		crash_base = KDUMP_KERNELBASE;
 #endif
 	}
 
-	crash_base = PAGE_ALIGN(crashk_res.start);
-	if (crash_base != crashk_res.start) {
-		printk("Crash kernel base must be aligned to 0x%lx\n",
-				PAGE_SIZE);
-		crashk_res.start = crash_base;
-	}
+	crash_base_align = PAGE_ALIGN(crash_base);
+	if (crash_base != crash_base_align)
+		pr_warn("Crash kernel base must be aligned to 0x%lx\n", PAGE_SIZE);
 
+	return crash_base_align;
 #endif
-	crash_size = PAGE_ALIGN(crash_size);
-	crashk_res.end = crashk_res.start + crash_size - 1;
+}
 
-	/* The crash region must not overlap the current kernel */
-	if (overlaps_crashkernel(__pa(_stext), _end - _stext)) {
-		printk(KERN_WARNING
-			"Crash kernel can not overlap current kernel\n");
-		crashk_res.start = crashk_res.end = 0;
+void __init arch_reserve_crashkernel(void)
+{
+	unsigned long long crash_size, crash_base, crash_end;
+	unsigned long long kernel_start, kernel_size;
+	unsigned long long total_mem_sz;
+	int ret;
+
+	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
+
+	/* use common parsing */
+	ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
+				&crash_base, NULL, NULL);
+
+	if (ret)
 		return;
-	}
 
-	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
-			"for crashkernel (System RAM: %ldMB)\n",
-			(unsigned long)(crash_size >> 20),
-			(unsigned long)(crashk_res.start >> 20),
-			(unsigned long)(total_mem_sz >> 20));
+	crash_base = get_crash_base(crash_base);
+	crash_end = crash_base + crash_size - 1;
 
-	if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
-	    memblock_reserve(crashk_res.start, crash_size)) {
-		pr_err("Failed to reserve memory for crashkernel!\n");
-		crashk_res.start = crashk_res.end = 0;
+	kernel_start = __pa(_stext);
+	kernel_size = _end - _stext;
+
+	/* The crash region must not overlap the current kernel */
+	if ((kernel_start + kernel_size > crash_base) && (kernel_start <= crash_end)) {
+		pr_warn("Crash kernel can not overlap current kernel\n");
 		return;
 	}
+
+	reserve_crashkernel_generic(crash_size, crash_base, 0, false);
 }
 
 int __init overlaps_crashkernel(unsigned long start, unsigned long size)
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH v2 6/6] crash: option to let arch decide mem range is usable
  2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
                   ` (4 preceding siblings ...)
  2025-01-21 11:54 ` [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
@ 2025-01-21 11:54 ` Sourabh Jain
  2025-01-24  9:52   ` Hari Bathini
  5 siblings, 1 reply; 18+ messages in thread
From: Sourabh Jain @ 2025-01-21 11:54 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Sourabh Jain, Andrew Morton, Baoquan he, Hari Bathini,
	Madhavan Srinivasan, Mahesh Salgaonkar, Michael Ellerman, kexec,
	linux-kernel

On PowerPC, the memory reserved for the crashkernel can contain
components like RTAS, TCE, OPAL, etc., which should be avoided when
loading kexec segments into crashkernel memory. Due to these special
components, PowerPC has its own set of functions to locate holes in the
crashkernel memory for loading kexec segments for kdump. However, for
loading kexec segments in the kexec case, PowerPC uses generic functions
to locate holes.

So, let's use generic functions to locate memory holes for kdump on
PowerPC by adding an arch hook to handle such special regions while
loading kexec segments, and remove the PowerPC functions to locate
holes.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: kexec@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
 arch/powerpc/include/asm/kexec.h  |   6 +-
 arch/powerpc/kexec/file_load_64.c | 259 ++----------------------------
 include/linux/kexec.h             |   9 ++
 kernel/kexec_file.c               |  12 ++
 4 files changed, 34 insertions(+), 252 deletions(-)

diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 64741558071f..5e4680f9ff35 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -95,8 +95,10 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned long
 int arch_kimage_file_post_load_cleanup(struct kimage *image);
 #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
 
-int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
-#define arch_kexec_locate_mem_hole arch_kexec_locate_mem_hole
+int arch_check_excluded_range(struct kimage *image, unsigned long start,
+			      unsigned long end);
+#define arch_check_excluded_range  arch_check_excluded_range
+
 
 int load_crashdump_segments_ppc64(struct kimage *image,
 				  struct kexec_buf *kbuf);
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index dc65c1391157..e7ef8b2a2554 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -49,201 +49,18 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
 	NULL
 };
 
-/**
- * __locate_mem_hole_top_down - Looks top down for a large enough memory hole
- *                              in the memory regions between buf_min & buf_max
- *                              for the buffer. If found, sets kbuf->mem.
- * @kbuf:                       Buffer contents and memory parameters.
- * @buf_min:                    Minimum address for the buffer.
- * @buf_max:                    Maximum address for the buffer.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int __locate_mem_hole_top_down(struct kexec_buf *kbuf,
-				      u64 buf_min, u64 buf_max)
-{
-	int ret = -EADDRNOTAVAIL;
-	phys_addr_t start, end;
-	u64 i;
-
-	for_each_mem_range_rev(i, &start, &end) {
-		/*
-		 * memblock uses [start, end) convention while it is
-		 * [start, end] here. Fix the off-by-one to have the
-		 * same convention.
-		 */
-		end -= 1;
-
-		if (start > buf_max)
-			continue;
-
-		/* Memory hole not found */
-		if (end < buf_min)
-			break;
-
-		/* Adjust memory region based on the given range */
-		if (start < buf_min)
-			start = buf_min;
-		if (end > buf_max)
-			end = buf_max;
-
-		start = ALIGN(start, kbuf->buf_align);
-		if (start < end && (end - start + 1) >= kbuf->memsz) {
-			/* Suitable memory range found. Set kbuf->mem */
-			kbuf->mem = ALIGN_DOWN(end - kbuf->memsz + 1,
-					       kbuf->buf_align);
-			ret = 0;
-			break;
-		}
-	}
-
-	return ret;
-}
-
-/**
- * locate_mem_hole_top_down_ppc64 - Skip special memory regions to find a
- *                                  suitable buffer with top down approach.
- * @kbuf:                           Buffer contents and memory parameters.
- * @buf_min:                        Minimum address for the buffer.
- * @buf_max:                        Maximum address for the buffer.
- * @emem:                           Exclude memory ranges.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int locate_mem_hole_top_down_ppc64(struct kexec_buf *kbuf,
-					  u64 buf_min, u64 buf_max,
-					  const struct crash_mem *emem)
+int arch_check_excluded_range(struct kimage *image, unsigned long start,
+			      unsigned long end)
 {
-	int i, ret = 0, err = -EADDRNOTAVAIL;
-	u64 start, end, tmin, tmax;
-
-	tmax = buf_max;
-	for (i = (emem->nr_ranges - 1); i >= 0; i--) {
-		start = emem->ranges[i].start;
-		end = emem->ranges[i].end;
-
-		if (start > tmax)
-			continue;
-
-		if (end < tmax) {
-			tmin = (end < buf_min ? buf_min : end + 1);
-			ret = __locate_mem_hole_top_down(kbuf, tmin, tmax);
-			if (!ret)
-				return 0;
-		}
-
-		tmax = start - 1;
-
-		if (tmax < buf_min) {
-			ret = err;
-			break;
-		}
-		ret = 0;
-	}
-
-	if (!ret) {
-		tmin = buf_min;
-		ret = __locate_mem_hole_top_down(kbuf, tmin, tmax);
-	}
-	return ret;
-}
-
-/**
- * __locate_mem_hole_bottom_up - Looks bottom up for a large enough memory hole
- *                               in the memory regions between buf_min & buf_max
- *                               for the buffer. If found, sets kbuf->mem.
- * @kbuf:                        Buffer contents and memory parameters.
- * @buf_min:                     Minimum address for the buffer.
- * @buf_max:                     Maximum address for the buffer.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int __locate_mem_hole_bottom_up(struct kexec_buf *kbuf,
-				       u64 buf_min, u64 buf_max)
-{
-	int ret = -EADDRNOTAVAIL;
-	phys_addr_t start, end;
-	u64 i;
-
-	for_each_mem_range(i, &start, &end) {
-		/*
-		 * memblock uses [start, end) convention while it is
-		 * [start, end] here. Fix the off-by-one to have the
-		 * same convention.
-		 */
-		end -= 1;
-
-		if (end < buf_min)
-			continue;
-
-		/* Memory hole not found */
-		if (start > buf_max)
-			break;
-
-		/* Adjust memory region based on the given range */
-		if (start < buf_min)
-			start = buf_min;
-		if (end > buf_max)
-			end = buf_max;
-
-		start = ALIGN(start, kbuf->buf_align);
-		if (start < end && (end - start + 1) >= kbuf->memsz) {
-			/* Suitable memory range found. Set kbuf->mem */
-			kbuf->mem = start;
-			ret = 0;
-			break;
-		}
-	}
-
-	return ret;
-}
-
-/**
- * locate_mem_hole_bottom_up_ppc64 - Skip special memory regions to find a
- *                                   suitable buffer with bottom up approach.
- * @kbuf:                            Buffer contents and memory parameters.
- * @buf_min:                         Minimum address for the buffer.
- * @buf_max:                         Maximum address for the buffer.
- * @emem:                            Exclude memory ranges.
- *
- * Returns 0 on success, negative errno on error.
- */
-static int locate_mem_hole_bottom_up_ppc64(struct kexec_buf *kbuf,
-					   u64 buf_min, u64 buf_max,
-					   const struct crash_mem *emem)
-{
-	int i, ret = 0, err = -EADDRNOTAVAIL;
-	u64 start, end, tmin, tmax;
-
-	tmin = buf_min;
-	for (i = 0; i < emem->nr_ranges; i++) {
-		start = emem->ranges[i].start;
-		end = emem->ranges[i].end;
-
-		if (end < tmin)
-			continue;
-
-		if (start > tmin) {
-			tmax = (start > buf_max ? buf_max : start - 1);
-			ret = __locate_mem_hole_bottom_up(kbuf, tmin, tmax);
-			if (!ret)
-				return 0;
-		}
-
-		tmin = end + 1;
+	struct crash_mem *emem;
+	int i;
 
-		if (tmin > buf_max) {
-			ret = err;
-			break;
-		}
-		ret = 0;
-	}
+	emem = image->arch.exclude_ranges;
+	for (i = 0; i < emem->nr_ranges; i++)
+		if (start < emem->ranges[i].end && end > emem->ranges[i].start)
+			return 1;
 
-	if (!ret) {
-		tmax = buf_max;
-		ret = __locate_mem_hole_bottom_up(kbuf, tmin, tmax);
-	}
-	return ret;
+	return 0;
 }
 
 #ifdef CONFIG_CRASH_DUMP
@@ -1004,64 +821,6 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
 	return ret;
 }
 
-/**
- * arch_kexec_locate_mem_hole - Skip special memory regions like rtas, opal,
- *                              tce-table, reserved-ranges & such (exclude
- *                              memory ranges) as they can't be used for kexec
- *                              segment buffer. Sets kbuf->mem when a suitable
- *                              memory hole is found.
- * @kbuf:                       Buffer contents and memory parameters.
- *
- * Assumes minimum of PAGE_SIZE alignment for kbuf->memsz & kbuf->buf_align.
- *
- * Returns 0 on success, negative errno on error.
- */
-int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
-{
-	struct crash_mem **emem;
-	u64 buf_min, buf_max;
-	int ret;
-
-	/* Look up the exclude ranges list while locating the memory hole */
-	emem = &(kbuf->image->arch.exclude_ranges);
-	if (!(*emem) || ((*emem)->nr_ranges == 0)) {
-		pr_warn("No exclude range list. Using the default locate mem hole method\n");
-		return kexec_locate_mem_hole(kbuf);
-	}
-
-	buf_min = kbuf->buf_min;
-	buf_max = kbuf->buf_max;
-	/* Segments for kdump kernel should be within crashkernel region */
-	if (IS_ENABLED(CONFIG_CRASH_DUMP) && kbuf->image->type == KEXEC_TYPE_CRASH) {
-		buf_min = (buf_min < crashk_res.start ?
-			   crashk_res.start : buf_min);
-		buf_max = (buf_max > crashk_res.end ?
-			   crashk_res.end : buf_max);
-	}
-
-	if (buf_min > buf_max) {
-		pr_err("Invalid buffer min and/or max values\n");
-		return -EINVAL;
-	}
-
-	if (kbuf->top_down)
-		ret = locate_mem_hole_top_down_ppc64(kbuf, buf_min, buf_max,
-						     *emem);
-	else
-		ret = locate_mem_hole_bottom_up_ppc64(kbuf, buf_min, buf_max,
-						      *emem);
-
-	/* Add the buffer allocated to the exclude list for the next lookup */
-	if (!ret) {
-		add_mem_range(emem, kbuf->mem, kbuf->memsz);
-		sort_memory_ranges(*emem, true);
-	} else {
-		pr_err("Failed to locate memory buffer of size %lu\n",
-		       kbuf->memsz);
-	}
-	return ret;
-}
-
 /**
  * arch_kexec_kernel_image_probe - Does additional handling needed to setup
  *                                 kexec segments.
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index f0e9f8eda7a3..407f8b0346aa 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -205,6 +205,15 @@ static inline int arch_kimage_file_post_load_cleanup(struct kimage *image)
 }
 #endif
 
+#ifndef arch_check_excluded_range
+static inline int arch_check_excluded_range(struct kimage *image,
+					    unsigned long start,
+					    unsigned long end)
+{
+	return 0;
+}
+#endif
+
 #ifdef CONFIG_KEXEC_SIG
 #ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION
 int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_len);
diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
index 3eedb8c226ad..fba686487e3b 100644
--- a/kernel/kexec_file.c
+++ b/kernel/kexec_file.c
@@ -464,6 +464,12 @@ static int locate_mem_hole_top_down(unsigned long start, unsigned long end,
 			continue;
 		}
 
+		/* Make sure this does not conflict with exclude range */
+		if (arch_check_excluded_range(image, temp_start, temp_end)) {
+			temp_start = temp_start - PAGE_SIZE;
+			continue;
+		}
+
 		/* We found a suitable memory range */
 		break;
 	} while (1);
@@ -498,6 +504,12 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
 			continue;
 		}
 
+		/* Make sure this does not conflict with exclude range */
+		if (arch_check_excluded_range(image, temp_start, temp_end)) {
+			temp_start = temp_start + PAGE_SIZE;
+			continue;
+		}
+
 		/* We found a suitable memory range */
 		break;
 	} while (1);
-- 
2.47.1



^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX
  2025-01-21 11:54 ` [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX Sourabh Jain
@ 2025-01-23 10:04   ` Hari Bathini
  2025-01-23 11:23     ` Sourabh Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Hari Bathini @ 2025-01-23 10:04 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev
  Cc: Andrew Morton, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel, Baoquan He



On 21/01/25 5:24 pm, Sourabh Jain wrote:
> kexec_elf_load() loads an ELF executable and sets the address of the
> lowest PT_LOAD section to the address held by the lowest_load_addr
> function argument.
> 
> To determine the lowest PT_LOAD address, a local variable lowest_addr
> (type unsigned long) is initialized to UINT_MAX. After loading each
> PT_LOAD, its address is compared to lowest_addr. If a loaded PT_LOAD
> address is lower, lowest_addr is updated. However, setting lowest_addr
> to UINT_MAX won't work when the kernel image is loaded above 4G, as the
> returned lowest PT_LOAD address would be invalid. This is resolved by
> initializing lowest_addr to ULONG_MAX instead.
> 
> This issue was discovered while implementing crashkernel high/low
> reservation on the PowerPC architecture.
> 

Acked-by: Hari Bathini <hbathini@linux.ibm.com>

> Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()")
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Baoquan He <bhe@redhat.com>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   kernel/kexec_elf.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/kexec_elf.c b/kernel/kexec_elf.c
> index d3689632e8b9..3a5c25b2adc9 100644
> --- a/kernel/kexec_elf.c
> +++ b/kernel/kexec_elf.c
> @@ -390,7 +390,7 @@ int kexec_elf_load(struct kimage *image, struct elfhdr *ehdr,
>   			 struct kexec_buf *kbuf,
>   			 unsigned long *lowest_load_addr)
>   {
> -	unsigned long lowest_addr = UINT_MAX;
> +	unsigned long lowest_addr = ULONG_MAX;
>   	int ret;
>   	size_t i;
>   



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 2/6] crash: remove an unused argument from reserve_crashkernel_generic()
  2025-01-21 11:54 ` [PATCH v2 2/6] crash: remove an unused argument from reserve_crashkernel_generic() Sourabh Jain
@ 2025-01-23 10:13   ` Hari Bathini
  0 siblings, 0 replies; 18+ messages in thread
From: Hari Bathini @ 2025-01-23 10:13 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev
  Cc: Andrew Morton, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel, Baoquan He



On 21/01/25 5:24 pm, Sourabh Jain wrote:
> cmdline argument is not used in reserve_crashkernel_generic() so remove
> it. Correspondingly, all the callers have been updated as well.
> 
> No functional change intended.
> 

Looks good to me.

Acked-by: Hari Bathini <hbathini@linux.ibm.com>

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Acked-by: Baoquan He <bhe@redhat.com>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   arch/arm64/mm/init.c          |  6 ++----
>   arch/loongarch/kernel/setup.c |  5 ++---
>   arch/riscv/mm/init.c          |  6 ++----
>   arch/x86/kernel/setup.c       |  6 ++----
>   include/linux/crash_reserve.h | 11 +++++------
>   kernel/crash_reserve.c        |  9 ++++-----
>   6 files changed, 17 insertions(+), 26 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 9c0b8d9558fc..6c5a1ee4b5d3 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -98,21 +98,19 @@ static void __init arch_reserve_crashkernel(void)
>   {
>   	unsigned long long low_size = 0;
>   	unsigned long long crash_base, crash_size;
> -	char *cmdline = boot_command_line;
>   	bool high = false;
>   	int ret;
>   
>   	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
>   		return;
>   
> -	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   				&crash_size, &crash_base,
>   				&low_size, &high);
>   	if (ret)
>   		return;
>   
> -	reserve_crashkernel_generic(cmdline, crash_size, crash_base,
> -				    low_size, high);
> +	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
>   }
>   
>   static phys_addr_t __init max_zone_phys(phys_addr_t zone_limit)
> diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c
> index 56934fe58170..ece9c4266c3f 100644
> --- a/arch/loongarch/kernel/setup.c
> +++ b/arch/loongarch/kernel/setup.c
> @@ -259,18 +259,17 @@ static void __init arch_reserve_crashkernel(void)
>   	int ret;
>   	unsigned long long low_size = 0;
>   	unsigned long long crash_base, crash_size;
> -	char *cmdline = boot_command_line;
>   	bool high = false;
>   
>   	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
>   		return;
>   
> -	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   				&crash_size, &crash_base, &low_size, &high);
>   	if (ret)
>   		return;
>   
> -	reserve_crashkernel_generic(cmdline, crash_size, crash_base, low_size, high);
> +	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
>   }
>   
>   static void __init fdt_setup(void)
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 8d167e09f1fe..16b81beb41bf 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -1392,21 +1392,19 @@ static void __init arch_reserve_crashkernel(void)
>   {
>   	unsigned long long low_size = 0;
>   	unsigned long long crash_base, crash_size;
> -	char *cmdline = boot_command_line;
>   	bool high = false;
>   	int ret;
>   
>   	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
>   		return;
>   
> -	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   				&crash_size, &crash_base,
>   				&low_size, &high);
>   	if (ret)
>   		return;
>   
> -	reserve_crashkernel_generic(cmdline, crash_size, crash_base,
> -				    low_size, high);
> +	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
>   }
>   
>   void __init paging_init(void)
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index f1fea506e20f..15b6823556c8 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -469,14 +469,13 @@ static void __init memblock_x86_reserve_range_setup_data(void)
>   static void __init arch_reserve_crashkernel(void)
>   {
>   	unsigned long long crash_base, crash_size, low_size = 0;
> -	char *cmdline = boot_command_line;
>   	bool high = false;
>   	int ret;
>   
>   	if (!IS_ENABLED(CONFIG_CRASH_RESERVE))
>   		return;
>   
> -	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> +	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   				&crash_size, &crash_base,
>   				&low_size, &high);
>   	if (ret)
> @@ -487,8 +486,7 @@ static void __init arch_reserve_crashkernel(void)
>   		return;
>   	}
>   
> -	reserve_crashkernel_generic(cmdline, crash_size, crash_base,
> -				    low_size, high);
> +	reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
>   }
>   
>   static struct resource standard_io_resources[] = {
> diff --git a/include/linux/crash_reserve.h b/include/linux/crash_reserve.h
> index 5a9df944fb80..1fe7e7d1b214 100644
> --- a/include/linux/crash_reserve.h
> +++ b/include/linux/crash_reserve.h
> @@ -32,13 +32,12 @@ int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
>   #define CRASH_ADDR_HIGH_MAX		memblock_end_of_DRAM()
>   #endif
>   
> -void __init reserve_crashkernel_generic(char *cmdline,
> -		unsigned long long crash_size,
> -		unsigned long long crash_base,
> -		unsigned long long crash_low_size,
> -		bool high);
> +void __init reserve_crashkernel_generic(unsigned long long crash_size,
> +					unsigned long long crash_base,
> +					unsigned long long crash_low_size,
> +					bool high);
>   #else
> -static inline void __init reserve_crashkernel_generic(char *cmdline,
> +static inline void __init reserve_crashkernel_generic(
>   		unsigned long long crash_size,
>   		unsigned long long crash_base,
>   		unsigned long long crash_low_size,
> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
> index a620fb4b2116..aff7c0fdbefa 100644
> --- a/kernel/crash_reserve.c
> +++ b/kernel/crash_reserve.c
> @@ -375,11 +375,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>   	return 0;
>   }
>   
> -void __init reserve_crashkernel_generic(char *cmdline,
> -			     unsigned long long crash_size,
> -			     unsigned long long crash_base,
> -			     unsigned long long crash_low_size,
> -			     bool high)
> +void __init reserve_crashkernel_generic(unsigned long long crash_size,
> +					unsigned long long crash_base,
> +					unsigned long long crash_low_size,
> +					bool high)
>   {
>   	unsigned long long search_end = CRASH_ADDR_LOW_MAX, search_base = 0;
>   	bool fixed_base = false;



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource
  2025-01-21 11:54 ` [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource Sourabh Jain
@ 2025-01-23 10:26   ` Hari Bathini
  2025-01-23 11:50     ` Sourabh Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Hari Bathini @ 2025-01-23 10:26 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel

Hi Sourabh,

On 21/01/25 5:24 pm, Sourabh Jain wrote:
> insert_crashkernel_resources() adds crash memory to iomem_resource if
> generic crashkernel reservation is enabled on an architecture.
> 

> On PowerPC, system RAM is added to iomem_resource. See commit
> c40dd2f766440 ("powerpc: Add System RAM to /proc/iomem").

The changelog clearly says the patch was added for kdump. But 
kexec-tools or kernel code for kdump on powerpc don't seem to have relied
on that. Device-tree memory nodes were used instead? Wondering if
dropping commit c40dd2f766440 is better..

- Hari

> 
> Enabling generic crashkernel reservation on PowerPC leads to a conflict
> when system RAM is added to iomem_resource because a part of the system
> RAM, the crashkernel memory, has already been added to iomem_resource.
> 
> The next commit in the series "powerpc/crash: use generic crashkernel
> reservation" enables generic crashkernel reservation on PowerPC. If the
> crashkernel is added to iomem_resource, the kernel fails to add
> system RAM to /proc/iomem and prints the following traces:
> 
> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.13.0-rc2+
> snip...
> NIP [c000000002016b3c] add_system_ram_resources+0xf0/0x15c
> LR [c000000002016b34] add_system_ram_resources+0xe8/0x15c
> Call Trace:
> [c00000000484bbc0] [c000000002016b34] add_system_ram_resources+0xe8/0x15c
> [c00000000484bc20] [c000000000010a4c] do_one_initcall+0x7c/0x39c
> [c00000000484bd00] [c000000002005418] do_initcalls+0x144/0x18c
> [c00000000484bd90] [c000000002005714] kernel_init_freeable+0x21c/0x290
> [c00000000484bdf0] [c0000000000110f4] kernel_init+0x2c/0x1b8
> [c00000000484be50] [c00000000000dd3c] ret_from_kernel_user_thread+0x14/0x1c
> 
> To avoid this, an architecture hook is added in
> insert_crashkernel_resources(), allowing the architecture to decide
> whether crashkernel memory should be added to iomem_resource.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   include/linux/crash_reserve.h | 11 +++++++++++
>   kernel/crash_reserve.c        |  3 +++
>   2 files changed, 14 insertions(+)
> 
> diff --git a/include/linux/crash_reserve.h b/include/linux/crash_reserve.h
> index 1fe7e7d1b214..f1205d044dae 100644
> --- a/include/linux/crash_reserve.h
> +++ b/include/linux/crash_reserve.h
> @@ -18,6 +18,17 @@ int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
>   		unsigned long long *crash_size, unsigned long long *crash_base,
>   		unsigned long long *low_size, bool *high);
>   
> +#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
> +
> +#ifndef arch_add_crash_res_to_iomem
> +static inline bool arch_add_crash_res_to_iomem(void)
> +{
> +	return true;
> +}
> +#endif
> +
> +#endif
> +
>   #ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
>   #ifndef DEFAULT_CRASH_KERNEL_LOW_SIZE
>   #define DEFAULT_CRASH_KERNEL_LOW_SIZE	(128UL << 20)
> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
> index aff7c0fdbefa..190104f32fe1 100644
> --- a/kernel/crash_reserve.c
> +++ b/kernel/crash_reserve.c
> @@ -460,6 +460,9 @@ void __init reserve_crashkernel_generic(unsigned long long crash_size,
>   #ifndef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
>   static __init int insert_crashkernel_resources(void)
>   {
> +	if (!arch_add_crash_res_to_iomem())
> +		return 0;
> +
>   	if (crashk_res.start < crashk_res.end)
>   		insert_resource(&iomem_resource, &crashk_res);
>   



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit
  2025-01-21 11:54 ` [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit Sourabh Jain
@ 2025-01-23 10:30   ` Hari Bathini
  2025-01-23 11:22     ` Sourabh Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Hari Bathini @ 2025-01-23 10:30 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Michael Ellerman,
	kexec, linux-kernel, Mahesh Salgaonkar



On 21/01/25 5:24 pm, Sourabh Jain wrote:
> Commit 59d58189f3d9 ("crash: fix crash memory reserve exceed system
> memory bug") fails crashkernel parsing if the crash size is found to be
> higher than system RAM, which makes the memory_limit adjustment code
> ineffective due to an early exit from reserve_crashkernel().
> 
> Regardless lets not violated the user-specified memory limit by

s/violated/violate/

> adjusting it. Remove this adjustment to ensure all reservations stay
> within the limit. Commit f94f5ac07983 ("powerpc/fadump: Don't update
> the user-specified memory limit") did the same for fadump.
> 

Acked-by: Hari Bathini <hbathini@linux.ibm.com>

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> CC: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   arch/powerpc/kexec/core.c | 8 --------
>   1 file changed, 8 deletions(-)
> 
> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
> index b8333a49ea5d..4945b33322ae 100644
> --- a/arch/powerpc/kexec/core.c
> +++ b/arch/powerpc/kexec/core.c
> @@ -150,14 +150,6 @@ void __init reserve_crashkernel(void)
>   		return;
>   	}
>   
> -	/* Crash kernel trumps memory limit */
> -	if (memory_limit && memory_limit <= crashk_res.end) {
> -		memory_limit = crashk_res.end + 1;
> -		total_mem_sz = memory_limit;
> -		printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
> -		       memory_limit);
> -	}
> -
>   	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
>   			"for crashkernel (System RAM: %ldMB)\n",
>   			(unsigned long)(crash_size >> 20),



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation
  2025-01-21 11:54 ` [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
@ 2025-01-23 10:45   ` Hari Bathini
  2025-01-23 11:53     ` Sourabh Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Hari Bathini @ 2025-01-23 10:45 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Michael Ellerman,
	kexec, linux-kernel, Mahesh Salgaonkar

Hi Sourabh,

On 21/01/25 5:24 pm, Sourabh Jain wrote:
> Commit 0ab97169aa05 ("crash_core: add generic function to do
> reservation") added a generic function to reserve crashkernel memory.
> So let's use the same function on powerpc and remove the
> architecture-specific code that essentially does the same thing.
> 
> The generic crashkernel reservation also provides a way to split the
> crashkernel reservation into high and low memory reservations, which can
> be enabled for powerpc in the future.
> 
> Along with moving to the generic crashkernel reservation, the code
> related to finding the base address for the crashkernel has been
> separated into its own function name get_crash_base() for better
> readability and maintainability.
> 
> To prevent crashkernel memory from being added to iomem_resource, the
> function arch_add_crash_res_to_iomem() has been introduced. For further
> details on why this should not be done for the PowerPC architecture,
> please refer to the previous commit titled "crash: let arch decide crash
> memory export to iomem_resource.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> CC: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   arch/powerpc/Kconfig                     |  3 +
>   arch/powerpc/include/asm/crash_reserve.h | 18 +++++
>   arch/powerpc/include/asm/kexec.h         |  4 +-
>   arch/powerpc/kernel/prom.c               |  2 +-
>   arch/powerpc/kexec/core.c                | 90 ++++++++++--------------
>   5 files changed, 63 insertions(+), 54 deletions(-)
>   create mode 100644 arch/powerpc/include/asm/crash_reserve.h
> 
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index db9f7b2d07bf..880d35fadf40 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -718,6 +718,9 @@ config ARCH_SUPPORTS_CRASH_HOTPLUG
>   	def_bool y
>   	depends on PPC64
>   
> +config ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
> +	def_bool CRASH_RESERVE
> +
>   config FA_DUMP
>   	bool "Firmware-assisted dump"
>   	depends on CRASH_DUMP && PPC64 && (PPC_RTAS || PPC_POWERNV)
> diff --git a/arch/powerpc/include/asm/crash_reserve.h b/arch/powerpc/include/asm/crash_reserve.h
> new file mode 100644
> index 000000000000..f5e60721de41
> --- /dev/null
> +++ b/arch/powerpc/include/asm/crash_reserve.h
> @@ -0,0 +1,18 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_POWERPC_CRASH_RESERVE_H
> +#define _ASM_POWERPC_CRASH_RESERVE_H
> +
> +/* crash kernel regions are Page size agliged */
> +#define CRASH_ALIGN             PAGE_SIZE
> +
> +#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
> +

> +static inline bool arch_add_crash_res_to_iomem(void)
> +{
> +	return false;
> +}
> +#define arch_add_crash_res_to_iomem arch_add_crash_res_to_iomem

This is probably not needed if commit c40dd2f766440 can be reverted.
Othewise..

Acked-by: Hari Bathini <hbathini@linux.ibm.com>

> +#endif
> +
> +#endif /* _ASM_POWERPC_CRASH_RESERVE_H */
> +
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index 270ee93a0f7d..64741558071f 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -113,9 +113,9 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
>   
>   #ifdef CONFIG_CRASH_RESERVE
>   int __init overlaps_crashkernel(unsigned long start, unsigned long size);
> -extern void reserve_crashkernel(void);
> +extern void arch_reserve_crashkernel(void);
>   #else
> -static inline void reserve_crashkernel(void) {}
> +static inline void arch_reserve_crashkernel(void) {}
>   static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
>   #endif
>   
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index e0059842a1c6..9ed9dde7d231 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -860,7 +860,7 @@ void __init early_init_devtree(void *params)
>   	 */
>   	if (fadump_reserve_mem() == 0)
>   #endif
> -		reserve_crashkernel();
> +		arch_reserve_crashkernel();
>   	early_reserve_mem();
>   
>   	if (memory_limit > memblock_phys_mem_size())
> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
> index 4945b33322ae..b21cfa814492 100644
> --- a/arch/powerpc/kexec/core.c
> +++ b/arch/powerpc/kexec/core.c
> @@ -80,38 +80,20 @@ void machine_kexec(struct kimage *image)
>   }
>   
>   #ifdef CONFIG_CRASH_RESERVE
> -void __init reserve_crashkernel(void)
> -{
> -	unsigned long long crash_size, crash_base, total_mem_sz;
> -	int ret;
>   
> -	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
> -	/* use common parsing */
> -	ret = parse_crashkernel(boot_command_line, total_mem_sz,
> -			&crash_size, &crash_base, NULL, NULL);
> -	if (ret == 0 && crash_size > 0) {
> -		crashk_res.start = crash_base;
> -		crashk_res.end = crash_base + crash_size - 1;
> -	}
> -
> -	if (crashk_res.end == crashk_res.start) {
> -		crashk_res.start = crashk_res.end = 0;
> -		return;
> -	}
> -
> -	/* We might have got these values via the command line or the
> -	 * device tree, either way sanitise them now. */
> -
> -	crash_size = resource_size(&crashk_res);
> +static unsigned long long __init get_crash_base(unsigned long long crash_base)
> +{
>   
>   #ifndef CONFIG_NONSTATIC_KERNEL
> -	if (crashk_res.start != KDUMP_KERNELBASE)
> +	if (crash_base != KDUMP_KERNELBASE)
>   		printk("Crash kernel location must be 0x%x\n",
>   				KDUMP_KERNELBASE);
>   
> -	crashk_res.start = KDUMP_KERNELBASE;
> +	return KDUMP_KERNELBASE;
>   #else
> -	if (!crashk_res.start) {
> +	unsigned long long crash_base_align;
> +
> +	if (!crash_base) {
>   #ifdef CONFIG_PPC64
>   		/*
>   		 * On the LPAR platform place the crash kernel to mid of
> @@ -123,45 +105,51 @@ void __init reserve_crashkernel(void)
>   		 * kernel starts at 128MB offset on other platforms.
>   		 */
>   		if (firmware_has_feature(FW_FEATURE_LPAR))
> -			crashk_res.start = min_t(u64, ppc64_rma_size / 2, SZ_512M);
> +			crash_base = min_t(u64, ppc64_rma_size / 2, SZ_512M);
>   		else
> -			crashk_res.start = min_t(u64, ppc64_rma_size / 2, SZ_128M);
> +			crash_base = min_t(u64, ppc64_rma_size / 2, SZ_128M);
>   #else
> -		crashk_res.start = KDUMP_KERNELBASE;
> +		crash_base = KDUMP_KERNELBASE;
>   #endif
>   	}
>   
> -	crash_base = PAGE_ALIGN(crashk_res.start);
> -	if (crash_base != crashk_res.start) {
> -		printk("Crash kernel base must be aligned to 0x%lx\n",
> -				PAGE_SIZE);
> -		crashk_res.start = crash_base;
> -	}
> +	crash_base_align = PAGE_ALIGN(crash_base);
> +	if (crash_base != crash_base_align)
> +		pr_warn("Crash kernel base must be aligned to 0x%lx\n", PAGE_SIZE);
>   
> +	return crash_base_align;
>   #endif
> -	crash_size = PAGE_ALIGN(crash_size);
> -	crashk_res.end = crashk_res.start + crash_size - 1;
> +}
>   
> -	/* The crash region must not overlap the current kernel */
> -	if (overlaps_crashkernel(__pa(_stext), _end - _stext)) {
> -		printk(KERN_WARNING
> -			"Crash kernel can not overlap current kernel\n");
> -		crashk_res.start = crashk_res.end = 0;
> +void __init arch_reserve_crashkernel(void)
> +{
> +	unsigned long long crash_size, crash_base, crash_end;
> +	unsigned long long kernel_start, kernel_size;
> +	unsigned long long total_mem_sz;
> +	int ret;
> +
> +	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
> +
> +	/* use common parsing */
> +	ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
> +				&crash_base, NULL, NULL);
> +
> +	if (ret)
>   		return;
> -	}
>   
> -	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
> -			"for crashkernel (System RAM: %ldMB)\n",
> -			(unsigned long)(crash_size >> 20),
> -			(unsigned long)(crashk_res.start >> 20),
> -			(unsigned long)(total_mem_sz >> 20));
> +	crash_base = get_crash_base(crash_base);
> +	crash_end = crash_base + crash_size - 1;
>   
> -	if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
> -	    memblock_reserve(crashk_res.start, crash_size)) {
> -		pr_err("Failed to reserve memory for crashkernel!\n");
> -		crashk_res.start = crashk_res.end = 0;
> +	kernel_start = __pa(_stext);
> +	kernel_size = _end - _stext;
> +
> +	/* The crash region must not overlap the current kernel */
> +	if ((kernel_start + kernel_size > crash_base) && (kernel_start <= crash_end)) {
> +		pr_warn("Crash kernel can not overlap current kernel\n");
>   		return;
>   	}
> +
> +	reserve_crashkernel_generic(crash_size, crash_base, 0, false);
>   }
>   
>   int __init overlaps_crashkernel(unsigned long start, unsigned long size)



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit
  2025-01-23 10:30   ` Hari Bathini
@ 2025-01-23 11:22     ` Sourabh Jain
  0 siblings, 0 replies; 18+ messages in thread
From: Sourabh Jain @ 2025-01-23 11:22 UTC (permalink / raw)
  To: Hari Bathini, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Michael Ellerman,
	kexec, linux-kernel, Mahesh Salgaonkar

Hello Hari,


On 23/01/25 16:00, Hari Bathini wrote:
>
>
> On 21/01/25 5:24 pm, Sourabh Jain wrote:
>> Commit 59d58189f3d9 ("crash: fix crash memory reserve exceed system
>> memory bug") fails crashkernel parsing if the crash size is found to be
>> higher than system RAM, which makes the memory_limit adjustment code
>> ineffective due to an early exit from reserve_crashkernel().
>>
>> Regardless lets not violated the user-specified memory limit by
>
> s/violated/violate/

Noted. It will be fixed in next version.

>
>> adjusting it. Remove this adjustment to ensure all reservations stay
>> within the limit. Commit f94f5ac07983 ("powerpc/fadump: Don't update
>> the user-specified memory limit") did the same for fadump.
>>
>
> Acked-by: Hari Bathini <hbathini@linux.ibm.com>

Thanks for the Ack!

- Sourabh Jain


>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Baoquan he <bhe@redhat.com>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> CC: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: kexec@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   arch/powerpc/kexec/core.c | 8 --------
>>   1 file changed, 8 deletions(-)
>>
>> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
>> index b8333a49ea5d..4945b33322ae 100644
>> --- a/arch/powerpc/kexec/core.c
>> +++ b/arch/powerpc/kexec/core.c
>> @@ -150,14 +150,6 @@ void __init reserve_crashkernel(void)
>>           return;
>>       }
>>   -    /* Crash kernel trumps memory limit */
>> -    if (memory_limit && memory_limit <= crashk_res.end) {
>> -        memory_limit = crashk_res.end + 1;
>> -        total_mem_sz = memory_limit;
>> -        printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
>> -               memory_limit);
>> -    }
>> -
>>       printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
>>               "for crashkernel (System RAM: %ldMB)\n",
>>               (unsigned long)(crash_size >> 20),
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX
  2025-01-23 10:04   ` Hari Bathini
@ 2025-01-23 11:23     ` Sourabh Jain
  0 siblings, 0 replies; 18+ messages in thread
From: Sourabh Jain @ 2025-01-23 11:23 UTC (permalink / raw)
  To: Hari Bathini, linuxppc-dev
  Cc: Andrew Morton, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel, Baoquan He

Hello Hari,


On 23/01/25 15:34, Hari Bathini wrote:
>
>
> On 21/01/25 5:24 pm, Sourabh Jain wrote:
>> kexec_elf_load() loads an ELF executable and sets the address of the
>> lowest PT_LOAD section to the address held by the lowest_load_addr
>> function argument.
>>
>> To determine the lowest PT_LOAD address, a local variable lowest_addr
>> (type unsigned long) is initialized to UINT_MAX. After loading each
>> PT_LOAD, its address is compared to lowest_addr. If a loaded PT_LOAD
>> address is lower, lowest_addr is updated. However, setting lowest_addr
>> to UINT_MAX won't work when the kernel image is loaded above 4G, as the
>> returned lowest PT_LOAD address would be invalid. This is resolved by
>> initializing lowest_addr to ULONG_MAX instead.
>>
>> This issue was discovered while implementing crashkernel high/low
>> reservation on the PowerPC architecture.
>>
>
> Acked-by: Hari Bathini <hbathini@linux.ibm.com>


Thanks for the Ack!

- Sourabh Jain


>
>> Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()")
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: kexec@lists.infradead.org
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Cc: linux-kernel@vger.kernel.org
>> Acked-by: Baoquan He <bhe@redhat.com>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   kernel/kexec_elf.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/kexec_elf.c b/kernel/kexec_elf.c
>> index d3689632e8b9..3a5c25b2adc9 100644
>> --- a/kernel/kexec_elf.c
>> +++ b/kernel/kexec_elf.c
>> @@ -390,7 +390,7 @@ int kexec_elf_load(struct kimage *image, struct 
>> elfhdr *ehdr,
>>                struct kexec_buf *kbuf,
>>                unsigned long *lowest_load_addr)
>>   {
>> -    unsigned long lowest_addr = UINT_MAX;
>> +    unsigned long lowest_addr = ULONG_MAX;
>>       int ret;
>>       size_t i;
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource
  2025-01-23 10:26   ` Hari Bathini
@ 2025-01-23 11:50     ` Sourabh Jain
  0 siblings, 0 replies; 18+ messages in thread
From: Sourabh Jain @ 2025-01-23 11:50 UTC (permalink / raw)
  To: Hari Bathini, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel

Hello Hari,


On 23/01/25 15:56, Hari Bathini wrote:
> Hi Sourabh,
>
> On 21/01/25 5:24 pm, Sourabh Jain wrote:
>> insert_crashkernel_resources() adds crash memory to iomem_resource if
>> generic crashkernel reservation is enabled on an architecture.
>>
>
>> On PowerPC, system RAM is added to iomem_resource. See commit
>> c40dd2f766440 ("powerpc: Add System RAM to /proc/iomem").
>
> The changelog clearly says the patch was added for kdump. But 
> kexec-tools or kernel code for kdump on powerpc don't seem to have relied
> on that. Device-tree memory nodes were used instead? Wondering if
> dropping commit c40dd2f766440 is better..
>

Hmm. Even I thought of removing that patch for the same reason.
But I noticed the below comment:

/*
  * System memory should not be in /proc/iomem but various tools expect it
  * (eg kdump).
  */

Since they said eg kdump. I thought maybe it is needed for tools other 
then kdump.
But I think this was only added for kdump. And as we know this is not
used in kexec/kdump tool anymore.

I will revert `c40dd2f766440 ("powerpc: Add System RAM to /proc/iomem"). `
change in the next series and take feedback from the community.

Thanks for the review.

- Sourabh Jain

>
>>
>> Enabling generic crashkernel reservation on PowerPC leads to a conflict
>> when system RAM is added to iomem_resource because a part of the system
>> RAM, the crashkernel memory, has already been added to iomem_resource.
>>
>> The next commit in the series "powerpc/crash: use generic crashkernel
>> reservation" enables generic crashkernel reservation on PowerPC. If the
>> crashkernel is added to iomem_resource, the kernel fails to add
>> system RAM to /proc/iomem and prints the following traces:
>>
>> CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.13.0-rc2+
>> snip...
>> NIP [c000000002016b3c] add_system_ram_resources+0xf0/0x15c
>> LR [c000000002016b34] add_system_ram_resources+0xe8/0x15c
>> Call Trace:
>> [c00000000484bbc0] [c000000002016b34] 
>> add_system_ram_resources+0xe8/0x15c
>> [c00000000484bc20] [c000000000010a4c] do_one_initcall+0x7c/0x39c
>> [c00000000484bd00] [c000000002005418] do_initcalls+0x144/0x18c
>> [c00000000484bd90] [c000000002005714] kernel_init_freeable+0x21c/0x290
>> [c00000000484bdf0] [c0000000000110f4] kernel_init+0x2c/0x1b8
>> [c00000000484be50] [c00000000000dd3c] 
>> ret_from_kernel_user_thread+0x14/0x1c
>>
>> To avoid this, an architecture hook is added in
>> insert_crashkernel_resources(), allowing the architecture to decide
>> whether crashkernel memory should be added to iomem_resource.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Baoquan he <bhe@redhat.com>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: kexec@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   include/linux/crash_reserve.h | 11 +++++++++++
>>   kernel/crash_reserve.c        |  3 +++
>>   2 files changed, 14 insertions(+)
>>
>> diff --git a/include/linux/crash_reserve.h 
>> b/include/linux/crash_reserve.h
>> index 1fe7e7d1b214..f1205d044dae 100644
>> --- a/include/linux/crash_reserve.h
>> +++ b/include/linux/crash_reserve.h
>> @@ -18,6 +18,17 @@ int __init parse_crashkernel(char *cmdline, 
>> unsigned long long system_ram,
>>           unsigned long long *crash_size, unsigned long long 
>> *crash_base,
>>           unsigned long long *low_size, bool *high);
>>   +#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
>> +
>> +#ifndef arch_add_crash_res_to_iomem
>> +static inline bool arch_add_crash_res_to_iomem(void)
>> +{
>> +    return true;
>> +}
>> +#endif
>> +
>> +#endif
>> +
>>   #ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
>>   #ifndef DEFAULT_CRASH_KERNEL_LOW_SIZE
>>   #define DEFAULT_CRASH_KERNEL_LOW_SIZE    (128UL << 20)
>> diff --git a/kernel/crash_reserve.c b/kernel/crash_reserve.c
>> index aff7c0fdbefa..190104f32fe1 100644
>> --- a/kernel/crash_reserve.c
>> +++ b/kernel/crash_reserve.c
>> @@ -460,6 +460,9 @@ void __init reserve_crashkernel_generic(unsigned 
>> long long crash_size,
>>   #ifndef HAVE_ARCH_ADD_CRASH_RES_TO_IOMEM_EARLY
>>   static __init int insert_crashkernel_resources(void)
>>   {
>> +    if (!arch_add_crash_res_to_iomem())
>> +        return 0;
>> +
>>       if (crashk_res.start < crashk_res.end)
>>           insert_resource(&iomem_resource, &crashk_res);
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation
  2025-01-23 10:45   ` Hari Bathini
@ 2025-01-23 11:53     ` Sourabh Jain
  0 siblings, 0 replies; 18+ messages in thread
From: Sourabh Jain @ 2025-01-23 11:53 UTC (permalink / raw)
  To: Hari Bathini, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Michael Ellerman,
	kexec, linux-kernel, Mahesh Salgaonkar

Hello Hari,


On 23/01/25 16:15, Hari Bathini wrote:
> Hi Sourabh,
>
> On 21/01/25 5:24 pm, Sourabh Jain wrote:
>> Commit 0ab97169aa05 ("crash_core: add generic function to do
>> reservation") added a generic function to reserve crashkernel memory.
>> So let's use the same function on powerpc and remove the
>> architecture-specific code that essentially does the same thing.
>>
>> The generic crashkernel reservation also provides a way to split the
>> crashkernel reservation into high and low memory reservations, which can
>> be enabled for powerpc in the future.
>>
>> Along with moving to the generic crashkernel reservation, the code
>> related to finding the base address for the crashkernel has been
>> separated into its own function name get_crash_base() for better
>> readability and maintainability.
>>
>> To prevent crashkernel memory from being added to iomem_resource, the
>> function arch_add_crash_res_to_iomem() has been introduced. For further
>> details on why this should not be done for the PowerPC architecture,
>> please refer to the previous commit titled "crash: let arch decide crash
>> memory export to iomem_resource.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Baoquan he <bhe@redhat.com>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> CC: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: kexec@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   arch/powerpc/Kconfig                     |  3 +
>>   arch/powerpc/include/asm/crash_reserve.h | 18 +++++
>>   arch/powerpc/include/asm/kexec.h         |  4 +-
>>   arch/powerpc/kernel/prom.c               |  2 +-
>>   arch/powerpc/kexec/core.c                | 90 ++++++++++--------------
>>   5 files changed, 63 insertions(+), 54 deletions(-)
>>   create mode 100644 arch/powerpc/include/asm/crash_reserve.h
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index db9f7b2d07bf..880d35fadf40 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -718,6 +718,9 @@ config ARCH_SUPPORTS_CRASH_HOTPLUG
>>       def_bool y
>>       depends on PPC64
>>   +config ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
>> +    def_bool CRASH_RESERVE
>> +
>>   config FA_DUMP
>>       bool "Firmware-assisted dump"
>>       depends on CRASH_DUMP && PPC64 && (PPC_RTAS || PPC_POWERNV)
>> diff --git a/arch/powerpc/include/asm/crash_reserve.h 
>> b/arch/powerpc/include/asm/crash_reserve.h
>> new file mode 100644
>> index 000000000000..f5e60721de41
>> --- /dev/null
>> +++ b/arch/powerpc/include/asm/crash_reserve.h
>> @@ -0,0 +1,18 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +#ifndef _ASM_POWERPC_CRASH_RESERVE_H
>> +#define _ASM_POWERPC_CRASH_RESERVE_H
>> +
>> +/* crash kernel regions are Page size agliged */
>> +#define CRASH_ALIGN             PAGE_SIZE
>> +
>> +#ifdef CONFIG_ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
>> +
>
>> +static inline bool arch_add_crash_res_to_iomem(void)
>> +{
>> +    return false;
>> +}
>> +#define arch_add_crash_res_to_iomem arch_add_crash_res_to_iomem
>
> This is probably not needed if commit c40dd2f766440 can be reverted.
> Othewise..

Agree. Since I am reverting c40dd2f766440 in next series so I will the 
above change.

>
> Acked-by: Hari Bathini <hbathini@linux.ibm.com>

Thanks for the Ack!

- Sourabh Jain


>
>> +#endif
>> +
>> +#endif /* _ASM_POWERPC_CRASH_RESERVE_H */
>> +
>> diff --git a/arch/powerpc/include/asm/kexec.h 
>> b/arch/powerpc/include/asm/kexec.h
>> index 270ee93a0f7d..64741558071f 100644
>> --- a/arch/powerpc/include/asm/kexec.h
>> +++ b/arch/powerpc/include/asm/kexec.h
>> @@ -113,9 +113,9 @@ int setup_new_fdt_ppc64(const struct kimage 
>> *image, void *fdt, struct crash_mem
>>     #ifdef CONFIG_CRASH_RESERVE
>>   int __init overlaps_crashkernel(unsigned long start, unsigned long 
>> size);
>> -extern void reserve_crashkernel(void);
>> +extern void arch_reserve_crashkernel(void);
>>   #else
>> -static inline void reserve_crashkernel(void) {}
>> +static inline void arch_reserve_crashkernel(void) {}
>>   static inline int overlaps_crashkernel(unsigned long start, 
>> unsigned long size) { return 0; }
>>   #endif
>>   diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
>> index e0059842a1c6..9ed9dde7d231 100644
>> --- a/arch/powerpc/kernel/prom.c
>> +++ b/arch/powerpc/kernel/prom.c
>> @@ -860,7 +860,7 @@ void __init early_init_devtree(void *params)
>>        */
>>       if (fadump_reserve_mem() == 0)
>>   #endif
>> -        reserve_crashkernel();
>> +        arch_reserve_crashkernel();
>>       early_reserve_mem();
>>         if (memory_limit > memblock_phys_mem_size())
>> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
>> index 4945b33322ae..b21cfa814492 100644
>> --- a/arch/powerpc/kexec/core.c
>> +++ b/arch/powerpc/kexec/core.c
>> @@ -80,38 +80,20 @@ void machine_kexec(struct kimage *image)
>>   }
>>     #ifdef CONFIG_CRASH_RESERVE
>> -void __init reserve_crashkernel(void)
>> -{
>> -    unsigned long long crash_size, crash_base, total_mem_sz;
>> -    int ret;
>>   -    total_mem_sz = memory_limit ? memory_limit : 
>> memblock_phys_mem_size();
>> -    /* use common parsing */
>> -    ret = parse_crashkernel(boot_command_line, total_mem_sz,
>> -            &crash_size, &crash_base, NULL, NULL);
>> -    if (ret == 0 && crash_size > 0) {
>> -        crashk_res.start = crash_base;
>> -        crashk_res.end = crash_base + crash_size - 1;
>> -    }
>> -
>> -    if (crashk_res.end == crashk_res.start) {
>> -        crashk_res.start = crashk_res.end = 0;
>> -        return;
>> -    }
>> -
>> -    /* We might have got these values via the command line or the
>> -     * device tree, either way sanitise them now. */
>> -
>> -    crash_size = resource_size(&crashk_res);
>> +static unsigned long long __init get_crash_base(unsigned long long 
>> crash_base)
>> +{
>>     #ifndef CONFIG_NONSTATIC_KERNEL
>> -    if (crashk_res.start != KDUMP_KERNELBASE)
>> +    if (crash_base != KDUMP_KERNELBASE)
>>           printk("Crash kernel location must be 0x%x\n",
>>                   KDUMP_KERNELBASE);
>>   -    crashk_res.start = KDUMP_KERNELBASE;
>> +    return KDUMP_KERNELBASE;
>>   #else
>> -    if (!crashk_res.start) {
>> +    unsigned long long crash_base_align;
>> +
>> +    if (!crash_base) {
>>   #ifdef CONFIG_PPC64
>>           /*
>>            * On the LPAR platform place the crash kernel to mid of
>> @@ -123,45 +105,51 @@ void __init reserve_crashkernel(void)
>>            * kernel starts at 128MB offset on other platforms.
>>            */
>>           if (firmware_has_feature(FW_FEATURE_LPAR))
>> -            crashk_res.start = min_t(u64, ppc64_rma_size / 2, SZ_512M);
>> +            crash_base = min_t(u64, ppc64_rma_size / 2, SZ_512M);
>>           else
>> -            crashk_res.start = min_t(u64, ppc64_rma_size / 2, SZ_128M);
>> +            crash_base = min_t(u64, ppc64_rma_size / 2, SZ_128M);
>>   #else
>> -        crashk_res.start = KDUMP_KERNELBASE;
>> +        crash_base = KDUMP_KERNELBASE;
>>   #endif
>>       }
>>   -    crash_base = PAGE_ALIGN(crashk_res.start);
>> -    if (crash_base != crashk_res.start) {
>> -        printk("Crash kernel base must be aligned to 0x%lx\n",
>> -                PAGE_SIZE);
>> -        crashk_res.start = crash_base;
>> -    }
>> +    crash_base_align = PAGE_ALIGN(crash_base);
>> +    if (crash_base != crash_base_align)
>> +        pr_warn("Crash kernel base must be aligned to 0x%lx\n", 
>> PAGE_SIZE);
>>   +    return crash_base_align;
>>   #endif
>> -    crash_size = PAGE_ALIGN(crash_size);
>> -    crashk_res.end = crashk_res.start + crash_size - 1;
>> +}
>>   -    /* The crash region must not overlap the current kernel */
>> -    if (overlaps_crashkernel(__pa(_stext), _end - _stext)) {
>> -        printk(KERN_WARNING
>> -            "Crash kernel can not overlap current kernel\n");
>> -        crashk_res.start = crashk_res.end = 0;
>> +void __init arch_reserve_crashkernel(void)
>> +{
>> +    unsigned long long crash_size, crash_base, crash_end;
>> +    unsigned long long kernel_start, kernel_size;
>> +    unsigned long long total_mem_sz;
>> +    int ret;
>> +
>> +    total_mem_sz = memory_limit ? memory_limit : 
>> memblock_phys_mem_size();
>> +
>> +    /* use common parsing */
>> +    ret = parse_crashkernel(boot_command_line, total_mem_sz, 
>> &crash_size,
>> +                &crash_base, NULL, NULL);
>> +
>> +    if (ret)
>>           return;
>> -    }
>>   -    printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
>> -            "for crashkernel (System RAM: %ldMB)\n",
>> -            (unsigned long)(crash_size >> 20),
>> -            (unsigned long)(crashk_res.start >> 20),
>> -            (unsigned long)(total_mem_sz >> 20));
>> +    crash_base = get_crash_base(crash_base);
>> +    crash_end = crash_base + crash_size - 1;
>>   -    if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
>> -        memblock_reserve(crashk_res.start, crash_size)) {
>> -        pr_err("Failed to reserve memory for crashkernel!\n");
>> -        crashk_res.start = crashk_res.end = 0;
>> +    kernel_start = __pa(_stext);
>> +    kernel_size = _end - _stext;
>> +
>> +    /* The crash region must not overlap the current kernel */
>> +    if ((kernel_start + kernel_size > crash_base) && (kernel_start 
>> <= crash_end)) {
>> +        pr_warn("Crash kernel can not overlap current kernel\n");
>>           return;
>>       }
>> +
>> +    reserve_crashkernel_generic(crash_size, crash_base, 0, false);
>>   }
>>     int __init overlaps_crashkernel(unsigned long start, unsigned 
>> long size)
>



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 6/6] crash: option to let arch decide mem range is usable
  2025-01-21 11:54 ` [PATCH v2 6/6] crash: option to let arch decide mem range is usable Sourabh Jain
@ 2025-01-24  9:52   ` Hari Bathini
  2025-01-24 10:28     ` Sourabh Jain
  0 siblings, 1 reply; 18+ messages in thread
From: Hari Bathini @ 2025-01-24  9:52 UTC (permalink / raw)
  To: Sourabh Jain, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel

Hi Sourabh,

On 21/01/25 5:24 pm, Sourabh Jain wrote:
> On PowerPC, the memory reserved for the crashkernel can contain
> components like RTAS, TCE, OPAL, etc., which should be avoided when
> loading kexec segments into crashkernel memory. Due to these special
> components, PowerPC has its own set of functions to locate holes in the
> crashkernel memory for loading kexec segments for kdump. However, for
> loading kexec segments in the kexec case, PowerPC uses generic functions
> to locate holes.
> 
> So, let's use generic functions to locate memory holes for kdump on
> PowerPC by adding an arch hook to handle such special regions while
> loading kexec segments, and remove the PowerPC functions to locate
> holes.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: kexec@lists.infradead.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
>   arch/powerpc/include/asm/kexec.h  |   6 +-
>   arch/powerpc/kexec/file_load_64.c | 259 ++----------------------------
>   include/linux/kexec.h             |   9 ++
>   kernel/kexec_file.c               |  12 ++
>   4 files changed, 34 insertions(+), 252 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index 64741558071f..5e4680f9ff35 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -95,8 +95,10 @@ int arch_kexec_kernel_image_probe(struct kimage *image, void *buf, unsigned long
>   int arch_kimage_file_post_load_cleanup(struct kimage *image);
>   #define arch_kimage_file_post_load_cleanup arch_kimage_file_post_load_cleanup
>   
> -int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
> -#define arch_kexec_locate_mem_hole arch_kexec_locate_mem_hole
> +int arch_check_excluded_range(struct kimage *image, unsigned long start,
> +			      unsigned long end);
> +#define arch_check_excluded_range  arch_check_excluded_range
> +
>   
>   int load_crashdump_segments_ppc64(struct kimage *image,
>   				  struct kexec_buf *kbuf);
> diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
> index dc65c1391157..e7ef8b2a2554 100644
> --- a/arch/powerpc/kexec/file_load_64.c
> +++ b/arch/powerpc/kexec/file_load_64.c
> @@ -49,201 +49,18 @@ const struct kexec_file_ops * const kexec_file_loaders[] = {
>   	NULL
>   };
>   
> -/**
> - * __locate_mem_hole_top_down - Looks top down for a large enough memory hole
> - *                              in the memory regions between buf_min & buf_max
> - *                              for the buffer. If found, sets kbuf->mem.
> - * @kbuf:                       Buffer contents and memory parameters.
> - * @buf_min:                    Minimum address for the buffer.
> - * @buf_max:                    Maximum address for the buffer.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int __locate_mem_hole_top_down(struct kexec_buf *kbuf,
> -				      u64 buf_min, u64 buf_max)
> -{
> -	int ret = -EADDRNOTAVAIL;
> -	phys_addr_t start, end;
> -	u64 i;
> -
> -	for_each_mem_range_rev(i, &start, &end) {
> -		/*
> -		 * memblock uses [start, end) convention while it is
> -		 * [start, end] here. Fix the off-by-one to have the
> -		 * same convention.
> -		 */
> -		end -= 1;
> -
> -		if (start > buf_max)
> -			continue;
> -
> -		/* Memory hole not found */
> -		if (end < buf_min)
> -			break;
> -
> -		/* Adjust memory region based on the given range */
> -		if (start < buf_min)
> -			start = buf_min;
> -		if (end > buf_max)
> -			end = buf_max;
> -
> -		start = ALIGN(start, kbuf->buf_align);
> -		if (start < end && (end - start + 1) >= kbuf->memsz) {
> -			/* Suitable memory range found. Set kbuf->mem */
> -			kbuf->mem = ALIGN_DOWN(end - kbuf->memsz + 1,
> -					       kbuf->buf_align);
> -			ret = 0;
> -			break;
> -		}
> -	}
> -
> -	return ret;
> -}
> -
> -/**
> - * locate_mem_hole_top_down_ppc64 - Skip special memory regions to find a
> - *                                  suitable buffer with top down approach.
> - * @kbuf:                           Buffer contents and memory parameters.
> - * @buf_min:                        Minimum address for the buffer.
> - * @buf_max:                        Maximum address for the buffer.
> - * @emem:                           Exclude memory ranges.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int locate_mem_hole_top_down_ppc64(struct kexec_buf *kbuf,
> -					  u64 buf_min, u64 buf_max,
> -					  const struct crash_mem *emem)
> +int arch_check_excluded_range(struct kimage *image, unsigned long start,
> +			      unsigned long end)
>   {
> -	int i, ret = 0, err = -EADDRNOTAVAIL;
> -	u64 start, end, tmin, tmax;
> -
> -	tmax = buf_max;
> -	for (i = (emem->nr_ranges - 1); i >= 0; i--) {
> -		start = emem->ranges[i].start;
> -		end = emem->ranges[i].end;
> -
> -		if (start > tmax)
> -			continue;
> -
> -		if (end < tmax) {
> -			tmin = (end < buf_min ? buf_min : end + 1);
> -			ret = __locate_mem_hole_top_down(kbuf, tmin, tmax);
> -			if (!ret)
> -				return 0;
> -		}
> -
> -		tmax = start - 1;
> -
> -		if (tmax < buf_min) {
> -			ret = err;
> -			break;
> -		}
> -		ret = 0;
> -	}
> -
> -	if (!ret) {
> -		tmin = buf_min;
> -		ret = __locate_mem_hole_top_down(kbuf, tmin, tmax);
> -	}
> -	return ret;
> -}
> -
> -/**
> - * __locate_mem_hole_bottom_up - Looks bottom up for a large enough memory hole
> - *                               in the memory regions between buf_min & buf_max
> - *                               for the buffer. If found, sets kbuf->mem.
> - * @kbuf:                        Buffer contents and memory parameters.
> - * @buf_min:                     Minimum address for the buffer.
> - * @buf_max:                     Maximum address for the buffer.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int __locate_mem_hole_bottom_up(struct kexec_buf *kbuf,
> -				       u64 buf_min, u64 buf_max)
> -{
> -	int ret = -EADDRNOTAVAIL;
> -	phys_addr_t start, end;
> -	u64 i;
> -
> -	for_each_mem_range(i, &start, &end) {
> -		/*
> -		 * memblock uses [start, end) convention while it is
> -		 * [start, end] here. Fix the off-by-one to have the
> -		 * same convention.
> -		 */
> -		end -= 1;
> -
> -		if (end < buf_min)
> -			continue;
> -
> -		/* Memory hole not found */
> -		if (start > buf_max)
> -			break;
> -
> -		/* Adjust memory region based on the given range */
> -		if (start < buf_min)
> -			start = buf_min;
> -		if (end > buf_max)
> -			end = buf_max;
> -
> -		start = ALIGN(start, kbuf->buf_align);
> -		if (start < end && (end - start + 1) >= kbuf->memsz) {
> -			/* Suitable memory range found. Set kbuf->mem */
> -			kbuf->mem = start;
> -			ret = 0;
> -			break;
> -		}
> -	}
> -
> -	return ret;
> -}
> -
> -/**
> - * locate_mem_hole_bottom_up_ppc64 - Skip special memory regions to find a
> - *                                   suitable buffer with bottom up approach.
> - * @kbuf:                            Buffer contents and memory parameters.
> - * @buf_min:                         Minimum address for the buffer.
> - * @buf_max:                         Maximum address for the buffer.
> - * @emem:                            Exclude memory ranges.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -static int locate_mem_hole_bottom_up_ppc64(struct kexec_buf *kbuf,
> -					   u64 buf_min, u64 buf_max,
> -					   const struct crash_mem *emem)
> -{
> -	int i, ret = 0, err = -EADDRNOTAVAIL;
> -	u64 start, end, tmin, tmax;
> -
> -	tmin = buf_min;
> -	for (i = 0; i < emem->nr_ranges; i++) {
> -		start = emem->ranges[i].start;
> -		end = emem->ranges[i].end;
> -
> -		if (end < tmin)
> -			continue;
> -
> -		if (start > tmin) {
> -			tmax = (start > buf_max ? buf_max : start - 1);
> -			ret = __locate_mem_hole_bottom_up(kbuf, tmin, tmax);
> -			if (!ret)
> -				return 0;
> -		}
> -
> -		tmin = end + 1;
> +	struct crash_mem *emem;
> +	int i;
>   
> -		if (tmin > buf_max) {
> -			ret = err;
> -			break;
> -		}
> -		ret = 0;
> -	}
> +	emem = image->arch.exclude_ranges;
> +	for (i = 0; i < emem->nr_ranges; i++)
> +		if (start < emem->ranges[i].end && end > emem->ranges[i].start)
> +			return 1;
>   
> -	if (!ret) {
> -		tmax = buf_max;
> -		ret = __locate_mem_hole_bottom_up(kbuf, tmin, tmax);
> -	}
> -	return ret;
> +	return 0;
>   }
>   
>   #ifdef CONFIG_CRASH_DUMP
> @@ -1004,64 +821,6 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
>   	return ret;
>   }
>   
> -/**
> - * arch_kexec_locate_mem_hole - Skip special memory regions like rtas, opal,
> - *                              tce-table, reserved-ranges & such (exclude
> - *                              memory ranges) as they can't be used for kexec
> - *                              segment buffer. Sets kbuf->mem when a suitable
> - *                              memory hole is found.
> - * @kbuf:                       Buffer contents and memory parameters.
> - *
> - * Assumes minimum of PAGE_SIZE alignment for kbuf->memsz & kbuf->buf_align.
> - *
> - * Returns 0 on success, negative errno on error.
> - */
> -int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> -{
> -	struct crash_mem **emem;
> -	u64 buf_min, buf_max;
> -	int ret;
> -
> -	/* Look up the exclude ranges list while locating the memory hole */
> -	emem = &(kbuf->image->arch.exclude_ranges);
> -	if (!(*emem) || ((*emem)->nr_ranges == 0)) {
> -		pr_warn("No exclude range list. Using the default locate mem hole method\n");
> -		return kexec_locate_mem_hole(kbuf);
> -	}
> -
> -	buf_min = kbuf->buf_min;
> -	buf_max = kbuf->buf_max;
> -	/* Segments for kdump kernel should be within crashkernel region */
> -	if (IS_ENABLED(CONFIG_CRASH_DUMP) && kbuf->image->type == KEXEC_TYPE_CRASH) {
> -		buf_min = (buf_min < crashk_res.start ?
> -			   crashk_res.start : buf_min);
> -		buf_max = (buf_max > crashk_res.end ?
> -			   crashk_res.end : buf_max);
> -	}
> -
> -	if (buf_min > buf_max) {
> -		pr_err("Invalid buffer min and/or max values\n");
> -		return -EINVAL;
> -	}
> -
> -	if (kbuf->top_down)
> -		ret = locate_mem_hole_top_down_ppc64(kbuf, buf_min, buf_max,
> -						     *emem);
> -	else
> -		ret = locate_mem_hole_bottom_up_ppc64(kbuf, buf_min, buf_max,
> -						      *emem);
> -
> -	/* Add the buffer allocated to the exclude list for the next lookup */
> -	if (!ret) {
> -		add_mem_range(emem, kbuf->mem, kbuf->memsz);
> -		sort_memory_ranges(*emem, true);
> -	} else {
> -		pr_err("Failed to locate memory buffer of size %lu\n",
> -		       kbuf->memsz);
> -	}
> -	return ret;
> -}
> -
>   /**
>    * arch_kexec_kernel_image_probe - Does additional handling needed to setup
>    *                                 kexec segments.


> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index f0e9f8eda7a3..407f8b0346aa 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -205,6 +205,15 @@ static inline int arch_kimage_file_post_load_cleanup(struct kimage *image)
>   }
>   #endif
>   
> +#ifndef arch_check_excluded_range
> +static inline int arch_check_excluded_range(struct kimage *image,
> +					    unsigned long start,
> +					    unsigned long end)
> +{
> +	return 0;
> +}
> +#endif
> +
>   #ifdef CONFIG_KEXEC_SIG
>   #ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION
>   int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long kernel_len);
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 3eedb8c226ad..fba686487e3b 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -464,6 +464,12 @@ static int locate_mem_hole_top_down(unsigned long start, unsigned long end,
>   			continue;
>   		}
>   
> +		/* Make sure this does not conflict with exclude range */
> +		if (arch_check_excluded_range(image, temp_start, temp_end)) {
> +			temp_start = temp_start - PAGE_SIZE;
> +			continue;
> +		}
> +
>   		/* We found a suitable memory range */
>   		break;
>   	} while (1);
> @@ -498,6 +504,12 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
>   			continue;
>   		}
>   
> +		/* Make sure this does not conflict with exclude range */
> +		if (arch_check_excluded_range(image, temp_start, temp_end)) {
> +			temp_start = temp_start + PAGE_SIZE;
> +			continue;
> +		}
> +
>   		/* We found a suitable memory range */
>   		break;
>   	} while (1);

Please split this arch-independent patch and have it as a preceding
patch. Arch-specific changes can go in a separate patch.

- Hari


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH v2 6/6] crash: option to let arch decide mem range is usable
  2025-01-24  9:52   ` Hari Bathini
@ 2025-01-24 10:28     ` Sourabh Jain
  0 siblings, 0 replies; 18+ messages in thread
From: Sourabh Jain @ 2025-01-24 10:28 UTC (permalink / raw)
  To: Hari Bathini, linuxppc-dev
  Cc: Andrew Morton, Baoquan he, Madhavan Srinivasan, Mahesh Salgaonkar,
	Michael Ellerman, kexec, linux-kernel

Hello Hari,


On 24/01/25 15:22, Hari Bathini wrote:
> Hi Sourabh,
>
> On 21/01/25 5:24 pm, Sourabh Jain wrote:
>> On PowerPC, the memory reserved for the crashkernel can contain
>> components like RTAS, TCE, OPAL, etc., which should be avoided when
>> loading kexec segments into crashkernel memory. Due to these special
>> components, PowerPC has its own set of functions to locate holes in the
>> crashkernel memory for loading kexec segments for kdump. However, for
>> loading kexec segments in the kexec case, PowerPC uses generic functions
>> to locate holes.
>>
>> So, let's use generic functions to locate memory holes for kdump on
>> PowerPC by adding an arch hook to handle such special regions while
>> loading kexec segments, and remove the PowerPC functions to locate
>> holes.
>>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Baoquan he <bhe@redhat.com>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: kexec@lists.infradead.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>>   arch/powerpc/include/asm/kexec.h  |   6 +-
>>   arch/powerpc/kexec/file_load_64.c | 259 ++----------------------------
>>   include/linux/kexec.h             |   9 ++
>>   kernel/kexec_file.c               |  12 ++
>>   4 files changed, 34 insertions(+), 252 deletions(-)
>>
>> diff --git a/arch/powerpc/include/asm/kexec.h 
>> b/arch/powerpc/include/asm/kexec.h
>> index 64741558071f..5e4680f9ff35 100644
>> --- a/arch/powerpc/include/asm/kexec.h
>> +++ b/arch/powerpc/include/asm/kexec.h
>> @@ -95,8 +95,10 @@ int arch_kexec_kernel_image_probe(struct kimage 
>> *image, void *buf, unsigned long
>>   int arch_kimage_file_post_load_cleanup(struct kimage *image);
>>   #define arch_kimage_file_post_load_cleanup 
>> arch_kimage_file_post_load_cleanup
>>   -int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
>> -#define arch_kexec_locate_mem_hole arch_kexec_locate_mem_hole
>> +int arch_check_excluded_range(struct kimage *image, unsigned long 
>> start,
>> +                  unsigned long end);
>> +#define arch_check_excluded_range  arch_check_excluded_range
>> +
>>     int load_crashdump_segments_ppc64(struct kimage *image,
>>                     struct kexec_buf *kbuf);
>> diff --git a/arch/powerpc/kexec/file_load_64.c 
>> b/arch/powerpc/kexec/file_load_64.c
>> index dc65c1391157..e7ef8b2a2554 100644
>> --- a/arch/powerpc/kexec/file_load_64.c
>> +++ b/arch/powerpc/kexec/file_load_64.c
>> @@ -49,201 +49,18 @@ const struct kexec_file_ops * const 
>> kexec_file_loaders[] = {
>>       NULL
>>   };
>>   -/**
>> - * __locate_mem_hole_top_down - Looks top down for a large enough 
>> memory hole
>> - *                              in the memory regions between 
>> buf_min & buf_max
>> - *                              for the buffer. If found, sets 
>> kbuf->mem.
>> - * @kbuf:                       Buffer contents and memory parameters.
>> - * @buf_min:                    Minimum address for the buffer.
>> - * @buf_max:                    Maximum address for the buffer.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int __locate_mem_hole_top_down(struct kexec_buf *kbuf,
>> -                      u64 buf_min, u64 buf_max)
>> -{
>> -    int ret = -EADDRNOTAVAIL;
>> -    phys_addr_t start, end;
>> -    u64 i;
>> -
>> -    for_each_mem_range_rev(i, &start, &end) {
>> -        /*
>> -         * memblock uses [start, end) convention while it is
>> -         * [start, end] here. Fix the off-by-one to have the
>> -         * same convention.
>> -         */
>> -        end -= 1;
>> -
>> -        if (start > buf_max)
>> -            continue;
>> -
>> -        /* Memory hole not found */
>> -        if (end < buf_min)
>> -            break;
>> -
>> -        /* Adjust memory region based on the given range */
>> -        if (start < buf_min)
>> -            start = buf_min;
>> -        if (end > buf_max)
>> -            end = buf_max;
>> -
>> -        start = ALIGN(start, kbuf->buf_align);
>> -        if (start < end && (end - start + 1) >= kbuf->memsz) {
>> -            /* Suitable memory range found. Set kbuf->mem */
>> -            kbuf->mem = ALIGN_DOWN(end - kbuf->memsz + 1,
>> -                           kbuf->buf_align);
>> -            ret = 0;
>> -            break;
>> -        }
>> -    }
>> -
>> -    return ret;
>> -}
>> -
>> -/**
>> - * locate_mem_hole_top_down_ppc64 - Skip special memory regions to 
>> find a
>> - *                                  suitable buffer with top down 
>> approach.
>> - * @kbuf:                           Buffer contents and memory 
>> parameters.
>> - * @buf_min:                        Minimum address for the buffer.
>> - * @buf_max:                        Maximum address for the buffer.
>> - * @emem:                           Exclude memory ranges.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int locate_mem_hole_top_down_ppc64(struct kexec_buf *kbuf,
>> -                      u64 buf_min, u64 buf_max,
>> -                      const struct crash_mem *emem)
>> +int arch_check_excluded_range(struct kimage *image, unsigned long 
>> start,
>> +                  unsigned long end)
>>   {
>> -    int i, ret = 0, err = -EADDRNOTAVAIL;
>> -    u64 start, end, tmin, tmax;
>> -
>> -    tmax = buf_max;
>> -    for (i = (emem->nr_ranges - 1); i >= 0; i--) {
>> -        start = emem->ranges[i].start;
>> -        end = emem->ranges[i].end;
>> -
>> -        if (start > tmax)
>> -            continue;
>> -
>> -        if (end < tmax) {
>> -            tmin = (end < buf_min ? buf_min : end + 1);
>> -            ret = __locate_mem_hole_top_down(kbuf, tmin, tmax);
>> -            if (!ret)
>> -                return 0;
>> -        }
>> -
>> -        tmax = start - 1;
>> -
>> -        if (tmax < buf_min) {
>> -            ret = err;
>> -            break;
>> -        }
>> -        ret = 0;
>> -    }
>> -
>> -    if (!ret) {
>> -        tmin = buf_min;
>> -        ret = __locate_mem_hole_top_down(kbuf, tmin, tmax);
>> -    }
>> -    return ret;
>> -}
>> -
>> -/**
>> - * __locate_mem_hole_bottom_up - Looks bottom up for a large enough 
>> memory hole
>> - *                               in the memory regions between 
>> buf_min & buf_max
>> - *                               for the buffer. If found, sets 
>> kbuf->mem.
>> - * @kbuf:                        Buffer contents and memory parameters.
>> - * @buf_min:                     Minimum address for the buffer.
>> - * @buf_max:                     Maximum address for the buffer.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int __locate_mem_hole_bottom_up(struct kexec_buf *kbuf,
>> -                       u64 buf_min, u64 buf_max)
>> -{
>> -    int ret = -EADDRNOTAVAIL;
>> -    phys_addr_t start, end;
>> -    u64 i;
>> -
>> -    for_each_mem_range(i, &start, &end) {
>> -        /*
>> -         * memblock uses [start, end) convention while it is
>> -         * [start, end] here. Fix the off-by-one to have the
>> -         * same convention.
>> -         */
>> -        end -= 1;
>> -
>> -        if (end < buf_min)
>> -            continue;
>> -
>> -        /* Memory hole not found */
>> -        if (start > buf_max)
>> -            break;
>> -
>> -        /* Adjust memory region based on the given range */
>> -        if (start < buf_min)
>> -            start = buf_min;
>> -        if (end > buf_max)
>> -            end = buf_max;
>> -
>> -        start = ALIGN(start, kbuf->buf_align);
>> -        if (start < end && (end - start + 1) >= kbuf->memsz) {
>> -            /* Suitable memory range found. Set kbuf->mem */
>> -            kbuf->mem = start;
>> -            ret = 0;
>> -            break;
>> -        }
>> -    }
>> -
>> -    return ret;
>> -}
>> -
>> -/**
>> - * locate_mem_hole_bottom_up_ppc64 - Skip special memory regions to 
>> find a
>> - *                                   suitable buffer with bottom up 
>> approach.
>> - * @kbuf:                            Buffer contents and memory 
>> parameters.
>> - * @buf_min:                         Minimum address for the buffer.
>> - * @buf_max:                         Maximum address for the buffer.
>> - * @emem:                            Exclude memory ranges.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -static int locate_mem_hole_bottom_up_ppc64(struct kexec_buf *kbuf,
>> -                       u64 buf_min, u64 buf_max,
>> -                       const struct crash_mem *emem)
>> -{
>> -    int i, ret = 0, err = -EADDRNOTAVAIL;
>> -    u64 start, end, tmin, tmax;
>> -
>> -    tmin = buf_min;
>> -    for (i = 0; i < emem->nr_ranges; i++) {
>> -        start = emem->ranges[i].start;
>> -        end = emem->ranges[i].end;
>> -
>> -        if (end < tmin)
>> -            continue;
>> -
>> -        if (start > tmin) {
>> -            tmax = (start > buf_max ? buf_max : start - 1);
>> -            ret = __locate_mem_hole_bottom_up(kbuf, tmin, tmax);
>> -            if (!ret)
>> -                return 0;
>> -        }
>> -
>> -        tmin = end + 1;
>> +    struct crash_mem *emem;
>> +    int i;
>>   -        if (tmin > buf_max) {
>> -            ret = err;
>> -            break;
>> -        }
>> -        ret = 0;
>> -    }
>> +    emem = image->arch.exclude_ranges;
>> +    for (i = 0; i < emem->nr_ranges; i++)
>> +        if (start < emem->ranges[i].end && end > emem->ranges[i].start)
>> +            return 1;
>>   -    if (!ret) {
>> -        tmax = buf_max;
>> -        ret = __locate_mem_hole_bottom_up(kbuf, tmin, tmax);
>> -    }
>> -    return ret;
>> +    return 0;
>>   }
>>     #ifdef CONFIG_CRASH_DUMP
>> @@ -1004,64 +821,6 @@ int setup_new_fdt_ppc64(const struct kimage 
>> *image, void *fdt, struct crash_mem
>>       return ret;
>>   }
>>   -/**
>> - * arch_kexec_locate_mem_hole - Skip special memory regions like 
>> rtas, opal,
>> - *                              tce-table, reserved-ranges & such 
>> (exclude
>> - *                              memory ranges) as they can't be used 
>> for kexec
>> - *                              segment buffer. Sets kbuf->mem when 
>> a suitable
>> - *                              memory hole is found.
>> - * @kbuf:                       Buffer contents and memory parameters.
>> - *
>> - * Assumes minimum of PAGE_SIZE alignment for kbuf->memsz & 
>> kbuf->buf_align.
>> - *
>> - * Returns 0 on success, negative errno on error.
>> - */
>> -int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
>> -{
>> -    struct crash_mem **emem;
>> -    u64 buf_min, buf_max;
>> -    int ret;
>> -
>> -    /* Look up the exclude ranges list while locating the memory 
>> hole */
>> -    emem = &(kbuf->image->arch.exclude_ranges);
>> -    if (!(*emem) || ((*emem)->nr_ranges == 0)) {
>> -        pr_warn("No exclude range list. Using the default locate mem 
>> hole method\n");
>> -        return kexec_locate_mem_hole(kbuf);
>> -    }
>> -
>> -    buf_min = kbuf->buf_min;
>> -    buf_max = kbuf->buf_max;
>> -    /* Segments for kdump kernel should be within crashkernel region */
>> -    if (IS_ENABLED(CONFIG_CRASH_DUMP) && kbuf->image->type == 
>> KEXEC_TYPE_CRASH) {
>> -        buf_min = (buf_min < crashk_res.start ?
>> -               crashk_res.start : buf_min);
>> -        buf_max = (buf_max > crashk_res.end ?
>> -               crashk_res.end : buf_max);
>> -    }
>> -
>> -    if (buf_min > buf_max) {
>> -        pr_err("Invalid buffer min and/or max values\n");
>> -        return -EINVAL;
>> -    }
>> -
>> -    if (kbuf->top_down)
>> -        ret = locate_mem_hole_top_down_ppc64(kbuf, buf_min, buf_max,
>> -                             *emem);
>> -    else
>> -        ret = locate_mem_hole_bottom_up_ppc64(kbuf, buf_min, buf_max,
>> -                              *emem);
>> -
>> -    /* Add the buffer allocated to the exclude list for the next 
>> lookup */
>> -    if (!ret) {
>> -        add_mem_range(emem, kbuf->mem, kbuf->memsz);
>> -        sort_memory_ranges(*emem, true);
>> -    } else {
>> -        pr_err("Failed to locate memory buffer of size %lu\n",
>> -               kbuf->memsz);
>> -    }
>> -    return ret;
>> -}
>> -
>>   /**
>>    * arch_kexec_kernel_image_probe - Does additional handling needed 
>> to setup
>>    *                                 kexec segments.
>
>
>> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
>> index f0e9f8eda7a3..407f8b0346aa 100644
>> --- a/include/linux/kexec.h
>> +++ b/include/linux/kexec.h
>> @@ -205,6 +205,15 @@ static inline int 
>> arch_kimage_file_post_load_cleanup(struct kimage *image)
>>   }
>>   #endif
>>   +#ifndef arch_check_excluded_range
>> +static inline int arch_check_excluded_range(struct kimage *image,
>> +                        unsigned long start,
>> +                        unsigned long end)
>> +{
>> +    return 0;
>> +}
>> +#endif
>> +
>>   #ifdef CONFIG_KEXEC_SIG
>>   #ifdef CONFIG_SIGNED_PE_FILE_VERIFICATION
>>   int kexec_kernel_verify_pe_sig(const char *kernel, unsigned long 
>> kernel_len);
>> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
>> index 3eedb8c226ad..fba686487e3b 100644
>> --- a/kernel/kexec_file.c
>> +++ b/kernel/kexec_file.c
>> @@ -464,6 +464,12 @@ static int locate_mem_hole_top_down(unsigned 
>> long start, unsigned long end,
>>               continue;
>>           }
>>   +        /* Make sure this does not conflict with exclude range */
>> +        if (arch_check_excluded_range(image, temp_start, temp_end)) {
>> +            temp_start = temp_start - PAGE_SIZE;
>> +            continue;
>> +        }
>> +
>>           /* We found a suitable memory range */
>>           break;
>>       } while (1);
>> @@ -498,6 +504,12 @@ static int locate_mem_hole_bottom_up(unsigned 
>> long start, unsigned long end,
>>               continue;
>>           }
>>   +        /* Make sure this does not conflict with exclude range */
>> +        if (arch_check_excluded_range(image, temp_start, temp_end)) {
>> +            temp_start = temp_start + PAGE_SIZE;
>> +            continue;
>> +        }
>> +
>>           /* We found a suitable memory range */
>>           break;
>>       } while (1);
>
> Please split this arch-independent patch and have it as a preceding
> patch. Arch-specific changes can go in a separate patch.

Yes, I will split this patch in the next version.

Thanks,
Sourabh Jain



^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-01-24 10:28 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-21 11:54 [PATCH v2 0/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
2025-01-21 11:54 ` [PATCH v2 1/6] kexec: Initialize ELF lowest address to ULONG_MAX Sourabh Jain
2025-01-23 10:04   ` Hari Bathini
2025-01-23 11:23     ` Sourabh Jain
2025-01-21 11:54 ` [PATCH v2 2/6] crash: remove an unused argument from reserve_crashkernel_generic() Sourabh Jain
2025-01-23 10:13   ` Hari Bathini
2025-01-21 11:54 ` [PATCH v2 3/6] crash: let arch decide crash memory export to iomem_resource Sourabh Jain
2025-01-23 10:26   ` Hari Bathini
2025-01-23 11:50     ` Sourabh Jain
2025-01-21 11:54 ` [PATCH v2 4/6] powerpc/kdump: preserve user-specified memory limit Sourabh Jain
2025-01-23 10:30   ` Hari Bathini
2025-01-23 11:22     ` Sourabh Jain
2025-01-21 11:54 ` [PATCH v2 5/6] powerpc/crash: use generic crashkernel reservation Sourabh Jain
2025-01-23 10:45   ` Hari Bathini
2025-01-23 11:53     ` Sourabh Jain
2025-01-21 11:54 ` [PATCH v2 6/6] crash: option to let arch decide mem range is usable Sourabh Jain
2025-01-24  9:52   ` Hari Bathini
2025-01-24 10:28     ` Sourabh Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).