* [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation
@ 2025-10-27 15:13 Sourabh Jain
2025-10-27 15:13 ` [PATCH 1/4] powerpc/mmu: do MMU type discovery before " Sourabh Jain
` (4 more replies)
0 siblings, 5 replies; 8+ messages in thread
From: Sourabh Jain @ 2025-10-27 15:13 UTC (permalink / raw)
To: linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Ritesh Harjani (IBM),
Shivang Upadhyay
Add support for reserving crashkernel memory in higher address ranges
using the crashkernel=xxM,high command-line option.
With this feature, most of the crashkernel memory for kdump will be
reserved in high memory regions, while only a small portion (64 MB) will
be reserved in low memory for the kdump kernel. This helps free up low
memory for other components that require allocations in that region.
For example, if crashkernel=2G,high is specified, the kernel will reserve
2 GB of crashkernel memory near the end of system RAM and an additional
64 MB of low memory (below 1 GB) for RTAS to function properly.
Currently, this feature is supported only on PPC64 systems with 64-bit
RTAS instantiation and Radix MMU enabled.
Two critical changes were made to support this feature:
- CPU feature discovery is now performed before crashkernel
reservation. This ensures the MMU type is determined before reserving
crashkernel memory. (Patch 01/04)
- RTAS instantiation has been moved to 64-bit mode. (Patch 02/04)
Apply the following patch first, and then apply this patch series:
https://lore.kernel.org/all/20251024170118.297472-1-sourabhjain@linux.ibm.com/
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
Sourabh Jain (4):
powerpc/mmu: do MMU type discovery before crashkernel reservation
powerpc: move to 64-bit RTAS
powerpc/kdump: consider high crashkernel memory if enabled
powerpc/kdump: add support for high crashkernel reservation
arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
arch/powerpc/include/asm/crash_reserve.h | 8 +++++
arch/powerpc/include/asm/kexec.h | 1 +
arch/powerpc/include/asm/mmu.h | 1 +
arch/powerpc/include/asm/rtas.h | 11 ++++++
arch/powerpc/kernel/prom.c | 28 ++++++++-------
arch/powerpc/kernel/prom_init.c | 26 +++++++++++---
arch/powerpc/kernel/rtas.c | 5 +++
arch/powerpc/kernel/rtas_entry.S | 17 ++++++++-
arch/powerpc/kexec/core.c | 45 +++++++++++++++++-------
arch/powerpc/kexec/elf_64.c | 10 ++++--
arch/powerpc/kexec/file_load_64.c | 5 +--
arch/powerpc/kexec/ranges.c | 24 +++++++++++--
arch/powerpc/mm/init_64.c | 27 ++++++++------
14 files changed, 161 insertions(+), 48 deletions(-)
--
2.51.0
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/4] powerpc/mmu: do MMU type discovery before crashkernel reservation
2025-10-27 15:13 [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation Sourabh Jain
@ 2025-10-27 15:13 ` Sourabh Jain
2025-10-31 4:53 ` Ritesh Harjani
2025-10-27 15:13 ` [PATCH 2/4] powerpc: move to 64-bit RTAS Sourabh Jain
` (3 subsequent siblings)
4 siblings, 1 reply; 8+ messages in thread
From: Sourabh Jain @ 2025-10-27 15:13 UTC (permalink / raw)
To: linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Ritesh Harjani (IBM),
Shivang Upadhyay
Crashkernel reservation on high memory depends on the MMU type, so
finalize the MMU type before calling arch_reserve_crashkernel().
With the changes introduced here, early_radix_enabled() becomes usable
and will be used in arch_reserve_crashkernel() in the upcoming patch.
early_radix_enabled() depends on cur_cpu_spec->mmu_features to find
out if the radix MMU is enabled. The radix MMU bit in mmu_features is
discovered from the FDT and kernel configs. To make sure the MMU type is
finalized before arch_reserve_crashkernel() is called, the function that
scans the FDT and sets mmu_features, along with some bits from
mmu_early_type_finalize(), has been moved above
arch_reserve_crashkernel().
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
arch/powerpc/include/asm/mmu.h | 1 +
arch/powerpc/kernel/prom.c | 28 +++++++++++++-----------
arch/powerpc/mm/init_64.c | 27 ++++++++++++++---------
4 files changed, 34 insertions(+), 23 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 48631365b48c..7a3b2ff02041 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -208,6 +208,7 @@ extern int mmu_vmemmap_psize;
/* MMU initialization */
void mmu_early_init_devtree(void);
+void mmu_early_type_finalize(void);
void hash__early_init_devtree(void);
void radix__early_init_devtree(void);
#ifdef CONFIG_PPC_PKEY
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
index 5f9c5d436e17..c40dc6349e55 100644
--- a/arch/powerpc/include/asm/mmu.h
+++ b/arch/powerpc/include/asm/mmu.h
@@ -384,6 +384,7 @@ extern void early_init_mmu_secondary(void);
extern void setup_initial_memory_limit(phys_addr_t first_memblock_base,
phys_addr_t first_memblock_size);
static inline void mmu_early_init_devtree(void) { }
+static inline void mmu_early_type_finalize(void) { }
static inline void pkey_early_init_devtree(void) {}
diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
index 9ed9dde7d231..db1615f26075 100644
--- a/arch/powerpc/kernel/prom.c
+++ b/arch/powerpc/kernel/prom.c
@@ -853,6 +853,21 @@ void __init early_init_devtree(void *params)
if (PHYSICAL_START > MEMORY_START)
memblock_reserve(MEMORY_START, int_vector_size);
reserve_kdump_trampoline();
+
+ DBG("Scanning CPUs ...\n");
+
+ dt_cpu_ftrs_scan();
+
+ /* Retrieve CPU related informations from the flat tree
+ * (altivec support, boot CPU ID, ...)
+ */
+ of_scan_flat_dt(early_init_dt_scan_cpus, NULL);
+ if (boot_cpuid < 0) {
+ printk("Failed to identify boot CPU !\n");
+ BUG();
+ }
+
+ mmu_early_type_finalize();
#if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP)
/*
* If we fail to reserve memory for firmware-assisted dump then
@@ -884,19 +899,6 @@ void __init early_init_devtree(void *params)
* FIXME .. and the initrd too? */
move_device_tree();
- DBG("Scanning CPUs ...\n");
-
- dt_cpu_ftrs_scan();
-
- /* Retrieve CPU related informations from the flat tree
- * (altivec support, boot CPU ID, ...)
- */
- of_scan_flat_dt(early_init_dt_scan_cpus, NULL);
- if (boot_cpuid < 0) {
- printk("Failed to identify boot CPU !\n");
- BUG();
- }
-
save_fscr_to_task();
#if defined(CONFIG_SMP) && defined(CONFIG_PPC64)
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index b6f3ae03ca9e..cd52c1baa3bc 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -622,8 +622,10 @@ static void __init early_init_memory_block_size(void)
of_scan_flat_dt(probe_memory_block_size, &memory_block_size);
}
-void __init mmu_early_init_devtree(void)
+
+void __init mmu_early_type_finalize(void)
{
+
bool hvmode = !!(mfmsr() & MSR_HV);
/* Disable radix mode based on kernel command line. */
@@ -634,6 +636,20 @@ void __init mmu_early_init_devtree(void)
pr_warn("WARNING: Ignoring cmdline option disable_radix\n");
}
+ /*
+ * Check /chosen/ibm,architecture-vec-5 if running as a guest.
+ * When running bare-metal, we can use radix if we like
+ * even though the ibm,architecture-vec-5 property created by
+ * skiboot doesn't have the necessary bits set.
+ */
+ if (!hvmode)
+ early_check_vec5();
+}
+
+void __init mmu_early_init_devtree(void)
+{
+ bool hvmode = !!(mfmsr() & MSR_HV);
+
of_scan_flat_dt(dt_scan_mmu_pid_width, NULL);
if (hvmode && !mmu_lpid_bits) {
if (early_cpu_has_feature(CPU_FTR_ARCH_207S))
@@ -646,15 +662,6 @@ void __init mmu_early_init_devtree(void)
mmu_pid_bits = 20; /* POWER9-10 */
}
- /*
- * Check /chosen/ibm,architecture-vec-5 if running as a guest.
- * When running bare-metal, we can use radix if we like
- * even though the ibm,architecture-vec-5 property created by
- * skiboot doesn't have the necessary bits set.
- */
- if (!hvmode)
- early_check_vec5();
-
early_init_memory_block_size();
if (early_radix_enabled()) {
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 2/4] powerpc: move to 64-bit RTAS
2025-10-27 15:13 [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation Sourabh Jain
2025-10-27 15:13 ` [PATCH 1/4] powerpc/mmu: do MMU type discovery before " Sourabh Jain
@ 2025-10-27 15:13 ` Sourabh Jain
2025-10-29 12:52 ` Sourabh Jain
2025-10-27 15:13 ` [PATCH 3/4] powerpc/kdump: consider high crashkernel memory if enabled Sourabh Jain
` (2 subsequent siblings)
4 siblings, 1 reply; 8+ messages in thread
From: Sourabh Jain @ 2025-10-27 15:13 UTC (permalink / raw)
To: linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Ritesh Harjani (IBM),
Shivang Upadhyay
Kdump kernels loaded at high addresses (above 4G) could not boot
because the kernel used 32-bit RTAS.
Until now, the kernel always used 32-bit RTAS, even for 64-bit kernels.
Before making an RTAS call, it sets the MSR register with the SF bit off
and sets rtas_return_loc/rtas_entry.s to LR as the return address.
For kdump kernels loaded above 4G, RTAS cannot jump back to this LR
correctly and instead jumps to a 32-bit truncated address. This usually
causes exception which leads to kernel panic.
To fix this, the kernel initializes 64-bit RTAS and sets the SF bit in
the MSR register before each RTAS call, ensuring that RTAS jumps back
correctly if the LR address is higher than 4G. This allows kdump kernels
at high addresses to boot properly.
If 64-bit RTAS initialization fails or is not supported (e.g., in QEMU),
the kernel falls back to 32-bit RTAS. In this case, high-address kdump
kernels will not be allowed (handled in upcoming patches), and RTAS
calls will use SF bit off.
Changes made to achieve this:
- Initialize 64-bit RTAS in prom_init and add a new FDT property
linux,rtas-64
- Kernel reads linux,rtas-64 and sets a global variable rtas_64 to
indicate whether RTAS is 64-bit or 32-bit
- Prepare MSR register for RTAS calls based on whether RTAS is 32-bit
or 64-bit
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
arch/powerpc/include/asm/rtas.h | 2 ++
arch/powerpc/kernel/prom_init.c | 26 ++++++++++++++++++++++----
arch/powerpc/kernel/rtas.c | 5 +++++
arch/powerpc/kernel/rtas_entry.S | 17 ++++++++++++++++-
4 files changed, 45 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index d046bbd5017d..aaa4c3bc1d61 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -10,6 +10,8 @@
#include <linux/time.h>
#include <linux/cpumask.h>
+extern int rtas_64;
+
/*
* Definitions for talking to the RTAS on CHRP machines.
*
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index 827c958677f8..ab85b8bb8d4f 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -1841,6 +1841,7 @@ static void __init prom_instantiate_rtas(void)
u32 base, entry = 0;
__be32 val;
u32 size = 0;
+ u32 rtas_64 = 1;
prom_debug("prom_instantiate_rtas: start...\n");
@@ -1867,12 +1868,25 @@ static void __init prom_instantiate_rtas(void)
prom_printf("instantiating rtas at 0x%x...", base);
+ /*
+ * First, try to instantiate 64-bit RTAS. If that fails, fall back
+ * to 32-bit. Although 64-bit RTAS support has been available on
+ * real machines for some time, QEMU still lacks this support.
+ */
if (call_prom_ret("call-method", 3, 2, &entry,
- ADDR("instantiate-rtas"),
+ ADDR("instantiate-rtas-64"),
rtas_inst, base) != 0
- || entry == 0) {
- prom_printf(" failed\n");
- return;
+ || entry == 0) {
+
+ rtas_64 = 0;
+ if (call_prom_ret("call-method", 3, 2, &entry,
+ ADDR("instantiate-rtas"),
+ rtas_inst, base) != 0
+ || entry == 0) {
+
+ prom_printf(" failed\n");
+ return;
+ }
}
prom_printf(" done\n");
@@ -1884,6 +1898,9 @@ static void __init prom_instantiate_rtas(void)
val = cpu_to_be32(entry);
prom_setprop(rtas_node, "/rtas", "linux,rtas-entry",
&val, sizeof(val));
+ val = cpu_to_be32(rtas_64);
+ prom_setprop(rtas_node, "/rtas", "linux,rtas-64",
+ &val, sizeof(val));
/* Check if it supports "query-cpu-stopped-state" */
if (prom_getprop(rtas_node, "query-cpu-stopped-state",
@@ -1893,6 +1910,7 @@ static void __init prom_instantiate_rtas(void)
prom_debug("rtas base = 0x%x\n", base);
prom_debug("rtas entry = 0x%x\n", entry);
prom_debug("rtas size = 0x%x\n", size);
+ prom_debug("rtas 64-bit = 0x%x\n", rtas_64);
prom_debug("prom_instantiate_rtas: end...\n");
}
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 8d81c1e7a8db..723806468984 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -45,6 +45,8 @@
#include <asm/trace.h>
#include <asm/udbg.h>
+int rtas_64 = 1;
+
struct rtas_filter {
/* Indexes into the args buffer, -1 if not used */
const int buf_idx1;
@@ -2087,6 +2089,9 @@ int __init early_init_dt_scan_rtas(unsigned long node,
entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL);
sizep = of_get_flat_dt_prop(node, "rtas-size", NULL);
+ if (!of_get_flat_dt_prop(node, "linux,rtas-64", NULL))
+ rtas_64 = 0;
+
#ifdef CONFIG_PPC64
/* need this feature to decide the crashkernel offset */
if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL))
diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
index 6ce95ddadbcd..df776f0103c9 100644
--- a/arch/powerpc/kernel/rtas_entry.S
+++ b/arch/powerpc/kernel/rtas_entry.S
@@ -54,6 +54,10 @@ _ASM_NOKPROBE_SYMBOL(enter_rtas)
/*
* 32-bit rtas on 64-bit machines has the additional problem that RTAS may
* not preserve the upper parts of registers it uses.
+ *
+ * Note: In 64-bit RTAS, the SF bit is set so that RTAS can return
+ * correctly if the return address is above 4 GB. Everything else
+ * works the same as in 32-bit RTAS.
*/
_GLOBAL(enter_rtas)
mflr r0
@@ -113,7 +117,18 @@ __enter_rtas:
* from the saved MSR value and insert into the value RTAS will use.
*/
extrdi r0, r6, 1, 63 - MSR_HV_LG
- LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
+
+ LOAD_REG_ADDR(r7, rtas_64) /* Load the address rtas_64 into r7 */
+ ld r8, 0(r7) /* Load the value of rtas_64 from memory into r8 */
+ cmpdi r8, 0 /* Compare r8 with 0 (check if rtas_64 is zero) */
+ beq no_sf_bit /* Branch to no_sf_bit if rtas_64 is zero */
+ LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI | MSR_SF) /* r6 = ME|RI|SF */
+ b continue
+
+no_sf_bit:
+ LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI) /* r6 = ME|RI (NO SF bit in MSR) */
+
+continue:
insrdi r6, r0, 1, 63 - MSR_HV_LG
li r0,0
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 3/4] powerpc/kdump: consider high crashkernel memory if enabled
2025-10-27 15:13 [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation Sourabh Jain
2025-10-27 15:13 ` [PATCH 1/4] powerpc/mmu: do MMU type discovery before " Sourabh Jain
2025-10-27 15:13 ` [PATCH 2/4] powerpc: move to 64-bit RTAS Sourabh Jain
@ 2025-10-27 15:13 ` Sourabh Jain
2025-10-27 15:13 ` [PATCH 4/4] powerpc/kdump: add support for high crashkernel reservation Sourabh Jain
2025-10-28 6:23 ` [PATCH 0/4] powerpc/kdump: Support " Baoquan he
4 siblings, 0 replies; 8+ messages in thread
From: Sourabh Jain @ 2025-10-27 15:13 UTC (permalink / raw)
To: linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Ritesh Harjani (IBM),
Shivang Upadhyay
The next patch adds high crashkernel reservation support on powerpc, so
kdump setup is updated to handle high crashkernel while loading the kdump
kernel.
With high crashkernel reservation, the crashkernel is split into two
regions: low crashkernel and high crashkernel. To ensure kdump loads
properly with the split reservation, the following changes are made:
- Load the kdump image in high memory if enabled
- Include both low and high crashkernel regions in usable memory
ranges for the kdump kernel
- Exclude both low and high crashkernel regions from crashkernel memory
to prevent them from being exported through /proc/vmcore
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
arch/powerpc/kexec/elf_64.c | 10 +++++++---
arch/powerpc/kexec/file_load_64.c | 5 +++--
arch/powerpc/kexec/ranges.c | 24 +++++++++++++++++++++---
3 files changed, 31 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kexec/elf_64.c b/arch/powerpc/kexec/elf_64.c
index 5d6d616404cf..ab84ff6d3685 100644
--- a/arch/powerpc/kexec/elf_64.c
+++ b/arch/powerpc/kexec/elf_64.c
@@ -52,9 +52,13 @@ static void *elf64_load(struct kimage *image, char *kernel_buf,
if (IS_ENABLED(CONFIG_CRASH_DUMP) && image->type == KEXEC_TYPE_CRASH) {
/* min & max buffer values for kdump case */
kbuf.buf_min = pbuf.buf_min = crashk_res.start;
- kbuf.buf_max = pbuf.buf_max =
- ((crashk_res.end < ppc64_rma_size) ?
- crashk_res.end : (ppc64_rma_size - 1));
+
+ if (crashk_low_res.end)
+ kbuf.buf_max = pbuf.buf_max = crashk_res.end;
+ else
+ kbuf.buf_max = pbuf.buf_max =
+ ((crashk_res.end < ppc64_rma_size) ?
+ crashk_res.end : (ppc64_rma_size - 1));
}
ret = kexec_elf_load(image, &ehdr, &elf_info, &kbuf, &kernel_load_addr);
diff --git a/arch/powerpc/kexec/file_load_64.c b/arch/powerpc/kexec/file_load_64.c
index e7ef8b2a2554..d45f5748e75c 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -746,6 +746,7 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
int i, nr_ranges, ret;
#ifdef CONFIG_CRASH_DUMP
+ uint64_t crashk_start;
/*
* Restrict memory usage for kdump kernel by setting up
* usable memory ranges and memory reserve map.
@@ -765,8 +766,8 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
* Ensure we don't touch crashed kernel's memory except the
* first 64K of RAM, which will be backed up.
*/
- ret = fdt_add_mem_rsv(fdt, BACKUP_SRC_END + 1,
- crashk_res.start - BACKUP_SRC_SIZE);
+ crashk_start = crashk_low_res.end ? crashk_low_res.start : crashk_res.start;
+ ret = fdt_add_mem_rsv(fdt, BACKUP_SRC_END + 1, crashk_start - BACKUP_SRC_SIZE);
if (ret) {
pr_err("Error reserving crash memory: %s\n",
fdt_strerror(ret));
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index c61aa96f0942..53e52e1f07c8 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -524,9 +524,20 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
* Also, crashed kernel's memory must be added to reserve map to
* avoid kdump kernel from using it.
*/
- ret = add_mem_range(mem_ranges, 0, crashk_res.end + 1);
- if (ret)
- goto out;
+ if (crashk_low_res.end) {
+ ret = add_mem_range(mem_ranges, 0, crashk_low_res.end + 1);
+ if (ret)
+ goto out;
+
+ ret = add_mem_range(mem_ranges, crashk_res.start,
+ crashk_res.end - crashk_res.start + 1);
+ if (ret)
+ goto out;
+ } else {
+ ret = add_mem_range(mem_ranges, 0, crashk_res.end + 1);
+ if (ret)
+ goto out;
+ }
for (i = 0; i < crashk_cma_cnt; i++) {
ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
@@ -610,6 +621,13 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges)
if (ret)
goto out;
+ if (crashk_low_res.end) {
+ ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_low_res.start,
+ crashk_low_res.end);
+ if (ret)
+ goto out;
+ }
+
for (i = 0; i < crashk_cma_cnt; ++i) {
ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_cma_ranges[i].start,
crashk_cma_ranges[i].end);
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH 4/4] powerpc/kdump: add support for high crashkernel reservation
2025-10-27 15:13 [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation Sourabh Jain
` (2 preceding siblings ...)
2025-10-27 15:13 ` [PATCH 3/4] powerpc/kdump: consider high crashkernel memory if enabled Sourabh Jain
@ 2025-10-27 15:13 ` Sourabh Jain
2025-10-28 6:23 ` [PATCH 0/4] powerpc/kdump: Support " Baoquan he
4 siblings, 0 replies; 8+ messages in thread
From: Sourabh Jain @ 2025-10-27 15:13 UTC (permalink / raw)
To: linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Ritesh Harjani (IBM),
Shivang Upadhyay
With this changes included crashkernel=xxM,high will be supported on
powerpc. This allow user to allocated crashkernel memory on higer memory
region and keeping the low memory allocation minimal.
The low memory reservation is by default set to 64 MB and it is reserved
below RTAS_INSTANTIATE_MAX (1G) to make sure rtas instantiation work
properly.
powerpc uses generic crashkernel parser and reserve functions and they
are capable of handling high crashkernel reservtion so
arch_reserve_crashkernel() is updated call generic crashkernel praser
and reserve function with resptive options to make
crashkernel=XXM,high prase and make crashkernel memory get reserved on
higher memory regions.
Note: High crashkernel is supported only on PPC 64-bit systems when
64-bit RTAS is instantiated and Radix MMU is enabled; otherwise, the
crashkernel reservation falls back to the default, even if the kernel
command includes crashkernel=XXM,high.
Cc: Baoquan he <bhe@redhat.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
arch/powerpc/include/asm/crash_reserve.h | 8 +++++
arch/powerpc/include/asm/kexec.h | 1 +
arch/powerpc/include/asm/rtas.h | 9 +++++
arch/powerpc/kexec/core.c | 45 +++++++++++++++++-------
4 files changed, 51 insertions(+), 12 deletions(-)
diff --git a/arch/powerpc/include/asm/crash_reserve.h b/arch/powerpc/include/asm/crash_reserve.h
index 6467ce29b1fa..d96d7726104a 100644
--- a/arch/powerpc/include/asm/crash_reserve.h
+++ b/arch/powerpc/include/asm/crash_reserve.h
@@ -2,7 +2,15 @@
#ifndef _ASM_POWERPC_CRASH_RESERVE_H
#define _ASM_POWERPC_CRASH_RESERVE_H
+#include <asm/rtas.h>
+
/* crash kernel regions are Page size agliged */
#define CRASH_ALIGN PAGE_SIZE
+#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_64M
+
+#define CRASH_ADDR_LOW_MAX RTAS_INSTANTIATE_MAX
+#define CRASH_ADDR_HIGH_MAX memblock_end_of_DRAM()
+
+
#endif /* _ASM_POWERPC_CRASH_RESERVE_H */
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index bd4a6c42a5f3..080fef2344b4 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -116,6 +116,7 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
int __init overlaps_crashkernel(unsigned long start, unsigned long size);
extern void arch_reserve_crashkernel(void);
extern void kdump_cma_reserve(void);
+unsigned long long __init get_crash_base(unsigned long long crash_base);
#else
static inline void arch_reserve_crashkernel(void) {}
static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
index aaa4c3bc1d61..d290437d8131 100644
--- a/arch/powerpc/include/asm/rtas.h
+++ b/arch/powerpc/include/asm/rtas.h
@@ -561,6 +561,14 @@ static inline int page_is_rtas_user_buf(unsigned long pfn)
return 0;
}
+static inline bool is_rtas_high_crashkernel_capable(void)
+{
+ if (rtas_64)
+ return true;
+
+ return false;
+}
+
/* Not the best place to put pSeries_coalesce_init, will be fixed when we
* move some of the rtas suspend-me stuff to pseries */
void pSeries_coalesce_init(void);
@@ -569,6 +577,7 @@ void rtas_initialize(void);
static inline int page_is_rtas_user_buf(unsigned long pfn) { return 0;}
static inline void pSeries_coalesce_init(void) { }
static inline void rtas_initialize(void) { }
+static inline bool is_rtas_high_crashkernel_capable(void) { return true; }
#endif
#ifdef CONFIG_HV_PERF_CTRS
diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
index 25744737eff5..09b7518bba36 100644
--- a/arch/powerpc/kexec/core.c
+++ b/arch/powerpc/kexec/core.c
@@ -15,6 +15,7 @@
#include <linux/irq.h>
#include <linux/ftrace.h>
+#include <asm/rtas.h>
#include <asm/kdump.h>
#include <asm/machdep.h>
#include <asm/pgalloc.h>
@@ -61,7 +62,7 @@ void machine_kexec(struct kimage *image)
#ifdef CONFIG_CRASH_RESERVE
-static unsigned long long __init get_crash_base(unsigned long long crash_base)
+unsigned long long __init get_crash_base(unsigned long long crash_base)
{
#ifndef CONFIG_NONSTATIC_KERNEL
@@ -101,35 +102,55 @@ static unsigned long long __init get_crash_base(unsigned long long crash_base)
#endif
}
+static bool high_crashkernel_supported(void)
+{
+#if defined(CONFIG_PPC64) && (defined(CONFIG_PPC_PSERIES) || defined(CONFIG_PPC_POWERNV))
+ if (early_radix_enabled() && is_rtas_high_crashkernel_capable())
+ return true;
+#endif
+ return false;
+}
+
void __init arch_reserve_crashkernel(void)
{
- unsigned long long crash_size, crash_base, crash_end;
+ unsigned long long crash_size, crash_base, crash_end, low_size = 0;
unsigned long long kernel_start, kernel_size;
unsigned long long total_mem_sz;
+ bool high = false;
int ret;
total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
- &crash_base, NULL, &cma_size, NULL);
+ &crash_base, &low_size, &cma_size, &high);
if (ret)
return;
- crash_base = get_crash_base(crash_base);
- crash_end = crash_base + crash_size - 1;
+ if (high && !high_crashkernel_supported()) {
+ high = false;
+ low_size = 0;
+ pr_warn("High crashkernel unsupported, using standard reservation");
+ }
- kernel_start = __pa(_stext);
- kernel_size = _end - _stext;
+ if (high) {
+ crash_base = 0;
+ } else {
+ crash_base = get_crash_base(crash_base);
+ crash_end = crash_base + crash_size - 1;
- /* The crash region must not overlap the current kernel */
- if ((kernel_start + kernel_size > crash_base) && (kernel_start <= crash_end)) {
- pr_warn("Crash kernel can not overlap current kernel\n");
- return;
+ kernel_start = __pa(_stext);
+ kernel_size = _end - _stext;
+
+ /* The crash region must not overlap the current kernel */
+ if ((kernel_start + kernel_size > crash_base) && (kernel_start <= crash_end)) {
+ pr_warn("Crash kernel can not overlap current kernel\n");
+ return;
+ }
}
- reserve_crashkernel_generic(crash_size, crash_base, 0, false);
+ reserve_crashkernel_generic(crash_size, crash_base, low_size, high);
}
void __init kdump_cma_reserve(void)
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation
2025-10-27 15:13 [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation Sourabh Jain
` (3 preceding siblings ...)
2025-10-27 15:13 ` [PATCH 4/4] powerpc/kdump: add support for high crashkernel reservation Sourabh Jain
@ 2025-10-28 6:23 ` Baoquan he
4 siblings, 0 replies; 8+ messages in thread
From: Baoquan he @ 2025-10-28 6:23 UTC (permalink / raw)
To: Sourabh Jain, kexec
Cc: linuxppc-dev, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Ritesh Harjani (IBM),
Shivang Upadhyay
Cc kexec mailing list.
On 10/27/25 at 08:43pm, Sourabh Jain wrote:
> Add support for reserving crashkernel memory in higher address ranges
> using the crashkernel=xxM,high command-line option.
>
> With this feature, most of the crashkernel memory for kdump will be
> reserved in high memory regions, while only a small portion (64 MB) will
> be reserved in low memory for the kdump kernel. This helps free up low
> memory for other components that require allocations in that region.
>
> For example, if crashkernel=2G,high is specified, the kernel will reserve
> 2 GB of crashkernel memory near the end of system RAM and an additional
> 64 MB of low memory (below 1 GB) for RTAS to function properly.
>
> Currently, this feature is supported only on PPC64 systems with 64-bit
> RTAS instantiation and Radix MMU enabled.
>
> Two critical changes were made to support this feature:
>
> - CPU feature discovery is now performed before crashkernel
> reservation. This ensures the MMU type is determined before reserving
> crashkernel memory. (Patch 01/04)
>
> - RTAS instantiation has been moved to 64-bit mode. (Patch 02/04)
>
> Apply the following patch first, and then apply this patch series:
> https://lore.kernel.org/all/20251024170118.297472-1-sourabhjain@linux.ibm.com/
>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
>
> Sourabh Jain (4):
> powerpc/mmu: do MMU type discovery before crashkernel reservation
> powerpc: move to 64-bit RTAS
> powerpc/kdump: consider high crashkernel memory if enabled
> powerpc/kdump: add support for high crashkernel reservation
>
> arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
> arch/powerpc/include/asm/crash_reserve.h | 8 +++++
> arch/powerpc/include/asm/kexec.h | 1 +
> arch/powerpc/include/asm/mmu.h | 1 +
> arch/powerpc/include/asm/rtas.h | 11 ++++++
> arch/powerpc/kernel/prom.c | 28 ++++++++-------
> arch/powerpc/kernel/prom_init.c | 26 +++++++++++---
> arch/powerpc/kernel/rtas.c | 5 +++
> arch/powerpc/kernel/rtas_entry.S | 17 ++++++++-
> arch/powerpc/kexec/core.c | 45 +++++++++++++++++-------
> arch/powerpc/kexec/elf_64.c | 10 ++++--
> arch/powerpc/kexec/file_load_64.c | 5 +--
> arch/powerpc/kexec/ranges.c | 24 +++++++++++--
> arch/powerpc/mm/init_64.c | 27 ++++++++------
> 14 files changed, 161 insertions(+), 48 deletions(-)
>
> --
> 2.51.0
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 2/4] powerpc: move to 64-bit RTAS
2025-10-27 15:13 ` [PATCH 2/4] powerpc: move to 64-bit RTAS Sourabh Jain
@ 2025-10-29 12:52 ` Sourabh Jain
0 siblings, 0 replies; 8+ messages in thread
From: Sourabh Jain @ 2025-10-29 12:52 UTC (permalink / raw)
To: linuxppc-dev
Cc: Baoquan he, Hari Bathini, Madhavan Srinivasan, Mahesh Salgaonkar,
Michael Ellerman, Ritesh Harjani (IBM), Shivang Upadhyay
On 27/10/25 20:43, Sourabh Jain wrote:
> Kdump kernels loaded at high addresses (above 4G) could not boot
> because the kernel used 32-bit RTAS.
>
> Until now, the kernel always used 32-bit RTAS, even for 64-bit kernels.
> Before making an RTAS call, it sets the MSR register with the SF bit off
> and sets rtas_return_loc/rtas_entry.s to LR as the return address.
> For kdump kernels loaded above 4G, RTAS cannot jump back to this LR
> correctly and instead jumps to a 32-bit truncated address. This usually
> causes exception which leads to kernel panic.
>
> To fix this, the kernel initializes 64-bit RTAS and sets the SF bit in
> the MSR register before each RTAS call, ensuring that RTAS jumps back
> correctly if the LR address is higher than 4G. This allows kdump kernels
> at high addresses to boot properly.
>
> If 64-bit RTAS initialization fails or is not supported (e.g., in QEMU),
> the kernel falls back to 32-bit RTAS. In this case, high-address kdump
> kernels will not be allowed (handled in upcoming patches), and RTAS
> calls will use SF bit off.
>
> Changes made to achieve this:
> - Initialize 64-bit RTAS in prom_init and add a new FDT property
> linux,rtas-64
I just realized that when RTAS is instantiated using
instantiate-rtas-64, the kernel
must not only set the SF bit in the MSR but also make sure that each cell in
the RTAS Argument Call Buffer is a 64-bit, sign-extended value aligned to an
8-byte boundary. This is currently not handled correctly.
Please review the other patches; I’ll fix this issue in the next version.
> - Kernel reads linux,rtas-64 and sets a global variable rtas_64 to
> indicate whether RTAS is 64-bit or 32-bit
> - Prepare MSR register for RTAS calls based on whether RTAS is 32-bit
> or 64-bit
>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
> arch/powerpc/include/asm/rtas.h | 2 ++
> arch/powerpc/kernel/prom_init.c | 26 ++++++++++++++++++++++----
> arch/powerpc/kernel/rtas.c | 5 +++++
> arch/powerpc/kernel/rtas_entry.S | 17 ++++++++++++++++-
> 4 files changed, 45 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h
> index d046bbd5017d..aaa4c3bc1d61 100644
> --- a/arch/powerpc/include/asm/rtas.h
> +++ b/arch/powerpc/include/asm/rtas.h
> @@ -10,6 +10,8 @@
> #include <linux/time.h>
> #include <linux/cpumask.h>
>
> +extern int rtas_64;
> +
> /*
> * Definitions for talking to the RTAS on CHRP machines.
> *
> diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
> index 827c958677f8..ab85b8bb8d4f 100644
> --- a/arch/powerpc/kernel/prom_init.c
> +++ b/arch/powerpc/kernel/prom_init.c
> @@ -1841,6 +1841,7 @@ static void __init prom_instantiate_rtas(void)
> u32 base, entry = 0;
> __be32 val;
> u32 size = 0;
> + u32 rtas_64 = 1;
>
> prom_debug("prom_instantiate_rtas: start...\n");
>
> @@ -1867,12 +1868,25 @@ static void __init prom_instantiate_rtas(void)
>
> prom_printf("instantiating rtas at 0x%x...", base);
>
> + /*
> + * First, try to instantiate 64-bit RTAS. If that fails, fall back
> + * to 32-bit. Although 64-bit RTAS support has been available on
> + * real machines for some time, QEMU still lacks this support.
> + */
> if (call_prom_ret("call-method", 3, 2, &entry,
> - ADDR("instantiate-rtas"),
> + ADDR("instantiate-rtas-64"),
> rtas_inst, base) != 0
> - || entry == 0) {
> - prom_printf(" failed\n");
> - return;
> + || entry == 0) {
> +
> + rtas_64 = 0;
> + if (call_prom_ret("call-method", 3, 2, &entry,
> + ADDR("instantiate-rtas"),
> + rtas_inst, base) != 0
> + || entry == 0) {
> +
> + prom_printf(" failed\n");
> + return;
> + }
> }
> prom_printf(" done\n");
>
> @@ -1884,6 +1898,9 @@ static void __init prom_instantiate_rtas(void)
> val = cpu_to_be32(entry);
> prom_setprop(rtas_node, "/rtas", "linux,rtas-entry",
> &val, sizeof(val));
> + val = cpu_to_be32(rtas_64);
> + prom_setprop(rtas_node, "/rtas", "linux,rtas-64",
> + &val, sizeof(val));
>
> /* Check if it supports "query-cpu-stopped-state" */
> if (prom_getprop(rtas_node, "query-cpu-stopped-state",
> @@ -1893,6 +1910,7 @@ static void __init prom_instantiate_rtas(void)
> prom_debug("rtas base = 0x%x\n", base);
> prom_debug("rtas entry = 0x%x\n", entry);
> prom_debug("rtas size = 0x%x\n", size);
> + prom_debug("rtas 64-bit = 0x%x\n", rtas_64);
>
> prom_debug("prom_instantiate_rtas: end...\n");
> }
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 8d81c1e7a8db..723806468984 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -45,6 +45,8 @@
> #include <asm/trace.h>
> #include <asm/udbg.h>
>
> +int rtas_64 = 1;
> +
> struct rtas_filter {
> /* Indexes into the args buffer, -1 if not used */
> const int buf_idx1;
> @@ -2087,6 +2089,9 @@ int __init early_init_dt_scan_rtas(unsigned long node,
> entryp = of_get_flat_dt_prop(node, "linux,rtas-entry", NULL);
> sizep = of_get_flat_dt_prop(node, "rtas-size", NULL);
>
> + if (!of_get_flat_dt_prop(node, "linux,rtas-64", NULL))
> + rtas_64 = 0;
> +
> #ifdef CONFIG_PPC64
> /* need this feature to decide the crashkernel offset */
> if (of_get_flat_dt_prop(node, "ibm,hypertas-functions", NULL))
> diff --git a/arch/powerpc/kernel/rtas_entry.S b/arch/powerpc/kernel/rtas_entry.S
> index 6ce95ddadbcd..df776f0103c9 100644
> --- a/arch/powerpc/kernel/rtas_entry.S
> +++ b/arch/powerpc/kernel/rtas_entry.S
> @@ -54,6 +54,10 @@ _ASM_NOKPROBE_SYMBOL(enter_rtas)
> /*
> * 32-bit rtas on 64-bit machines has the additional problem that RTAS may
> * not preserve the upper parts of registers it uses.
> + *
> + * Note: In 64-bit RTAS, the SF bit is set so that RTAS can return
> + * correctly if the return address is above 4 GB. Everything else
> + * works the same as in 32-bit RTAS.
> */
> _GLOBAL(enter_rtas)
> mflr r0
> @@ -113,7 +117,18 @@ __enter_rtas:
> * from the saved MSR value and insert into the value RTAS will use.
> */
> extrdi r0, r6, 1, 63 - MSR_HV_LG
> - LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI)
> +
> + LOAD_REG_ADDR(r7, rtas_64) /* Load the address rtas_64 into r7 */
> + ld r8, 0(r7) /* Load the value of rtas_64 from memory into r8 */
> + cmpdi r8, 0 /* Compare r8 with 0 (check if rtas_64 is zero) */
> + beq no_sf_bit /* Branch to no_sf_bit if rtas_64 is zero */
> + LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI | MSR_SF) /* r6 = ME|RI|SF */
> + b continue
> +
> +no_sf_bit:
> + LOAD_REG_IMMEDIATE(r6, MSR_ME | MSR_RI) /* r6 = ME|RI (NO SF bit in MSR) */
> +
> +continue:
> insrdi r6, r0, 1, 63 - MSR_HV_LG
>
> li r0,0
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 1/4] powerpc/mmu: do MMU type discovery before crashkernel reservation
2025-10-27 15:13 ` [PATCH 1/4] powerpc/mmu: do MMU type discovery before " Sourabh Jain
@ 2025-10-31 4:53 ` Ritesh Harjani
0 siblings, 0 replies; 8+ messages in thread
From: Ritesh Harjani @ 2025-10-31 4:53 UTC (permalink / raw)
To: Sourabh Jain, linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay
Sourabh Jain <sourabhjain@linux.ibm.com> writes:
> Crashkernel reservation on high memory depends on the MMU type, so
> finalize the MMU type before calling arch_reserve_crashkernel().
>
> With the changes introduced here, early_radix_enabled() becomes usable
> and will be used in arch_reserve_crashkernel() in the upcoming patch.
>
> early_radix_enabled() depends on cur_cpu_spec->mmu_features to find
> out if the radix MMU is enabled. The radix MMU bit in mmu_features is
> discovered from the FDT and kernel configs. To make sure the MMU type is
> finalized before arch_reserve_crashkernel() is called, the function that
> scans the FDT and sets mmu_features, along with some bits from
> mmu_early_type_finalize(), has been moved above
> arch_reserve_crashkernel().
>
That is correct. mmu_features may as well get set from this path too...
early_init_dt_scan_cpus() ->
if (!dt_cpu_ftrs_in_use())
-> check_cpu_features(node, "ibm_pa_features",...
cur_cpu_spec->mmu_features |= fp->mmu_features
...which I guess is controlled using CONFIG_PPC_DT_CPU_FTRS.
so it make sense to move those dependent paths above.
Overall the patch looks good to me. Added few minor nits below.
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/mmu.h | 1 +
> arch/powerpc/include/asm/mmu.h | 1 +
> arch/powerpc/kernel/prom.c | 28 +++++++++++++-----------
> arch/powerpc/mm/init_64.c | 27 ++++++++++++++---------
> 4 files changed, 34 insertions(+), 23 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index 48631365b48c..7a3b2ff02041 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -208,6 +208,7 @@ extern int mmu_vmemmap_psize;
>
> /* MMU initialization */
> void mmu_early_init_devtree(void);
> +void mmu_early_type_finalize(void);
Minor nit:
Can we please rename this function to - mmu_early_init_vec5()?
Your naming isn't wrong, but it's just known that after vec5 call, we
finalize the mmu early init type.. So keeping this function name as
"mmu_early_init_vec5()" makes slightly more sense to me.
And then the order of function declarations can also be kept like below -
/* MMU initialization */
+void mmu_early_init_vec5(void);
void mmu_early_init_devtree(void);
diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
+static inline void mmu_early_init_vec5(void) { }
static inline void mmu_early_init_devtree(void) { }
> void hash__early_init_devtree(void);
> void radix__early_init_devtree(void);
> #ifdef CONFIG_PPC_PKEY
> diff --git a/arch/powerpc/include/asm/mmu.h b/arch/powerpc/include/asm/mmu.h
> index 5f9c5d436e17..c40dc6349e55 100644
> --- a/arch/powerpc/include/asm/mmu.h
> +++ b/arch/powerpc/include/asm/mmu.h
> @@ -384,6 +384,7 @@ extern void early_init_mmu_secondary(void);
> extern void setup_initial_memory_limit(phys_addr_t first_memblock_base,
> phys_addr_t first_memblock_size);
> static inline void mmu_early_init_devtree(void) { }
> +static inline void mmu_early_type_finalize(void) { }
>
> static inline void pkey_early_init_devtree(void) {}
>
> diff --git a/arch/powerpc/kernel/prom.c b/arch/powerpc/kernel/prom.c
> index 9ed9dde7d231..db1615f26075 100644
> --- a/arch/powerpc/kernel/prom.c
> +++ b/arch/powerpc/kernel/prom.c
> @@ -853,6 +853,21 @@ void __init early_init_devtree(void *params)
> if (PHYSICAL_START > MEMORY_START)
> memblock_reserve(MEMORY_START, int_vector_size);
> reserve_kdump_trampoline();
> +
> + DBG("Scanning CPUs ...\n");
> +
> + dt_cpu_ftrs_scan();
> +
> + /* Retrieve CPU related informations from the flat tree
> + * (altivec support, boot CPU ID, ...)
> + */
> + of_scan_flat_dt(early_init_dt_scan_cpus, NULL);
> + if (boot_cpuid < 0) {
> + printk("Failed to identify boot CPU !\n");
> + BUG();
> + }
> +
> + mmu_early_type_finalize();
> #if defined(CONFIG_FA_DUMP) || defined(CONFIG_PRESERVE_FA_DUMP)
> /*
> * If we fail to reserve memory for firmware-assisted dump then
> @@ -884,19 +899,6 @@ void __init early_init_devtree(void *params)
> * FIXME .. and the initrd too? */
> move_device_tree();
>
> - DBG("Scanning CPUs ...\n");
> -
> - dt_cpu_ftrs_scan();
> -
> - /* Retrieve CPU related informations from the flat tree
> - * (altivec support, boot CPU ID, ...)
> - */
> - of_scan_flat_dt(early_init_dt_scan_cpus, NULL);
> - if (boot_cpuid < 0) {
> - printk("Failed to identify boot CPU !\n");
> - BUG();
> - }
> -
> save_fscr_to_task();
>
> #if defined(CONFIG_SMP) && defined(CONFIG_PPC64)
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index b6f3ae03ca9e..cd52c1baa3bc 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -622,8 +622,10 @@ static void __init early_init_memory_block_size(void)
> of_scan_flat_dt(probe_memory_block_size, &memory_block_size);
> }
>
> -void __init mmu_early_init_devtree(void)
Let's also add a comment here to be more explicit perhaps. Because it has
caused confusion in past.
/*
* mmu_early_init_vec5(): For non-hv mode (Pseries LPAR), whether we can do
* radix or not depend upon hypervisor vec5 values. This functions checks
* ibm,architecture-vec-5 and updates cur_cpu_spec->mmu_features bits
* accordingly.
* After this function returns, early_radix_enabled() can be used
* to check if radix is supported.
*/
> +
> +void __init mmu_early_type_finalize(void)
> {
> +
> bool hvmode = !!(mfmsr() & MSR_HV);
>
> /* Disable radix mode based on kernel command line. */
> @@ -634,6 +636,20 @@ void __init mmu_early_init_devtree(void)
> pr_warn("WARNING: Ignoring cmdline option disable_radix\n");
> }
>
> + /*
> + * Check /chosen/ibm,architecture-vec-5 if running as a guest.
> + * When running bare-metal, we can use radix if we like
> + * even though the ibm,architecture-vec-5 property created by
> + * skiboot doesn't have the necessary bits set.
> + */
> + if (!hvmode)
> + early_check_vec5();
> +}
-ritesh
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-10-31 5:31 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-27 15:13 [PATCH 0/4] powerpc/kdump: Support high crashkernel reservation Sourabh Jain
2025-10-27 15:13 ` [PATCH 1/4] powerpc/mmu: do MMU type discovery before " Sourabh Jain
2025-10-31 4:53 ` Ritesh Harjani
2025-10-27 15:13 ` [PATCH 2/4] powerpc: move to 64-bit RTAS Sourabh Jain
2025-10-29 12:52 ` Sourabh Jain
2025-10-27 15:13 ` [PATCH 3/4] powerpc/kdump: consider high crashkernel memory if enabled Sourabh Jain
2025-10-27 15:13 ` [PATCH 4/4] powerpc/kdump: add support for high crashkernel reservation Sourabh Jain
2025-10-28 6:23 ` [PATCH 0/4] powerpc/kdump: Support " Baoquan he
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).