* [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
@ 2025-11-03 4:37 Sourabh Jain
2025-11-03 10:10 ` Ritesh Harjani
0 siblings, 1 reply; 9+ messages in thread
From: Sourabh Jain @ 2025-11-03 4:37 UTC (permalink / raw)
To: linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Jiri Bohac, Hari Bathini,
Madhavan Srinivasan, Mahesh Salgaonkar, Michael Ellerman,
Ritesh Harjani (IBM), Shivang Upadhyay, kexec
Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
crashkernel= command line option") and commit ab475510e042 ("kdump:
implement reserve_crashkernel_cma") added CMA support for kdump
crashkernel reservation.
Extend crashkernel CMA reservation support to powerpc.
The following changes are made to enable CMA reservation on powerpc:
- Parse and obtain the CMA reservation size along with other crashkernel
parameters
- Call reserve_crashkernel_cma() to allocate the CMA region for kdump
- Include the CMA-reserved ranges in the usable memory ranges for the
kdump kernel to use.
- Exclude the CMA-reserved ranges from the crash kernel memory to
prevent them from being exported through /proc/vmcore.
With the introduction of the CMA crashkernel regions,
crash_exclude_mem_range() needs to be called multiple times to exclude
both crashk_res and crashk_cma_ranges from the crash memory ranges. To
avoid repetitive logic for validating mem_ranges size and handling
reallocation when required, this functionality is moved to a new wrapper
function crash_exclude_mem_range_guarded().
To ensure proper CMA reservation, reserve_crashkernel_cma() is called
after pageblock_order is initialized.
Update kernel-parameters.txt to document CMA support for crashkernel on
powerpc architecture.
Cc: Baoquan he <bhe@redhat.com>
Cc: Jiri Bohac <jbohac@suse.cz>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
---
Changlog:
v3 -> v4
- Removed repeated initialization to tmem in
crash_exclude_mem_range_guarded()
- Call crash_exclude_mem_range() with right crashk ranges
v4 -> v5:
- Document CMA-based crashkernel support for ppc64 in kernel-parameters.txt
---
.../admin-guide/kernel-parameters.txt | 2 +-
arch/powerpc/include/asm/kexec.h | 2 +
arch/powerpc/kernel/setup-common.c | 4 +-
arch/powerpc/kexec/core.c | 10 ++++-
arch/powerpc/kexec/ranges.c | 43 ++++++++++++++-----
5 files changed, 47 insertions(+), 14 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 6c42061ca20e..0f386b546cec 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -1013,7 +1013,7 @@
It will be ignored when crashkernel=X,high is not used
or memory reserved is below 4G.
crashkernel=size[KMG],cma
- [KNL, X86] Reserve additional crash kernel memory from
+ [KNL, X86, ppc64] Reserve additional crash kernel memory from
CMA. This reservation is usable by the first system's
userspace memory and kernel movable allocations (memory
balloon, zswap). Pages allocated from this memory range
diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
index 4bbf9f699aaa..bd4a6c42a5f3 100644
--- a/arch/powerpc/include/asm/kexec.h
+++ b/arch/powerpc/include/asm/kexec.h
@@ -115,9 +115,11 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
#ifdef CONFIG_CRASH_RESERVE
int __init overlaps_crashkernel(unsigned long start, unsigned long size);
extern void arch_reserve_crashkernel(void);
+extern void kdump_cma_reserve(void);
#else
static inline void arch_reserve_crashkernel(void) {}
static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
+static inline void kdump_cma_reserve(void) { }
#endif
#if defined(CONFIG_CRASH_DUMP)
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 68d47c53876c..c8c42b419742 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -35,6 +35,7 @@
#include <linux/of_irq.h>
#include <linux/hugetlb.h>
#include <linux/pgtable.h>
+#include <asm/kexec.h>
#include <asm/io.h>
#include <asm/paca.h>
#include <asm/processor.h>
@@ -995,11 +996,12 @@ void __init setup_arch(char **cmdline_p)
initmem_init();
/*
- * Reserve large chunks of memory for use by CMA for fadump, KVM and
+ * Reserve large chunks of memory for use by CMA for kdump, fadump, KVM and
* hugetlb. These must be called after initmem_init(), so that
* pageblock_order is initialised.
*/
fadump_cma_init();
+ kdump_cma_reserve();
kvm_cma_reserve();
gigantic_hugetlb_cma_reserve();
diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
index d1a2d755381c..25744737eff5 100644
--- a/arch/powerpc/kexec/core.c
+++ b/arch/powerpc/kexec/core.c
@@ -33,6 +33,8 @@ void machine_kexec_cleanup(struct kimage *image)
{
}
+unsigned long long cma_size;
+
/*
* Do not allocate memory (or fail in any way) in machine_kexec().
* We are past the point of no return, committed to rebooting now.
@@ -110,7 +112,7 @@ void __init arch_reserve_crashkernel(void)
/* use common parsing */
ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
- &crash_base, NULL, NULL, NULL);
+ &crash_base, NULL, &cma_size, NULL);
if (ret)
return;
@@ -130,6 +132,12 @@ void __init arch_reserve_crashkernel(void)
reserve_crashkernel_generic(crash_size, crash_base, 0, false);
}
+void __init kdump_cma_reserve(void)
+{
+ if (cma_size)
+ reserve_crashkernel_cma(cma_size);
+}
+
int __init overlaps_crashkernel(unsigned long start, unsigned long size)
{
return (start + size) > crashk_res.start && start <= crashk_res.end;
diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
index 3702b0bdab14..3bd27c38726b 100644
--- a/arch/powerpc/kexec/ranges.c
+++ b/arch/powerpc/kexec/ranges.c
@@ -515,7 +515,7 @@ int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
*/
int get_usable_memory_ranges(struct crash_mem **mem_ranges)
{
- int ret;
+ int ret, i;
/*
* Early boot failure observed on guests when low memory (first memory
@@ -528,6 +528,13 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
if (ret)
goto out;
+ for (i = 0; i < crashk_cma_cnt; i++) {
+ ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
+ crashk_cma_ranges[i].end - crashk_cma_ranges[i].start + 1);
+ if (ret)
+ goto out;
+ }
+
ret = add_rtas_mem_range(mem_ranges);
if (ret)
goto out;
@@ -546,6 +553,22 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
#endif /* CONFIG_KEXEC_FILE */
#ifdef CONFIG_CRASH_DUMP
+static int crash_exclude_mem_range_guarded(struct crash_mem **mem_ranges,
+ unsigned long long mstart,
+ unsigned long long mend)
+{
+ struct crash_mem *tmem = *mem_ranges;
+
+ /* Reallocate memory ranges if there is no space to split ranges */
+ if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
+ tmem = realloc_mem_ranges(mem_ranges);
+ if (!tmem)
+ return -ENOMEM;
+ }
+
+ return crash_exclude_mem_range(tmem, mstart, mend);
+}
+
/**
* get_crash_memory_ranges - Get crash memory ranges. This list includes
* first/crashing kernel's memory regions that
@@ -557,7 +580,6 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
int get_crash_memory_ranges(struct crash_mem **mem_ranges)
{
phys_addr_t base, end;
- struct crash_mem *tmem;
u64 i;
int ret;
@@ -582,19 +604,18 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges)
sort_memory_ranges(*mem_ranges, true);
}
- /* Reallocate memory ranges if there is no space to split ranges */
- tmem = *mem_ranges;
- if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
- tmem = realloc_mem_ranges(mem_ranges);
- if (!tmem)
- goto out;
- }
-
/* Exclude crashkernel region */
- ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
+ ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_res.start, crashk_res.end);
if (ret)
goto out;
+ for (i = 0; i < crashk_cma_cnt; ++i) {
+ ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_cma_ranges[i].start,
+ crashk_cma_ranges[i].end);
+ if (ret)
+ goto out;
+ }
+
/*
* FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
* regions are exported to save their context at the time of
--
2.51.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-03 4:37 [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation Sourabh Jain
@ 2025-11-03 10:10 ` Ritesh Harjani
2025-11-04 5:18 ` Sourabh Jain
0 siblings, 1 reply; 9+ messages in thread
From: Ritesh Harjani @ 2025-11-03 10:10 UTC (permalink / raw)
To: Sourabh Jain, linuxppc-dev
Cc: Sourabh Jain, Baoquan he, Jiri Bohac, Hari Bathini,
Madhavan Srinivasan, Mahesh Salgaonkar, Michael Ellerman,
Shivang Upadhyay, kexec
Sourabh Jain <sourabhjain@linux.ibm.com> writes:
> Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
> crashkernel= command line option") and commit ab475510e042 ("kdump:
> implement reserve_crashkernel_cma") added CMA support for kdump
> crashkernel reservation.
>
> Extend crashkernel CMA reservation support to powerpc.
>
> The following changes are made to enable CMA reservation on powerpc:
>
> - Parse and obtain the CMA reservation size along with other crashkernel
> parameters
> - Call reserve_crashkernel_cma() to allocate the CMA region for kdump
> - Include the CMA-reserved ranges in the usable memory ranges for the
> kdump kernel to use.
> - Exclude the CMA-reserved ranges from the crash kernel memory to
> prevent them from being exported through /proc/vmcore.
>
> With the introduction of the CMA crashkernel regions,
> crash_exclude_mem_range() needs to be called multiple times to exclude
> both crashk_res and crashk_cma_ranges from the crash memory ranges. To
> avoid repetitive logic for validating mem_ranges size and handling
> reallocation when required, this functionality is moved to a new wrapper
> function crash_exclude_mem_range_guarded().
>
> To ensure proper CMA reservation, reserve_crashkernel_cma() is called
> after pageblock_order is initialized.
>
> Update kernel-parameters.txt to document CMA support for crashkernel on
> powerpc architecture.
>
> Cc: Baoquan he <bhe@redhat.com>
> Cc: Jiri Bohac <jbohac@suse.cz>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
> Cc: kexec@lists.infradead.org
> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
> ---
> Changlog:
>
> v3 -> v4
> - Removed repeated initialization to tmem in
> crash_exclude_mem_range_guarded()
> - Call crash_exclude_mem_range() with right crashk ranges
>
> v4 -> v5:
> - Document CMA-based crashkernel support for ppc64 in kernel-parameters.txt
> ---
> .../admin-guide/kernel-parameters.txt | 2 +-
> arch/powerpc/include/asm/kexec.h | 2 +
> arch/powerpc/kernel/setup-common.c | 4 +-
> arch/powerpc/kexec/core.c | 10 ++++-
> arch/powerpc/kexec/ranges.c | 43 ++++++++++++++-----
> 5 files changed, 47 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 6c42061ca20e..0f386b546cec 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -1013,7 +1013,7 @@
> It will be ignored when crashkernel=X,high is not used
> or memory reserved is below 4G.
> crashkernel=size[KMG],cma
> - [KNL, X86] Reserve additional crash kernel memory from
> + [KNL, X86, ppc64] Reserve additional crash kernel memory from
Shouldn't this be PPC and not ppc64?
If I see the crash_dump support...
config ARCH_SUPPORTS_CRASH_DUMP
def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
The changes below aren't specific to ppc64 correct?
> CMA. This reservation is usable by the first system's
> userspace memory and kernel movable allocations (memory
> balloon, zswap). Pages allocated from this memory range
> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
> index 4bbf9f699aaa..bd4a6c42a5f3 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -115,9 +115,11 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
> #ifdef CONFIG_CRASH_RESERVE
> int __init overlaps_crashkernel(unsigned long start, unsigned long size);
> extern void arch_reserve_crashkernel(void);
> +extern void kdump_cma_reserve(void);
> #else
> static inline void arch_reserve_crashkernel(void) {}
> static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
> +static inline void kdump_cma_reserve(void) { }
> #endif
>
> #if defined(CONFIG_CRASH_DUMP)
> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
> index 68d47c53876c..c8c42b419742 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -35,6 +35,7 @@
> #include <linux/of_irq.h>
> #include <linux/hugetlb.h>
> #include <linux/pgtable.h>
> +#include <asm/kexec.h>
> #include <asm/io.h>
> #include <asm/paca.h>
> #include <asm/processor.h>
> @@ -995,11 +996,12 @@ void __init setup_arch(char **cmdline_p)
> initmem_init();
>
> /*
> - * Reserve large chunks of memory for use by CMA for fadump, KVM and
> + * Reserve large chunks of memory for use by CMA for kdump, fadump, KVM and
> * hugetlb. These must be called after initmem_init(), so that
> * pageblock_order is initialised.
> */
> fadump_cma_init();
> + kdump_cma_reserve();
> kvm_cma_reserve();
> gigantic_hugetlb_cma_reserve();
>
> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
> index d1a2d755381c..25744737eff5 100644
> --- a/arch/powerpc/kexec/core.c
> +++ b/arch/powerpc/kexec/core.c
> @@ -33,6 +33,8 @@ void machine_kexec_cleanup(struct kimage *image)
> {
> }
>
> +unsigned long long cma_size;
> +
nit:
Since this is a gloabal powerpc variable you are defining, then can we
keep it's name to crashk_cma_size?
> /*
> * Do not allocate memory (or fail in any way) in machine_kexec().
> * We are past the point of no return, committed to rebooting now.
> @@ -110,7 +112,7 @@ void __init arch_reserve_crashkernel(void)
>
> /* use common parsing */
> ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
> - &crash_base, NULL, NULL, NULL);
> + &crash_base, NULL, &cma_size, NULL);
>
> if (ret)
> return;
> @@ -130,6 +132,12 @@ void __init arch_reserve_crashkernel(void)
> reserve_crashkernel_generic(crash_size, crash_base, 0, false);
> }
>
> +void __init kdump_cma_reserve(void)
> +{
> + if (cma_size)
> + reserve_crashkernel_cma(cma_size);
> +}
> +
nit:
cma_size is already checked for null within reserve_crashkernel_cma(),
so we don't really need kdump_cma_reserve() function call as such.
Also kdump_cma_reserve() only make sense with #ifdef CRASHKERNEL_CMA..
so instead do you think we can directly call reserve_crashkernel_cma(cma_size)?
-ritesh
> int __init overlaps_crashkernel(unsigned long start, unsigned long size)
> {
> return (start + size) > crashk_res.start && start <= crashk_res.end;
> diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
> index 3702b0bdab14..3bd27c38726b 100644
> --- a/arch/powerpc/kexec/ranges.c
> +++ b/arch/powerpc/kexec/ranges.c
> @@ -515,7 +515,7 @@ int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
> */
> int get_usable_memory_ranges(struct crash_mem **mem_ranges)
> {
> - int ret;
> + int ret, i;
>
> /*
> * Early boot failure observed on guests when low memory (first memory
> @@ -528,6 +528,13 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
> if (ret)
> goto out;
>
> + for (i = 0; i < crashk_cma_cnt; i++) {
> + ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
> + crashk_cma_ranges[i].end - crashk_cma_ranges[i].start + 1);
> + if (ret)
> + goto out;
> + }
> +
> ret = add_rtas_mem_range(mem_ranges);
> if (ret)
> goto out;
> @@ -546,6 +553,22 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
> #endif /* CONFIG_KEXEC_FILE */
>
> #ifdef CONFIG_CRASH_DUMP
> +static int crash_exclude_mem_range_guarded(struct crash_mem **mem_ranges,
> + unsigned long long mstart,
> + unsigned long long mend)
> +{
> + struct crash_mem *tmem = *mem_ranges;
> +
> + /* Reallocate memory ranges if there is no space to split ranges */
> + if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
> + tmem = realloc_mem_ranges(mem_ranges);
> + if (!tmem)
> + return -ENOMEM;
> + }
> +
> + return crash_exclude_mem_range(tmem, mstart, mend);
> +}
> +
> /**
> * get_crash_memory_ranges - Get crash memory ranges. This list includes
> * first/crashing kernel's memory regions that
> @@ -557,7 +580,6 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
> int get_crash_memory_ranges(struct crash_mem **mem_ranges)
> {
> phys_addr_t base, end;
> - struct crash_mem *tmem;
> u64 i;
> int ret;
>
> @@ -582,19 +604,18 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges)
> sort_memory_ranges(*mem_ranges, true);
> }
>
> - /* Reallocate memory ranges if there is no space to split ranges */
> - tmem = *mem_ranges;
> - if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
> - tmem = realloc_mem_ranges(mem_ranges);
> - if (!tmem)
> - goto out;
> - }
> -
> /* Exclude crashkernel region */
> - ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
> + ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_res.start, crashk_res.end);
> if (ret)
> goto out;
>
> + for (i = 0; i < crashk_cma_cnt; ++i) {
> + ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_cma_ranges[i].start,
> + crashk_cma_ranges[i].end);
> + if (ret)
> + goto out;
> + }
> +
> /*
> * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
> * regions are exported to save their context at the time of
> --
> 2.51.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-03 10:10 ` Ritesh Harjani
@ 2025-11-04 5:18 ` Sourabh Jain
2025-11-04 9:34 ` Sourabh Jain
2025-11-04 10:18 ` Ritesh Harjani
0 siblings, 2 replies; 9+ messages in thread
From: Sourabh Jain @ 2025-11-04 5:18 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
On 03/11/25 15:40, Ritesh Harjani (IBM) wrote:
> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>
>> Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
>> crashkernel= command line option") and commit ab475510e042 ("kdump:
>> implement reserve_crashkernel_cma") added CMA support for kdump
>> crashkernel reservation.
>>
>> Extend crashkernel CMA reservation support to powerpc.
>>
>> The following changes are made to enable CMA reservation on powerpc:
>>
>> - Parse and obtain the CMA reservation size along with other crashkernel
>> parameters
>> - Call reserve_crashkernel_cma() to allocate the CMA region for kdump
>> - Include the CMA-reserved ranges in the usable memory ranges for the
>> kdump kernel to use.
>> - Exclude the CMA-reserved ranges from the crash kernel memory to
>> prevent them from being exported through /proc/vmcore.
>>
>> With the introduction of the CMA crashkernel regions,
>> crash_exclude_mem_range() needs to be called multiple times to exclude
>> both crashk_res and crashk_cma_ranges from the crash memory ranges. To
>> avoid repetitive logic for validating mem_ranges size and handling
>> reallocation when required, this functionality is moved to a new wrapper
>> function crash_exclude_mem_range_guarded().
>>
>> To ensure proper CMA reservation, reserve_crashkernel_cma() is called
>> after pageblock_order is initialized.
>>
>> Update kernel-parameters.txt to document CMA support for crashkernel on
>> powerpc architecture.
>>
>> Cc: Baoquan he <bhe@redhat.com>
>> Cc: Jiri Bohac <jbohac@suse.cz>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
>> Cc: kexec@lists.infradead.org
>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>> ---
>> Changlog:
>>
>> v3 -> v4
>> - Removed repeated initialization to tmem in
>> crash_exclude_mem_range_guarded()
>> - Call crash_exclude_mem_range() with right crashk ranges
>>
>> v4 -> v5:
>> - Document CMA-based crashkernel support for ppc64 in kernel-parameters.txt
>> ---
>> .../admin-guide/kernel-parameters.txt | 2 +-
>> arch/powerpc/include/asm/kexec.h | 2 +
>> arch/powerpc/kernel/setup-common.c | 4 +-
>> arch/powerpc/kexec/core.c | 10 ++++-
>> arch/powerpc/kexec/ranges.c | 43 ++++++++++++++-----
>> 5 files changed, 47 insertions(+), 14 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index 6c42061ca20e..0f386b546cec 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -1013,7 +1013,7 @@
>> It will be ignored when crashkernel=X,high is not used
>> or memory reserved is below 4G.
>> crashkernel=size[KMG],cma
>> - [KNL, X86] Reserve additional crash kernel memory from
>> + [KNL, X86, ppc64] Reserve additional crash kernel memory from
> Shouldn't this be PPC and not ppc64?
>
> If I see the crash_dump support...
>
> config ARCH_SUPPORTS_CRASH_DUMP
> def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
>
> The changes below aren't specific to ppc64 correct?
The thing is this feature is only supported with KEXEC_FILE and which
only supported on PPC64:
config ARCH_SUPPORTS_KEXEC_FILE
def_bool PPC64
Hence I kept it as ppc64.
I think I should update that in the commit message.
Also do you think is it good to restrict this feature to KEXEC_FILE?
>
>> CMA. This reservation is usable by the first system's
>> userspace memory and kernel movable allocations (memory
>> balloon, zswap). Pages allocated from this memory range
>> diff --git a/arch/powerpc/include/asm/kexec.h b/arch/powerpc/include/asm/kexec.h
>> index 4bbf9f699aaa..bd4a6c42a5f3 100644
>> --- a/arch/powerpc/include/asm/kexec.h
>> +++ b/arch/powerpc/include/asm/kexec.h
>> @@ -115,9 +115,11 @@ int setup_new_fdt_ppc64(const struct kimage *image, void *fdt, struct crash_mem
>> #ifdef CONFIG_CRASH_RESERVE
>> int __init overlaps_crashkernel(unsigned long start, unsigned long size);
>> extern void arch_reserve_crashkernel(void);
>> +extern void kdump_cma_reserve(void);
>> #else
>> static inline void arch_reserve_crashkernel(void) {}
>> static inline int overlaps_crashkernel(unsigned long start, unsigned long size) { return 0; }
>> +static inline void kdump_cma_reserve(void) { }
>> #endif
>>
>> #if defined(CONFIG_CRASH_DUMP)
>> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
>> index 68d47c53876c..c8c42b419742 100644
>> --- a/arch/powerpc/kernel/setup-common.c
>> +++ b/arch/powerpc/kernel/setup-common.c
>> @@ -35,6 +35,7 @@
>> #include <linux/of_irq.h>
>> #include <linux/hugetlb.h>
>> #include <linux/pgtable.h>
>> +#include <asm/kexec.h>
>> #include <asm/io.h>
>> #include <asm/paca.h>
>> #include <asm/processor.h>
>> @@ -995,11 +996,12 @@ void __init setup_arch(char **cmdline_p)
>> initmem_init();
>>
>> /*
>> - * Reserve large chunks of memory for use by CMA for fadump, KVM and
>> + * Reserve large chunks of memory for use by CMA for kdump, fadump, KVM and
>> * hugetlb. These must be called after initmem_init(), so that
>> * pageblock_order is initialised.
>> */
>> fadump_cma_init();
>> + kdump_cma_reserve();
>> kvm_cma_reserve();
>> gigantic_hugetlb_cma_reserve();
>>
>> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
>> index d1a2d755381c..25744737eff5 100644
>> --- a/arch/powerpc/kexec/core.c
>> +++ b/arch/powerpc/kexec/core.c
>> @@ -33,6 +33,8 @@ void machine_kexec_cleanup(struct kimage *image)
>> {
>> }
>>
>> +unsigned long long cma_size;
>> +
> nit:
> Since this is a gloabal powerpc variable you are defining, then can we
> keep it's name to crashk_cma_size?
Yeah make sense. I will update the variable name.
>
>> /*
>> * Do not allocate memory (or fail in any way) in machine_kexec().
>> * We are past the point of no return, committed to rebooting now.
>> @@ -110,7 +112,7 @@ void __init arch_reserve_crashkernel(void)
>>
>> /* use common parsing */
>> ret = parse_crashkernel(boot_command_line, total_mem_sz, &crash_size,
>> - &crash_base, NULL, NULL, NULL);
>> + &crash_base, NULL, &cma_size, NULL);
>>
>> if (ret)
>> return;
>> @@ -130,6 +132,12 @@ void __init arch_reserve_crashkernel(void)
>> reserve_crashkernel_generic(crash_size, crash_base, 0, false);
>> }
>>
>> +void __init kdump_cma_reserve(void)
>> +{
>> + if (cma_size)
>> + reserve_crashkernel_cma(cma_size);
>> +}
>> +
> nit:
> cma_size is already checked for null within reserve_crashkernel_cma(),
> so we don't really need kdump_cma_reserve() function call as such.
>
> Also kdump_cma_reserve() only make sense with #ifdef CRASHKERNEL_CMA..
> so instead do you think we can directly call reserve_crashkernel_cma(cma_size)?
I think the above kdump_cma_reserve() definition should come under
CONFIG_CRASH_RESERVE
because the way it is declared in arch/powerpc/include/asm/kexec.h.
I would like to keep kdump_cma_reserve() as is it because of two reasons:
- It keeps setup_arch() free from kdump #ifdefs
- In case if we want to add some condition on this reservation it would
straight forward.
So lets keep kdump_cma_reserve as is, unless you have strong opinion on
not to.
>> int __init overlaps_crashkernel(unsigned long start, unsigned long size)
>> {
>> return (start + size) > crashk_res.start && start <= crashk_res.end;
>> diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
>> index 3702b0bdab14..3bd27c38726b 100644
>> --- a/arch/powerpc/kexec/ranges.c
>> +++ b/arch/powerpc/kexec/ranges.c
>> @@ -515,7 +515,7 @@ int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
>> */
>> int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>> {
>> - int ret;
>> + int ret, i;
>>
>> /*
>> * Early boot failure observed on guests when low memory (first memory
>> @@ -528,6 +528,13 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>> if (ret)
>> goto out;
>>
>> + for (i = 0; i < crashk_cma_cnt; i++) {
>> + ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
>> + crashk_cma_ranges[i].end - crashk_cma_ranges[i].start + 1);
>> + if (ret)
>> + goto out;
>> + }
>> +
>> ret = add_rtas_mem_range(mem_ranges);
>> if (ret)
>> goto out;
>> @@ -546,6 +553,22 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>> #endif /* CONFIG_KEXEC_FILE */
>>
>> #ifdef CONFIG_CRASH_DUMP
>> +static int crash_exclude_mem_range_guarded(struct crash_mem **mem_ranges,
>> + unsigned long long mstart,
>> + unsigned long long mend)
>> +{
>> + struct crash_mem *tmem = *mem_ranges;
>> +
>> + /* Reallocate memory ranges if there is no space to split ranges */
>> + if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>> + tmem = realloc_mem_ranges(mem_ranges);
>> + if (!tmem)
>> + return -ENOMEM;
>> + }
>> +
>> + return crash_exclude_mem_range(tmem, mstart, mend);
>> +}
>> +
>> /**
>> * get_crash_memory_ranges - Get crash memory ranges. This list includes
>> * first/crashing kernel's memory regions that
>> @@ -557,7 +580,6 @@ int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>> int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>> {
>> phys_addr_t base, end;
>> - struct crash_mem *tmem;
>> u64 i;
>> int ret;
>>
>> @@ -582,19 +604,18 @@ int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>> sort_memory_ranges(*mem_ranges, true);
>> }
>>
>> - /* Reallocate memory ranges if there is no space to split ranges */
>> - tmem = *mem_ranges;
>> - if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>> - tmem = realloc_mem_ranges(mem_ranges);
>> - if (!tmem)
>> - goto out;
>> - }
>> -
>> /* Exclude crashkernel region */
>> - ret = crash_exclude_mem_range(tmem, crashk_res.start, crashk_res.end);
>> + ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_res.start, crashk_res.end);
>> if (ret)
>> goto out;
>>
>> + for (i = 0; i < crashk_cma_cnt; ++i) {
>> + ret = crash_exclude_mem_range_guarded(mem_ranges, crashk_cma_ranges[i].start,
>> + crashk_cma_ranges[i].end);
>> + if (ret)
>> + goto out;
>> + }
>> +
>> /*
>> * FIXME: For now, stay in parity with kexec-tools but if RTAS/OPAL
>> * regions are exported to save their context at the time of
>> --
>> 2.51.0
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-04 5:18 ` Sourabh Jain
@ 2025-11-04 9:34 ` Sourabh Jain
2025-11-04 10:24 ` Ritesh Harjani
2025-11-04 10:18 ` Ritesh Harjani
1 sibling, 1 reply; 9+ messages in thread
From: Sourabh Jain @ 2025-11-04 9:34 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
On 04/11/25 10:48, Sourabh Jain wrote:
>
>
> On 03/11/25 15:40, Ritesh Harjani (IBM) wrote:
>> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>>
>>> Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
>>> crashkernel= command line option") and commit ab475510e042 ("kdump:
>>> implement reserve_crashkernel_cma") added CMA support for kdump
>>> crashkernel reservation.
>>>
>>> Extend crashkernel CMA reservation support to powerpc.
>>>
>>> The following changes are made to enable CMA reservation on powerpc:
>>>
>>> - Parse and obtain the CMA reservation size along with other
>>> crashkernel
>>> parameters
>>> - Call reserve_crashkernel_cma() to allocate the CMA region for kdump
>>> - Include the CMA-reserved ranges in the usable memory ranges for the
>>> kdump kernel to use.
>>> - Exclude the CMA-reserved ranges from the crash kernel memory to
>>> prevent them from being exported through /proc/vmcore.
>>>
>>> With the introduction of the CMA crashkernel regions,
>>> crash_exclude_mem_range() needs to be called multiple times to exclude
>>> both crashk_res and crashk_cma_ranges from the crash memory ranges. To
>>> avoid repetitive logic for validating mem_ranges size and handling
>>> reallocation when required, this functionality is moved to a new
>>> wrapper
>>> function crash_exclude_mem_range_guarded().
>>>
>>> To ensure proper CMA reservation, reserve_crashkernel_cma() is called
>>> after pageblock_order is initialized.
>>>
>>> Update kernel-parameters.txt to document CMA support for crashkernel on
>>> powerpc architecture.
>>>
>>> Cc: Baoquan he <bhe@redhat.com>
>>> Cc: Jiri Bohac <jbohac@suse.cz>
>>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
>>> Cc: kexec@lists.infradead.org
>>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>> ---
>>> Changlog:
>>>
>>> v3 -> v4
>>> - Removed repeated initialization to tmem in
>>> crash_exclude_mem_range_guarded()
>>> - Call crash_exclude_mem_range() with right crashk ranges
>>>
>>> v4 -> v5:
>>> - Document CMA-based crashkernel support for ppc64 in
>>> kernel-parameters.txt
>>> ---
>>> .../admin-guide/kernel-parameters.txt | 2 +-
>>> arch/powerpc/include/asm/kexec.h | 2 +
>>> arch/powerpc/kernel/setup-common.c | 4 +-
>>> arch/powerpc/kexec/core.c | 10 ++++-
>>> arch/powerpc/kexec/ranges.c | 43
>>> ++++++++++++++-----
>>> 5 files changed, 47 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>>> b/Documentation/admin-guide/kernel-parameters.txt
>>> index 6c42061ca20e..0f386b546cec 100644
>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>> @@ -1013,7 +1013,7 @@
>>> It will be ignored when crashkernel=X,high is not used
>>> or memory reserved is below 4G.
>>> crashkernel=size[KMG],cma
>>> - [KNL, X86] Reserve additional crash kernel memory from
>>> + [KNL, X86, ppc64] Reserve additional crash kernel
>>> memory from
>> Shouldn't this be PPC and not ppc64?
>>
>> If I see the crash_dump support...
>>
>> config ARCH_SUPPORTS_CRASH_DUMP
>> def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
>>
>> The changes below aren't specific to ppc64 correct?
>
> The thing is this feature is only supported with KEXEC_FILE and which
> only supported on PPC64:
>
> config ARCH_SUPPORTS_KEXEC_FILE
> def_bool PPC64
>
> Hence I kept it as ppc64.
>
> I think I should update that in the commit message.
>
> Also do you think is it good to restrict this feature to KEXEC_FILE?
Putting this under KEXEC_FILE may not help much because KEXEC_FILE is
enabled
by default in most configurations. Once it is enabled, the CMA
reservation will
happen regardless of which system call is used to load the kdump kernel
(kexec_load or kexec_file_load).
However, not restricting this feature to KEXEC_FILE will allow the kexec
tool to
independently add support for this feature in the future for the kexec_load
system call.
With that logic, I think if we do not restrict this feature to
KEXEC_FILE, the support
will be available for ppc and not limited to ppc64.
>
>>
>>> CMA. This reservation is usable by the first system's
>>> userspace memory and kernel movable allocations (memory
>>> balloon, zswap). Pages allocated from this memory range
>>> diff --git a/arch/powerpc/include/asm/kexec.h
>>> b/arch/powerpc/include/asm/kexec.h
>>> index 4bbf9f699aaa..bd4a6c42a5f3 100644
>>> --- a/arch/powerpc/include/asm/kexec.h
>>> +++ b/arch/powerpc/include/asm/kexec.h
>>> @@ -115,9 +115,11 @@ int setup_new_fdt_ppc64(const struct kimage
>>> *image, void *fdt, struct crash_mem
>>> #ifdef CONFIG_CRASH_RESERVE
>>> int __init overlaps_crashkernel(unsigned long start, unsigned long
>>> size);
>>> extern void arch_reserve_crashkernel(void);
>>> +extern void kdump_cma_reserve(void);
>>> #else
>>> static inline void arch_reserve_crashkernel(void) {}
>>> static inline int overlaps_crashkernel(unsigned long start,
>>> unsigned long size) { return 0; }
>>> +static inline void kdump_cma_reserve(void) { }
>>> #endif
>>> #if defined(CONFIG_CRASH_DUMP)
>>> diff --git a/arch/powerpc/kernel/setup-common.c
>>> b/arch/powerpc/kernel/setup-common.c
>>> index 68d47c53876c..c8c42b419742 100644
>>> --- a/arch/powerpc/kernel/setup-common.c
>>> +++ b/arch/powerpc/kernel/setup-common.c
>>> @@ -35,6 +35,7 @@
>>> #include <linux/of_irq.h>
>>> #include <linux/hugetlb.h>
>>> #include <linux/pgtable.h>
>>> +#include <asm/kexec.h>
>>> #include <asm/io.h>
>>> #include <asm/paca.h>
>>> #include <asm/processor.h>
>>> @@ -995,11 +996,12 @@ void __init setup_arch(char **cmdline_p)
>>> initmem_init();
>>> /*
>>> - * Reserve large chunks of memory for use by CMA for fadump,
>>> KVM and
>>> + * Reserve large chunks of memory for use by CMA for kdump,
>>> fadump, KVM and
>>> * hugetlb. These must be called after initmem_init(), so that
>>> * pageblock_order is initialised.
>>> */
>>> fadump_cma_init();
>>> + kdump_cma_reserve();
>>> kvm_cma_reserve();
>>> gigantic_hugetlb_cma_reserve();
>>> diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
>>> index d1a2d755381c..25744737eff5 100644
>>> --- a/arch/powerpc/kexec/core.c
>>> +++ b/arch/powerpc/kexec/core.c
>>> @@ -33,6 +33,8 @@ void machine_kexec_cleanup(struct kimage *image)
>>> {
>>> }
>>> +unsigned long long cma_size;
>>> +
>> nit:
>> Since this is a gloabal powerpc variable you are defining, then can we
>> keep it's name to crashk_cma_size?
>
> Yeah make sense. I will update the variable name.
>
>
>>
>>> /*
>>> * Do not allocate memory (or fail in any way) in machine_kexec().
>>> * We are past the point of no return, committed to rebooting now.
>>> @@ -110,7 +112,7 @@ void __init arch_reserve_crashkernel(void)
>>> /* use common parsing */
>>> ret = parse_crashkernel(boot_command_line, total_mem_sz,
>>> &crash_size,
>>> - &crash_base, NULL, NULL, NULL);
>>> + &crash_base, NULL, &cma_size, NULL);
>>> if (ret)
>>> return;
>>> @@ -130,6 +132,12 @@ void __init arch_reserve_crashkernel(void)
>>> reserve_crashkernel_generic(crash_size, crash_base, 0, false);
>>> }
>>> +void __init kdump_cma_reserve(void)
>>> +{
>>> + if (cma_size)
>>> + reserve_crashkernel_cma(cma_size);
>>> +}
>>> +
>> nit:
>> cma_size is already checked for null within reserve_crashkernel_cma(),
>> so we don't really need kdump_cma_reserve() function call as such.
>>
>> Also kdump_cma_reserve() only make sense with #ifdef CRASHKERNEL_CMA..
>> so instead do you think we can directly call
>> reserve_crashkernel_cma(cma_size)?
>
> I think the above kdump_cma_reserve() definition should come under
> CONFIG_CRASH_RESERVE
> because the way it is declared in arch/powerpc/include/asm/kexec.h.
>
> I would like to keep kdump_cma_reserve() as is it because of two reasons:
>
> - It keeps setup_arch() free from kdump #ifdefs
> - In case if we want to add some condition on this reservation it
> would straight forward.
>
> So lets keep kdump_cma_reserve as is, unless you have strong opinion
> on not to.
>
>>> int __init overlaps_crashkernel(unsigned long start, unsigned long
>>> size)
>>> {
>>> return (start + size) > crashk_res.start && start <=
>>> crashk_res.end;
>>> diff --git a/arch/powerpc/kexec/ranges.c b/arch/powerpc/kexec/ranges.c
>>> index 3702b0bdab14..3bd27c38726b 100644
>>> --- a/arch/powerpc/kexec/ranges.c
>>> +++ b/arch/powerpc/kexec/ranges.c
>>> @@ -515,7 +515,7 @@ int get_exclude_memory_ranges(struct crash_mem
>>> **mem_ranges)
>>> */
>>> int get_usable_memory_ranges(struct crash_mem **mem_ranges)
>>> {
>>> - int ret;
>>> + int ret, i;
>>> /*
>>> * Early boot failure observed on guests when low memory
>>> (first memory
>>> @@ -528,6 +528,13 @@ int get_usable_memory_ranges(struct crash_mem
>>> **mem_ranges)
>>> if (ret)
>>> goto out;
>>> + for (i = 0; i < crashk_cma_cnt; i++) {
>>> + ret = add_mem_range(mem_ranges, crashk_cma_ranges[i].start,
>>> + crashk_cma_ranges[i].end -
>>> crashk_cma_ranges[i].start + 1);
>>> + if (ret)
>>> + goto out;
>>> + }
>>> +
>>> ret = add_rtas_mem_range(mem_ranges);
>>> if (ret)
>>> goto out;
>>> @@ -546,6 +553,22 @@ int get_usable_memory_ranges(struct crash_mem
>>> **mem_ranges)
>>> #endif /* CONFIG_KEXEC_FILE */
>>> #ifdef CONFIG_CRASH_DUMP
>>> +static int crash_exclude_mem_range_guarded(struct crash_mem
>>> **mem_ranges,
>>> + unsigned long long mstart,
>>> + unsigned long long mend)
>>> +{
>>> + struct crash_mem *tmem = *mem_ranges;
>>> +
>>> + /* Reallocate memory ranges if there is no space to split
>>> ranges */
>>> + if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>>> + tmem = realloc_mem_ranges(mem_ranges);
>>> + if (!tmem)
>>> + return -ENOMEM;
>>> + }
>>> +
>>> + return crash_exclude_mem_range(tmem, mstart, mend);
>>> +}
>>> +
>>> /**
>>> * get_crash_memory_ranges - Get crash memory ranges. This list
>>> includes
>>> * first/crashing kernel's memory
>>> regions that
>>> @@ -557,7 +580,6 @@ int get_usable_memory_ranges(struct crash_mem
>>> **mem_ranges)
>>> int get_crash_memory_ranges(struct crash_mem **mem_ranges)
>>> {
>>> phys_addr_t base, end;
>>> - struct crash_mem *tmem;
>>> u64 i;
>>> int ret;
>>> @@ -582,19 +604,18 @@ int get_crash_memory_ranges(struct crash_mem
>>> **mem_ranges)
>>> sort_memory_ranges(*mem_ranges, true);
>>> }
>>> - /* Reallocate memory ranges if there is no space to split
>>> ranges */
>>> - tmem = *mem_ranges;
>>> - if (tmem && (tmem->nr_ranges == tmem->max_nr_ranges)) {
>>> - tmem = realloc_mem_ranges(mem_ranges);
>>> - if (!tmem)
>>> - goto out;
>>> - }
>>> -
>>> /* Exclude crashkernel region */
>>> - ret = crash_exclude_mem_range(tmem, crashk_res.start,
>>> crashk_res.end);
>>> + ret = crash_exclude_mem_range_guarded(mem_ranges,
>>> crashk_res.start, crashk_res.end);
>>> if (ret)
>>> goto out;
>>> + for (i = 0; i < crashk_cma_cnt; ++i) {
>>> + ret = crash_exclude_mem_range_guarded(mem_ranges,
>>> crashk_cma_ranges[i].start,
>>> + crashk_cma_ranges[i].end);
>>> + if (ret)
>>> + goto out;
>>> + }
>>> +
>>> /*
>>> * FIXME: For now, stay in parity with kexec-tools but if
>>> RTAS/OPAL
>>> * regions are exported to save their context at the
>>> time of
>>> --
>>> 2.51.0
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-04 5:18 ` Sourabh Jain
2025-11-04 9:34 ` Sourabh Jain
@ 2025-11-04 10:18 ` Ritesh Harjani
2025-11-04 10:35 ` Sourabh Jain
1 sibling, 1 reply; 9+ messages in thread
From: Ritesh Harjani @ 2025-11-04 10:18 UTC (permalink / raw)
To: Sourabh Jain, linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
Sourabh Jain <sourabhjain@linux.ibm.com> writes:
> I would like to keep kdump_cma_reserve() as is it because of two reasons:
>
> - It keeps setup_arch() free from kdump #ifdefs
Not really.
Instead of kdump_cma_reserve(crashk_cma_size), one could call
reserve_crashkernel_cma(crashk_cma_size) directly in setup_arch().
> - In case if we want to add some condition on this reservation it would
> straight forward.
>
Make sense.
> So lets keep kdump_cma_reserve as is, unless you have strong opinion on
> not to.
>
No strong opinion, as I said it was a minor nit. Feel free to keep the
function kdump_cma_reserve() as is then.
-ritesh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-04 9:34 ` Sourabh Jain
@ 2025-11-04 10:24 ` Ritesh Harjani
2025-11-04 12:38 ` Sourabh Jain
0 siblings, 1 reply; 9+ messages in thread
From: Ritesh Harjani @ 2025-11-04 10:24 UTC (permalink / raw)
To: Sourabh Jain, linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
Sourabh Jain <sourabhjain@linux.ibm.com> writes:
> On 04/11/25 10:48, Sourabh Jain wrote:
>>
>>
>> On 03/11/25 15:40, Ritesh Harjani (IBM) wrote:
>>> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>>>
>>>> Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
>>>> crashkernel= command line option") and commit ab475510e042 ("kdump:
>>>> implement reserve_crashkernel_cma") added CMA support for kdump
>>>> crashkernel reservation.
>>>>
>>>> Extend crashkernel CMA reservation support to powerpc.
>>>>
>>>> The following changes are made to enable CMA reservation on powerpc:
>>>>
>>>> - Parse and obtain the CMA reservation size along with other
>>>> crashkernel
>>>> parameters
>>>> - Call reserve_crashkernel_cma() to allocate the CMA region for kdump
>>>> - Include the CMA-reserved ranges in the usable memory ranges for the
>>>> kdump kernel to use.
>>>> - Exclude the CMA-reserved ranges from the crash kernel memory to
>>>> prevent them from being exported through /proc/vmcore.
>>>>
>>>> With the introduction of the CMA crashkernel regions,
>>>> crash_exclude_mem_range() needs to be called multiple times to exclude
>>>> both crashk_res and crashk_cma_ranges from the crash memory ranges. To
>>>> avoid repetitive logic for validating mem_ranges size and handling
>>>> reallocation when required, this functionality is moved to a new
>>>> wrapper
>>>> function crash_exclude_mem_range_guarded().
>>>>
>>>> To ensure proper CMA reservation, reserve_crashkernel_cma() is called
>>>> after pageblock_order is initialized.
>>>>
>>>> Update kernel-parameters.txt to document CMA support for crashkernel on
>>>> powerpc architecture.
>>>>
>>>> Cc: Baoquan he <bhe@redhat.com>
>>>> Cc: Jiri Bohac <jbohac@suse.cz>
>>>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>>>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>>>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>>> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>>> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
>>>> Cc: kexec@lists.infradead.org
>>>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>>> ---
>>>> Changlog:
>>>>
>>>> v3 -> v4
>>>> - Removed repeated initialization to tmem in
>>>> crash_exclude_mem_range_guarded()
>>>> - Call crash_exclude_mem_range() with right crashk ranges
>>>>
>>>> v4 -> v5:
>>>> - Document CMA-based crashkernel support for ppc64 in
>>>> kernel-parameters.txt
>>>> ---
>>>> .../admin-guide/kernel-parameters.txt | 2 +-
>>>> arch/powerpc/include/asm/kexec.h | 2 +
>>>> arch/powerpc/kernel/setup-common.c | 4 +-
>>>> arch/powerpc/kexec/core.c | 10 ++++-
>>>> arch/powerpc/kexec/ranges.c | 43
>>>> ++++++++++++++-----
>>>> 5 files changed, 47 insertions(+), 14 deletions(-)
>>>>
>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>>>> b/Documentation/admin-guide/kernel-parameters.txt
>>>> index 6c42061ca20e..0f386b546cec 100644
>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>> @@ -1013,7 +1013,7 @@
>>>> It will be ignored when crashkernel=X,high is not used
>>>> or memory reserved is below 4G.
>>>> crashkernel=size[KMG],cma
>>>> - [KNL, X86] Reserve additional crash kernel memory from
>>>> + [KNL, X86, ppc64] Reserve additional crash kernel
>>>> memory from
>>> Shouldn't this be PPC and not ppc64?
>>>
>>> If I see the crash_dump support...
>>>
>>> config ARCH_SUPPORTS_CRASH_DUMP
>>> def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
>>>
>>> The changes below aren't specific to ppc64 correct?
>>
>> The thing is this feature is only supported with KEXEC_FILE and which
>> only supported on PPC64:
>>
>> config ARCH_SUPPORTS_KEXEC_FILE
>> def_bool PPC64
>>
>> Hence I kept it as ppc64.
>>
I am not much familiar with the kexec_load v/s kexec_file_load
internals. Maybe because of that I am unable to clearly understand your
above points.
But let me try and explain what I think you meant :)
We first call "get_usable_memory_ranges(&umem)" which updates the usable
memory ranges in "umem". We then call "update_usable_mem_fdt(fdt, umem)"
which updates the FDT for the kdump kernel's fdt to inform about these
usable memory ranges to the kdump kernel.
Now since your patch only does that in get_usable_memory_range(), this
extra CMA reservation is mainly only useful when the kdump load happens
via kexec_file_load(), (because get_usable_memory_range() only gets
called from kexec_file_load() path)
Is this what you meant here?
>> I think I should update that in the commit message.
>>
>> Also do you think is it good to restrict this feature to KEXEC_FILE?
>
> Putting this under KEXEC_FILE may not help much because KEXEC_FILE is
> enabled
> by default in most configurations. Once it is enabled, the CMA
> reservation will
> happen regardless of which system call is used to load the kdump kernel
> (kexec_load or kexec_file_load).
>
What I understood from the feature was that, on the normal production
kernel this feature crashkernel=xM,cma allows to reserve an extra xMB of
memory as a CMA region for kdump kernel's memory allocations. But this
CMA reservation would happen in the normal kernel itself during
setup_arch() -> kdump_cma_reserve()..
And this CMA reservation happens irrespective of whether the kdump
kernel will get loaded via whichever system call.
> However, not restricting this feature to KEXEC_FILE will allow the kexec
> tool to
> independently add support for this feature in the future for the kexec_load
> system call.
Sure.
>
> With that logic, I think if we do not restrict this feature to
> KEXEC_FILE, the support
> will be available for ppc and not limited to ppc64.
>
Yes, that make sense.
If one doesn't want to make the CMA reservation, then we need not pass
the extra cmdline argument and no reservation will be made. So, no need
to restrict this to PPC64 by making it available only for KEXEC_FILE.
-ritesh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-04 10:18 ` Ritesh Harjani
@ 2025-11-04 10:35 ` Sourabh Jain
2025-11-04 10:51 ` Ritesh Harjani
0 siblings, 1 reply; 9+ messages in thread
From: Sourabh Jain @ 2025-11-04 10:35 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
On 04/11/25 15:48, Ritesh Harjani (IBM) wrote:
> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>
>
>> I would like to keep kdump_cma_reserve() as is it because of two reasons:
>>
>> - It keeps setup_arch() free from kdump #ifdefs
> Not really.
>
> Instead of kdump_cma_reserve(crashk_cma_size), one could call
>
> reserve_crashkernel_cma(crashk_cma_size) directly in setup_arch().
reserve_crashkernel_cma() is not available unless the kernel is built
with CONFIG_CRASH_RESERVE.
So, wouldn’t calling reserve_crashkernel_cma() directly from
setup_arch() lead to a build failure? Or
am I missing something?
>
>> - In case if we want to add some condition on this reservation it would
>> straight forward.
>>
> Make sense.
>
>> So lets keep kdump_cma_reserve as is, unless you have strong opinion on
>> not to.
>>
> No strong opinion, as I said it was a minor nit. Feel free to keep the
> function kdump_cma_reserve() as is then.
>
> -ritesh
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-04 10:35 ` Sourabh Jain
@ 2025-11-04 10:51 ` Ritesh Harjani
0 siblings, 0 replies; 9+ messages in thread
From: Ritesh Harjani @ 2025-11-04 10:51 UTC (permalink / raw)
To: Sourabh Jain, linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
Sourabh Jain <sourabhjain@linux.ibm.com> writes:
> On 04/11/25 15:48, Ritesh Harjani (IBM) wrote:
>> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>>
>>
>>> I would like to keep kdump_cma_reserve() as is it because of two reasons:
>>>
>>> - It keeps setup_arch() free from kdump #ifdefs
>> Not really.
>>
>> Instead of kdump_cma_reserve(crashk_cma_size), one could call
>>
>> reserve_crashkernel_cma(crashk_cma_size) directly in setup_arch().
>
>
> reserve_crashkernel_cma() is not available unless the kernel is built
> with CONFIG_CRASH_RESERVE.
> So, wouldn’t calling reserve_crashkernel_cma() directly from
> setup_arch() lead to a build failure? Or
> am I missing something?
>
OOps.. I was assuming the #else CRASHKERNEL_CMA definition should get
called, but all of that logic itself is protected in
CONFIG_CRASH_RESERVE :(
Right to avoid #ifdef or IS_ENABLED in setup_arch..
it's better to have kdump_cma_reserve()
Thanks for pointing that out.
<snip>
obj-$(CONFIG_CRASH_RESERVE) += crash_reserve.o
kernel/crash_reserve.c
#ifdef CRASHKERNEL_CMA
int crashk_cma_cnt;
void __init reserve_crashkernel_cma(unsigned long long cma_size)
{
...
}
#else /* CRASHKERNEL_CMA */
void __init reserve_crashkernel_cma(unsigned long long cma_size)
{
if (cma_size)
pr_warn("crashkernel CMA reservation not supported\n");
}
#endif
-ritesh
>>
>>> - In case if we want to add some condition on this reservation it would
>>> straight forward.
>>>
>> Make sense.
>>
>>> So lets keep kdump_cma_reserve as is, unless you have strong opinion on
>>> not to.
>>>
>> No strong opinion, as I said it was a minor nit. Feel free to keep the
>> function kdump_cma_reserve() as is then.
>>
>> -ritesh
>>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation
2025-11-04 10:24 ` Ritesh Harjani
@ 2025-11-04 12:38 ` Sourabh Jain
0 siblings, 0 replies; 9+ messages in thread
From: Sourabh Jain @ 2025-11-04 12:38 UTC (permalink / raw)
To: Ritesh Harjani (IBM), linuxppc-dev
Cc: Baoquan he, Jiri Bohac, Hari Bathini, Madhavan Srinivasan,
Mahesh Salgaonkar, Michael Ellerman, Shivang Upadhyay, kexec
On 04/11/25 15:54, Ritesh Harjani (IBM) wrote:
> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>
>> On 04/11/25 10:48, Sourabh Jain wrote:
>>>
>>> On 03/11/25 15:40, Ritesh Harjani (IBM) wrote:
>>>> Sourabh Jain <sourabhjain@linux.ibm.com> writes:
>>>>
>>>>> Commit 35c18f2933c5 ("Add a new optional ",cma" suffix to the
>>>>> crashkernel= command line option") and commit ab475510e042 ("kdump:
>>>>> implement reserve_crashkernel_cma") added CMA support for kdump
>>>>> crashkernel reservation.
>>>>>
>>>>> Extend crashkernel CMA reservation support to powerpc.
>>>>>
>>>>> The following changes are made to enable CMA reservation on powerpc:
>>>>>
>>>>> - Parse and obtain the CMA reservation size along with other
>>>>> crashkernel
>>>>> parameters
>>>>> - Call reserve_crashkernel_cma() to allocate the CMA region for kdump
>>>>> - Include the CMA-reserved ranges in the usable memory ranges for the
>>>>> kdump kernel to use.
>>>>> - Exclude the CMA-reserved ranges from the crash kernel memory to
>>>>> prevent them from being exported through /proc/vmcore.
>>>>>
>>>>> With the introduction of the CMA crashkernel regions,
>>>>> crash_exclude_mem_range() needs to be called multiple times to exclude
>>>>> both crashk_res and crashk_cma_ranges from the crash memory ranges. To
>>>>> avoid repetitive logic for validating mem_ranges size and handling
>>>>> reallocation when required, this functionality is moved to a new
>>>>> wrapper
>>>>> function crash_exclude_mem_range_guarded().
>>>>>
>>>>> To ensure proper CMA reservation, reserve_crashkernel_cma() is called
>>>>> after pageblock_order is initialized.
>>>>>
>>>>> Update kernel-parameters.txt to document CMA support for crashkernel on
>>>>> powerpc architecture.
>>>>>
>>>>> Cc: Baoquan he <bhe@redhat.com>
>>>>> Cc: Jiri Bohac <jbohac@suse.cz>
>>>>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>>>>> Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
>>>>> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
>>>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>>>> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>>>>> Cc: Shivang Upadhyay <shivangu@linux.ibm.com>
>>>>> Cc: kexec@lists.infradead.org
>>>>> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com>
>>>>> ---
>>>>> Changlog:
>>>>>
>>>>> v3 -> v4
>>>>> - Removed repeated initialization to tmem in
>>>>> crash_exclude_mem_range_guarded()
>>>>> - Call crash_exclude_mem_range() with right crashk ranges
>>>>>
>>>>> v4 -> v5:
>>>>> - Document CMA-based crashkernel support for ppc64 in
>>>>> kernel-parameters.txt
>>>>> ---
>>>>> .../admin-guide/kernel-parameters.txt | 2 +-
>>>>> arch/powerpc/include/asm/kexec.h | 2 +
>>>>> arch/powerpc/kernel/setup-common.c | 4 +-
>>>>> arch/powerpc/kexec/core.c | 10 ++++-
>>>>> arch/powerpc/kexec/ranges.c | 43
>>>>> ++++++++++++++-----
>>>>> 5 files changed, 47 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>>>>> b/Documentation/admin-guide/kernel-parameters.txt
>>>>> index 6c42061ca20e..0f386b546cec 100644
>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>> @@ -1013,7 +1013,7 @@
>>>>> It will be ignored when crashkernel=X,high is not used
>>>>> or memory reserved is below 4G.
>>>>> crashkernel=size[KMG],cma
>>>>> - [KNL, X86] Reserve additional crash kernel memory from
>>>>> + [KNL, X86, ppc64] Reserve additional crash kernel
>>>>> memory from
>>>> Shouldn't this be PPC and not ppc64?
>>>>
>>>> If I see the crash_dump support...
>>>>
>>>> config ARCH_SUPPORTS_CRASH_DUMP
>>>> def_bool PPC64 || PPC_BOOK3S_32 || PPC_85xx || (44x && !SMP)
>>>>
>>>> The changes below aren't specific to ppc64 correct?
>>> The thing is this feature is only supported with KEXEC_FILE and which
>>> only supported on PPC64:
>>>
>>> config ARCH_SUPPORTS_KEXEC_FILE
>>> def_bool PPC64
>>>
>>> Hence I kept it as ppc64.
>>>
> I am not much familiar with the kexec_load v/s kexec_file_load
> internals. Maybe because of that I am unable to clearly understand your
> above points.
>
> But let me try and explain what I think you meant :)
>
> We first call "get_usable_memory_ranges(&umem)" which updates the usable
> memory ranges in "umem". We then call "update_usable_mem_fdt(fdt, umem)"
> which updates the FDT for the kdump kernel's fdt to inform about these
> usable memory ranges to the kdump kernel.
>
> Now since your patch only does that in get_usable_memory_range(), this
> extra CMA reservation is mainly only useful when the kdump load happens
> via kexec_file_load(), (because get_usable_memory_range() only gets
> called from kexec_file_load() path)
>
> Is this what you meant here?
Yeah, for kexec_file_load, the FDT for the kdump kernel is prepared in
the Linux
kernel (using the functions you mentioned), whereas for kexec_load, it
is prepared
in the kexec tool (userspace). Hence, these changes are not sufficient
to support
this feature with the kexec_load syscall.
The kexec tool must be updated to ensure that the FDT is prepared in a
way that
marks the crashkernel CMA reservation as usable in the kdump FDT for the
kexec_load system call.
Anyway, it makes more sense to say that crashkernel=xM,cma support is
available
on ppc rather than ppc64, since restricting crashkernel CMA reservation to
KEXEC_FILE does not help. The details are explained below.
>
>
>>> I think I should update that in the commit message.
>>>
>>> Also do you think is it good to restrict this feature to KEXEC_FILE?
>> Putting this under KEXEC_FILE may not help much because KEXEC_FILE is
>> enabled
>> by default in most configurations. Once it is enabled, the CMA
>> reservation will
>> happen regardless of which system call is used to load the kdump kernel
>> (kexec_load or kexec_file_load).
>>
> What I understood from the feature was that, on the normal production
> kernel this feature crashkernel=xM,cma allows to reserve an extra xMB of
> memory as a CMA region for kdump kernel's memory allocations. But this
> CMA reservation would happen in the normal kernel itself during
> setup_arch() -> kdump_cma_reserve()..
>
> And this CMA reservation happens irrespective of whether the kdump
> kernel will get loaded via whichever system call.
Yeah that's right.
>
>> However, not restricting this feature to KEXEC_FILE will allow the kexec
>> tool to
>> independently add support for this feature in the future for the kexec_load
>> system call.
> Sure.
>
>> With that logic, I think if we do not restrict this feature to
>> KEXEC_FILE, the support
>> will be available for ppc and not limited to ppc64.
>>
> Yes, that make sense.
>
> If one doesn't want to make the CMA reservation, then we need not pass
> the extra cmdline argument and no reservation will be made. So, no need
> to restrict this to PPC64 by making it available only for KEXEC_FILE.
Agree.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-11-04 12:38 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-03 4:37 [PATCH v5] powerpc/kdump: Add support for crashkernel CMA reservation Sourabh Jain
2025-11-03 10:10 ` Ritesh Harjani
2025-11-04 5:18 ` Sourabh Jain
2025-11-04 9:34 ` Sourabh Jain
2025-11-04 10:24 ` Ritesh Harjani
2025-11-04 12:38 ` Sourabh Jain
2025-11-04 10:18 ` Ritesh Harjani
2025-11-04 10:35 ` Sourabh Jain
2025-11-04 10:51 ` Ritesh Harjani
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).