* [PATCH 0/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high @ 2023-01-17 3:49 Baoquan He 2023-01-17 3:49 ` [PATCH 1/2] " Baoquan He 2023-01-17 3:49 ` [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases Baoquan He 0 siblings, 2 replies; 12+ messages in thread From: Baoquan He @ 2023-01-17 3:49 UTC (permalink / raw) To: linux-kernel Cc: kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang, Baoquan He After crashkernel=,high support was added, our QE engineer surprisingly found the reserved crashkernel high region could cross the high low memory boundary. E.g on system with 4G as boundary, the crashkernel high region is [4G-64M, 4G+448M]. After investigation, I noticed on arm64, the area near 4G is contiguous. Not like x86, its firmware occupies the cpu address space below 4G. When memblock searches for available memory region top down, it could find a region across 4G boundary. Finally, we actually got two memory regions in low memory. This complicates the crashkernel=,high behaviour, people will be confused by this. In this patchset, simpliyfy the behaviour of crashkernel=,high. When trying to reserve memory region for crashkernel high, we only search for the region in high memory, e.g above 4G. If failed, try searching in low memory. This makes sure the crashkernel high memory limited in high memory, confusion is removed. In patch 2, I add code comment above several crashkernel reservation case to ease code reading. I put them in a separate patch 2 because I want the code change in patch 1 to be straightforward and simpler for reviewing. Please help review and check if this is helpful and worth. Baoquan He (2): arm64: kdump: simplify the reservation behaviour of crashkernel=,high arm64/kdump: add code comments for crashkernel reservation cases arch/arm64/mm/init.c | 43 ++++++++++++++++++++++++++++++++++++------- 1 file changed, 36 insertions(+), 7 deletions(-) -- 2.34.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-01-17 3:49 [PATCH 0/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high Baoquan He @ 2023-01-17 3:49 ` Baoquan He 2023-01-20 9:04 ` Simon Horman 2023-01-24 17:36 ` Catalin Marinas 2023-01-17 3:49 ` [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases Baoquan He 1 sibling, 2 replies; 12+ messages in thread From: Baoquan He @ 2023-01-17 3:49 UTC (permalink / raw) To: linux-kernel Cc: kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang, Baoquan He On arm64, reservation for 'crashkernel=xM,high' is taken by searching for suitable memory region up down. If the 'xM' of crashkernel high memory is reserved from high memory successfully, it will try to reserve crashkernel low memory later accoringly. Otherwise, it will try to search low memory area for the 'xM' suitable region. While we observed an unexpected case where a reserved region crosses the high and low meomry boundary. E.g on a system with 4G as low memory end, user added the kernel parameters like: 'crashkernel=512M,high', it could finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. This looks very strange because we have two low memory regions [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell why that happened. Here, for crashkernel=xM,high, search the high memory for the suitable region above the high and low memory boundary. If failed, try reserving the suitable region below the boundary. Like this, the crashkernel high region will only exist in high memory, and crashkernel low region only exists in low memory. The reservation behaviour for crashkernel=,high is clearer and simpler. Signed-off-by: Baoquan He <bhe@redhat.com> --- arch/arm64/mm/init.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 58a0bb2c17f1..26a05af2bfa8 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -127,12 +127,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size) */ static void __init reserve_crashkernel(void) { - unsigned long long crash_base, crash_size; - unsigned long long crash_low_size = 0; + unsigned long long crash_base, crash_size, search_base; unsigned long long crash_max = CRASH_ADDR_LOW_MAX; + unsigned long long crash_low_size = 0; char *cmdline = boot_command_line; - int ret; bool fixed_base = false; + bool high = false; + int ret; if (!IS_ENABLED(CONFIG_KEXEC_CORE)) return; @@ -155,7 +156,9 @@ static void __init reserve_crashkernel(void) else if (ret) return; + search_base = CRASH_ADDR_LOW_MAX; crash_max = CRASH_ADDR_HIGH_MAX; + high = true; } else if (ret || !crash_size) { /* The specified value is invalid */ return; @@ -166,31 +169,44 @@ static void __init reserve_crashkernel(void) /* User specifies base address explicitly. */ if (crash_base) { fixed_base = true; + search_base = crash_base; crash_max = crash_base + crash_size; } retry: crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, - crash_base, crash_max); + search_base, crash_max); if (!crash_base) { + if (fixed_base) { + pr_warn("cannot reserve crashkernel region [0x%llx-0x%llx]\n", + search_base, crash_max); + return; + } + /* * If the first attempt was for low memory, fall back to * high memory, the minimum required low memory will be * reserved later. */ - if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) { + if (!high && crash_max == CRASH_ADDR_LOW_MAX) { crash_max = CRASH_ADDR_HIGH_MAX; + search_base = CRASH_ADDR_LOW_MAX; crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; goto retry; } + if (high && (crash_max == CRASH_ADDR_HIGH_MAX)) { + crash_max = CRASH_ADDR_LOW_MAX; + search_base = 0; + goto retry; + } pr_warn("cannot allocate crashkernel (size:0x%llx)\n", crash_size); return; } - if ((crash_base > CRASH_ADDR_LOW_MAX - crash_low_size) && - crash_low_size && reserve_crashkernel_low(crash_low_size)) { + if ((crash_base >= CRASH_ADDR_LOW_MAX) && crash_low_size && + reserve_crashkernel_low(crash_low_size)) { memblock_phys_free(crash_base, crash_size); return; } -- 2.34.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-01-17 3:49 ` [PATCH 1/2] " Baoquan He @ 2023-01-20 9:04 ` Simon Horman 2023-01-24 2:28 ` Baoquan He 2023-01-24 17:36 ` Catalin Marinas 1 sibling, 1 reply; 12+ messages in thread From: Simon Horman @ 2023-01-20 9:04 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > suitable memory region up down. If the 'xM' of crashkernel high memory > is reserved from high memory successfully, it will try to reserve > crashkernel low memory later accoringly. Otherwise, it will try to search > low memory area for the 'xM' suitable region. > > While we observed an unexpected case where a reserved region crosses the > high and low meomry boundary. E.g on a system with 4G as low memory end, > user added the kernel parameters like: 'crashkernel=512M,high', it could > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > This looks very strange because we have two low memory regions > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > why that happened. > > Here, for crashkernel=xM,high, search the high memory for the suitable > region above the high and low memory boundary. If failed, try reserving > the suitable region below the boundary. Like this, the crashkernel high > region will only exist in high memory, and crashkernel low region only > exists in low memory. The reservation behaviour for crashkernel=,high is > clearer and simpler. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > arch/arm64/mm/init.c | 30 +++++++++++++++++++++++------- > 1 file changed, 23 insertions(+), 7 deletions(-) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 58a0bb2c17f1..26a05af2bfa8 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -127,12 +127,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size) > */ > static void __init reserve_crashkernel(void) > { > - unsigned long long crash_base, crash_size; > - unsigned long long crash_low_size = 0; > + unsigned long long crash_base, crash_size, search_base; > unsigned long long crash_max = CRASH_ADDR_LOW_MAX; > + unsigned long long crash_low_size = 0; > char *cmdline = boot_command_line; > - int ret; > bool fixed_base = false; > + bool high = false; > + int ret; > > if (!IS_ENABLED(CONFIG_KEXEC_CORE)) > return; > @@ -155,7 +156,9 @@ static void __init reserve_crashkernel(void) > else if (ret) > return; > > + search_base = CRASH_ADDR_LOW_MAX; > crash_max = CRASH_ADDR_HIGH_MAX; > + high = true; > } else if (ret || !crash_size) { > /* The specified value is invalid */ > return; > @@ -166,31 +169,44 @@ static void __init reserve_crashkernel(void) > /* User specifies base address explicitly. */ > if (crash_base) { > fixed_base = true; > + search_base = crash_base; > crash_max = crash_base + crash_size; > } > > retry: > crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, > - crash_base, crash_max); > + search_base, crash_max); > if (!crash_base) { > + if (fixed_base) { > + pr_warn("cannot reserve crashkernel region [0x%llx-0x%llx]\n", > + search_base, crash_max); > + return; > + } > + > /* > * If the first attempt was for low memory, fall back to > * high memory, the minimum required low memory will be > * reserved later. > */ > - if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) { > + if (!high && crash_max == CRASH_ADDR_LOW_MAX) { > crash_max = CRASH_ADDR_HIGH_MAX; > + search_base = CRASH_ADDR_LOW_MAX; > crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > goto retry; > } > > + if (high && (crash_max == CRASH_ADDR_HIGH_MAX)) { nit: unnecessary (and inconsistent with code just above) parentheses. > + crash_max = CRASH_ADDR_LOW_MAX; > + search_base = 0; > + goto retry; > + } > pr_warn("cannot allocate crashkernel (size:0x%llx)\n", > crash_size); > return; > } > > - if ((crash_base > CRASH_ADDR_LOW_MAX - crash_low_size) && > - crash_low_size && reserve_crashkernel_low(crash_low_size)) { > + if ((crash_base >= CRASH_ADDR_LOW_MAX) && crash_low_size && > + reserve_crashkernel_low(crash_low_size)) { > memblock_phys_free(crash_base, crash_size); > return; > } > -- > 2.34.1 > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-01-20 9:04 ` Simon Horman @ 2023-01-24 2:28 ` Baoquan He 0 siblings, 0 replies; 12+ messages in thread From: Baoquan He @ 2023-01-24 2:28 UTC (permalink / raw) To: Simon Horman Cc: linux-kernel, kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On 01/20/23 at 10:04am, Simon Horman wrote: > On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > > suitable memory region up down. If the 'xM' of crashkernel high memory > > is reserved from high memory successfully, it will try to reserve > > crashkernel low memory later accoringly. Otherwise, it will try to search > > low memory area for the 'xM' suitable region. > > > > While we observed an unexpected case where a reserved region crosses the > > high and low meomry boundary. E.g on a system with 4G as low memory end, > > user added the kernel parameters like: 'crashkernel=512M,high', it could > > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > > This looks very strange because we have two low memory regions > > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > > why that happened. > > > > Here, for crashkernel=xM,high, search the high memory for the suitable > > region above the high and low memory boundary. If failed, try reserving > > the suitable region below the boundary. Like this, the crashkernel high > > region will only exist in high memory, and crashkernel low region only > > exists in low memory. The reservation behaviour for crashkernel=,high is > > clearer and simpler. > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > arch/arm64/mm/init.c | 30 +++++++++++++++++++++++------- > > 1 file changed, 23 insertions(+), 7 deletions(-) > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 58a0bb2c17f1..26a05af2bfa8 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -127,12 +127,13 @@ static int __init reserve_crashkernel_low(unsigned long long low_size) > > */ > > static void __init reserve_crashkernel(void) > > { > > - unsigned long long crash_base, crash_size; > > - unsigned long long crash_low_size = 0; > > + unsigned long long crash_base, crash_size, search_base; > > unsigned long long crash_max = CRASH_ADDR_LOW_MAX; > > + unsigned long long crash_low_size = 0; > > char *cmdline = boot_command_line; > > - int ret; > > bool fixed_base = false; > > + bool high = false; > > + int ret; > > > > if (!IS_ENABLED(CONFIG_KEXEC_CORE)) > > return; > > @@ -155,7 +156,9 @@ static void __init reserve_crashkernel(void) > > else if (ret) > > return; > > > > + search_base = CRASH_ADDR_LOW_MAX; > > crash_max = CRASH_ADDR_HIGH_MAX; > > + high = true; > > } else if (ret || !crash_size) { > > /* The specified value is invalid */ > > return; > > @@ -166,31 +169,44 @@ static void __init reserve_crashkernel(void) > > /* User specifies base address explicitly. */ > > if (crash_base) { > > fixed_base = true; > > + search_base = crash_base; > > crash_max = crash_base + crash_size; > > } > > > > retry: > > crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, > > - crash_base, crash_max); > > + search_base, crash_max); > > if (!crash_base) { > > + if (fixed_base) { > > + pr_warn("cannot reserve crashkernel region [0x%llx-0x%llx]\n", > > + search_base, crash_max); > > + return; > > + } > > + > > /* > > * If the first attempt was for low memory, fall back to > > * high memory, the minimum required low memory will be > > * reserved later. > > */ > > - if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) { > > + if (!high && crash_max == CRASH_ADDR_LOW_MAX) { > > crash_max = CRASH_ADDR_HIGH_MAX; > > + search_base = CRASH_ADDR_LOW_MAX; > > crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > > goto retry; > > } > > > > + if (high && (crash_max == CRASH_ADDR_HIGH_MAX)) { > > nit: unnecessary (and inconsistent with code just above) parentheses. Indeed, will remove it. Thanks for reviewing. > > > + crash_max = CRASH_ADDR_LOW_MAX; > > + search_base = 0; > > + goto retry; > > + } > > pr_warn("cannot allocate crashkernel (size:0x%llx)\n", > > crash_size); > > return; > > } > > > > - if ((crash_base > CRASH_ADDR_LOW_MAX - crash_low_size) && > > - crash_low_size && reserve_crashkernel_low(crash_low_size)) { > > + if ((crash_base >= CRASH_ADDR_LOW_MAX) && crash_low_size && > > + reserve_crashkernel_low(crash_low_size)) { > > memblock_phys_free(crash_base, crash_size); > > return; > > } > > -- > > 2.34.1 > > > > > > _______________________________________________ > > kexec mailing list > > kexec@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec > > > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-01-17 3:49 ` [PATCH 1/2] " Baoquan He 2023-01-20 9:04 ` Simon Horman @ 2023-01-24 17:36 ` Catalin Marinas 2023-02-01 5:57 ` Baoquan He 1 sibling, 1 reply; 12+ messages in thread From: Catalin Marinas @ 2023-01-24 17:36 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, kexec, linux-arm-kernel, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > suitable memory region up down. If the 'xM' of crashkernel high memory > is reserved from high memory successfully, it will try to reserve > crashkernel low memory later accoringly. Otherwise, it will try to search > low memory area for the 'xM' suitable region. > > While we observed an unexpected case where a reserved region crosses the > high and low meomry boundary. E.g on a system with 4G as low memory end, > user added the kernel parameters like: 'crashkernel=512M,high', it could > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > This looks very strange because we have two low memory regions > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > why that happened. > > Here, for crashkernel=xM,high, search the high memory for the suitable > region above the high and low memory boundary. If failed, try reserving > the suitable region below the boundary. Like this, the crashkernel high > region will only exist in high memory, and crashkernel low region only > exists in low memory. The reservation behaviour for crashkernel=,high is > clearer and simpler. Well, I guess it depends on how you look at the 'high' option: is it permitting to go into high addresses or forcing high addresses only? IIUC the x86 implementation has a similar behaviour to the arm64 one, it allows allocation across boundary. What x86 seems to do though is that if crash_base of the high allocation is below 4G, it gives up on further low allocation. On arm64 we had this initially but improved it slightly to check whether the low allocation is of sufficient size. In your example above, it is 126MB instead of 128MB, hence an explicit low allocation. Is the only problem that some users get confused? I don't see this as a significant issue. However, with your patch, there is a potential failure if there isn't sufficient memory to accommodate the request in either high or low ranges. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-01-24 17:36 ` Catalin Marinas @ 2023-02-01 5:57 ` Baoquan He 2023-02-01 17:07 ` Catalin Marinas 0 siblings, 1 reply; 12+ messages in thread From: Baoquan He @ 2023-02-01 5:57 UTC (permalink / raw) To: Catalin Marinas Cc: linux-kernel, kexec, linux-arm-kernel, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On 01/24/23 at 05:36pm, Catalin Marinas wrote: > On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > > suitable memory region up down. If the 'xM' of crashkernel high memory > > is reserved from high memory successfully, it will try to reserve > > crashkernel low memory later accoringly. Otherwise, it will try to search > > low memory area for the 'xM' suitable region. > > > > While we observed an unexpected case where a reserved region crosses the > > high and low meomry boundary. E.g on a system with 4G as low memory end, > > user added the kernel parameters like: 'crashkernel=512M,high', it could > > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > > This looks very strange because we have two low memory regions > > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > > why that happened. > > > > Here, for crashkernel=xM,high, search the high memory for the suitable > > region above the high and low memory boundary. If failed, try reserving > > the suitable region below the boundary. Like this, the crashkernel high > > region will only exist in high memory, and crashkernel low region only > > exists in low memory. The reservation behaviour for crashkernel=,high is > > clearer and simpler. > Thanks for looking into this. Please see inline comments. > Well, I guess it depends on how you look at the 'high' option: is it > permitting to go into high addresses or forcing high addresses only? > IIUC the x86 implementation has a similar behaviour to the arm64 one, it > allows allocation across boundary. Hmm, x86 has no chance to allocate a memory region across 4G boundary because it reserves many small regions to map firmware, pci bus, etc near 4G. E.g one x86 system has /proc/iomem as below. I haven't seen a x86 system which doesn't look like this. [root@ ~]# cat /proc/iomem 00000000-00000fff : Reserved 00001000-0009fbff : System RAM 0009fc00-0009ffff : Reserved 000a0000-000bffff : PCI Bus 0000:00 000c0000-000c93ff : Video ROM 000c9800-000ca5ff : Adapter ROM 000ca800-000ccbff : Adapter ROM 000f0000-000fffff : Reserved 000f0000-000fffff : System ROM 00100000-bffeffff : System RAM 73200000-74001b07 : Kernel code 74200000-74bebfff : Kernel rodata 74c00000-75167cbf : Kernel data 758a4000-75ffffff : Kernel bss af000000-beffffff : Crash kernel bfff0000-bfffffff : Reserved c0000000-febfffff : PCI Bus 0000:00 fc000000-fdffffff : 0000:00:02.0 fc000000-fdffffff : cirrus feb80000-febbffff : 0000:00:03.0 febd0000-febd0fff : 0000:00:02.0 febd0000-febd0fff : cirrus febd1000-febd1fff : 0000:00:03.0 febd2000-febd2fff : 0000:00:04.0 febd3000-febd3fff : 0000:00:06.0 febd4000-febd4fff : 0000:00:07.0 febd5000-febd5fff : 0000:00:08.0 febd6000-febd6fff : 0000:00:09.0 febd7000-febd7fff : 0000:00:0a.0 fec00000-fec003ff : IOAPIC 0 fee00000-fee00fff : Local APIC feffc000-feffffff : Reserved fffc0000-ffffffff : Reserved 100000000-13fffffff : System RAM > What x86 seems to do though is that if crash_base of the high allocation > is below 4G, it gives up on further low allocation. On arm64 we had this > initially but improved it slightly to check whether the low allocation > is of sufficient size. In your example above, it is 126MB instead of > 128MB, hence an explicit low allocation. Right. From code, x86 tries to allocate crashkernel high reion top down. If crashkernel high region is above 4G, it reserves 128M for crashkernel low. If it only allocates region under 4G, no further action. But arm64 allocates crashkernel high memory top down and could cross the 4G boudary. This will bring 3 issues: 1) For crashkernel=x,high, it could get crashkernel high region across 4G boudary. Then user will see two memory regions under 4G, and one memory region above 4G. The two low memory regions are confusing. 2) If people explicityly specify "crashkernel=x,high crashkernel=y,low" and y <= 128M, e.g "crashkernel=256M,high crashkernel=64M,low", when crashkernel high region crosses 4G boudary and the part below 4G of crashkernel high reservation is bigger than y, the expected crahskernel low reservation will be skipped. But the expected crashkernel high reservation is shrank and could not satisfy user space requirement. 3) The crossing boundary behaviour of crahskernel high reservation is different than x86 arch. From distros point of view, this brings inconsistency and confusion. Users need to dig into x86 and arm64 details to find out why. For upstream kernel dev and maintainers, issue 3) could be a slight impaction. While issue 1) and 2) cause actual affect. With a small code change to fix this, we can get simpler, more understandable crashkernel=,high reservation behaviour. > > Is the only problem that some users get confused? I don't see this as a > significant issue. However, with your patch, there is a potential > failure if there isn't sufficient memory to accommodate the request in > either high or low ranges. I think we don't need to worry about the potential failure. Before, w/o crashkernel=,high support, no matter how large the system memory is, you can only reserve crashkernel memory under 4G. With crashkernel=,high support, we don't have the limitation. If one system can only satisfy crashkernel reservation across 4G boudary, I think she/he need consider to decrease the value of crashkernel=,high and try again. However, the corssing boundary reservation for crashkernel high region could bring obscure semantics and behaviour, that is a problem we should fix. Thanks Baoquan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-02-01 5:57 ` Baoquan He @ 2023-02-01 17:07 ` Catalin Marinas 2023-02-02 2:55 ` Baoquan He 2023-02-03 9:55 ` Baoquan He 0 siblings, 2 replies; 12+ messages in thread From: Catalin Marinas @ 2023-02-01 17:07 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, kexec, linux-arm-kernel, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On Wed, Feb 01, 2023 at 01:57:17PM +0800, Baoquan He wrote: > On 01/24/23 at 05:36pm, Catalin Marinas wrote: > > On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > > > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > > > suitable memory region up down. If the 'xM' of crashkernel high memory > > > is reserved from high memory successfully, it will try to reserve > > > crashkernel low memory later accoringly. Otherwise, it will try to search > > > low memory area for the 'xM' suitable region. > > > > > > While we observed an unexpected case where a reserved region crosses the > > > high and low meomry boundary. E.g on a system with 4G as low memory end, > > > user added the kernel parameters like: 'crashkernel=512M,high', it could > > > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > > > This looks very strange because we have two low memory regions > > > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > > > why that happened. > > > > > > Here, for crashkernel=xM,high, search the high memory for the suitable > > > region above the high and low memory boundary. If failed, try reserving > > > the suitable region below the boundary. Like this, the crashkernel high > > > region will only exist in high memory, and crashkernel low region only > > > exists in low memory. The reservation behaviour for crashkernel=,high is > > > clearer and simpler. > > > > Well, I guess it depends on how you look at the 'high' option: is it > > permitting to go into high addresses or forcing high addresses only? > > IIUC the x86 implementation has a similar behaviour to the arm64 one, it > > allows allocation across boundary. > > Hmm, x86 has no chance to allocate a memory region across 4G boundary > because it reserves many small regions to map firmware, pci bus, etc > near 4G. E.g one x86 system has /proc/iomem as below. I haven't seen a > x86 system which doesn't look like this. > > [root@ ~]# cat /proc/iomem [...] > fffc0000-ffffffff : Reserved > 100000000-13fffffff : System RAM Ah, that's why we don't see this problem on x86. Alright, for consistency I'm fine with having the same logic on arm64. I guess we don't need the additional check on whether the 'high' allocation reserved at least 128MB in the 'low' range. If it succeeded and the start is below 4GB, it's guaranteed that it got the full allocation in the 'low' range. I haven't checked whether your patch cleaned this up already, if not please do in the next version. And as already asked, please fold the comments with the same patch, it's easier to read. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-02-01 17:07 ` Catalin Marinas @ 2023-02-02 2:55 ` Baoquan He 2023-02-03 9:55 ` Baoquan He 1 sibling, 0 replies; 12+ messages in thread From: Baoquan He @ 2023-02-02 2:55 UTC (permalink / raw) To: Catalin Marinas Cc: linux-kernel, kexec, linux-arm-kernel, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On 02/01/23 at 05:07pm, Catalin Marinas wrote: > On Wed, Feb 01, 2023 at 01:57:17PM +0800, Baoquan He wrote: > > On 01/24/23 at 05:36pm, Catalin Marinas wrote: > > > On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > > > > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > > > > suitable memory region up down. If the 'xM' of crashkernel high memory > > > > is reserved from high memory successfully, it will try to reserve > > > > crashkernel low memory later accoringly. Otherwise, it will try to search > > > > low memory area for the 'xM' suitable region. > > > > > > > > While we observed an unexpected case where a reserved region crosses the > > > > high and low meomry boundary. E.g on a system with 4G as low memory end, > > > > user added the kernel parameters like: 'crashkernel=512M,high', it could > > > > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > > > > This looks very strange because we have two low memory regions > > > > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > > > > why that happened. > > > > > > > > Here, for crashkernel=xM,high, search the high memory for the suitable > > > > region above the high and low memory boundary. If failed, try reserving > > > > the suitable region below the boundary. Like this, the crashkernel high > > > > region will only exist in high memory, and crashkernel low region only > > > > exists in low memory. The reservation behaviour for crashkernel=,high is > > > > clearer and simpler. > > > > > > Well, I guess it depends on how you look at the 'high' option: is it > > > permitting to go into high addresses or forcing high addresses only? > > > IIUC the x86 implementation has a similar behaviour to the arm64 one, it > > > allows allocation across boundary. > > > > Hmm, x86 has no chance to allocate a memory region across 4G boundary > > because it reserves many small regions to map firmware, pci bus, etc > > near 4G. E.g one x86 system has /proc/iomem as below. I haven't seen a > > x86 system which doesn't look like this. > > > > [root@ ~]# cat /proc/iomem > [...] > > fffc0000-ffffffff : Reserved > > 100000000-13fffffff : System RAM > > Ah, that's why we don't see this problem on x86. > > Alright, for consistency I'm fine with having the same logic on arm64. I > guess we don't need the additional check on whether the 'high' > allocation reserved at least 128MB in the 'low' range. If it succeeded > and the start is below 4GB, it's guaranteed that it got the full > allocation in the 'low' range. I haven't checked whether your patch > cleaned this up already, if not please do in the next version. Yes, that checking has been cleaned away in this patch. > > And as already asked, please fold the comments with the same patch, it's > easier to read. Sure, will do. Thanks a lot for reviewing. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high 2023-02-01 17:07 ` Catalin Marinas 2023-02-02 2:55 ` Baoquan He @ 2023-02-03 9:55 ` Baoquan He 1 sibling, 0 replies; 12+ messages in thread From: Baoquan He @ 2023-02-03 9:55 UTC (permalink / raw) To: Catalin Marinas Cc: linux-kernel, kexec, linux-arm-kernel, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang Hi Catalin, On 02/01/23 at 05:07pm, Catalin Marinas wrote: > On Wed, Feb 01, 2023 at 01:57:17PM +0800, Baoquan He wrote: > > On 01/24/23 at 05:36pm, Catalin Marinas wrote: > > > On Tue, Jan 17, 2023 at 11:49:20AM +0800, Baoquan He wrote: > > > > On arm64, reservation for 'crashkernel=xM,high' is taken by searching for > > > > suitable memory region up down. If the 'xM' of crashkernel high memory > > > > is reserved from high memory successfully, it will try to reserve > > > > crashkernel low memory later accoringly. Otherwise, it will try to search > > > > low memory area for the 'xM' suitable region. > > > > > > > > While we observed an unexpected case where a reserved region crosses the > > > > high and low meomry boundary. E.g on a system with 4G as low memory end, > > > > user added the kernel parameters like: 'crashkernel=512M,high', it could > > > > finally have [4G-126M, 4G+386M], [1G, 1G+128M] regions in running kernel. > > > > This looks very strange because we have two low memory regions > > > > [4G-126M, 4G] and [1G, 1G+128M]. Much explanation need be given to tell > > > > why that happened. > > > > > > > > Here, for crashkernel=xM,high, search the high memory for the suitable > > > > region above the high and low memory boundary. If failed, try reserving > > > > the suitable region below the boundary. Like this, the crashkernel high > > > > region will only exist in high memory, and crashkernel low region only > > > > exists in low memory. The reservation behaviour for crashkernel=,high is > > > > clearer and simpler. > > > > > > Well, I guess it depends on how you look at the 'high' option: is it > > > permitting to go into high addresses or forcing high addresses only? > > > IIUC the x86 implementation has a similar behaviour to the arm64 one, it > > > allows allocation across boundary. > > > > Hmm, x86 has no chance to allocate a memory region across 4G boundary > > because it reserves many small regions to map firmware, pci bus, etc > > near 4G. E.g one x86 system has /proc/iomem as below. I haven't seen a > > x86 system which doesn't look like this. > > > > [root@ ~]# cat /proc/iomem > [...] > > fffc0000-ffffffff : Reserved > > 100000000-13fffffff : System RAM > > Ah, that's why we don't see this problem on x86. > > Alright, for consistency I'm fine with having the same logic on arm64. I > guess we don't need the additional check on whether the 'high' > allocation reserved at least 128MB in the 'low' range. If it succeeded > and the start is below 4GB, it's guaranteed that it got the full > allocation in the 'low' range. I haven't checked whether your patch > cleaned this up already, if not please do in the next version. > > And as already asked, please fold the comments with the same patch, it's > easier to read. I have updated patch according to you and Simon's suggestion, and resend v2. By the way, could you please have a look at below patchset, to see what solution we should take to solve the spotted problem on arm64? === arm64, kdump: enforce to take 4G as the crashkernel low memory end https://lore.kernel.org/all/20220828005545.94389-1-bhe@redhat.com/T/#u After thorough discussion, I think the problem and root cuase have been very clear to us. However, which way to choose to solve it haven't been decided. In our distors, RHEL and Fedora, we enabed both CONFIG_ZONE_DMA CONFIG_ZONE_DMA32 by default, need set crashkernel= in cmdline. And we don't set 'rodata=' kernel parameter unless have to. I am fine with taking off the protection on crashkernel region, or taking the way where my patchset is done. Thanks Baoquan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases 2023-01-17 3:49 [PATCH 0/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high Baoquan He 2023-01-17 3:49 ` [PATCH 1/2] " Baoquan He @ 2023-01-17 3:49 ` Baoquan He 2023-01-20 9:02 ` Simon Horman 1 sibling, 1 reply; 12+ messages in thread From: Baoquan He @ 2023-01-17 3:49 UTC (permalink / raw) To: linux-kernel Cc: kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang, Baoquan He This will help understand codes on crashkernel reservations on arm64. Signed-off-by: Baoquan He <bhe@redhat.com> --- arch/arm64/mm/init.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 26a05af2bfa8..f88ad17cb20d 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -177,6 +177,10 @@ static void __init reserve_crashkernel(void) crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, search_base, crash_max); if (!crash_base) { + /* + * For crashkernel=size[KMG]@offset[KMG], print out failure + * message if can't reserve the specified region. + */ if (fixed_base) { pr_warn("cannot reserve crashkernel region [0x%llx-0x%llx]\n", search_base, crash_max); @@ -188,6 +192,11 @@ static void __init reserve_crashkernel(void) * high memory, the minimum required low memory will be * reserved later. */ + /* + * For crashkernel=size[KMG], if the first attempt was for + * low memory, fall back to high memory, the minimum required + * low memory will be reserved later. + */ if (!high && crash_max == CRASH_ADDR_LOW_MAX) { crash_max = CRASH_ADDR_HIGH_MAX; search_base = CRASH_ADDR_LOW_MAX; @@ -195,6 +204,10 @@ static void __init reserve_crashkernel(void) goto retry; } + /* + * For crashkernel=size[KMG],high, if the first attempt was for + * high memory, fall back to low memory. + */ if (high && (crash_max == CRASH_ADDR_HIGH_MAX)) { crash_max = CRASH_ADDR_LOW_MAX; search_base = 0; -- 2.34.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases 2023-01-17 3:49 ` [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases Baoquan He @ 2023-01-20 9:02 ` Simon Horman 2023-01-24 2:49 ` Baoquan He 0 siblings, 1 reply; 12+ messages in thread From: Simon Horman @ 2023-01-20 9:02 UTC (permalink / raw) To: Baoquan He Cc: linux-kernel, kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On Tue, Jan 17, 2023 at 11:49:21AM +0800, Baoquan He wrote: > This will help understand codes on crashkernel reservations on arm64. FWIIW, I think you can fold this into the first patch. And, although I have no good idea at this moment, I do wonder if the logic can be simplified - I for one really needed the comments to understand the retry logic. > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > arch/arm64/mm/init.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 26a05af2bfa8..f88ad17cb20d 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -177,6 +177,10 @@ static void __init reserve_crashkernel(void) > crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, > search_base, crash_max); > if (!crash_base) { > + /* > + * For crashkernel=size[KMG]@offset[KMG], print out failure > + * message if can't reserve the specified region. > + */ > if (fixed_base) { > pr_warn("cannot reserve crashkernel region [0x%llx-0x%llx]\n", > search_base, crash_max); > @@ -188,6 +192,11 @@ static void __init reserve_crashkernel(void) > * high memory, the minimum required low memory will be > * reserved later. > */ > + /* > + * For crashkernel=size[KMG], if the first attempt was for > + * low memory, fall back to high memory, the minimum required > + * low memory will be reserved later. > + */ I think this duplicates the preceding comment. Perhaps just replace the earlier comment with this one. > if (!high && crash_max == CRASH_ADDR_LOW_MAX) { > crash_max = CRASH_ADDR_HIGH_MAX; > search_base = CRASH_ADDR_LOW_MAX; > @@ -195,6 +204,10 @@ static void __init reserve_crashkernel(void) > goto retry; > } > > + /* > + * For crashkernel=size[KMG],high, if the first attempt was for > + * high memory, fall back to low memory. > + */ > if (high && (crash_max == CRASH_ADDR_HIGH_MAX)) { > crash_max = CRASH_ADDR_LOW_MAX; > search_base = 0; > -- > 2.34.1 > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases 2023-01-20 9:02 ` Simon Horman @ 2023-01-24 2:49 ` Baoquan He 0 siblings, 0 replies; 12+ messages in thread From: Baoquan He @ 2023-01-24 2:49 UTC (permalink / raw) To: Simon Horman Cc: linux-kernel, kexec, linux-arm-kernel, catalin.marinas, will, thunder.leizhen, John.p.donnelly, wangkefeng.wang On 01/20/23 at 10:02am, Simon Horman wrote: > On Tue, Jan 17, 2023 at 11:49:21AM +0800, Baoquan He wrote: > > This will help understand codes on crashkernel reservations on arm64. > > FWIIW, I think you can fold this into the first patch. Sure, will do. I folded this into patch 1 at the very beginning. Then felt the added code comments add lines of change and make the code change not so straightforward. I admit it's from personal feeling. > > And, although I have no good idea at this moment, I do wonder > if the logic can be simplified - I for one really needed the > comments to understand the retry logic. Got it. I will consider if I can improve the logic readability. Thanks. > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > arch/arm64/mm/init.c | 13 +++++++++++++ > > 1 file changed, 13 insertions(+) > > > > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > > index 26a05af2bfa8..f88ad17cb20d 100644 > > --- a/arch/arm64/mm/init.c > > +++ b/arch/arm64/mm/init.c > > @@ -177,6 +177,10 @@ static void __init reserve_crashkernel(void) > > crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN, > > search_base, crash_max); > > if (!crash_base) { > > + /* > > + * For crashkernel=size[KMG]@offset[KMG], print out failure > > + * message if can't reserve the specified region. > > + */ > > if (fixed_base) { > > pr_warn("cannot reserve crashkernel region [0x%llx-0x%llx]\n", > > search_base, crash_max); > > @@ -188,6 +192,11 @@ static void __init reserve_crashkernel(void) > > * high memory, the minimum required low memory will be > > * reserved later. > > */ > > + /* > > + * For crashkernel=size[KMG], if the first attempt was for > > + * low memory, fall back to high memory, the minimum required > > + * low memory will be reserved later. > > + */ > > I think this duplicates the preceding comment. > Perhaps just replace the earlier comment with this one. > > > if (!high && crash_max == CRASH_ADDR_LOW_MAX) { > > crash_max = CRASH_ADDR_HIGH_MAX; > > search_base = CRASH_ADDR_LOW_MAX; > > @@ -195,6 +204,10 @@ static void __init reserve_crashkernel(void) > > goto retry; > > } > > > > + /* > > + * For crashkernel=size[KMG],high, if the first attempt was for > > + * high memory, fall back to low memory. > > + */ > > if (high && (crash_max == CRASH_ADDR_HIGH_MAX)) { > > crash_max = CRASH_ADDR_LOW_MAX; > > search_base = 0; > > -- > > 2.34.1 > > > > > > _______________________________________________ > > kexec mailing list > > kexec@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/kexec > > > > _______________________________________________ > kexec mailing list > kexec@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/kexec > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2023-02-03 9:56 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-01-17 3:49 [PATCH 0/2] arm64: kdump: simplify the reservation behaviour of crashkernel=,high Baoquan He 2023-01-17 3:49 ` [PATCH 1/2] " Baoquan He 2023-01-20 9:04 ` Simon Horman 2023-01-24 2:28 ` Baoquan He 2023-01-24 17:36 ` Catalin Marinas 2023-02-01 5:57 ` Baoquan He 2023-02-01 17:07 ` Catalin Marinas 2023-02-02 2:55 ` Baoquan He 2023-02-03 9:55 ` Baoquan He 2023-01-17 3:49 ` [PATCH 2/2] arm64/kdump: add code comments for crashkernel reservation cases Baoquan He 2023-01-20 9:02 ` Simon Horman 2023-01-24 2:49 ` Baoquan He
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).