* [PATCH] MIPS: loongson64: fix boot failure
@ 2023-12-25 9:30 Huang Pei
2024-01-09 21:40 ` Thomas Bogendoerfer
2024-01-15 14:08 ` memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure) Jiaxun Yang
0 siblings, 2 replies; 50+ messages in thread
From: Huang Pei @ 2023-12-25 9:30 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
this:
----------------------------------------------------------------------
Call Trace:
[<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
[<ffffffff82333940>] memblock_free_all+0x104/0x2a8
[<ffffffff8231d8e4>] mem_init+0x84/0x94
[<ffffffff82330958>] mm_core_init+0xf8/0x308
[<ffffffff82318c38>] start_kernel+0x43c/0x86c
Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
64420070 00003025 24040003
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
----------------------------------------------------------------------
The root cause is no memory region "0x0-0x1fffff" paired with
memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
=debug":
----------------------------------------------------------------------
memory[0x0] [0x0000000000200000-0x000000000effffff],
0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
memory[0x1] [0x0000000090000000-0x00000000fdffffff],
0x000000006e000000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000100000000-0x000000027fffffff],
0x0000000180000000 bytes on node 0 flags: 0x0
memory[0x3] [0x0000100000000000-0x000010000fffffff],
0x0000000010000000 bytes on node 1 flags: 0x0
memory[0x4] [0x0000100090000000-0x000010027fffffff],
0x00000001f0000000 bytes on node 1 flags: 0x0
reserved.cnt = 0x1f
reserved[0x0] [0x0000000000000000-0x000000000190c80a],
0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
reserved[0x1] [0x000000000190c810-0x000000000190eea3],
0x0000000000002694 bytes flags: 0x0
----------------------------------------------------------------------
It caused memory-reserved region "0x0-0x1fffff" without valid node id
in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
"reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
accessing "node_data" out of bound.
To fix this bug, we should remove unnecessary memory block reservation.
+. no need to reserve 0x0-0x1fffff below kernel loading address, since
it is not registered by "memblock_add_node"
+. no need to reserve 0x0-0xfff for exception handling if it is not
registered by "memblock_add" either.
Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/kernel/traps.c | 3 ++-
arch/mips/loongson64/numa.c | 2 --
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 246c6a6b0261..9b632b4c10c3 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
void reserve_exception_space(phys_addr_t addr, unsigned long size)
{
- memblock_reserve(addr, size);
+ if(memblock_is_region_memory(addr, size))
+ memblock_reserve(addr, size);
}
void __init *set_except_vector(int n, void *addr)
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 8f61e93c0c5b..0f516dde81da 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
memblock_reserve((node_addrspace_offset | 0xfe000000),
32 << 20);
- /* Reserve pfn range 0~node[0]->node_start_pfn */
- memblock_reserve(0, PAGE_SIZE * start_pfn);
}
}
--
2.20.1
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH] MIPS: loongson64: fix boot failure
2023-12-25 9:30 [PATCH] MIPS: loongson64: fix boot failure Huang Pei
@ 2024-01-09 21:40 ` Thomas Bogendoerfer
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
` (4 more replies)
2024-01-15 14:08 ` memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure) Jiaxun Yang
1 sibling, 5 replies; 50+ messages in thread
From: Thomas Bogendoerfer @ 2024-01-09 21:40 UTC (permalink / raw)
To: Huang Pei
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Mon, Dec 25, 2023 at 05:30:25PM +0800, Huang Pei wrote:
> arch/mips/loongson64/numa.c | 2 --
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> index 246c6a6b0261..9b632b4c10c3 100644
> --- a/arch/mips/kernel/traps.c
> +++ b/arch/mips/kernel/traps.c
> @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
>
> void reserve_exception_space(phys_addr_t addr, unsigned long size)
> {
> - memblock_reserve(addr, size);
> + if(memblock_is_region_memory(addr, size))
> + memblock_reserve(addr, size);
> }
I don't think this is a good idea, reserve_exception_space() is called
in cpu_probe() and at that point memblock setup hasn't been done. So
we end up in a situation commit bd67b711bfaa ("MIPS: kernel: Reserve
exception base early to prevent corruption") fixed.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH V2]: MIPS: loongson64: fix boot failure
2024-01-09 21:40 ` Thomas Bogendoerfer
@ 2024-01-13 9:55 ` Huang Pei
2024-01-13 9:55 ` [PATCH 1/3] MIPS: adjust exception vector space revervation Huang Pei
` (2 more replies)
2024-01-18 12:39 ` [PATCH V3]: MIPS: loongson64: fix booting failure Huang Pei
` (3 subsequent siblings)
4 siblings, 3 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-13 9:55 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
There are two issues here:
+ one is that misuse of memblock_reserve, memblock_reserve is allocating
memory (owned by kernel) for kernel use, but the misuse is reserving
memory (not owned by kernel). Actually, not memblock_add{,_note}ing
memory is right use of reserving memory not owned by kernel.
+ the other one is the using memblock_reserve without knowing whether the
memory is owned by kernel, but this is an issue for MIPS R2+, so just
bypass it;
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 1/3] MIPS: adjust exception vector space revervation
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
@ 2024-01-13 9:55 ` Huang Pei
2024-01-13 9:55 ` [PATCH 2/3] MIPS: loongson64: fix booting failure Huang Pei
2024-01-13 9:55 ` [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware" Huang Pei
2 siblings, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-13 9:55 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
MIPS R2+ use CP0_Ebase to setup a new exception vector space, instead
of using 0x0-0x1000, so it is unnecessary to reserve 0x0-0x1000 for
MIPS R2+. If this range is not within valid memoryblock.memory range,
memory initialization with CONFIG_DEFERRED_STRUCT_PAGE_INIT failed on
3A1000+(3A1000+ needs other machine-specific fix ) after 61167ad5fecd
("mm: pass nid to reserve_bootmem_region()")
Call reserve_exception_space(0,0x1000) only ONCE whether UP or SMP.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/kernel/cpu-probe.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index b406d8bfb15a..3da35e20d70d 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -1992,12 +1992,14 @@ void cpu_probe(void)
*/
loongson3_cpucfg_synthesize_data(c);
+ if (cpu == 0) {
#ifdef CONFIG_64BIT
- if (cpu == 0)
__ua_limit = ~((1ull << cpu_vmbits) - 1);
#endif
+ if(!cpu_has_mips_r2_r6)
+ reserve_exception_space(0, 0x1000);
+ }
- reserve_exception_space(0, 0x1000);
}
void cpu_report(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 2/3] MIPS: loongson64: fix booting failure
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
2024-01-13 9:55 ` [PATCH 1/3] MIPS: adjust exception vector space revervation Huang Pei
@ 2024-01-13 9:55 ` Huang Pei
2024-01-13 9:55 ` [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware" Huang Pei
2 siblings, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-13 9:55 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Inappropriate memblock_reserve cause loongson64 booting failed with
CONFIG_DEFERRED_STRUCT_PAGE_INIT like this:
----------------------------------------------------------------------
Call Trace:
[<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
[<ffffffff82333940>] memblock_free_all+0x104/0x2a8
[<ffffffff8231d8e4>] mem_init+0x84/0x94
[<ffffffff82330958>] mm_core_init+0xf8/0x308
[<ffffffff82318c38>] start_kernel+0x43c/0x86c
Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
64420070 00003025 24040003
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
----------------------------------------------------------------------
The root cause is no memory region "0x0-0x1fffff" paired with
memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
=debug":
----------------------------------------------------------------------
memory[0x0] [0x0000000000200000-0x000000000effffff],
0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
memory[0x1] [0x0000000090000000-0x00000000fdffffff],
0x000000006e000000 bytes on node 0 flags: 0x0
memory[0x2] [0x0000000100000000-0x000000027fffffff],
0x0000000180000000 bytes on node 0 flags: 0x0
memory[0x3] [0x0000100000000000-0x000010000fffffff],
0x0000000010000000 bytes on node 1 flags: 0x0
memory[0x4] [0x0000100090000000-0x000010027fffffff],
0x00000001f0000000 bytes on node 1 flags: 0x0
reserved.cnt = 0x1f
reserved[0x0] [0x0000000000000000-0x000000000190c80a],
0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
reserved[0x1] [0x000000000190c810-0x000000000190eea3],
0x0000000000002694 bytes flags: 0x0
----------------------------------------------------------------------
To fix this bug, we should remove unnecessary memory block reservation.
Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/loongson64/numa.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 8f61e93c0c5b..9b4ce40c1d40 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -129,9 +129,6 @@ static void __init node_mem_init(unsigned int node)
if (node_end_pfn(0) >= (0xffffffff >> PAGE_SHIFT))
memblock_reserve((node_addrspace_offset | 0xfe000000),
32 << 20);
-
- /* Reserve pfn range 0~node[0]->node_start_pfn */
- memblock_reserve(0, PAGE_SIZE * start_pfn);
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
2024-01-13 9:55 ` [PATCH 1/3] MIPS: adjust exception vector space revervation Huang Pei
2024-01-13 9:55 ` [PATCH 2/3] MIPS: loongson64: fix booting failure Huang Pei
@ 2024-01-13 9:55 ` Huang Pei
2024-01-13 11:59 ` Jiaxun Yang
2 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-13 9:55 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
This reverts commit 264927e3538169fe2973a5732f03ea01b0f9f9e8.
The SMBIOS memory reserved region(0xfffe000-0xfffe7ff) is not OWNED
by kenel, here OWNED means the region is within the physical memory
given to kernel by firmware as SYSTEM_RAM_{LOW,HIGH}.
Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
this:
----------------------------------------------------------------------
Call Trace:
[<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
[<ffffffff82333940>] memblock_free_all+0x104/0x2a8
[<ffffffff8231d8e4>] mem_init+0x84/0x94
[<ffffffff82330958>] mm_core_init+0xf8/0x308
[<ffffffff82318c38>] start_kernel+0x43c/0x86c
Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
64420070 00003025 24040003
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
----------------------------------------------------------------------
This is another case of inappropriate usage of memblock_reserve
---
.../include/asm/mach-loongson64/boot_param.h | 6 +--
arch/mips/loongson64/init.c | 42 +++++++------------
2 files changed, 17 insertions(+), 31 deletions(-)
diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h b/arch/mips/include/asm/mach-loongson64/boot_param.h
index e007edd6b60a..c454ef734c45 100644
--- a/arch/mips/include/asm/mach-loongson64/boot_param.h
+++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
@@ -14,11 +14,7 @@
#define ADAPTER_ROM 8
#define ACPI_TABLE 9
#define SMBIOS_TABLE 10
-#define UMA_VIDEO_RAM 11
-#define VUMA_VIDEO_RAM 12
-#define MAX_MEMORY_TYPE 13
-
-#define MEM_SIZE_IS_IN_BYTES (1 << 31)
+#define MAX_MEMORY_TYPE 11
#define LOONGSON3_BOOT_MEM_MAP_MAX 128
struct efi_memory_map_loongson {
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index f25caa6aa9d3..d62262f93069 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -49,7 +49,8 @@ void virtual_early_config(void)
void __init szmem(unsigned int node)
{
u32 i, mem_type;
- phys_addr_t node_id, mem_start, mem_size;
+ static unsigned long num_physpages;
+ u64 node_id, node_psize, start_pfn, end_pfn, mem_start, mem_size;
/* Otherwise come from DTB */
if (loongson_sysconf.fw_interface != LOONGSON_LEFI)
@@ -63,38 +64,27 @@ void __init szmem(unsigned int node)
mem_type = loongson_memmap->map[i].mem_type;
mem_size = loongson_memmap->map[i].mem_size;
-
- /* Memory size comes in MB if MEM_SIZE_IS_IN_BYTES not set */
- if (mem_size & MEM_SIZE_IS_IN_BYTES)
- mem_size &= ~MEM_SIZE_IS_IN_BYTES;
- else
- mem_size = mem_size << 20;
-
- mem_start = (node_id << 44) | loongson_memmap->map[i].mem_start;
+ mem_start = loongson_memmap->map[i].mem_start;
switch (mem_type) {
case SYSTEM_RAM_LOW:
case SYSTEM_RAM_HIGH:
- case UMA_VIDEO_RAM:
- pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes usable\n",
- (u32)node_id, mem_type, &mem_start, &mem_size);
- memblock_add_node(mem_start, mem_size, node,
+ start_pfn = ((node_id << 44) + mem_start) >> PAGE_SHIFT;
+ node_psize = (mem_size << 20) >> PAGE_SHIFT;
+ end_pfn = start_pfn + node_psize;
+ num_physpages += node_psize;
+ pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
+ (u32)node_id, mem_type, mem_start, mem_size);
+ pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
+ start_pfn, end_pfn, num_physpages);
+ memblock_add_node(PFN_PHYS(start_pfn),
+ PFN_PHYS(node_psize), node,
MEMBLOCK_NONE);
break;
case SYSTEM_RAM_RESERVED:
- case VIDEO_ROM:
- case ADAPTER_ROM:
- case ACPI_TABLE:
- case SMBIOS_TABLE:
- pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes reserved\n",
- (u32)node_id, mem_type, &mem_start, &mem_size);
- memblock_reserve(mem_start, mem_size);
- break;
- /* We should not reserve VUMA_VIDEO_RAM as it overlaps with MMIO */
- case VUMA_VIDEO_RAM:
- default:
- pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes unhandled\n",
- (u32)node_id, mem_type, &mem_start, &mem_size);
+ pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
+ (u32)node_id, mem_type, mem_start, mem_size);
+ memblock_reserve(((node_id << 44) + mem_start), mem_size << 20);
break;
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-13 9:55 ` [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware" Huang Pei
@ 2024-01-13 11:59 ` Jiaxun Yang
2024-01-14 8:53 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Jiaxun Yang @ 2024-01-13 11:59 UTC (permalink / raw)
To: Huang Pei, Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Paul Burton, Li Xuefeng, Yang Tiezhu,
Gao Juxin, Huacai Chen
在 2024/1/13 09:55, Huang Pei 写道:
> This reverts commit 264927e3538169fe2973a5732f03ea01b0f9f9e8.
>
> The SMBIOS memory reserved region(0xfffe000-0xfffe7ff) is not OWNED
> by kenel, here OWNED means the region is within the physical memory
> given to kernel by firmware as SYSTEM_RAM_{LOW,HIGH}.
>
> Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> this:
> ----------------------------------------------------------------------
> Call Trace:
> [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> [<ffffffff8231d8e4>] mem_init+0x84/0x94
> [<ffffffff82330958>] mm_core_init+0xf8/0x308
> [<ffffffff82318c38>] start_kernel+0x43c/0x86c
>
> Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> 64420070 00003025 24040003
>
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> ----------------------------------------------------------------------
>
> This is another case of inappropriate usage of memblock_reserve
In my test system with kunlun firmware they are actually covered by
SYSRAM type.
IMO, better to add those memory to memblock as well in your case.
Thanks
- Jiaxun
> ---
> .../include/asm/mach-loongson64/boot_param.h | 6 +--
> arch/mips/loongson64/init.c | 42 +++++++------------
> 2 files changed, 17 insertions(+), 31 deletions(-)
>
> diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h b/arch/mips/include/asm/mach-loongson64/boot_param.h
> index e007edd6b60a..c454ef734c45 100644
> --- a/arch/mips/include/asm/mach-loongson64/boot_param.h
> +++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
> @@ -14,11 +14,7 @@
> #define ADAPTER_ROM 8
> #define ACPI_TABLE 9
> #define SMBIOS_TABLE 10
> -#define UMA_VIDEO_RAM 11
> -#define VUMA_VIDEO_RAM 12
> -#define MAX_MEMORY_TYPE 13
> -
> -#define MEM_SIZE_IS_IN_BYTES (1 << 31)
> +#define MAX_MEMORY_TYPE 11
>
> #define LOONGSON3_BOOT_MEM_MAP_MAX 128
> struct efi_memory_map_loongson {
> diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> index f25caa6aa9d3..d62262f93069 100644
> --- a/arch/mips/loongson64/init.c
> +++ b/arch/mips/loongson64/init.c
> @@ -49,7 +49,8 @@ void virtual_early_config(void)
> void __init szmem(unsigned int node)
> {
> u32 i, mem_type;
> - phys_addr_t node_id, mem_start, mem_size;
> + static unsigned long num_physpages;
> + u64 node_id, node_psize, start_pfn, end_pfn, mem_start, mem_size;
>
> /* Otherwise come from DTB */
> if (loongson_sysconf.fw_interface != LOONGSON_LEFI)
> @@ -63,38 +64,27 @@ void __init szmem(unsigned int node)
>
> mem_type = loongson_memmap->map[i].mem_type;
> mem_size = loongson_memmap->map[i].mem_size;
> -
> - /* Memory size comes in MB if MEM_SIZE_IS_IN_BYTES not set */
> - if (mem_size & MEM_SIZE_IS_IN_BYTES)
> - mem_size &= ~MEM_SIZE_IS_IN_BYTES;
> - else
> - mem_size = mem_size << 20;
> -
> - mem_start = (node_id << 44) | loongson_memmap->map[i].mem_start;
> + mem_start = loongson_memmap->map[i].mem_start;
>
> switch (mem_type) {
> case SYSTEM_RAM_LOW:
> case SYSTEM_RAM_HIGH:
> - case UMA_VIDEO_RAM:
> - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes usable\n",
> - (u32)node_id, mem_type, &mem_start, &mem_size);
> - memblock_add_node(mem_start, mem_size, node,
> + start_pfn = ((node_id << 44) + mem_start) >> PAGE_SHIFT;
> + node_psize = (mem_size << 20) >> PAGE_SHIFT;
> + end_pfn = start_pfn + node_psize;
> + num_physpages += node_psize;
> + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> + (u32)node_id, mem_type, mem_start, mem_size);
> + pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
> + start_pfn, end_pfn, num_physpages);
> + memblock_add_node(PFN_PHYS(start_pfn),
> + PFN_PHYS(node_psize), node,
> MEMBLOCK_NONE);
> break;
> case SYSTEM_RAM_RESERVED:
> - case VIDEO_ROM:
> - case ADAPTER_ROM:
> - case ACPI_TABLE:
> - case SMBIOS_TABLE:
> - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes reserved\n",
> - (u32)node_id, mem_type, &mem_start, &mem_size);
> - memblock_reserve(mem_start, mem_size);
> - break;
> - /* We should not reserve VUMA_VIDEO_RAM as it overlaps with MMIO */
> - case VUMA_VIDEO_RAM:
> - default:
> - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes unhandled\n",
> - (u32)node_id, mem_type, &mem_start, &mem_size);
> + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> + (u32)node_id, mem_type, mem_start, mem_size);
> + memblock_reserve(((node_id << 44) + mem_start), mem_size << 20);
> break;
> }
> }
--
---
Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-13 11:59 ` Jiaxun Yang
@ 2024-01-14 8:53 ` Huang Pei
2024-01-14 11:58 ` Jiaxun Yang
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-14 8:53 UTC (permalink / raw)
To: Jiaxun Yang
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
On Sat, Jan 13, 2024 at 11:59:11AM +0000, Jiaxun Yang wrote:
>
>
> 在 2024/1/13 09:55, Huang Pei 写道:
> > This reverts commit 264927e3538169fe2973a5732f03ea01b0f9f9e8.
> >
> > The SMBIOS memory reserved region(0xfffe000-0xfffe7ff) is not OWNED
> > by kenel, here OWNED means the region is within the physical memory
> > given to kernel by firmware as SYSTEM_RAM_{LOW,HIGH}.
> >
> > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > this:
> > ----------------------------------------------------------------------
> > Call Trace:
> > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> >
> > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > 64420070 00003025 24040003
> >
> > ---[ end trace 0000000000000000 ]---
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > ----------------------------------------------------------------------
> >
> > This is another case of inappropriate usage of memblock_reserve
>
> In my test system with kunlun firmware they are actually covered by SYSRAM
> type.
> IMO, better to add those memory to memblock as well in your case.
>
My test machine is PMON-based 3B1500, the SMBIOS_TABLE located in
0xfffe000-0xfffe7ff, which is not included in SYSTEM_RAM_LOW(0x200000-
0xeffffff), I think we need a check like,
----------------------------------------------------------------------
if(memblock_is_region_memory(addr, size))
memblock_reserve(addr, size);
----------------------------------------------------------------------
to support both cases;
> Thanks
> - Jiaxun
> > ---
> > .../include/asm/mach-loongson64/boot_param.h | 6 +--
> > arch/mips/loongson64/init.c | 42 +++++++------------
> > 2 files changed, 17 insertions(+), 31 deletions(-)
> >
> > diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h b/arch/mips/include/asm/mach-loongson64/boot_param.h
> > index e007edd6b60a..c454ef734c45 100644
> > --- a/arch/mips/include/asm/mach-loongson64/boot_param.h
> > +++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
> > @@ -14,11 +14,7 @@
> > #define ADAPTER_ROM 8
> > #define ACPI_TABLE 9
> > #define SMBIOS_TABLE 10
> > -#define UMA_VIDEO_RAM 11
> > -#define VUMA_VIDEO_RAM 12
> > -#define MAX_MEMORY_TYPE 13
> > -
> > -#define MEM_SIZE_IS_IN_BYTES (1 << 31)
> > +#define MAX_MEMORY_TYPE 11
> > #define LOONGSON3_BOOT_MEM_MAP_MAX 128
> > struct efi_memory_map_loongson {
> > diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> > index f25caa6aa9d3..d62262f93069 100644
> > --- a/arch/mips/loongson64/init.c
> > +++ b/arch/mips/loongson64/init.c
> > @@ -49,7 +49,8 @@ void virtual_early_config(void)
> > void __init szmem(unsigned int node)
> > {
> > u32 i, mem_type;
> > - phys_addr_t node_id, mem_start, mem_size;
> > + static unsigned long num_physpages;
> > + u64 node_id, node_psize, start_pfn, end_pfn, mem_start, mem_size;
> > /* Otherwise come from DTB */
> > if (loongson_sysconf.fw_interface != LOONGSON_LEFI)
> > @@ -63,38 +64,27 @@ void __init szmem(unsigned int node)
> > mem_type = loongson_memmap->map[i].mem_type;
> > mem_size = loongson_memmap->map[i].mem_size;
> > -
> > - /* Memory size comes in MB if MEM_SIZE_IS_IN_BYTES not set */
> > - if (mem_size & MEM_SIZE_IS_IN_BYTES)
> > - mem_size &= ~MEM_SIZE_IS_IN_BYTES;
> > - else
> > - mem_size = mem_size << 20;
> > -
> > - mem_start = (node_id << 44) | loongson_memmap->map[i].mem_start;
> > + mem_start = loongson_memmap->map[i].mem_start;
> > switch (mem_type) {
> > case SYSTEM_RAM_LOW:
> > case SYSTEM_RAM_HIGH:
> > - case UMA_VIDEO_RAM:
> > - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes usable\n",
> > - (u32)node_id, mem_type, &mem_start, &mem_size);
> > - memblock_add_node(mem_start, mem_size, node,
> > + start_pfn = ((node_id << 44) + mem_start) >> PAGE_SHIFT;
> > + node_psize = (mem_size << 20) >> PAGE_SHIFT;
> > + end_pfn = start_pfn + node_psize;
> > + num_physpages += node_psize;
> > + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> > + (u32)node_id, mem_type, mem_start, mem_size);
> > + pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
> > + start_pfn, end_pfn, num_physpages);
> > + memblock_add_node(PFN_PHYS(start_pfn),
> > + PFN_PHYS(node_psize), node,
> > MEMBLOCK_NONE);
> > break;
> > case SYSTEM_RAM_RESERVED:
> > - case VIDEO_ROM:
> > - case ADAPTER_ROM:
> > - case ACPI_TABLE:
> > - case SMBIOS_TABLE:
> > - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes reserved\n",
> > - (u32)node_id, mem_type, &mem_start, &mem_size);
> > - memblock_reserve(mem_start, mem_size);
> > - break;
> > - /* We should not reserve VUMA_VIDEO_RAM as it overlaps with MMIO */
> > - case VUMA_VIDEO_RAM:
> > - default:
> > - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes unhandled\n",
> > - (u32)node_id, mem_type, &mem_start, &mem_size);
> > + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> > + (u32)node_id, mem_type, mem_start, mem_size);
> > + memblock_reserve(((node_id << 44) + mem_start), mem_size << 20);
> > break;
> > }
> > }
>
> --
> ---
> Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-14 8:53 ` Huang Pei
@ 2024-01-14 11:58 ` Jiaxun Yang
2024-01-15 1:25 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Jiaxun Yang @ 2024-01-14 11:58 UTC (permalink / raw)
To: Huang Pei
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
在 2024/1/14 08:53, Huang Pei 写道:
> On Sat, Jan 13, 2024 at 11:59:11AM +0000, Jiaxun Yang wrote:
[...]
>
> In my test system with kunlun firmware they are actually covered by SYSRAM
> type.
> IMO, better to add those memory to memblock as well in your case.
>
> My test machine is PMON-based 3B1500, the SMBIOS_TABLE located in
> 0xfffe000-0xfffe7ff, which is not included in SYSTEM_RAM_LOW(0x200000-
> 0xeffffff), I think we need a check like,
> ----------------------------------------------------------------------
> if(memblock_is_region_memory(addr, size))
> memblock_reserve(addr, size);
> ----------------------------------------------------------------------
> to support both cases;
Then we are running into ordering issue. memblock_add of SYSRAM may
later then reservation.
What about unconditionally add those ranges to memblock? As it's certain
that those regions will
be RAM.
Thanks
- Jiaxun
>> Thanks
>> - Jiaxun
>>> ---
>>> .../include/asm/mach-loongson64/boot_param.h | 6 +--
>>> arch/mips/loongson64/init.c | 42 +++++++------------
>>> 2 files changed, 17 insertions(+), 31 deletions(-)
>>>
>>> diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h b/arch/mips/include/asm/mach-loongson64/boot_param.h
>>> index e007edd6b60a..c454ef734c45 100644
>>> --- a/arch/mips/include/asm/mach-loongson64/boot_param.h
>>> +++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
>>> @@ -14,11 +14,7 @@
>>> #define ADAPTER_ROM 8
>>> #define ACPI_TABLE 9
>>> #define SMBIOS_TABLE 10
>>> -#define UMA_VIDEO_RAM 11
>>> -#define VUMA_VIDEO_RAM 12
>>> -#define MAX_MEMORY_TYPE 13
>>> -
>>> -#define MEM_SIZE_IS_IN_BYTES (1 << 31)
>>> +#define MAX_MEMORY_TYPE 11
>>> #define LOONGSON3_BOOT_MEM_MAP_MAX 128
>>> struct efi_memory_map_loongson {
>>> diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
>>> index f25caa6aa9d3..d62262f93069 100644
>>> --- a/arch/mips/loongson64/init.c
>>> +++ b/arch/mips/loongson64/init.c
>>> @@ -49,7 +49,8 @@ void virtual_early_config(void)
>>> void __init szmem(unsigned int node)
>>> {
>>> u32 i, mem_type;
>>> - phys_addr_t node_id, mem_start, mem_size;
>>> + static unsigned long num_physpages;
>>> + u64 node_id, node_psize, start_pfn, end_pfn, mem_start, mem_size;
>>> /* Otherwise come from DTB */
>>> if (loongson_sysconf.fw_interface != LOONGSON_LEFI)
>>> @@ -63,38 +64,27 @@ void __init szmem(unsigned int node)
>>> mem_type = loongson_memmap->map[i].mem_type;
>>> mem_size = loongson_memmap->map[i].mem_size;
>>> -
>>> - /* Memory size comes in MB if MEM_SIZE_IS_IN_BYTES not set */
>>> - if (mem_size & MEM_SIZE_IS_IN_BYTES)
>>> - mem_size &= ~MEM_SIZE_IS_IN_BYTES;
>>> - else
>>> - mem_size = mem_size << 20;
>>> -
>>> - mem_start = (node_id << 44) | loongson_memmap->map[i].mem_start;
>>> + mem_start = loongson_memmap->map[i].mem_start;
>>> switch (mem_type) {
>>> case SYSTEM_RAM_LOW:
>>> case SYSTEM_RAM_HIGH:
>>> - case UMA_VIDEO_RAM:
>>> - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes usable\n",
>>> - (u32)node_id, mem_type, &mem_start, &mem_size);
>>> - memblock_add_node(mem_start, mem_size, node,
>>> + start_pfn = ((node_id << 44) + mem_start) >> PAGE_SHIFT;
>>> + node_psize = (mem_size << 20) >> PAGE_SHIFT;
>>> + end_pfn = start_pfn + node_psize;
>>> + num_physpages += node_psize;
>>> + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
>>> + (u32)node_id, mem_type, mem_start, mem_size);
>>> + pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
>>> + start_pfn, end_pfn, num_physpages);
>>> + memblock_add_node(PFN_PHYS(start_pfn),
>>> + PFN_PHYS(node_psize), node,
>>> MEMBLOCK_NONE);
>>> break;
>>> case SYSTEM_RAM_RESERVED:
>>> - case VIDEO_ROM:
>>> - case ADAPTER_ROM:
>>> - case ACPI_TABLE:
>>> - case SMBIOS_TABLE:
>>> - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes reserved\n",
>>> - (u32)node_id, mem_type, &mem_start, &mem_size);
>>> - memblock_reserve(mem_start, mem_size);
>>> - break;
>>> - /* We should not reserve VUMA_VIDEO_RAM as it overlaps with MMIO */
>>> - case VUMA_VIDEO_RAM:
>>> - default:
>>> - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes unhandled\n",
>>> - (u32)node_id, mem_type, &mem_start, &mem_size);
>>> + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
>>> + (u32)node_id, mem_type, mem_start, mem_size);
>>> + memblock_reserve(((node_id << 44) + mem_start), mem_size << 20);
>>> break;
>>> }
>>> }
>> --
>> ---
>> Jiaxun Yang
>
--
---
Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-14 11:58 ` Jiaxun Yang
@ 2024-01-15 1:25 ` Huang Pei
2024-01-15 14:14 ` Jiaxun Yang
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-15 1:25 UTC (permalink / raw)
To: Jiaxun Yang
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
On Sun, Jan 14, 2024 at 11:58:25AM +0000, Jiaxun Yang wrote:
>
>
> 在 2024/1/14 08:53, Huang Pei 写道:
> > On Sat, Jan 13, 2024 at 11:59:11AM +0000, Jiaxun Yang wrote:
> [...]
> >
> > In my test system with kunlun firmware they are actually covered by SYSRAM
> > type.
> > IMO, better to add those memory to memblock as well in your case.
> >
> > My test machine is PMON-based 3B1500, the SMBIOS_TABLE located in
> > 0xfffe000-0xfffe7ff, which is not included in SYSTEM_RAM_LOW(0x200000-
> > 0xeffffff), I think we need a check like,
> > ----------------------------------------------------------------------
> > if(memblock_is_region_memory(addr, size))
> > memblock_reserve(addr, size);
> > ----------------------------------------------------------------------
> > to support both cases;
>
> Then we are running into ordering issue. memblock_add of SYSRAM may later
> then reservation.
> What about unconditionally add those ranges to memblock? As it's certain
> that those regions will
> be RAM.
>
I think we can do szmem twice, one for memblock.memory, the other for
memblock_reserve.
> Thanks
> - Jiaxun
>
> > > Thanks
> > > - Jiaxun
> > > > ---
> > > > .../include/asm/mach-loongson64/boot_param.h | 6 +--
> > > > arch/mips/loongson64/init.c | 42 +++++++------------
> > > > 2 files changed, 17 insertions(+), 31 deletions(-)
> > > >
> > > > diff --git a/arch/mips/include/asm/mach-loongson64/boot_param.h b/arch/mips/include/asm/mach-loongson64/boot_param.h
> > > > index e007edd6b60a..c454ef734c45 100644
> > > > --- a/arch/mips/include/asm/mach-loongson64/boot_param.h
> > > > +++ b/arch/mips/include/asm/mach-loongson64/boot_param.h
> > > > @@ -14,11 +14,7 @@
> > > > #define ADAPTER_ROM 8
> > > > #define ACPI_TABLE 9
> > > > #define SMBIOS_TABLE 10
> > > > -#define UMA_VIDEO_RAM 11
> > > > -#define VUMA_VIDEO_RAM 12
> > > > -#define MAX_MEMORY_TYPE 13
> > > > -
> > > > -#define MEM_SIZE_IS_IN_BYTES (1 << 31)
> > > > +#define MAX_MEMORY_TYPE 11
> > > > #define LOONGSON3_BOOT_MEM_MAP_MAX 128
> > > > struct efi_memory_map_loongson {
> > > > diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> > > > index f25caa6aa9d3..d62262f93069 100644
> > > > --- a/arch/mips/loongson64/init.c
> > > > +++ b/arch/mips/loongson64/init.c
> > > > @@ -49,7 +49,8 @@ void virtual_early_config(void)
> > > > void __init szmem(unsigned int node)
> > > > {
> > > > u32 i, mem_type;
> > > > - phys_addr_t node_id, mem_start, mem_size;
> > > > + static unsigned long num_physpages;
> > > > + u64 node_id, node_psize, start_pfn, end_pfn, mem_start, mem_size;
> > > > /* Otherwise come from DTB */
> > > > if (loongson_sysconf.fw_interface != LOONGSON_LEFI)
> > > > @@ -63,38 +64,27 @@ void __init szmem(unsigned int node)
> > > > mem_type = loongson_memmap->map[i].mem_type;
> > > > mem_size = loongson_memmap->map[i].mem_size;
> > > > -
> > > > - /* Memory size comes in MB if MEM_SIZE_IS_IN_BYTES not set */
> > > > - if (mem_size & MEM_SIZE_IS_IN_BYTES)
> > > > - mem_size &= ~MEM_SIZE_IS_IN_BYTES;
> > > > - else
> > > > - mem_size = mem_size << 20;
> > > > -
> > > > - mem_start = (node_id << 44) | loongson_memmap->map[i].mem_start;
> > > > + mem_start = loongson_memmap->map[i].mem_start;
> > > > switch (mem_type) {
> > > > case SYSTEM_RAM_LOW:
> > > > case SYSTEM_RAM_HIGH:
> > > > - case UMA_VIDEO_RAM:
> > > > - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes usable\n",
> > > > - (u32)node_id, mem_type, &mem_start, &mem_size);
> > > > - memblock_add_node(mem_start, mem_size, node,
> > > > + start_pfn = ((node_id << 44) + mem_start) >> PAGE_SHIFT;
> > > > + node_psize = (mem_size << 20) >> PAGE_SHIFT;
> > > > + end_pfn = start_pfn + node_psize;
> > > > + num_physpages += node_psize;
> > > > + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> > > > + (u32)node_id, mem_type, mem_start, mem_size);
> > > > + pr_info(" start_pfn:0x%llx, end_pfn:0x%llx, num_physpages:0x%lx\n",
> > > > + start_pfn, end_pfn, num_physpages);
> > > > + memblock_add_node(PFN_PHYS(start_pfn),
> > > > + PFN_PHYS(node_psize), node,
> > > > MEMBLOCK_NONE);
> > > > break;
> > > > case SYSTEM_RAM_RESERVED:
> > > > - case VIDEO_ROM:
> > > > - case ADAPTER_ROM:
> > > > - case ACPI_TABLE:
> > > > - case SMBIOS_TABLE:
> > > > - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes reserved\n",
> > > > - (u32)node_id, mem_type, &mem_start, &mem_size);
> > > > - memblock_reserve(mem_start, mem_size);
> > > > - break;
> > > > - /* We should not reserve VUMA_VIDEO_RAM as it overlaps with MMIO */
> > > > - case VUMA_VIDEO_RAM:
> > > > - default:
> > > > - pr_info("Node %d, mem_type:%d\t[%pa], %pa bytes unhandled\n",
> > > > - (u32)node_id, mem_type, &mem_start, &mem_size);
> > > > + pr_info("Node%d: mem_type:%d, mem_start:0x%llx, mem_size:0x%llx MB\n",
> > > > + (u32)node_id, mem_type, mem_start, mem_size);
> > > > + memblock_reserve(((node_id << 44) + mem_start), mem_size << 20);
> > > > break;
> > > > }
> > > > }
> > > --
> > > ---
> > > Jiaxun Yang
> >
>
> --
> ---
> Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2023-12-25 9:30 [PATCH] MIPS: loongson64: fix boot failure Huang Pei
2024-01-09 21:40 ` Thomas Bogendoerfer
@ 2024-01-15 14:08 ` Jiaxun Yang
2024-01-16 3:27 ` Huang Pei
2024-01-16 8:39 ` Mike Rapoport
1 sibling, 2 replies; 50+ messages in thread
From: Jiaxun Yang @ 2024-01-15 14:08 UTC (permalink / raw)
To: rppt, linux-mm
Cc: Bibo Mao, linux-mips, Paul Burton, Li Xuefeng, Yang Tiezhu,
Gao Juxin, Huacai Chen, Huang Pei, Thomas Bogendoerfer,
linux-kernel
Hi mm folks,
Just a quick question, what is the expected behavior of memblock_reserve
a region that is not added to memblock with memblock_add?
I'm unable to find any documentation about memblock_reserve in comments and
boot-time-mm, but as per my understanding to the code, this should be a
legit usage?
In practical we run into uninitialized nid of reserved block problem,
should we fix it
in our usage, or on memblock side?
Thanks
在 2023/12/25 09:30, Huang Pei 写道:
> Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> this:
> ----------------------------------------------------------------------
> Call Trace:
> [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> [<ffffffff8231d8e4>] mem_init+0x84/0x94
> [<ffffffff82330958>] mm_core_init+0xf8/0x308
> [<ffffffff82318c38>] start_kernel+0x43c/0x86c
>
> Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> 64420070 00003025 24040003
>
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> ----------------------------------------------------------------------
>
> The root cause is no memory region "0x0-0x1fffff" paired with
> memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> =debug":
>
> ----------------------------------------------------------------------
> memory[0x0] [0x0000000000200000-0x000000000effffff],
> 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> 0x000000006e000000 bytes on node 0 flags: 0x0
> memory[0x2] [0x0000000100000000-0x000000027fffffff],
> 0x0000000180000000 bytes on node 0 flags: 0x0
> memory[0x3] [0x0000100000000000-0x000010000fffffff],
> 0x0000000010000000 bytes on node 1 flags: 0x0
> memory[0x4] [0x0000100090000000-0x000010027fffffff],
> 0x00000001f0000000 bytes on node 1 flags: 0x0
> reserved.cnt = 0x1f
> reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> 0x0000000000002694 bytes flags: 0x0
> ----------------------------------------------------------------------
>
> It caused memory-reserved region "0x0-0x1fffff" without valid node id
> in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> accessing "node_data" out of bound.
>
> To fix this bug, we should remove unnecessary memory block reservation.
>
> +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> it is not registered by "memblock_add_node"
>
> +. no need to reserve 0x0-0xfff for exception handling if it is not
> registered by "memblock_add" either.
>
> Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/kernel/traps.c | 3 ++-
> arch/mips/loongson64/numa.c | 2 --
> 2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> index 246c6a6b0261..9b632b4c10c3 100644
> --- a/arch/mips/kernel/traps.c
> +++ b/arch/mips/kernel/traps.c
> @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
>
> void reserve_exception_space(phys_addr_t addr, unsigned long size)
> {
> - memblock_reserve(addr, size);
> + if(memblock_is_region_memory(addr, size))
> + memblock_reserve(addr, size);
> }
>
> void __init *set_except_vector(int n, void *addr)
> diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> index 8f61e93c0c5b..0f516dde81da 100644
> --- a/arch/mips/loongson64/numa.c
> +++ b/arch/mips/loongson64/numa.c
> @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> memblock_reserve((node_addrspace_offset | 0xfe000000),
> 32 << 20);
>
> - /* Reserve pfn range 0~node[0]->node_start_pfn */
> - memblock_reserve(0, PAGE_SIZE * start_pfn);
> }
> }
>
--
---
Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-15 1:25 ` Huang Pei
@ 2024-01-15 14:14 ` Jiaxun Yang
2024-01-16 3:10 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Jiaxun Yang @ 2024-01-15 14:14 UTC (permalink / raw)
To: Huang Pei
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
在 2024/1/15 01:25, Huang Pei 写道:
> On Sun, Jan 14, 2024 at 11:58:25AM +0000, Jiaxun Yang wrote:
>>
>> 在 2024/1/14 08:53, Huang Pei 写道:
>>> On Sat, Jan 13, 2024 at 11:59:11AM +0000, Jiaxun Yang wrote:
>> [...]
>>> In my test system with kunlun firmware they are actually covered by SYSRAM
>>> type.
>>> IMO, better to add those memory to memblock as well in your case.
>>>
>>> My test machine is PMON-based 3B1500, the SMBIOS_TABLE located in
>>> 0xfffe000-0xfffe7ff, which is not included in SYSTEM_RAM_LOW(0x200000-
>>> 0xeffffff), I think we need a check like,
>>> ----------------------------------------------------------------------
>>> if(memblock_is_region_memory(addr, size))
>>> memblock_reserve(addr, size);
>>> ----------------------------------------------------------------------
>>> to support both cases;
>> Then we are running into ordering issue. memblock_add of SYSRAM may later
>> then reservation.
>> What about unconditionally add those ranges to memblock? As it's certain
>> that those regions will
>> be RAM.
>>
> I think we can do szmem twice, one for memblock.memory, the other for
> memblock_reserve.
IMO this is a little bit insane.
LoongArch made a workaround to set all reserved memory to node zero [1].
[1]:
https://lore.kernel.org/all/20230911092810.3108092-1-chenhuacai@loongson.cn/
Thanks
--
---
Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware"
2024-01-15 14:14 ` Jiaxun Yang
@ 2024-01-16 3:10 ` Huang Pei
0 siblings, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-16 3:10 UTC (permalink / raw)
To: Jiaxun Yang
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
On Mon, Jan 15, 2024 at 02:14:14PM +0000, Jiaxun Yang wrote:
>
>
> 在 2024/1/15 01:25, Huang Pei 写道:
> > On Sun, Jan 14, 2024 at 11:58:25AM +0000, Jiaxun Yang wrote:
> > >
> > > 在 2024/1/14 08:53, Huang Pei 写道:
> > > > On Sat, Jan 13, 2024 at 11:59:11AM +0000, Jiaxun Yang wrote:
> > > [...]
> > > > In my test system with kunlun firmware they are actually covered by SYSRAM
> > > > type.
> > > > IMO, better to add those memory to memblock as well in your case.
> > > >
> > > > My test machine is PMON-based 3B1500, the SMBIOS_TABLE located in
> > > > 0xfffe000-0xfffe7ff, which is not included in SYSTEM_RAM_LOW(0x200000-
> > > > 0xeffffff), I think we need a check like,
> > > > ----------------------------------------------------------------------
> > > > if(memblock_is_region_memory(addr, size))
> > > > memblock_reserve(addr, size);
> > > > ----------------------------------------------------------------------
> > > > to support both cases;
> > > Then we are running into ordering issue. memblock_add of SYSRAM may later
> > > then reservation.
> > > What about unconditionally add those ranges to memblock? As it's certain
> > > that those regions will
> > > be RAM.
> > >
> > I think we can do szmem twice, one for memblock.memory, the other for
> > memblock_reserve.
> IMO this is a little bit insane.
> LoongArch made a workaround to set all reserved memory to node zero [1].
>
> [1]:
> https://lore.kernel.org/all/20230911092810.3108092-1-chenhuacai@loongson.cn/
>
I think they forgot MIPS totally. Their fix is right, because all the
reserved memory is located on Node 0 by now. If we keep reserved
memory within range of the memoryblock.memory and keep this conresponcy,
then there would not be this bug. My fix is just another way to keep this
conresponcy and make the memory ownership more clear between kernel and
firmware.
> Thanks
>
> --
> ---
> Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-15 14:08 ` memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure) Jiaxun Yang
@ 2024-01-16 3:27 ` Huang Pei
2024-01-16 8:39 ` Mike Rapoport
1 sibling, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-16 3:27 UTC (permalink / raw)
To: Jiaxun Yang
Cc: rppt, linux-mm, Bibo Mao, linux-mips, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen, Thomas Bogendoerfer,
linux-kernel
On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> Hi mm folks,
>
> Just a quick question, what is the expected behavior of memblock_reserve
> a region that is not added to memblock with memblock_add?
>
> I'm unable to find any documentation about memblock_reserve in comments and
> boot-time-mm, but as per my understanding to the code, this should be a
> legit usage?
>
> In practical we run into uninitialized nid of reserved block problem, should
> we fix it
> in our usage, or on memblock side?
>
> Thanks
I have same question. Kernel make a dmi info(see dmi_string, or add
memblock=debug) copy.
If reserved SMIBIOS memory belong to kernel, no need to make a copy.
But if not, how could it be included in a range of memblock.memory?
>
> 在 2023/12/25 09:30, Huang Pei 写道:
> > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > this:
> > ----------------------------------------------------------------------
> > Call Trace:
> > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> >
> > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > 64420070 00003025 24040003
> >
> > ---[ end trace 0000000000000000 ]---
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > ----------------------------------------------------------------------
> >
> > The root cause is no memory region "0x0-0x1fffff" paired with
> > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > =debug":
> >
> > ----------------------------------------------------------------------
> > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > 0x000000006e000000 bytes on node 0 flags: 0x0
> > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > 0x0000000180000000 bytes on node 0 flags: 0x0
> > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > 0x0000000010000000 bytes on node 1 flags: 0x0
> > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > reserved.cnt = 0x1f
> > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > 0x0000000000002694 bytes flags: 0x0
> > ----------------------------------------------------------------------
> >
> > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > accessing "node_data" out of bound.
> >
> > To fix this bug, we should remove unnecessary memory block reservation.
> >
> > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > it is not registered by "memblock_add_node"
> >
> > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > registered by "memblock_add" either.
> >
> > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > ---
> > arch/mips/kernel/traps.c | 3 ++-
> > arch/mips/loongson64/numa.c | 2 --
> > 2 files changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > index 246c6a6b0261..9b632b4c10c3 100644
> > --- a/arch/mips/kernel/traps.c
> > +++ b/arch/mips/kernel/traps.c
> > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > {
> > - memblock_reserve(addr, size);
> > + if(memblock_is_region_memory(addr, size))
> > + memblock_reserve(addr, size);
> > }
> > void __init *set_except_vector(int n, void *addr)
> > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > index 8f61e93c0c5b..0f516dde81da 100644
> > --- a/arch/mips/loongson64/numa.c
> > +++ b/arch/mips/loongson64/numa.c
> > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > 32 << 20);
> > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > }
> > }
>
> --
> ---
> Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-15 14:08 ` memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure) Jiaxun Yang
2024-01-16 3:27 ` Huang Pei
@ 2024-01-16 8:39 ` Mike Rapoport
2024-01-16 12:23 ` Huang Pei
1 sibling, 1 reply; 50+ messages in thread
From: Mike Rapoport @ 2024-01-16 8:39 UTC (permalink / raw)
To: Jiaxun Yang
Cc: linux-mm, Bibo Mao, linux-mips, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen, Huang Pei,
Thomas Bogendoerfer, linux-kernel, Yajun Deng
On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> Hi mm folks,
>
> Just a quick question, what is the expected behavior of memblock_reserve
> a region that is not added to memblock with memblock_add?
>
> I'm unable to find any documentation about memblock_reserve in comments and
> boot-time-mm, but as per my understanding to the code, this should be a
> legit usage?
Yes, memblock allows reserving memory that was not added to memblock with
memblock_add().
> In practical we run into uninitialized nid of reserved block problem, should
> we fix it
> in our usage, or on memblock side?
Apparently it's a bug in memblock :(
If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
does the issue disappear?
> Thanks
>
> 在 2023/12/25 09:30, Huang Pei 写道:
> > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > this:
> > ----------------------------------------------------------------------
> > Call Trace:
> > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> >
> > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > 64420070 00003025 24040003
> >
> > ---[ end trace 0000000000000000 ]---
> > Kernel panic - not syncing: Attempted to kill the idle task!
> > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > ----------------------------------------------------------------------
> >
> > The root cause is no memory region "0x0-0x1fffff" paired with
> > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > =debug":
> >
> > ----------------------------------------------------------------------
> > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > 0x000000006e000000 bytes on node 0 flags: 0x0
> > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > 0x0000000180000000 bytes on node 0 flags: 0x0
> > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > 0x0000000010000000 bytes on node 1 flags: 0x0
> > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > reserved.cnt = 0x1f
> > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > 0x0000000000002694 bytes flags: 0x0
> > ----------------------------------------------------------------------
> >
> > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > accessing "node_data" out of bound.
> >
> > To fix this bug, we should remove unnecessary memory block reservation.
> >
> > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > it is not registered by "memblock_add_node"
> >
> > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > registered by "memblock_add" either.
> >
> > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > ---
> > arch/mips/kernel/traps.c | 3 ++-
> > arch/mips/loongson64/numa.c | 2 --
> > 2 files changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > index 246c6a6b0261..9b632b4c10c3 100644
> > --- a/arch/mips/kernel/traps.c
> > +++ b/arch/mips/kernel/traps.c
> > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > {
> > - memblock_reserve(addr, size);
> > + if(memblock_is_region_memory(addr, size))
> > + memblock_reserve(addr, size);
> > }
> > void __init *set_except_vector(int n, void *addr)
> > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > index 8f61e93c0c5b..0f516dde81da 100644
> > --- a/arch/mips/loongson64/numa.c
> > +++ b/arch/mips/loongson64/numa.c
> > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > 32 << 20);
> > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > }
> > }
>
> --
> ---
> Jiaxun Yang
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-16 8:39 ` Mike Rapoport
@ 2024-01-16 12:23 ` Huang Pei
2024-01-17 2:20 ` Yajun Deng
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-16 12:23 UTC (permalink / raw)
To: Mike Rapoport
Cc: Jiaxun Yang, linux-mm, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel, Yajun Deng
On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > Hi mm folks,
> >
> > Just a quick question, what is the expected behavior of memblock_reserve
> > a region that is not added to memblock with memblock_add?
> >
> > I'm unable to find any documentation about memblock_reserve in comments and
> > boot-time-mm, but as per my understanding to the code, this should be a
> > legit usage?
>
> Yes, memblock allows reserving memory that was not added to memblock with
> memblock_add().
I think arch/platform specific code should fix this bug, like,
--------------------------------------------------------------------------
//for loongson64
memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
--------------------------------------------------------------------------
or maybe memblock provide something like memblock_reserve_node
>
> > In practical we run into uninitialized nid of reserved block problem, should
> > we fix it
> > in our usage, or on memblock side?
>
> Apparently it's a bug in memblock :(
>
> If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> does the issue disappear?
Yes, I git bisect this commit.
But I don't think it is a bug in memblock. IMO, memblock_reserve under
NUMA set nid of reserved region to MAX_NUMNODES, which is the point
that cause the "memblock_get_region_node from memmap_init_reserved_pages "
passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
-> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
initialize the nid of reserved region(only it know that), or the reserved
region NOT within a memblock added by memblock_add, memblock can not
give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
reserve_bootmem_region()") just reveals the embarrassment case by an
out of bound memory access.
>
> > Thanks
> >
> > 在 2023/12/25 09:30, Huang Pei 写道:
> > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > this:
> > > ----------------------------------------------------------------------
> > > Call Trace:
> > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > >
> > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > 64420070 00003025 24040003
> > >
> > > ---[ end trace 0000000000000000 ]---
> > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > ----------------------------------------------------------------------
> > >
> > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > =debug":
> > >
> > > ----------------------------------------------------------------------
> > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > reserved.cnt = 0x1f
> > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > 0x0000000000002694 bytes flags: 0x0
> > > ----------------------------------------------------------------------
> > >
> > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > accessing "node_data" out of bound.
> > >
> > > To fix this bug, we should remove unnecessary memory block reservation.
> > >
> > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > it is not registered by "memblock_add_node"
> > >
> > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > registered by "memblock_add" either.
> > >
> > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > ---
> > > arch/mips/kernel/traps.c | 3 ++-
> > > arch/mips/loongson64/numa.c | 2 --
> > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > index 246c6a6b0261..9b632b4c10c3 100644
> > > --- a/arch/mips/kernel/traps.c
> > > +++ b/arch/mips/kernel/traps.c
> > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > {
> > > - memblock_reserve(addr, size);
> > > + if(memblock_is_region_memory(addr, size))
> > > + memblock_reserve(addr, size);
> > > }
> > > void __init *set_except_vector(int n, void *addr)
> > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > index 8f61e93c0c5b..0f516dde81da 100644
> > > --- a/arch/mips/loongson64/numa.c
> > > +++ b/arch/mips/loongson64/numa.c
> > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > 32 << 20);
> > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > }
> > > }
> >
> > --
> > ---
> > Jiaxun Yang
> >
>
> --
> Sincerely yours,
> Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-16 12:23 ` Huang Pei
@ 2024-01-17 2:20 ` Yajun Deng
2024-01-17 3:01 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Yajun Deng @ 2024-01-17 2:20 UTC (permalink / raw)
To: Huang Pei, Mike Rapoport
Cc: Jiaxun Yang, linux-mm, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On 2024/1/16 20:23, Huang Pei wrote:
> On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
>> On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
>>> Hi mm folks,
>>>
>>> Just a quick question, what is the expected behavior of memblock_reserve
>>> a region that is not added to memblock with memblock_add?
>>>
>>> I'm unable to find any documentation about memblock_reserve in comments and
>>> boot-time-mm, but as per my understanding to the code, this should be a
>>> legit usage?
>> Yes, memblock allows reserving memory that was not added to memblock with
>> memblock_add().
> I think arch/platform specific code should fix this bug, like,
> --------------------------------------------------------------------------
> //for loongson64
> memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
>
> --------------------------------------------------------------------------
>
> or maybe memblock provide something like memblock_reserve_node
Hi pei,
Can you test the following patch to see if it fixes this bug?
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 2c19f5515e36..97721d99fdce 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned
long pfn, int nid)
pg_data_t *pgdat;
int zid;
+ if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
+ nid = early_pfn_to_nid(pfn);
+
if (early_page_initialised(pfn, nid))
return;
>>
>>> In practical we run into uninitialized nid of reserved block problem, should
>>> we fix it
>>> in our usage, or on memblock side?
>> Apparently it's a bug in memblock :(
>>
>> If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
>> does the issue disappear?
> Yes, I git bisect this commit.
>
> But I don't think it is a bug in memblock. IMO, memblock_reserve under
> NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> initialize the nid of reserved region(only it know that), or the reserved
> region NOT within a memblock added by memblock_add, memblock can not
> give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> reserve_bootmem_region()") just reveals the embarrassment case by an
> out of bound memory access.
>
>>
>>> Thanks
>>>
>>> 在 2023/12/25 09:30, Huang Pei 写道:
>>>> Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
>>>> loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
>>>> this:
>>>> ----------------------------------------------------------------------
>>>> Call Trace:
>>>> [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
>>>> [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
>>>> [<ffffffff8231d8e4>] mem_init+0x84/0x94
>>>> [<ffffffff82330958>] mm_core_init+0xf8/0x308
>>>> [<ffffffff82318c38>] start_kernel+0x43c/0x86c
>>>>
>>>> Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
>>>> 64420070 00003025 24040003
>>>>
>>>> ---[ end trace 0000000000000000 ]---
>>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>>> ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>>>> ----------------------------------------------------------------------
>>>>
>>>> The root cause is no memory region "0x0-0x1fffff" paired with
>>>> memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
>>>> =debug":
>>>>
>>>> ----------------------------------------------------------------------
>>>> memory[0x0] [0x0000000000200000-0x000000000effffff],
>>>> 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
>>>> memory[0x1] [0x0000000090000000-0x00000000fdffffff],
>>>> 0x000000006e000000 bytes on node 0 flags: 0x0
>>>> memory[0x2] [0x0000000100000000-0x000000027fffffff],
>>>> 0x0000000180000000 bytes on node 0 flags: 0x0
>>>> memory[0x3] [0x0000100000000000-0x000010000fffffff],
>>>> 0x0000000010000000 bytes on node 1 flags: 0x0
>>>> memory[0x4] [0x0000100090000000-0x000010027fffffff],
>>>> 0x00000001f0000000 bytes on node 1 flags: 0x0
>>>> reserved.cnt = 0x1f
>>>> reserved[0x0] [0x0000000000000000-0x000000000190c80a],
>>>> 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
>>>> reserved[0x1] [0x000000000190c810-0x000000000190eea3],
>>>> 0x0000000000002694 bytes flags: 0x0
>>>> ----------------------------------------------------------------------
>>>>
>>>> It caused memory-reserved region "0x0-0x1fffff" without valid node id
>>>> in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
>>>> "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
>>>> accessing "node_data" out of bound.
>>>>
>>>> To fix this bug, we should remove unnecessary memory block reservation.
>>>>
>>>> +. no need to reserve 0x0-0x1fffff below kernel loading address, since
>>>> it is not registered by "memblock_add_node"
>>>>
>>>> +. no need to reserve 0x0-0xfff for exception handling if it is not
>>>> registered by "memblock_add" either.
>>>>
>>>> Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
>>>> Signed-off-by: Huang Pei <huangpei@loongson.cn>
>>>> ---
>>>> arch/mips/kernel/traps.c | 3 ++-
>>>> arch/mips/loongson64/numa.c | 2 --
>>>> 2 files changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
>>>> index 246c6a6b0261..9b632b4c10c3 100644
>>>> --- a/arch/mips/kernel/traps.c
>>>> +++ b/arch/mips/kernel/traps.c
>>>> @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
>>>> void reserve_exception_space(phys_addr_t addr, unsigned long size)
>>>> {
>>>> - memblock_reserve(addr, size);
>>>> + if(memblock_is_region_memory(addr, size))
>>>> + memblock_reserve(addr, size);
>>>> }
>>>> void __init *set_except_vector(int n, void *addr)
>>>> diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
>>>> index 8f61e93c0c5b..0f516dde81da 100644
>>>> --- a/arch/mips/loongson64/numa.c
>>>> +++ b/arch/mips/loongson64/numa.c
>>>> @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
>>>> memblock_reserve((node_addrspace_offset | 0xfe000000),
>>>> 32 << 20);
>>>> - /* Reserve pfn range 0~node[0]->node_start_pfn */
>>>> - memblock_reserve(0, PAGE_SIZE * start_pfn);
>>>> }
>>>> }
>>> --
>>> ---
>>> Jiaxun Yang
>>>
>> --
>> Sincerely yours,
>> Mike.
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 2:20 ` Yajun Deng
@ 2024-01-17 3:01 ` Huang Pei
2024-01-17 3:17 ` Yajun Deng
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-17 3:01 UTC (permalink / raw)
To: Yajun Deng
Cc: Mike Rapoport, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
>
> On 2024/1/16 20:23, Huang Pei wrote:
> > On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> > > On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > > > Hi mm folks,
> > > >
> > > > Just a quick question, what is the expected behavior of memblock_reserve
> > > > a region that is not added to memblock with memblock_add?
> > > >
> > > > I'm unable to find any documentation about memblock_reserve in comments and
> > > > boot-time-mm, but as per my understanding to the code, this should be a
> > > > legit usage?
> > > Yes, memblock allows reserving memory that was not added to memblock with
> > > memblock_add().
> > I think arch/platform specific code should fix this bug, like,
> > --------------------------------------------------------------------------
> > //for loongson64
> > memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
> >
> > --------------------------------------------------------------------------
> >
> > or maybe memblock provide something like memblock_reserve_node
>
> Hi pei,
>
> Can you test the following patch to see if it fixes this bug?
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 2c19f5515e36..97721d99fdce 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
> pfn, int nid)
> pg_data_t *pgdat;
> int zid;
>
> + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
> + nid = early_pfn_to_nid(pfn);
> +
> if (early_page_initialised(pfn, nid))
> return;
I do not think this fix set the right nid, ONLY arch/platform know that
nid
int __meminit early_pfn_to_nid(unsigned long pfn)
{
static DEFINE_SPINLOCK(early_pfn_lock);
int nid;
spin_lock(&early_pfn_lock);
nid = __early_pfn_to_nid(pfn,
&early_pfnnid_cache);
if (nid < 0)
//!!!first_online_node MAY NOT be the node the pfn belong to!!!
nid = first_online_node;
spin_unlock(&early_pfn_lock);
return
nid;
}
>
>
> > > > In practical we run into uninitialized nid of reserved block problem, should
> > > > we fix it
> > > > in our usage, or on memblock side?
> > > Apparently it's a bug in memblock :(
> > >
> > > If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> > > does the issue disappear?
> > Yes, I git bisect this commit.
> >
> > But I don't think it is a bug in memblock. IMO, memblock_reserve under
> > NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> > that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> > passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> > -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> > initialize the nid of reserved region(only it know that), or the reserved
> > region NOT within a memblock added by memblock_add, memblock can not
> > give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> > reserve_bootmem_region()") just reveals the embarrassment case by an
> > out of bound memory access.
> >
> > > > Thanks
> > > >
> > > > 在 2023/12/25 09:30, Huang Pei 写道:
> > > > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > > > this:
> > > > > ----------------------------------------------------------------------
> > > > > Call Trace:
> > > > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > > > >
> > > > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > > > 64420070 00003025 24040003
> > > > >
> > > > > ---[ end trace 0000000000000000 ]---
> > > > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > > > ----------------------------------------------------------------------
> > > > >
> > > > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > > > =debug":
> > > > >
> > > > > ----------------------------------------------------------------------
> > > > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > > > reserved.cnt = 0x1f
> > > > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > > > 0x0000000000002694 bytes flags: 0x0
> > > > > ----------------------------------------------------------------------
> > > > >
> > > > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > > > accessing "node_data" out of bound.
> > > > >
> > > > > To fix this bug, we should remove unnecessary memory block reservation.
> > > > >
> > > > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > > > it is not registered by "memblock_add_node"
> > > > >
> > > > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > > > registered by "memblock_add" either.
> > > > >
> > > > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > > > ---
> > > > > arch/mips/kernel/traps.c | 3 ++-
> > > > > arch/mips/loongson64/numa.c | 2 --
> > > > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > > > >
> > > > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > > > index 246c6a6b0261..9b632b4c10c3 100644
> > > > > --- a/arch/mips/kernel/traps.c
> > > > > +++ b/arch/mips/kernel/traps.c
> > > > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > > > {
> > > > > - memblock_reserve(addr, size);
> > > > > + if(memblock_is_region_memory(addr, size))
> > > > > + memblock_reserve(addr, size);
> > > > > }
> > > > > void __init *set_except_vector(int n, void *addr)
> > > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > > > index 8f61e93c0c5b..0f516dde81da 100644
> > > > > --- a/arch/mips/loongson64/numa.c
> > > > > +++ b/arch/mips/loongson64/numa.c
> > > > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > > > 32 << 20);
> > > > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > > > }
> > > > > }
> > > > --
> > > > ---
> > > > Jiaxun Yang
> > > >
> > > --
> > > Sincerely yours,
> > > Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 3:01 ` Huang Pei
@ 2024-01-17 3:17 ` Yajun Deng
2024-01-17 3:59 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Yajun Deng @ 2024-01-17 3:17 UTC (permalink / raw)
To: Huang Pei
Cc: Mike Rapoport, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On 2024/1/17 11:01, Huang Pei wrote:
> On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
>> On 2024/1/16 20:23, Huang Pei wrote:
>>> On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
>>>> On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
>>>>> Hi mm folks,
>>>>>
>>>>> Just a quick question, what is the expected behavior of memblock_reserve
>>>>> a region that is not added to memblock with memblock_add?
>>>>>
>>>>> I'm unable to find any documentation about memblock_reserve in comments and
>>>>> boot-time-mm, but as per my understanding to the code, this should be a
>>>>> legit usage?
>>>> Yes, memblock allows reserving memory that was not added to memblock with
>>>> memblock_add().
>>> I think arch/platform specific code should fix this bug, like,
>>> --------------------------------------------------------------------------
>>> //for loongson64
>>> memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
>>>
>>> --------------------------------------------------------------------------
>>>
>>> or maybe memblock provide something like memblock_reserve_node
>> Hi pei,
>>
>> Can you test the following patch to see if it fixes this bug?
>>
>> diff --git a/mm/mm_init.c b/mm/mm_init.c
>> index 2c19f5515e36..97721d99fdce 100644
>> --- a/mm/mm_init.c
>> +++ b/mm/mm_init.c
>> @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
>> pfn, int nid)
>> pg_data_t *pgdat;
>> int zid;
>>
>> + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
>> + nid = early_pfn_to_nid(pfn);
>> +
>> if (early_page_initialised(pfn, nid))
>> return;
> I do not think this fix set the right nid, ONLY arch/platform know that
> nid
>
> int __meminit early_pfn_to_nid(unsigned long pfn)
> {
> static DEFINE_SPINLOCK(early_pfn_lock);
> int nid;
>
> spin_lock(&early_pfn_lock);
> nid = __early_pfn_to_nid(pfn,
> &early_pfnnid_cache);
> if (nid < 0)
> //!!!first_online_node MAY NOT be the node the pfn belong to!!!
> nid = first_online_node;
>
> spin_unlock(&early_pfn_lock);
>
> return
> nid;
> }
Okay, I don't think this bug is caused by commit 61167ad5fecd ("mm: pass
nid to reserve_bootmem_region()"),
because even if you revert this commit, it will still get nid by
early_pfn_to_nid(). Did I get that right?
>>
>>>>> In practical we run into uninitialized nid of reserved block problem, should
>>>>> we fix it
>>>>> in our usage, or on memblock side?
>>>> Apparently it's a bug in memblock :(
>>>>
>>>> If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
>>>> does the issue disappear?
>>> Yes, I git bisect this commit.
>>>
>>> But I don't think it is a bug in memblock. IMO, memblock_reserve under
>>> NUMA set nid of reserved region to MAX_NUMNODES, which is the point
>>> that cause the "memblock_get_region_node from memmap_init_reserved_pages "
>>> passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
>>> -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
>>> initialize the nid of reserved region(only it know that), or the reserved
>>> region NOT within a memblock added by memblock_add, memblock can not
>>> give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
>>> reserve_bootmem_region()") just reveals the embarrassment case by an
>>> out of bound memory access.
>>>
>>>>> Thanks
>>>>>
>>>>> 在 2023/12/25 09:30, Huang Pei 写道:
>>>>>> Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
>>>>>> loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
>>>>>> this:
>>>>>> ----------------------------------------------------------------------
>>>>>> Call Trace:
>>>>>> [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
>>>>>> [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
>>>>>> [<ffffffff8231d8e4>] mem_init+0x84/0x94
>>>>>> [<ffffffff82330958>] mm_core_init+0xf8/0x308
>>>>>> [<ffffffff82318c38>] start_kernel+0x43c/0x86c
>>>>>>
>>>>>> Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
>>>>>> 64420070 00003025 24040003
>>>>>>
>>>>>> ---[ end trace 0000000000000000 ]---
>>>>>> Kernel panic - not syncing: Attempted to kill the idle task!
>>>>>> ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
>>>>>> ----------------------------------------------------------------------
>>>>>>
>>>>>> The root cause is no memory region "0x0-0x1fffff" paired with
>>>>>> memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
>>>>>> =debug":
>>>>>>
>>>>>> ----------------------------------------------------------------------
>>>>>> memory[0x0] [0x0000000000200000-0x000000000effffff],
>>>>>> 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
>>>>>> memory[0x1] [0x0000000090000000-0x00000000fdffffff],
>>>>>> 0x000000006e000000 bytes on node 0 flags: 0x0
>>>>>> memory[0x2] [0x0000000100000000-0x000000027fffffff],
>>>>>> 0x0000000180000000 bytes on node 0 flags: 0x0
>>>>>> memory[0x3] [0x0000100000000000-0x000010000fffffff],
>>>>>> 0x0000000010000000 bytes on node 1 flags: 0x0
>>>>>> memory[0x4] [0x0000100090000000-0x000010027fffffff],
>>>>>> 0x00000001f0000000 bytes on node 1 flags: 0x0
>>>>>> reserved.cnt = 0x1f
>>>>>> reserved[0x0] [0x0000000000000000-0x000000000190c80a],
>>>>>> 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
>>>>>> reserved[0x1] [0x000000000190c810-0x000000000190eea3],
>>>>>> 0x0000000000002694 bytes flags: 0x0
>>>>>> ----------------------------------------------------------------------
>>>>>>
>>>>>> It caused memory-reserved region "0x0-0x1fffff" without valid node id
>>>>>> in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
>>>>>> "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
>>>>>> accessing "node_data" out of bound.
>>>>>>
>>>>>> To fix this bug, we should remove unnecessary memory block reservation.
>>>>>>
>>>>>> +. no need to reserve 0x0-0x1fffff below kernel loading address, since
>>>>>> it is not registered by "memblock_add_node"
>>>>>>
>>>>>> +. no need to reserve 0x0-0xfff for exception handling if it is not
>>>>>> registered by "memblock_add" either.
>>>>>>
>>>>>> Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
>>>>>> Signed-off-by: Huang Pei <huangpei@loongson.cn>
>>>>>> ---
>>>>>> arch/mips/kernel/traps.c | 3 ++-
>>>>>> arch/mips/loongson64/numa.c | 2 --
>>>>>> 2 files changed, 2 insertions(+), 3 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
>>>>>> index 246c6a6b0261..9b632b4c10c3 100644
>>>>>> --- a/arch/mips/kernel/traps.c
>>>>>> +++ b/arch/mips/kernel/traps.c
>>>>>> @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
>>>>>> void reserve_exception_space(phys_addr_t addr, unsigned long size)
>>>>>> {
>>>>>> - memblock_reserve(addr, size);
>>>>>> + if(memblock_is_region_memory(addr, size))
>>>>>> + memblock_reserve(addr, size);
>>>>>> }
>>>>>> void __init *set_except_vector(int n, void *addr)
>>>>>> diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
>>>>>> index 8f61e93c0c5b..0f516dde81da 100644
>>>>>> --- a/arch/mips/loongson64/numa.c
>>>>>> +++ b/arch/mips/loongson64/numa.c
>>>>>> @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
>>>>>> memblock_reserve((node_addrspace_offset | 0xfe000000),
>>>>>> 32 << 20);
>>>>>> - /* Reserve pfn range 0~node[0]->node_start_pfn */
>>>>>> - memblock_reserve(0, PAGE_SIZE * start_pfn);
>>>>>> }
>>>>>> }
>>>>> --
>>>>> ---
>>>>> Jiaxun Yang
>>>>>
>>>> --
>>>> Sincerely yours,
>>>> Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 3:17 ` Yajun Deng
@ 2024-01-17 3:59 ` Huang Pei
2024-01-17 6:46 ` Mike Rapoport
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-17 3:59 UTC (permalink / raw)
To: Yajun Deng
Cc: Mike Rapoport, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On Wed, Jan 17, 2024 at 11:17:18AM +0800, Yajun Deng wrote:
>
> On 2024/1/17 11:01, Huang Pei wrote:
> > On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
> > > On 2024/1/16 20:23, Huang Pei wrote:
> > > > On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> > > > > On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > > > > > Hi mm folks,
> > > > > >
> > > > > > Just a quick question, what is the expected behavior of memblock_reserve
> > > > > > a region that is not added to memblock with memblock_add?
> > > > > >
> > > > > > I'm unable to find any documentation about memblock_reserve in comments and
> > > > > > boot-time-mm, but as per my understanding to the code, this should be a
> > > > > > legit usage?
> > > > > Yes, memblock allows reserving memory that was not added to memblock with
> > > > > memblock_add().
> > > > I think arch/platform specific code should fix this bug, like,
> > > > --------------------------------------------------------------------------
> > > > //for loongson64
> > > > memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
> > > >
> > > > --------------------------------------------------------------------------
> > > >
> > > > or maybe memblock provide something like memblock_reserve_node
> > > Hi pei,
> > >
> > > Can you test the following patch to see if it fixes this bug?
> > >
> > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > index 2c19f5515e36..97721d99fdce 100644
> > > --- a/mm/mm_init.c
> > > +++ b/mm/mm_init.c
> > > @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
> > > pfn, int nid)
> > > pg_data_t *pgdat;
> > > int zid;
> > >
> > > + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
> > > + nid = early_pfn_to_nid(pfn);
> > > +
> > > if (early_page_initialised(pfn, nid))
> > > return;
> > I do not think this fix set the right nid, ONLY arch/platform know that
> > nid
> >
> > int __meminit early_pfn_to_nid(unsigned long pfn)
> > {
> > static DEFINE_SPINLOCK(early_pfn_lock);
> > int nid;
> >
> > spin_lock(&early_pfn_lock);
> > nid = __early_pfn_to_nid(pfn,
> > &early_pfnnid_cache);
> > if (nid < 0)
> > //!!!first_online_node MAY NOT be the node the pfn belong to!!!
> > nid = first_online_node;
> >
> > spin_unlock(&early_pfn_lock);
> >
> > return
> > nid;
> > }
>
>
> Okay, I don't think this bug is caused by commit 61167ad5fecd ("mm: pass nid
> to reserve_bootmem_region()"),
>
> because even if you revert this commit, it will still get nid by
> early_pfn_to_nid(). Did I get that right?
Yes, more accurately, this bug is exposed by commit 61167ad5fecd. My
previous fix is based on presumptions that memory_reserve should reserve memory
added by memblock_add{,_node}, if going across this limitation, there need
to set the valid nid for reserved memory region.
>
> > >
> > > > > > In practical we run into uninitialized nid of reserved block problem, should
> > > > > > we fix it
> > > > > > in our usage, or on memblock side?
> > > > > Apparently it's a bug in memblock :(
> > > > >
> > > > > If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> > > > > does the issue disappear?
> > > > Yes, I git bisect this commit.
> > > >
> > > > But I don't think it is a bug in memblock. IMO, memblock_reserve under
> > > > NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> > > > that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> > > > passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> > > > -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> > > > initialize the nid of reserved region(only it know that), or the reserved
> > > > region NOT within a memblock added by memblock_add, memblock can not
> > > > give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> > > > reserve_bootmem_region()") just reveals the embarrassment case by an
> > > > out of bound memory access.
> > > >
> > > > > > Thanks
> > > > > >
> > > > > > 在 2023/12/25 09:30, Huang Pei 写道:
> > > > > > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > > > > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > > > > > this:
> > > > > > > ----------------------------------------------------------------------
> > > > > > > Call Trace:
> > > > > > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > > > > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > > > > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > > > > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > > > > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > > > > > >
> > > > > > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > > > > > 64420070 00003025 24040003
> > > > > > >
> > > > > > > ---[ end trace 0000000000000000 ]---
> > > > > > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > > > > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > > > > > ----------------------------------------------------------------------
> > > > > > >
> > > > > > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > > > > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > > > > > =debug":
> > > > > > >
> > > > > > > ----------------------------------------------------------------------
> > > > > > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > > > > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > > > > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > > > > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > > > > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > > > > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > > > > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > > > > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > > > > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > > > > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > > > > > reserved.cnt = 0x1f
> > > > > > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > > > > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > > > > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > > > > > 0x0000000000002694 bytes flags: 0x0
> > > > > > > ----------------------------------------------------------------------
> > > > > > >
> > > > > > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > > > > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > > > > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > > > > > accessing "node_data" out of bound.
> > > > > > >
> > > > > > > To fix this bug, we should remove unnecessary memory block reservation.
> > > > > > >
> > > > > > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > > > > > it is not registered by "memblock_add_node"
> > > > > > >
> > > > > > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > > > > > registered by "memblock_add" either.
> > > > > > >
> > > > > > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > > > > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > > > > > ---
> > > > > > > arch/mips/kernel/traps.c | 3 ++-
> > > > > > > arch/mips/loongson64/numa.c | 2 --
> > > > > > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > > > > > >
> > > > > > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > > > > > index 246c6a6b0261..9b632b4c10c3 100644
> > > > > > > --- a/arch/mips/kernel/traps.c
> > > > > > > +++ b/arch/mips/kernel/traps.c
> > > > > > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > > > > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > > > > > {
> > > > > > > - memblock_reserve(addr, size);
> > > > > > > + if(memblock_is_region_memory(addr, size))
> > > > > > > + memblock_reserve(addr, size);
> > > > > > > }
> > > > > > > void __init *set_except_vector(int n, void *addr)
> > > > > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > > > > > index 8f61e93c0c5b..0f516dde81da 100644
> > > > > > > --- a/arch/mips/loongson64/numa.c
> > > > > > > +++ b/arch/mips/loongson64/numa.c
> > > > > > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > > > > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > > > > > 32 << 20);
> > > > > > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > > > > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > > > > > }
> > > > > > > }
> > > > > > --
> > > > > > ---
> > > > > > Jiaxun Yang
> > > > > >
> > > > > --
> > > > > Sincerely yours,
> > > > > Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 3:59 ` Huang Pei
@ 2024-01-17 6:46 ` Mike Rapoport
2024-01-17 7:45 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Mike Rapoport @ 2024-01-17 6:46 UTC (permalink / raw)
To: Huang Pei
Cc: Yajun Deng, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On Wed, Jan 17, 2024 at 11:59:10AM +0800, Huang Pei wrote:
> On Wed, Jan 17, 2024 at 11:17:18AM +0800, Yajun Deng wrote:
> >
> > On 2024/1/17 11:01, Huang Pei wrote:
> > > On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
> > > > On 2024/1/16 20:23, Huang Pei wrote:
> > > > > On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> > > > > > On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > > > > > > Hi mm folks,
> > > > > > >
> > > > > > > Just a quick question, what is the expected behavior of memblock_reserve
> > > > > > > a region that is not added to memblock with memblock_add?
> > > > > > >
> > > > > > > I'm unable to find any documentation about memblock_reserve in comments and
> > > > > > > boot-time-mm, but as per my understanding to the code, this should be a
> > > > > > > legit usage?
> > > > > > Yes, memblock allows reserving memory that was not added to memblock with
> > > > > > memblock_add().
> > > > > I think arch/platform specific code should fix this bug, like,
> > > > > --------------------------------------------------------------------------
> > > > > //for loongson64
> > > > > memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
> > > > >
> > > > > --------------------------------------------------------------------------
> > > > >
> > > > > or maybe memblock provide something like memblock_reserve_node
> > > > Hi pei,
> > > >
> > > > Can you test the following patch to see if it fixes this bug?
> > > >
> > > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > > index 2c19f5515e36..97721d99fdce 100644
> > > > --- a/mm/mm_init.c
> > > > +++ b/mm/mm_init.c
> > > > @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
> > > > pfn, int nid)
> > > > pg_data_t *pgdat;
> > > > int zid;
> > > >
> > > > + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
> > > > + nid = early_pfn_to_nid(pfn);
> > > > +
> > > > if (early_page_initialised(pfn, nid))
> > > > return;
IMO this will fix the bug. Before 61167ad5fecd we had nid = first_online_node
for reserved pages that didn't have nid set in memblock.reserved. After
61167ad5fecd we try to initialize these pages with MAX_NUMNODES and
obviously crash when accessing node structure.
I think that the check for a valid nid should be moved to
memmap_init_reserved_pages() though. An entire reserved region will have
nid set to MAX_NUMNODES, so there is no point to check every page in it.
> > > I do not think this fix set the right nid, ONLY arch/platform know that
> > > nid
Why does it matter to have the right nid in a reserved page that is not
part of usable memory?
That's true that only arch knows on which node those reserved pages are,
but core mm does not care about reserved pages that are not in memory.
> > > int __meminit early_pfn_to_nid(unsigned long pfn)
> > > {
> > > static DEFINE_SPINLOCK(early_pfn_lock);
> > > int nid;
> > >
> > > spin_lock(&early_pfn_lock);
> > > nid = __early_pfn_to_nid(pfn,
> > > &early_pfnnid_cache);
> > > if (nid < 0)
> > > //!!!first_online_node MAY NOT be the node the pfn belong to!!!
> > > nid = first_online_node;
> > >
> > > spin_unlock(&early_pfn_lock);
> > >
> > > return
> > > nid;
> > > }
> >
> >
> > Okay, I don't think this bug is caused by commit 61167ad5fecd ("mm: pass nid
> > to reserve_bootmem_region()"),
> >
> > because even if you revert this commit, it will still get nid by
> > early_pfn_to_nid(). Did I get that right?
>
> Yes, more accurately, this bug is exposed by commit 61167ad5fecd. My
> previous fix is based on presumptions that memory_reserve should reserve memory
> added by memblock_add{,_node}, if going across this limitation, there need
> to set the valid nid for reserved memory region.
> >
> > > >
> > > > > > > In practical we run into uninitialized nid of reserved block problem, should
> > > > > > > we fix it
> > > > > > > in our usage, or on memblock side?
> > > > > > Apparently it's a bug in memblock :(
> > > > > >
> > > > > > If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> > > > > > does the issue disappear?
> > > > > Yes, I git bisect this commit.
> > > > >
> > > > > But I don't think it is a bug in memblock. IMO, memblock_reserve under
> > > > > NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> > > > > that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> > > > > passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> > > > > -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> > > > > initialize the nid of reserved region(only it know that), or the reserved
> > > > > region NOT within a memblock added by memblock_add, memblock can not
> > > > > give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> > > > > reserve_bootmem_region()") just reveals the embarrassment case by an
> > > > > out of bound memory access.
> > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > 在 2023/12/25 09:30, Huang Pei 写道:
> > > > > > > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > > > > > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > > > > > > this:
> > > > > > > > ----------------------------------------------------------------------
> > > > > > > > Call Trace:
> > > > > > > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > > > > > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > > > > > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > > > > > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > > > > > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > > > > > > >
> > > > > > > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > > > > > > 64420070 00003025 24040003
> > > > > > > >
> > > > > > > > ---[ end trace 0000000000000000 ]---
> > > > > > > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > > > > > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > > > > > > ----------------------------------------------------------------------
> > > > > > > >
> > > > > > > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > > > > > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > > > > > > =debug":
> > > > > > > >
> > > > > > > > ----------------------------------------------------------------------
> > > > > > > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > > > > > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > > > > > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > > > > > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > > > > > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > > > > > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > > > > > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > > > > > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > > > > > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > > > > > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > > > > > > reserved.cnt = 0x1f
> > > > > > > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > > > > > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > > > > > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > > > > > > 0x0000000000002694 bytes flags: 0x0
> > > > > > > > ----------------------------------------------------------------------
> > > > > > > >
> > > > > > > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > > > > > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > > > > > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > > > > > > accessing "node_data" out of bound.
> > > > > > > >
> > > > > > > > To fix this bug, we should remove unnecessary memory block reservation.
> > > > > > > >
> > > > > > > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > > > > > > it is not registered by "memblock_add_node"
> > > > > > > >
> > > > > > > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > > > > > > registered by "memblock_add" either.
> > > > > > > >
> > > > > > > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > > > > > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > > > > > > ---
> > > > > > > > arch/mips/kernel/traps.c | 3 ++-
> > > > > > > > arch/mips/loongson64/numa.c | 2 --
> > > > > > > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > > > > > > index 246c6a6b0261..9b632b4c10c3 100644
> > > > > > > > --- a/arch/mips/kernel/traps.c
> > > > > > > > +++ b/arch/mips/kernel/traps.c
> > > > > > > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > > > > > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > > > > > > {
> > > > > > > > - memblock_reserve(addr, size);
> > > > > > > > + if(memblock_is_region_memory(addr, size))
> > > > > > > > + memblock_reserve(addr, size);
> > > > > > > > }
> > > > > > > > void __init *set_except_vector(int n, void *addr)
> > > > > > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > > > > > > index 8f61e93c0c5b..0f516dde81da 100644
> > > > > > > > --- a/arch/mips/loongson64/numa.c
> > > > > > > > +++ b/arch/mips/loongson64/numa.c
> > > > > > > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > > > > > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > > > > > > 32 << 20);
> > > > > > > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > > > > > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > > > > > > }
> > > > > > > > }
> > > > > > > --
> > > > > > > ---
> > > > > > > Jiaxun Yang
> > > > > > >
> > > > > > --
> > > > > > Sincerely yours,
> > > > > > Mike.
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 6:46 ` Mike Rapoport
@ 2024-01-17 7:45 ` Huang Pei
2024-01-17 11:08 ` Mike Rapoport
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-17 7:45 UTC (permalink / raw)
To: Mike Rapoport
Cc: Yajun Deng, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On Wed, Jan 17, 2024 at 08:46:34AM +0200, Mike Rapoport wrote:
> On Wed, Jan 17, 2024 at 11:59:10AM +0800, Huang Pei wrote:
> > On Wed, Jan 17, 2024 at 11:17:18AM +0800, Yajun Deng wrote:
> > >
> > > On 2024/1/17 11:01, Huang Pei wrote:
> > > > On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
> > > > > On 2024/1/16 20:23, Huang Pei wrote:
> > > > > > On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> > > > > > > On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > > > > > > > Hi mm folks,
> > > > > > > >
> > > > > > > > Just a quick question, what is the expected behavior of memblock_reserve
> > > > > > > > a region that is not added to memblock with memblock_add?
> > > > > > > >
> > > > > > > > I'm unable to find any documentation about memblock_reserve in comments and
> > > > > > > > boot-time-mm, but as per my understanding to the code, this should be a
> > > > > > > > legit usage?
> > > > > > > Yes, memblock allows reserving memory that was not added to memblock with
> > > > > > > memblock_add().
> > > > > > I think arch/platform specific code should fix this bug, like,
> > > > > > --------------------------------------------------------------------------
> > > > > > //for loongson64
> > > > > > memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
> > > > > >
> > > > > > --------------------------------------------------------------------------
> > > > > >
> > > > > > or maybe memblock provide something like memblock_reserve_node
> > > > > Hi pei,
> > > > >
> > > > > Can you test the following patch to see if it fixes this bug?
> > > > >
> > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > > > index 2c19f5515e36..97721d99fdce 100644
> > > > > --- a/mm/mm_init.c
> > > > > +++ b/mm/mm_init.c
> > > > > @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
> > > > > pfn, int nid)
> > > > > pg_data_t *pgdat;
> > > > > int zid;
> > > > >
> > > > > + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
> > > > > + nid = early_pfn_to_nid(pfn);
> > > > > +
> > > > > if (early_page_initialised(pfn, nid))
> > > > > return;
>
> IMO this will fix the bug. Before 61167ad5fecd we had nid = first_online_node
> for reserved pages that didn't have nid set in memblock.reserved. After
> 61167ad5fecd we try to initialize these pages with MAX_NUMNODES and
> obviously crash when accessing node structure.
>
> I think that the check for a valid nid should be moved to
> memmap_init_reserved_pages() though. An entire reserved region will have
> nid set to MAX_NUMNODES, so there is no point to check every page in it.
>
> > > > I do not think this fix set the right nid, ONLY arch/platform know that
> > > > nid
>
> Why does it matter to have the right nid in a reserved page that is not
> part of usable memory?
>
IMO, if a reserved page DO have a valid nid, and archs knows that, archs
should set it right, and this is just the case of loongson64(and
loongarch).
> That's true that only arch knows on which node those reserved pages are,
> but core mm does not care about reserved pages that are not in memory.
>
> > > > int __meminit early_pfn_to_nid(unsigned long pfn)
> > > > {
> > > > static DEFINE_SPINLOCK(early_pfn_lock);
> > > > int nid;
> > > >
> > > > spin_lock(&early_pfn_lock);
> > > > nid = __early_pfn_to_nid(pfn,
> > > > &early_pfnnid_cache);
> > > > if (nid < 0)
> > > > //!!!first_online_node MAY NOT be the node the pfn belong to!!!
> > > > nid = first_online_node;
> > > >
> > > > spin_unlock(&early_pfn_lock);
> > > >
> > > > return
> > > > nid;
> > > > }
> > >
> > >
> > > Okay, I don't think this bug is caused by commit 61167ad5fecd ("mm: pass nid
> > > to reserve_bootmem_region()"),
> > >
> > > because even if you revert this commit, it will still get nid by
> > > early_pfn_to_nid(). Did I get that right?
> >
> > Yes, more accurately, this bug is exposed by commit 61167ad5fecd. My
> > previous fix is based on presumptions that memory_reserve should reserve memory
> > added by memblock_add{,_node}, if going across this limitation, there need
> > to set the valid nid for reserved memory region.
> > >
> > > > >
> > > > > > > > In practical we run into uninitialized nid of reserved block problem, should
> > > > > > > > we fix it
> > > > > > > > in our usage, or on memblock side?
> > > > > > > Apparently it's a bug in memblock :(
> > > > > > >
> > > > > > > If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> > > > > > > does the issue disappear?
> > > > > > Yes, I git bisect this commit.
> > > > > >
> > > > > > But I don't think it is a bug in memblock. IMO, memblock_reserve under
> > > > > > NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> > > > > > that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> > > > > > passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> > > > > > -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> > > > > > initialize the nid of reserved region(only it know that), or the reserved
> > > > > > region NOT within a memblock added by memblock_add, memblock can not
> > > > > > give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> > > > > > reserve_bootmem_region()") just reveals the embarrassment case by an
> > > > > > out of bound memory access.
> > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > 在 2023/12/25 09:30, Huang Pei 写道:
> > > > > > > > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > > > > > > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > > > > > > > this:
> > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > Call Trace:
> > > > > > > > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > > > > > > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > > > > > > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > > > > > > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > > > > > > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > > > > > > > >
> > > > > > > > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > > > > > > > 64420070 00003025 24040003
> > > > > > > > >
> > > > > > > > > ---[ end trace 0000000000000000 ]---
> > > > > > > > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > > > > > > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > >
> > > > > > > > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > > > > > > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > > > > > > > =debug":
> > > > > > > > >
> > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > > > > > > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > > > > > > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > > > > > > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > > > > > > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > > > > > > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > > > > > > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > > > > > > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > > > > > > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > > > > > > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > > > > > > > reserved.cnt = 0x1f
> > > > > > > > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > > > > > > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > > > > > > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > > > > > > > 0x0000000000002694 bytes flags: 0x0
> > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > >
> > > > > > > > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > > > > > > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > > > > > > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > > > > > > > accessing "node_data" out of bound.
> > > > > > > > >
> > > > > > > > > To fix this bug, we should remove unnecessary memory block reservation.
> > > > > > > > >
> > > > > > > > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > > > > > > > it is not registered by "memblock_add_node"
> > > > > > > > >
> > > > > > > > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > > > > > > > registered by "memblock_add" either.
> > > > > > > > >
> > > > > > > > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > > > > > > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > > > > > > > ---
> > > > > > > > > arch/mips/kernel/traps.c | 3 ++-
> > > > > > > > > arch/mips/loongson64/numa.c | 2 --
> > > > > > > > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > > > > > > > index 246c6a6b0261..9b632b4c10c3 100644
> > > > > > > > > --- a/arch/mips/kernel/traps.c
> > > > > > > > > +++ b/arch/mips/kernel/traps.c
> > > > > > > > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > > > > > > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > > > > > > > {
> > > > > > > > > - memblock_reserve(addr, size);
> > > > > > > > > + if(memblock_is_region_memory(addr, size))
> > > > > > > > > + memblock_reserve(addr, size);
> > > > > > > > > }
> > > > > > > > > void __init *set_except_vector(int n, void *addr)
> > > > > > > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > > > > > > > index 8f61e93c0c5b..0f516dde81da 100644
> > > > > > > > > --- a/arch/mips/loongson64/numa.c
> > > > > > > > > +++ b/arch/mips/loongson64/numa.c
> > > > > > > > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > > > > > > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > > > > > > > 32 << 20);
> > > > > > > > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > > > > > > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > > > > > > > }
> > > > > > > > > }
> > > > > > > > --
> > > > > > > > ---
> > > > > > > > Jiaxun Yang
> > > > > > > >
> > > > > > > --
> > > > > > > Sincerely yours,
> > > > > > > Mike.
> >
>
> --
> Sincerely yours,
> Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 7:45 ` Huang Pei
@ 2024-01-17 11:08 ` Mike Rapoport
2024-01-18 2:26 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Mike Rapoport @ 2024-01-17 11:08 UTC (permalink / raw)
To: Huang Pei
Cc: Yajun Deng, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On Wed, Jan 17, 2024 at 03:45:46PM +0800, Huang Pei wrote:
> On Wed, Jan 17, 2024 at 08:46:34AM +0200, Mike Rapoport wrote:
> > On Wed, Jan 17, 2024 at 11:59:10AM +0800, Huang Pei wrote:
> > > On Wed, Jan 17, 2024 at 11:17:18AM +0800, Yajun Deng wrote:
> > > >
> > > > On 2024/1/17 11:01, Huang Pei wrote:
> > > > > On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
> > > > > > On 2024/1/16 20:23, Huang Pei wrote:
> > > > > > > On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> > > > > > > > On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > > > > > > > > Hi mm folks,
> > > > > > > > >
> > > > > > > > > Just a quick question, what is the expected behavior of memblock_reserve
> > > > > > > > > a region that is not added to memblock with memblock_add?
> > > > > > > > >
> > > > > > > > > I'm unable to find any documentation about memblock_reserve in comments and
> > > > > > > > > boot-time-mm, but as per my understanding to the code, this should be a
> > > > > > > > > legit usage?
> > > > > > > > Yes, memblock allows reserving memory that was not added to memblock with
> > > > > > > > memblock_add().
> > > > > > > I think arch/platform specific code should fix this bug, like,
> > > > > > > --------------------------------------------------------------------------
> > > > > > > //for loongson64
> > > > > > > memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
> > > > > > >
> > > > > > > --------------------------------------------------------------------------
> > > > > > >
> > > > > > > or maybe memblock provide something like memblock_reserve_node
> > > > > > Hi pei,
> > > > > >
> > > > > > Can you test the following patch to see if it fixes this bug?
> > > > > >
> > > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > > > > index 2c19f5515e36..97721d99fdce 100644
> > > > > > --- a/mm/mm_init.c
> > > > > > +++ b/mm/mm_init.c
> > > > > > @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
> > > > > > pfn, int nid)
> > > > > > pg_data_t *pgdat;
> > > > > > int zid;
> > > > > >
> > > > > > + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
> > > > > > + nid = early_pfn_to_nid(pfn);
> > > > > > +
> > > > > > if (early_page_initialised(pfn, nid))
> > > > > > return;
> >
> > IMO this will fix the bug. Before 61167ad5fecd we had nid = first_online_node
> > for reserved pages that didn't have nid set in memblock.reserved. After
> > 61167ad5fecd we try to initialize these pages with MAX_NUMNODES and
> > obviously crash when accessing node structure.
> >
> > I think that the check for a valid nid should be moved to
> > memmap_init_reserved_pages() though. An entire reserved region will have
> > nid set to MAX_NUMNODES, so there is no point to check every page in it.
> >
> > > > > I do not think this fix set the right nid, ONLY arch/platform know that
> > > > > nid
> >
> > Why does it matter to have the right nid in a reserved page that is not
> > part of usable memory?
> >
> IMO, if a reserved page DO have a valid nid, and archs knows that, archs
> should set it right, and this is just the case of loongson64(and
> loongarch).
An arch may choose to set nids for reserved regions that are never added to
memblock.memory, but mm shouldn't crash if it didn't.
> > That's true that only arch knows on which node those reserved pages are,
> > but core mm does not care about reserved pages that are not in memory.
> >
> > > > > int __meminit early_pfn_to_nid(unsigned long pfn)
> > > > > {
> > > > > static DEFINE_SPINLOCK(early_pfn_lock);
> > > > > int nid;
> > > > >
> > > > > spin_lock(&early_pfn_lock);
> > > > > nid = __early_pfn_to_nid(pfn,
> > > > > &early_pfnnid_cache);
> > > > > if (nid < 0)
> > > > > //!!!first_online_node MAY NOT be the node the pfn belong to!!!
> > > > > nid = first_online_node;
> > > > >
> > > > > spin_unlock(&early_pfn_lock);
> > > > >
> > > > > return
> > > > > nid;
> > > > > }
> > > >
> > > >
> > > > Okay, I don't think this bug is caused by commit 61167ad5fecd ("mm: pass nid
> > > > to reserve_bootmem_region()"),
> > > >
> > > > because even if you revert this commit, it will still get nid by
> > > > early_pfn_to_nid(). Did I get that right?
> > >
> > > Yes, more accurately, this bug is exposed by commit 61167ad5fecd. My
> > > previous fix is based on presumptions that memory_reserve should reserve memory
> > > added by memblock_add{,_node}, if going across this limitation, there need
> > > to set the valid nid for reserved memory region.
> > > >
> > > > > >
> > > > > > > > > In practical we run into uninitialized nid of reserved block problem, should
> > > > > > > > > we fix it
> > > > > > > > > in our usage, or on memblock side?
> > > > > > > > Apparently it's a bug in memblock :(
> > > > > > > >
> > > > > > > > If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> > > > > > > > does the issue disappear?
> > > > > > > Yes, I git bisect this commit.
> > > > > > >
> > > > > > > But I don't think it is a bug in memblock. IMO, memblock_reserve under
> > > > > > > NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> > > > > > > that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> > > > > > > passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> > > > > > > -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> > > > > > > initialize the nid of reserved region(only it know that), or the reserved
> > > > > > > region NOT within a memblock added by memblock_add, memblock can not
> > > > > > > give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> > > > > > > reserve_bootmem_region()") just reveals the embarrassment case by an
> > > > > > > out of bound memory access.
> > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > 在 2023/12/25 09:30, Huang Pei 写道:
> > > > > > > > > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > > > > > > > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > > > > > > > > this:
> > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > Call Trace:
> > > > > > > > > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > > > > > > > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > > > > > > > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > > > > > > > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > > > > > > > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > > > > > > > > >
> > > > > > > > > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > > > > > > > > 64420070 00003025 24040003
> > > > > > > > > >
> > > > > > > > > > ---[ end trace 0000000000000000 ]---
> > > > > > > > > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > > > > > > > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > >
> > > > > > > > > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > > > > > > > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > > > > > > > > =debug":
> > > > > > > > > >
> > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > > > > > > > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > > > > > > > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > > > > > > > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > > > > > > > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > > > > > > > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > > > > > > > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > > > > > > > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > > > > > > > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > > > > > > > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > > > > > > > > reserved.cnt = 0x1f
> > > > > > > > > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > > > > > > > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > > > > > > > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > > > > > > > > 0x0000000000002694 bytes flags: 0x0
> > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > >
> > > > > > > > > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > > > > > > > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > > > > > > > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > > > > > > > > accessing "node_data" out of bound.
> > > > > > > > > >
> > > > > > > > > > To fix this bug, we should remove unnecessary memory block reservation.
> > > > > > > > > >
> > > > > > > > > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > > > > > > > > it is not registered by "memblock_add_node"
> > > > > > > > > >
> > > > > > > > > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > > > > > > > > registered by "memblock_add" either.
> > > > > > > > > >
> > > > > > > > > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > > > > > > > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > > > > > > > > ---
> > > > > > > > > > arch/mips/kernel/traps.c | 3 ++-
> > > > > > > > > > arch/mips/loongson64/numa.c | 2 --
> > > > > > > > > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > > > > > > > > index 246c6a6b0261..9b632b4c10c3 100644
> > > > > > > > > > --- a/arch/mips/kernel/traps.c
> > > > > > > > > > +++ b/arch/mips/kernel/traps.c
> > > > > > > > > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > > > > > > > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > > > > > > > > {
> > > > > > > > > > - memblock_reserve(addr, size);
> > > > > > > > > > + if(memblock_is_region_memory(addr, size))
> > > > > > > > > > + memblock_reserve(addr, size);
> > > > > > > > > > }
> > > > > > > > > > void __init *set_except_vector(int n, void *addr)
> > > > > > > > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > > > > > > > > index 8f61e93c0c5b..0f516dde81da 100644
> > > > > > > > > > --- a/arch/mips/loongson64/numa.c
> > > > > > > > > > +++ b/arch/mips/loongson64/numa.c
> > > > > > > > > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > > > > > > > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > > > > > > > > 32 << 20);
> > > > > > > > > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > > > > > > > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > > > > > > > > }
> > > > > > > > > > }
> > > > > > > > > --
> > > > > > > > > ---
> > > > > > > > > Jiaxun Yang
> > > > > > > > >
> > > > > > > > --
> > > > > > > > Sincerely yours,
> > > > > > > > Mike.
> > >
> >
> > --
> > Sincerely yours,
> > Mike.
>
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure)
2024-01-17 11:08 ` Mike Rapoport
@ 2024-01-18 2:26 ` Huang Pei
0 siblings, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-18 2:26 UTC (permalink / raw)
To: Mike Rapoport
Cc: Yajun Deng, Jiaxun Yang, linux-mm, Bibo Mao, linux-mips,
Paul Burton, Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen,
Thomas Bogendoerfer, linux-kernel
On Wed, Jan 17, 2024 at 01:08:48PM +0200, Mike Rapoport wrote:
> On Wed, Jan 17, 2024 at 03:45:46PM +0800, Huang Pei wrote:
> > On Wed, Jan 17, 2024 at 08:46:34AM +0200, Mike Rapoport wrote:
> > > On Wed, Jan 17, 2024 at 11:59:10AM +0800, Huang Pei wrote:
> > > > On Wed, Jan 17, 2024 at 11:17:18AM +0800, Yajun Deng wrote:
> > > > >
> > > > > On 2024/1/17 11:01, Huang Pei wrote:
> > > > > > On Wed, Jan 17, 2024 at 10:20:00AM +0800, Yajun Deng wrote:
> > > > > > > On 2024/1/16 20:23, Huang Pei wrote:
> > > > > > > > On Tue, Jan 16, 2024 at 10:39:04AM +0200, Mike Rapoport wrote:
> > > > > > > > > On Mon, Jan 15, 2024 at 02:08:21PM +0000, Jiaxun Yang wrote:
> > > > > > > > > > Hi mm folks,
> > > > > > > > > >
> > > > > > > > > > Just a quick question, what is the expected behavior of memblock_reserve
> > > > > > > > > > a region that is not added to memblock with memblock_add?
> > > > > > > > > >
> > > > > > > > > > I'm unable to find any documentation about memblock_reserve in comments and
> > > > > > > > > > boot-time-mm, but as per my understanding to the code, this should be a
> > > > > > > > > > legit usage?
> > > > > > > > > Yes, memblock allows reserving memory that was not added to memblock with
> > > > > > > > > memblock_add().
> > > > > > > > I think arch/platform specific code should fix this bug, like,
> > > > > > > > --------------------------------------------------------------------------
> > > > > > > > //for loongson64
> > > > > > > > memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
> > > > > > > >
> > > > > > > > --------------------------------------------------------------------------
> > > > > > > >
> > > > > > > > or maybe memblock provide something like memblock_reserve_node
> > > > > > > Hi pei,
> > > > > > >
> > > > > > > Can you test the following patch to see if it fixes this bug?
> > > > > > >
> > > > > > > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > > > > > > index 2c19f5515e36..97721d99fdce 100644
> > > > > > > --- a/mm/mm_init.c
> > > > > > > +++ b/mm/mm_init.c
> > > > > > > @@ -708,6 +708,9 @@ static void __meminit init_reserved_page(unsigned long
> > > > > > > pfn, int nid)
> > > > > > > pg_data_t *pgdat;
> > > > > > > int zid;
> > > > > > >
> > > > > > > + if (unlikely(nid == NUMA_NO_NODE || nid >= MAX_NUMNODES))
> > > > > > > + nid = early_pfn_to_nid(pfn);
> > > > > > > +
> > > > > > > if (early_page_initialised(pfn, nid))
> > > > > > > return;
> > >
> > > IMO this will fix the bug. Before 61167ad5fecd we had nid = first_online_node
> > > for reserved pages that didn't have nid set in memblock.reserved. After
> > > 61167ad5fecd we try to initialize these pages with MAX_NUMNODES and
> > > obviously crash when accessing node structure.
> > >
> > > I think that the check for a valid nid should be moved to
> > > memmap_init_reserved_pages() though. An entire reserved region will have
> > > nid set to MAX_NUMNODES, so there is no point to check every page in it.
> > >
> > > > > > I do not think this fix set the right nid, ONLY arch/platform know that
> > > > > > nid
> > >
> > > Why does it matter to have the right nid in a reserved page that is not
> > > part of usable memory?
> > >
> > IMO, if a reserved page DO have a valid nid, and archs knows that, archs
> > should set it right, and this is just the case of loongson64(and
> > loongarch).
I will set the nid for reserved page on loongson64, just like what
loongarch did.
> An arch may choose to set nids for reserved regions that are never added to
> memblock.memory, but mm shouldn't crash if it didn't.
>
I agree.
> > > That's true that only arch knows on which node those reserved pages are,
> > > but core mm does not care about reserved pages that are not in memory.
> > >
> > > > > > int __meminit early_pfn_to_nid(unsigned long pfn)
> > > > > > {
> > > > > > static DEFINE_SPINLOCK(early_pfn_lock);
> > > > > > int nid;
> > > > > >
> > > > > > spin_lock(&early_pfn_lock);
> > > > > > nid = __early_pfn_to_nid(pfn,
> > > > > > &early_pfnnid_cache);
> > > > > > if (nid < 0)
> > > > > > //!!!first_online_node MAY NOT be the node the pfn belong to!!!
> > > > > > nid = first_online_node;
> > > > > >
> > > > > > spin_unlock(&early_pfn_lock);
> > > > > >
> > > > > > return
> > > > > > nid;
> > > > > > }
> > > > >
> > > > >
> > > > > Okay, I don't think this bug is caused by commit 61167ad5fecd ("mm: pass nid
> > > > > to reserve_bootmem_region()"),
> > > > >
> > > > > because even if you revert this commit, it will still get nid by
> > > > > early_pfn_to_nid(). Did I get that right?
> > > >
> > > > Yes, more accurately, this bug is exposed by commit 61167ad5fecd. My
> > > > previous fix is based on presumptions that memory_reserve should reserve memory
> > > > added by memblock_add{,_node}, if going across this limitation, there need
> > > > to set the valid nid for reserved memory region.
> > > > >
> > > > > > >
> > > > > > > > > > In practical we run into uninitialized nid of reserved block problem, should
> > > > > > > > > > we fix it
> > > > > > > > > > in our usage, or on memblock side?
> > > > > > > > > Apparently it's a bug in memblock :(
> > > > > > > > >
> > > > > > > > > If you revert 61167ad5fecd ("mm: pass nid to reserve_bootmem_region()")
> > > > > > > > > does the issue disappear?
> > > > > > > > Yes, I git bisect this commit.
> > > > > > > >
> > > > > > > > But I don't think it is a bug in memblock. IMO, memblock_reserve under
> > > > > > > > NUMA set nid of reserved region to MAX_NUMNODES, which is the point
> > > > > > > > that cause the "memblock_get_region_node from memmap_init_reserved_pages "
> > > > > > > > passing a invalid node id(aka MAX_NUMNODES) to "reserver_bootmem_region
> > > > > > > > -> init_reserved_page -> early_pfn_to_nid". If arch-specific code DOES NOT
> > > > > > > > initialize the nid of reserved region(only it know that), or the reserved
> > > > > > > > region NOT within a memblock added by memblock_add, memblock can not
> > > > > > > > give a valid node id to the reserved region. Commit 61167ad5fecd ("mm: pass nid to
> > > > > > > > reserve_bootmem_region()") just reveals the embarrassment case by an
> > > > > > > > out of bound memory access.
> > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > > >
> > > > > > > > > > 在 2023/12/25 09:30, Huang Pei 写道:
> > > > > > > > > > > Since commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()),
> > > > > > > > > > > loongson64 booting failed with CONFIG_DEFERRED_STRUCT_PAGE_INIT like
> > > > > > > > > > > this:
> > > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > > Call Trace:
> > > > > > > > > > > [<ffffffff8235d088>] reserve_bootmem_region+0xa8/0x184
> > > > > > > > > > > [<ffffffff82333940>] memblock_free_all+0x104/0x2a8
> > > > > > > > > > > [<ffffffff8231d8e4>] mem_init+0x84/0x94
> > > > > > > > > > > [<ffffffff82330958>] mm_core_init+0xf8/0x308
> > > > > > > > > > > [<ffffffff82318c38>] start_kernel+0x43c/0x86c
> > > > > > > > > > >
> > > > > > > > > > > Code: 10400028 2402fff0 de420000 <dc432880> 0203182b 14600022
> > > > > > > > > > > 64420070 00003025 24040003
> > > > > > > > > > >
> > > > > > > > > > > ---[ end trace 0000000000000000 ]---
> > > > > > > > > > > Kernel panic - not syncing: Attempted to kill the idle task!
> > > > > > > > > > > ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> > > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > >
> > > > > > > > > > > The root cause is no memory region "0x0-0x1fffff" paired with
> > > > > > > > > > > memory-reserved region "0x0-0x1fffff" and "0x0-0xfff", with "memblock
> > > > > > > > > > > =debug":
> > > > > > > > > > >
> > > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > > memory[0x0] [0x0000000000200000-0x000000000effffff],
> > > > > > > > > > > 0x000000000ee00000 bytes on node 0 flags: 0x0 !!!!here
> > > > > > > > > > > memory[0x1] [0x0000000090000000-0x00000000fdffffff],
> > > > > > > > > > > 0x000000006e000000 bytes on node 0 flags: 0x0
> > > > > > > > > > > memory[0x2] [0x0000000100000000-0x000000027fffffff],
> > > > > > > > > > > 0x0000000180000000 bytes on node 0 flags: 0x0
> > > > > > > > > > > memory[0x3] [0x0000100000000000-0x000010000fffffff],
> > > > > > > > > > > 0x0000000010000000 bytes on node 1 flags: 0x0
> > > > > > > > > > > memory[0x4] [0x0000100090000000-0x000010027fffffff],
> > > > > > > > > > > 0x00000001f0000000 bytes on node 1 flags: 0x0
> > > > > > > > > > > reserved.cnt = 0x1f
> > > > > > > > > > > reserved[0x0] [0x0000000000000000-0x000000000190c80a],
> > > > > > > > > > > 0x000000000190c80b bytes flags: 0x0 !!!!oops 0x0-0x1fffff not in memory[0]
> > > > > > > > > > > reserved[0x1] [0x000000000190c810-0x000000000190eea3],
> > > > > > > > > > > 0x0000000000002694 bytes flags: 0x0
> > > > > > > > > > > ----------------------------------------------------------------------
> > > > > > > > > > >
> > > > > > > > > > > It caused memory-reserved region "0x0-0x1fffff" without valid node id
> > > > > > > > > > > in "memblock_get_region_node" from "memmap_init_reserved_pages", lead to
> > > > > > > > > > > "reserve_bootmem_region-> init_reserved_page -> early_pfn_to_nid()"
> > > > > > > > > > > accessing "node_data" out of bound.
> > > > > > > > > > >
> > > > > > > > > > > To fix this bug, we should remove unnecessary memory block reservation.
> > > > > > > > > > >
> > > > > > > > > > > +. no need to reserve 0x0-0x1fffff below kernel loading address, since
> > > > > > > > > > > it is not registered by "memblock_add_node"
> > > > > > > > > > >
> > > > > > > > > > > +. no need to reserve 0x0-0xfff for exception handling if it is not
> > > > > > > > > > > registered by "memblock_add" either.
> > > > > > > > > > >
> > > > > > > > > > > Fixes: commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region())
> > > > > > > > > > > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > > > > > > > > > > ---
> > > > > > > > > > > arch/mips/kernel/traps.c | 3 ++-
> > > > > > > > > > > arch/mips/loongson64/numa.c | 2 --
> > > > > > > > > > > 2 files changed, 2 insertions(+), 3 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> > > > > > > > > > > index 246c6a6b0261..9b632b4c10c3 100644
> > > > > > > > > > > --- a/arch/mips/kernel/traps.c
> > > > > > > > > > > +++ b/arch/mips/kernel/traps.c
> > > > > > > > > > > @@ -2007,7 +2007,8 @@ unsigned long vi_handlers[64];
> > > > > > > > > > > void reserve_exception_space(phys_addr_t addr, unsigned long size)
> > > > > > > > > > > {
> > > > > > > > > > > - memblock_reserve(addr, size);
> > > > > > > > > > > + if(memblock_is_region_memory(addr, size))
> > > > > > > > > > > + memblock_reserve(addr, size);
> > > > > > > > > > > }
> > > > > > > > > > > void __init *set_except_vector(int n, void *addr)
> > > > > > > > > > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > > > > > > > > > index 8f61e93c0c5b..0f516dde81da 100644
> > > > > > > > > > > --- a/arch/mips/loongson64/numa.c
> > > > > > > > > > > +++ b/arch/mips/loongson64/numa.c
> > > > > > > > > > > @@ -130,8 +130,6 @@ static void __init node_mem_init(unsigned int node)
> > > > > > > > > > > memblock_reserve((node_addrspace_offset | 0xfe000000),
> > > > > > > > > > > 32 << 20);
> > > > > > > > > > > - /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > > > > > > > > > - memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > > > > > > > > > }
> > > > > > > > > > > }
> > > > > > > > > > --
> > > > > > > > > > ---
> > > > > > > > > > Jiaxun Yang
> > > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Sincerely yours,
> > > > > > > > > Mike.
> > > >
> > >
> > > --
> > > Sincerely yours,
> > > Mike.
> >
>
> --
> Sincerely yours,
> Mike.
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH V3]: MIPS: loongson64: fix booting failure
2024-01-09 21:40 ` Thomas Bogendoerfer
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
@ 2024-01-18 12:39 ` Huang Pei
2024-01-18 12:39 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-18 12:39 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-19 4:02 ` [PATCH V4]: MIPS: loongson64: fix boot failure Huang Pei
` (2 subsequent siblings)
4 siblings, 2 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-18 12:39 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Patch 1:
Just reserve exception vector ONLY ONCE.
Patch 2:
workaround the boot failure. See [1] for the real fix
[1]: https://lore.kernel.org/all/Zad3ynuuu8S-51eI@kernel.org
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-18 12:39 ` [PATCH V3]: MIPS: loongson64: fix booting failure Huang Pei
@ 2024-01-18 12:39 ` Huang Pei
2024-01-18 12:39 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
1 sibling, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-18 12:39 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
cpu_probe is called by BP an APs, but reserving exception vector
0x0-0x1000 need once. Call it on cpu 0.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/kernel/cpu-probe.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index b406d8bfb15a..6939d0de2a03 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -1581,7 +1581,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
__cpu_name[cpu] = "Broadcom BMIPS4380";
set_elf_platform(cpu, "bmips4380");
c->options |= MIPS_CPU_RIXI;
- reserve_exception_space(0x400, VECTORSPACING * 64);
+ if (cpu == 0) {
+ reserve_exception_space(0x400, VECTORSPACING * 64);
+ }
} else {
c->cputype = CPU_BMIPS4350;
__cpu_name[cpu] = "Broadcom BMIPS4350";
@@ -1598,7 +1600,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
__cpu_name[cpu] = "Broadcom BMIPS5000";
set_elf_platform(cpu, "bmips5000");
c->options |= MIPS_CPU_ULRI | MIPS_CPU_RIXI;
- reserve_exception_space(0x1000, VECTORSPACING * 64);
+ if (cpu == 0) {
+ reserve_exception_space(0x1000, VECTORSPACING * 64);
+ }
break;
}
}
@@ -1992,12 +1996,13 @@ void cpu_probe(void)
*/
loongson3_cpucfg_synthesize_data(c);
+ if (cpu == 0) {
#ifdef CONFIG_64BIT
- if (cpu == 0)
__ua_limit = ~((1ull << cpu_vmbits) - 1);
#endif
+ reserve_exception_space(0, 0x1000);
+ }
- reserve_exception_space(0, 0x1000);
}
void cpu_report(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-18 12:39 ` [PATCH V3]: MIPS: loongson64: fix booting failure Huang Pei
2024-01-18 12:39 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
@ 2024-01-18 12:39 ` Huang Pei
1 sibling, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-18 12:39 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This work around booting failure on 3A1000+ since commit 61167ad5fecd
("mm: pass nid to reserve_bootmem_region()) under
CONFIG_DEFERRED_STRUCT_PAGE_INIT
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/loongson64/init.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index f25caa6aa9d3..5bfabc67136a 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -97,6 +97,8 @@ void __init szmem(unsigned int node)
(u32)node_id, mem_type, &mem_start, &mem_size);
break;
}
+ /* set nid for reserved memory */
+ memblock_set_node(0, node_id << 44, &memblock.reserved, node_id);
}
/* Reserve vgabios if it comes from firmware */
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH V4]: MIPS: loongson64: fix boot failure
2024-01-09 21:40 ` Thomas Bogendoerfer
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
2024-01-18 12:39 ` [PATCH V3]: MIPS: loongson64: fix booting failure Huang Pei
@ 2024-01-19 4:02 ` Huang Pei
2024-01-19 4:02 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-19 4:02 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-22 8:08 ` [PATCH V6]: MIPS: loongson64: fix boot failure Huang Pei
2024-01-23 1:47 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
4 siblings, 2 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-19 4:02 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Hi, Thomas
(Sorry, the Subject is not appropriate now, [1] fixed the bug)
Patch 1: adjust comment.
Patch 2: fix v3's bug. Test passed on 3B1500, with and without [1]
[1]:https://lore.kernel.org/all/ZajIQGtqUNWgBWkk@kernel.org/
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-19 4:02 ` [PATCH V4]: MIPS: loongson64: fix boot failure Huang Pei
@ 2024-01-19 4:02 ` Huang Pei
2024-01-19 15:23 ` Sergei Shtylyov
2024-01-19 16:15 ` Thomas Bogendoerfer
2024-01-19 4:02 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
1 sibling, 2 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-19 4:02 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
"cpu_probe" is called both by BP and APs, but reserving exception vector
(like 0x0-0x1000) called by "cpu_probe" need once and calling on BPs is
too late since memblock is unavailable at that time.
So, reserve exception vector ONLY by BP.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/kernel/cpu-probe.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
index b406d8bfb15a..6939d0de2a03 100644
--- a/arch/mips/kernel/cpu-probe.c
+++ b/arch/mips/kernel/cpu-probe.c
@@ -1581,7 +1581,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
__cpu_name[cpu] = "Broadcom BMIPS4380";
set_elf_platform(cpu, "bmips4380");
c->options |= MIPS_CPU_RIXI;
- reserve_exception_space(0x400, VECTORSPACING * 64);
+ if (cpu == 0) {
+ reserve_exception_space(0x400, VECTORSPACING * 64);
+ }
} else {
c->cputype = CPU_BMIPS4350;
__cpu_name[cpu] = "Broadcom BMIPS4350";
@@ -1598,7 +1600,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
__cpu_name[cpu] = "Broadcom BMIPS5000";
set_elf_platform(cpu, "bmips5000");
c->options |= MIPS_CPU_ULRI | MIPS_CPU_RIXI;
- reserve_exception_space(0x1000, VECTORSPACING * 64);
+ if (cpu == 0) {
+ reserve_exception_space(0x1000, VECTORSPACING * 64);
+ }
break;
}
}
@@ -1992,12 +1996,13 @@ void cpu_probe(void)
*/
loongson3_cpucfg_synthesize_data(c);
+ if (cpu == 0) {
#ifdef CONFIG_64BIT
- if (cpu == 0)
__ua_limit = ~((1ull << cpu_vmbits) - 1);
#endif
+ reserve_exception_space(0, 0x1000);
+ }
- reserve_exception_space(0, 0x1000);
}
void cpu_report(void)
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-19 4:02 ` [PATCH V4]: MIPS: loongson64: fix boot failure Huang Pei
2024-01-19 4:02 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
@ 2024-01-19 4:02 ` Huang Pei
2024-01-19 10:05 ` Jiaxun Yang
1 sibling, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-19 4:02 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This works around booting failure on 3A1000+ since commit 61167ad5fecd
("mm: pass nid to reserve_bootmem_region()) under
CONFIG_DEFERRED_STRUCT_PAGE_INIT.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/loongson64/init.c | 2 ++
arch/mips/loongson64/numa.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index f25caa6aa9d3..000ba91c0886 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
if (loongson_sysconf.vgabios_addr)
memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
SZ_256K);
+ /* set nid for reserved memory */
+ memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
}
#ifndef CONFIG_NUMA
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 8f61e93c0c5b..6345e096c532 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
/* Reserve pfn range 0~node[0]->node_start_pfn */
memblock_reserve(0, PAGE_SIZE * start_pfn);
+ /* set nid for reserved memory on node 0 */
+ memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-19 4:02 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
@ 2024-01-19 10:05 ` Jiaxun Yang
2024-01-21 2:14 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Jiaxun Yang @ 2024-01-19 10:05 UTC (permalink / raw)
To: Huang Pei, Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Paul Burton, Li Xuefeng, Yang Tiezhu,
Gao Juxin, Huacai Chen
在 2024/1/19 04:02, Huang Pei 写道:
> Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
> that reserved memblock regions have no valid node id set, just set it
> right since loongson64 firmware makes it clear in memory layout info.
>
> This works around booting failure on 3A1000+ since commit 61167ad5fecd
> ("mm: pass nid to reserve_bootmem_region()) under
> CONFIG_DEFERRED_STRUCT_PAGE_INIT.
This should be done at MIPS arch level I guess.
Thanks
- Jiaxun
>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/loongson64/init.c | 2 ++
> arch/mips/loongson64/numa.c | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> index f25caa6aa9d3..000ba91c0886 100644
> --- a/arch/mips/loongson64/init.c
> +++ b/arch/mips/loongson64/init.c
> @@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
> if (loongson_sysconf.vgabios_addr)
> memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
> SZ_256K);
> + /* set nid for reserved memory */
> + memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
> }
>
> #ifndef CONFIG_NUMA
> diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> index 8f61e93c0c5b..6345e096c532 100644
> --- a/arch/mips/loongson64/numa.c
> +++ b/arch/mips/loongson64/numa.c
> @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
>
> /* Reserve pfn range 0~node[0]->node_start_pfn */
> memblock_reserve(0, PAGE_SIZE * start_pfn);
> + /* set nid for reserved memory on node 0 */
> + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
> }
> }
>
--
---
Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-19 4:02 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
@ 2024-01-19 15:23 ` Sergei Shtylyov
2024-01-19 16:15 ` Thomas Bogendoerfer
1 sibling, 0 replies; 50+ messages in thread
From: Sergei Shtylyov @ 2024-01-19 15:23 UTC (permalink / raw)
To: Huang Pei, Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On 1/19/24 7:02 AM, Huang Pei wrote:
> "cpu_probe" is called both by BP and APs, but reserving exception vector
> (like 0x0-0x1000) called by "cpu_probe" need once and calling on BPs is
> too late since memblock is unavailable at that time.
>
> So, reserve exception vector ONLY by BP.
>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/kernel/cpu-probe.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
> index b406d8bfb15a..6939d0de2a03 100644
> --- a/arch/mips/kernel/cpu-probe.c
> +++ b/arch/mips/kernel/cpu-probe.c
> @@ -1581,7 +1581,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
> __cpu_name[cpu] = "Broadcom BMIPS4380";
> set_elf_platform(cpu, "bmips4380");
> c->options |= MIPS_CPU_RIXI;
> - reserve_exception_space(0x400, VECTORSPACING * 64);
> + if (cpu == 0) {
No need for {} around single statement.
> + reserve_exception_space(0x400, VECTORSPACING * 64);
> + }
> } else {
> c->cputype = CPU_BMIPS4350;
> __cpu_name[cpu] = "Broadcom BMIPS4350";
> @@ -1598,7 +1600,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
> __cpu_name[cpu] = "Broadcom BMIPS5000";
> set_elf_platform(cpu, "bmips5000");
> c->options |= MIPS_CPU_ULRI | MIPS_CPU_RIXI;
> - reserve_exception_space(0x1000, VECTORSPACING * 64);
> + if (cpu == 0) {
Ditto.
> + reserve_exception_space(0x1000, VECTORSPACING * 64);
> + }
> break;
> }
> }
[...]
MBR, Sergey
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-19 4:02 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-19 15:23 ` Sergei Shtylyov
@ 2024-01-19 16:15 ` Thomas Bogendoerfer
2024-01-21 7:13 ` Huang Pei
1 sibling, 1 reply; 50+ messages in thread
From: Thomas Bogendoerfer @ 2024-01-19 16:15 UTC (permalink / raw)
To: Huang Pei
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Fri, Jan 19, 2024 at 12:02:39PM +0800, Huang Pei wrote:
> "cpu_probe" is called both by BP and APs, but reserving exception vector
> (like 0x0-0x1000) called by "cpu_probe" need once and calling on BPs is
> too late since memblock is unavailable at that time.
>
> So, reserve exception vector ONLY by BP.
>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/kernel/cpu-probe.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
> index b406d8bfb15a..6939d0de2a03 100644
> --- a/arch/mips/kernel/cpu-probe.c
> +++ b/arch/mips/kernel/cpu-probe.c
> @@ -1581,7 +1581,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
> __cpu_name[cpu] = "Broadcom BMIPS4380";
> set_elf_platform(cpu, "bmips4380");
> c->options |= MIPS_CPU_RIXI;
> - reserve_exception_space(0x400, VECTORSPACING * 64);
> + if (cpu == 0) {
> + reserve_exception_space(0x400, VECTORSPACING * 64);
> + }
why not do a
if (smp_processor_id() == 0)
memblock_reserve(...)
in reserve_exception_space() ?
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-19 10:05 ` Jiaxun Yang
@ 2024-01-21 2:14 ` Huang Pei
2024-01-21 10:35 ` Jiaxun Yang
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-21 2:14 UTC (permalink / raw)
To: Jiaxun Yang
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
On Fri, Jan 19, 2024 at 10:05:39AM +0000, Jiaxun Yang wrote:
>
>
> 在 2024/1/19 04:02, Huang Pei 写道:
> > Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
> > that reserved memblock regions have no valid node id set, just set it
> > right since loongson64 firmware makes it clear in memory layout info.
> >
> > This works around booting failure on 3A1000+ since commit 61167ad5fecd
> > ("mm: pass nid to reserve_bootmem_region()) under
> > CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>
> This should be done at MIPS arch level I guess.
You mean the real fix or the set nid for the reserved memblock region?
+ the real fix is [1](see old mail).
I do not think MIPS arch needs it:
+ This ONLY matters on NUMA, most of MIPS platforms is UMA.
+ MM does not care about the nid of reserved memblock region provided by
arch/platform, nor it provide "memblock_reserved_node" like
"memblock_add_node". Loongson64 knows about nid of reserved region and the
regular distribution of physical memory bewtween nodes, that is why it can
be done by simple "memblock_set_node" and I add it.
>
> Thanks
> - Jiaxun
>
> >
> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > ---
> > arch/mips/loongson64/init.c | 2 ++
> > arch/mips/loongson64/numa.c | 2 ++
> > 2 files changed, 4 insertions(+)
> >
> > diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> > index f25caa6aa9d3..000ba91c0886 100644
> > --- a/arch/mips/loongson64/init.c
> > +++ b/arch/mips/loongson64/init.c
> > @@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
> > if (loongson_sysconf.vgabios_addr)
> > memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
> > SZ_256K);
> > + /* set nid for reserved memory */
> > + memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
> > }
> > #ifndef CONFIG_NUMA
> > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > index 8f61e93c0c5b..6345e096c532 100644
> > --- a/arch/mips/loongson64/numa.c
> > +++ b/arch/mips/loongson64/numa.c
> > @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
> > /* Reserve pfn range 0~node[0]->node_start_pfn */
> > memblock_reserve(0, PAGE_SIZE * start_pfn);
> > + /* set nid for reserved memory on node 0 */
> > + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
> > }
> > }
>
> --
> ---
> Jiaxun Yang
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-19 16:15 ` Thomas Bogendoerfer
@ 2024-01-21 7:13 ` Huang Pei
0 siblings, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-21 7:13 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Fri, Jan 19, 2024 at 05:15:02PM +0100, Thomas Bogendoerfer wrote:
> On Fri, Jan 19, 2024 at 12:02:39PM +0800, Huang Pei wrote:
> > "cpu_probe" is called both by BP and APs, but reserving exception vector
> > (like 0x0-0x1000) called by "cpu_probe" need once and calling on BPs is
> > too late since memblock is unavailable at that time.
> >
> > So, reserve exception vector ONLY by BP.
> >
> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > ---
> > arch/mips/kernel/cpu-probe.c | 13 +++++++++----
> > 1 file changed, 9 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/mips/kernel/cpu-probe.c b/arch/mips/kernel/cpu-probe.c
> > index b406d8bfb15a..6939d0de2a03 100644
> > --- a/arch/mips/kernel/cpu-probe.c
> > +++ b/arch/mips/kernel/cpu-probe.c
> > @@ -1581,7 +1581,9 @@ static inline void cpu_probe_broadcom(struct cpuinfo_mips *c, unsigned int cpu)
> > __cpu_name[cpu] = "Broadcom BMIPS4380";
> > set_elf_platform(cpu, "bmips4380");
> > c->options |= MIPS_CPU_RIXI;
> > - reserve_exception_space(0x400, VECTORSPACING * 64);
> > + if (cpu == 0) {
> > + reserve_exception_space(0x400, VECTORSPACING * 64);
> > + }
>
> why not do a
>
> if (smp_processor_id() == 0)
> memblock_reserve(...)
>
> in reserve_exception_space() ?
This is better, I will include this in V6
>
> Thomas.
>
> --
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-21 2:14 ` Huang Pei
@ 2024-01-21 10:35 ` Jiaxun Yang
0 siblings, 0 replies; 50+ messages in thread
From: Jiaxun Yang @ 2024-01-21 10:35 UTC (permalink / raw)
To: Huang Pei
Cc: Thomas Bogendoerfer, Bibo Mao, linux-mips@vger.kernel.org,
paulburton@kernel.org, Xuefeng Li, Tiezhu Yang, Gao Juxin,
Huacai Chen
在2024年1月21日一月 上午2:14,Huang Pei写道:
> On Fri, Jan 19, 2024 at 10:05:39AM +0000, Jiaxun Yang wrote:
>>
>>
>> 在 2024/1/19 04:02, Huang Pei 写道:
>> > Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
>> > that reserved memblock regions have no valid node id set, just set it
>> > right since loongson64 firmware makes it clear in memory layout info.
>> >
>> > This works around booting failure on 3A1000+ since commit 61167ad5fecd
>> > ("mm: pass nid to reserve_bootmem_region()) under
>> > CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>>
>> This should be done at MIPS arch level I guess.
> You mean the real fix or the set nid for the reserved memblock region?
Set nid for reserved region.
>
> + the real fix is [1](see old mail).
>
> I do not think MIPS arch needs it:
>
> + This ONLY matters on NUMA, most of MIPS platforms is UMA.
There are still some NUMA capable platforms around, such as SGI-IP27.
Thanks
- Jiaxun
>
> + MM does not care about the nid of reserved memblock region provided by
> arch/platform, nor it provide "memblock_reserved_node" like
> "memblock_add_node". Loongson64 knows about nid of reserved region and the
> regular distribution of physical memory bewtween nodes, that is why it can
> be done by simple "memblock_set_node" and I add it.
>
>>
>> Thanks
>> - Jiaxun
>>
>> >
>> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
>> > ---
>> > arch/mips/loongson64/init.c | 2 ++
>> > arch/mips/loongson64/numa.c | 2 ++
>> > 2 files changed, 4 insertions(+)
>> >
>> > diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
>> > index f25caa6aa9d3..000ba91c0886 100644
>> > --- a/arch/mips/loongson64/init.c
>> > +++ b/arch/mips/loongson64/init.c
>> > @@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
>> > if (loongson_sysconf.vgabios_addr)
>> > memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
>> > SZ_256K);
>> > + /* set nid for reserved memory */
>> > + memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
>> > }
>> > #ifndef CONFIG_NUMA
>> > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
>> > index 8f61e93c0c5b..6345e096c532 100644
>> > --- a/arch/mips/loongson64/numa.c
>> > +++ b/arch/mips/loongson64/numa.c
>> > @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
>> > /* Reserve pfn range 0~node[0]->node_start_pfn */
>> > memblock_reserve(0, PAGE_SIZE * start_pfn);
>> > + /* set nid for reserved memory on node 0 */
>> > + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
>> > }
>> > }
>>
>> --
>> ---
>> Jiaxun Yang
--
- Jiaxun
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH V6]: MIPS: loongson64: fix boot failure
2024-01-09 21:40 ` Thomas Bogendoerfer
` (2 preceding siblings ...)
2024-01-19 4:02 ` [PATCH V4]: MIPS: loongson64: fix boot failure Huang Pei
@ 2024-01-22 8:08 ` Huang Pei
2024-01-22 8:08 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-22 8:08 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-23 1:47 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
4 siblings, 2 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-22 8:08 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Patch 1: simplify the code, thanks Thomas.
Patch 2: correct comment and commit message.
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-22 8:08 ` [PATCH V6]: MIPS: loongson64: fix boot failure Huang Pei
@ 2024-01-22 8:08 ` Huang Pei
2024-01-22 8:08 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
1 sibling, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-22 8:08 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
"cpu_probe" is called both by BP and APs, but reserving exception vector
(like 0x0-0x1000) called by "cpu_probe" need once and calling on APs is
too late since memblock is unavailable at that time.
So, reserve exception vector ONLY by BP.
Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/kernel/traps.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 246c6a6b0261..5b778995d448 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -2007,7 +2007,13 @@ unsigned long vi_handlers[64];
void reserve_exception_space(phys_addr_t addr, unsigned long size)
{
- memblock_reserve(addr, size);
+ /*
+ * reserve exception space on CPUs other than CPU0
+ * is too late, since memblock is unavailable when APs
+ * up
+ */
+ if (smp_processor_id() == 0)
+ memblock_reserve(addr, size);
}
void __init *set_except_vector(int n, void *addr)
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-22 8:08 ` [PATCH V6]: MIPS: loongson64: fix boot failure Huang Pei
2024-01-22 8:08 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
@ 2024-01-22 8:08 ` Huang Pei
2024-01-22 8:20 ` Sergey Shtylyov
1 sibling, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-22 8:08 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This works around booting failure on 3A1000+ since commit 61167ad5fecd
("mm: pass nid to reserve_bootmem_region()) under
CONFIG_DEFERRED_STRUCT_PAGE_INIT.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/loongson64/init.c | 2 ++
arch/mips/loongson64/numa.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index f25caa6aa9d3..000ba91c0886 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
if (loongson_sysconf.vgabios_addr)
memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
SZ_256K);
+ /* set nid for reserved memory */
+ memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
}
#ifndef CONFIG_NUMA
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 8f61e93c0c5b..6345e096c532 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
/* Reserve pfn range 0~node[0]->node_start_pfn */
memblock_reserve(0, PAGE_SIZE * start_pfn);
+ /* set nid for reserved memory on node 0 */
+ memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-22 8:08 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
@ 2024-01-22 8:20 ` Sergey Shtylyov
0 siblings, 0 replies; 50+ messages in thread
From: Sergey Shtylyov @ 2024-01-22 8:20 UTC (permalink / raw)
To: Huang Pei, Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On 1/22/24 11:08 AM, Huang Pei wrote:
> Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()) reveals
You forgot a space before (" and " after reserve_bootmem_region(). :-)
> that reserved memblock regions have no valid node id set, just set it
> right since loongson64 firmware makes it clear in memory layout info.
>
> This works around booting failure on 3A1000+ since commit 61167ad5fecd
> ("mm: pass nid to reserve_bootmem_region()) under
Again, you forgot to close the quote...
> CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/loongson64/init.c | 2 ++
> arch/mips/loongson64/numa.c | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> index f25caa6aa9d3..000ba91c0886 100644
> --- a/arch/mips/loongson64/init.c
> +++ b/arch/mips/loongson64/init.c
> @@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
> if (loongson_sysconf.vgabios_addr)
> memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
> SZ_256K);
> + /* set nid for reserved memory */
> + memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
Please add spaces around +, at least for consistency with <<. :-)
[...]
> diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> index 8f61e93c0c5b..6345e096c532 100644
> --- a/arch/mips/loongson64/numa.c
> +++ b/arch/mips/loongson64/numa.c
> @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
>
> /* Reserve pfn range 0~node[0]->node_start_pfn */
> memblock_reserve(0, PAGE_SIZE * start_pfn);
> + /* set nid for reserved memory on node 0 */
> + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
Could use 1ULL instead of (u64)1...
[...]
MBR, Sergey
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-09 21:40 ` Thomas Bogendoerfer
` (3 preceding siblings ...)
2024-01-22 8:08 ` [PATCH V6]: MIPS: loongson64: fix boot failure Huang Pei
@ 2024-01-23 1:47 ` Huang Pei
2024-01-23 1:47 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-26 10:12 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Thomas Bogendoerfer
4 siblings, 2 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-23 1:47 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
"cpu_probe" is called both by BP and APs, but reserving exception vector
(like 0x0-0x1000) called by "cpu_probe" need once and calling on APs is
too late since memblock is unavailable at that time.
So, reserve exception vector ONLY by BP.
Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/kernel/traps.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
index 246c6a6b0261..5b778995d448 100644
--- a/arch/mips/kernel/traps.c
+++ b/arch/mips/kernel/traps.c
@@ -2007,7 +2007,13 @@ unsigned long vi_handlers[64];
void reserve_exception_space(phys_addr_t addr, unsigned long size)
{
- memblock_reserve(addr, size);
+ /*
+ * reserve exception space on CPUs other than CPU0
+ * is too late, since memblock is unavailable when APs
+ * up
+ */
+ if (smp_processor_id() == 0)
+ memblock_reserve(addr, size);
}
void __init *set_except_vector(int n, void *addr)
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-23 1:47 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
@ 2024-01-23 1:47 ` Huang Pei
2024-01-26 10:12 ` Thomas Bogendoerfer
2024-01-26 10:12 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Thomas Bogendoerfer
1 sibling, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-23 1:47 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This works around booting failure on 3A1000+ since commit 61167ad5fecd
("mm: pass nid to reserve_bootmem_region()") under
CONFIG_DEFERRED_STRUCT_PAGE_INIT.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/loongson64/init.c | 2 ++
arch/mips/loongson64/numa.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index f25caa6aa9d3..000ba91c0886 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
if (loongson_sysconf.vgabios_addr)
memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
SZ_256K);
+ /* set nid for reserved memory */
+ memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
}
#ifndef CONFIG_NUMA
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 8f61e93c0c5b..6345e096c532 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
/* Reserve pfn range 0~node[0]->node_start_pfn */
memblock_reserve(0, PAGE_SIZE * start_pfn);
+ /* set nid for reserved memory on node 0 */
+ memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE
2024-01-23 1:47 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-23 1:47 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
@ 2024-01-26 10:12 ` Thomas Bogendoerfer
1 sibling, 0 replies; 50+ messages in thread
From: Thomas Bogendoerfer @ 2024-01-26 10:12 UTC (permalink / raw)
To: Huang Pei
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Tue, Jan 23, 2024 at 09:47:57AM +0800, Huang Pei wrote:
> "cpu_probe" is called both by BP and APs, but reserving exception vector
> (like 0x0-0x1000) called by "cpu_probe" need once and calling on APs is
> too late since memblock is unavailable at that time.
>
> So, reserve exception vector ONLY by BP.
>
> Suggested-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/kernel/traps.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/mips/kernel/traps.c b/arch/mips/kernel/traps.c
> index 246c6a6b0261..5b778995d448 100644
> --- a/arch/mips/kernel/traps.c
> +++ b/arch/mips/kernel/traps.c
> @@ -2007,7 +2007,13 @@ unsigned long vi_handlers[64];
>
> void reserve_exception_space(phys_addr_t addr, unsigned long size)
> {
> - memblock_reserve(addr, size);
> + /*
> + * reserve exception space on CPUs other than CPU0
> + * is too late, since memblock is unavailable when APs
> + * up
> + */
> + if (smp_processor_id() == 0)
> + memblock_reserve(addr, size);
> }
>
> void __init *set_except_vector(int n, void *addr)
> --
> 2.30.2
applied to mips-fixes.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-23 1:47 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
@ 2024-01-26 10:12 ` Thomas Bogendoerfer
2024-01-26 14:24 ` Huacai Chen
0 siblings, 1 reply; 50+ messages in thread
From: Thomas Bogendoerfer @ 2024-01-26 10:12 UTC (permalink / raw)
To: Huang Pei
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Tue, Jan 23, 2024 at 09:47:58AM +0800, Huang Pei wrote:
> Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
> that reserved memblock regions have no valid node id set, just set it
> right since loongson64 firmware makes it clear in memory layout info.
>
> This works around booting failure on 3A1000+ since commit 61167ad5fecd
> ("mm: pass nid to reserve_bootmem_region()") under
> CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/loongson64/init.c | 2 ++
> arch/mips/loongson64/numa.c | 2 ++
> 2 files changed, 4 insertions(+)
>
> diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> index f25caa6aa9d3..000ba91c0886 100644
> --- a/arch/mips/loongson64/init.c
> +++ b/arch/mips/loongson64/init.c
> @@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
> if (loongson_sysconf.vgabios_addr)
> memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
> SZ_256K);
> + /* set nid for reserved memory */
> + memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
> }
>
> #ifndef CONFIG_NUMA
> diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> index 8f61e93c0c5b..6345e096c532 100644
> --- a/arch/mips/loongson64/numa.c
> +++ b/arch/mips/loongson64/numa.c
> @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
>
> /* Reserve pfn range 0~node[0]->node_start_pfn */
> memblock_reserve(0, PAGE_SIZE * start_pfn);
> + /* set nid for reserved memory on node 0 */
> + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
> }
> }
>
> --
> 2.30.2
applied to mips-fixes.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-26 10:12 ` Thomas Bogendoerfer
@ 2024-01-26 14:24 ` Huacai Chen
2024-01-26 17:24 ` Thomas Bogendoerfer
0 siblings, 1 reply; 50+ messages in thread
From: Huacai Chen @ 2024-01-26 14:24 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Huang Pei, Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
On Fri, Jan 26, 2024 at 6:30 PM Thomas Bogendoerfer
<tsbogend@alpha.franken.de> wrote:
>
> On Tue, Jan 23, 2024 at 09:47:58AM +0800, Huang Pei wrote:
> > Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
> > that reserved memblock regions have no valid node id set, just set it
> > right since loongson64 firmware makes it clear in memory layout info.
> >
> > This works around booting failure on 3A1000+ since commit 61167ad5fecd
> > ("mm: pass nid to reserve_bootmem_region()") under
> > CONFIG_DEFERRED_STRUCT_PAGE_INIT.
> >
> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > ---
> > arch/mips/loongson64/init.c | 2 ++
> > arch/mips/loongson64/numa.c | 2 ++
> > 2 files changed, 4 insertions(+)
> >
> > diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
> > index f25caa6aa9d3..000ba91c0886 100644
> > --- a/arch/mips/loongson64/init.c
> > +++ b/arch/mips/loongson64/init.c
> > @@ -103,6 +103,8 @@ void __init szmem(unsigned int node)
> > if (loongson_sysconf.vgabios_addr)
> > memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
> > SZ_256K);
> > + /* set nid for reserved memory */
> > + memblock_set_node((u64)node << 44, (u64)(node+1) << 44, &memblock.reserved, node);
> > }
> >
> > #ifndef CONFIG_NUMA
> > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > index 8f61e93c0c5b..6345e096c532 100644
> > --- a/arch/mips/loongson64/numa.c
> > +++ b/arch/mips/loongson64/numa.c
> > @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
> >
> > /* Reserve pfn range 0~node[0]->node_start_pfn */
> > memblock_reserve(0, PAGE_SIZE * start_pfn);
> > + /* set nid for reserved memory on node 0 */
> > + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
> > }
> > }
> >
> > --
> > 2.30.2
>
> applied to mips-fixes.
Oh, I'm sorry that I found a very stupid error in this patch. The
comment says set memory on node 0 but the code set it on node 1...
Huacai
>
> Thomas.
>
> --
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea. [ RFC1925, 2.3 ]
>
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region
2024-01-26 14:24 ` Huacai Chen
@ 2024-01-26 17:24 ` Thomas Bogendoerfer
2024-01-27 9:12 ` [PATCH] " Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Thomas Bogendoerfer @ 2024-01-26 17:24 UTC (permalink / raw)
To: Huacai Chen
Cc: Huang Pei, Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton,
Li Xuefeng, Yang Tiezhu, Gao Juxin, Huacai Chen
On Fri, Jan 26, 2024 at 10:24:29PM +0800, Huacai Chen wrote:
> On Fri, Jan 26, 2024 at 6:30 PM Thomas Bogendoerfer
> <tsbogend@alpha.franken.de> wrote:
> >
> > On Tue, Jan 23, 2024 at 09:47:58AM +0800, Huang Pei wrote:
> > [...]
> > > diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
> > > index 8f61e93c0c5b..6345e096c532 100644
> > > --- a/arch/mips/loongson64/numa.c
> > > +++ b/arch/mips/loongson64/numa.c
> > > @@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
> > >
> > > /* Reserve pfn range 0~node[0]->node_start_pfn */
> > > memblock_reserve(0, PAGE_SIZE * start_pfn);
> > > + /* set nid for reserved memory on node 0 */
> > > + memblock_set_node(0, (u64)1 << 44, &memblock.reserved, 1);
> > > }
> > > }
> > >
> > > --
> > > 2.30.2
> >
> > applied to mips-fixes.
> Oh, I'm sorry that I found a very stupid error in this patch. The
> comment says set memory on node 0 but the code set it on node 1...
and what is correct ? Please send a patch to fix it in mips-fixes.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] MIPS: loongson64: set nid for reserved memblock region
2024-01-26 17:24 ` Thomas Bogendoerfer
@ 2024-01-27 9:12 ` Huang Pei
2024-01-27 9:12 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-27 9:12 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
fix wrong nid for node0's reserved memblock region.
^ permalink raw reply [flat|nested] 50+ messages in thread
* [PATCH] MIPS: loongson64: set nid for reserved memblock region
2024-01-27 9:12 ` [PATCH] " Huang Pei
@ 2024-01-27 9:12 ` Huang Pei
2024-01-27 10:04 ` Thomas Bogendoerfer
0 siblings, 1 reply; 50+ messages in thread
From: Huang Pei @ 2024-01-27 9:12 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
that reserved memblock regions have no valid node id set, just set it
right since loongson64 firmware makes it clear in memory layout info.
This works around booting failure on 3A1000+ since commit 61167ad5fecd
("mm: pass nid to reserve_bootmem_region()") under
CONFIG_DEFERRED_STRUCT_PAGE_INIT.
Signed-off-by: Huang Pei <huangpei@loongson.cn>
---
arch/mips/loongson64/init.c | 3 +++
arch/mips/loongson64/numa.c | 2 ++
2 files changed, 5 insertions(+)
diff --git a/arch/mips/loongson64/init.c b/arch/mips/loongson64/init.c
index f25caa6aa9d3..553142c1f14f 100644
--- a/arch/mips/loongson64/init.c
+++ b/arch/mips/loongson64/init.c
@@ -103,6 +103,9 @@ void __init szmem(unsigned int node)
if (loongson_sysconf.vgabios_addr)
memblock_reserve(virt_to_phys((void *)loongson_sysconf.vgabios_addr),
SZ_256K);
+ /* set nid for reserved memory */
+ memblock_set_node((u64)node << 44, (u64)(node + 1) << 44,
+ &memblock.reserved, node);
}
#ifndef CONFIG_NUMA
diff --git a/arch/mips/loongson64/numa.c b/arch/mips/loongson64/numa.c
index 8f61e93c0c5b..68dafd6d3e25 100644
--- a/arch/mips/loongson64/numa.c
+++ b/arch/mips/loongson64/numa.c
@@ -132,6 +132,8 @@ static void __init node_mem_init(unsigned int node)
/* Reserve pfn range 0~node[0]->node_start_pfn */
memblock_reserve(0, PAGE_SIZE * start_pfn);
+ /* set nid for reserved memory on node 0 */
+ memblock_set_node(0, 1ULL << 44, &memblock.reserved, 0);
}
}
--
2.30.2
^ permalink raw reply related [flat|nested] 50+ messages in thread
* Re: [PATCH] MIPS: loongson64: set nid for reserved memblock region
2024-01-27 9:12 ` Huang Pei
@ 2024-01-27 10:04 ` Thomas Bogendoerfer
2024-01-28 4:42 ` Huang Pei
0 siblings, 1 reply; 50+ messages in thread
From: Thomas Bogendoerfer @ 2024-01-27 10:04 UTC (permalink / raw)
To: Huang Pei
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Sat, Jan 27, 2024 at 05:12:21PM +0800, Huang Pei wrote:
> Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
> that reserved memblock regions have no valid node id set, just set it
> right since loongson64 firmware makes it clear in memory layout info.
>
> This works around booting failure on 3A1000+ since commit 61167ad5fecd
> ("mm: pass nid to reserve_bootmem_region()") under
> CONFIG_DEFERRED_STRUCT_PAGE_INIT.
>
> Signed-off-by: Huang Pei <huangpei@loongson.cn>
> ---
> arch/mips/loongson64/init.c | 3 +++
> arch/mips/loongson64/numa.c | 2 ++
> 2 files changed, 5 insertions(+)
I would've needed a patch just fixing the one line since the broken commit
is already in mips-fixes, which is a public tree so no rebasing.
I'm going to revert the old commit instead and add the new one,
BTW. please send new patches as it's own thread and not as a reply.
Thomas.
--
Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
* Re: [PATCH] MIPS: loongson64: set nid for reserved memblock region
2024-01-27 10:04 ` Thomas Bogendoerfer
@ 2024-01-28 4:42 ` Huang Pei
0 siblings, 0 replies; 50+ messages in thread
From: Huang Pei @ 2024-01-28 4:42 UTC (permalink / raw)
To: Thomas Bogendoerfer
Cc: Bibo Mao, linux-mips, Jiaxun Yang, Paul Burton, Li Xuefeng,
Yang Tiezhu, Gao Juxin, Huacai Chen
On Sat, Jan 27, 2024 at 11:04:04AM +0100, Thomas Bogendoerfer wrote:
> On Sat, Jan 27, 2024 at 05:12:21PM +0800, Huang Pei wrote:
> > Commit 61167ad5fecd("mm: pass nid to reserve_bootmem_region()") reveals
> > that reserved memblock regions have no valid node id set, just set it
> > right since loongson64 firmware makes it clear in memory layout info.
> >
> > This works around booting failure on 3A1000+ since commit 61167ad5fecd
> > ("mm: pass nid to reserve_bootmem_region()") under
> > CONFIG_DEFERRED_STRUCT_PAGE_INIT.
> >
> > Signed-off-by: Huang Pei <huangpei@loongson.cn>
> > ---
> > arch/mips/loongson64/init.c | 3 +++
> > arch/mips/loongson64/numa.c | 2 ++
> > 2 files changed, 5 insertions(+)
>
> I would've needed a patch just fixing the one line since the broken commit
> is already in mips-fixes, which is a public tree so no rebasing.
> I'm going to revert the old commit instead and add the new one,
> BTW. please send new patches as it's own thread and not as a reply.
>
Ok, new patches always as its own thread not as a reply, I got it.
> Thomas.
>
> --
> Crap can work. Given enough thrust pigs will fly, but it's not necessarily a
> good idea. [ RFC1925, 2.3 ]
^ permalink raw reply [flat|nested] 50+ messages in thread
end of thread, other threads:[~2024-01-28 4:42 UTC | newest]
Thread overview: 50+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-25 9:30 [PATCH] MIPS: loongson64: fix boot failure Huang Pei
2024-01-09 21:40 ` Thomas Bogendoerfer
2024-01-13 9:55 ` [PATCH V2]: " Huang Pei
2024-01-13 9:55 ` [PATCH 1/3] MIPS: adjust exception vector space revervation Huang Pei
2024-01-13 9:55 ` [PATCH 2/3] MIPS: loongson64: fix booting failure Huang Pei
2024-01-13 9:55 ` [PATCH 3/3] Revert "MIPS: Loongson64: Handle more memory types passed from firmware" Huang Pei
2024-01-13 11:59 ` Jiaxun Yang
2024-01-14 8:53 ` Huang Pei
2024-01-14 11:58 ` Jiaxun Yang
2024-01-15 1:25 ` Huang Pei
2024-01-15 14:14 ` Jiaxun Yang
2024-01-16 3:10 ` Huang Pei
2024-01-18 12:39 ` [PATCH V3]: MIPS: loongson64: fix booting failure Huang Pei
2024-01-18 12:39 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-18 12:39 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-19 4:02 ` [PATCH V4]: MIPS: loongson64: fix boot failure Huang Pei
2024-01-19 4:02 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-19 15:23 ` Sergei Shtylyov
2024-01-19 16:15 ` Thomas Bogendoerfer
2024-01-21 7:13 ` Huang Pei
2024-01-19 4:02 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-19 10:05 ` Jiaxun Yang
2024-01-21 2:14 ` Huang Pei
2024-01-21 10:35 ` Jiaxun Yang
2024-01-22 8:08 ` [PATCH V6]: MIPS: loongson64: fix boot failure Huang Pei
2024-01-22 8:08 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-22 8:08 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-22 8:20 ` Sergey Shtylyov
2024-01-23 1:47 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Huang Pei
2024-01-23 1:47 ` [PATCH 2/2] MIPS: loongson64: set nid for reserved memblock region Huang Pei
2024-01-26 10:12 ` Thomas Bogendoerfer
2024-01-26 14:24 ` Huacai Chen
2024-01-26 17:24 ` Thomas Bogendoerfer
2024-01-27 9:12 ` [PATCH] " Huang Pei
2024-01-27 9:12 ` Huang Pei
2024-01-27 10:04 ` Thomas Bogendoerfer
2024-01-28 4:42 ` Huang Pei
2024-01-26 10:12 ` [PATCH 1/2] MIPS: reserve exception vector space ONLY ONCE Thomas Bogendoerfer
2024-01-15 14:08 ` memblock_reserve for unadded region (was: [PATCH] MIPS: loongson64: fix boot failure) Jiaxun Yang
2024-01-16 3:27 ` Huang Pei
2024-01-16 8:39 ` Mike Rapoport
2024-01-16 12:23 ` Huang Pei
2024-01-17 2:20 ` Yajun Deng
2024-01-17 3:01 ` Huang Pei
2024-01-17 3:17 ` Yajun Deng
2024-01-17 3:59 ` Huang Pei
2024-01-17 6:46 ` Mike Rapoport
2024-01-17 7:45 ` Huang Pei
2024-01-17 11:08 ` Mike Rapoport
2024-01-18 2:26 ` Huang Pei
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).