linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size
@ 2023-10-17  6:22 Mike Rapoport
  2023-10-17  7:28 ` David Hildenbrand
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Mike Rapoport @ 2023-10-17  6:22 UTC (permalink / raw)
  To: x86
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	David Hildenbrand, H. Peter Anvin, Ingo Molnar, Michal Hocko,
	Mike Rapoport, Peter Zijlstra, Qi Zheng, Thomas Gleixner,
	linux-kernel, linux-mm

From: "Mike Rapoport (IBM)" <rppt@kernel.org>

Qi Zheng reports crashes in a production environment and provides a
simplified example as a reproducer:

  For example, if we use qemu to start a two NUMA node kernel,
  one of the nodes has 2M memory (less than NODE_MIN_SIZE),
  and the other node has 2G, then we will encounter the
  following panic:

  [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
  [    0.150783] #PF: supervisor write access in kernel mode
  [    0.151488] #PF: error_code(0x0002) - not-present page
  <...>
  [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
  <...>
  [    0.169781] Call Trace:
  [    0.170159]  <TASK>
  [    0.170448]  deactivate_slab+0x187/0x3c0
  [    0.171031]  ? bootstrap+0x1b/0x10e
  [    0.171559]  ? preempt_count_sub+0x9/0xa0
  [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
  [    0.172735]  ? bootstrap+0x1b/0x10e
  [    0.173236]  bootstrap+0x6b/0x10e
  [    0.173720]  kmem_cache_init+0x10a/0x188
  [    0.174240]  start_kernel+0x415/0x6ac
  [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
  [    0.175417]  </TASK>
  [    0.175713] Modules linked in:
  [    0.176117] CR2: 0000000000000000

The crashes happen because of inconsistency between nodemask that has
nodes with less than 4MB as memoryless and the actual memory fed into
core mm.

The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
empty node in SRAT parsing") that introduced minimal size of a NUMA node
does not explain why a node size cannot be less than 4MB and what boot
failures this restriction might fix.

Since then a lot has changed and core mm won't confuse badly about small
node sizes.

Drop the limitation for the minimal node size.

Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
---
 arch/x86/include/asm/numa.h | 7 -------
 arch/x86/mm/numa.c          | 7 -------
 2 files changed, 14 deletions(-)

diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index e3bae2b60a0d..ef2844d69173 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -12,13 +12,6 @@
 
 #define NR_NODE_MEMBLKS		(MAX_NUMNODES*2)
 
-/*
- * Too small node sizes may confuse the VM badly. Usually they
- * result from BIOS bugs. So dont recognize nodes as standalone
- * NUMA entities that have less than this amount of RAM listed:
- */
-#define NODE_MIN_SIZE (4*1024*1024)
-
 extern int numa_off;
 
 /*
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 2aadb2019b4f..55e3d895f15c 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -601,13 +601,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
 		if (start >= end)
 			continue;
 
-		/*
-		 * Don't confuse VM with a node that doesn't have the
-		 * minimum amount of memory:
-		 */
-		if (end && (end - start) < NODE_MIN_SIZE)
-			continue;
-
 		alloc_node_data(nid);
 	}
 

base-commit: 94f6f0550c625fab1f373bb86a6669b45e9748b3
-- 
2.39.2



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size
  2023-10-17  6:22 [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mike Rapoport
@ 2023-10-17  7:28 ` David Hildenbrand
  2023-10-17  7:35   ` David Hildenbrand
  2023-10-17  7:52   ` Mike Rapoport
  2023-10-18 10:42 ` [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size Ingo Molnar
  2023-10-18 11:55 ` [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mario Casquero
  2 siblings, 2 replies; 12+ messages in thread
From: David Hildenbrand @ 2023-10-17  7:28 UTC (permalink / raw)
  To: Mike Rapoport, x86
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Michal Hocko, Peter Zijlstra,
	Qi Zheng, Thomas Gleixner, linux-kernel, linux-mm

On 17.10.23 08:22, Mike Rapoport wrote:
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> 
> Qi Zheng reports crashes in a production environment and provides a
> simplified example as a reproducer:
> 
>    For example, if we use qemu to start a two NUMA node kernel,
>    one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>    and the other node has 2G, then we will encounter the
>    following panic:
> 
>    [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
>    [    0.150783] #PF: supervisor write access in kernel mode
>    [    0.151488] #PF: error_code(0x0002) - not-present page
>    <...>
>    [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>    <...>
>    [    0.169781] Call Trace:
>    [    0.170159]  <TASK>
>    [    0.170448]  deactivate_slab+0x187/0x3c0
>    [    0.171031]  ? bootstrap+0x1b/0x10e
>    [    0.171559]  ? preempt_count_sub+0x9/0xa0
>    [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
>    [    0.172735]  ? bootstrap+0x1b/0x10e
>    [    0.173236]  bootstrap+0x6b/0x10e
>    [    0.173720]  kmem_cache_init+0x10a/0x188
>    [    0.174240]  start_kernel+0x415/0x6ac
>    [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
>    [    0.175417]  </TASK>
>    [    0.175713] Modules linked in:
>    [    0.176117] CR2: 0000000000000000
> 
> The crashes happen because of inconsistency between nodemask that has
> nodes with less than 4MB as memoryless and the actual memory fed into
> core mm.
> 
> The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
> empty node in SRAT parsing") that introduced minimal size of a NUMA node
> does not explain why a node size cannot be less than 4MB and what boot
> failures this restriction might fix.
> 
> Since then a lot has changed and core mm won't confuse badly about small
> node sizes.
> 
> Drop the limitation for the minimal node size.
> 
> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Acked-by: David Hildenbrand <david@redhat.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/

That's just a resend I assume? Or has anything changed?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size
  2023-10-17  7:28 ` David Hildenbrand
@ 2023-10-17  7:35   ` David Hildenbrand
  2023-10-17  7:52   ` Mike Rapoport
  1 sibling, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2023-10-17  7:35 UTC (permalink / raw)
  To: Mike Rapoport, x86
  Cc: Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Michal Hocko, Peter Zijlstra,
	Qi Zheng, Thomas Gleixner, linux-kernel, linux-mm

On 17.10.23 09:28, David Hildenbrand wrote:
> On 17.10.23 08:22, Mike Rapoport wrote:
>> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
>>
>> Qi Zheng reports crashes in a production environment and provides a
>> simplified example as a reproducer:
>>
>>     For example, if we use qemu to start a two NUMA node kernel,
>>     one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>>     and the other node has 2G, then we will encounter the
>>     following panic:
>>
>>     [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
>>     [    0.150783] #PF: supervisor write access in kernel mode
>>     [    0.151488] #PF: error_code(0x0002) - not-present page
>>     <...>
>>     [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>>     <...>
>>     [    0.169781] Call Trace:
>>     [    0.170159]  <TASK>
>>     [    0.170448]  deactivate_slab+0x187/0x3c0
>>     [    0.171031]  ? bootstrap+0x1b/0x10e
>>     [    0.171559]  ? preempt_count_sub+0x9/0xa0
>>     [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
>>     [    0.172735]  ? bootstrap+0x1b/0x10e
>>     [    0.173236]  bootstrap+0x6b/0x10e
>>     [    0.173720]  kmem_cache_init+0x10a/0x188
>>     [    0.174240]  start_kernel+0x415/0x6ac
>>     [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
>>     [    0.175417]  </TASK>
>>     [    0.175713] Modules linked in:
>>     [    0.176117] CR2: 0000000000000000
>>
>> The crashes happen because of inconsistency between nodemask that has
>> nodes with less than 4MB as memoryless and the actual memory fed into
>> core mm.
>>
>> The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
>> empty node in SRAT parsing") that introduced minimal size of a NUMA node
>> does not explain why a node size cannot be less than 4MB and what boot
>> failures this restriction might fix.
>>
>> Since then a lot has changed and core mm won't confuse badly about small
>> node sizes.
>>
>> Drop the limitation for the minimal node size.
>>
>> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Acked-by: Michal Hocko <mhocko@suse.com>
>> Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
> 
> That's just a resend I assume? Or has anything changed?

Saw the other mail now, so just a resend.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size
  2023-10-17  7:28 ` David Hildenbrand
  2023-10-17  7:35   ` David Hildenbrand
@ 2023-10-17  7:52   ` Mike Rapoport
  1 sibling, 0 replies; 12+ messages in thread
From: Mike Rapoport @ 2023-10-17  7:52 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: x86, Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Michal Hocko, Peter Zijlstra,
	Qi Zheng, Thomas Gleixner, linux-kernel, linux-mm

On Tue, Oct 17, 2023 at 09:28:14AM +0200, David Hildenbrand wrote:
> On 17.10.23 08:22, Mike Rapoport wrote:
> > From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> > 
> > Qi Zheng reports crashes in a production environment and provides a
> > simplified example as a reproducer:
> > 
> >    For example, if we use qemu to start a two NUMA node kernel,
> >    one of the nodes has 2M memory (less than NODE_MIN_SIZE),
> >    and the other node has 2G, then we will encounter the
> >    following panic:
> > 
> >    [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
> >    [    0.150783] #PF: supervisor write access in kernel mode
> >    [    0.151488] #PF: error_code(0x0002) - not-present page
> >    <...>
> >    [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
> >    <...>
> >    [    0.169781] Call Trace:
> >    [    0.170159]  <TASK>
> >    [    0.170448]  deactivate_slab+0x187/0x3c0
> >    [    0.171031]  ? bootstrap+0x1b/0x10e
> >    [    0.171559]  ? preempt_count_sub+0x9/0xa0
> >    [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
> >    [    0.172735]  ? bootstrap+0x1b/0x10e
> >    [    0.173236]  bootstrap+0x6b/0x10e
> >    [    0.173720]  kmem_cache_init+0x10a/0x188
> >    [    0.174240]  start_kernel+0x415/0x6ac
> >    [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
> >    [    0.175417]  </TASK>
> >    [    0.175713] Modules linked in:
> >    [    0.176117] CR2: 0000000000000000
> > 
> > The crashes happen because of inconsistency between nodemask that has
> > nodes with less than 4MB as memoryless and the actual memory fed into
> > core mm.
> > 
> > The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
> > empty node in SRAT parsing") that introduced minimal size of a NUMA node
> > does not explain why a node size cannot be less than 4MB and what boot
> > failures this restriction might fix.
> > 
> > Since then a lot has changed and core mm won't confuse badly about small
> > node sizes.
> > 
> > Drop the limitation for the minimal node size.
> > 
> > Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
> > Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Acked-by: Michal Hocko <mhocko@suse.com>
> > Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
> 
> That's just a resend I assume? Or has anything changed?

Oh, I forgot RESEND prefix, sorry
 
> -- 
> Cheers,
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-17  6:22 [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mike Rapoport
  2023-10-17  7:28 ` David Hildenbrand
@ 2023-10-18 10:42 ` Ingo Molnar
  2023-10-18 12:26   ` Qi Zheng
  2023-10-19  9:35   ` Mike Rapoport
  2023-10-18 11:55 ` [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mario Casquero
  2 siblings, 2 replies; 12+ messages in thread
From: Ingo Molnar @ 2023-10-18 10:42 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: x86, Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	David Hildenbrand, H. Peter Anvin, Ingo Molnar, Michal Hocko,
	Peter Zijlstra, Qi Zheng, Thomas Gleixner, linux-kernel, linux-mm


* Mike Rapoport <rppt@kernel.org> wrote:

> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> 
> Qi Zheng reports crashes in a production environment and provides a
> simplified example as a reproducer:
> 
>   For example, if we use qemu to start a two NUMA node kernel,
>   one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>   and the other node has 2G, then we will encounter the
>   following panic:
> 
>   [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
>   [    0.150783] #PF: supervisor write access in kernel mode
>   [    0.151488] #PF: error_code(0x0002) - not-present page
>   <...>
>   [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>   <...>
>   [    0.169781] Call Trace:
>   [    0.170159]  <TASK>
>   [    0.170448]  deactivate_slab+0x187/0x3c0
>   [    0.171031]  ? bootstrap+0x1b/0x10e
>   [    0.171559]  ? preempt_count_sub+0x9/0xa0
>   [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
>   [    0.172735]  ? bootstrap+0x1b/0x10e
>   [    0.173236]  bootstrap+0x6b/0x10e
>   [    0.173720]  kmem_cache_init+0x10a/0x188
>   [    0.174240]  start_kernel+0x415/0x6ac
>   [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
>   [    0.175417]  </TASK>
>   [    0.175713] Modules linked in:
>   [    0.176117] CR2: 0000000000000000
> 
> The crashes happen because of inconsistency between nodemask that has
> nodes with less than 4MB as memoryless and the actual memory fed into
> core mm.

Presumably the core MM got fixed too to not just crash, but provide some 
sort of warning?

> The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
> empty node in SRAT parsing") that introduced minimal size of a NUMA node
> does not explain why a node size cannot be less than 4MB and what boot
> failures this restriction might fix.
> 
> Since then a lot has changed and core mm won't confuse badly about small
> node sizes.

Core MM won't get confused ... other than by the above weird Qemu topology, 
to which it responds with a ... NULL pointer dereference?

Seems quite close to the literal definition of 'get confused badly' to me, 
and doesn't give me the warm fuzzy feeling that giving the core MM even 
*more* weird topologies is super safe ... :-/

> Drop the limitation for the minimal node size.

While I agree with dropping the limitation, and I agree that 9391a3f9c7f1 
should have provided more of a justification, I believe a core MM fix is in 
order as well, for it to not crash. [ If it's fixed upstream already, 
please reference the relevant commit ID. ]

Also, the changelog spelling & general presentation were quite low quality 
- I've fixed it up a bit below, please carry this version going forward. 
Please spell-check your patches before sending out Nth versions of it, 
maybe maintainers are skipping them for a reason!

Thanks,

	Ingo

=================>
From: "Mike Rapoport (IBM)" <rppt@kernel.org>
Date: Tue, 17 Oct 2023 09:22:15 +0300
Subject: [PATCH] x86/mm: Drop 4MB restriction on minimal NUMA node memory size

Qi Zheng reported crashes in a production environment and provided a
simplified example as a reproducer:

 |  For example, if we use qemu to start a two NUMA node kernel,
 |  one of the nodes has 2M memory (less than NODE_MIN_SIZE),
 |  and the other node has 2G, then we will encounter the
 |  following panic:
 |
 |    BUG: kernel NULL pointer dereference, address: 0000000000000000
 |    <...>
 |    RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
 |    <...>
 |    Call Trace:
 |      <TASK>
 |      deactivate_slab()
 |      bootstrap()
 |      kmem_cache_init()
 |      start_kernel()
 |      secondary_startup_64_no_verify()

The crashes happen because of inconsistency between the nodemask that
has nodes with less than 4MB as memoryless, and the actual memory fed
into the core mm.

The commit:

  9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing")

... that introduced minimal size of a NUMA node does not explain why
a node size cannot be less than 4MB and what boot failures this
restriction might fix.

In the 17 years since then a lot has changed and core mm won't get
confused about small node sizes.

Drop the limitation for the minimal node size.

[ mingo: Improved changelog clarity. ]

Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Not-Yet-Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
---
 arch/x86/include/asm/numa.h | 7 -------
 arch/x86/mm/numa.c          | 7 -------
 2 files changed, 14 deletions(-)

diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
index e3bae2b60a0d..ef2844d69173 100644
--- a/arch/x86/include/asm/numa.h
+++ b/arch/x86/include/asm/numa.h
@@ -12,13 +12,6 @@
 
 #define NR_NODE_MEMBLKS		(MAX_NUMNODES*2)
 
-/*
- * Too small node sizes may confuse the VM badly. Usually they
- * result from BIOS bugs. So dont recognize nodes as standalone
- * NUMA entities that have less than this amount of RAM listed:
- */
-#define NODE_MIN_SIZE (4*1024*1024)
-
 extern int numa_off;
 
 /*
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index c01c5506fd4a..aa39d678fe81 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -602,13 +602,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
 		if (start >= end)
 			continue;
 
-		/*
-		 * Don't confuse VM with a node that doesn't have the
-		 * minimum amount of memory:
-		 */
-		if (end && (end - start) < NODE_MIN_SIZE)
-			continue;
-
 		alloc_node_data(nid);
 	}
 


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size
  2023-10-17  6:22 [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mike Rapoport
  2023-10-17  7:28 ` David Hildenbrand
  2023-10-18 10:42 ` [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size Ingo Molnar
@ 2023-10-18 11:55 ` Mario Casquero
  2 siblings, 0 replies; 12+ messages in thread
From: Mario Casquero @ 2023-10-18 11:55 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: x86, Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	David Hildenbrand, H. Peter Anvin, Ingo Molnar, Michal Hocko,
	Peter Zijlstra, Qi Zheng, Thomas Gleixner, linux-kernel, linux-mm

This patch has been successfully tested by QE. Start a VM with two
NUMA nodes, one of them with less than 2M of memory. Check there is no
kernel panic and the VM boots up smoothly.
Tested-by: Mario Casquero <mcasquer@redhat.com>

BR,
Mario




On Tue, Oct 17, 2023 at 8:24 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
>
> Qi Zheng reports crashes in a production environment and provides a
> simplified example as a reproducer:
>
>   For example, if we use qemu to start a two NUMA node kernel,
>   one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>   and the other node has 2G, then we will encounter the
>   following panic:
>
>   [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
>   [    0.150783] #PF: supervisor write access in kernel mode
>   [    0.151488] #PF: error_code(0x0002) - not-present page
>   <...>
>   [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>   <...>
>   [    0.169781] Call Trace:
>   [    0.170159]  <TASK>
>   [    0.170448]  deactivate_slab+0x187/0x3c0
>   [    0.171031]  ? bootstrap+0x1b/0x10e
>   [    0.171559]  ? preempt_count_sub+0x9/0xa0
>   [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
>   [    0.172735]  ? bootstrap+0x1b/0x10e
>   [    0.173236]  bootstrap+0x6b/0x10e
>   [    0.173720]  kmem_cache_init+0x10a/0x188
>   [    0.174240]  start_kernel+0x415/0x6ac
>   [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
>   [    0.175417]  </TASK>
>   [    0.175713] Modules linked in:
>   [    0.176117] CR2: 0000000000000000
>
> The crashes happen because of inconsistency between nodemask that has
> nodes with less than 4MB as memoryless and the actual memory fed into
> core mm.
>
> The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
> empty node in SRAT parsing") that introduced minimal size of a NUMA node
> does not explain why a node size cannot be less than 4MB and what boot
> failures this restriction might fix.
>
> Since then a lot has changed and core mm won't confuse badly about small
> node sizes.
>
> Drop the limitation for the minimal node size.
>
> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Acked-by: David Hildenbrand <david@redhat.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
> ---
>  arch/x86/include/asm/numa.h | 7 -------
>  arch/x86/mm/numa.c          | 7 -------
>  2 files changed, 14 deletions(-)
>
> diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
> index e3bae2b60a0d..ef2844d69173 100644
> --- a/arch/x86/include/asm/numa.h
> +++ b/arch/x86/include/asm/numa.h
> @@ -12,13 +12,6 @@
>
>  #define NR_NODE_MEMBLKS                (MAX_NUMNODES*2)
>
> -/*
> - * Too small node sizes may confuse the VM badly. Usually they
> - * result from BIOS bugs. So dont recognize nodes as standalone
> - * NUMA entities that have less than this amount of RAM listed:
> - */
> -#define NODE_MIN_SIZE (4*1024*1024)
> -
>  extern int numa_off;
>
>  /*
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index 2aadb2019b4f..55e3d895f15c 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -601,13 +601,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
>                 if (start >= end)
>                         continue;
>
> -               /*
> -                * Don't confuse VM with a node that doesn't have the
> -                * minimum amount of memory:
> -                */
> -               if (end && (end - start) < NODE_MIN_SIZE)
> -                       continue;
> -
>                 alloc_node_data(nid);
>         }
>
>
> base-commit: 94f6f0550c625fab1f373bb86a6669b45e9748b3
> --
> 2.39.2
>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-18 10:42 ` [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size Ingo Molnar
@ 2023-10-18 12:26   ` Qi Zheng
  2023-10-18 12:44     ` Ingo Molnar
  2023-10-19  9:35   ` Mike Rapoport
  1 sibling, 1 reply; 12+ messages in thread
From: Qi Zheng @ 2023-10-18 12:26 UTC (permalink / raw)
  To: Ingo Molnar, Mike Rapoport, David Hildenbrand, Michal Hocko,
	Andrew Morton
  Cc: x86, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	linux-kernel, linux-mm

Hi all,

On 2023/10/18 18:42, Ingo Molnar wrote:
> 
> * Mike Rapoport <rppt@kernel.org> wrote:
> 
>> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
>>
>> Qi Zheng reports crashes in a production environment and provides a
>> simplified example as a reproducer:
>>
>>    For example, if we use qemu to start a two NUMA node kernel,
>>    one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>>    and the other node has 2G, then we will encounter the
>>    following panic:
>>
>>    [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
>>    [    0.150783] #PF: supervisor write access in kernel mode
>>    [    0.151488] #PF: error_code(0x0002) - not-present page
>>    <...>
>>    [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>>    <...>
>>    [    0.169781] Call Trace:
>>    [    0.170159]  <TASK>
>>    [    0.170448]  deactivate_slab+0x187/0x3c0
>>    [    0.171031]  ? bootstrap+0x1b/0x10e
>>    [    0.171559]  ? preempt_count_sub+0x9/0xa0
>>    [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
>>    [    0.172735]  ? bootstrap+0x1b/0x10e
>>    [    0.173236]  bootstrap+0x6b/0x10e
>>    [    0.173720]  kmem_cache_init+0x10a/0x188
>>    [    0.174240]  start_kernel+0x415/0x6ac
>>    [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
>>    [    0.175417]  </TASK>
>>    [    0.175713] Modules linked in:
>>    [    0.176117] CR2: 0000000000000000
>>
>> The crashes happen because of inconsistency between nodemask that has
>> nodes with less than 4MB as memoryless and the actual memory fed into
>> core mm.
> 
> Presumably the core MM got fixed too to not just crash, but provide some
> sort of warning?
> 
>> The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
>> empty node in SRAT parsing") that introduced minimal size of a NUMA node
>> does not explain why a node size cannot be less than 4MB and what boot
>> failures this restriction might fix.
>>
>> Since then a lot has changed and core mm won't confuse badly about small
>> node sizes.
> 
> Core MM won't get confused ... other than by the above weird Qemu topology,
> to which it responds with a ... NULL pointer dereference?
> 
> Seems quite close to the literal definition of 'get confused badly' to me,
> and doesn't give me the warm fuzzy feeling that giving the core MM even
> *more* weird topologies is super safe ... :-/
> 
>> Drop the limitation for the minimal node size.
> 
> While I agree with dropping the limitation, and I agree that 9391a3f9c7f1
> should have provided more of a justification, I believe a core MM fix is in
> order as well, for it to not crash. [ If it's fixed upstream already,
> please reference the relevant commit ID. ]

Agree. I posted a fixed patchset[1] before, maybe we can reconsider
it. :)

[1]. 
https://lore.kernel.org/lkml/20230215152412.13368-1-zhengqi.arch@bytedance.com/

For memoryless node, this patchset skip it and fallback to other nodes
when build its zonelists.

Say we have node0 and node1, and node0 is memoryless, then:

[    0.102400] Fallback order for Node 0: 1
[    0.102931] Fallback order for Node 1: 1

In this way, we will not allocate pages from memoryless node0. Then
the crash problem under the weird Qemu topology will be fixed.

Thanks,
Qi

> 
> Also, the changelog spelling & general presentation were quite low quality
> - I've fixed it up a bit below, please carry this version going forward.
> Please spell-check your patches before sending out Nth versions of it,
> maybe maintainers are skipping them for a reason!
> 
> Thanks,
> 
> 	Ingo
> 
> =================>
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> Date: Tue, 17 Oct 2023 09:22:15 +0300
> Subject: [PATCH] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
> 
> Qi Zheng reported crashes in a production environment and provided a
> simplified example as a reproducer:
> 
>   |  For example, if we use qemu to start a two NUMA node kernel,
>   |  one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>   |  and the other node has 2G, then we will encounter the
>   |  following panic:
>   |
>   |    BUG: kernel NULL pointer dereference, address: 0000000000000000
>   |    <...>
>   |    RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>   |    <...>
>   |    Call Trace:
>   |      <TASK>
>   |      deactivate_slab()
>   |      bootstrap()
>   |      kmem_cache_init()
>   |      start_kernel()
>   |      secondary_startup_64_no_verify()
> 
> The crashes happen because of inconsistency between the nodemask that
> has nodes with less than 4MB as memoryless, and the actual memory fed
> into the core mm.
> 
> The commit:
> 
>    9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing")
> 
> ... that introduced minimal size of a NUMA node does not explain why
> a node size cannot be less than 4MB and what boot failures this
> restriction might fix.
> 
> In the 17 years since then a lot has changed and core mm won't get
> confused about small node sizes.
> 
> Drop the limitation for the minimal node size.
> 
> [ mingo: Improved changelog clarity. ]
> 
> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Not-Yet-Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Acked-by: David Hildenbrand <david@redhat.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
> ---
>   arch/x86/include/asm/numa.h | 7 -------
>   arch/x86/mm/numa.c          | 7 -------
>   2 files changed, 14 deletions(-)
> 
> diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
> index e3bae2b60a0d..ef2844d69173 100644
> --- a/arch/x86/include/asm/numa.h
> +++ b/arch/x86/include/asm/numa.h
> @@ -12,13 +12,6 @@
>   
>   #define NR_NODE_MEMBLKS		(MAX_NUMNODES*2)
>   
> -/*
> - * Too small node sizes may confuse the VM badly. Usually they
> - * result from BIOS bugs. So dont recognize nodes as standalone
> - * NUMA entities that have less than this amount of RAM listed:
> - */
> -#define NODE_MIN_SIZE (4*1024*1024)
> -
>   extern int numa_off;
>   
>   /*
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index c01c5506fd4a..aa39d678fe81 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -602,13 +602,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
>   		if (start >= end)
>   			continue;
>   
> -		/*
> -		 * Don't confuse VM with a node that doesn't have the
> -		 * minimum amount of memory:
> -		 */
> -		if (end && (end - start) < NODE_MIN_SIZE)
> -			continue;
> -
>   		alloc_node_data(nid);
>   	}
>   


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-18 12:26   ` Qi Zheng
@ 2023-10-18 12:44     ` Ingo Molnar
  2023-10-18 13:20       ` Qi Zheng
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2023-10-18 12:44 UTC (permalink / raw)
  To: Qi Zheng, Andrew Morton
  Cc: Mike Rapoport, David Hildenbrand, Michal Hocko, Andrew Morton,
	x86, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	linux-kernel, linux-mm


* Qi Zheng <zhengqi.arch@bytedance.com> wrote:

> > While I agree with dropping the limitation, and I agree that 
> > 9391a3f9c7f1 should have provided more of a justification, I believe a 
> > core MM fix is in order as well, for it to not crash. [ If it's fixed 
> > upstream already, please reference the relevant commit ID. ]
> 
> Agree. I posted a fixed patchset[1] before, maybe we can reconsider it. 
> :)
> 
> [1]. https://lore.kernel.org/lkml/20230215152412.13368-1-zhengqi.arch@bytedance.com/
> 
> For memoryless node, this patchset skip it and fallback to other nodes
> when build its zonelists.

Mind resubmitting that to the MM folks, with the NULL dereference crash 
mentioned prominently? Feel free to Cc: me.

Fixing hypothetical robustness problems is good, fixing specific crashes is 
better. :-)

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-18 12:44     ` Ingo Molnar
@ 2023-10-18 13:20       ` Qi Zheng
  2023-10-20  8:46         ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Qi Zheng @ 2023-10-18 13:20 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Andrew Morton, Mike Rapoport, David Hildenbrand, Michal Hocko,
	x86, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	linux-kernel, linux-mm

Hi Ingo,

On 2023/10/18 20:44, Ingo Molnar wrote:
> 
> * Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> 
>>> While I agree with dropping the limitation, and I agree that
>>> 9391a3f9c7f1 should have provided more of a justification, I believe a
>>> core MM fix is in order as well, for it to not crash. [ If it's fixed
>>> upstream already, please reference the relevant commit ID. ]
>>
>> Agree. I posted a fixed patchset[1] before, maybe we can reconsider it.
>> :)
>>
>> [1]. https://lore.kernel.org/lkml/20230215152412.13368-1-zhengqi.arch@bytedance.com/
>>
>> For memoryless node, this patchset skip it and fallback to other nodes
>> when build its zonelists.
> 
> Mind resubmitting that to the MM folks, with the NULL dereference crash
> mentioned prominently? Feel free to Cc: me.

OK, I will resend it if no one else objects. :)

Thanks,
Qi

> 
> Fixing hypothetical robustness problems is good, fixing specific crashes is
> better. :-)
> 
> Thanks,
> 
> 	Ingo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-18 10:42 ` [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size Ingo Molnar
  2023-10-18 12:26   ` Qi Zheng
@ 2023-10-19  9:35   ` Mike Rapoport
  1 sibling, 0 replies; 12+ messages in thread
From: Mike Rapoport @ 2023-10-19  9:35 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: x86, Andrew Morton, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	David Hildenbrand, H. Peter Anvin, Ingo Molnar, Michal Hocko,
	Peter Zijlstra, Qi Zheng, Thomas Gleixner, linux-kernel, linux-mm

On Wed, Oct 18, 2023 at 12:42:50PM +0200, Ingo Molnar wrote:
> 
> * Mike Rapoport <rppt@kernel.org> wrote:
> 
> > From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> > 
> > Qi Zheng reports crashes in a production environment and provides a
> > simplified example as a reproducer:
> > 
> >   For example, if we use qemu to start a two NUMA node kernel,
> >   one of the nodes has 2M memory (less than NODE_MIN_SIZE),
> >   and the other node has 2G, then we will encounter the
> >   following panic:
> > 
> >   [    0.149844] BUG: kernel NULL pointer dereference, address: 0000000000000000
> >   [    0.150783] #PF: supervisor write access in kernel mode
> >   [    0.151488] #PF: error_code(0x0002) - not-present page
> >   <...>
> >   [    0.156056] RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
> >   <...>
> >   [    0.169781] Call Trace:
> >   [    0.170159]  <TASK>
> >   [    0.170448]  deactivate_slab+0x187/0x3c0
> >   [    0.171031]  ? bootstrap+0x1b/0x10e
> >   [    0.171559]  ? preempt_count_sub+0x9/0xa0
> >   [    0.172145]  ? kmem_cache_alloc+0x12c/0x440
> >   [    0.172735]  ? bootstrap+0x1b/0x10e
> >   [    0.173236]  bootstrap+0x6b/0x10e
> >   [    0.173720]  kmem_cache_init+0x10a/0x188
> >   [    0.174240]  start_kernel+0x415/0x6ac
> >   [    0.174738]  secondary_startup_64_no_verify+0xe0/0xeb
> >   [    0.175417]  </TASK>
> >   [    0.175713] Modules linked in:
> >   [    0.176117] CR2: 0000000000000000
> > 
> > The crashes happen because of inconsistency between nodemask that has
> > nodes with less than 4MB as memoryless and the actual memory fed into
> > core mm.
> 
> Presumably the core MM got fixed too to not just crash, but provide some 
> sort of warning?
> 
> > The commit 9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring
> > empty node in SRAT parsing") that introduced minimal size of a NUMA node
> > does not explain why a node size cannot be less than 4MB and what boot
> > failures this restriction might fix.
> > 
> > Since then a lot has changed and core mm won't confuse badly about small
> > node sizes.
> 
> Core MM won't get confused ... other than by the above weird Qemu topology, 
> to which it responds with a ... NULL pointer dereference?
> 
> Seems quite close to the literal definition of 'get confused badly' to me, 
> and doesn't give me the warm fuzzy feeling that giving the core MM even 
> *more* weird topologies is super safe ... :-/

The confusion is not about topology and not because of the small size of a
node.
The confusion is because x86 fails to consistently report it's memory
layout.  At one point it says there is memoryless node and another point it
says that that node actually has memory.

And dropping the kludge that says "Don't confuse VM with a node that
doesn't have the minimum amount of memory" fixes exactly that.
 
> > Drop the limitation for the minimal node size.
> 
> While I agree with dropping the limitation, and I agree that 9391a3f9c7f1 
> should have provided more of a justification, I believe a core MM fix is in 
> order as well, for it to not crash. [ If it's fixed upstream already, 
> please reference the relevant commit ID. ]

The core mm can be more strict about memory layouts it accepts indeed.
I'll look into that.
 
> Also, the changelog spelling & general presentation were quite low quality 
> - I've fixed it up a bit below, please carry this version going forward. 
> Please spell-check your patches before sending out Nth versions of it, 
> maybe maintainers are skipping them for a reason!
> 
> Thanks,
> 
> 	Ingo
> 
> =================>
> From: "Mike Rapoport (IBM)" <rppt@kernel.org>
> Date: Tue, 17 Oct 2023 09:22:15 +0300
> Subject: [PATCH] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
> 
> Qi Zheng reported crashes in a production environment and provided a
> simplified example as a reproducer:
> 
>  |  For example, if we use qemu to start a two NUMA node kernel,
>  |  one of the nodes has 2M memory (less than NODE_MIN_SIZE),
>  |  and the other node has 2G, then we will encounter the
>  |  following panic:
>  |
>  |    BUG: kernel NULL pointer dereference, address: 0000000000000000
>  |    <...>
>  |    RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40
>  |    <...>
>  |    Call Trace:
>  |      <TASK>
>  |      deactivate_slab()
>  |      bootstrap()
>  |      kmem_cache_init()
>  |      start_kernel()
>  |      secondary_startup_64_no_verify()
> 
> The crashes happen because of inconsistency between the nodemask that
> has nodes with less than 4MB as memoryless, and the actual memory fed
> into the core mm.
> 
> The commit:
> 
>   9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing")
> 
> ... that introduced minimal size of a NUMA node does not explain why
> a node size cannot be less than 4MB and what boot failures this
> restriction might fix.
> 
> In the 17 years since then a lot has changed and core mm won't get
> confused about small node sizes.
> 
> Drop the limitation for the minimal node size.
> 
> [ mingo: Improved changelog clarity. ]
> 
> Reported-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
> Not-Yet-Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Acked-by: David Hildenbrand <david@redhat.com>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Link: https://lore.kernel.org/all/20230212110305.93670-1-zhengqi.arch@bytedance.com/
> ---
>  arch/x86/include/asm/numa.h | 7 -------
>  arch/x86/mm/numa.c          | 7 -------
>  2 files changed, 14 deletions(-)
> 
> diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h
> index e3bae2b60a0d..ef2844d69173 100644
> --- a/arch/x86/include/asm/numa.h
> +++ b/arch/x86/include/asm/numa.h
> @@ -12,13 +12,6 @@
>  
>  #define NR_NODE_MEMBLKS		(MAX_NUMNODES*2)
>  
> -/*
> - * Too small node sizes may confuse the VM badly. Usually they
> - * result from BIOS bugs. So dont recognize nodes as standalone
> - * NUMA entities that have less than this amount of RAM listed:
> - */
> -#define NODE_MIN_SIZE (4*1024*1024)
> -
>  extern int numa_off;
>  
>  /*
> diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
> index c01c5506fd4a..aa39d678fe81 100644
> --- a/arch/x86/mm/numa.c
> +++ b/arch/x86/mm/numa.c
> @@ -602,13 +602,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi)
>  		if (start >= end)
>  			continue;
>  
> -		/*
> -		 * Don't confuse VM with a node that doesn't have the
> -		 * minimum amount of memory:
> -		 */
> -		if (end && (end - start) < NODE_MIN_SIZE)
> -			continue;
> -
>  		alloc_node_data(nid);
>  	}
>  

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-18 13:20       ` Qi Zheng
@ 2023-10-20  8:46         ` Ingo Molnar
  2023-10-20  8:59           ` Ingo Molnar
  0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2023-10-20  8:46 UTC (permalink / raw)
  To: Qi Zheng
  Cc: Andrew Morton, Mike Rapoport, David Hildenbrand, Michal Hocko,
	x86, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	linux-kernel, linux-mm


* Qi Zheng <zhengqi.arch@bytedance.com> wrote:

> Hi Ingo,
> 
> On 2023/10/18 20:44, Ingo Molnar wrote:
> > 
> > * Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> > 
> > > > While I agree with dropping the limitation, and I agree that
> > > > 9391a3f9c7f1 should have provided more of a justification, I believe a
> > > > core MM fix is in order as well, for it to not crash. [ If it's fixed
> > > > upstream already, please reference the relevant commit ID. ]
> > > 
> > > Agree. I posted a fixed patchset[1] before, maybe we can reconsider it.
> > > :)
> > > 
> > > [1]. https://lore.kernel.org/lkml/20230215152412.13368-1-zhengqi.arch@bytedance.com/
> > > 
> > > For memoryless node, this patchset skip it and fallback to other nodes
> > > when build its zonelists.
> > 
> > Mind resubmitting that to the MM folks, with the NULL dereference crash
> > mentioned prominently? Feel free to Cc: me.
> 
> OK, I will resend it if no one else objects. :)

Thanks, much appreciated - and I see Andrew already applied your two fixes 
to -mm.

With that background I was able to apply the x86 fix as well - which can be 
backported without the MM changes. The current commit in tip:x86/mm is:

  a1e2b8b36820 ("x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size")

Which should hit v6.7 in about ~1.5 weeks, unless there's unexpected 
problems.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size
  2023-10-20  8:46         ` Ingo Molnar
@ 2023-10-20  8:59           ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2023-10-20  8:59 UTC (permalink / raw)
  To: Qi Zheng
  Cc: Andrew Morton, Mike Rapoport, David Hildenbrand, Michal Hocko,
	x86, Andy Lutomirski, Borislav Petkov, Dave Hansen,
	H. Peter Anvin, Ingo Molnar, Peter Zijlstra, Thomas Gleixner,
	linux-kernel, linux-mm


* Ingo Molnar <mingo@kernel.org> wrote:

> > > Mind resubmitting that to the MM folks, with the NULL dereference 
> > > crash mentioned prominently? Feel free to Cc: me.
> > 
> > OK, I will resend it if no one else objects. :)
> 
> Thanks, much appreciated - and I see Andrew already applied your two 
> fixes to -mm.
> 
> With that background I was able to apply the x86 fix as well - which can 
> be backported without the MM changes. The current commit in tip:x86/mm 
> is:
> 
>   a1e2b8b36820 ("x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size")
> 
> Which should hit v6.7 in about ~1.5 weeks, unless there's unexpected 
> problems.

Note that I haven't added a Cc: stable tag, a 17 years old bug is not 
really a regression - but I have no objections against this fix getting 
into -stable once it gains a bit more testing and hits upstream.

Thanks,

	Ingo


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2023-10-20  8:59 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-17  6:22 [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mike Rapoport
2023-10-17  7:28 ` David Hildenbrand
2023-10-17  7:35   ` David Hildenbrand
2023-10-17  7:52   ` Mike Rapoport
2023-10-18 10:42 ` [PATCH v2] x86/mm: Drop 4MB restriction on minimal NUMA node memory size Ingo Molnar
2023-10-18 12:26   ` Qi Zheng
2023-10-18 12:44     ` Ingo Molnar
2023-10-18 13:20       ` Qi Zheng
2023-10-20  8:46         ` Ingo Molnar
2023-10-20  8:59           ` Ingo Molnar
2023-10-19  9:35   ` Mike Rapoport
2023-10-18 11:55 ` [PATCH] x86/mm: drop 4MB restriction on minimal NUMA node size Mario Casquero

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).