All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Rapoport <rppt@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>, Jan Beulich <jbeulich@suse.com>,
	Narasimhan V <Narasimhan.V@amd.com>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	stable@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [GIT PULL] memblock:fix validation of NUMA coverage
Date: Fri, 14 Jun 2024 11:17:52 +0300	[thread overview]
Message-ID: <Zmv8sMMGS8uosLQD@kernel.org> (raw)
In-Reply-To: <CAHk-=wgOMcScTviziAbL9Z2RDduaEFdZbHsESxqUS2eFfUmUVg@mail.gmail.com>

On Thu, Jun 13, 2024 at 10:38:28AM -0700, Linus Torvalds wrote:
> On Thu, 13 Jun 2024 at 10:09, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Is there some broken scripting that people have started using (or have
> > been using for a while and was recently broken)?
> 
> ... and then when I actually pull the code, I note that the problem
> where it checked _one_ bogus value has just been replaced with
> checking _another_ bogus value.
> 
> Christ.
> 
> What if people use a node ID that is simply outside the range
> entirely, instead of one of those special node IDs?
> 
> And now for memblock_set_node() you should apparently use NUMA_NO_NODE
> to not get a warning, but for memblock_set_region_node() apparently
> the right random constant to use is MAX_NUMNODES.
> 
> Does *any* of this make sense? No.
> 
> How about instead of having two random constants - and not having any
> range checking that I see - just have *one* random constant for "I
> have no range", call that NUMA_NO_NODE, and then have a simple helper
> for "do I have a valid range", and make that be
> 
>    static inline bool numa_valid_node(int nid)
>    { return (unsigned int)nid < MAX_NUMNODES; }
> 
> or something like that? Notice that now *all* of
> 
>  - NUMA_NO_NODE (explicitly no node)
> 
>  - MAX_NUMNODES (randomly used no node)
> 
>  - out of range node (who knows wth firmware tables do?)
> 
> will get the same result from that "numa_valid_node()" function.
> 
> And at that point you don't need to care, you don't need to warn, and
> you don't need to have these insane rules where "sometimes you *HAVE*
> to use NUMA_NO_NODE, or we warn, in other cases MAX_NUMNODES is the
> thing".
> 
> Please? IOW, instead of adding a warning for fragile code, then change
> some caller to follow the new rules, JUST FIX THE STUPID FRAGILITY!
> 
> Or hey, just do
> 
>     #define NUMA_NO_NODE MAX_NUMNODES
> 
> and have two names for the *same* constant, instead fo having two
> different constants with strange semantic differences that seem to
> make no sense and where the memblock code itself seems to go
> back-and-forth on it in different contexts.

A single constant is likely to backfire because I remember seeing checks
like 'if (nid < 0)' so redefining NUMA_NO_NODE will require auditing all
those.

But a helper function works great.
I could only lightly test it as I don't have a fleet of machines with
variety of memory layouts, so I'm planning to push it into -next early next
week (with subject replaced by a more informative one)

From 319eddd74b372cae840782c7d53832ab30533a6b Mon Sep 17 00:00:00 2001
From: "Mike Rapoport (IBM)" <rppt@kernel.org>
Date: Fri, 14 Jun 2024 11:05:43 +0300
Subject: [PATCH] memblock: FIX THE STUPID FRAGILITY

Introduce numa_valid_node(nid) that verifies that nid is a valid node ID
and use that instead of comparing nid parameter with either NUMA_NO_NODE
or MAX_NUMNODES.

This makes the checks for valid node IDs consistent and more robust and
allows to get rid of multiple WARNings.

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
---
 include/linux/numa.h |  5 +++++
 mm/memblock.c        | 28 +++++++---------------------
 2 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/include/linux/numa.h b/include/linux/numa.h
index 1d43371fafd2..eb19503604fe 100644
--- a/include/linux/numa.h
+++ b/include/linux/numa.h
@@ -15,6 +15,11 @@
 #define	NUMA_NO_NODE	(-1)
 #define	NUMA_NO_MEMBLK	(-1)
 
+static inline bool numa_valid_node(int nid)
+{
+	return nid >= 0 && nid < MAX_NUMNODES;
+}
+
 /* optionally keep NUMA memory info available post init */
 #ifdef CONFIG_NUMA_KEEP_MEMINFO
 #define __initdata_or_meminfo
diff --git a/mm/memblock.c b/mm/memblock.c
index 08e9806b1cf9..e81fb68f7f88 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -754,7 +754,7 @@ bool __init_memblock memblock_validate_numa_coverage(unsigned long threshold_byt
 
 	/* calculate lose page */
 	for_each_mem_pfn_range(i, MAX_NUMNODES, &start_pfn, &end_pfn, &nid) {
-		if (nid == NUMA_NO_NODE)
+		if (!numa_valid_node(nid))
 			nr_pages += end_pfn - start_pfn;
 	}
 
@@ -1061,7 +1061,7 @@ static bool should_skip_region(struct memblock_type *type,
 		return false;
 
 	/* only memory regions are associated with nodes, check it */
-	if (nid != NUMA_NO_NODE && nid != m_nid)
+	if (numa_valid_node(nid) && nid != m_nid)
 		return true;
 
 	/* skip hotpluggable memory regions if needed */
@@ -1118,10 +1118,6 @@ void __next_mem_range(u64 *idx, int nid, enum memblock_flags flags,
 	int idx_a = *idx & 0xffffffff;
 	int idx_b = *idx >> 32;
 
-	if (WARN_ONCE(nid == MAX_NUMNODES,
-	"Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
-		nid = NUMA_NO_NODE;
-
 	for (; idx_a < type_a->cnt; idx_a++) {
 		struct memblock_region *m = &type_a->regions[idx_a];
 
@@ -1215,9 +1211,6 @@ void __init_memblock __next_mem_range_rev(u64 *idx, int nid,
 	int idx_a = *idx & 0xffffffff;
 	int idx_b = *idx >> 32;
 
-	if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
-		nid = NUMA_NO_NODE;
-
 	if (*idx == (u64)ULLONG_MAX) {
 		idx_a = type_a->cnt - 1;
 		if (type_b != NULL)
@@ -1303,7 +1296,7 @@ void __init_memblock __next_mem_pfn_range(int *idx, int nid,
 
 		if (PFN_UP(r->base) >= PFN_DOWN(r->base + r->size))
 			continue;
-		if (nid == MAX_NUMNODES || nid == r_nid)
+		if (!numa_valid_node(nid) || nid == r_nid)
 			break;
 	}
 	if (*idx >= type->cnt) {
@@ -1339,10 +1332,6 @@ int __init_memblock memblock_set_node(phys_addr_t base, phys_addr_t size,
 	int start_rgn, end_rgn;
 	int i, ret;
 
-	if (WARN_ONCE(nid == MAX_NUMNODES,
-		      "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
-		nid = NUMA_NO_NODE;
-
 	ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn);
 	if (ret)
 		return ret;
@@ -1452,9 +1441,6 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
 	enum memblock_flags flags = choose_memblock_flags();
 	phys_addr_t found;
 
-	if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n"))
-		nid = NUMA_NO_NODE;
-
 	if (!align) {
 		/* Can't use WARNs this early in boot on powerpc */
 		dump_stack();
@@ -1467,7 +1453,7 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size,
 	if (found && !memblock_reserve(found, size))
 		goto done;
 
-	if (nid != NUMA_NO_NODE && !exact_nid) {
+	if (numa_valid_node(nid) && !exact_nid) {
 		found = memblock_find_in_range_node(size, align, start,
 						    end, NUMA_NO_NODE,
 						    flags);
@@ -1987,7 +1973,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type)
 		end = base + size - 1;
 		flags = rgn->flags;
 #ifdef CONFIG_NUMA
-		if (memblock_get_region_node(rgn) != MAX_NUMNODES)
+		if (numa_valid_node(memblock_get_region_node(rgn)))
 			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
 				 memblock_get_region_node(rgn));
 #endif
@@ -2181,7 +2167,7 @@ static void __init memmap_init_reserved_pages(void)
 			start = region->base;
 			end = start + region->size;
 
-			if (nid == NUMA_NO_NODE || nid >= MAX_NUMNODES)
+			if (!numa_valid_node(nid))
 				nid = early_pfn_to_nid(PFN_DOWN(start));
 
 			reserve_bootmem_region(start, end, nid);
@@ -2272,7 +2258,7 @@ static int memblock_debug_show(struct seq_file *m, void *private)
 
 		seq_printf(m, "%4d: ", i);
 		seq_printf(m, "%pa..%pa ", &reg->base, &end);
-		if (nid != MAX_NUMNODES)
+		if (numa_valid_node(nid))
 			seq_printf(m, "%4d ", nid);
 		else
 			seq_printf(m, "%4c ", 'x');
-- 
2.43.0

 
>               Linus

-- 
Sincerely yours,
Mike.

  parent reply	other threads:[~2024-06-14  8:20 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-13 14:09 [GIT PULL] memblock:fix validation of NUMA coverage Mike Rapoport
2024-06-13 17:09 ` Linus Torvalds
2024-06-13 17:38   ` Linus Torvalds
2024-06-14  6:01     ` Jan Beulich
2024-06-14  7:31       ` Mike Rapoport
2024-06-14  8:17     ` Mike Rapoport [this message]
2024-06-14 16:28       ` Linus Torvalds
2024-06-13 19:30 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zmv8sMMGS8uosLQD@kernel.org \
    --to=rppt@kernel.org \
    --cc=Narasimhan.V@amd.com \
    --cc=bp@alien8.de \
    --cc=jbeulich@suse.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=paulmck@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.