From: Dongjoo Seo <dongjoo.linux.dev@gmail.com>
To: akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Cc: dave@stgolabs.net, dan.j.williams@intel.com, nifan@outlook.com,
a.manzanares@samsung.com,
Dongjoo Seo <dongjoo.linux.dev@gmail.com>
Subject: [PATCH] mm/page_alloc: fix NUMA stats update for cpu-less nodes
Date: Wed, 23 Oct 2024 10:50:37 -0700 [thread overview]
Message-ID: <20241023175037.9125-1-dongjoo.linux.dev@gmail.com> (raw)
This patch corrects this issue by:
1. Checking if the zone or preferred zone is CPU-less before updating
the NUMA stats.
2. Ensuring NUMA_HIT is only updated if the zone is not CPU-less.
3. Ensuring NUMA_FOREIGN is only updated if the preferred zone is not
CPU-less.
Example Before and After Patch:
- Before Patch:
node0 node1 node2
numa_hit 86333181 114338269 5108
numa_miss 5199455 0 56844591
numa_foreign 32281033 29763013 0
interleave_hit 91 91 0
local_node 86326417 114288458 0
other_node 5206219 49768 56849702
- After Patch:
node0 node1 node2
numa_hit 2523058 9225528 0
numa_miss 150213 10226 21495942
numa_foreign 17144215 4501270 0
interleave_hit 91 94 0
local_node 2493918 9208226 0
other_node 179351 27528 21495942
In the case of memoryless node, when a process prefers a node
with no memory(e.g., because it is running on a CPU local to that
node), the kernel treats a nearby node with memory as the
preffered node. As a result, such allocation do not increment the
numa_foreign counter on the memoryless node, leading to skewed
NUMA_HIT, NUMA_MISS, and NUMA_FOREIGN stat for the nearest node.
Similarly, in the context of cpuless nodes, this patch ensures
that NUMA statistics are accurately updated by adding checks to
prevent the miscounting of memory allocations when the involved
nodes have no CPUs. This ensures more precise tracking of memory
access patterns accross all nodes, regardless of whether they
have CPUs or not, improving the overall reliability of NUMA stat.
The reason is that page allocation from dev_dax, cpuset, memcg ..
comes with preferred allocating zone in cpuless node and its hard
to track the zone info for miss information.
Signed-off-by: Dongjoo Seo <dongjoo.linux.dev@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Fan Ni <nifan@outlook.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Adam Manzanares <a.manzanares@samsung.com>
---
mm/page_alloc.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0f33dab6d344..2981466e8e1a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2894,19 +2894,21 @@ static inline void zone_statistics(struct zone *preferred_zone, struct zone *z,
{
#ifdef CONFIG_NUMA
enum numa_stat_item local_stat = NUMA_LOCAL;
+ bool z_is_cpuless = !node_state(zone_to_nid(z), N_CPU);
+ bool pref_is_cpuless = !node_state(zone_to_nid(preferred_zone), N_CPU);
- /* skip numa counters update if numa stats is disabled */
if (!static_branch_likely(&vm_numa_stat_key))
return;
- if (zone_to_nid(z) != numa_node_id())
+ if (zone_to_nid(z) != numa_node_id() || z_is_cpuless)
local_stat = NUMA_OTHER;
- if (zone_to_nid(z) == zone_to_nid(preferred_zone))
+ if (zone_to_nid(z) == zone_to_nid(preferred_zone) && !z_is_cpuless)
__count_numa_events(z, NUMA_HIT, nr_account);
else {
__count_numa_events(z, NUMA_MISS, nr_account);
- __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
+ if (!pref_is_cpuless)
+ __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
}
__count_numa_events(z, local_stat, nr_account);
#endif
--
2.39.2
next reply other threads:[~2024-10-23 17:50 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-23 17:50 Dongjoo Seo [this message]
2024-10-23 18:03 ` [PATCH] mm/page_alloc: fix NUMA stats update for cpu-less nodes Michal Hocko
2024-10-23 20:41 ` Andrew Morton
2024-10-23 21:38 ` Michal Hocko
2024-10-23 22:15 ` Dongjoo Seo
2024-10-23 22:23 ` Michal Hocko
2024-10-24 4:54 ` Dongjoo Seo
2024-10-24 8:24 ` Michal Hocko
2024-10-24 18:13 ` Dongjoo Seo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241023175037.9125-1-dongjoo.linux.dev@gmail.com \
--to=dongjoo.linux.dev@gmail.com \
--cc=a.manzanares@samsung.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nifan@outlook.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox