From: Michal Hocko <mhocko@suse.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: mm-commits@vger.kernel.org, stable@vger.kernel.org,
nifan@outlook.com, dave@stgolabs.net, dan.j.williams@intel.com,
a.manzanares@samsung.com, dongjoo.linux.dev@gmail.com
Subject: Re: + mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch added to mm-hotfixes-unstable branch
Date: Thu, 24 Oct 2024 11:10:16 +0200 [thread overview]
Message-ID: <ZxoO-NEVYaZQbpxN@tiehlicka> (raw)
In-Reply-To: <20241023204125.C4DFBC4CECC@smtp.kernel.org>
I believe this patch should be dropped. Not only it doesn't really
describes an actual problem I believe it is indeed incorrect as
explained https://lore.kernel.org/all/ZxoEWBakAv64wfhD@tiehlicka/T/#u
On Wed 23-10-24 13:41:25, Andrew Morton wrote:
> From: Dongjoo Seo <dongjoo.linux.dev@gmail.com>
> Subject: mm/page_alloc: fix NUMA stats update for cpu-less nodes
> Date: Wed, 23 Oct 2024 10:50:37 -0700
>
> In the case of memoryless node, when a process prefers a node with no
> memory(e.g., because it is running on a CPU local to that node), the
> kernel treats a nearby node with memory as the preferred node. As a
> result, such allocations do not increment the numa_foreign counter on the
> memoryless node, leading to skewed NUMA_HIT, NUMA_MISS, and NUMA_FOREIGN
> stats for the nearest node.
>
> This patch corrects this issue by:
> 1. Checking if the zone or preferred zone is CPU-less before updating
> the NUMA stats.
> 2. Ensuring NUMA_HIT is only updated if the zone is not CPU-less.
> 3. Ensuring NUMA_FOREIGN is only updated if the preferred zone is not
> CPU-less.
>
> Example Before and After Patch:
> - Before Patch:
> node0 node1 node2
> numa_hit 86333181 114338269 5108
> numa_miss 5199455 0 56844591
> numa_foreign 32281033 29763013 0
> interleave_hit 91 91 0
> local_node 86326417 114288458 0
> other_node 5206219 49768 56849702
>
> - After Patch:
> node0 node1 node2
> numa_hit 2523058 9225528 0
> numa_miss 150213 10226 21495942
> numa_foreign 17144215 4501270 0
> interleave_hit 91 94 0
> local_node 2493918 9208226 0
> other_node 179351 27528 21495942
>
> Similarly, in the context of cpuless nodes, this patch ensures that NUMA
> statistics are accurately updated by adding checks to prevent the
> miscounting of memory allocations when the involved nodes have no CPUs.
> This ensures more precise tracking of memory access patterns accross all
> nodes, regardless of whether they have CPUs or not, improving the overall
> reliability of NUMA stat. The reason is that page allocation from
> dev_dax, cpuset, memcg .. comes with preferred allocating zone in cpuless
> node and its hard to track the zone info for miss information.
>
> Link: https://lkml.kernel.org/r/20241023175037.9125-1-dongjoo.linux.dev@gmail.com
> Signed-off-by: Dongjoo Seo <dongjoo.linux.dev@gmail.com>
> Cc: Davidlohr Bueso <dave@stgolabs.net>
> Cc: Fan Ni <nifan@outlook.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Adam Manzanares <a.manzanares@samsung.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> ---
>
> mm/page_alloc.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> --- a/mm/page_alloc.c~mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes
> +++ a/mm/page_alloc.c
> @@ -2858,19 +2858,21 @@ static inline void zone_statistics(struc
> {
> #ifdef CONFIG_NUMA
> enum numa_stat_item local_stat = NUMA_LOCAL;
> + bool z_is_cpuless = !node_state(zone_to_nid(z), N_CPU);
> + bool pref_is_cpuless = !node_state(zone_to_nid(preferred_zone), N_CPU);
>
> - /* skip numa counters update if numa stats is disabled */
> if (!static_branch_likely(&vm_numa_stat_key))
> return;
>
> - if (zone_to_nid(z) != numa_node_id())
> + if (zone_to_nid(z) != numa_node_id() || z_is_cpuless)
> local_stat = NUMA_OTHER;
>
> - if (zone_to_nid(z) == zone_to_nid(preferred_zone))
> + if (zone_to_nid(z) == zone_to_nid(preferred_zone) && !z_is_cpuless)
> __count_numa_events(z, NUMA_HIT, nr_account);
> else {
> __count_numa_events(z, NUMA_MISS, nr_account);
> - __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
> + if (!pref_is_cpuless)
> + __count_numa_events(preferred_zone, NUMA_FOREIGN, nr_account);
> }
> __count_numa_events(z, local_stat, nr_account);
> #endif
> _
>
> Patches currently in -mm which might be from dongjoo.linux.dev@gmail.com are
>
> mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch
--
Michal Hocko
SUSE Labs
prev parent reply other threads:[~2024-10-24 9:10 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-23 20:41 + mm-page_alloc-fix-numa-stats-update-for-cpu-less-nodes.patch added to mm-hotfixes-unstable branch Andrew Morton
2024-10-24 9:10 ` Michal Hocko [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZxoO-NEVYaZQbpxN@tiehlicka \
--to=mhocko@suse.com \
--cc=a.manzanares@samsung.com \
--cc=akpm@linux-foundation.org \
--cc=dan.j.williams@intel.com \
--cc=dave@stgolabs.net \
--cc=dongjoo.linux.dev@gmail.com \
--cc=mm-commits@vger.kernel.org \
--cc=nifan@outlook.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox