From: "Chen, Yu C" <yu.c.chen@intel.com>
To: "Michal Koutný" <mkoutny@suse.com>,
"Shakeel Butt" <shakeel.butt@linux.dev>
Cc: <peterz@infradead.org>, <akpm@linux-foundation.org>,
<mingo@redhat.com>, <tj@kernel.org>, <hannes@cmpxchg.org>,
<corbet@lwn.net>, <mgorman@suse.de>, <mhocko@kernel.org>,
<muchun.song@linux.dev>, <roman.gushchin@linux.dev>,
<tim.c.chen@intel.com>, <aubrey.li@intel.com>,
<libo.chen@oracle.com>, <kprateek.nayak@amd.com>,
<vineethr@linux.ibm.com>, <venkat88@linux.ibm.com>,
<ayushjai@amd.com>, <cgroups@vger.kernel.org>,
<linux-doc@vger.kernel.org>, <linux-mm@kvack.org>,
<linux-kernel@vger.kernel.org>, <yu.chen.surf@foxmail.com>
Subject: Re: [PATCH v5 2/2] sched/numa: add statistics of numa balance task
Date: Tue, 27 May 2025 17:20:54 +0800 [thread overview]
Message-ID: <a8314889-f036-49ff-9cda-01367ddccf51@intel.com> (raw)
In-Reply-To: <uuhyie7udxyvbdpccwi7dl5cy26ygkkuxjixpl247u5nqwpcqm@5whxlt5ddswo>
On 5/26/2025 9:35 PM, Michal Koutný wrote:
> On Fri, May 23, 2025 at 04:42:50PM -0700, Shakeel Butt <shakeel.butt@linux.dev> wrote:
>> Hmm these are scheduler events, how are these relevant to memory cgroup
>> or vmstat? Any reason to not expose these in cpu.stat?
>
> Good point. If I take it further -- this functionality needs neither
> memory controller (CONFIG_MEMCG) nor CPU controller
> (CONFIG_CGROUP_SCHED), so it might be technically calculated and exposed
> in _any_ cgroup (which would be same technical solution how cpu time is
> counted in cpu.stat regardless of CPU controller, cpu_stat_show()).
>
Yes, we can add it to cpu.stat. However, this might make it more difficult
for users to locate related events. Some statistics about NUMA page
migrations/faults are recorded in memory.stat, while others about NUMA task
migrations (triggered by NUMA faults periodicly) are stored in cpu.stat.
Do you recommend extending the struct cgroup_base_stat to include counters
for task_migrate/task_swap? Additionally, should we enhance
cgroup_base_stat_cputime_show() to parse task_migrate/task_swap in a manner
similar to cputime?
Alternatively, as Shakeel previously mentioned, could we reuse
"count_memcg_event_mm()" and related infrastructure while exposing these
statistics/events in cpu.stat? I assume Shakeel was referring to the
following
approach:
1. Skip task migration/swap in memory.stat:
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cdaab8a957f3..b8eea3eca46f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1529,6 +1529,11 @@ static void memcg_stat_format(struct mem_cgroup
*memcg, struct seq_buf *s)
if (memcg_vm_event_stat[i] == PGPGIN ||
memcg_vm_event_stat[i] == PGPGOUT)
continue;
+#endif
+#ifdef CONFIG_NUMA_BALANCING
+ if (memcg_vm_event_stat[i] == NUMA_TASK_MIGRATE ||
+ memcg_vm_event_stat[i] == NUMA_TASK_SWAP)
+ continue;
#endif
2.Skip task migration/swap in /proc/vmstat
diff --git a/mm/vmstat.c b/mm/vmstat.c
index ed08bb384ae4..ea8a8ae1cdac 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1912,6 +1912,10 @@ static void *vmstat_next(struct seq_file *m, void
*arg, loff_t *pos)
(*pos)++;
if (*pos >= NR_VMSTAT_ITEMS)
return NULL;
+#ifdef CONFIG_NUMA_BALANCING
+ if (*pos == NUMA_TASK_MIGRATE || *pos == NUMA_TASK_SWAP)
+ return NULL;
+#endif
3. Display task migration/swap events in cpu.stat:
seq_buf_printf(&s, "%s %lu\n",
+
vm_event_name(memcg_vm_event_stat[NUMA_TASK_MIGRATE]),
+ memcg_events(memcg,
memcg_vm_event_stat[NUMA_TASK_MIGRATE]));
It looks like more code is needed. Michal, Shakeel, could you please advise
which strategy is preferred, or should we keep the current version?
Thanks,
Chenyu
next prev parent reply other threads:[~2025-05-27 9:21 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-23 12:48 [PATCH v5 0/2] sched/numa: add statistics of numa balance task migration Chen Yu
2025-05-23 12:51 ` [PATCH v5 1/2] sched/numa: fix task swap by skipping kernel threads Chen Yu
2025-05-23 23:22 ` Shakeel Butt
2025-05-23 12:51 ` [PATCH v5 2/2] sched/numa: add statistics of numa balance task Chen Yu
2025-05-23 23:42 ` Shakeel Butt
2025-05-24 9:07 ` Chen, Yu C
2025-05-24 17:32 ` Shakeel Butt
2025-05-25 12:35 ` Chen, Yu C
2025-05-27 17:48 ` Shakeel Butt
2025-05-29 5:04 ` Chen, Yu C
2025-05-26 13:35 ` Michal Koutný
2025-05-27 9:20 ` Chen, Yu C [this message]
2025-05-27 18:15 ` Shakeel Butt
2025-06-02 16:53 ` Michal Koutný
2025-06-03 14:46 ` Chen, Yu C
2025-06-17 9:30 ` Michal Koutný
2025-06-19 13:03 ` Chen, Yu C
2025-06-19 14:06 ` Michal Koutný
2025-05-23 22:06 ` [PATCH v5 0/2] sched/numa: add statistics of numa balance task migration Andrew Morton
2025-05-23 23:52 ` Shakeel Butt
2025-05-28 0:21 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a8314889-f036-49ff-9cda-01367ddccf51@intel.com \
--to=yu.c.chen@intel.com \
--cc=akpm@linux-foundation.org \
--cc=aubrey.li@intel.com \
--cc=ayushjai@amd.com \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=hannes@cmpxchg.org \
--cc=kprateek.nayak@amd.com \
--cc=libo.chen@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=peterz@infradead.org \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tim.c.chen@intel.com \
--cc=tj@kernel.org \
--cc=venkat88@linux.ibm.com \
--cc=vineethr@linux.ibm.com \
--cc=yu.chen.surf@foxmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).