From: Shakeel Butt <shakeel.butt@linux.dev>
To: "JP Kobryn (Meta)" <jp.kobryn@linux.dev>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, mhocko@suse.com,
vbabka@suse.cz, apopple@nvidia.com, axelrasmussen@google.com,
byungchul@sk.com, cgroups@vger.kernel.org, david@kernel.org,
eperezma@redhat.com, gourry@gourry.net, jasowang@redhat.com,
hannes@cmpxchg.org, joshua.hahnjy@gmail.com,
Liam.Howlett@oracle.com, linux-kernel@vger.kernel.org,
lorenzo.stoakes@oracle.com, matthew.brost@intel.com,
mst@redhat.com, rppt@kernel.org, muchun.song@linux.dev,
zhengqi.arch@bytedance.com, rakie.kim@sk.com,
roman.gushchin@linux.dev, surenb@google.com,
virtualization@lists.linux.dev, weixugc@google.com,
xuanzhuo@linux.alibaba.com, ying.huang@linux.alibaba.com,
yuanchu@google.com, ziy@nvidia.com, kernel-team@meta.com
Subject: Re: [PATCH v2] mm/mempolicy: track page allocations per mempolicy
Date: Tue, 10 Mar 2026 07:53:17 -0700 [thread overview]
Message-ID: <abAmMjkZZLN9LXXM@linux.dev> (raw)
In-Reply-To: <dcf2e654-ad2f-4390-9b62-078e664158de@linux.dev>
On Mon, Mar 09, 2026 at 09:17:43PM -0700, JP Kobryn (Meta) wrote:
> On 3/9/26 4:43 PM, Shakeel Butt wrote:
> > On Fri, Mar 06, 2026 at 08:55:20PM -0800, JP Kobryn (Meta) wrote:
[...]
> >
> > This seems like monotonic increasing metrics and I think you don't care about
> > their absolute value but rather rate of change. Any reason this can not be
> > achieved through tracepoints and BPF combination?
>
> We have the per-node reclaim stats (pg{steal,scan,refill}) in
> nodeN/vmstat and memory.numa_stat now. The new stats in this patch would
> be collected from the same source. They were meant to be used together,
> so it seemed like a reasonable location. I think the advantage over
> tracepoints is we get the observability on from the start and it would
> be simple to extend existing programs that already read stats from the
> cgroup dir files.
Convenience is not really justifying the cost of adding 18 counters,
particularly in memcg. We can argue about adding just in system level metrics
but not for memcg.
counter_cost = nr_cpus * nr_nodes * nr_memcg * 16 (struct lruvec_stats_percpu)
On a typical prod machine, we can see 1000s of memcg, 100s of cpus and couple of
numa nodes. So, a single counter's cost can range from 200KiB to MiBs. This does
not seem like a cost we should force everyone to pay.
If you really want these per-memcg and assuming these metrics are updated in
non-performance critical path, we can try to decouple these and other reclaim
related stats from rstat infra. That would at least reduce nr_cpus factor in the
above equation to 1. Though we will need to actually evaluate the performance
for the change before committing to it.
next prev parent reply other threads:[~2026-03-10 14:53 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-07 4:55 [PATCH v2] mm/mempolicy: track page allocations per mempolicy JP Kobryn (Meta)
2026-03-07 12:27 ` Huang, Ying
2026-03-08 19:20 ` Gregory Price
2026-03-09 4:11 ` JP Kobryn (Meta)
2026-03-09 4:31 ` JP Kobryn (Meta)
2026-03-11 2:56 ` Huang, Ying
2026-03-11 17:31 ` JP Kobryn (Meta)
2026-03-07 14:32 ` kernel test robot
2026-03-07 19:57 ` kernel test robot
2026-03-08 19:24 ` Usama Arif
2026-03-09 3:30 ` JP Kobryn (Meta)
2026-03-11 18:06 ` Johannes Weiner
2026-03-09 23:35 ` Shakeel Butt
2026-03-09 23:43 ` Shakeel Butt
2026-03-10 4:17 ` JP Kobryn (Meta)
2026-03-10 14:53 ` Shakeel Butt [this message]
2026-03-10 17:01 ` JP Kobryn (Meta)
2026-03-12 13:40 ` Vlastimil Babka (SUSE)
2026-03-12 16:13 ` JP Kobryn (Meta)
2026-03-13 5:07 ` Huang, Ying
2026-03-13 6:14 ` JP Kobryn (Meta)
2026-03-13 7:34 ` Vlastimil Babka (SUSE)
2026-03-13 9:31 ` Huang, Ying
2026-03-13 18:28 ` JP Kobryn (Meta)
2026-03-13 18:09 ` JP Kobryn (Meta)
2026-03-16 2:54 ` Huang, Ying
2026-03-17 4:37 ` JP Kobryn (Meta)
2026-03-17 6:44 ` Huang, Ying
2026-03-17 11:10 ` Vlastimil Babka (SUSE)
2026-03-17 17:55 ` JP Kobryn (Meta)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abAmMjkZZLN9LXXM@linux.dev \
--to=shakeel.butt@linux.dev \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=axelrasmussen@google.com \
--cc=byungchul@sk.com \
--cc=cgroups@vger.kernel.org \
--cc=david@kernel.org \
--cc=eperezma@redhat.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=jasowang@redhat.com \
--cc=joshua.hahnjy@gmail.com \
--cc=jp.kobryn@linux.dev \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=mst@redhat.com \
--cc=muchun.song@linux.dev \
--cc=rakie.kim@sk.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux.dev \
--cc=weixugc@google.com \
--cc=xuanzhuo@linux.alibaba.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox