From: "Michael S. Tsirkin" <mst@redhat.com>
To: JP Kobryn <inwardvessel@gmail.com>
Cc: linux-mm@kvack.org, apopple@nvidia.com,
akpm@linux-foundation.org, axelrasmussen@google.com,
byungchul@sk.com, cgroups@vger.kernel.org, david@kernel.org,
eperezma@redhat.com, gourry@gourry.net, jasowang@redhat.com,
hannes@cmpxchg.org, joshua.hahnjy@gmail.com,
Liam.Howlett@oracle.com, linux-kernel@vger.kernel.org,
lorenzo.stoakes@oracle.com, matthew.brost@intel.com,
mhocko@suse.com, rppt@kernel.org, muchun.song@linux.dev,
zhengqi.arch@bytedance.com, rakie.kim@sk.com,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
surenb@google.com, virtualization@lists.linux.dev,
vbabka@suse.cz, weixugc@google.com, xuanzhuo@linux.alibaba.com,
ying.huang@linux.alibaba.com, yuanchu@google.com, ziy@nvidia.com,
kernel-team@meta.com
Subject: Re: [PATCH 2/2] mm: move pgscan and pgsteal to node stats
Date: Thu, 12 Feb 2026 02:08:24 -0500 [thread overview]
Message-ID: <20260212020724-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <20260212045109.255391-3-inwardvessel@gmail.com>
On Wed, Feb 11, 2026 at 08:51:09PM -0800, JP Kobryn wrote:
> It would be useful to narrow down reclaim to specific nodes.
>
> Provide per-node reclaim visibility by changing the pgscan and pgsteal
> stats from global vm_event_item's to node_stat_item's. Note this change has
> the side effect of now tracking these stats on a per-memcg basis.
>
> Signed-off-by: JP Kobryn <inwardvessel@gmail.com>
> Suggested-by: Johannes Weiner <hannes@cmpxchg.org>
virtio_balloon changes
> ---
> drivers/virtio/virtio_balloon.c | 8 ++++----
> include/linux/mmzone.h | 12 +++++++++++
> include/linux/vm_event_item.h | 12 -----------
> mm/memcontrol.c | 36 ++++++++++++++++++---------------
> mm/vmscan.c | 32 +++++++++++------------------
> mm/vmstat.c | 24 +++++++++++-----------
> 6 files changed, 60 insertions(+), 64 deletions(-)
>
> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
> index 74fe59f5a78c..1341d9d1a2a1 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -374,13 +374,13 @@ static inline unsigned int update_balloon_vm_stats(struct virtio_balloon *vb)
> update_stat(vb, idx++, VIRTIO_BALLOON_S_ALLOC_STALL, stall);
>
> update_stat(vb, idx++, VIRTIO_BALLOON_S_ASYNC_SCAN,
> - pages_to_bytes(events[PGSCAN_KSWAPD]));
> + pages_to_bytes(global_node_page_state(PGSCAN_KSWAPD)));
> update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRECT_SCAN,
> - pages_to_bytes(events[PGSCAN_DIRECT]));
> + pages_to_bytes(global_node_page_state(PGSCAN_DIRECT)));
> update_stat(vb, idx++, VIRTIO_BALLOON_S_ASYNC_RECLAIM,
> - pages_to_bytes(events[PGSTEAL_KSWAPD]));
> + pages_to_bytes(global_node_page_state(PGSTEAL_KSWAPD)));
> update_stat(vb, idx++, VIRTIO_BALLOON_S_DIRECT_RECLAIM,
> - pages_to_bytes(events[PGSTEAL_DIRECT]));
> + pages_to_bytes(global_node_page_state(PGSTEAL_DIRECT)));
>
> #ifdef CONFIG_HUGETLB_PAGE
> update_stat(vb, idx++, VIRTIO_BALLOON_S_HTLB_PGALLOC,
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 762609d5f0af..fc39c107a4b5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -255,6 +255,18 @@ enum node_stat_item {
> PGDEMOTE_DIRECT,
> PGDEMOTE_KHUGEPAGED,
> PGDEMOTE_PROACTIVE,
> + PGSTEAL_KSWAPD,
> + PGSTEAL_DIRECT,
> + PGSTEAL_KHUGEPAGED,
> + PGSTEAL_PROACTIVE,
> + PGSTEAL_ANON,
> + PGSTEAL_FILE,
> + PGSCAN_KSWAPD,
> + PGSCAN_DIRECT,
> + PGSCAN_KHUGEPAGED,
> + PGSCAN_PROACTIVE,
> + PGSCAN_ANON,
> + PGSCAN_FILE,
> #ifdef CONFIG_NUMA
> PGALLOC_MPOL_DEFAULT,
> PGALLOC_MPOL_PREFERRED,
> diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
> index 92f80b4d69a6..6f1787680658 100644
> --- a/include/linux/vm_event_item.h
> +++ b/include/linux/vm_event_item.h
> @@ -40,19 +40,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
> PGLAZYFREED,
> PGREFILL,
> PGREUSE,
> - PGSTEAL_KSWAPD,
> - PGSTEAL_DIRECT,
> - PGSTEAL_KHUGEPAGED,
> - PGSTEAL_PROACTIVE,
> - PGSCAN_KSWAPD,
> - PGSCAN_DIRECT,
> - PGSCAN_KHUGEPAGED,
> - PGSCAN_PROACTIVE,
> PGSCAN_DIRECT_THROTTLE,
> - PGSCAN_ANON,
> - PGSCAN_FILE,
> - PGSTEAL_ANON,
> - PGSTEAL_FILE,
> #ifdef CONFIG_NUMA
> PGSCAN_ZONE_RECLAIM_SUCCESS,
> PGSCAN_ZONE_RECLAIM_FAILED,
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 86f43b7e5f71..bde0b6536be6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -328,6 +328,18 @@ static const unsigned int memcg_node_stat_items[] = {
> PGDEMOTE_DIRECT,
> PGDEMOTE_KHUGEPAGED,
> PGDEMOTE_PROACTIVE,
> + PGSTEAL_KSWAPD,
> + PGSTEAL_DIRECT,
> + PGSTEAL_KHUGEPAGED,
> + PGSTEAL_PROACTIVE,
> + PGSTEAL_ANON,
> + PGSTEAL_FILE,
> + PGSCAN_KSWAPD,
> + PGSCAN_DIRECT,
> + PGSCAN_KHUGEPAGED,
> + PGSCAN_PROACTIVE,
> + PGSCAN_ANON,
> + PGSCAN_FILE,
> #ifdef CONFIG_HUGETLB_PAGE
> NR_HUGETLB,
> #endif
> @@ -441,14 +453,6 @@ static const unsigned int memcg_vm_event_stat[] = {
> #endif
> PSWPIN,
> PSWPOUT,
> - PGSCAN_KSWAPD,
> - PGSCAN_DIRECT,
> - PGSCAN_KHUGEPAGED,
> - PGSCAN_PROACTIVE,
> - PGSTEAL_KSWAPD,
> - PGSTEAL_DIRECT,
> - PGSTEAL_KHUGEPAGED,
> - PGSTEAL_PROACTIVE,
> PGFAULT,
> PGMAJFAULT,
> PGREFILL,
> @@ -1496,15 +1500,15 @@ static void memcg_stat_format(struct mem_cgroup *memcg, struct seq_buf *s)
>
> /* Accumulated memory events */
> seq_buf_printf(s, "pgscan %lu\n",
> - memcg_events(memcg, PGSCAN_KSWAPD) +
> - memcg_events(memcg, PGSCAN_DIRECT) +
> - memcg_events(memcg, PGSCAN_PROACTIVE) +
> - memcg_events(memcg, PGSCAN_KHUGEPAGED));
> + memcg_page_state(memcg, PGSCAN_KSWAPD) +
> + memcg_page_state(memcg, PGSCAN_DIRECT) +
> + memcg_page_state(memcg, PGSCAN_PROACTIVE) +
> + memcg_page_state(memcg, PGSCAN_KHUGEPAGED));
> seq_buf_printf(s, "pgsteal %lu\n",
> - memcg_events(memcg, PGSTEAL_KSWAPD) +
> - memcg_events(memcg, PGSTEAL_DIRECT) +
> - memcg_events(memcg, PGSTEAL_PROACTIVE) +
> - memcg_events(memcg, PGSTEAL_KHUGEPAGED));
> + memcg_page_state(memcg, PGSTEAL_KSWAPD) +
> + memcg_page_state(memcg, PGSTEAL_DIRECT) +
> + memcg_page_state(memcg, PGSTEAL_PROACTIVE) +
> + memcg_page_state(memcg, PGSTEAL_KHUGEPAGED));
>
> for (i = 0; i < ARRAY_SIZE(memcg_vm_event_stat); i++) {
> #ifdef CONFIG_MEMCG_V1
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 614ccf39fe3f..16a0f21e3ea1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1977,7 +1977,7 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
> unsigned long nr_taken;
> struct reclaim_stat stat;
> bool file = is_file_lru(lru);
> - enum vm_event_item item;
> + enum node_stat_item item;
> struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> bool stalled = false;
>
> @@ -2003,10 +2003,8 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
>
> __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, nr_taken);
> item = PGSCAN_KSWAPD + reclaimer_offset(sc);
> - if (!cgroup_reclaim(sc))
> - __count_vm_events(item, nr_scanned);
> - count_memcg_events(lruvec_memcg(lruvec), item, nr_scanned);
> - __count_vm_events(PGSCAN_ANON + file, nr_scanned);
> + mod_lruvec_state(lruvec, item, nr_scanned);
> + mod_lruvec_state(lruvec, PGSCAN_ANON + file, nr_scanned);
>
> spin_unlock_irq(&lruvec->lru_lock);
>
> @@ -2023,10 +2021,8 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
> stat.nr_demoted);
> __mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
> item = PGSTEAL_KSWAPD + reclaimer_offset(sc);
> - if (!cgroup_reclaim(sc))
> - __count_vm_events(item, nr_reclaimed);
> - count_memcg_events(lruvec_memcg(lruvec), item, nr_reclaimed);
> - __count_vm_events(PGSTEAL_ANON + file, nr_reclaimed);
> + mod_lruvec_state(lruvec, item, nr_reclaimed);
> + mod_lruvec_state(lruvec, PGSTEAL_ANON + file, nr_reclaimed);
>
> lru_note_cost_unlock_irq(lruvec, file, stat.nr_pageout,
> nr_scanned - nr_reclaimed);
> @@ -4536,7 +4532,7 @@ static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
> {
> int i;
> int gen;
> - enum vm_event_item item;
> + enum node_stat_item item;
> int sorted = 0;
> int scanned = 0;
> int isolated = 0;
> @@ -4595,13 +4591,11 @@ static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
> }
>
> item = PGSCAN_KSWAPD + reclaimer_offset(sc);
> - if (!cgroup_reclaim(sc)) {
> - __count_vm_events(item, isolated);
> + if (!cgroup_reclaim(sc))
> __count_vm_events(PGREFILL, sorted);
> - }
> - count_memcg_events(memcg, item, isolated);
> + mod_lruvec_state(lruvec, item, isolated);
> count_memcg_events(memcg, PGREFILL, sorted);
> - __count_vm_events(PGSCAN_ANON + type, isolated);
> + mod_lruvec_state(lruvec, PGSCAN_ANON + type, isolated);
> trace_mm_vmscan_lru_isolate(sc->reclaim_idx, sc->order, scan_batch,
> scanned, skipped, isolated,
> type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
> @@ -4686,7 +4680,7 @@ static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
> LIST_HEAD(clean);
> struct folio *folio;
> struct folio *next;
> - enum vm_event_item item;
> + enum node_stat_item item;
> struct reclaim_stat stat;
> struct lru_gen_mm_walk *walk;
> bool skip_retry = false;
> @@ -4750,10 +4744,8 @@ static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
> stat.nr_demoted);
>
> item = PGSTEAL_KSWAPD + reclaimer_offset(sc);
> - if (!cgroup_reclaim(sc))
> - __count_vm_events(item, reclaimed);
> - count_memcg_events(memcg, item, reclaimed);
> - __count_vm_events(PGSTEAL_ANON + type, reclaimed);
> + mod_lruvec_state(lruvec, item, reclaimed);
> + mod_lruvec_state(lruvec, PGSTEAL_ANON + type, reclaimed);
>
> spin_unlock_irq(&lruvec->lru_lock);
>
> diff --git a/mm/vmstat.c b/mm/vmstat.c
> index 74e0ddde1e93..e4b259989d58 100644
> --- a/mm/vmstat.c
> +++ b/mm/vmstat.c
> @@ -1291,6 +1291,18 @@ const char * const vmstat_text[] = {
> [I(PGDEMOTE_DIRECT)] = "pgdemote_direct",
> [I(PGDEMOTE_KHUGEPAGED)] = "pgdemote_khugepaged",
> [I(PGDEMOTE_PROACTIVE)] = "pgdemote_proactive",
> + [I(PGSTEAL_KSWAPD)] = "pgsteal_kswapd",
> + [I(PGSTEAL_DIRECT)] = "pgsteal_direct",
> + [I(PGSTEAL_KHUGEPAGED)] = "pgsteal_khugepaged",
> + [I(PGSTEAL_PROACTIVE)] = "pgsteal_proactive",
> + [I(PGSTEAL_ANON)] = "pgsteal_anon",
> + [I(PGSTEAL_FILE)] = "pgsteal_file",
> + [I(PGSCAN_KSWAPD)] = "pgscan_kswapd",
> + [I(PGSCAN_DIRECT)] = "pgscan_direct",
> + [I(PGSCAN_KHUGEPAGED)] = "pgscan_khugepaged",
> + [I(PGSCAN_PROACTIVE)] = "pgscan_proactive",
> + [I(PGSCAN_ANON)] = "pgscan_anon",
> + [I(PGSCAN_FILE)] = "pgscan_file",
> #ifdef CONFIG_NUMA
> [I(PGALLOC_MPOL_DEFAULT)] = "pgalloc_mpol_default",
> [I(PGALLOC_MPOL_PREFERRED)] = "pgalloc_mpol_preferred",
> @@ -1344,19 +1356,7 @@ const char * const vmstat_text[] = {
>
> [I(PGREFILL)] = "pgrefill",
> [I(PGREUSE)] = "pgreuse",
> - [I(PGSTEAL_KSWAPD)] = "pgsteal_kswapd",
> - [I(PGSTEAL_DIRECT)] = "pgsteal_direct",
> - [I(PGSTEAL_KHUGEPAGED)] = "pgsteal_khugepaged",
> - [I(PGSTEAL_PROACTIVE)] = "pgsteal_proactive",
> - [I(PGSCAN_KSWAPD)] = "pgscan_kswapd",
> - [I(PGSCAN_DIRECT)] = "pgscan_direct",
> - [I(PGSCAN_KHUGEPAGED)] = "pgscan_khugepaged",
> - [I(PGSCAN_PROACTIVE)] = "pgscan_proactive",
> [I(PGSCAN_DIRECT_THROTTLE)] = "pgscan_direct_throttle",
> - [I(PGSCAN_ANON)] = "pgscan_anon",
> - [I(PGSCAN_FILE)] = "pgscan_file",
> - [I(PGSTEAL_ANON)] = "pgsteal_anon",
> - [I(PGSTEAL_FILE)] = "pgsteal_file",
>
> #ifdef CONFIG_NUMA
> [I(PGSCAN_ZONE_RECLAIM_SUCCESS)] = "zone_reclaim_success",
> --
> 2.47.3
next prev parent reply other threads:[~2026-02-12 7:08 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-12 4:51 [PATCH 0/2] improve per-node allocation and reclaim visibility JP Kobryn
2026-02-12 4:51 ` [PATCH 1/2] mm/mempolicy: track page allocations per mempolicy JP Kobryn
2026-02-12 7:29 ` Michal Hocko
2026-02-12 21:22 ` JP Kobryn
2026-02-16 8:26 ` Michal Hocko
2026-02-16 17:50 ` JP Kobryn (Meta)
2026-02-16 21:07 ` Michal Hocko
2026-02-17 7:48 ` JP Kobryn (Meta)
2026-02-17 12:37 ` Michal Hocko
2026-02-17 18:19 ` JP Kobryn (Meta)
2026-02-17 18:52 ` Michal Hocko
2026-02-12 15:07 ` Shakeel Butt
2026-02-12 21:23 ` JP Kobryn
2026-02-12 15:24 ` Vlastimil Babka
2026-02-12 21:25 ` JP Kobryn
2026-02-13 8:54 ` Vlastimil Babka
2026-02-13 19:56 ` JP Kobryn (Meta)
2026-02-18 4:25 ` kernel test robot
2026-02-12 4:51 ` [PATCH 2/2] mm: move pgscan and pgsteal to node stats JP Kobryn
2026-02-12 7:08 ` Michael S. Tsirkin [this message]
2026-02-12 21:23 ` JP Kobryn
2026-02-12 7:29 ` Michal Hocko
2026-02-12 21:20 ` JP Kobryn
2026-02-12 4:57 ` [PATCH 0/2] improve per-node allocation and reclaim visibility Matthew Wilcox
2026-02-12 21:22 ` JP Kobryn
2026-02-12 21:53 ` Matthew Wilcox
2026-02-12 18:08 ` [syzbot ci] " syzbot ci
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260212020724-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=axelrasmussen@google.com \
--cc=byungchul@sk.com \
--cc=cgroups@vger.kernel.org \
--cc=david@kernel.org \
--cc=eperezma@redhat.com \
--cc=gourry@gourry.net \
--cc=hannes@cmpxchg.org \
--cc=inwardvessel@gmail.com \
--cc=jasowang@redhat.com \
--cc=joshua.hahnjy@gmail.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=matthew.brost@intel.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=rakie.kim@sk.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=virtualization@lists.linux.dev \
--cc=weixugc@google.com \
--cc=xuanzhuo@linux.alibaba.com \
--cc=ying.huang@linux.alibaba.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.