From: "Harry Yoo (Oracle)" <harry@kernel.org>
To: Qi Zheng <qi.zheng@linux.dev>
Cc: hannes@cmpxchg.org, hughd@google.com, mhocko@suse.com,
roman.gushchin@linux.dev, shakeel.butt@linux.dev,
muchun.song@linux.dev, david@kernel.org,
lorenzo.stoakes@oracle.com, ziy@nvidia.com, harry.yoo@oracle.com,
yosry.ahmed@linux.dev, imran.f.khan@oracle.com,
kamalesh.babulal@oracle.com, axelrasmussen@google.com,
yuanchu@google.com, weixugc@google.com,
chenridong@huaweicloud.com, mkoutny@suse.com,
akpm@linux-foundation.org, hamzamahfooz@linux.microsoft.com,
apais@linux.microsoft.com, lance.yang@linux.dev, bhe@redhat.com,
usamaarif642@gmail.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
Qi Zheng <zhengqi.arch@bytedance.com>,
Yosry Ahmed <yosry@kernel.org>
Subject: Re: [PATCH v6 30/33] mm: memcontrol: prepare for reparenting non-hierarchical stats
Date: Mon, 23 Mar 2026 16:53:12 +0900 [thread overview]
Message-ID: <acDxaEgnqPI-Z4be@hyeyoo> (raw)
In-Reply-To: <e862995c45a7101a541284b6ebee5e5c32c89066.1772711148.git.zhengqi.arch@bytedance.com>
On Thu, Mar 05, 2026 at 07:52:48PM +0800, Qi Zheng wrote:
> From: Qi Zheng <zhengqi.arch@bytedance.com>
>
> To resolve the dying memcg issue, we need to reparent LRU folios of child
> memcg to its parent memcg. This could cause problems for non-hierarchical
> stats.
>
> As Yosry Ahmed pointed out:
>
> ```
> In short, if memory is charged to a dying cgroup at the time of
> reparenting, when the memory gets uncharged the stats updates will occur
> at the parent. This will update both hierarchical and non-hierarchical
> stats of the parent, which would corrupt the parent's non-hierarchical
> stats (because those counters were never incremented when the memory was
> charged).
> ```
>
> Now we have the following two types of non-hierarchical stats, and they
> are only used in CONFIG_MEMCG_V1:
>
> a. memcg->vmstats->state_local[i]
> b. pn->lruvec_stats->state_local[i]
>
> To ensure that these non-hierarchical stats work properly, we need to
> reparent these non-hierarchical stats after reparenting LRU folios. To
> this end, this commit makes the following preparations:
>
> 1. implement reparent_state_local() to reparent non-hierarchical stats
> 2. make css_killed_work_fn() to be called in rcu work, and implement
> get_non_dying_memcg_start() and get_non_dying_memcg_end() to avoid race
> between mod_memcg_state()/mod_memcg_lruvec_state()
> and reparent_state_local()
>
> Co-developed-by: Yosry Ahmed <yosry@kernel.org>
> Signed-off-by: Yosry Ahmed <yosry@kernel.org>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
> kernel/cgroup/cgroup.c | 9 ++--
> mm/memcontrol-v1.c | 16 +++++++
> mm/memcontrol-v1.h | 7 +++
> mm/memcontrol.c | 97 ++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 125 insertions(+), 4 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 23b70bd80ddc9..b0519a16f5684 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -473,6 +501,30 @@ unsigned long lruvec_page_state_local(struct lruvec *lruvec,
> return x;
> }
>
> +#ifdef CONFIG_MEMCG_V1
> +static void __mod_memcg_lruvec_state(struct mem_cgroup_per_node *pn,
> + enum node_stat_item idx, int val);
> +
> +void reparent_memcg_lruvec_state_local(struct mem_cgroup *memcg,
> + struct mem_cgroup *parent, int idx)
> +{
> + int nid;
> +
> + for_each_node(nid) {
> + struct lruvec *child_lruvec = mem_cgroup_lruvec(memcg, NODE_DATA(nid));
> + struct lruvec *parent_lruvec = mem_cgroup_lruvec(parent, NODE_DATA(nid));
> + unsigned long value = lruvec_page_state_local(child_lruvec, idx);
> + struct mem_cgroup_per_node *child_pn, *parent_pn;
> +
> + child_pn = container_of(child_lruvec, struct mem_cgroup_per_node, lruvec);
> + parent_pn = container_of(parent_lruvec, struct mem_cgroup_per_node, lruvec);
> +
> + __mod_memcg_lruvec_state(child_pn, idx, -value);
> + __mod_memcg_lruvec_state(parent_pn, idx, value);
We should probably change the type of `@val` from int to val to avoid
losing non hierarchical stats during reparenting?
> #ifdef CONFIG_MEMCG_V1
> static void __mod_memcg_state(struct mem_cgroup *memcg,
> enum memcg_stat_item idx, int val)
>
> @@ -769,6 +857,15 @@ unsigned long memcg_page_state_local(struct mem_cgroup *memcg, int idx)
> #endif
> return x;
> }
> +
> +void reparent_memcg_state_local(struct mem_cgroup *memcg,
> + struct mem_cgroup *parent, int idx)
> +{
> + unsigned long value = memcg_page_state_local(memcg, idx);
> +
> + __mod_memcg_state(memcg, idx, -value);
> + __mod_memcg_state(parent, idx, value);
> +}
Same here.
Otherwise LGTM.
> #endif
>
> static void __mod_memcg_lruvec_state(struct mem_cgroup_per_node *pn,
--
Cheers,
Harry / Hyeonggon
next prev parent reply other threads:[~2026-03-23 7:53 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-05 11:52 [PATCH v6 00/33] Eliminate Dying Memory Cgroup Qi Zheng
2026-03-05 11:52 ` [PATCH v6 01/33] mm: memcontrol: remove dead code of checking parent memory cgroup Qi Zheng
2026-03-05 11:52 ` [PATCH v6 02/33] mm: workingset: use folio_lruvec() in workingset_refault() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 03/33] mm: rename unlock_page_lruvec_irq and its variants Qi Zheng
2026-03-05 11:52 ` [PATCH v6 04/33] mm: vmscan: prepare for the refactoring the move_folios_to_lru() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 05/33] mm: vmscan: refactor move_folios_to_lru() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 06/33] mm: memcontrol: allocate object cgroup for non-kmem case Qi Zheng
2026-03-05 11:52 ` [PATCH v6 07/33] mm: memcontrol: return root object cgroup for root memory cgroup Qi Zheng
2026-03-05 11:52 ` [PATCH v6 08/33] mm: memcontrol: prevent memory cgroup release in get_mem_cgroup_from_folio() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 09/33] buffer: prevent memory cgroup release in folio_alloc_buffers() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 10/33] writeback: prevent memory cgroup release in writeback module Qi Zheng
2026-03-05 11:52 ` [PATCH v6 11/33] mm: memcontrol: prevent memory cgroup release in count_memcg_folio_events() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 12/33] mm: page_io: prevent memory cgroup release in page_io module Qi Zheng
2026-03-05 11:52 ` [PATCH v6 13/33] mm: migrate: prevent memory cgroup release in folio_migrate_mapping() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 14/33] mm: mglru: prevent memory cgroup release in mglru Qi Zheng
2026-03-05 11:52 ` [PATCH v6 15/33] mm: memcontrol: prevent memory cgroup release in mem_cgroup_swap_full() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 16/33] mm: workingset: prevent memory cgroup release in lru_gen_eviction() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 17/33] mm: thp: prevent memory cgroup release in folio_split_queue_lock{_irqsave}() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 18/33] mm: zswap: prevent memory cgroup release in zswap_compress() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 19/33] mm: workingset: prevent lruvec release in workingset_refault() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 20/33] mm: zswap: prevent lruvec release in zswap_folio_swapin() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 21/33] mm: swap: prevent lruvec release in lru_gen_clear_refs() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 22/33] mm: workingset: prevent lruvec release in workingset_activation() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 23/33] mm: do not open-code lruvec lock Qi Zheng
2026-03-05 11:52 ` [PATCH v6 24/33] mm: memcontrol: prepare for reparenting LRU pages for " Qi Zheng
2026-03-05 11:52 ` [PATCH v6 25/33] mm: vmscan: prepare for reparenting traditional LRU folios Qi Zheng
2026-03-05 11:52 ` [PATCH v6 26/33] mm: vmscan: prepare for reparenting MGLRU folios Qi Zheng
2026-03-23 13:29 ` Harry Yoo (Oracle)
2026-03-24 2:46 ` Qi Zheng
2026-03-24 11:49 ` [PATCH] fix: " Qi Zheng
2026-03-25 0:28 ` Harry Yoo (Oracle)
2026-03-05 11:52 ` [PATCH v6 27/33] mm: memcontrol: refactor memcg_reparent_objcgs() Qi Zheng
2026-03-05 11:52 ` [PATCH v6 28/33] mm: workingset: use lruvec_lru_size() to get the number of lru pages Qi Zheng
2026-03-05 11:52 ` [PATCH v6 29/33] mm: memcontrol: refactor mod_memcg_state() and mod_memcg_lruvec_state() Qi Zheng
2026-04-03 21:39 ` Shakeel Butt
2026-03-05 11:52 ` [PATCH v6 30/33] mm: memcontrol: prepare for reparenting non-hierarchical stats Qi Zheng
2026-03-13 16:22 ` Michal Koutný
2026-03-16 3:47 ` Qi Zheng
2026-03-23 7:53 ` Harry Yoo (Oracle) [this message]
2026-03-23 9:47 ` Qi Zheng
2026-03-23 12:25 ` Harry Yoo (Oracle)
2026-03-24 2:54 ` Qi Zheng
2026-03-24 4:05 ` Harry Yoo (Oracle)
2026-03-24 4:25 ` Qi Zheng
2026-03-24 4:40 ` Harry Yoo (Oracle)
2026-03-05 11:52 ` [PATCH v6 31/33] mm: memcontrol: convert objcg to be per-memcg per-node type Qi Zheng
2026-03-06 20:29 ` Usama Arif
2026-03-07 8:51 ` Qi Zheng
2026-03-07 11:08 ` Usama Arif
2026-03-09 2:59 ` Qi Zheng
2026-03-09 11:29 ` [PATCH] fix: " Qi Zheng
2026-03-09 11:33 ` Usama Arif
2026-03-09 11:43 ` Qi Zheng
2026-03-05 11:52 ` [PATCH v6 32/33] mm: memcontrol: eliminate the problem of dying memory cgroup for LRU folios Qi Zheng
2026-04-06 18:11 ` Joshua Hahn
2026-04-07 2:12 ` Qi Zheng
2026-03-05 11:52 ` [PATCH v6 33/33] mm: lru: add VM_WARN_ON_ONCE_FOLIO to lru maintenance helpers Qi Zheng
2026-03-06 0:51 ` [PATCH v6 00/33] Eliminate Dying Memory Cgroup Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acDxaEgnqPI-Z4be@hyeyoo \
--to=harry@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=apais@linux.microsoft.com \
--cc=axelrasmussen@google.com \
--cc=bhe@redhat.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=david@kernel.org \
--cc=hamzamahfooz@linux.microsoft.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hughd@google.com \
--cc=imran.f.khan@oracle.com \
--cc=kamalesh.babulal@oracle.com \
--cc=lance.yang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=qi.zheng@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=weixugc@google.com \
--cc=yosry.ahmed@linux.dev \
--cc=yosry@kernel.org \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.