From: Qi Zheng <qi.zheng@linux.dev>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: hannes@cmpxchg.org, hughd@google.com, mhocko@suse.com,
roman.gushchin@linux.dev, muchun.song@linux.dev,
david@kernel.org, lorenzo.stoakes@oracle.com, ziy@nvidia.com,
harry.yoo@oracle.com, yosry.ahmed@linux.dev,
imran.f.khan@oracle.com, kamalesh.babulal@oracle.com,
axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com,
chenridong@huaweicloud.com, mkoutny@suse.com,
akpm@linux-foundation.org, hamzamahfooz@linux.microsoft.com,
apais@linux.microsoft.com, lance.yang@linux.dev,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
cgroups@vger.kernel.org, Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH v3 26/30] mm: vmscan: prepare for reparenting MGLRU folios
Date: Mon, 19 Jan 2026 11:39:37 +0800 [thread overview]
Message-ID: <7cd937c3-148c-43ed-ac51-18d90cb9a5cc@linux.dev> (raw)
In-Reply-To: <mhxt3oheta2enk562jt25qdac2tkq6ulc3olxairfow6tb5qpl@n5dhssfcyhea>
On 1/18/26 11:29 AM, Shakeel Butt wrote:
> I asked AI to review this patch and it find following potential issues:
Thanks.
>
> Author: Qi Zheng <zhengqi.arch@bytedance.com>
>
> mm: vmscan: prepare for reparenting MGLRU folios
>
> This commit adds infrastructure for reparenting MGLRU folios from dying
> memcg to parent memcg, including functions to ensure parent has MAX_NR_GENS
> generations before reparenting.
>
> Link: https://lkml.kernel.org/r/92e0728fed3d68855173352416cf8077670610f0.1768389889.git.zhengqi.arch@bytedance.com
>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index f1a45c94c2fc..284292653d14 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>
> [ ... ]
>
>> +static void try_to_inc_max_seq_nowalk(struct mem_cgroup *memcg,
>> + struct lruvec *lruvec)
>> +{
>> + struct lru_gen_mm_list *mm_list = get_mm_list(memcg);
>> + struct lru_gen_mm_state *mm_state = get_mm_state(lruvec);
>> + int swappiness = mem_cgroup_swappiness(memcg);
>> + DEFINE_MAX_SEQ(lruvec);
>> + bool success = false;
>> +
>> + /*
>> + * We are not iterating the mm_list here, updating mm_state->seq is just
>> + * to make mm walkers work properly.
>> + */
>> + if (mm_state) {
>> + spin_lock(&mm_list->lock);
>> + VM_WARN_ON_ONCE(mm_state->seq + 1 < max_seq);
>> + if (max_seq > mm_state->seq) {
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> When mm_state->seq equals max_seq (which can happen when mm walkers have
> completed their iteration), this condition is false, success remains false,
> and inc_max_seq() is never called.
I expect that once mm walkers complete their iterations, the number of
generations will have increased, so it won't cause an infinite loop.
>
> The caller max_lru_gen_memcg() loops with:
>
> while (get_nr_gens(lruvec, type) < MAX_NR_GENS) {
> try_to_inc_max_seq_nowalk(memcg, lruvec);
> cond_resched();
> }
>
> If try_to_inc_max_seq_nowalk() fails to increment max_seq, get_nr_gens()
> returns the same value, and the while loop continues indefinitely.
>
> Can this cause an infinite loop during memcg offline when mm_state->seq has
> caught up to max_seq?
>
> Suggested fix: Remove the success variable and always call inc_max_seq().
> The mm_state->seq update is only for coordination with mm walkers - it should
> not gate whether we increment the generation. inc_max_seq() has its own
> staleness check (seq < READ_ONCE(lrugen->max_seq)) so calling it unconditionally
> is safe:
>
> if (mm_state) {
> spin_lock(&mm_list->lock);
> VM_WARN_ON_ONCE(mm_state->seq + 1 < max_seq);
> if (max_seq > mm_state->seq)
> WRITE_ONCE(mm_state->seq, mm_state->seq + 1);
> spin_unlock(&mm_list->lock);
> }
>
> inc_max_seq(lruvec, max_seq, swappiness);
>
>> + WRITE_ONCE(mm_state->seq, mm_state->seq + 1);
>> + success = true;
>> + }
>> + spin_unlock(&mm_list->lock);
>> + } else {
>> + success = true;
>> + }
>> +
>> + if (success)
>> + inc_max_seq(lruvec, max_seq, swappiness);
>> +}
>
> [ ... ]
>
>> +static void __lru_gen_reparent_memcg(struct lruvec *child_lruvec, struct lruvec *parent_lruvec,
>> + int zone, int type)
>> +{
>> + struct lru_gen_folio *child_lrugen, *parent_lrugen;
>> + enum lru_list lru = type * LRU_INACTIVE_FILE;
>> + int i;
>
> [ ... ]
>
>> + WRITE_ONCE(child_lrugen->nr_pages[gen][type][zone], 0);
>> + WRITE_ONCE(parent_lrugen->nr_pages[gen][type][zone],
>> + parent_lrugen->nr_pages[gen][type][zone] + nr_pages);
>> +
>> + update_lru_size(parent_lruvec, lru + dst_lru_active, zone, nr_pages);
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> This update_lru_size() call updates both __update_lru_size() counters and
> mem_cgroup_update_lru_size(). However, lru_gen_reparent_memcg() also calls
> mem_cgroup_update_lru_size() for LRU_UNEVICTABLE. The combination appears
> to cause incorrect accounting.
>
> Note: This issue was fixed in later commit ("mm: mglru: do not call
> update_lru_size() during reparenting").
Right.
>
>> + }
>> +}
>
next prev parent reply other threads:[~2026-01-19 3:40 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-14 11:26 [PATCH v3 00/30] Eliminate Dying Memory Cgroup Qi Zheng
2026-01-14 11:26 ` [PATCH v3 01/30] mm: memcontrol: remove dead code of checking parent memory cgroup Qi Zheng
2026-01-14 11:26 ` [PATCH v3 02/30] mm: workingset: use folio_lruvec() in workingset_refault() Qi Zheng
2026-01-14 11:26 ` [PATCH v3 03/30] mm: rename unlock_page_lruvec_irq and its variants Qi Zheng
2026-01-14 11:26 ` [PATCH v3 04/30] mm: vmscan: prepare for the refactoring the move_folios_to_lru() Qi Zheng
2026-01-16 9:10 ` Harry Yoo
2026-01-16 9:14 ` Muchun Song
2026-01-14 11:26 ` [PATCH v3 05/30] mm: vmscan: refactor move_folios_to_lru() Qi Zheng
2026-01-16 11:31 ` Harry Yoo
2026-01-14 11:26 ` [PATCH v3 06/30] mm: memcontrol: allocate object cgroup for non-kmem case Qi Zheng
2026-01-14 11:32 ` [PATCH v3 07/30] mm: memcontrol: return root object cgroup for root memory cgroup Qi Zheng
2026-01-16 12:53 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 08/30] mm: memcontrol: prevent memory cgroup release in get_mem_cgroup_from_folio() Qi Zheng
2026-01-17 20:00 ` Shakeel Butt
2026-01-18 0:31 ` Shakeel Butt
2026-01-19 3:20 ` Qi Zheng
2026-01-19 8:53 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 09/30] buffer: prevent memory cgroup release in folio_alloc_buffers() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 10/30] writeback: prevent memory cgroup release in writeback module Qi Zheng
2026-01-14 11:32 ` [PATCH v3 11/30] mm: memcontrol: prevent memory cgroup release in count_memcg_folio_events() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 12/30] mm: page_io: prevent memory cgroup release in page_io module Qi Zheng
2026-01-14 11:32 ` [PATCH v3 13/30] mm: migrate: prevent memory cgroup release in folio_migrate_mapping() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 14/30] mm: mglru: prevent memory cgroup release in mglru Qi Zheng
2026-01-17 22:46 ` Shakeel Butt
2026-01-19 9:25 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 15/30] mm: memcontrol: prevent memory cgroup release in mem_cgroup_swap_full() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 16/30] mm: workingset: prevent memory cgroup release in lru_gen_eviction() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 17/30] mm: thp: prevent memory cgroup release in folio_split_queue_lock{_irqsave}() Qi Zheng
2026-01-16 9:15 ` Muchun Song
2026-01-14 11:32 ` [PATCH v3 18/30] mm: zswap: prevent memory cgroup release in zswap_compress() Qi Zheng
2026-01-16 9:18 ` Muchun Song
2026-01-20 7:47 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 19/30] mm: workingset: prevent lruvec release in workingset_refault() Qi Zheng
2026-01-17 23:02 ` Shakeel Butt
2026-01-14 11:32 ` [PATCH v3 20/30] mm: zswap: prevent lruvec release in zswap_folio_swapin() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 21/30] mm: swap: prevent lruvec release in lru_gen_clear_refs() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 22/30] mm: workingset: prevent lruvec release in workingset_activation() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 23/30] mm: do not open-code lruvec lock Qi Zheng
2026-01-15 9:26 ` Baoquan He
2026-01-15 9:31 ` Qi Zheng
2026-01-16 9:20 ` Muchun Song
2026-01-17 23:08 ` Shakeel Butt
2026-01-20 7:58 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 24/30] mm: memcontrol: prepare for reparenting LRU pages for " Qi Zheng
2026-01-15 12:34 ` kernel test robot
2026-01-16 8:16 ` Qi Zheng
2026-01-16 10:41 ` Philip Li
2026-01-16 11:06 ` Qi Zheng
2026-01-15 12:44 ` kernel test robot
2026-01-16 6:29 ` kernel test robot
2026-01-16 9:43 ` Muchun Song
2026-01-16 9:50 ` Qi Zheng
2026-01-18 0:44 ` Shakeel Butt
2026-01-19 3:44 ` Qi Zheng
2026-01-20 15:54 ` Shakeel Butt
2026-01-18 0:46 ` Shakeel Butt
2026-01-20 8:21 ` Harry Yoo
2026-01-20 11:51 ` Qi Zheng
2026-01-20 12:50 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 25/30] mm: vmscan: prepare for reparenting traditional LRU folios Qi Zheng
2026-01-16 9:49 ` Muchun Song
2026-01-18 1:11 ` Shakeel Butt
2026-01-19 3:24 ` Qi Zheng
2026-01-14 11:32 ` [PATCH v3 26/30] mm: vmscan: prepare for reparenting MGLRU folios Qi Zheng
2026-01-15 10:44 ` [PATCH v3 26/30 fix] mm: mglru: do not call update_lru_size() during reparenting Qi Zheng
2026-01-15 17:46 ` Andrew Morton
2026-01-21 3:53 ` Harry Yoo
2026-01-21 4:19 ` Harry Yoo
2026-01-21 11:21 ` Qi Zheng
2026-01-18 3:25 ` [PATCH v3 26/30] mm: vmscan: prepare for reparenting MGLRU folios Shakeel Butt
2026-01-18 3:29 ` Shakeel Butt
2026-01-19 3:39 ` Qi Zheng [this message]
2026-01-14 11:32 ` [PATCH v3 27/30] mm: memcontrol: refactor memcg_reparent_objcgs() Qi Zheng
2026-01-18 2:31 ` Shakeel Butt
2026-01-22 9:04 ` Harry Yoo
2026-01-22 9:13 ` Muchun Song
2026-01-14 11:32 ` [PATCH v3 28/30] mm: memcontrol: prepare for reparenting state_local Qi Zheng
2026-01-15 10:41 ` [PATCH v3 28/30 fix 1/2] mm: memcontrol: fix lruvec_stats->state_local reparenting Qi Zheng
2026-01-15 10:41 ` [PATCH v3 28/30 fix 2/2] mm: memcontrol: change state_locals to atomic_long_t type Qi Zheng
2026-01-15 17:47 ` [PATCH v3 28/30 fix 1/2] mm: memcontrol: fix lruvec_stats->state_local reparenting Andrew Morton
2026-01-16 3:27 ` Qi Zheng
2026-01-18 3:22 ` Shakeel Butt
2026-01-19 3:36 ` Qi Zheng
2026-01-20 7:19 ` Muchun Song
2026-01-20 18:47 ` Shakeel Butt
2026-01-21 3:43 ` Qi Zheng
2026-01-21 8:20 ` Shakeel Butt
2026-01-21 11:25 ` Qi Zheng
2026-01-18 3:20 ` [PATCH v3 28/30] mm: memcontrol: prepare for reparenting state_local Shakeel Butt
2026-01-19 3:34 ` Qi Zheng
2026-01-29 2:10 ` Harry Yoo
2026-01-29 8:50 ` Qi Zheng
2026-01-29 12:23 ` Harry Yoo
2026-01-30 7:22 ` Qi Zheng
2026-02-02 3:15 ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 29/30] mm: memcontrol: eliminate the problem of dying memory cgroup for LRU folios Qi Zheng
2026-01-14 11:32 ` [PATCH v3 30/30] mm: lru: add VM_WARN_ON_ONCE_FOLIO to lru maintenance helpers Qi Zheng
2026-01-14 17:07 ` [syzbot ci] Re: Eliminate Dying Memory Cgroup syzbot ci
2026-01-15 3:47 ` Qi Zheng
2026-01-14 17:58 ` [PATCH v3 00/30] " Andrew Morton
2026-01-15 3:52 ` Qi Zheng
2026-01-15 5:59 ` Andrew Morton
2026-01-15 6:05 ` Qi Zheng
2026-01-15 12:40 ` Lorenzo Stoakes
2026-01-16 0:43 ` Andrew Morton
2026-01-16 8:33 ` Lorenzo Stoakes
2026-01-16 12:25 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7cd937c3-148c-43ed-ac51-18d90cb9a5cc@linux.dev \
--to=qi.zheng@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=apais@linux.microsoft.com \
--cc=axelrasmussen@google.com \
--cc=cgroups@vger.kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=david@kernel.org \
--cc=hamzamahfooz@linux.microsoft.com \
--cc=hannes@cmpxchg.org \
--cc=harry.yoo@oracle.com \
--cc=hughd@google.com \
--cc=imran.f.khan@oracle.com \
--cc=kamalesh.babulal@oracle.com \
--cc=lance.yang@linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=weixugc@google.com \
--cc=yosry.ahmed@linux.dev \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.