From: Chen Ridong <chenridong@huaweicloud.com>
To: Kairui Song <ryncsn@gmail.com>,
Axel Rasmussen <axelrasmussen@google.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Lorenzo Stoakes <ljs@kernel.org>, Barry Song <baohua@kernel.org>,
David Stevens <stevensd@google.com>, Leno Hou <lenohou@gmail.com>,
Yafang Shao <laoar.shao@gmail.com>, Yu Zhao <yuzhao@google.com>,
Zicheng Wang <wangzicheng@honor.com>,
Kalesh Singh <kaleshsingh@google.com>,
Suren Baghdasaryan <surenb@google.com>,
Chris Li <chrisl@kernel.org>, Vernon Yang <vernon2gm@gmail.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 4/8] mm/mglru: scan and count the exact number of folios
Date: Tue, 24 Mar 2026 15:22:43 +0800 [thread overview]
Message-ID: <0427249d-c6c7-477a-aeff-e007198fcf45@huaweicloud.com> (raw)
In-Reply-To: <CAMgjq7CjwdaQk66=61sNLR_21eaPZrFiai2foaTQNbZ3uxQmRw@mail.gmail.com>
On 2026/3/23 0:20, Kairui Song wrote:
> On Sat, Mar 21, 2026 at 4:59 AM Axel Rasmussen <axelrasmussen@google.com> wrote:
>>
>> On Tue, Mar 17, 2026 at 12:11 PM Kairui Song via B4 Relay
>> <devnull+kasong.tencent.com@kernel.org> wrote:
>>>
>>> From: Kairui Song <kasong@tencent.com>
>>>
>>> Make the scan helpers return the exact number of folios being scanned
>>> or isolated. This should make the scan more accurate and easier to
>>> follow.
>>>
>>> Now there is no more need for special handling when there is no
>>> progress made. The old livelock prevention `(return isolated ||
>>> !remaining ? scanned : 0)` is replaced by the natural scan budget
>>> exhaustion in try_to_shrink_lruvec, and sort_folio moves ineligible
>>> folios to newer generations.
>>>
>>> Signed-off-by: Kairui Song <kasong@tencent.com>
>>> ---
>>> mm/vmscan.c | 27 +++++++++++----------------
>>> 1 file changed, 11 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index ed5b5f8dd3c7..4f4548ff3a17 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -4680,7 +4680,7 @@ static bool isolate_folio(struct lruvec *lruvec, struct folio *folio, struct sca
>>>
>>> static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>>> struct scan_control *sc, int type, int tier,
>>> - struct list_head *list)
>>> + struct list_head *list, int *isolatedp)
>>> {
>>> int i;
>>> int gen;
>>> @@ -4750,11 +4750,9 @@ static int scan_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>>> type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON);
>>> if (type == LRU_GEN_FILE)
>>> sc->nr.file_taken += isolated;
>>> - /*
>>> - * There might not be eligible folios due to reclaim_idx. Check the
>>> - * remaining to prevent livelock if it's not making progress.
>>> - */
>>> - return isolated || !remaining ? scanned : 0;
>>> +
>>> + *isolatedp = isolated;
>>> + return scanned;
>>> }
>>>
>>> static int get_tier_idx(struct lruvec *lruvec, int type)
>>> @@ -4819,23 +4817,24 @@ static int isolate_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>>> int *type_scanned, struct list_head *list)
>>> {
>>> int i;
>>> + int scanned = 0;
>>> + int isolated = 0;
>>> int type = get_type_to_scan(lruvec, swappiness);
>>>
>>> for_each_evictable_type(i, swappiness) {
>>> - int scanned;
>>> int tier = get_tier_idx(lruvec, type);
>>>
>>> *type_scanned = type;
>>
>> I think this is problematic, now `isolate_folios` can scan a nonzero
>> amount of > 1 type of memory. Then the caller (`evict_folios`) calls
>> `trace_mm_vmscan_lru_shrink_inactive` with the total scanned amount,
>> with only the last type we scanned (misattributing part of the scan,
>> potentially). Not a "functional" issue, but it could mean confusing
>> data for anyone watching the tracepoint.
>
> Thanks! Nice catch, I'll introduce another variable for the tracepoint
> then it should be fine.
>
>>
>>
>>>
>>> - scanned = scan_folios(nr_to_scan, lruvec, sc,
>>> - type, tier, list);
>>> - if (scanned)
>>> + scanned += scan_folios(nr_to_scan, lruvec, sc,
>>> + type, tier, list, &isolated);
>>> + if (isolated)
>>> return scanned;
>>>
>>> type = !type;
>>> }
>>>
>>> - return 0;
>>> + return scanned;
>>> }
>>>
>>> static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>>> @@ -4852,7 +4851,6 @@ static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>>> struct reclaim_stat stat;
>>> struct lru_gen_mm_walk *walk;
>>> bool skip_retry = false;
>>> - struct lru_gen_folio *lrugen = &lruvec->lrugen;
>>> struct mem_cgroup *memcg = lruvec_memcg(lruvec);
>>> struct pglist_data *pgdat = lruvec_pgdat(lruvec);
>>>
>>> @@ -4860,10 +4858,7 @@ static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>>>
>>> scanned = isolate_folios(nr_to_scan, lruvec, sc, swappiness, &type, &list);
>>>
>>> - scanned += try_to_inc_min_seq(lruvec, swappiness);
>>> -
>>> - if (evictable_min_seq(lrugen->min_seq, swappiness) + MIN_NR_GENS > lrugen->max_seq)
>>> - scanned = 0;
>>> + try_to_inc_min_seq(lruvec, swappiness);
>>
>> IIUC, this change is what introduces the issue patch 6 is trying to
>> resolve. Is it worth squashing patch 6 in to this one, so we don't
>> have this non-ideal intermediate state?
>
> Well it's not, patch 6 is fixing an existing problem, see the cover
> letter about the OOM issue.
>
> This part of changing is just cleanup the loop code. It looks really
> strange to me that increasing min_seq is considered as scanning one
> folio. Aborting the scan if there is only 2 gen kind of make sense but
> this doesn't seems the right place. These strange parts to avoid
> livelock can be dropped since we have an exact count of folios being
> scanned now. I'll add more words in the commit message.
This change confused me too.
IIUC, this change looks conceptually tied to patch 3. The following change means
that evict_folios should not be invoked if aging is needed. So the judge can be
dropped there, right?
```
static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
{
...
+ if (should_run_aging(lruvec, max_seq, sc, swappiness)) {
+ if (try_to_inc_max_seq(lruvec, max_seq, swappiness, false))
+ need_rotate = true;
+ break;
+ }
```
--
Best regards,
Ridong
next prev parent reply other threads:[~2026-03-24 7:22 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-17 19:08 [PATCH 0/8] mm/mglru: improve reclaim loop and dirty folio handling Kairui Song via B4 Relay
2026-03-17 19:08 ` [PATCH 1/8] mm/mglru: consolidate common code for retrieving evitable size Kairui Song via B4 Relay
2026-03-17 19:55 ` Yuanchu Xie
2026-03-18 9:42 ` Barry Song
2026-03-18 9:57 ` Kairui Song
2026-03-19 1:40 ` Chen Ridong
2026-03-20 19:51 ` Axel Rasmussen
2026-03-22 16:10 ` Kairui Song
2026-03-26 6:25 ` Baolin Wang
2026-03-17 19:08 ` [PATCH 2/8] mm/mglru: relocate the LRU scan batch limit to callers Kairui Song via B4 Relay
2026-03-19 2:00 ` Chen Ridong
2026-03-19 4:12 ` Kairui Song
2026-03-20 21:00 ` Axel Rasmussen
2026-03-22 8:14 ` Barry Song
2026-03-24 6:05 ` Kairui Song
2026-03-17 19:08 ` [PATCH 3/8] mm/mglru: restructure the reclaim loop Kairui Song via B4 Relay
2026-03-20 20:09 ` Axel Rasmussen
2026-03-22 16:11 ` Kairui Song
2026-03-24 6:41 ` Chen Ridong
2026-03-26 7:31 ` Baolin Wang
2026-03-26 8:37 ` Kairui Song
2026-03-17 19:09 ` [PATCH 4/8] mm/mglru: scan and count the exact number of folios Kairui Song via B4 Relay
2026-03-20 20:57 ` Axel Rasmussen
2026-03-22 16:20 ` Kairui Song
2026-03-24 7:22 ` Chen Ridong [this message]
2026-03-24 8:05 ` Kairui Song
2026-03-24 9:10 ` Chen Ridong
2026-03-24 9:29 ` Kairui Song
2026-03-17 19:09 ` [PATCH 5/8] mm/mglru: use a smaller batch for reclaim Kairui Song via B4 Relay
2026-03-20 20:58 ` Axel Rasmussen
2026-03-24 7:51 ` Chen Ridong
2026-03-17 19:09 ` [PATCH 6/8] mm/mglru: don't abort scan immediately right after aging Kairui Song via B4 Relay
2026-03-17 19:09 ` [PATCH 7/8] mm/mglru: simplify and improve dirty writeback handling Kairui Song via B4 Relay
2026-03-20 21:18 ` Axel Rasmussen
2026-03-22 16:22 ` Kairui Song
2026-03-24 8:57 ` Chen Ridong
2026-03-24 11:09 ` Kairui Song
2026-03-26 7:56 ` Baolin Wang
2026-03-17 19:09 ` [PATCH 8/8] mm/vmscan: remove sc->file_taken Kairui Song via B4 Relay
2026-03-20 21:19 ` Axel Rasmussen
2026-03-25 4:49 ` [PATCH 0/8] mm/mglru: improve reclaim loop and dirty folio handling Eric Naim
2026-03-25 5:47 ` Kairui Song
2026-03-25 9:26 ` Eric Naim
2026-03-25 9:47 ` Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0427249d-c6c7-477a-aeff-e007198fcf45@huaweicloud.com \
--to=chenridong@huaweicloud.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kaleshsingh@google.com \
--cc=laoar.shao@gmail.com \
--cc=lenohou@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@kernel.org \
--cc=ryncsn@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=stevensd@google.com \
--cc=surenb@google.com \
--cc=vernon2gm@gmail.com \
--cc=wangzicheng@honor.com \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox