From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Kairui Song <ryncsn@gmail.com>
Cc: kasong@tencent.com, linux-mm@kvack.org,
Andrew Morton <akpm@linux-foundation.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Michal Hocko <mhocko@kernel.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Shakeel Butt <shakeel.butt@linux.dev>,
Lorenzo Stoakes <ljs@kernel.org>, Barry Song <baohua@kernel.org>,
David Stevens <stevensd@google.com>,
Chen Ridong <chenridong@huaweicloud.com>,
Leno Hou <lenohou@gmail.com>, Yafang Shao <laoar.shao@gmail.com>,
Yu Zhao <yuzhao@google.com>, Zicheng Wang <wangzicheng@honor.com>,
Kalesh Singh <kaleshsingh@google.com>,
Suren Baghdasaryan <surenb@google.com>,
Chris Li <chrisl@kernel.org>, Vernon Yang <vernon2gm@gmail.com>,
linux-kernel@vger.kernel.org, Qi Zheng <qi.zheng@linux.dev>
Subject: Re: [PATCH v2 05/12] mm/mglru: scan and count the exact number of folios
Date: Tue, 31 Mar 2026 17:52:40 +0800 [thread overview]
Message-ID: <51d646bf-87f0-4f56-892e-4c62940458a5@linux.alibaba.com> (raw)
In-Reply-To: <acuBldipHM5fj_pw@KASONG-MC4>
On 3/31/26 5:01 PM, Kairui Song wrote:
> On Tue, Mar 31, 2026 at 04:04:30PM +0800, Baolin Wang wrote:
>>
>>
>> On 3/29/26 3:52 AM, Kairui Song via B4 Relay wrote:
>>> From: Kairui Song <kasong@tencent.com>
>>>
>>> Make the scan helpers return the exact number of folios being scanned
>>> or isolated. Since the reclaim loop now has a natural scan budget that
>>> controls the scan progress, returning the scan number directly should
>>> make the scan more accurate and easier to follow.
>>>
>>> The number of scanned folios for each iteration is always positive and
>>> larger than 0, unless the reclaim must stop for a forced aging, so
>>> there is no more need for any special handling when there is no
>>> progress made:
>>>
>>> - `return isolated || !remaining ? scanned : 0` in scan_folios: both
>>> the function and the call now just return the exact scan count,
>>> combined with the scan budget introduced in the previous commit to
>>> avoid livelock or under scan.
>>
>> Make sense to me.
>>
>>>
>>> - `scanned += try_to_inc_min_seq` in evict_folios: adding a bool as a
>>> scan count was kind of confusing and no longer needed too, as scan
>>> number will never be zero even if none of the folio in oldest
>>> generation is isolated.
>>
>> Yes, agree.
>>
>>>
>>> - `evictable_min_seq + MIN_NR_GENS > max_seq` guard in evict_folios:
>>> the per-type get_nr_gens == MIN_NR_GENS check in scan_folios
>>> naturally returns 0 when only two gens remain and breaks the loop.
>>>
>>> Also move try_to_inc_min_seq before isolate_folios, so that any empty
>>> gens created by external folio freeing are also skipped.
>>
>> This part is somewhat confusing. You probably mean the case where the list
>> of that gen becomes empty via isolate_folio(), right?
>>
>> If that's the case, the original logic would remove the empty gens produced
>> by isolate_folio() after calling try_to_inc_min_seq().
>>
>> However, with your changes, this removal won't happen until the next
>> eviction. Does this provide any additional benefits? Or could you describe
>> how this change impacts your testing?
>
> Hi Baolin, thanks for the review.
>
> Yeah, I also notices this issue after sending this while doing more
> self review.
>
> So I did some test with the patch below:
>
> static bool inc_max_seq(struct lruvec *lruvec, unsigned long seq, int swappiness)
> @@ -4818,11 +4814,15 @@ static int evict_folios(unsigned long nr_to_scan, struct lruvec *lruvec,
>
> lruvec_lock_irq(lruvec);
>
> + /* In case folio deletion created empty gen, flush them */
> try_to_inc_min_seq(lruvec, swappiness);
>
> scanned = isolate_folios(nr_to_scan, lruvec, sc, swappiness,
> &list, &isolated, &type, &type_scanned);
>
> + /* Isolation might created empty gen, flush them */
> + try_to_inc_min_seq(lruvec, swappiness);
> +
> lruvec_unlock_irq(lruvec);
>
> if (list_empty(&list))
>
> The return value of try_to_inc_min_seq can also be dropped
> since it's no longer used, and the function call should be cheap.
>
> After system time of build kernel using 3G memory and make -j96
> with ZRAM as swap, system time in seconds average of 12 test run each:
>
> mm-new:
> 9136.055833
>
> After V2:
> 8819.932222
>
> After V2, with above patch:
> 8783.944444
>
> After V2, without above patch but move try_to_inc_min_seq
> back to after isolate_folios:
> 8807.874444
>
> This series is looking good, this inc_min change seems trivial
> but in theory it does have have real effect.
>
> - Moving the try_to_inc_min_seq after isolate_folios may result in a
> wasted isolate_folios call and early abort of reclaim loop if there
> is a stalled oldest gen created by folio deletion.
Indeed.
> - Moving the try_to_inc_min_seq before isolate_folios may leave a
> empty gen after isolation. Usually it's fine because next eviction
> will still reclaim them. But before next eviction, during that period,
> new file folios could be added the oldest gen and get reclaim too
> early. That looks a real problem.
>
> This maybe trivial since MGLRU itself also may suffer the same
> problem when the oldest gen is just too short, that's a much more
> common case (For this short oldest gen issue we can solve later).
>
> - Having try_to_inc_min_seq both before and after isolate_folios
> seems the best choice here and somehow matches the benchmark
> result above, very close to the noise level though.
>
> Well I only tested one cases, the cover letter described a
> larger matrix, still all good with this series and I'm not
> 100% sure how this particular change effects them, I guess
> it's still trivial.
>
> The try_to_inc_min_seq call should be cheap enough since it's
> called only for one batch of 64 folios, and it's only reading
> a few lists for the non inc path.
>
> How do you think that we just call it twice here?
Sounds reasonable to me.
I'm not sure if we need to split out a new patch with adding above
message, as this patch mainly focuses on optimizing the number of folios
being scanned.
next prev parent reply other threads:[~2026-03-31 9:52 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-28 19:52 [PATCH v2 00/12] mm/mglru: improve reclaim loop and dirty folio handling Kairui Song via B4 Relay
2026-03-28 19:52 ` [PATCH v2 01/12] mm/mglru: consolidate common code for retrieving evitable size Kairui Song via B4 Relay
2026-03-28 19:52 ` [PATCH v2 02/12] mm/mglru: rename variables related to aging and rotation Kairui Song via B4 Relay
2026-03-30 1:57 ` Chen Ridong
2026-03-30 7:59 ` Baolin Wang
2026-04-01 0:00 ` Barry Song
2026-03-28 19:52 ` [PATCH v2 03/12] mm/mglru: relocate the LRU scan batch limit to callers Kairui Song via B4 Relay
2026-03-30 8:14 ` Baolin Wang
2026-04-01 0:20 ` Barry Song
2026-03-28 19:52 ` [PATCH v2 04/12] mm/mglru: restructure the reclaim loop Kairui Song via B4 Relay
2026-03-29 6:47 ` Kairui Song
2026-03-28 19:52 ` [PATCH v2 05/12] mm/mglru: scan and count the exact number of folios Kairui Song via B4 Relay
2026-03-31 8:04 ` Baolin Wang
2026-03-31 9:01 ` Kairui Song
2026-03-31 9:52 ` Baolin Wang [this message]
2026-03-28 19:52 ` [PATCH v2 06/12] mm/mglru: use a smaller batch for reclaim Kairui Song via B4 Relay
2026-03-31 8:08 ` Baolin Wang
2026-03-28 19:52 ` [PATCH v2 07/12] mm/mglru: don't abort scan immediately right after aging Kairui Song via B4 Relay
2026-03-28 19:52 ` [PATCH v2 08/12] mm/mglru: simplify and improve dirty writeback handling Kairui Song via B4 Relay
2026-03-29 8:21 ` Kairui Song
2026-03-29 8:46 ` Kairui Song
2026-03-31 8:42 ` Baolin Wang
2026-03-31 9:18 ` Kairui Song
2026-04-01 2:52 ` Baolin Wang
2026-04-01 4:57 ` Kairui Song
2026-04-02 0:11 ` Barry Song
2026-04-07 2:52 ` Chen Ridong
2026-04-01 23:37 ` Shakeel Butt
2026-04-02 11:44 ` Kairui Song
2026-03-28 19:52 ` [PATCH v2 09/12] mm/mglru: remove no longer used reclaim argument for folio protection Kairui Song via B4 Relay
2026-03-28 19:52 ` [PATCH v2 10/12] mm/vmscan: remove sc->file_taken Kairui Song via B4 Relay
2026-03-31 8:49 ` Baolin Wang
2026-03-28 19:52 ` [PATCH v2 11/12] mm/vmscan: remove sc->unqueued_dirty Kairui Song via B4 Relay
2026-03-31 8:51 ` Baolin Wang
2026-03-28 19:52 ` [PATCH v2 12/12] mm/vmscan: unify writeback reclaim statistic and throttling Kairui Song via B4 Relay
2026-03-31 9:24 ` Baolin Wang
2026-03-31 9:29 ` Kairui Song
2026-03-31 9:36 ` Baolin Wang
2026-03-31 9:40 ` Kairui Song
2026-04-01 5:01 ` Leno Hou
2026-04-02 2:39 ` Shakeel Butt
2026-04-02 2:56 ` Kairui Song
2026-04-02 3:17 ` Shakeel Butt
2026-04-01 5:18 ` [PATCH v2 00/12] mm/mglru: improve reclaim loop and dirty folio handling Leno Hou
2026-04-01 7:36 ` Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51d646bf-87f0-4f56-892e-4c62940458a5@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=chenridong@huaweicloud.com \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kaleshsingh@google.com \
--cc=kasong@tencent.com \
--cc=laoar.shao@gmail.com \
--cc=lenohou@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@kernel.org \
--cc=qi.zheng@linux.dev \
--cc=ryncsn@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=stevensd@google.com \
--cc=surenb@google.com \
--cc=vernon2gm@gmail.com \
--cc=wangzicheng@honor.com \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=yuzhao@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox