From: "JP Kobryn (Meta)" <jp.kobryn@linux.dev>
To: Barry Song <baohua@kernel.org>, Shakeel Butt <shakeel.butt@linux.dev>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@kernel.org,
mhocko@suse.com, willy@infradead.org, hannes@cmpxchg.org,
riel@surriel.com, chrisl@kernel.org, kasong@tencent.com,
shikemeng@huaweicloud.com, nphamcs@gmail.com, bhe@redhat.com,
youngjun.park@lge.com, qi.zheng@linux.dev,
axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com,
linux-kernel@vger.kernel.org, kernel-team@meta.com
Subject: Re: [PATCH] mm/lruvec: preemptively free dead folios during lru_add drain
Date: Fri, 24 Apr 2026 08:38:06 -0700 [thread overview]
Message-ID: <a4c8a792-7256-4e4c-9f8e-5539a8f93459@linux.dev> (raw)
In-Reply-To: <CAGsJ_4xy+kGktiePL28DP3PAcVsdnz8Noann2UGhD=EV+3xjqA@mail.gmail.com>
On 4/23/26 4:53 PM, Barry Song wrote:
> On Fri, Apr 24, 2026 at 7:46 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
>>
>> On Fri, Apr 24, 2026 at 07:22:30AM +0800, Barry Song wrote:
>>> On Fri, Apr 24, 2026 at 12:43 AM JP Kobryn (Meta) <jp.kobryn@linux.dev> wrote:
>>>>
>>>> Of all observable lruvec lock contention in our fleet, we find that ~24%
>>>> occurs when dead folios are present in lru_add batches at drain time. This
>>>> is wasteful in the sense that the folio is added to the LRU just to be
>>>> immediately removed via folios_put_refs(), incurring two unnecessary lock
>>>> acquisitions.
>>>>
>>>> Eliminate this overhead by preemptively cleaning up dead folios before they
>>>> make it into the LRU. Use folio_ref_freeze() to filter folios whose only
>>>> remaining refcount is the batch ref. When dead folios are found, move them
>>>> off the add batch and onto a temporary batch to be freed.
>>>>
>>>> During A/B testing on one of our prod instagram workloads (high-frequency
>>>> short-lived requests), the patch intercepted almost all dead folios before
>>>> they entered the LRU. Data collected using the mm_lru_insertion tracepoint
>>>> shows the effectiveness of the patch:
>>>>
>>>> Per-host LRU add averages at 95% CPU load
>>>> (60 hosts each side, 3 x 60s intervals)
>>>>
>>>> dead folios/min total folios/min dead %
>>>> unpatched: 1,297,785 19,341,986 6.7097%
>>>> patched: 14 19,039,996 0.0001%
>>>>
>>>> Within this workload, we save ~2.6M lock acquisitions per minute per host
>>>> as a result.
>>>>
>>>> System-wide memory stats improved on the patched side also at 95% CPU load:
>>>> - direct reclaim scanning reduced 7%
>>>> - allocation stalls reduced 5.2%
>>>> - compaction stalls reduced 12.3%
>>>> - page frees reduced 4.9%
>>>>
>>>> No regressions were observed in requests served per second or request tail
>>>> latency (p99). Both metrics showed directional improvement at higher CPU
>>>> utilization (comparing 85% to 95%).
>>>>
>>>> Signed-off-by: JP Kobryn (Meta) <jp.kobryn@linux.dev>
>>>
>>> Hi JP,
>>> I’m seeing a large number of "BAD page" bugs.
>>> Not sure if it’s related, but reverting this patch
>>> seems to fix the issue.
>>>
>>> [ 2869.365978] BUG: Bad page state in process uname pfn:3a5417
>>> [ 2869.365981] page: refcount:0 mapcount:0 mapping:0000000000000000
>>> index:0x724884c20 pfn:0x3a5417
>>> [ 2869.365983] flags:
>>> 0x17ffffc0020908(uptodate|active|owner_2|swapbacked|node=0|zone=2|lastcpupid=0x1fffff)
>>
>> Hi Barry, are you using MGLRU? It seems like MGLRU set active flag in
>> folio_add_lru().
>
> Yes. If you are referring to this set_active, I think it is
> incorrect, so I have fixed it here and am waiting for review:
>
> https://lore.kernel.org/linux-mm/20260418120233.7162-1-baohua@kernel.org/
>
>>
>> JP, we need to clean active flag but let's check what else can be set before
>> folio_add_lru().
Barry/Shakeel,
We can do something like this as a special case for MGLRU:
diff --git a/mm/swap.c b/mm/swap.c
index 71607b0ce3d18..68ea929f65031 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -185,6 +185,8 @@ static void folio_batch_move_lru(struct folio_batch
*fbatch, move_fn_t move_fn)
* deferred split list to avoid a dangling list entry.
*/
if (is_lru_add && folio_ref_freeze(folio, 1)) {
+ if (lru_gen_enabled())
+ __folio_clear_active(folio);
folio_unqueue_deferred_split(folio);
fbatch->folios[i] = NULL;
folio_batch_add(&free_fbatch, folio);
Unless Barry's patch works out... Any thoughts?
next prev parent reply other threads:[~2026-04-24 15:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-23 16:43 [PATCH] mm/lruvec: preemptively free dead folios during lru_add drain JP Kobryn (Meta)
2026-04-23 17:15 ` Matthew Wilcox
2026-04-23 18:21 ` JP Kobryn (Meta)
2026-04-23 18:46 ` Shakeel Butt
2026-04-23 21:18 ` JP Kobryn (Meta)
2026-04-23 22:45 ` Shakeel Butt
2026-04-23 23:22 ` Barry Song
2026-04-23 23:46 ` Shakeel Butt
2026-04-23 23:53 ` Barry Song
2026-04-24 1:46 ` JP Kobryn (Meta)
2026-04-24 15:38 ` JP Kobryn (Meta) [this message]
2026-04-24 16:30 ` Shakeel Butt
2026-04-24 7:37 ` [syzbot ci] " syzbot ci
2026-04-24 8:32 ` [PATCH] " Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a4c8a792-7256-4e4c-9f8e-5539a8f93459@linux.dev \
--to=jp.kobryn@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kasong@tencent.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=nphamcs@gmail.com \
--cc=qi.zheng@linux.dev \
--cc=riel@surriel.com \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=willy@infradead.org \
--cc=youngjun.park@lge.com \
--cc=yuanchu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.