From: Johannes Weiner <hannes@cmpxchg.org>
To: Dave Chinner <dgc@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Usama Arif <usama.arif@linux.dev>,
Kiryl Shutsemau <kas@kernel.org>,
Dave Chinner <david@fromorbit.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: switch deferred split shrinker to list_lru
Date: Thu, 12 Mar 2026 10:26:14 -0400 [thread overview]
Message-ID: <abLNBhh4cMtUo6RD@cmpxchg.org> (raw)
In-Reply-To: <abHrZ7VoRIU71vAH@dread>
On Thu, Mar 12, 2026 at 09:23:35AM +1100, Dave Chinner wrote:
> On Wed, Mar 11, 2026 at 11:43:58AM -0400, Johannes Weiner wrote:
> > The deferred split queue handles cgroups in a suboptimal fashion. The
> > queue is per-NUMA node or per-cgroup, not the intersection. That means
> > on a cgrouped system, a node-restricted allocation entering reclaim
> > can end up splitting large pages on other nodes:
> >
> > alloc/unmap
> > deferred_split_folio()
> > list_add_tail(memcg->split_queue)
> > set_shrinker_bit(memcg, node, deferred_shrinker_id)
> >
> > for_each_zone_zonelist_nodemask(restricted_nodes)
> > mem_cgroup_iter()
> > shrink_slab(node, memcg)
> > shrink_slab_memcg(node, memcg)
> > if test_shrinker_bit(memcg, node, deferred_shrinker_id)
> > deferred_split_scan()
> > walks memcg->split_queue
> >
> > The shrinker bit adds an imperfect guard rail. As soon as the cgroup
> > has a single large page on the node of interest, all large pages owned
> > by that memcg, including those on other nodes, will be split.
> >
> > list_lru properly sets up per-node, per-cgroup lists. As a bonus, it
> > streamlines a lot of the list operations and reclaim walks. It's used
> > widely by other major shrinkers already. Convert the deferred split
> > queue as well.
> >
> > The list_lru per-memcg heads are instantiated on demand when the first
> > object of interest is allocated for a cgroup, by calling
> > memcg_list_lru_alloc(). Add calls to where splittable pages are
> > created: anon faults, swapin faults, khugepaged collapse.
> >
> > These calls create all possible node heads for the cgroup at once, so
> > the migration code (between nodes) doesn't need any special care.
> >
> > The folio_test_partially_mapped() state is currently protected and
> > serialized wrt LRU state by the deferred split queue lock. To
> > facilitate the transition, add helpers to the list_lru API to allow
> > caller-side locking.
> >
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> > include/linux/huge_mm.h | 6 +-
> > include/linux/list_lru.h | 48 ++++++
> > include/linux/memcontrol.h | 4 -
> > include/linux/mmzone.h | 12 --
> > mm/huge_memory.c | 326 +++++++++++--------------------------
> > mm/internal.h | 2 +-
> > mm/khugepaged.c | 7 +
> > mm/list_lru.c | 197 ++++++++++++++--------
> > mm/memcontrol.c | 12 +-
> > mm/memory.c | 52 +++---
> > mm/mm_init.c | 14 --
> > 11 files changed, 310 insertions(+), 370 deletions(-)
>
> Can you please split this up into multiple patches (i.e. one logical
> change per patch) to make it easier to review?
No problem, I'll do that and send out a v2.
The list_lru changes started as only the locking functions, then
things kept creeping in...
Thanks
next prev parent reply other threads:[~2026-03-12 14:26 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 15:43 [PATCH] mm: switch deferred split shrinker to list_lru Johannes Weiner
2026-03-11 15:46 ` Johannes Weiner
2026-03-11 15:49 ` David Hildenbrand (Arm)
2026-03-11 17:00 ` Usama Arif
2026-03-11 17:42 ` Johannes Weiner
2026-03-11 19:24 ` Johannes Weiner
2026-03-11 20:09 ` Shakeel Butt
2026-03-11 21:59 ` Yosry Ahmed
2026-03-11 22:23 ` Dave Chinner
2026-03-12 14:26 ` Johannes Weiner [this message]
2026-03-12 9:14 ` [syzbot ci] " syzbot ci
-- strict thread matches above, loose matches on Subject: below --
2026-03-12 10:07 [PATCH] " kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=abLNBhh4cMtUo6RD@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=david@redhat.com \
--cc=dgc@kernel.org \
--cc=kas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=roman.gushchin@linux.dev \
--cc=usama.arif@linux.dev \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.