public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Dave Chinner <dgc@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>, Zi Yan <ziy@nvidia.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Usama Arif <usama.arif@linux.dev>,
	Kiryl Shutsemau <kas@kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: switch deferred split shrinker to list_lru
Date: Thu, 12 Mar 2026 10:26:14 -0400	[thread overview]
Message-ID: <abLNBhh4cMtUo6RD@cmpxchg.org> (raw)
In-Reply-To: <abHrZ7VoRIU71vAH@dread>

On Thu, Mar 12, 2026 at 09:23:35AM +1100, Dave Chinner wrote:
> On Wed, Mar 11, 2026 at 11:43:58AM -0400, Johannes Weiner wrote:
> > The deferred split queue handles cgroups in a suboptimal fashion. The
> > queue is per-NUMA node or per-cgroup, not the intersection. That means
> > on a cgrouped system, a node-restricted allocation entering reclaim
> > can end up splitting large pages on other nodes:
> > 
> > 	alloc/unmap
> > 	  deferred_split_folio()
> > 	    list_add_tail(memcg->split_queue)
> > 	    set_shrinker_bit(memcg, node, deferred_shrinker_id)
> > 
> > 	for_each_zone_zonelist_nodemask(restricted_nodes)
> > 	  mem_cgroup_iter()
> > 	    shrink_slab(node, memcg)
> > 	      shrink_slab_memcg(node, memcg)
> > 	        if test_shrinker_bit(memcg, node, deferred_shrinker_id)
> > 	          deferred_split_scan()
> > 	            walks memcg->split_queue
> > 
> > The shrinker bit adds an imperfect guard rail. As soon as the cgroup
> > has a single large page on the node of interest, all large pages owned
> > by that memcg, including those on other nodes, will be split.
> > 
> > list_lru properly sets up per-node, per-cgroup lists. As a bonus, it
> > streamlines a lot of the list operations and reclaim walks. It's used
> > widely by other major shrinkers already. Convert the deferred split
> > queue as well.
> > 
> > The list_lru per-memcg heads are instantiated on demand when the first
> > object of interest is allocated for a cgroup, by calling
> > memcg_list_lru_alloc(). Add calls to where splittable pages are
> > created: anon faults, swapin faults, khugepaged collapse.
> > 
> > These calls create all possible node heads for the cgroup at once, so
> > the migration code (between nodes) doesn't need any special care.
> > 
> > The folio_test_partially_mapped() state is currently protected and
> > serialized wrt LRU state by the deferred split queue lock. To
> > facilitate the transition, add helpers to the list_lru API to allow
> > caller-side locking.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  include/linux/huge_mm.h    |   6 +-
> >  include/linux/list_lru.h   |  48 ++++++
> >  include/linux/memcontrol.h |   4 -
> >  include/linux/mmzone.h     |  12 --
> >  mm/huge_memory.c           | 326 +++++++++++--------------------------
> >  mm/internal.h              |   2 +-
> >  mm/khugepaged.c            |   7 +
> >  mm/list_lru.c              | 197 ++++++++++++++--------
> >  mm/memcontrol.c            |  12 +-
> >  mm/memory.c                |  52 +++---
> >  mm/mm_init.c               |  14 --
> >  11 files changed, 310 insertions(+), 370 deletions(-)
> 
> Can you please split this up into multiple patches (i.e. one logical
> change per patch) to make it easier to review?

No problem, I'll do that and send out a v2.

The list_lru changes started as only the locking functions, then
things kept creeping in...

Thanks


  reply	other threads:[~2026-03-12 14:26 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-11 15:43 [PATCH] mm: switch deferred split shrinker to list_lru Johannes Weiner
2026-03-11 15:46 ` Johannes Weiner
2026-03-11 15:49   ` David Hildenbrand (Arm)
2026-03-11 17:00 ` Usama Arif
2026-03-11 17:42   ` Johannes Weiner
2026-03-11 19:24     ` Johannes Weiner
2026-03-11 20:09       ` Shakeel Butt
2026-03-11 21:59         ` Yosry Ahmed
2026-03-11 22:23 ` Dave Chinner
2026-03-12 14:26   ` Johannes Weiner [this message]
2026-03-12  9:14 ` [syzbot ci] " syzbot ci

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abLNBhh4cMtUo6RD@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=david@redhat.com \
    --cc=dgc@kernel.org \
    --cc=kas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=roman.gushchin@linux.dev \
    --cc=usama.arif@linux.dev \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox