From: Johannes Weiner <hannes@cmpxchg.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Shakeel Butt <shakeel.butt@linux.dev>,
Yosry Ahmed <yosry.ahmed@linux.dev>, Zi Yan <ziy@nvidia.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Usama Arif <usama.arif@linux.dev>,
Kiryl Shutsemau <kas@kernel.org>,
Dave Chinner <david@fromorbit.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 7/7] mm: switch deferred split shrinker to list_lru
Date: Fri, 20 Mar 2026 12:02:38 -0400 [thread overview]
Message-ID: <ab1vnhL4XftNdTSu@cmpxchg.org> (raw)
In-Reply-To: <e9631520-857f-4f67-8944-6af7a2b47e89@kernel.org>
On Thu, Mar 19, 2026 at 08:21:21AM +0100, David Hildenbrand (Arm) wrote:
> >>> @@ -3802,33 +3706,28 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> >>> struct folio *new_folio, *next;
> >>> int old_order = folio_order(folio);
> >>> int ret = 0;
> >>> - struct deferred_split *ds_queue;
> >>> + struct list_lru_one *l;
> >>>
> >>> VM_WARN_ON_ONCE(!mapping && end);
> >>> /* Prevent deferred_split_scan() touching ->_refcount */
> >>> - ds_queue = folio_split_queue_lock(folio);
> >>> + rcu_read_lock();
> >>
> >> The RCU lock is for the folio_memcg(), right?
> >>
> >> I recall I raised in the past that some get/put-like logic (that wraps
> >> the rcu_read_lock() + folio_memcg()) might make this a lot easier to get.
> >>
> >>
> >> memcg = folio_memcg_lookup(folio)
> >>
> >> ... do stuff
> >>
> >> folio_memcg_putback(folio, memcg);
> >>
> >> Or sth like that.
> >>
> >>
> >> Alternativey, you could have some helpers that do the
> >> list_lru_lock+unlock etc.
> >>
> >> folio_memcg_list_lru_lock()
> >> ...
> >> folio_memcg_list_ru_unlock(l);
> >>
> >> Just some thoughts as inspiration :)
> >
> > I remember you raising this in the objcg + reparenting patches. There
> > are a few more instances of
> >
> > rcu_read_lock()
> > foo = folio_memcg()
> > ...
> > rcu_read_unlock()
> >
> > in other parts of the code not touched by these patches here, so the
> > first pattern is a more universal encapsulation.
> >
> > Let me look into this. Would you be okay with a follow-up that covers
> > the others as well?
>
> Of course :) If list_lru lock helpers would be the right thing to do, it
> might be better placed in this series.
I'm playing around with the below. But there are a few things that
seem suboptimal:
- We need a local @memcg, which makes sites that just pass
folio_memcg() somewhere else fatter. More site LOC on average.
- Despite being more verbose, it communicates less. rcu_read_lock()
is universally understood, folio_memcg_foo() is cryptic.
- It doesn't cover similar accessors with the same lifetime rules,
like folio_lruvec(), folio_memcg_check()
include/linux/memcontrol.h | 35 ++++++++++++++++++++++++++---------
mm/huge_memory.c | 34 ++++++++++++++++++----------------
mm/list_lru.c | 5 ++---
mm/memcontrol.c | 17 +++++++----------
mm/migrate.c | 5 ++---
mm/page_io.c | 12 ++++++------
mm/vmscan.c | 7 ++++---
mm/workingset.c | 5 ++---
mm/zswap.c | 11 ++++++-----
9 files changed, 73 insertions(+), 58 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 0782c72a1997..5162145b9322 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -430,6 +430,17 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio)
return objcg ? obj_cgroup_memcg(objcg) : NULL;
}
+static inline struct mem_cgroup *folio_memcg_begin(struct folio *folio)
+{
+ rcu_read_lock();
+ return folio_memcg(folio);
+}
+
+static inline void folio_memcg_end(void)
+{
+ rcu_read_unlock();
+}
+
/*
* folio_memcg_charged - If a folio is charged to a memory cgroup.
* @folio: Pointer to the folio.
@@ -917,11 +928,10 @@ static inline void mod_memcg_page_state(struct page *page,
if (mem_cgroup_disabled())
return;
- rcu_read_lock();
- memcg = folio_memcg(page_folio(page));
+ memcg = folio_memcg_begin(page_folio(page));
if (memcg)
mod_memcg_state(memcg, idx, val);
- rcu_read_unlock();
+ folio_memcg_end();
}
unsigned long memcg_events(struct mem_cgroup *memcg, int event);
@@ -949,10 +959,9 @@ static inline void count_memcg_folio_events(struct folio *folio,
if (!folio_memcg_charged(folio))
return;
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
count_memcg_events(memcg, idx, nr);
- rcu_read_unlock();
+ folio_memcg_end();
}
static inline void count_memcg_events_mm(struct mm_struct *mm,
@@ -1035,6 +1044,15 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio)
return NULL;
}
+static inline struct mem_cgroup *folio_memcg_begin(struct folio *folio)
+{
+ return NULL;
+}
+
+static inline void folio_memcg_end(void)
+{
+}
+
static inline bool folio_memcg_charged(struct folio *folio)
{
return false;
@@ -1546,11 +1564,10 @@ static inline void mem_cgroup_track_foreign_dirty(struct folio *folio,
if (!folio_memcg_charged(folio))
return;
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
if (unlikely(&memcg->css != wb->memcg_css))
mem_cgroup_track_foreign_dirty_slowpath(folio, wb);
- rcu_read_unlock();
+ folio_memcg_end();
}
void mem_cgroup_flush_foreign(struct bdi_writeback *wb);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e90d08db219d..1aa20c1dd0c1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3769,9 +3769,10 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
/* Prevent deferred_split_scan() touching ->_refcount */
dequeue_deferred = folio_test_anon(folio) && old_order > 1;
if (dequeue_deferred) {
- rcu_read_lock();
- l = list_lru_lock(&deferred_split_lru,
- folio_nid(folio), folio_memcg(folio));
+ struct mem_cgroup *memcg;
+
+ memcg = folio_memcg_begin(folio);
+ l = list_lru_lock(&deferred_split_lru, folio_nid(folio), memcg);
}
if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
struct swap_cluster_info *ci = NULL;
@@ -3786,7 +3787,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1);
}
list_lru_unlock(l);
- rcu_read_unlock();
+ folio_memcg_end();
}
if (mapping) {
@@ -3891,7 +3892,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
} else {
if (dequeue_deferred) {
list_lru_unlock(l);
- rcu_read_unlock();
+ folio_memcg_end();
}
return -EAGAIN;
}
@@ -4272,12 +4273,13 @@ bool __folio_unqueue_deferred_split(struct folio *folio)
int nid = folio_nid(folio);
unsigned long flags;
bool unqueued = false;
+ struct mem_cgroup *memcg;
WARN_ON_ONCE(folio_ref_count(folio));
WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg_charged(folio));
- rcu_read_lock();
- l = list_lru_lock_irqsave(&deferred_split_lru, nid, folio_memcg(folio), &flags);
+ memcg = folio_memcg_begin(folio);
+ l = list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags);
if (__list_lru_del(&deferred_split_lru, l, &folio->_deferred_list, nid)) {
if (folio_test_partially_mapped(folio)) {
folio_clear_partially_mapped(folio);
@@ -4287,7 +4289,7 @@ bool __folio_unqueue_deferred_split(struct folio *folio)
unqueued = true;
}
list_lru_unlock_irqrestore(l, &flags);
- rcu_read_unlock();
+ folio_memcg_end();
return unqueued; /* useful for debug warnings */
}
@@ -4322,8 +4324,7 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
nid = folio_nid(folio);
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
l = list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags);
if (partially_mapped) {
if (!folio_test_partially_mapped(folio)) {
@@ -4339,7 +4340,7 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
}
__list_lru_add(&deferred_split_lru, l, &folio->_deferred_list, nid, memcg);
list_lru_unlock_irqrestore(l, &flags);
- rcu_read_unlock();
+ folio_memcg_end();
}
static unsigned long deferred_split_count(struct shrinker *shrink,
@@ -4445,16 +4446,17 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
* don't add it back to split_queue.
*/
if (!did_split && folio_test_partially_mapped(folio)) {
- rcu_read_lock();
+ struct mem_cgroup *memcg;
+
+ memcg = folio_memcg_begin(folio);
l = list_lru_lock_irqsave(&deferred_split_lru,
- folio_nid(folio),
- folio_memcg(folio),
+ folio_nid(folio), memcg,
&flags);
__list_lru_add(&deferred_split_lru, l,
&folio->_deferred_list,
- folio_nid(folio), folio_memcg(folio));
+ folio_nid(folio), memcg);
list_lru_unlock_irqrestore(l, &flags);
- rcu_read_unlock();
+ folio_memcg_end();
}
folio_put(folio);
}
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 1ccdd45b1d14..638d084bb0f5 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -604,10 +604,9 @@ int folio_memcg_list_lru_alloc(struct folio *folio, struct list_lru *lru,
return 0;
/* Fast path when list_lru heads already exist */
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
res = memcg_list_lru_allocated(memcg, lru);
- rcu_read_unlock();
+ folio_memcg_end();
if (likely(res))
return 0;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f381cb6bdff1..14732f1542f2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -965,18 +965,17 @@ void lruvec_stat_mod_folio(struct folio *folio, enum node_stat_item idx,
pg_data_t *pgdat = folio_pgdat(folio);
struct lruvec *lruvec;
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
/* Untracked pages have no memcg, no lruvec. Update only the node */
if (!memcg) {
- rcu_read_unlock();
+ folio_memcg_end();
mod_node_page_state(pgdat, idx, val);
return;
}
lruvec = mem_cgroup_lruvec(memcg, pgdat);
mod_lruvec_state(lruvec, idx, val);
- rcu_read_unlock();
+ folio_memcg_end();
}
EXPORT_SYMBOL(lruvec_stat_mod_folio);
@@ -1170,11 +1169,10 @@ struct mem_cgroup *get_mem_cgroup_from_folio(struct folio *folio)
if (!folio_memcg_charged(folio))
return root_mem_cgroup;
- rcu_read_lock();
do {
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
} while (unlikely(!css_tryget(&memcg->css)));
- rcu_read_unlock();
+ folio_memcg_end();
return memcg;
}
@@ -5535,8 +5533,7 @@ bool mem_cgroup_swap_full(struct folio *folio)
if (do_memsw_account() || !folio_memcg_charged(folio))
return ret;
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) {
unsigned long usage = page_counter_read(&memcg->swap);
@@ -5546,7 +5543,7 @@ bool mem_cgroup_swap_full(struct folio *folio)
break;
}
}
- rcu_read_unlock();
+ folio_memcg_end();
return ret;
}
diff --git a/mm/migrate.c b/mm/migrate.c
index fdbb20163f66..a2d542ebf3ed 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -672,8 +672,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
struct lruvec *old_lruvec, *new_lruvec;
struct mem_cgroup *memcg;
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
@@ -700,7 +699,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
__mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
}
- rcu_read_unlock();
+ folio_memcg_end();
}
local_irq_enable();
diff --git a/mm/page_io.c b/mm/page_io.c
index 63b262f4c5a9..862135a65848 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -239,6 +239,7 @@ static void swap_zeromap_folio_clear(struct folio *folio)
*/
int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
{
+ struct mem_cgroup *memcg;
int ret = 0;
if (folio_free_swap(folio))
@@ -277,13 +278,13 @@ int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
goto out_unlock;
}
- rcu_read_lock();
- if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) {
+ memcg = folio_memcg_begin(folio);
+ if (!mem_cgroup_zswap_writeback_enabled(memcg)) {
rcu_read_unlock();
folio_mark_dirty(folio);
return AOP_WRITEPAGE_ACTIVATE;
}
- rcu_read_unlock();
+ folio_memcg_end();
__swap_writepage(folio, swap_plug);
return 0;
@@ -314,11 +315,10 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct folio *folio)
if (!folio_memcg_charged(folio))
return;
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
css = cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys);
bio_associate_blkg_from_css(bio, css);
- rcu_read_unlock();
+ folio_memcg_end();
}
#else
#define bio_associate_blkg_from_page(bio, folio) do { } while (0)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 33287ba4a500..12ad40fa7d60 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3407,6 +3407,7 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg,
struct pglist_data *pgdat)
{
struct folio *folio = pfn_folio(pfn);
+ struct mem_cgroup *this_memcg;
if (folio_lru_gen(folio) < 0)
return NULL;
@@ -3414,10 +3415,10 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg,
if (folio_nid(folio) != pgdat->node_id)
return NULL;
- rcu_read_lock();
- if (folio_memcg(folio) != memcg)
+ this_memcg = folio_memcg_begin(folio);
+ if (this_memcg != memcg)
folio = NULL;
- rcu_read_unlock();
+ folio_memcg_end();
return folio;
}
diff --git a/mm/workingset.c b/mm/workingset.c
index 07e6836d0502..77bfec58b797 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -251,8 +251,7 @@ static void *lru_gen_eviction(struct folio *folio)
BUILD_BUG_ON(LRU_GEN_WIDTH + LRU_REFS_WIDTH >
BITS_PER_LONG - max(EVICTION_SHIFT, EVICTION_SHIFT_ANON));
- rcu_read_lock();
- memcg = folio_memcg(folio);
+ memcg = folio_memcg_begin(folio);
lruvec = mem_cgroup_lruvec(memcg, pgdat);
lrugen = &lruvec->lrugen;
min_seq = READ_ONCE(lrugen->min_seq[type]);
@@ -261,7 +260,7 @@ static void *lru_gen_eviction(struct folio *folio)
hist = lru_hist_from_seq(min_seq);
atomic_long_add(delta, &lrugen->evicted[hist][type][tier]);
memcg_id = mem_cgroup_private_id(memcg);
- rcu_read_unlock();
+ folio_memcg_end();
return pack_shadow(memcg_id, pgdat, token, workingset, type);
}
diff --git a/mm/zswap.c b/mm/zswap.c
index 4f2e652e8ad3..fb035dd70d8b 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -895,14 +895,15 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
* to the active LRU list in the case.
*/
if (comp_ret || !dlen || dlen >= PAGE_SIZE) {
- rcu_read_lock();
- if (!mem_cgroup_zswap_writeback_enabled(
- folio_memcg(page_folio(page)))) {
- rcu_read_unlock();
+ struct mem_cgroup *memcg;
+
+ memcg = folio_memcg_begin(page_folio(page));
+ if (!mem_cgroup_zswap_writeback_enabled(memcg)) {
+ folio_memcg_end();
comp_ret = comp_ret ? comp_ret : -EINVAL;
goto unlock;
}
- rcu_read_unlock();
+ folio_memcg_end();
comp_ret = 0;
dlen = PAGE_SIZE;
dst = kmap_local_page(page);
next prev parent reply other threads:[~2026-03-20 16:02 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-12 20:51 [PATCH v2 0/7] mm: switch THP shrinker to list_lru Johannes Weiner
2026-03-12 20:51 ` [PATCH v2 1/7] mm: list_lru: lock_list_lru_of_memcg() cannot return NULL if !skip_empty Johannes Weiner
2026-03-17 9:43 ` David Hildenbrand (Arm)
2026-03-18 17:56 ` Shakeel Butt
2026-03-18 19:25 ` Johannes Weiner
2026-03-18 19:34 ` Shakeel Butt
2026-03-12 20:51 ` [PATCH v2 2/7] mm: list_lru: deduplicate unlock_list_lru() Johannes Weiner
2026-03-17 9:44 ` David Hildenbrand (Arm)
2026-03-18 17:57 ` Shakeel Butt
2026-03-12 20:51 ` [PATCH v2 3/7] mm: list_lru: move list dead check to lock_list_lru_of_memcg() Johannes Weiner
2026-03-17 9:47 ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 4/7] mm: list_lru: deduplicate lock_list_lru() Johannes Weiner
2026-03-17 9:51 ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 5/7] mm: list_lru: introduce caller locking for additions and deletions Johannes Weiner
2026-03-17 10:00 ` David Hildenbrand (Arm)
2026-03-17 14:03 ` Johannes Weiner
2026-03-17 14:34 ` Johannes Weiner
2026-03-17 16:35 ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 6/7] mm: list_lru: introduce memcg_list_lru_alloc_folio() Johannes Weiner
2026-03-17 10:09 ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 7/7] mm: switch deferred split shrinker to list_lru Johannes Weiner
2026-03-18 20:25 ` David Hildenbrand (Arm)
2026-03-18 22:48 ` Johannes Weiner
2026-03-19 7:21 ` David Hildenbrand (Arm)
2026-03-20 16:02 ` Johannes Weiner [this message]
2026-03-23 19:39 ` David Hildenbrand (Arm)
2026-03-20 16:07 ` Johannes Weiner
2026-03-23 19:32 ` David Hildenbrand (Arm)
2026-03-13 17:39 ` [syzbot ci] Re: mm: switch THP " syzbot ci
2026-03-13 23:08 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ab1vnhL4XftNdTSu@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@fromorbit.com \
--cc=david@kernel.org \
--cc=kas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=usama.arif@linux.dev \
--cc=yosry.ahmed@linux.dev \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox