All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Yosry Ahmed <yosry.ahmed@linux.dev>, Zi Yan <ziy@nvidia.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Usama Arif <usama.arif@linux.dev>,
	Kiryl Shutsemau <kas@kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 7/7] mm: switch deferred split shrinker to list_lru
Date: Fri, 20 Mar 2026 12:02:38 -0400	[thread overview]
Message-ID: <ab1vnhL4XftNdTSu@cmpxchg.org> (raw)
In-Reply-To: <e9631520-857f-4f67-8944-6af7a2b47e89@kernel.org>

On Thu, Mar 19, 2026 at 08:21:21AM +0100, David Hildenbrand (Arm) wrote:
> >>> @@ -3802,33 +3706,28 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
> >>>  	struct folio *new_folio, *next;
> >>>  	int old_order = folio_order(folio);
> >>>  	int ret = 0;
> >>> -	struct deferred_split *ds_queue;
> >>> +	struct list_lru_one *l;
> >>>  
> >>>  	VM_WARN_ON_ONCE(!mapping && end);
> >>>  	/* Prevent deferred_split_scan() touching ->_refcount */
> >>> -	ds_queue = folio_split_queue_lock(folio);
> >>> +	rcu_read_lock();
> >>
> >> The RCU lock is for the folio_memcg(), right?
> >>
> >> I recall I raised in the past that some get/put-like logic (that wraps
> >> the rcu_read_lock() + folio_memcg()) might make this a lot easier to get.
> >>
> >>
> >> memcg = folio_memcg_lookup(folio)
> >>
> >> ... do stuff
> >>
> >> folio_memcg_putback(folio, memcg);
> >>
> >> Or sth like that.
> >>
> >>
> >> Alternativey, you could have some helpers that do the
> >> list_lru_lock+unlock etc.
> >>
> >> folio_memcg_list_lru_lock()
> >> ...
> >> folio_memcg_list_ru_unlock(l);
> >>
> >> Just some thoughts as inspiration :)
> > 
> > I remember you raising this in the objcg + reparenting patches. There
> > are a few more instances of
> > 
> > 	rcu_read_lock()
> > 	foo = folio_memcg()
> > 	...
> > 	rcu_read_unlock()
> > 
> > in other parts of the code not touched by these patches here, so the
> > first pattern is a more universal encapsulation.
> > 
> > Let me look into this. Would you be okay with a follow-up that covers
> > the others as well?
> 
> Of course :) If list_lru lock helpers would be the right thing to do, it
> might be better placed in this series.

I'm playing around with the below. But there are a few things that
seem suboptimal:

- We need a local @memcg, which makes sites that just pass
  folio_memcg() somewhere else fatter. More site LOC on average.
- Despite being more verbose, it communicates less. rcu_read_lock()
  is universally understood, folio_memcg_foo() is cryptic.
- It doesn't cover similar accessors with the same lifetime rules,
  like folio_lruvec(), folio_memcg_check()

 include/linux/memcontrol.h | 35 ++++++++++++++++++++++++++---------
 mm/huge_memory.c           | 34 ++++++++++++++++++----------------
 mm/list_lru.c              |  5 ++---
 mm/memcontrol.c            | 17 +++++++----------
 mm/migrate.c               |  5 ++---
 mm/page_io.c               | 12 ++++++------
 mm/vmscan.c                |  7 ++++---
 mm/workingset.c            |  5 ++---
 mm/zswap.c                 | 11 ++++++-----
 9 files changed, 73 insertions(+), 58 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 0782c72a1997..5162145b9322 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -430,6 +430,17 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio)
 	return objcg ? obj_cgroup_memcg(objcg) : NULL;
 }
 
+static inline struct mem_cgroup *folio_memcg_begin(struct folio *folio)
+{
+	rcu_read_lock();
+	return folio_memcg(folio);
+}
+
+static inline void folio_memcg_end(void)
+{
+	rcu_read_unlock();
+}
+
 /*
  * folio_memcg_charged - If a folio is charged to a memory cgroup.
  * @folio: Pointer to the folio.
@@ -917,11 +928,10 @@ static inline void mod_memcg_page_state(struct page *page,
 	if (mem_cgroup_disabled())
 		return;
 
-	rcu_read_lock();
-	memcg = folio_memcg(page_folio(page));
+	memcg = folio_memcg_begin(page_folio(page));
 	if (memcg)
 		mod_memcg_state(memcg, idx, val);
-	rcu_read_unlock();
+	folio_memcg_end();
 }
 
 unsigned long memcg_events(struct mem_cgroup *memcg, int event);
@@ -949,10 +959,9 @@ static inline void count_memcg_folio_events(struct folio *folio,
 	if (!folio_memcg_charged(folio))
 		return;
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	count_memcg_events(memcg, idx, nr);
-	rcu_read_unlock();
+	folio_memcg_end();
 }
 
 static inline void count_memcg_events_mm(struct mm_struct *mm,
@@ -1035,6 +1044,15 @@ static inline struct mem_cgroup *folio_memcg(struct folio *folio)
 	return NULL;
 }
 
+static inline struct mem_cgroup *folio_memcg_begin(struct folio *folio)
+{
+	return NULL;
+}
+
+static inline void folio_memcg_end(void)
+{
+}
+
 static inline bool folio_memcg_charged(struct folio *folio)
 {
 	return false;
@@ -1546,11 +1564,10 @@ static inline void mem_cgroup_track_foreign_dirty(struct folio *folio,
 	if (!folio_memcg_charged(folio))
 		return;
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	if (unlikely(&memcg->css != wb->memcg_css))
 		mem_cgroup_track_foreign_dirty_slowpath(folio, wb);
-	rcu_read_unlock();
+	folio_memcg_end();
 }
 
 void mem_cgroup_flush_foreign(struct bdi_writeback *wb);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e90d08db219d..1aa20c1dd0c1 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3769,9 +3769,10 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 	/* Prevent deferred_split_scan() touching ->_refcount */
 	dequeue_deferred = folio_test_anon(folio) && old_order > 1;
 	if (dequeue_deferred) {
-		rcu_read_lock();
-		l = list_lru_lock(&deferred_split_lru,
-				  folio_nid(folio), folio_memcg(folio));
+		struct mem_cgroup *memcg;
+
+		memcg = folio_memcg_begin(folio);
+		l = list_lru_lock(&deferred_split_lru, folio_nid(folio), memcg);
 	}
 	if (folio_ref_freeze(folio, folio_cache_ref_count(folio) + 1)) {
 		struct swap_cluster_info *ci = NULL;
@@ -3786,7 +3787,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 					MTHP_STAT_NR_ANON_PARTIALLY_MAPPED, -1);
 			}
 			list_lru_unlock(l);
-			rcu_read_unlock();
+			folio_memcg_end();
 		}
 
 		if (mapping) {
@@ -3891,7 +3892,7 @@ static int __folio_freeze_and_split_unmapped(struct folio *folio, unsigned int n
 	} else {
 		if (dequeue_deferred) {
 			list_lru_unlock(l);
-			rcu_read_unlock();
+			folio_memcg_end();
 		}
 		return -EAGAIN;
 	}
@@ -4272,12 +4273,13 @@ bool __folio_unqueue_deferred_split(struct folio *folio)
 	int nid = folio_nid(folio);
 	unsigned long flags;
 	bool unqueued = false;
+	struct mem_cgroup *memcg;
 
 	WARN_ON_ONCE(folio_ref_count(folio));
 	WARN_ON_ONCE(!mem_cgroup_disabled() && !folio_memcg_charged(folio));
 
-	rcu_read_lock();
-	l = list_lru_lock_irqsave(&deferred_split_lru, nid, folio_memcg(folio), &flags);
+	memcg = folio_memcg_begin(folio);
+	l = list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags);
 	if (__list_lru_del(&deferred_split_lru, l, &folio->_deferred_list, nid)) {
 		if (folio_test_partially_mapped(folio)) {
 			folio_clear_partially_mapped(folio);
@@ -4287,7 +4289,7 @@ bool __folio_unqueue_deferred_split(struct folio *folio)
 		unqueued = true;
 	}
 	list_lru_unlock_irqrestore(l, &flags);
-	rcu_read_unlock();
+	folio_memcg_end();
 
 	return unqueued;	/* useful for debug warnings */
 }
@@ -4322,8 +4324,7 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
 
 	nid = folio_nid(folio);
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	l = list_lru_lock_irqsave(&deferred_split_lru, nid, memcg, &flags);
 	if (partially_mapped) {
 		if (!folio_test_partially_mapped(folio)) {
@@ -4339,7 +4340,7 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped)
 	}
 	__list_lru_add(&deferred_split_lru, l, &folio->_deferred_list, nid, memcg);
 	list_lru_unlock_irqrestore(l, &flags);
-	rcu_read_unlock();
+	folio_memcg_end();
 }
 
 static unsigned long deferred_split_count(struct shrinker *shrink,
@@ -4445,16 +4446,17 @@ static unsigned long deferred_split_scan(struct shrinker *shrink,
 		 * don't add it back to split_queue.
 		 */
 		if (!did_split && folio_test_partially_mapped(folio)) {
-			rcu_read_lock();
+			struct mem_cgroup *memcg;
+
+			memcg = folio_memcg_begin(folio);
 			l = list_lru_lock_irqsave(&deferred_split_lru,
-						  folio_nid(folio),
-						  folio_memcg(folio),
+						  folio_nid(folio), memcg,
 						  &flags);
 			__list_lru_add(&deferred_split_lru, l,
 				       &folio->_deferred_list,
-				       folio_nid(folio), folio_memcg(folio));
+				       folio_nid(folio), memcg);
 			list_lru_unlock_irqrestore(l, &flags);
-			rcu_read_unlock();
+			folio_memcg_end();
 		}
 		folio_put(folio);
 	}
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 1ccdd45b1d14..638d084bb0f5 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -604,10 +604,9 @@ int folio_memcg_list_lru_alloc(struct folio *folio, struct list_lru *lru,
 		return 0;
 
 	/* Fast path when list_lru heads already exist */
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	res = memcg_list_lru_allocated(memcg, lru);
-	rcu_read_unlock();
+	folio_memcg_end();
 	if (likely(res))
 		return 0;
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index f381cb6bdff1..14732f1542f2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -965,18 +965,17 @@ void lruvec_stat_mod_folio(struct folio *folio, enum node_stat_item idx,
 	pg_data_t *pgdat = folio_pgdat(folio);
 	struct lruvec *lruvec;
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	/* Untracked pages have no memcg, no lruvec. Update only the node */
 	if (!memcg) {
-		rcu_read_unlock();
+		folio_memcg_end();
 		mod_node_page_state(pgdat, idx, val);
 		return;
 	}
 
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 	mod_lruvec_state(lruvec, idx, val);
-	rcu_read_unlock();
+	folio_memcg_end();
 }
 EXPORT_SYMBOL(lruvec_stat_mod_folio);
 
@@ -1170,11 +1169,10 @@ struct mem_cgroup *get_mem_cgroup_from_folio(struct folio *folio)
 	if (!folio_memcg_charged(folio))
 		return root_mem_cgroup;
 
-	rcu_read_lock();
 	do {
-		memcg = folio_memcg(folio);
+		memcg = folio_memcg_begin(folio);
 	} while (unlikely(!css_tryget(&memcg->css)));
-	rcu_read_unlock();
+	folio_memcg_end();
 	return memcg;
 }
 
@@ -5535,8 +5533,7 @@ bool mem_cgroup_swap_full(struct folio *folio)
 	if (do_memsw_account() || !folio_memcg_charged(folio))
 		return ret;
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	for (; !mem_cgroup_is_root(memcg); memcg = parent_mem_cgroup(memcg)) {
 		unsigned long usage = page_counter_read(&memcg->swap);
 
@@ -5546,7 +5543,7 @@ bool mem_cgroup_swap_full(struct folio *folio)
 			break;
 		}
 	}
-	rcu_read_unlock();
+	folio_memcg_end();
 
 	return ret;
 }
diff --git a/mm/migrate.c b/mm/migrate.c
index fdbb20163f66..a2d542ebf3ed 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -672,8 +672,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
 		struct lruvec *old_lruvec, *new_lruvec;
 		struct mem_cgroup *memcg;
 
-		rcu_read_lock();
-		memcg = folio_memcg(folio);
+		memcg = folio_memcg_begin(folio);
 		old_lruvec = mem_cgroup_lruvec(memcg, oldzone->zone_pgdat);
 		new_lruvec = mem_cgroup_lruvec(memcg, newzone->zone_pgdat);
 
@@ -700,7 +699,7 @@ static int __folio_migrate_mapping(struct address_space *mapping,
 			mod_lruvec_state(new_lruvec, NR_FILE_DIRTY, nr);
 			__mod_zone_page_state(newzone, NR_ZONE_WRITE_PENDING, nr);
 		}
-		rcu_read_unlock();
+		folio_memcg_end();
 	}
 	local_irq_enable();
 
diff --git a/mm/page_io.c b/mm/page_io.c
index 63b262f4c5a9..862135a65848 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -239,6 +239,7 @@ static void swap_zeromap_folio_clear(struct folio *folio)
  */
 int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
 {
+	struct mem_cgroup *memcg;
 	int ret = 0;
 
 	if (folio_free_swap(folio))
@@ -277,13 +278,13 @@ int swap_writeout(struct folio *folio, struct swap_iocb **swap_plug)
 		goto out_unlock;
 	}
 
-	rcu_read_lock();
-	if (!mem_cgroup_zswap_writeback_enabled(folio_memcg(folio))) {
+	memcg = folio_memcg_begin(folio);
+	if (!mem_cgroup_zswap_writeback_enabled(memcg)) {
 		rcu_read_unlock();
 		folio_mark_dirty(folio);
 		return AOP_WRITEPAGE_ACTIVATE;
 	}
-	rcu_read_unlock();
+	folio_memcg_end();
 
 	__swap_writepage(folio, swap_plug);
 	return 0;
@@ -314,11 +315,10 @@ static void bio_associate_blkg_from_page(struct bio *bio, struct folio *folio)
 	if (!folio_memcg_charged(folio))
 		return;
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	css = cgroup_e_css(memcg->css.cgroup, &io_cgrp_subsys);
 	bio_associate_blkg_from_css(bio, css);
-	rcu_read_unlock();
+	folio_memcg_end();
 }
 #else
 #define bio_associate_blkg_from_page(bio, folio)		do { } while (0)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 33287ba4a500..12ad40fa7d60 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3407,6 +3407,7 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg,
 				   struct pglist_data *pgdat)
 {
 	struct folio *folio = pfn_folio(pfn);
+	struct mem_cgroup *this_memcg;
 
 	if (folio_lru_gen(folio) < 0)
 		return NULL;
@@ -3414,10 +3415,10 @@ static struct folio *get_pfn_folio(unsigned long pfn, struct mem_cgroup *memcg,
 	if (folio_nid(folio) != pgdat->node_id)
 		return NULL;
 
-	rcu_read_lock();
-	if (folio_memcg(folio) != memcg)
+	this_memcg = folio_memcg_begin(folio);
+	if (this_memcg != memcg)
 		folio = NULL;
-	rcu_read_unlock();
+	folio_memcg_end();
 
 	return folio;
 }
diff --git a/mm/workingset.c b/mm/workingset.c
index 07e6836d0502..77bfec58b797 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -251,8 +251,7 @@ static void *lru_gen_eviction(struct folio *folio)
 	BUILD_BUG_ON(LRU_GEN_WIDTH + LRU_REFS_WIDTH >
 		     BITS_PER_LONG - max(EVICTION_SHIFT, EVICTION_SHIFT_ANON));
 
-	rcu_read_lock();
-	memcg = folio_memcg(folio);
+	memcg = folio_memcg_begin(folio);
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 	lrugen = &lruvec->lrugen;
 	min_seq = READ_ONCE(lrugen->min_seq[type]);
@@ -261,7 +260,7 @@ static void *lru_gen_eviction(struct folio *folio)
 	hist = lru_hist_from_seq(min_seq);
 	atomic_long_add(delta, &lrugen->evicted[hist][type][tier]);
 	memcg_id = mem_cgroup_private_id(memcg);
-	rcu_read_unlock();
+	folio_memcg_end();
 
 	return pack_shadow(memcg_id, pgdat, token, workingset, type);
 }
diff --git a/mm/zswap.c b/mm/zswap.c
index 4f2e652e8ad3..fb035dd70d8b 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -895,14 +895,15 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry,
 	 * to the active LRU list in the case.
 	 */
 	if (comp_ret || !dlen || dlen >= PAGE_SIZE) {
-		rcu_read_lock();
-		if (!mem_cgroup_zswap_writeback_enabled(
-					folio_memcg(page_folio(page)))) {
-			rcu_read_unlock();
+		struct mem_cgroup *memcg;
+
+		memcg = folio_memcg_begin(page_folio(page));
+		if (!mem_cgroup_zswap_writeback_enabled(memcg)) {
+			folio_memcg_end();
 			comp_ret = comp_ret ? comp_ret : -EINVAL;
 			goto unlock;
 		}
-		rcu_read_unlock();
+		folio_memcg_end();
 		comp_ret = 0;
 		dlen = PAGE_SIZE;
 		dst = kmap_local_page(page);


  reply	other threads:[~2026-03-20 16:02 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-12 20:51 [PATCH v2 0/7] mm: switch THP shrinker to list_lru Johannes Weiner
2026-03-12 20:51 ` [PATCH v2 1/7] mm: list_lru: lock_list_lru_of_memcg() cannot return NULL if !skip_empty Johannes Weiner
2026-03-17  9:43   ` David Hildenbrand (Arm)
2026-03-18 17:56   ` Shakeel Butt
2026-03-18 19:25     ` Johannes Weiner
2026-03-18 19:34       ` Shakeel Butt
2026-03-12 20:51 ` [PATCH v2 2/7] mm: list_lru: deduplicate unlock_list_lru() Johannes Weiner
2026-03-17  9:44   ` David Hildenbrand (Arm)
2026-03-18 17:57   ` Shakeel Butt
2026-03-12 20:51 ` [PATCH v2 3/7] mm: list_lru: move list dead check to lock_list_lru_of_memcg() Johannes Weiner
2026-03-17  9:47   ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 4/7] mm: list_lru: deduplicate lock_list_lru() Johannes Weiner
2026-03-17  9:51   ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 5/7] mm: list_lru: introduce caller locking for additions and deletions Johannes Weiner
2026-03-17 10:00   ` David Hildenbrand (Arm)
2026-03-17 14:03     ` Johannes Weiner
2026-03-17 14:34       ` Johannes Weiner
2026-03-17 16:35         ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 6/7] mm: list_lru: introduce memcg_list_lru_alloc_folio() Johannes Weiner
2026-03-17 10:09   ` David Hildenbrand (Arm)
2026-03-12 20:51 ` [PATCH v2 7/7] mm: switch deferred split shrinker to list_lru Johannes Weiner
2026-03-18 20:25   ` David Hildenbrand (Arm)
2026-03-18 22:48     ` Johannes Weiner
2026-03-19  7:21       ` David Hildenbrand (Arm)
2026-03-20 16:02         ` Johannes Weiner [this message]
2026-03-23 19:39           ` David Hildenbrand (Arm)
2026-03-20 16:07         ` Johannes Weiner
2026-03-23 19:32           ` David Hildenbrand (Arm)
2026-03-13 17:39 ` [syzbot ci] Re: mm: switch THP " syzbot ci
2026-03-13 23:08   ` Johannes Weiner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ab1vnhL4XftNdTSu@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=david@kernel.org \
    --cc=kas@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=usama.arif@linux.dev \
    --cc=yosry.ahmed@linux.dev \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.