linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Roman Gushchin <roman.gushchin@linux.dev>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	 Shakeel Butt <shakeel.butt@linux.dev>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Michal Hocko <mhocko@kernel.org>,
	 David Hildenbrand <david@redhat.com>,
	linux-mm@kvack.org,  linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: skip lru_note_cost() when scanning only file or anon
Date: Fri, 11 Jul 2025 11:18:01 -0700	[thread overview]
Message-ID: <8734b21tie.fsf@linux.dev> (raw)
In-Reply-To: <20250711155044.137652-1-roman.gushchin@linux.dev> (Roman Gushchin's message of "Fri, 11 Jul 2025 08:50:44 -0700")

Sorry, sent a wrong version with a trivial bug. Below is the correct
one. The only difference is s/&sc/sc when calling scan_balance_biased().

--

From c06530edfb8a11139f2d7878ce3956b9238cc702 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <roman.gushchin@linux.dev>
Subject: [PATCH] mm: skip lru_note_cost() when scanning only file or anon

lru_note_cost() records relative cost of incurring io and cpu spent
on lru rotations, which is used to balance the pressure on file and
anon memory. The applied pressure is inversely proportional to the
recorded cost of reclaiming, but only within 2/3 of the range
(swappiness aside).

This is useful when both anon and file memory is reclaimable, however
in many cases it's not the case: e.g. there might be no swap,
proactive reclaim can target anon memory specifically,
the memory pressure can come from cgroup v1's memsw limit, etc.
In all these cases recording the cost will only bias all following
reclaim, also potentially outside of the scope of the original memcg.

So it's better to not record the cost if it comes from the initially
biased reclaim.

lru_note_cost() is a relatively expensive function, which traverses
the memcg tree up to the root and takes the lruvec lock on each level.
Overall it's responsible for about 50% of cycles spent on lruvec lock,
which might be a non-trivial number overall under heavy memory
pressure. So optimizing out a large number of lru_note_cost() calls
is also beneficial from the performance perspective.

Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
---
 mm/vmscan.c | 34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c86a2495138a..7d08606b08ea 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -71,6 +71,13 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/vmscan.h>
 
+enum scan_balance {
+	SCAN_EQUAL,
+	SCAN_FRACT,
+	SCAN_ANON,
+	SCAN_FILE,
+};
+
 struct scan_control {
 	/* How many pages shrink_list() should reclaim */
 	unsigned long nr_to_reclaim;
@@ -90,6 +97,7 @@ struct scan_control {
 	/*
 	 * Scan pressure balancing between anon and file LRUs
 	 */
+	enum scan_balance scan_balance;
 	unsigned long	anon_cost;
 	unsigned long	file_cost;
 
@@ -1988,6 +1996,17 @@ static int current_may_throttle(void)
 	return !(current->flags & PF_LOCAL_THROTTLE);
 }
 
+static bool scan_balance_biased(struct scan_control *sc)
+{
+	switch (sc->scan_balance) {
+	case SCAN_EQUAL:
+	case SCAN_FRACT:
+		return false;
+	default:
+		return true;
+	}
+}
+
 /*
  * shrink_inactive_list() is a helper for shrink_node().  It returns the number
  * of reclaimed pages
@@ -2054,7 +2073,9 @@ static unsigned long shrink_inactive_list(unsigned long nr_to_scan,
 	__count_vm_events(PGSTEAL_ANON + file, nr_reclaimed);
 	spin_unlock_irq(&lruvec->lru_lock);
 
-	lru_note_cost(lruvec, file, stat.nr_pageout, nr_scanned - nr_reclaimed);
+	if (!scan_balance_biased(sc))
+		lru_note_cost(lruvec, file, stat.nr_pageout,
+			      nr_scanned - nr_reclaimed);
 
 	/*
 	 * If dirty folios are scanned that are not queued for IO, it
@@ -2202,7 +2223,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	__mod_node_page_state(pgdat, NR_ISOLATED_ANON + file, -nr_taken);
 	spin_unlock_irq(&lruvec->lru_lock);
 
-	if (nr_rotated)
+	if (nr_rotated && !scan_balance_biased(sc))
 		lru_note_cost(lruvec, file, 0, nr_rotated);
 	trace_mm_vmscan_lru_shrink_active(pgdat->node_id, nr_taken, nr_activate,
 			nr_deactivate, nr_rotated, sc->priority, file);
@@ -2327,13 +2348,6 @@ static bool inactive_is_low(struct lruvec *lruvec, enum lru_list inactive_lru)
 	return inactive * inactive_ratio < active;
 }
 
-enum scan_balance {
-	SCAN_EQUAL,
-	SCAN_FRACT,
-	SCAN_ANON,
-	SCAN_FILE,
-};
-
 static void prepare_scan_control(pg_data_t *pgdat, struct scan_control *sc)
 {
 	unsigned long file;
@@ -2613,6 +2627,8 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
 	calculate_pressure_balance(sc, swappiness, fraction, &denominator);
 
 out:
+	sc->scan_balance = scan_balance;
+
 	for_each_evictable_lru(lru) {
 		bool file = is_file_lru(lru);
 		unsigned long lruvec_size;
-- 
2.50.0



  parent reply	other threads:[~2025-07-11 18:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-11 15:50 [PATCH] mm: skip lru_note_cost() when scanning only file or anon Roman Gushchin
2025-07-11 17:20 ` Johannes Weiner
2025-07-11 17:55   ` Roman Gushchin
2025-07-14 15:22     ` Johannes Weiner
2025-07-14 16:21       ` Roman Gushchin
2025-07-11 18:18 ` Roman Gushchin [this message]
2025-07-13 19:57   ` Hugh Dickins
2025-07-14 15:25     ` Johannes Weiner
2025-07-14 17:59     ` Roman Gushchin
2025-07-14 20:28     ` Shakeel Butt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734b21tie.fsf@linux.dev \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@kernel.org \
    --cc=shakeel.butt@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).