From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD77E20D4FC for ; Tue, 28 Apr 2026 01:35:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777340125; cv=none; b=QtQ4qDMGO6LLp1b3cAO9xS4gKL2EELf1JQIiUdwIUyHIdOousTY60JwdvZS9kvoREC2bxUfVQRgYToaeQXcNlAheRsH0ZOmgHAgl8hmxnk/Vvo16gQWm1lf8VgGS5Y0H64/rGFm6JZyB3TaMFxgR4x8Dtku4u7Brz1NvXEiqKu4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777340125; c=relaxed/simple; bh=gvVpoMVTCKgQNZqCzuDTpGYLUAXAmkhJCAZl9bSLC2E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=QoWyU2hbQFGjbII8CpIDVTLPa1yjaCkivgxSsU0g2rtZhzAF+nueftW46pmGFwptqBLaPp/72D0XthbQaGrdoEk3jjPVWMOvK3LoBUB7O/jeQa0OYKt8t68P49uF2rlJ2kYuKX2d6aIHhOye+6PSExSByNT+NEBEldht0m82tS0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=apaXZdz7; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="apaXZdz7" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A8582C2BCB7; Tue, 28 Apr 2026 01:35:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777340125; bh=gvVpoMVTCKgQNZqCzuDTpGYLUAXAmkhJCAZl9bSLC2E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=apaXZdz7Lo2XfVzFBh2pOJN8MiF7jLtWmbtx4vwLQFO/5F6/GXT02Vm91BDuKu3kH iKVeVfnaEYTS99C5Rvsls/ltcbLCkuwqqvV/H3v6HEoJjrXMcJiL7wIpiI/3a8HKAA 8f72G27nB9hMUqGj4nPSOMh4v5e3BvXNWtqMqp21Odd7W+7Xa9mvWLTcRzooOJt3Fm L9nAdOV04pJ5/V4y8oDEPE6z4gvhB7u1Rs0GlT+AYy5yyDCLnY2ax7B356uGdX4Tl2 xCLWgnzyjnZMpGwjig0XD/DBDKyi6Z9VSgVrPY7OlLIm64MxwTi6/SCU+eNOzceH0g gohVvaIH3fI1w== From: "Barry Song (Xiaomi)" To: axelrasmussen@google.com Cc: akpm@linux-foundation.org, baohua@kernel.org, kasong@tencent.com, lance.yang@linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, liulei.rjpt@vivo.com, pfalcato@suse.de, qi.zheng@linux.dev, shakeel.butt@linux.dev, surenb@google.com, wangzicheng@honor.com, weixugc@google.com, will@kernel.org, willy@infradead.org, xueyuan.chen21@gmail.com, yuanchu@google.com Subject: Re: [PATCH] mm/mglru: Use folio_mark_accessed to replace folio_set_active in PF Date: Tue, 28 Apr 2026 09:35:20 +0800 Message-Id: <20260428013520.47417-1-baohua@kernel.org> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit On Tue, Apr 28, 2026 at 2:23 AM Axel Rasmussen wrote: > > For what it's worth, I agree with this change in principle. > > In production we set fault_around_bytes to 4096. That setting is > surprisingly load-bearing (i.e. if I change it, even at a small > experimental scale, I expect workloads to notice and complain). So I > don't think I have an easy way to test this change under production > workloads. > > Like Andrew said the workload in the commit message doesn't seem > unreasonable, and the benefit is large. > > I guess the workload that would see a downside from this is one that > heavily uses readahead pages but also generates many "one-time-use" > pages instead of maintaining a "fixed" working set. Without activating > the readahead pages, does it lose some of the readahead benefit > because they are pushed out? > > About the Sashiko comments, the tier bits being cleared doesn't seem > that problematic to me. However, the WORKINGSET_ACTIVATE counter issue > seems worth fixing. > I am considering something more reasonable than simply "fixing" the counter. Right now, MGLRU unconditionally treats PF folios as WORKINGSET_ACTIVATE_BASE and neglects other folios entirely. I am thinking of a better approach that detects true recency. In the active/inactive case, this is refault_distance < workingset_size. In MGLRU, we might detect whether reclamation occurred within the most recent one or two generations. I am queuing the following for testing: diff --git a/mm/workingset.c b/mm/workingset.c index 07e6836d0502..8b552b3d7e37 100644 --- a/mm/workingset.c +++ b/mm/workingset.c @@ -271,10 +271,11 @@ static void *lru_gen_eviction(struct folio *folio) * Fills in @lruvec, @token, @workingset with the values unpacked from shadow. */ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec, - unsigned long *token, bool *workingset, bool file) + unsigned long *token, bool *workingset, bool file, + unsigned long *gen_distance) { int memcg_id; - unsigned long max_seq; + unsigned long max_seq, distance; struct mem_cgroup *memcg; struct pglist_data *pgdat; @@ -286,7 +287,10 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec, max_seq = READ_ONCE((*lruvec)->lrugen.max_seq); max_seq &= (file ? EVICTION_MASK : EVICTION_MASK_ANON) >> LRU_REFS_WIDTH; - return abs_diff(max_seq, *token >> LRU_REFS_WIDTH) < MAX_NR_GENS; + distance = abs_diff(max_seq, *token >> LRU_REFS_WIDTH); + if (gen_distance) + *gen_distance = distance; + return distance < MAX_NR_GENS; } static void lru_gen_refault(struct folio *folio, void *shadow) @@ -294,7 +298,7 @@ static void lru_gen_refault(struct folio *folio, void *shadow) bool recent; int hist, tier, refs; bool workingset; - unsigned long token; + unsigned long token, distance; struct lruvec *lruvec; struct lru_gen_folio *lrugen; int type = folio_is_file_lru(folio); @@ -302,7 +306,8 @@ static void lru_gen_refault(struct folio *folio, void *shadow) rcu_read_lock(); - recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset, type); + recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset, type, + &distance); if (lruvec != folio_lruvec(folio)) goto unlock; @@ -319,9 +324,11 @@ static void lru_gen_refault(struct folio *folio, void *shadow) atomic_long_add(delta, &lrugen->refaulted[hist][type][tier]); - /* see folio_add_lru() where folio_set_active() will be called */ - if (lru_gen_in_fault()) + /* If the folio was reclaimed very recently. */ + if (distance <= MIN_LRU_GENS) { + folio_set_active(folio); mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + type, delta); + } if (workingset) { folio_set_workingset(folio); @@ -442,7 +449,7 @@ bool workingset_test_recent(void *shadow, bool file, bool *workingset, rcu_read_lock(); recent = lru_gen_test_recent(shadow, &eviction_lruvec, &eviction, - workingset, file); + workingset, file, NULL); rcu_read_unlock(); return recent; }