From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C55F3AA4ED
	for <linux-kernel@vger.kernel.org>; Thu, 23 Apr 2026 18:21:20 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1776968482; cv=none; b=nmjX6q2punzRNIqWfhVTTc6rZsH3i708n2Tdzf0jk5Vm/xXfPWYF9InQVQUPOvkFKoIJLeYpmAv0HD5eHFkwU3qpPMdzaJZApI6Kex8DdDrCVU3/Qg2p1cejcaOVhVEIueAgoJTMaL8M8gYtP/0vU4kulqKq1lZLz61EFbeKnnQ=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1776968482; c=relaxed/simple;
	bh=1uCjyp+bdCH76/UZ9hxH/luf3DI/c7hJw4St1i6/i20=;
	h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From:
	 In-Reply-To:Content-Type; b=VoxM90Q7JjAGDB5Zlu4N7mFN+RFJOuOOB5sh1g9yq1yoUxcUMXH56/JP9GWUSs2VFsqrzar/W4zcT2+klX/u+UjJC1JiE8SSAL6+S8sh++OzzXdiFXD/GcU1Cni8TrRp+MXdRrQaw/emosqozDrA5WvYKdbMQWRow2S31UhX410=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=PRHmDfdT; arc=none smtp.client-ip=91.218.175.181
Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev
Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="PRHmDfdT"
Message-ID: <c09c9138-b2ab-41d6-be9e-05be87a2bfce@linux.dev>
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1;
	t=1776968477;
	h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
	 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
	 content-transfer-encoding:content-transfer-encoding:
	 in-reply-to:in-reply-to:references:references;
	bh=NT5G3ybYNd+sfFQXG2SgNNLq2F21AGBVro6FdqalTiM=;
	b=PRHmDfdTojOH+JWxKAM1fa6YLZJJb4nYdQ0QcXTOEua3xbjplBUhtw6uBWax01E61unV9G
	/BvFtDEGFICdnrQ5B1Cy9dj8Jl8h0ynS7Ji5nGtCE8n/XyYQgwMlpPBaFV6Et2QLTH5lcj
	JFp+RTpn12USHA2yG44cO7auSbDpU6o=
Date: Thu, 23 Apr 2026 11:21:10 -0700
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Subject: Re: [PATCH] mm/lruvec: preemptively free dead folios during lru_add
 drain
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@kernel.org,
 mhocko@suse.com, hannes@cmpxchg.org, shakeel.butt@linux.dev,
 riel@surriel.com, chrisl@kernel.org, kasong@tencent.com,
 shikemeng@huaweicloud.com, nphamcs@gmail.com, bhe@redhat.com,
 baohua@kernel.org, youngjun.park@lge.com, qi.zheng@linux.dev,
 axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com,
 linux-kernel@vger.kernel.org, kernel-team@meta.com
References: <20260423164307.29805-1-jp.kobryn@linux.dev>
 <aepTnIu1WiyyHNJp@casper.infradead.org>
Content-Language: en-US
X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers.
From: "JP Kobryn (Meta)" <jp.kobryn@linux.dev>
In-Reply-To: <aepTnIu1WiyyHNJp@casper.infradead.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Migadu-Flow: FLOW_OUT

On 4/23/26 10:15 AM, Matthew Wilcox wrote:
> On Thu, Apr 23, 2026 at 09:43:07AM -0700, JP Kobryn (Meta) wrote:
>> Of all observable lruvec lock contention in our fleet, we find that ~24%
>> occurs when dead folios are present in lru_add batches at drain time. This
>> is wasteful in the sense that the folio is added to the LRU just to be
>> immediately removed via folios_put_refs(), incurring two unnecessary lock
>> acquisitions.
> 
> Well, this is a lovely patch with no obvious downsides.  Nicely done.

Thanks for the kind words and review :)

[...]
>> diff --git a/mm/swap.c b/mm/swap.c
>> index 5cc44f0de9877..71607b0ce3d18 100644
>> --- a/mm/swap.c
>> +++ b/mm/swap.c
>> @@ -160,13 +160,36 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn)
>>   	int i;
>>   	struct lruvec *lruvec = NULL;
>>   	unsigned long flags = 0;
>> +	struct folio_batch free_fbatch;
>> +	bool is_lru_add = (move_fn == lru_add);
>> +
>> +	/*
>> +	 * If we're adding to the LRU, preemptively filter dead folios. Use
>> +	 * this dedicated folio batch for temp storage and deferred cleanup.
>> +	 */
>> +	if (is_lru_add)
>> +		folio_batch_init(&free_fbatch);
>>   
>>   	for (i = 0; i < folio_batch_count(fbatch); i++) {
>>   		struct folio *folio = fbatch->folios[i];
>>   
>>   		/* block memcg migration while the folio moves between lru */
>> -		if (move_fn != lru_add && !folio_test_clear_lru(folio))
>> +		if (!is_lru_add && !folio_test_clear_lru(folio))
>> +			continue;
>> +
>> +		/*
>> +		 * Filter dead folios by moving them from the add batch to the temp
>> +		 * batch for freeing after this loop.
>> +		 *
>> +		 * Since the folio may be part of a huge page, unqueue from
>> +		 * deferred split list to avoid a dangling list entry.
>> +		 */
>> +		if (is_lru_add && folio_ref_freeze(folio, 1)) {
>> +			folio_unqueue_deferred_split(folio);
> 
> Would it be better to do this outside the lru lock; it's just that we
> don't have a convenient batched version to do it?  It seems like
> there are a few places that could use a batched version in vmscan.c and
> swap.c.  Not that I think we should hold up this patch to investigate
> that micro-optimisation!  Just something you couldlook at as a
> follow-up.

Good call. I'll leave this patch as-is (unless other feedback), then
pursue the batched version of unqueuing the split in a separate
follow-up patch.

> 
> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>