From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C55F3AA4ED for ; Thu, 23 Apr 2026 18:21:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776968482; cv=none; b=nmjX6q2punzRNIqWfhVTTc6rZsH3i708n2Tdzf0jk5Vm/xXfPWYF9InQVQUPOvkFKoIJLeYpmAv0HD5eHFkwU3qpPMdzaJZApI6Kex8DdDrCVU3/Qg2p1cejcaOVhVEIueAgoJTMaL8M8gYtP/0vU4kulqKq1lZLz61EFbeKnnQ= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776968482; c=relaxed/simple; bh=1uCjyp+bdCH76/UZ9hxH/luf3DI/c7hJw4St1i6/i20=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=VoxM90Q7JjAGDB5Zlu4N7mFN+RFJOuOOB5sh1g9yq1yoUxcUMXH56/JP9GWUSs2VFsqrzar/W4zcT2+klX/u+UjJC1JiE8SSAL6+S8sh++OzzXdiFXD/GcU1Cni8TrRp+MXdRrQaw/emosqozDrA5WvYKdbMQWRow2S31UhX410= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=PRHmDfdT; arc=none smtp.client-ip=91.218.175.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="PRHmDfdT" Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1776968477; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NT5G3ybYNd+sfFQXG2SgNNLq2F21AGBVro6FdqalTiM=; b=PRHmDfdTojOH+JWxKAM1fa6YLZJJb4nYdQ0QcXTOEua3xbjplBUhtw6uBWax01E61unV9G /BvFtDEGFICdnrQ5B1Cy9dj8Jl8h0ynS7Ji5nGtCE8n/XyYQgwMlpPBaFV6Et2QLTH5lcj JFp+RTpn12USHA2yG44cO7auSbDpU6o= Date: Thu, 23 Apr 2026 11:21:10 -0700 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Subject: Re: [PATCH] mm/lruvec: preemptively free dead folios during lru_add drain To: Matthew Wilcox Cc: linux-mm@kvack.org, akpm@linux-foundation.org, vbabka@kernel.org, mhocko@suse.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, riel@surriel.com, chrisl@kernel.org, kasong@tencent.com, shikemeng@huaweicloud.com, nphamcs@gmail.com, bhe@redhat.com, baohua@kernel.org, youngjun.park@lge.com, qi.zheng@linux.dev, axelrasmussen@google.com, yuanchu@google.com, weixugc@google.com, linux-kernel@vger.kernel.org, kernel-team@meta.com References: <20260423164307.29805-1-jp.kobryn@linux.dev> Content-Language: en-US X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: "JP Kobryn (Meta)" In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT On 4/23/26 10:15 AM, Matthew Wilcox wrote: > On Thu, Apr 23, 2026 at 09:43:07AM -0700, JP Kobryn (Meta) wrote: >> Of all observable lruvec lock contention in our fleet, we find that ~24% >> occurs when dead folios are present in lru_add batches at drain time. This >> is wasteful in the sense that the folio is added to the LRU just to be >> immediately removed via folios_put_refs(), incurring two unnecessary lock >> acquisitions. > > Well, this is a lovely patch with no obvious downsides. Nicely done. Thanks for the kind words and review :) [...] >> diff --git a/mm/swap.c b/mm/swap.c >> index 5cc44f0de9877..71607b0ce3d18 100644 >> --- a/mm/swap.c >> +++ b/mm/swap.c >> @@ -160,13 +160,36 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn) >> int i; >> struct lruvec *lruvec = NULL; >> unsigned long flags = 0; >> + struct folio_batch free_fbatch; >> + bool is_lru_add = (move_fn == lru_add); >> + >> + /* >> + * If we're adding to the LRU, preemptively filter dead folios. Use >> + * this dedicated folio batch for temp storage and deferred cleanup. >> + */ >> + if (is_lru_add) >> + folio_batch_init(&free_fbatch); >> >> for (i = 0; i < folio_batch_count(fbatch); i++) { >> struct folio *folio = fbatch->folios[i]; >> >> /* block memcg migration while the folio moves between lru */ >> - if (move_fn != lru_add && !folio_test_clear_lru(folio)) >> + if (!is_lru_add && !folio_test_clear_lru(folio)) >> + continue; >> + >> + /* >> + * Filter dead folios by moving them from the add batch to the temp >> + * batch for freeing after this loop. >> + * >> + * Since the folio may be part of a huge page, unqueue from >> + * deferred split list to avoid a dangling list entry. >> + */ >> + if (is_lru_add && folio_ref_freeze(folio, 1)) { >> + folio_unqueue_deferred_split(folio); > > Would it be better to do this outside the lru lock; it's just that we > don't have a convenient batched version to do it? It seems like > there are a few places that could use a batched version in vmscan.c and > swap.c. Not that I think we should hold up this patch to investigate > that micro-optimisation! Just something you couldlook at as a > follow-up. Good call. I'll leave this patch as-is (unless other feedback), then pursue the batched version of unqueuing the split in a separate follow-up patch. > > Reviewed-by: Matthew Wilcox (Oracle)